developing a geodynamics simulator with...

26
Developing a Geodynamics Simulator with PETSc Matthew G. Knepley 1 , Richard F. Katz 2 , and Barry Smith 1 1 Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL, [knepley,bsmith]@mcs.anl.gov 2 Department of Earth and Environmental Sciences, Lamont Doherty Earth Observatory, Palisades, NY, [email protected] Summary. Most high-performance simulation codes are not written from scratch but begin as desktop experiments and are subsequently migrated to a scalable, parallel paradigm. This tran- sition can be painful, however, because the restructuring required in conversion forces most authors to abandon their serial code and begin an entirely new parallel code. Starting a parallel code from scratch has many disadvantages, such as the loss of the original test suite and the introduction of new bugs. We present a disciplined, incremental approach to parallelization of existing scientific code using the PETSc framework. In addition to the parallelization, it allows the addition of more physics (in this case strong nonlinearities) without the user having to program anything beyond the new pieces of discretization code. Our approach permits users to easily develop and experiment on the desktop with the same code that scales efficiently to large clusters with excellent parallel performance. As a motivating example, we present work integrating PETSc into an existing plate tectonic subduction code. 1 Geodynamics of Subduction Zones Subduction zones, where one of the Earth’s surface plates collides with another and sinks into the deep mantle (Fig. 1a) [20], are the locus of many of the world’s most devastating natural disasters, especially volcanic eruptions and earthquakes. Subduc- tion zone volcanism such as the 1980 eruption of Mount St. Helens is characterized by violent explosions of ash and rock. Despite the relevance of this type of volcanism to problems ranging from public safety to global climate change and mass-extinction events in the geologic record, a detailed understanding of its source is lacking. Clues are abundant, however, in the rocks erupted from subduction zone volcanos which record a history of formation, transport, and eruption in their distinct geochemistry. The complexity of these processes in terms of the governing reactive thermochemical fluid dynamics and non-Newtonian rheology is significant; simplified models have proven inadequate in explaining basic observations [18]. More sophisticated PDE-based computational models are needed to address the sharp nonlinearities typically unexplored in past work. The large separation in length and time scales of the constituent physics implies a great computational cost due to the need for fine meshes to resolve the small (but relevant) scales. Moreover, the

Upload: others

Post on 17-Jun-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Developing a Geodynamics Simulator with PETScfoalab.earth.ox.ac.uk/publications/Knepley_etal_Springer06.pdf · Developing a Geodynamics Simulator with PETSc Matthew G. Knepley 1,

Developing a Geodynamics Simulator with PETSc

Matthew G Knepley1 Richard F Katz2 and Barry Smith1

1 Mathematics and Computer Science Division Argonne National Laboratory Argonne IL[knepleybsmith]mcsanlgov

2 Department of Earth and Environmental Sciences Lamont Doherty Earth ObservatoryPalisades NYkatzldeocolumbiaedu

Summary Most high-performance simulation codes are not written from scratch but begin asdesktop experiments and are subsequently migrated to a scalable parallel paradigm This tran-sition can be painful however because the restructuring required in conversion forces mostauthors to abandon their serial code and begin an entirely new parallel code Starting a parallelcode from scratch has many disadvantages such as the loss of the original test suite and theintroduction of new bugs We present a disciplined incremental approach to parallelizationof existing scientific code using the PETSc framework In addition to the parallelization itallows the addition of more physics (in this case strong nonlinearities) without the user havingto program anything beyond the new pieces of discretization code Our approach permits usersto easily develop and experiment on the desktop with the same code that scales efficiently tolarge clusters with excellent parallel performance As a motivating example we present workintegrating PETSc into an existing plate tectonic subduction code

1 Geodynamics of Subduction Zones

Subduction zones where one of the Earthrsquos surface plates collides with another andsinks into the deep mantle (Fig 1a) [20] are the locus of many of the worldrsquos mostdevastating natural disasters especially volcanic eruptions and earthquakes Subduc-tion zone volcanism such as the 1980 eruption of Mount St Helens is characterizedby violent explosions of ash and rock Despite the relevance of this type of volcanismto problems ranging from public safety to global climate change and mass-extinctionevents in the geologic record a detailed understanding of its source is lacking Cluesare abundant however in the rocks erupted from subduction zone volcanos whichrecord a history of formation transport and eruption in their distinct geochemistryThe complexity of these processes in terms of the governing reactive thermochemicalfluid dynamics and non-Newtonian rheology is significant simplified models haveproven inadequate in explaining basic observations [18]

More sophisticated PDE-based computational models are needed to address thesharp nonlinearities typically unexplored in past work The large separation in lengthand time scales of the constituent physics implies a great computational cost due tothe need for fine meshes to resolve the small (but relevant) scales Moreover the

2 Matthew G Knepley et al

highly nonlinear character of the non-Newtonian temperature-dependent viscositydemands costly solution algorithms This high computational complexity coupledwith long development times have put such models out of reach for most geoscien-tists The Portable Extensible Toolkit for Scientific Computation (PETSc) [5] makesit easier for geoscientists to overcome these barriers In an abstract sense PETSc pro-vides a framework for collaboration between geoscientists with complex modelingproblems and numerical analysts and software engineers who have the encapsulatednumerical methods for solving those problems In the case described here the orig-inal model was a specialized single linear solver serial code capable of calculatingthe thermal structure due to an analytically prescribed isoviscous flow field By port-ing this code into the PETSc framework we were able to extend it to solve for fullycoupled thermal structure and non-Newtonian flow with a choice of many scalableparallel solvers flexible boundary conditions convenient parameter input real-timecode steering and many other features not available with the serial version This wasdone using an incremental process by first replacing the custom linear solver withPETScrsquos general purpose nonlinear solver then replacing the sequential data struc-tures with PETScrsquos parallel ones and then finally adding the additional nonlinearphysics (to the portion of the code that discretizes the PDE)

η=η(PTV)

~100 km

Subducting slab

No Slip

Subducting slab

No Slip

Stre

ss fr

ee

Dislocation and diffusion creep

Distance from trench

Dep

th

Vslab

Mantle wedge

Continental crust

Continental crust

Mantle wedge

Zero stressfault

(a)

(b)

Fig 1 (a) Schematic diagram of a subduction zone Oceanic lithosphere is colliding withcontinental lithosphere and being subducted Volatile compounds are released from the sub-ducting slab at depth and enter the mantle wedge lowering the melting temperature of therock and causing partial melting This melt rises to feed volcanos at the surface(b) Schematicdiagram of the computational domain Boundary conditions on the flow field are imposed atthe bottom of the crust the top of the slab and the intersection of the mantle wedge andthe domain boundary Flow velocity within the mantle wedge is the solution to equation (1)Potential temperature throughout the domain is the solution to equation (3) The ldquozero stressfaultrdquo represents a frictional sliding surface On geologic time scales all stress on this boundaryis relieved in earthquakes and does not cause deformation

Any model of subduction zone volcanism must be based on the thermal andflow structure of the slowly moving solid mantle wedge shown in Fig 1 While themagnesium and iron-rich mantle rock is solid on human time-scales (as evidenced bythe seismic waves it transmits) on geologic time-scales it undergoes two modes of

Developing a Geodynamics Simulator with PETSc 3

solid-state creep [14] Its motion can be described by the Stokes equation for steadyflow of an incompressible highly viscous fluid (with zero Reynolds number)

nablaP = nablamiddot[η

(nablaV + nablaVT

)] st nablamiddot V = 0 (1)

η = (1ηdisl + 1ηdifn)minus1 (2)

whereP is the fluid pressureV is the mantle velocity fieldηdifn is the diffusioncreep viscosity andηdisl is the dislocation creep viscosity Both creep mechanismsgive an Arrhenius-type dependence on pressure and temperature Dislocation creephas an additional non-Newtonian strain rate dependence The strength of the nonlin-earity of viscosity makes this equation difficult to solve without a good initial guessIn our code the initial guess in provided by a continuation method we modify equa-tion (1) to letη rarr ηα whereα is a number between zero and one The continuationmethod described in section 51 variesα over a sequence of solves of increasingnonlinearity to reach natural variation in viscosity The solution at each step is usedas a guess for the following solve The first step in this process is to solve the problemfor constant viscosity (α=0) where the system of equations is linear and analyticallytractable

The production of molten rock depends fundamentally on the mantle temperaturefield The distribution of heat is governed by the conservation of enthalpy expressedas an advection-diffusion equation

partθ

partt+ V middotnablaθ = κnabla2θ (3)

whereθ is the mantle potential temperature andκ is the thermal diffusivity of themantle The mantle has a very low thermal diffusivity compared to materials such asmetal or water and thus the advection term dominates in equation (3)

For 0 lt α le 1 equations (1) and (3) form a nonlinear set coupled through theviscosity The solution of these equations is the first stage in modeling magma gen-esis in subduction zones Future work will incorporate equations of porous flow ofvolatiles and magma through the mantle reactive melting and geochemical trans-port Even without these complexities however we have achieved interesting resultswith the simple though highly nonlinear set of equations given above Some of theseresults are presented below after an in-depth look at the application developmentprocess in the PETSc framework

2 Integrating PETSc

PETSc is a set of library interfaces built in a generally hierarchical fashion A usermay decide to use some libraries and disregard others Fig 2 illustrates this hierarchyof dependencies For instance a user can use only the PETSc linear algebra (vectorsand matrices) libraries or add linear solvers to those or add nonlinear solvers to theentire group Thus integration may proceed in several stages which we discuss inthe following sections The first step is to incorporate the PETSc libraries into the

4 Matthew G Knepley et al

existing compile system (iemake) and to initialize the PETSc runtime This maybe done in two ways

Fig 2 Interface hierarchy in a PETSc application

The simplest approach is to adopt the PETSc makefile structure which is pre-sented in Fig 3 for an application using the nonlinear solver package The user in-cludes thebmakecommonbase which defines both rules for compiling sourceand variables that are used during the compile and link Then only a simple rule forthe executable is necessary using the appropriate PETSc library variable Individualcompilation can be customized with the variables shown at the top of the makefile

Users with large existing build systems may choose not to inherit the PETScmake rules but instead use a lower-level interface based only on PETSc make vari-ables A boilerplate example is given in Fig 4 The user now includes only thebmakecommonvariables file which defines the make variables but does notprescribe any rules for compilation or linking The variables provide informationabout all compilation flags libraries and external packages necessary to link withPETSc

Before any PETSc code can be run the user must callPetscInitialize() and likewise after all PETSc code has completed the user must callPetscFinalize () This is analogous to the requirements for using MPI In fact if the userhas not done so already PETSc will handle the initialization and cleanup of MPIautomatically in these routines A simple C driver is shown in Fig 5

The initialization call provides the command line arguments to PETSc for pro-cessing and can take an optional help string for the user (printed when the-help op-tion is given) The finalization call frees any resources used by PETSc and provides

Developing a Geodynamics Simulator with PETSc 5

CFLAGS =FFLAGS =CPPFLAGS=FPPFLAGS=

include $PETSCDIRbmakecommonbase

ex1 ex1o utilo chkoptsminus$CLINKER minuso $ $ˆ $PETSCSNES LIB$RM $ˆ

ex1f ex1fo physo chkoptsminus$FLINKER minuso $ $ˆ $PETSCFORTRAN LIB $PETSCSNES LIB$RM $ˆ

Fig 3Typical PETSc makefile

include $PETSCDIRbmakecommonvariables

comassageCpl $lt$CC minusc $MY CFLAGS $COPTFLAGS $CFLAGS $CCPPFLAGS $lt

FomassageFortranpl $lt$FC minusc $MY FFLAGS $FOPTFLAGS $FFLAGS $FCPPFLAGS $lt

ex1 ex1o utilominus$CLINKER minuso $ $ˆ $PETSCSNES LIB$RM $lt

ex1f ex1fo physominus$FLINKER minuso $ $ˆ $PETSCFORTRAN LIB $PETSCSNES LIB$RM $lt

Fig 4Boilerplate custom makefile using PETSc

summary logging and diagnostic information most notably performance profilingThe equivalent Fortran driver is shown in Fig 6

Notice that the command line arguments are now obtained directly from the For-tran runtime library rather than the user code Once these calls are inserted intothe application code the user can verify a successful link and run with the PETSclibraries

6 Matthew G Knepley et al

static char help[ ] = Boilerplate PETSc Examplenn

include petsch

int main(int argc char argv)

ierr = PetscInitialize(ampargc ampargv (char ) 0 help)CHKERRQ(ierr)

User code

return PetscFinalize()

Fig 5Boilerplate C driver for PETSc

program mainimplicit none

include includefincludepetsch

integer ierr

call PetscInitialize(PETSCNULL CHARACTER ierr)

User code

call PetscFinalize(ierr)end

Fig 6Boilerplate Fortran driver for PETSc

3 Data Distribution and Linear Algebra

The PETSc solvers (discussed in Section 4) require the user to provide C or Fortranroutines that compute the residual of the equation they wish to solve (after they havediscretized the PDE) and optionally the Jacobian of that residual function We re-fer to these functions generically asFormFunction() andFormJacobian() For nonlinear problems the PETSc solvers use truncated Newton methods with linesearches or trust-regions for robustness and possibly approximate or incomplete Ja-cobians for efficiency

The central objects in any PETSc simulation are the abstractions from linearalgebra vectors and matrices or in PETSc terms theVec andMat classes Theseform the basis of all solver and preconditioner interfaces as well as providing the linkto user-supplied discretization and physics routines Thus the most important stepin the migration of an application to the PETSc framework is the incorporation of its

Developing a Geodynamics Simulator with PETSc 7

linear algebra and data distribution abstractions Two obvious strategies emerge foraccomplishing this integration a gradual approach that seeks to minimize the changeto existing code and a more aggressive approach that leverages as much of the PETSctechnology as possible The next two subsections address these complementary pathstoward integration We use the simple example of the solid-fuel ignition problem orBratu problem to illustrate the gradual evolution of a simulation and then we givean example of mantle subduction for the more aggressive approach

31 Gradual Evolution

A new PETSc user with very complicated code which perhaps was originally writ-ten by someone else may opt to make as few changes as possible when moving toPETSc PETSc facilitates this approach with low-level interfaces to bothVec andMat objects that closely resemble common serial data structures Vectors are espe-cially easy to migrate because the default PETSc storage format contiguous arrayson each process is that most common in serial applications Matrices are somewhatmore complicated and in order to hide the actual matrix data storage format PETScrequires the user to access values through a functional interface [6]

Let us examine the Bratu problem as an example of integrating PETSc vectorsinto an existing simulation [2] This problem is modeled by the partial differentialequation

minus∆uminus λeu = 0 (4)

We take the domain to be the unit square and impose homogeneous Dirichlet bound-ary conditions on the edges We discretize the equation using a 5-point stencil finitedifference scheme which results in a set of nonlinear algebraic equationsF (uh) = 0where we useuh to indicate the discrete solution vector In Fig 7 we present a For-tran 90 routine that calculates the residualF as a function of the input vectoruhIt also takes a user context as input that holds the domain information and problemcoefficient

Notice that the boundary conditions on the residual are of the formuminusuΓ whereuΓ are the specified boundary values because we are driving the residualF (u) tozero An alternative would be to eliminate those variables altogether substitutingthe boundary values in any interior calculation However this technique currentlyimposes a greater burden on the programmer

When integrating PETSc vectors into this code we can let Fortran allocate thestorage or we can use PETSc for allocation In Fig 8 we let PETSc managethe memory The input vectors already have storage allocated by PETSc whichcan be accessed as an F90 array by using theVecGetArrayF90() functionor in C by usingVecGetArray() A sampledriver() routine shows howPETSc allocates vector storage If we let Fortran manage memory then we useVecCreateSeqWithArray() orVecCreateMPIWithArray() in parallelto create theVec objects

During a Newton iteration to solve this equation we would require the JacobianJ of the mappingF Rather than being accessed as a multidimensional array the

8 Matthew G Knepley et al

module f90moduletype userctx

The start end and number of vertices in the xminus and yminusdirectionsinteger xsxeysyeinteger mxmydouble precision lambda

end type userctxcontainsend module f90module

subroutine FormFunction(uFuserierr)use f90moduletype (userctx) userdouble precision u(userxsuserxeuserysuserye)double precision F(userxsuserxeuserysuserye)double precision twoonehxhyhxdhyhydhxscuijuxxuyyinteger ijierr

hx = 10dble(usermxminus1)hy = 10dble(usermyminus1)sc = hxhyuserlambdahxdhy = hxhyhydhx = hyhx

do 20 j=userysuseryedo 10 i=userxsuserxe

Apply boundary conditionsif (i == 1 or j == 1 or i == usermx or j == usermy) then

F(ij) = u(ij) Apply finite difference scheme

elseuij = u(ij)uxx = hydhx (20uij minus u(iminus1j) minus u(i+1j))uyy = hxdhy (20uij minus u(ijminus1) minus u(ij+1))F(ij) = uxx + uyy minus scexp(uij)

endif10 continue20 continue

Fig 7Residual calculation for the Bratu problem

Developing a Geodynamics Simulator with PETSc 9

subroutine driver(uFuserierr)implicit none

Vec uFint Nparameter(N=10000)

call VecCreate(PETSCCOMM WORLDuierr)call VecSetSizes(uPETSCDECIDENierr)call VecDuplicate(uFierr)call SolverLoop(uFierr)call VecDestroy(uierr)call VecDestroy(Fierr)end

subroutine FormFunctionPETSc(uFuserierr)use f90moduleimplicit none

Vec uFtype (userctx) userdouble precisionpointer u v()f v()

call VecGetArrayF90(uu vierr)call VecGetArrayF90(Ff vierr)call FormFunction(u vf vuserierr)call VecRestoreArrayF90(uu vierr)call VecRestoreArrayF90(Ff vierr)end

Fig 8Driver for the Bratu problem using PETSc

PETScMat object provides theMatSetValues () function to set logically denseblocks of values into the structure A function that computes the Jacobian and storesit in a PETSc matrix is shown in Fig 9 With this addition the user can now runserial code using the PETSc solvers (details are given in Section 4) Note that thestorage mechanism is the only change to user code necessary for its incorporationinto the PETSc framework using the wrapper in Fig 8

Although PETSc removes the need for low-level parallel programming such asdirect calls to the MPI library the user must still identify parallelism inherent inthe application and must structure the computation accordingly The first step is topartition the domain and have each process calculate the residual and Jacobian onlyover its local piece This is easily accomplished by redefining thexs xe ys andyevariables on each process converting loops over the entire domain into loops overthe local domain However this method uses nearest-neighbor information and thus

10 Matthew G Knepley et al

subroutine FormJacobian(ujacuserierr)use f90moduletype (userctx) userMat jacdouble precision x(userxsuserxeuserysuserye)double precision twoonehxhyhxdhyhydhxscv(5)integer rowcol(5)ijierr

one = 10two = 20hx = onedble(usermxminus1)hy = onedble(usermyminus1)sc = hxhyuserlambdahxdhy = hxhyhydhx = hyhxdo 20 j=userysuserye

row = (j minus userys)userxm minus 1do 10 i=userxsuserxe

row = row + 1 boundary points

if (i == 1 or j == 1 or i == usermx or j == usermy) thencol(1) = rowv(1) = onecall MatSetValues(jac1row1colvINSERT VALUESierr)

interior grid pointselse

v(1) = minushxdhyv(2) = minushydhxv(3) = two(hydhx + hxdhy) minus scexp(x(ij))v(4) = minushydhxv(5) = minushxdhycol(1) = row minus userxmcol(2) = row minus 1col(3) = rowcol(4) = row + 1col(5) = row + userxmcall MatSetValues(jac1row5colvINSERT VALUESierr)

endif10 continue20 continue

Fig 9Jacobian calculation for the Bratu problem using PETSc

Developing a Geodynamics Simulator with PETSc 11

we must have access to some values not present locally on the process as shown inFig 10

Box-type stencil Star-type stencil

Proc 6

Proc 0 Proc 0Proc 1 Proc 1

Proc 6

Fig 10Star and box stencils using a DA

We must store values for theseghostpoints locally so that they may be used forthe computation In addition we must ensure that these values are identical to the cor-responding values on the neighboring processes At the lowest level PETSc providesghosted vectors created usingVecCreateGhost () with storage for some valuesowned by other processes These values are explicitly indicated during constructionTheVecGhostUpdateBegin () andVecGhostUpdateEnd () functions trans-fer values between ghost storage and the corresponding storage on other processesThis method however can become complicated for the user Therefore for the com-mon case of a logically rectangular grid PETSc provides theDA object to managethe determination allocation and coherence of ghost values TheDA object is dis-cussed fully in Section 32 but we give here a short introduction in the context of theBratu problem

TheDA object represents a distributed structured (possibly staggered) grid andthus has more information than our serial grid If we augment our user context toinclude information about ghost values we can obtain all the grid information fromtheDA Figure 11 shows the creation of aDA when one initially know only the totalnumber of vertices in thex andy directions Alternatively we could have specifiedthe local number of vertices in each direction

Now we need only modify our PETSc residual routine to update the ghost valuesas shown in Fig 12

Notice that the ghost value scatter is a two-step operation This allows thecommunication to overlap with local computation that may be taking place whilethe messages are in transit The only change necessary for the residual calcu-lation itself is the correct declaration of the input arraydouble precision

12 Matthew G Knepley et al

type userctxDA da

The start end and number of vertices in the xminusdirectioninteger xsxexm

The start end and number of ghost vertices in the xminusdirectioninteger gxsgxegxm

The start end and number of vertices in the yminusdirectioninteger ysyeym

The start end and number of ghost vertices in the yminusdirectioninteger gysgyegym

The number of vertices in the xminus and yminusdirectionsinteger mxmy

The MPI rank of this processinteger rank

The coefficient in the Bratu equationdouble precision lambda

end type userctx

call DACreate2d(PETSCCOMM WORLDDA NONPERIODICDA STENCIL STAR ampamp usermxusermyPETSCDECIDEPETSCDECIDE11 ampamp PETSCNULL INTEGERPETSCNULL INTEGERuserdaierr)call DAGetCorners(userdauserxsuserysPETSCNULL INTEGER amp

amp userxmuserymPETSCNULL INTEGERierr)call DAGetGhostCorners(userdausergxsusergys amp

amp PETSCNULL INTEGERusergxmusergym ampamp PETSCNULL INTEGERierr)

Here we shift the starting indices up by one so that we can easily use the Fortran convention of1minusbased indices rather than0minusbased

userxs = userxs+1userys = userys+1usergxs = usergxs+1usergys = usergys+1userye = userys+userymminus1userxe = userxs+userxmminus1usergye = usergys+usergymminus1usergxe = usergxs+usergxmminus1

Fig 11Creation of a DA for the Bratu problem

Developing a Geodynamics Simulator with PETSc 13

subroutine FormFunctionPETSc(uFuserierr)implicit none

Vec uFinteger ierrtype (userctx) user

double precisionpointer lu v()lf v()Vec uLocal

call DAGetLocalVector(userdauLocalierr)call DAGlobalToLocalBegin(userdauINSERT VALUESuLocalierr)call DAGlobalToLocalEnd(userdauINSERT VALUESuLocalierr)

call VecGetArrayF90(uLocallu vierr)call VecGetArrayF90(Flf vierr)

Actually compute the local portion of the residualcall FormFunctionLocal(lu vlf vuserierr)

call VecRestoreArrayF90(uLocallu vierr)call VecRestoreArrayF90(Flf vierr)

call DARestoreLocalVector(userdauLocalierr)

Fig 12Parallel residual calculation for the Bratu problem using PETSc

u(usergxsusergxeusergysusergye) which now includes ghostvalues

The changes toFormJacobian() are similar resizing the input array andchanging slightly the calculation of row and column indices and can be found inthe PETSc example source We have now finished the initial introduction of PETSclinear algebra and the simulation is ready to run in parallel

32 Rapid Evolution

The user who is willing to raise the level of abstraction in a code can start by em-ploying theDA to describe the problem domain and discretization The data interfacemimics a multidimensional array and is thus ideal for finite difference finite volumeand low-order finite element schemes on a logically rectangular grid In fact after thedefinition of aFormFunction() routine the simulation is ready to run becausethe nonlinear solver provides approximate Jacobians automatically as discussed inSection 4

14 Matthew G Knepley et al

DASetLocalFunction(userda (DALocalFunction1) FormFunctionLocal)

Fig 13Using the DA to form a function

int FormFunctionLocal(DALocalInfo infodouble udouble fAppCtx user)double two = 20hxhyhxdhyhydhxscdouble uijuxxuyyint ij

hx = 10(double)(infominusgtmxminus1)hy = 10(double)(infominusgtmyminus1)sc = hxhyuserminusgtlambdahxdhy = hxhyhydhx = hyhx Compute function over the locally owned part of the grid for (j=infominusgtys jltinfominusgtys+infominusgtym j++)

for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++) if (i == 0 | | j == 0 | | i == infominusgtmxminus1 | | j == infominusgtmyminus1)

f[j][i] = u[j][i] else

uij = u[j][i]uxx = (twouij minus u[j][iminus1] minus u[j][i+1])hydhxuyy = (twouij minus u[jminus1][i] minus u[j+1][i])hxdhyf[j][i] = uxx + uyy minus scexp(uij)

Fig 14FormFunctionLocal() for the Bratu problem

We begin by examining the Bratu problem from the last section only this time inC We can now provide our local residual routine to theDA see Fig 13

The grid information is passed toFormFunctionLocal() in a DALocalInfo structure as shown in Fig 14

The fields are passed directly as multidimensional C arrays complete with ghostvalues An application scientist can use a boilerplate example such as the Bratu prob-lem merely altering the local physics calculation inFormFunctionLocal() torapidly obtain a parallel scalable simulation that runs on the desktop as well as onmassively parallel supercomputers

If the user has an expression for the Jacobian of the residual function then thismatrix can also be computed by using theDA interface with the call as in Fig 15and the corresponding routine for the local calculation in Fig 16

Developing a Geodynamics Simulator with PETSc 15

DASetLocalJacobian(userda (DALocalFunction1) FormJacobianLocal)

Fig 15Using the DA to form a Jacobian

int FormJacobianLocal(DALocalInfo infodouble xMat jacAppCtx user)MatStencil col[5]rowdouble lambdav[5]hxhyhxdhyhydhxscint ij

lambda = userminusgtparamhx = 10(double)(infominusgtmxminus1)hy = 10(double)(infominusgtmyminus1)sc = hxhylambdahxdhy = hxhy hydhx = hyhx

for (j=infominusgtys jltinfominusgtys+infominusgtym j++) for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++)

rowj = j rowi = i boundary points if (i == 0 | | j == 0 | | i == infominusgtmxminus1 | | j == infominusgtmyminus1)

v[0] = 10MatSetValuesStencil(jac1amprow1amprowvINSERT VALUES)

else interior grid points

v[0] = minushxdhy col[0]j = j minus 1 col[0]i = iv[1] = minushydhx col[1]j = j col[1]i = iminus1v[2] = 20(hydhx + hxdhy) minus scexp(x[j][i]) col[2]j = j col[2]i = iv[3] = minushydhx col[3]j = j col[3]i = i+1v[4] = minushxdhy col[4]j = j + 1 col[4]i = iMatSetValuesStencil(jac1amprow5colvINSERT VALUES)

MatAssemblyBegin(jacMAT FINAL ASSEMBLY)MatAssemblyEnd(jacMAT FINAL ASSEMBLY)MatSetOption(jacMAT NEW NONZERO LOCATION ERR)

Fig 16FormJacobianLocal() for the Bratu problem

16 Matthew G Knepley et al

int FormFunctionLocal(DALocalInfo infoField xField fvoid ptr)AppCtx user = (AppCtx)ptrParameter param = userminusgtparamGridInfo grid = userminusgtgriddouble mag w mag uint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1 i j

for (j=infominusgtys jltinfominusgtys+infominusgtym j++) for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++)

calculateXMomentumResidual(i j x f user)calculateZMomentumResidual(i j x f user)calculatePressureResidual(i j x f user)calculateTemperatureResidual(i j x f user)

Fig 17FormFunctionLocal() for the mantle subduction problem

We have so far presented simple examples involving a perturbed Laplacian butthis development strategy extends far beyond toy problems We have developedlarge-scale parallel geophysical simulations [1] that are producing new results inthe field We began by replacingFormFunctionLocal() in the Bratu examplewith one containing the linear physics of the previous sequential linear subductioncode (dropping the previous codersquos linear solve) Then the PETSc solver results werecompared for correctness with the complete prior subduction code Next the newcode was immediately run in parallel for correctness and efficiency tests Finallythe nonlinear terms from the more complicated physics of the mantle subductionproblem discussed in Section 1 and shown in Figs 17-21 and better quality dis-cretizations of the boundary conditions were added In fact since the structure oftheFormFunction() changed significantly the finalFormFunction() looksnothing like the original but the process of obtaining through the incremental ap-proach reduced the learning curve for PETSc and make correctness checking at thevarious stages much easier

The details of the subduction code are not as important as the recognition thata very complicated physics problem need not involve any more complication in oursimulation infrastructure To give more insight into the actual calculation we showin Fig 22 the explicit calculation of the residual from the continuity equation

Here we have used theDA for a multicomponent problem on a staggered meshwithout alteration because the structure of the discretization remains logically rect-angular

Although theDA manages communication during the solution process data post-processing often necessary for visualization or analysis also requires this function-ality For the mantle subduction simulation we want to calculate both the second in-

Developing a Geodynamics Simulator with PETSc 17

int calculateXMomentumResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (iltj) f[j][i]u = x[j][i]u minus SlabVel(rsquoUrsquo ijuser)

else if (jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lithospheric lid f[j][i]u = x[j][i]u minus 00

else if (i==ilim ) on the right side boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]u = x[j][i]u minus HorizVelocity(ijuser) else

f[j][i]u = XNormalStress(xijCELL CENTERuser) minus EPS ZERO

else if (j==jlim ) on the bottom boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]u = x[j][i]u minus HorizVelocity(ijuser) else if (paramminusgtibound==BC NOSTRESS)

f[j][i]u = XMomentumResidual(xijuser)

else in the mantle wedge f[j][i]u = XMomentumResidual(xijuser)

Fig 18X-momentum residual for the mantle subduction problem

variant of the strain rate tensor and the viscosity over the grid (at both cell centers andcorners) Using theDAGlobalToLocalBegin () andDAGlobalToLocalEnd() functions we transferred the global solution data to local ghosted vectors and pro-ceeded with the calculation We can also transfer data from local vectors to a globalvector usingDALocalToGlobal () The entire viscosity calculation is given inFig 23

Note that this framework allows the user to manage arbitrary fields over the do-main For instance a porous flow simulation might manage material properties ofthe medium in this fashion

4 Solvers

The SNES object in PETSc is an abstraction of the ldquoinverserdquo of a nonlinear oper-ator The user provides aFormFunction() routine as seen in Section 3 whichcomputes the action of the operator on an input vector Linear problems can also be

18 Matthew G Knepley et al

int calculateZMomentumResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1jlim = infominusgtmyminus1

if (ilt=j) f[j][i]w = x[j][i]w minus SlabVel(rsquoWrsquo ijuser)

else if (jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lithospheric lid f[j][i]w = x[j][i]w minus 00

else if (j==jlim ) on the bottom boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]w = x[j][i]w minus VertVelocity(ijuser) else

f[j][i]w = ZNormalStress(xijCELL CENTERuser) minus EPS ZERO

else if (i==ilim ) on the right side boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]w = x[j][i]w minus VertVelocity(ijuser) else if (paramminusgtibound==BC NOSTRESS)

f[j][i]w = ZMomentumResidual(xijuser)

else in the mantle wedge f[j][i]w = ZMomentumResidual(xijuser)

Fig 19Z-momentum residual for the mantle subduction problem

int calculatePressureResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (iltj | | jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lid or slab f[j][i]p = x[j][i]p

else if ((i==ilim | | j==jlim ) ampamp paramminusgtibound==BC ANALYTIC ) on an analytic boundary f[j][i]p = x[j][i]p minus Pressure(ijuser)

else in the mantle wedge f[j][i]p = ContinuityResidual(xijuser)

Fig 20Pressure or continuity residual for the mantle subduction problem

Developing a Geodynamics Simulator with PETSc 19

int calculateTemperatureResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (j==0) on the surface f[j][i]T = x[j][i]T + x[j+1][i]T + PetscMax(x[j][i]T00)

else if (i==0) slab inflow boundary f[j][i]T = x[j][i]T minus PlateModel(jPLATE SLABuser)

else if (i==ilim ) right side boundary mag u = 10 minus pow( (10minusPetscMax(PetscMin(x[j][iminus1]uparamminusgtcb10)00)) 50 )f[j][i]T = x[j][i]T minus mag ux[jminus1][iminus1]T minus (10minusmag u)PlateModel(jPLATE LID user)

else if (j==jlim ) bottom boundary mag w = 10 minus pow( (10minusPetscMax(PetscMin(x[jminus1][i]wparamminusgtsb10)00)) 50 )f[j][i]T = x[j][i]T minus mag wx[jminus1][iminus1]T minus (10minusmag w)

else in the mantle wedge f[j][i]T = EnergyResidual(xijuser)

Fig 21Temperature or energy residual for the mantle subduction problem

double precision ContinuityResidual(Field x int i int j AppCtx user)

GridInfo grid = userminusgtgriddouble precision uEuWwNwSdudxdwdz

uW = x[j][iminus1]u uE = x[j][i]u dudx = (uE minus uW)gridminusgtdxwS = x[jminus1][i]w wN = x[j][i]w dwdz = (wN minus wS)gridminusgtdzreturn dudx + dwdz

Fig 22Calculation of the continuity residual for the mantle subduction problem

solved by usingSNES with no loss of efficiency compared to using the underlyingKSP object (PETSc linear solver object) directly ThusSNES seems the appropri-ate framework for a general-purpose simulator providing the flexibility to add orsubtract nonlinear terms from the equation at will

TheSNES solver does not require a user to implement a Jacobian Default rou-tines are provided to compute a finite difference approximation using coloring toaccount for the sparsity of the matrix The user does not even need the PETScMatinterface but can initially interact only withVec objects making the transition to

PETSc nearly painless However these approximations to the Jacobian can have nu-merical difficulties and are not as efficient as direct evaluation Therefore the user

20 Matthew G Knepley et al

Compute both the second invariant of the strain rate tensor and the viscosity int ViscosityField(DA da Vec X Vec VAppCtx user)

Parameter param = userminusgtparamGridInfo grid = userminusgtgridVec localXField v xdouble eps dx dz T epsC TCint ijisjsimjmilim jlim ivt

ivt = paramminusgtiviscparamminusgtivisc = paramminusgtoutput ivisc

DACreateLocalVector(da amplocalX)DAGlobalToLocalBegin(da X INSERT VALUES localX)DAGlobalToLocalEnd(da X INSERT VALUES localX)DAVecGetArray(dalocalX(void)ampx)DAVecGetArray(daV(void)ampv)

Parameters dx = gridminusgtdx dz = gridminusgtdzilim = gridminusgtniminus1 jlim = gridminusgtnjminus1

Compute real temperature strain rate and viscosity DAGetCorners(daampisampjsPETSCNULL ampimampjmPETSCNULL )for (j=js jltjs+jm j++)

for (i=is iltis+im i++) T = paramminusgtpotentialT x[j][i]T exp( (jminus05)dzparamminusgtz scale )if (iltilim ampamp jltjlim )

TC = paramminusgtpotentialT TInterp(xij) exp( jdzparamminusgtz scale ) else

TC = T Compute the values at both cell centers and cell corners eps = CalcSecInv(xijCELL CENTERuser)epsC = CalcSecInv(xijCELL CORNERuser)v[j][i]u = epsv[j][i]w = epsCv[j][i]p = Viscosity(Tepsdz(jminus05)param)v[j][i]T = Viscosity(TCepsCdzjparam)

DAVecRestoreArray(daV(void)ampv)DAVecRestoreArray(dalocalX(void)ampx)paramminusgtivisc = ivt

Fig 23 Calculation of the second invariant of the strain tensor and viscosity field for themantle subduction problem

Developing a Geodynamics Simulator with PETSc 21

KSP ksp subkspPC bjacobi ilu

SNESGetKSP(snes ampksp)KSPGetPC(ksp ampbjacobi)PCBJacobiGetSubKSP(bjacobi PETSCNULL PETSCNULL ampsubksp)KSPGetPC(subksp[0] ampilu)PCILUSetLevels(ilu levels)

Fig 24Customizing the preconditioner on a single process

has the option of providing a routine or of using the ADIC [15] or similar system forautomatic differentiation The finite difference approximation can be activated byusing the-snes fd and-mat fd coloring freq options or by providing theSNESDefaultComputeJacobianColor () function toSNESSetJacobian()

Through theSNES object the user may also access the great range of directand iterative linear solvers and preconditioners provided by PETSc Sixteen differentKrylov solvers are available including GMRES BiCGStab and LSQR along withinterfaces to popular sparse direct packages such as MUMPS [4] and SuperLU [9]In addition to a variety of incomplete factorization preconditioners including PI-LUT [16] PETSc supports additive Schwartz preconditioning algebraic multigridthrough BoomerAMG [13] and the geometric multigrid discussed below The fullpanoply of solvers and preconditioners available is catalogued on the PETSc Web-site [3]

The modularity of PETSc allows users to easily customize each solver For in-stance suppose the user wishes to increase the number of levels in ILU(k) precon-ditioning on one block of a block-Jacobi scheme The code fragment in Fig 24 willset the number of levels of fill tolevels

PETSc provides an elegant framework for managing geometric multigrid in com-bination with aDA which is abstracted in theDMMGobject In the same way thatany linear solve can be performed withSNES any preconditioner or solver com-bination is available inDMMGby using a single level That is the same user codesupports both geometric multigrid as well as direct methods Krylov methods etcWhen creating aDMMG the user specifies the number of levels of grid refinementand provides the coarse-grid information (see Fig 25) which is always aDA objectat present although unstructured prototypes are being developed Then theDMMGobject calculates the intergrid transfer operators prolongation and restriction and

allocates objects to hold the solution at each level The user must provide the actionof the residual operatorF and can optionally provide a routine to compute the Jaco-bian matrix (which will be called on each level) and a routine to compute the initialguess as in Fig 26

With this information the user can now solve the equation and retrieve the solu-tion vector as in Fig 27

22 Matthew G Knepley et al

DMMGCreate(comm gridmglevels user ampdmmg)DMMGSetDM(dmmg (DM) da)

Fig 25Creating a DMMG

DMMGSetSNESLocal(dmmg FormFunctionLocal FormJacobianLocal 0 0)DMMGSetInitialGuess(dmmg FormInitialGuess)

Fig 26Providing discretization and assembly routines to DMMG

DMMGSolve(dmmg)soln = DMMGGetx(dmmg)

Fig 27Solving the problem with DMMG

5 Extensions

PETSc is not a complete environment for simulating physical phenomena Ratherit is a set of tools that allow the user to assemble such an environment tailored toa specific application As such PETSc will never provide every facility appropriatefor a given simulation However the user can easily extend PETSc and supplementits capabilities We present two examples in the context of the mantle subductionsimulation

51 Simple Continuation

Because of the highly nonlinear dependence of viscosity on temperature and strainrate in equation (1) the iteration in Newtonrsquos method can fail to converge withouta good starting guess of the solution On the other hand for the isoviscous casewhere temperature is coupled to velocity only through advection a solution is easilyreached It is therefore natural to propose a continuation scheme in the viscosity be-ginning with constant viscosity and progressing to full variability A simple adaptivecontinuation model was devised in which the viscosity was raised to a power be-tween zero and oneη rarr ηα whereα=0 corresponds to constant viscosity andα=1corresponds to natural viscosity variation In the continuation loopα was increasedtowards unity by an amount dependent on the rate of convergence of the previousSNES solve

PETSc itself provides no special support for continuation but it is sufficientlymodular that a continuation loop was readily constructed in the application codeusing repeated calls toDMMGSolve() and found to work quite well

Developing a Geodynamics Simulator with PETSc 23

52 Simple Steering

No rigorous justification had been given for the continuation strategy above andconsequently it was possible that for certain initial conditions Newton would fail toconverge thereby resulting in extremely long run times An observant user could de-tect this situation and abort the run but all the potentially useful solution informationup to that point would be lost A clean way to asynchronously abort the continua-tion loop was thus needed On architectures where OS signals are available PETScprovides an interface for registering signal handlers Thus we were able to define ahandler The user wishing to change the control flow of the simulation simply sendsthe appropriate signal to the process This sets a flag in the user data that causes thechange at the next iteration

6 Simulation Results

A long-standing debate has existed between theorists and observationalists over thethermal structure of subduction zones Modelers using constant viscosity flow sim-ulations to compute thermal structure have predicted relatively cold subductionzones [19] Conversely observationalists who use heat flow measurements and min-eralogical thermobarometry to estimate temperatures at depth have long claimedthat subduction zones are hotter than predicted They have invoked the presence ofstrong upwelling of hot mantle material to supply the heat This upwelling was neverpredicted by isoviscous flow models however

The debate remained unresolved until recently when several groups includingour own succeeded in developing simulations with realistically variable viscosity(for other examples [810121821]) A comparison of flow and temperature fieldsfor variable and constant viscosity simulations generated with our code is shownin Fig 28 Results from these simulations are exciting because they close the gapbetween models and observations they predict hotter mantle temperatures steepersurface thermal gradients and upwelling mantle flow

Furthermore these simulations allow for quantitative predictions of variation ofobservables with subduction parameters A recent study of subduction zone earth-quakes has identified an intriguing trend the vertical distance from subduction zonevolcanoes to the surface of the subducting slab is anti-correlated with descent rate ofthe slab [11] Preliminary calculations with our model are consistent with this trendindicating that flow and thermal structure may play an important role in determiningnot only the quantity and chemistry of magmas but also their path of transport to thesurface Further work is required to resolve this issue

Simulating the geodynamics of subduction is an example of our success in port-ing an existing code into the PETSc framework simultaneously parallelizing thecode and increasing its functionality Using this code as a template we rapidly de-veloped a related simulation of another tectonic boundary the mid-ocean ridge Thiswork was done to address a set of observations of systematic morphological asym-metry in the global mid-ocean ridge system [7] Our model confirms the qualitative

24 Matthew G Knepley et al

Dep

th k

m

log10

η and flow field

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24 Temperature degC

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Distance km

Dep

th k

m

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24

Distance km

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Isov

isco

usV

aria

ble

Vis

cosi

ty

(a) (b)

(c) (d)

Fig 282D viscosity and potential temperature fields from simulations on 8 processors with230112 degrees of freedom Panels in the top row are from a simulation withα=1 in equation(1) Panels in bottom row haveα=0 The white box in panels (a) and (c) shows the region inwhich temperature is plotted in panels (b) and (d)(a) Colors showlog10 of the viscosity fieldNote that there are more than five orders of magnitude variation in viscosity Arrows show theflow direction and magnitude (the slab is being subducted at a rate of 6 cmyear) Upwellingis evident in the flow field near the wedge corner(b) Temperature field from the variableviscosity simulation 1100C isotherm is shown as a dashed line(c) (Constant) viscosity andflow field from the isoviscous simulation Strong flow is predicted at the base of the crustdespite the low-temperature rock there No upwelling flow is predicted(d) Temperature fieldfrom isoviscous simulation Note that the mantle wedge corner is much colder than in (b)

mechanism that had been proposed to explain these observations Furthermore itshows a quantitative agreement with trends in the observed data [17] As with ourmodels of subduction the key to demonstrating the validity of the hypothesized dy-namics was simulating them with the strongly nonlinear rheology that characterizesmantle rock on geologic timescales The computational tools provided by PETSc en-abled us easily handle this rheology reducing development time and allowing us tofocus on model interpretation

References

1 PETSc SNES Example 30 httpwwwmcsanlgovpetscpetsc-2

Developing a Geodynamics Simulator with PETSc 25

snapshotspetsc-currentsrcsnesexamplestutorialsex30chtml

2 PETSc SNES Example 5 httpwwwmcsanlgovpetscpetsc-2snapshotspetsc-currentsrcsnesexamplestutorialsex5f90Fhtml

3 PETSc Solvers httpwwwmcsanlgovpetscpetsc-2documentationlinearsolvertablehtml

4 Amestoy P R Duff I S LrsquoExcellent J-Y and Koster J A fully asynchronous mul-tifrontal solver using distributed dynamic schedulingSIAM Journal on Matrix Analysisand Applications 23(1)15ndash41 (2001)

5 Balay S Buschelman K Eijkhout V Gropp W D Kaushik D Knepley M GMcInnes L C Smith B F and Zhang H PETSc Web pagehttpwwwmcsanlgovpetsc

6 Balay S Gropp W D McInnes L C and Smith B F Efficient management of par-allelism in object oriented numerical software libraries In Arge E Bruaset A M andLangtangen H P editorsModern Software Tools in Scientific Computing pages 163ndash202 Birkhauser Press (1997)

7 Carbotte S Small C and Donnelly K The influence of ridge migration on the mag-matic segmentation of mid-ocean ridgesNature 429743ndash746 (2004)

8 Conder J Wiens D and Morris J On the decompression melting structure at volcanicarcs and back-arc spreading centersGeophys Res Letts 29 (2002)

9 Demmel J W Gilbert J R and Li X S SuperLU userrsquos guide Technical ReportLBNL-44289 Lawrence Berkeley National Laboratory (2003)

10 Eberle M Grasset O and Sotin C A numerical study of the interaction of the man-tle wedge subducting slab and overriding platePhys Earth Planet In 134191ndash202(2002)

11 England P Engdahl R and Thatcher W Systematic variation in the depth of slabsbeneath arc volcanosGeophys J Int 156(2)377ndash408 (2003)

12 Furukawa Y Depth of the decoupling plate interface and thermal structure under arcsJ Geophys Res 9820005ndash20013 (1993)

13 Henson V E and Yang U M BoomerAMG A parallel algebraic multigrid solver andpreconditioner Technical Report UCRL-JC-133948 Lawrence Livermore National Lab-oratory (2000)

14 Hirth G and Kohlstedt D Rheology of the upper mantle and the mantle wedge A viewfrom the experimentalists InInside the Subduction Factory volume 138 ofGeophysicalMonograph American Geophysical Union (2003)

15 Hovland P Norris B and Smith B Making automatic differentiation truly automaticCoupling PETSc with ADIC InProceedings of ICCS2002 (2002)

16 Hysom D and Pothen A A scalable parallel algorithm for incomplete factor precondi-tioning SIAM Journal on Scientific Computing 222194ndash2215 (2001)

17 Katz R Spiegelman M and Carbotte S Ridge migration asthenospheric flow and theorigin of magmatic segmentation in the global mid-ocean ridge systemGeophys ResLetts 31 (2004)

18 Kelemen P Rilling J Parmentier E Mehl L and Hacker B Thermal structure dueto solid-state flow in the mantle wedge beneath arcs InInside the Subduction Factoryvolume 138 ofGeophysical Monograph American Geophysical Union (2003)

19 Peacock S and Wang K Seismic consequences of warm versus cool subduction meta-morphism Examples from southwest and northeast JapanScience 286937ndash939 (1999)

20 Stern R Subduction zonesRev Geophys 40(4) (2002)

26 Matthew G Knepley et al

21 van Keken P Kiefer B and Peacock S High-resolution models of subduction zonesImplications for mineral dehydration reactions and the transport of water into the deepmantleGeochem Geophys Geosys 3(10) (2003)

Page 2: Developing a Geodynamics Simulator with PETScfoalab.earth.ox.ac.uk/publications/Knepley_etal_Springer06.pdf · Developing a Geodynamics Simulator with PETSc Matthew G. Knepley 1,

2 Matthew G Knepley et al

highly nonlinear character of the non-Newtonian temperature-dependent viscositydemands costly solution algorithms This high computational complexity coupledwith long development times have put such models out of reach for most geoscien-tists The Portable Extensible Toolkit for Scientific Computation (PETSc) [5] makesit easier for geoscientists to overcome these barriers In an abstract sense PETSc pro-vides a framework for collaboration between geoscientists with complex modelingproblems and numerical analysts and software engineers who have the encapsulatednumerical methods for solving those problems In the case described here the orig-inal model was a specialized single linear solver serial code capable of calculatingthe thermal structure due to an analytically prescribed isoviscous flow field By port-ing this code into the PETSc framework we were able to extend it to solve for fullycoupled thermal structure and non-Newtonian flow with a choice of many scalableparallel solvers flexible boundary conditions convenient parameter input real-timecode steering and many other features not available with the serial version This wasdone using an incremental process by first replacing the custom linear solver withPETScrsquos general purpose nonlinear solver then replacing the sequential data struc-tures with PETScrsquos parallel ones and then finally adding the additional nonlinearphysics (to the portion of the code that discretizes the PDE)

η=η(PTV)

~100 km

Subducting slab

No Slip

Subducting slab

No Slip

Stre

ss fr

ee

Dislocation and diffusion creep

Distance from trench

Dep

th

Vslab

Mantle wedge

Continental crust

Continental crust

Mantle wedge

Zero stressfault

(a)

(b)

Fig 1 (a) Schematic diagram of a subduction zone Oceanic lithosphere is colliding withcontinental lithosphere and being subducted Volatile compounds are released from the sub-ducting slab at depth and enter the mantle wedge lowering the melting temperature of therock and causing partial melting This melt rises to feed volcanos at the surface(b) Schematicdiagram of the computational domain Boundary conditions on the flow field are imposed atthe bottom of the crust the top of the slab and the intersection of the mantle wedge andthe domain boundary Flow velocity within the mantle wedge is the solution to equation (1)Potential temperature throughout the domain is the solution to equation (3) The ldquozero stressfaultrdquo represents a frictional sliding surface On geologic time scales all stress on this boundaryis relieved in earthquakes and does not cause deformation

Any model of subduction zone volcanism must be based on the thermal andflow structure of the slowly moving solid mantle wedge shown in Fig 1 While themagnesium and iron-rich mantle rock is solid on human time-scales (as evidenced bythe seismic waves it transmits) on geologic time-scales it undergoes two modes of

Developing a Geodynamics Simulator with PETSc 3

solid-state creep [14] Its motion can be described by the Stokes equation for steadyflow of an incompressible highly viscous fluid (with zero Reynolds number)

nablaP = nablamiddot[η

(nablaV + nablaVT

)] st nablamiddot V = 0 (1)

η = (1ηdisl + 1ηdifn)minus1 (2)

whereP is the fluid pressureV is the mantle velocity fieldηdifn is the diffusioncreep viscosity andηdisl is the dislocation creep viscosity Both creep mechanismsgive an Arrhenius-type dependence on pressure and temperature Dislocation creephas an additional non-Newtonian strain rate dependence The strength of the nonlin-earity of viscosity makes this equation difficult to solve without a good initial guessIn our code the initial guess in provided by a continuation method we modify equa-tion (1) to letη rarr ηα whereα is a number between zero and one The continuationmethod described in section 51 variesα over a sequence of solves of increasingnonlinearity to reach natural variation in viscosity The solution at each step is usedas a guess for the following solve The first step in this process is to solve the problemfor constant viscosity (α=0) where the system of equations is linear and analyticallytractable

The production of molten rock depends fundamentally on the mantle temperaturefield The distribution of heat is governed by the conservation of enthalpy expressedas an advection-diffusion equation

partθ

partt+ V middotnablaθ = κnabla2θ (3)

whereθ is the mantle potential temperature andκ is the thermal diffusivity of themantle The mantle has a very low thermal diffusivity compared to materials such asmetal or water and thus the advection term dominates in equation (3)

For 0 lt α le 1 equations (1) and (3) form a nonlinear set coupled through theviscosity The solution of these equations is the first stage in modeling magma gen-esis in subduction zones Future work will incorporate equations of porous flow ofvolatiles and magma through the mantle reactive melting and geochemical trans-port Even without these complexities however we have achieved interesting resultswith the simple though highly nonlinear set of equations given above Some of theseresults are presented below after an in-depth look at the application developmentprocess in the PETSc framework

2 Integrating PETSc

PETSc is a set of library interfaces built in a generally hierarchical fashion A usermay decide to use some libraries and disregard others Fig 2 illustrates this hierarchyof dependencies For instance a user can use only the PETSc linear algebra (vectorsand matrices) libraries or add linear solvers to those or add nonlinear solvers to theentire group Thus integration may proceed in several stages which we discuss inthe following sections The first step is to incorporate the PETSc libraries into the

4 Matthew G Knepley et al

existing compile system (iemake) and to initialize the PETSc runtime This maybe done in two ways

Fig 2 Interface hierarchy in a PETSc application

The simplest approach is to adopt the PETSc makefile structure which is pre-sented in Fig 3 for an application using the nonlinear solver package The user in-cludes thebmakecommonbase which defines both rules for compiling sourceand variables that are used during the compile and link Then only a simple rule forthe executable is necessary using the appropriate PETSc library variable Individualcompilation can be customized with the variables shown at the top of the makefile

Users with large existing build systems may choose not to inherit the PETScmake rules but instead use a lower-level interface based only on PETSc make vari-ables A boilerplate example is given in Fig 4 The user now includes only thebmakecommonvariables file which defines the make variables but does notprescribe any rules for compilation or linking The variables provide informationabout all compilation flags libraries and external packages necessary to link withPETSc

Before any PETSc code can be run the user must callPetscInitialize() and likewise after all PETSc code has completed the user must callPetscFinalize () This is analogous to the requirements for using MPI In fact if the userhas not done so already PETSc will handle the initialization and cleanup of MPIautomatically in these routines A simple C driver is shown in Fig 5

The initialization call provides the command line arguments to PETSc for pro-cessing and can take an optional help string for the user (printed when the-help op-tion is given) The finalization call frees any resources used by PETSc and provides

Developing a Geodynamics Simulator with PETSc 5

CFLAGS =FFLAGS =CPPFLAGS=FPPFLAGS=

include $PETSCDIRbmakecommonbase

ex1 ex1o utilo chkoptsminus$CLINKER minuso $ $ˆ $PETSCSNES LIB$RM $ˆ

ex1f ex1fo physo chkoptsminus$FLINKER minuso $ $ˆ $PETSCFORTRAN LIB $PETSCSNES LIB$RM $ˆ

Fig 3Typical PETSc makefile

include $PETSCDIRbmakecommonvariables

comassageCpl $lt$CC minusc $MY CFLAGS $COPTFLAGS $CFLAGS $CCPPFLAGS $lt

FomassageFortranpl $lt$FC minusc $MY FFLAGS $FOPTFLAGS $FFLAGS $FCPPFLAGS $lt

ex1 ex1o utilominus$CLINKER minuso $ $ˆ $PETSCSNES LIB$RM $lt

ex1f ex1fo physominus$FLINKER minuso $ $ˆ $PETSCFORTRAN LIB $PETSCSNES LIB$RM $lt

Fig 4Boilerplate custom makefile using PETSc

summary logging and diagnostic information most notably performance profilingThe equivalent Fortran driver is shown in Fig 6

Notice that the command line arguments are now obtained directly from the For-tran runtime library rather than the user code Once these calls are inserted intothe application code the user can verify a successful link and run with the PETSclibraries

6 Matthew G Knepley et al

static char help[ ] = Boilerplate PETSc Examplenn

include petsch

int main(int argc char argv)

ierr = PetscInitialize(ampargc ampargv (char ) 0 help)CHKERRQ(ierr)

User code

return PetscFinalize()

Fig 5Boilerplate C driver for PETSc

program mainimplicit none

include includefincludepetsch

integer ierr

call PetscInitialize(PETSCNULL CHARACTER ierr)

User code

call PetscFinalize(ierr)end

Fig 6Boilerplate Fortran driver for PETSc

3 Data Distribution and Linear Algebra

The PETSc solvers (discussed in Section 4) require the user to provide C or Fortranroutines that compute the residual of the equation they wish to solve (after they havediscretized the PDE) and optionally the Jacobian of that residual function We re-fer to these functions generically asFormFunction() andFormJacobian() For nonlinear problems the PETSc solvers use truncated Newton methods with linesearches or trust-regions for robustness and possibly approximate or incomplete Ja-cobians for efficiency

The central objects in any PETSc simulation are the abstractions from linearalgebra vectors and matrices or in PETSc terms theVec andMat classes Theseform the basis of all solver and preconditioner interfaces as well as providing the linkto user-supplied discretization and physics routines Thus the most important stepin the migration of an application to the PETSc framework is the incorporation of its

Developing a Geodynamics Simulator with PETSc 7

linear algebra and data distribution abstractions Two obvious strategies emerge foraccomplishing this integration a gradual approach that seeks to minimize the changeto existing code and a more aggressive approach that leverages as much of the PETSctechnology as possible The next two subsections address these complementary pathstoward integration We use the simple example of the solid-fuel ignition problem orBratu problem to illustrate the gradual evolution of a simulation and then we givean example of mantle subduction for the more aggressive approach

31 Gradual Evolution

A new PETSc user with very complicated code which perhaps was originally writ-ten by someone else may opt to make as few changes as possible when moving toPETSc PETSc facilitates this approach with low-level interfaces to bothVec andMat objects that closely resemble common serial data structures Vectors are espe-cially easy to migrate because the default PETSc storage format contiguous arrayson each process is that most common in serial applications Matrices are somewhatmore complicated and in order to hide the actual matrix data storage format PETScrequires the user to access values through a functional interface [6]

Let us examine the Bratu problem as an example of integrating PETSc vectorsinto an existing simulation [2] This problem is modeled by the partial differentialequation

minus∆uminus λeu = 0 (4)

We take the domain to be the unit square and impose homogeneous Dirichlet bound-ary conditions on the edges We discretize the equation using a 5-point stencil finitedifference scheme which results in a set of nonlinear algebraic equationsF (uh) = 0where we useuh to indicate the discrete solution vector In Fig 7 we present a For-tran 90 routine that calculates the residualF as a function of the input vectoruhIt also takes a user context as input that holds the domain information and problemcoefficient

Notice that the boundary conditions on the residual are of the formuminusuΓ whereuΓ are the specified boundary values because we are driving the residualF (u) tozero An alternative would be to eliminate those variables altogether substitutingthe boundary values in any interior calculation However this technique currentlyimposes a greater burden on the programmer

When integrating PETSc vectors into this code we can let Fortran allocate thestorage or we can use PETSc for allocation In Fig 8 we let PETSc managethe memory The input vectors already have storage allocated by PETSc whichcan be accessed as an F90 array by using theVecGetArrayF90() functionor in C by usingVecGetArray() A sampledriver() routine shows howPETSc allocates vector storage If we let Fortran manage memory then we useVecCreateSeqWithArray() orVecCreateMPIWithArray() in parallelto create theVec objects

During a Newton iteration to solve this equation we would require the JacobianJ of the mappingF Rather than being accessed as a multidimensional array the

8 Matthew G Knepley et al

module f90moduletype userctx

The start end and number of vertices in the xminus and yminusdirectionsinteger xsxeysyeinteger mxmydouble precision lambda

end type userctxcontainsend module f90module

subroutine FormFunction(uFuserierr)use f90moduletype (userctx) userdouble precision u(userxsuserxeuserysuserye)double precision F(userxsuserxeuserysuserye)double precision twoonehxhyhxdhyhydhxscuijuxxuyyinteger ijierr

hx = 10dble(usermxminus1)hy = 10dble(usermyminus1)sc = hxhyuserlambdahxdhy = hxhyhydhx = hyhx

do 20 j=userysuseryedo 10 i=userxsuserxe

Apply boundary conditionsif (i == 1 or j == 1 or i == usermx or j == usermy) then

F(ij) = u(ij) Apply finite difference scheme

elseuij = u(ij)uxx = hydhx (20uij minus u(iminus1j) minus u(i+1j))uyy = hxdhy (20uij minus u(ijminus1) minus u(ij+1))F(ij) = uxx + uyy minus scexp(uij)

endif10 continue20 continue

Fig 7Residual calculation for the Bratu problem

Developing a Geodynamics Simulator with PETSc 9

subroutine driver(uFuserierr)implicit none

Vec uFint Nparameter(N=10000)

call VecCreate(PETSCCOMM WORLDuierr)call VecSetSizes(uPETSCDECIDENierr)call VecDuplicate(uFierr)call SolverLoop(uFierr)call VecDestroy(uierr)call VecDestroy(Fierr)end

subroutine FormFunctionPETSc(uFuserierr)use f90moduleimplicit none

Vec uFtype (userctx) userdouble precisionpointer u v()f v()

call VecGetArrayF90(uu vierr)call VecGetArrayF90(Ff vierr)call FormFunction(u vf vuserierr)call VecRestoreArrayF90(uu vierr)call VecRestoreArrayF90(Ff vierr)end

Fig 8Driver for the Bratu problem using PETSc

PETScMat object provides theMatSetValues () function to set logically denseblocks of values into the structure A function that computes the Jacobian and storesit in a PETSc matrix is shown in Fig 9 With this addition the user can now runserial code using the PETSc solvers (details are given in Section 4) Note that thestorage mechanism is the only change to user code necessary for its incorporationinto the PETSc framework using the wrapper in Fig 8

Although PETSc removes the need for low-level parallel programming such asdirect calls to the MPI library the user must still identify parallelism inherent inthe application and must structure the computation accordingly The first step is topartition the domain and have each process calculate the residual and Jacobian onlyover its local piece This is easily accomplished by redefining thexs xe ys andyevariables on each process converting loops over the entire domain into loops overthe local domain However this method uses nearest-neighbor information and thus

10 Matthew G Knepley et al

subroutine FormJacobian(ujacuserierr)use f90moduletype (userctx) userMat jacdouble precision x(userxsuserxeuserysuserye)double precision twoonehxhyhxdhyhydhxscv(5)integer rowcol(5)ijierr

one = 10two = 20hx = onedble(usermxminus1)hy = onedble(usermyminus1)sc = hxhyuserlambdahxdhy = hxhyhydhx = hyhxdo 20 j=userysuserye

row = (j minus userys)userxm minus 1do 10 i=userxsuserxe

row = row + 1 boundary points

if (i == 1 or j == 1 or i == usermx or j == usermy) thencol(1) = rowv(1) = onecall MatSetValues(jac1row1colvINSERT VALUESierr)

interior grid pointselse

v(1) = minushxdhyv(2) = minushydhxv(3) = two(hydhx + hxdhy) minus scexp(x(ij))v(4) = minushydhxv(5) = minushxdhycol(1) = row minus userxmcol(2) = row minus 1col(3) = rowcol(4) = row + 1col(5) = row + userxmcall MatSetValues(jac1row5colvINSERT VALUESierr)

endif10 continue20 continue

Fig 9Jacobian calculation for the Bratu problem using PETSc

Developing a Geodynamics Simulator with PETSc 11

we must have access to some values not present locally on the process as shown inFig 10

Box-type stencil Star-type stencil

Proc 6

Proc 0 Proc 0Proc 1 Proc 1

Proc 6

Fig 10Star and box stencils using a DA

We must store values for theseghostpoints locally so that they may be used forthe computation In addition we must ensure that these values are identical to the cor-responding values on the neighboring processes At the lowest level PETSc providesghosted vectors created usingVecCreateGhost () with storage for some valuesowned by other processes These values are explicitly indicated during constructionTheVecGhostUpdateBegin () andVecGhostUpdateEnd () functions trans-fer values between ghost storage and the corresponding storage on other processesThis method however can become complicated for the user Therefore for the com-mon case of a logically rectangular grid PETSc provides theDA object to managethe determination allocation and coherence of ghost values TheDA object is dis-cussed fully in Section 32 but we give here a short introduction in the context of theBratu problem

TheDA object represents a distributed structured (possibly staggered) grid andthus has more information than our serial grid If we augment our user context toinclude information about ghost values we can obtain all the grid information fromtheDA Figure 11 shows the creation of aDA when one initially know only the totalnumber of vertices in thex andy directions Alternatively we could have specifiedthe local number of vertices in each direction

Now we need only modify our PETSc residual routine to update the ghost valuesas shown in Fig 12

Notice that the ghost value scatter is a two-step operation This allows thecommunication to overlap with local computation that may be taking place whilethe messages are in transit The only change necessary for the residual calcu-lation itself is the correct declaration of the input arraydouble precision

12 Matthew G Knepley et al

type userctxDA da

The start end and number of vertices in the xminusdirectioninteger xsxexm

The start end and number of ghost vertices in the xminusdirectioninteger gxsgxegxm

The start end and number of vertices in the yminusdirectioninteger ysyeym

The start end and number of ghost vertices in the yminusdirectioninteger gysgyegym

The number of vertices in the xminus and yminusdirectionsinteger mxmy

The MPI rank of this processinteger rank

The coefficient in the Bratu equationdouble precision lambda

end type userctx

call DACreate2d(PETSCCOMM WORLDDA NONPERIODICDA STENCIL STAR ampamp usermxusermyPETSCDECIDEPETSCDECIDE11 ampamp PETSCNULL INTEGERPETSCNULL INTEGERuserdaierr)call DAGetCorners(userdauserxsuserysPETSCNULL INTEGER amp

amp userxmuserymPETSCNULL INTEGERierr)call DAGetGhostCorners(userdausergxsusergys amp

amp PETSCNULL INTEGERusergxmusergym ampamp PETSCNULL INTEGERierr)

Here we shift the starting indices up by one so that we can easily use the Fortran convention of1minusbased indices rather than0minusbased

userxs = userxs+1userys = userys+1usergxs = usergxs+1usergys = usergys+1userye = userys+userymminus1userxe = userxs+userxmminus1usergye = usergys+usergymminus1usergxe = usergxs+usergxmminus1

Fig 11Creation of a DA for the Bratu problem

Developing a Geodynamics Simulator with PETSc 13

subroutine FormFunctionPETSc(uFuserierr)implicit none

Vec uFinteger ierrtype (userctx) user

double precisionpointer lu v()lf v()Vec uLocal

call DAGetLocalVector(userdauLocalierr)call DAGlobalToLocalBegin(userdauINSERT VALUESuLocalierr)call DAGlobalToLocalEnd(userdauINSERT VALUESuLocalierr)

call VecGetArrayF90(uLocallu vierr)call VecGetArrayF90(Flf vierr)

Actually compute the local portion of the residualcall FormFunctionLocal(lu vlf vuserierr)

call VecRestoreArrayF90(uLocallu vierr)call VecRestoreArrayF90(Flf vierr)

call DARestoreLocalVector(userdauLocalierr)

Fig 12Parallel residual calculation for the Bratu problem using PETSc

u(usergxsusergxeusergysusergye) which now includes ghostvalues

The changes toFormJacobian() are similar resizing the input array andchanging slightly the calculation of row and column indices and can be found inthe PETSc example source We have now finished the initial introduction of PETSclinear algebra and the simulation is ready to run in parallel

32 Rapid Evolution

The user who is willing to raise the level of abstraction in a code can start by em-ploying theDA to describe the problem domain and discretization The data interfacemimics a multidimensional array and is thus ideal for finite difference finite volumeand low-order finite element schemes on a logically rectangular grid In fact after thedefinition of aFormFunction() routine the simulation is ready to run becausethe nonlinear solver provides approximate Jacobians automatically as discussed inSection 4

14 Matthew G Knepley et al

DASetLocalFunction(userda (DALocalFunction1) FormFunctionLocal)

Fig 13Using the DA to form a function

int FormFunctionLocal(DALocalInfo infodouble udouble fAppCtx user)double two = 20hxhyhxdhyhydhxscdouble uijuxxuyyint ij

hx = 10(double)(infominusgtmxminus1)hy = 10(double)(infominusgtmyminus1)sc = hxhyuserminusgtlambdahxdhy = hxhyhydhx = hyhx Compute function over the locally owned part of the grid for (j=infominusgtys jltinfominusgtys+infominusgtym j++)

for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++) if (i == 0 | | j == 0 | | i == infominusgtmxminus1 | | j == infominusgtmyminus1)

f[j][i] = u[j][i] else

uij = u[j][i]uxx = (twouij minus u[j][iminus1] minus u[j][i+1])hydhxuyy = (twouij minus u[jminus1][i] minus u[j+1][i])hxdhyf[j][i] = uxx + uyy minus scexp(uij)

Fig 14FormFunctionLocal() for the Bratu problem

We begin by examining the Bratu problem from the last section only this time inC We can now provide our local residual routine to theDA see Fig 13

The grid information is passed toFormFunctionLocal() in a DALocalInfo structure as shown in Fig 14

The fields are passed directly as multidimensional C arrays complete with ghostvalues An application scientist can use a boilerplate example such as the Bratu prob-lem merely altering the local physics calculation inFormFunctionLocal() torapidly obtain a parallel scalable simulation that runs on the desktop as well as onmassively parallel supercomputers

If the user has an expression for the Jacobian of the residual function then thismatrix can also be computed by using theDA interface with the call as in Fig 15and the corresponding routine for the local calculation in Fig 16

Developing a Geodynamics Simulator with PETSc 15

DASetLocalJacobian(userda (DALocalFunction1) FormJacobianLocal)

Fig 15Using the DA to form a Jacobian

int FormJacobianLocal(DALocalInfo infodouble xMat jacAppCtx user)MatStencil col[5]rowdouble lambdav[5]hxhyhxdhyhydhxscint ij

lambda = userminusgtparamhx = 10(double)(infominusgtmxminus1)hy = 10(double)(infominusgtmyminus1)sc = hxhylambdahxdhy = hxhy hydhx = hyhx

for (j=infominusgtys jltinfominusgtys+infominusgtym j++) for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++)

rowj = j rowi = i boundary points if (i == 0 | | j == 0 | | i == infominusgtmxminus1 | | j == infominusgtmyminus1)

v[0] = 10MatSetValuesStencil(jac1amprow1amprowvINSERT VALUES)

else interior grid points

v[0] = minushxdhy col[0]j = j minus 1 col[0]i = iv[1] = minushydhx col[1]j = j col[1]i = iminus1v[2] = 20(hydhx + hxdhy) minus scexp(x[j][i]) col[2]j = j col[2]i = iv[3] = minushydhx col[3]j = j col[3]i = i+1v[4] = minushxdhy col[4]j = j + 1 col[4]i = iMatSetValuesStencil(jac1amprow5colvINSERT VALUES)

MatAssemblyBegin(jacMAT FINAL ASSEMBLY)MatAssemblyEnd(jacMAT FINAL ASSEMBLY)MatSetOption(jacMAT NEW NONZERO LOCATION ERR)

Fig 16FormJacobianLocal() for the Bratu problem

16 Matthew G Knepley et al

int FormFunctionLocal(DALocalInfo infoField xField fvoid ptr)AppCtx user = (AppCtx)ptrParameter param = userminusgtparamGridInfo grid = userminusgtgriddouble mag w mag uint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1 i j

for (j=infominusgtys jltinfominusgtys+infominusgtym j++) for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++)

calculateXMomentumResidual(i j x f user)calculateZMomentumResidual(i j x f user)calculatePressureResidual(i j x f user)calculateTemperatureResidual(i j x f user)

Fig 17FormFunctionLocal() for the mantle subduction problem

We have so far presented simple examples involving a perturbed Laplacian butthis development strategy extends far beyond toy problems We have developedlarge-scale parallel geophysical simulations [1] that are producing new results inthe field We began by replacingFormFunctionLocal() in the Bratu examplewith one containing the linear physics of the previous sequential linear subductioncode (dropping the previous codersquos linear solve) Then the PETSc solver results werecompared for correctness with the complete prior subduction code Next the newcode was immediately run in parallel for correctness and efficiency tests Finallythe nonlinear terms from the more complicated physics of the mantle subductionproblem discussed in Section 1 and shown in Figs 17-21 and better quality dis-cretizations of the boundary conditions were added In fact since the structure oftheFormFunction() changed significantly the finalFormFunction() looksnothing like the original but the process of obtaining through the incremental ap-proach reduced the learning curve for PETSc and make correctness checking at thevarious stages much easier

The details of the subduction code are not as important as the recognition thata very complicated physics problem need not involve any more complication in oursimulation infrastructure To give more insight into the actual calculation we showin Fig 22 the explicit calculation of the residual from the continuity equation

Here we have used theDA for a multicomponent problem on a staggered meshwithout alteration because the structure of the discretization remains logically rect-angular

Although theDA manages communication during the solution process data post-processing often necessary for visualization or analysis also requires this function-ality For the mantle subduction simulation we want to calculate both the second in-

Developing a Geodynamics Simulator with PETSc 17

int calculateXMomentumResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (iltj) f[j][i]u = x[j][i]u minus SlabVel(rsquoUrsquo ijuser)

else if (jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lithospheric lid f[j][i]u = x[j][i]u minus 00

else if (i==ilim ) on the right side boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]u = x[j][i]u minus HorizVelocity(ijuser) else

f[j][i]u = XNormalStress(xijCELL CENTERuser) minus EPS ZERO

else if (j==jlim ) on the bottom boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]u = x[j][i]u minus HorizVelocity(ijuser) else if (paramminusgtibound==BC NOSTRESS)

f[j][i]u = XMomentumResidual(xijuser)

else in the mantle wedge f[j][i]u = XMomentumResidual(xijuser)

Fig 18X-momentum residual for the mantle subduction problem

variant of the strain rate tensor and the viscosity over the grid (at both cell centers andcorners) Using theDAGlobalToLocalBegin () andDAGlobalToLocalEnd() functions we transferred the global solution data to local ghosted vectors and pro-ceeded with the calculation We can also transfer data from local vectors to a globalvector usingDALocalToGlobal () The entire viscosity calculation is given inFig 23

Note that this framework allows the user to manage arbitrary fields over the do-main For instance a porous flow simulation might manage material properties ofthe medium in this fashion

4 Solvers

The SNES object in PETSc is an abstraction of the ldquoinverserdquo of a nonlinear oper-ator The user provides aFormFunction() routine as seen in Section 3 whichcomputes the action of the operator on an input vector Linear problems can also be

18 Matthew G Knepley et al

int calculateZMomentumResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1jlim = infominusgtmyminus1

if (ilt=j) f[j][i]w = x[j][i]w minus SlabVel(rsquoWrsquo ijuser)

else if (jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lithospheric lid f[j][i]w = x[j][i]w minus 00

else if (j==jlim ) on the bottom boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]w = x[j][i]w minus VertVelocity(ijuser) else

f[j][i]w = ZNormalStress(xijCELL CENTERuser) minus EPS ZERO

else if (i==ilim ) on the right side boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]w = x[j][i]w minus VertVelocity(ijuser) else if (paramminusgtibound==BC NOSTRESS)

f[j][i]w = ZMomentumResidual(xijuser)

else in the mantle wedge f[j][i]w = ZMomentumResidual(xijuser)

Fig 19Z-momentum residual for the mantle subduction problem

int calculatePressureResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (iltj | | jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lid or slab f[j][i]p = x[j][i]p

else if ((i==ilim | | j==jlim ) ampamp paramminusgtibound==BC ANALYTIC ) on an analytic boundary f[j][i]p = x[j][i]p minus Pressure(ijuser)

else in the mantle wedge f[j][i]p = ContinuityResidual(xijuser)

Fig 20Pressure or continuity residual for the mantle subduction problem

Developing a Geodynamics Simulator with PETSc 19

int calculateTemperatureResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (j==0) on the surface f[j][i]T = x[j][i]T + x[j+1][i]T + PetscMax(x[j][i]T00)

else if (i==0) slab inflow boundary f[j][i]T = x[j][i]T minus PlateModel(jPLATE SLABuser)

else if (i==ilim ) right side boundary mag u = 10 minus pow( (10minusPetscMax(PetscMin(x[j][iminus1]uparamminusgtcb10)00)) 50 )f[j][i]T = x[j][i]T minus mag ux[jminus1][iminus1]T minus (10minusmag u)PlateModel(jPLATE LID user)

else if (j==jlim ) bottom boundary mag w = 10 minus pow( (10minusPetscMax(PetscMin(x[jminus1][i]wparamminusgtsb10)00)) 50 )f[j][i]T = x[j][i]T minus mag wx[jminus1][iminus1]T minus (10minusmag w)

else in the mantle wedge f[j][i]T = EnergyResidual(xijuser)

Fig 21Temperature or energy residual for the mantle subduction problem

double precision ContinuityResidual(Field x int i int j AppCtx user)

GridInfo grid = userminusgtgriddouble precision uEuWwNwSdudxdwdz

uW = x[j][iminus1]u uE = x[j][i]u dudx = (uE minus uW)gridminusgtdxwS = x[jminus1][i]w wN = x[j][i]w dwdz = (wN minus wS)gridminusgtdzreturn dudx + dwdz

Fig 22Calculation of the continuity residual for the mantle subduction problem

solved by usingSNES with no loss of efficiency compared to using the underlyingKSP object (PETSc linear solver object) directly ThusSNES seems the appropri-ate framework for a general-purpose simulator providing the flexibility to add orsubtract nonlinear terms from the equation at will

TheSNES solver does not require a user to implement a Jacobian Default rou-tines are provided to compute a finite difference approximation using coloring toaccount for the sparsity of the matrix The user does not even need the PETScMatinterface but can initially interact only withVec objects making the transition to

PETSc nearly painless However these approximations to the Jacobian can have nu-merical difficulties and are not as efficient as direct evaluation Therefore the user

20 Matthew G Knepley et al

Compute both the second invariant of the strain rate tensor and the viscosity int ViscosityField(DA da Vec X Vec VAppCtx user)

Parameter param = userminusgtparamGridInfo grid = userminusgtgridVec localXField v xdouble eps dx dz T epsC TCint ijisjsimjmilim jlim ivt

ivt = paramminusgtiviscparamminusgtivisc = paramminusgtoutput ivisc

DACreateLocalVector(da amplocalX)DAGlobalToLocalBegin(da X INSERT VALUES localX)DAGlobalToLocalEnd(da X INSERT VALUES localX)DAVecGetArray(dalocalX(void)ampx)DAVecGetArray(daV(void)ampv)

Parameters dx = gridminusgtdx dz = gridminusgtdzilim = gridminusgtniminus1 jlim = gridminusgtnjminus1

Compute real temperature strain rate and viscosity DAGetCorners(daampisampjsPETSCNULL ampimampjmPETSCNULL )for (j=js jltjs+jm j++)

for (i=is iltis+im i++) T = paramminusgtpotentialT x[j][i]T exp( (jminus05)dzparamminusgtz scale )if (iltilim ampamp jltjlim )

TC = paramminusgtpotentialT TInterp(xij) exp( jdzparamminusgtz scale ) else

TC = T Compute the values at both cell centers and cell corners eps = CalcSecInv(xijCELL CENTERuser)epsC = CalcSecInv(xijCELL CORNERuser)v[j][i]u = epsv[j][i]w = epsCv[j][i]p = Viscosity(Tepsdz(jminus05)param)v[j][i]T = Viscosity(TCepsCdzjparam)

DAVecRestoreArray(daV(void)ampv)DAVecRestoreArray(dalocalX(void)ampx)paramminusgtivisc = ivt

Fig 23 Calculation of the second invariant of the strain tensor and viscosity field for themantle subduction problem

Developing a Geodynamics Simulator with PETSc 21

KSP ksp subkspPC bjacobi ilu

SNESGetKSP(snes ampksp)KSPGetPC(ksp ampbjacobi)PCBJacobiGetSubKSP(bjacobi PETSCNULL PETSCNULL ampsubksp)KSPGetPC(subksp[0] ampilu)PCILUSetLevels(ilu levels)

Fig 24Customizing the preconditioner on a single process

has the option of providing a routine or of using the ADIC [15] or similar system forautomatic differentiation The finite difference approximation can be activated byusing the-snes fd and-mat fd coloring freq options or by providing theSNESDefaultComputeJacobianColor () function toSNESSetJacobian()

Through theSNES object the user may also access the great range of directand iterative linear solvers and preconditioners provided by PETSc Sixteen differentKrylov solvers are available including GMRES BiCGStab and LSQR along withinterfaces to popular sparse direct packages such as MUMPS [4] and SuperLU [9]In addition to a variety of incomplete factorization preconditioners including PI-LUT [16] PETSc supports additive Schwartz preconditioning algebraic multigridthrough BoomerAMG [13] and the geometric multigrid discussed below The fullpanoply of solvers and preconditioners available is catalogued on the PETSc Web-site [3]

The modularity of PETSc allows users to easily customize each solver For in-stance suppose the user wishes to increase the number of levels in ILU(k) precon-ditioning on one block of a block-Jacobi scheme The code fragment in Fig 24 willset the number of levels of fill tolevels

PETSc provides an elegant framework for managing geometric multigrid in com-bination with aDA which is abstracted in theDMMGobject In the same way thatany linear solve can be performed withSNES any preconditioner or solver com-bination is available inDMMGby using a single level That is the same user codesupports both geometric multigrid as well as direct methods Krylov methods etcWhen creating aDMMG the user specifies the number of levels of grid refinementand provides the coarse-grid information (see Fig 25) which is always aDA objectat present although unstructured prototypes are being developed Then theDMMGobject calculates the intergrid transfer operators prolongation and restriction and

allocates objects to hold the solution at each level The user must provide the actionof the residual operatorF and can optionally provide a routine to compute the Jaco-bian matrix (which will be called on each level) and a routine to compute the initialguess as in Fig 26

With this information the user can now solve the equation and retrieve the solu-tion vector as in Fig 27

22 Matthew G Knepley et al

DMMGCreate(comm gridmglevels user ampdmmg)DMMGSetDM(dmmg (DM) da)

Fig 25Creating a DMMG

DMMGSetSNESLocal(dmmg FormFunctionLocal FormJacobianLocal 0 0)DMMGSetInitialGuess(dmmg FormInitialGuess)

Fig 26Providing discretization and assembly routines to DMMG

DMMGSolve(dmmg)soln = DMMGGetx(dmmg)

Fig 27Solving the problem with DMMG

5 Extensions

PETSc is not a complete environment for simulating physical phenomena Ratherit is a set of tools that allow the user to assemble such an environment tailored toa specific application As such PETSc will never provide every facility appropriatefor a given simulation However the user can easily extend PETSc and supplementits capabilities We present two examples in the context of the mantle subductionsimulation

51 Simple Continuation

Because of the highly nonlinear dependence of viscosity on temperature and strainrate in equation (1) the iteration in Newtonrsquos method can fail to converge withouta good starting guess of the solution On the other hand for the isoviscous casewhere temperature is coupled to velocity only through advection a solution is easilyreached It is therefore natural to propose a continuation scheme in the viscosity be-ginning with constant viscosity and progressing to full variability A simple adaptivecontinuation model was devised in which the viscosity was raised to a power be-tween zero and oneη rarr ηα whereα=0 corresponds to constant viscosity andα=1corresponds to natural viscosity variation In the continuation loopα was increasedtowards unity by an amount dependent on the rate of convergence of the previousSNES solve

PETSc itself provides no special support for continuation but it is sufficientlymodular that a continuation loop was readily constructed in the application codeusing repeated calls toDMMGSolve() and found to work quite well

Developing a Geodynamics Simulator with PETSc 23

52 Simple Steering

No rigorous justification had been given for the continuation strategy above andconsequently it was possible that for certain initial conditions Newton would fail toconverge thereby resulting in extremely long run times An observant user could de-tect this situation and abort the run but all the potentially useful solution informationup to that point would be lost A clean way to asynchronously abort the continua-tion loop was thus needed On architectures where OS signals are available PETScprovides an interface for registering signal handlers Thus we were able to define ahandler The user wishing to change the control flow of the simulation simply sendsthe appropriate signal to the process This sets a flag in the user data that causes thechange at the next iteration

6 Simulation Results

A long-standing debate has existed between theorists and observationalists over thethermal structure of subduction zones Modelers using constant viscosity flow sim-ulations to compute thermal structure have predicted relatively cold subductionzones [19] Conversely observationalists who use heat flow measurements and min-eralogical thermobarometry to estimate temperatures at depth have long claimedthat subduction zones are hotter than predicted They have invoked the presence ofstrong upwelling of hot mantle material to supply the heat This upwelling was neverpredicted by isoviscous flow models however

The debate remained unresolved until recently when several groups includingour own succeeded in developing simulations with realistically variable viscosity(for other examples [810121821]) A comparison of flow and temperature fieldsfor variable and constant viscosity simulations generated with our code is shownin Fig 28 Results from these simulations are exciting because they close the gapbetween models and observations they predict hotter mantle temperatures steepersurface thermal gradients and upwelling mantle flow

Furthermore these simulations allow for quantitative predictions of variation ofobservables with subduction parameters A recent study of subduction zone earth-quakes has identified an intriguing trend the vertical distance from subduction zonevolcanoes to the surface of the subducting slab is anti-correlated with descent rate ofthe slab [11] Preliminary calculations with our model are consistent with this trendindicating that flow and thermal structure may play an important role in determiningnot only the quantity and chemistry of magmas but also their path of transport to thesurface Further work is required to resolve this issue

Simulating the geodynamics of subduction is an example of our success in port-ing an existing code into the PETSc framework simultaneously parallelizing thecode and increasing its functionality Using this code as a template we rapidly de-veloped a related simulation of another tectonic boundary the mid-ocean ridge Thiswork was done to address a set of observations of systematic morphological asym-metry in the global mid-ocean ridge system [7] Our model confirms the qualitative

24 Matthew G Knepley et al

Dep

th k

m

log10

η and flow field

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24 Temperature degC

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Distance km

Dep

th k

m

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24

Distance km

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Isov

isco

usV

aria

ble

Vis

cosi

ty

(a) (b)

(c) (d)

Fig 282D viscosity and potential temperature fields from simulations on 8 processors with230112 degrees of freedom Panels in the top row are from a simulation withα=1 in equation(1) Panels in bottom row haveα=0 The white box in panels (a) and (c) shows the region inwhich temperature is plotted in panels (b) and (d)(a) Colors showlog10 of the viscosity fieldNote that there are more than five orders of magnitude variation in viscosity Arrows show theflow direction and magnitude (the slab is being subducted at a rate of 6 cmyear) Upwellingis evident in the flow field near the wedge corner(b) Temperature field from the variableviscosity simulation 1100C isotherm is shown as a dashed line(c) (Constant) viscosity andflow field from the isoviscous simulation Strong flow is predicted at the base of the crustdespite the low-temperature rock there No upwelling flow is predicted(d) Temperature fieldfrom isoviscous simulation Note that the mantle wedge corner is much colder than in (b)

mechanism that had been proposed to explain these observations Furthermore itshows a quantitative agreement with trends in the observed data [17] As with ourmodels of subduction the key to demonstrating the validity of the hypothesized dy-namics was simulating them with the strongly nonlinear rheology that characterizesmantle rock on geologic timescales The computational tools provided by PETSc en-abled us easily handle this rheology reducing development time and allowing us tofocus on model interpretation

References

1 PETSc SNES Example 30 httpwwwmcsanlgovpetscpetsc-2

Developing a Geodynamics Simulator with PETSc 25

snapshotspetsc-currentsrcsnesexamplestutorialsex30chtml

2 PETSc SNES Example 5 httpwwwmcsanlgovpetscpetsc-2snapshotspetsc-currentsrcsnesexamplestutorialsex5f90Fhtml

3 PETSc Solvers httpwwwmcsanlgovpetscpetsc-2documentationlinearsolvertablehtml

4 Amestoy P R Duff I S LrsquoExcellent J-Y and Koster J A fully asynchronous mul-tifrontal solver using distributed dynamic schedulingSIAM Journal on Matrix Analysisand Applications 23(1)15ndash41 (2001)

5 Balay S Buschelman K Eijkhout V Gropp W D Kaushik D Knepley M GMcInnes L C Smith B F and Zhang H PETSc Web pagehttpwwwmcsanlgovpetsc

6 Balay S Gropp W D McInnes L C and Smith B F Efficient management of par-allelism in object oriented numerical software libraries In Arge E Bruaset A M andLangtangen H P editorsModern Software Tools in Scientific Computing pages 163ndash202 Birkhauser Press (1997)

7 Carbotte S Small C and Donnelly K The influence of ridge migration on the mag-matic segmentation of mid-ocean ridgesNature 429743ndash746 (2004)

8 Conder J Wiens D and Morris J On the decompression melting structure at volcanicarcs and back-arc spreading centersGeophys Res Letts 29 (2002)

9 Demmel J W Gilbert J R and Li X S SuperLU userrsquos guide Technical ReportLBNL-44289 Lawrence Berkeley National Laboratory (2003)

10 Eberle M Grasset O and Sotin C A numerical study of the interaction of the man-tle wedge subducting slab and overriding platePhys Earth Planet In 134191ndash202(2002)

11 England P Engdahl R and Thatcher W Systematic variation in the depth of slabsbeneath arc volcanosGeophys J Int 156(2)377ndash408 (2003)

12 Furukawa Y Depth of the decoupling plate interface and thermal structure under arcsJ Geophys Res 9820005ndash20013 (1993)

13 Henson V E and Yang U M BoomerAMG A parallel algebraic multigrid solver andpreconditioner Technical Report UCRL-JC-133948 Lawrence Livermore National Lab-oratory (2000)

14 Hirth G and Kohlstedt D Rheology of the upper mantle and the mantle wedge A viewfrom the experimentalists InInside the Subduction Factory volume 138 ofGeophysicalMonograph American Geophysical Union (2003)

15 Hovland P Norris B and Smith B Making automatic differentiation truly automaticCoupling PETSc with ADIC InProceedings of ICCS2002 (2002)

16 Hysom D and Pothen A A scalable parallel algorithm for incomplete factor precondi-tioning SIAM Journal on Scientific Computing 222194ndash2215 (2001)

17 Katz R Spiegelman M and Carbotte S Ridge migration asthenospheric flow and theorigin of magmatic segmentation in the global mid-ocean ridge systemGeophys ResLetts 31 (2004)

18 Kelemen P Rilling J Parmentier E Mehl L and Hacker B Thermal structure dueto solid-state flow in the mantle wedge beneath arcs InInside the Subduction Factoryvolume 138 ofGeophysical Monograph American Geophysical Union (2003)

19 Peacock S and Wang K Seismic consequences of warm versus cool subduction meta-morphism Examples from southwest and northeast JapanScience 286937ndash939 (1999)

20 Stern R Subduction zonesRev Geophys 40(4) (2002)

26 Matthew G Knepley et al

21 van Keken P Kiefer B and Peacock S High-resolution models of subduction zonesImplications for mineral dehydration reactions and the transport of water into the deepmantleGeochem Geophys Geosys 3(10) (2003)

Page 3: Developing a Geodynamics Simulator with PETScfoalab.earth.ox.ac.uk/publications/Knepley_etal_Springer06.pdf · Developing a Geodynamics Simulator with PETSc Matthew G. Knepley 1,

Developing a Geodynamics Simulator with PETSc 3

solid-state creep [14] Its motion can be described by the Stokes equation for steadyflow of an incompressible highly viscous fluid (with zero Reynolds number)

nablaP = nablamiddot[η

(nablaV + nablaVT

)] st nablamiddot V = 0 (1)

η = (1ηdisl + 1ηdifn)minus1 (2)

whereP is the fluid pressureV is the mantle velocity fieldηdifn is the diffusioncreep viscosity andηdisl is the dislocation creep viscosity Both creep mechanismsgive an Arrhenius-type dependence on pressure and temperature Dislocation creephas an additional non-Newtonian strain rate dependence The strength of the nonlin-earity of viscosity makes this equation difficult to solve without a good initial guessIn our code the initial guess in provided by a continuation method we modify equa-tion (1) to letη rarr ηα whereα is a number between zero and one The continuationmethod described in section 51 variesα over a sequence of solves of increasingnonlinearity to reach natural variation in viscosity The solution at each step is usedas a guess for the following solve The first step in this process is to solve the problemfor constant viscosity (α=0) where the system of equations is linear and analyticallytractable

The production of molten rock depends fundamentally on the mantle temperaturefield The distribution of heat is governed by the conservation of enthalpy expressedas an advection-diffusion equation

partθ

partt+ V middotnablaθ = κnabla2θ (3)

whereθ is the mantle potential temperature andκ is the thermal diffusivity of themantle The mantle has a very low thermal diffusivity compared to materials such asmetal or water and thus the advection term dominates in equation (3)

For 0 lt α le 1 equations (1) and (3) form a nonlinear set coupled through theviscosity The solution of these equations is the first stage in modeling magma gen-esis in subduction zones Future work will incorporate equations of porous flow ofvolatiles and magma through the mantle reactive melting and geochemical trans-port Even without these complexities however we have achieved interesting resultswith the simple though highly nonlinear set of equations given above Some of theseresults are presented below after an in-depth look at the application developmentprocess in the PETSc framework

2 Integrating PETSc

PETSc is a set of library interfaces built in a generally hierarchical fashion A usermay decide to use some libraries and disregard others Fig 2 illustrates this hierarchyof dependencies For instance a user can use only the PETSc linear algebra (vectorsand matrices) libraries or add linear solvers to those or add nonlinear solvers to theentire group Thus integration may proceed in several stages which we discuss inthe following sections The first step is to incorporate the PETSc libraries into the

4 Matthew G Knepley et al

existing compile system (iemake) and to initialize the PETSc runtime This maybe done in two ways

Fig 2 Interface hierarchy in a PETSc application

The simplest approach is to adopt the PETSc makefile structure which is pre-sented in Fig 3 for an application using the nonlinear solver package The user in-cludes thebmakecommonbase which defines both rules for compiling sourceand variables that are used during the compile and link Then only a simple rule forthe executable is necessary using the appropriate PETSc library variable Individualcompilation can be customized with the variables shown at the top of the makefile

Users with large existing build systems may choose not to inherit the PETScmake rules but instead use a lower-level interface based only on PETSc make vari-ables A boilerplate example is given in Fig 4 The user now includes only thebmakecommonvariables file which defines the make variables but does notprescribe any rules for compilation or linking The variables provide informationabout all compilation flags libraries and external packages necessary to link withPETSc

Before any PETSc code can be run the user must callPetscInitialize() and likewise after all PETSc code has completed the user must callPetscFinalize () This is analogous to the requirements for using MPI In fact if the userhas not done so already PETSc will handle the initialization and cleanup of MPIautomatically in these routines A simple C driver is shown in Fig 5

The initialization call provides the command line arguments to PETSc for pro-cessing and can take an optional help string for the user (printed when the-help op-tion is given) The finalization call frees any resources used by PETSc and provides

Developing a Geodynamics Simulator with PETSc 5

CFLAGS =FFLAGS =CPPFLAGS=FPPFLAGS=

include $PETSCDIRbmakecommonbase

ex1 ex1o utilo chkoptsminus$CLINKER minuso $ $ˆ $PETSCSNES LIB$RM $ˆ

ex1f ex1fo physo chkoptsminus$FLINKER minuso $ $ˆ $PETSCFORTRAN LIB $PETSCSNES LIB$RM $ˆ

Fig 3Typical PETSc makefile

include $PETSCDIRbmakecommonvariables

comassageCpl $lt$CC minusc $MY CFLAGS $COPTFLAGS $CFLAGS $CCPPFLAGS $lt

FomassageFortranpl $lt$FC minusc $MY FFLAGS $FOPTFLAGS $FFLAGS $FCPPFLAGS $lt

ex1 ex1o utilominus$CLINKER minuso $ $ˆ $PETSCSNES LIB$RM $lt

ex1f ex1fo physominus$FLINKER minuso $ $ˆ $PETSCFORTRAN LIB $PETSCSNES LIB$RM $lt

Fig 4Boilerplate custom makefile using PETSc

summary logging and diagnostic information most notably performance profilingThe equivalent Fortran driver is shown in Fig 6

Notice that the command line arguments are now obtained directly from the For-tran runtime library rather than the user code Once these calls are inserted intothe application code the user can verify a successful link and run with the PETSclibraries

6 Matthew G Knepley et al

static char help[ ] = Boilerplate PETSc Examplenn

include petsch

int main(int argc char argv)

ierr = PetscInitialize(ampargc ampargv (char ) 0 help)CHKERRQ(ierr)

User code

return PetscFinalize()

Fig 5Boilerplate C driver for PETSc

program mainimplicit none

include includefincludepetsch

integer ierr

call PetscInitialize(PETSCNULL CHARACTER ierr)

User code

call PetscFinalize(ierr)end

Fig 6Boilerplate Fortran driver for PETSc

3 Data Distribution and Linear Algebra

The PETSc solvers (discussed in Section 4) require the user to provide C or Fortranroutines that compute the residual of the equation they wish to solve (after they havediscretized the PDE) and optionally the Jacobian of that residual function We re-fer to these functions generically asFormFunction() andFormJacobian() For nonlinear problems the PETSc solvers use truncated Newton methods with linesearches or trust-regions for robustness and possibly approximate or incomplete Ja-cobians for efficiency

The central objects in any PETSc simulation are the abstractions from linearalgebra vectors and matrices or in PETSc terms theVec andMat classes Theseform the basis of all solver and preconditioner interfaces as well as providing the linkto user-supplied discretization and physics routines Thus the most important stepin the migration of an application to the PETSc framework is the incorporation of its

Developing a Geodynamics Simulator with PETSc 7

linear algebra and data distribution abstractions Two obvious strategies emerge foraccomplishing this integration a gradual approach that seeks to minimize the changeto existing code and a more aggressive approach that leverages as much of the PETSctechnology as possible The next two subsections address these complementary pathstoward integration We use the simple example of the solid-fuel ignition problem orBratu problem to illustrate the gradual evolution of a simulation and then we givean example of mantle subduction for the more aggressive approach

31 Gradual Evolution

A new PETSc user with very complicated code which perhaps was originally writ-ten by someone else may opt to make as few changes as possible when moving toPETSc PETSc facilitates this approach with low-level interfaces to bothVec andMat objects that closely resemble common serial data structures Vectors are espe-cially easy to migrate because the default PETSc storage format contiguous arrayson each process is that most common in serial applications Matrices are somewhatmore complicated and in order to hide the actual matrix data storage format PETScrequires the user to access values through a functional interface [6]

Let us examine the Bratu problem as an example of integrating PETSc vectorsinto an existing simulation [2] This problem is modeled by the partial differentialequation

minus∆uminus λeu = 0 (4)

We take the domain to be the unit square and impose homogeneous Dirichlet bound-ary conditions on the edges We discretize the equation using a 5-point stencil finitedifference scheme which results in a set of nonlinear algebraic equationsF (uh) = 0where we useuh to indicate the discrete solution vector In Fig 7 we present a For-tran 90 routine that calculates the residualF as a function of the input vectoruhIt also takes a user context as input that holds the domain information and problemcoefficient

Notice that the boundary conditions on the residual are of the formuminusuΓ whereuΓ are the specified boundary values because we are driving the residualF (u) tozero An alternative would be to eliminate those variables altogether substitutingthe boundary values in any interior calculation However this technique currentlyimposes a greater burden on the programmer

When integrating PETSc vectors into this code we can let Fortran allocate thestorage or we can use PETSc for allocation In Fig 8 we let PETSc managethe memory The input vectors already have storage allocated by PETSc whichcan be accessed as an F90 array by using theVecGetArrayF90() functionor in C by usingVecGetArray() A sampledriver() routine shows howPETSc allocates vector storage If we let Fortran manage memory then we useVecCreateSeqWithArray() orVecCreateMPIWithArray() in parallelto create theVec objects

During a Newton iteration to solve this equation we would require the JacobianJ of the mappingF Rather than being accessed as a multidimensional array the

8 Matthew G Knepley et al

module f90moduletype userctx

The start end and number of vertices in the xminus and yminusdirectionsinteger xsxeysyeinteger mxmydouble precision lambda

end type userctxcontainsend module f90module

subroutine FormFunction(uFuserierr)use f90moduletype (userctx) userdouble precision u(userxsuserxeuserysuserye)double precision F(userxsuserxeuserysuserye)double precision twoonehxhyhxdhyhydhxscuijuxxuyyinteger ijierr

hx = 10dble(usermxminus1)hy = 10dble(usermyminus1)sc = hxhyuserlambdahxdhy = hxhyhydhx = hyhx

do 20 j=userysuseryedo 10 i=userxsuserxe

Apply boundary conditionsif (i == 1 or j == 1 or i == usermx or j == usermy) then

F(ij) = u(ij) Apply finite difference scheme

elseuij = u(ij)uxx = hydhx (20uij minus u(iminus1j) minus u(i+1j))uyy = hxdhy (20uij minus u(ijminus1) minus u(ij+1))F(ij) = uxx + uyy minus scexp(uij)

endif10 continue20 continue

Fig 7Residual calculation for the Bratu problem

Developing a Geodynamics Simulator with PETSc 9

subroutine driver(uFuserierr)implicit none

Vec uFint Nparameter(N=10000)

call VecCreate(PETSCCOMM WORLDuierr)call VecSetSizes(uPETSCDECIDENierr)call VecDuplicate(uFierr)call SolverLoop(uFierr)call VecDestroy(uierr)call VecDestroy(Fierr)end

subroutine FormFunctionPETSc(uFuserierr)use f90moduleimplicit none

Vec uFtype (userctx) userdouble precisionpointer u v()f v()

call VecGetArrayF90(uu vierr)call VecGetArrayF90(Ff vierr)call FormFunction(u vf vuserierr)call VecRestoreArrayF90(uu vierr)call VecRestoreArrayF90(Ff vierr)end

Fig 8Driver for the Bratu problem using PETSc

PETScMat object provides theMatSetValues () function to set logically denseblocks of values into the structure A function that computes the Jacobian and storesit in a PETSc matrix is shown in Fig 9 With this addition the user can now runserial code using the PETSc solvers (details are given in Section 4) Note that thestorage mechanism is the only change to user code necessary for its incorporationinto the PETSc framework using the wrapper in Fig 8

Although PETSc removes the need for low-level parallel programming such asdirect calls to the MPI library the user must still identify parallelism inherent inthe application and must structure the computation accordingly The first step is topartition the domain and have each process calculate the residual and Jacobian onlyover its local piece This is easily accomplished by redefining thexs xe ys andyevariables on each process converting loops over the entire domain into loops overthe local domain However this method uses nearest-neighbor information and thus

10 Matthew G Knepley et al

subroutine FormJacobian(ujacuserierr)use f90moduletype (userctx) userMat jacdouble precision x(userxsuserxeuserysuserye)double precision twoonehxhyhxdhyhydhxscv(5)integer rowcol(5)ijierr

one = 10two = 20hx = onedble(usermxminus1)hy = onedble(usermyminus1)sc = hxhyuserlambdahxdhy = hxhyhydhx = hyhxdo 20 j=userysuserye

row = (j minus userys)userxm minus 1do 10 i=userxsuserxe

row = row + 1 boundary points

if (i == 1 or j == 1 or i == usermx or j == usermy) thencol(1) = rowv(1) = onecall MatSetValues(jac1row1colvINSERT VALUESierr)

interior grid pointselse

v(1) = minushxdhyv(2) = minushydhxv(3) = two(hydhx + hxdhy) minus scexp(x(ij))v(4) = minushydhxv(5) = minushxdhycol(1) = row minus userxmcol(2) = row minus 1col(3) = rowcol(4) = row + 1col(5) = row + userxmcall MatSetValues(jac1row5colvINSERT VALUESierr)

endif10 continue20 continue

Fig 9Jacobian calculation for the Bratu problem using PETSc

Developing a Geodynamics Simulator with PETSc 11

we must have access to some values not present locally on the process as shown inFig 10

Box-type stencil Star-type stencil

Proc 6

Proc 0 Proc 0Proc 1 Proc 1

Proc 6

Fig 10Star and box stencils using a DA

We must store values for theseghostpoints locally so that they may be used forthe computation In addition we must ensure that these values are identical to the cor-responding values on the neighboring processes At the lowest level PETSc providesghosted vectors created usingVecCreateGhost () with storage for some valuesowned by other processes These values are explicitly indicated during constructionTheVecGhostUpdateBegin () andVecGhostUpdateEnd () functions trans-fer values between ghost storage and the corresponding storage on other processesThis method however can become complicated for the user Therefore for the com-mon case of a logically rectangular grid PETSc provides theDA object to managethe determination allocation and coherence of ghost values TheDA object is dis-cussed fully in Section 32 but we give here a short introduction in the context of theBratu problem

TheDA object represents a distributed structured (possibly staggered) grid andthus has more information than our serial grid If we augment our user context toinclude information about ghost values we can obtain all the grid information fromtheDA Figure 11 shows the creation of aDA when one initially know only the totalnumber of vertices in thex andy directions Alternatively we could have specifiedthe local number of vertices in each direction

Now we need only modify our PETSc residual routine to update the ghost valuesas shown in Fig 12

Notice that the ghost value scatter is a two-step operation This allows thecommunication to overlap with local computation that may be taking place whilethe messages are in transit The only change necessary for the residual calcu-lation itself is the correct declaration of the input arraydouble precision

12 Matthew G Knepley et al

type userctxDA da

The start end and number of vertices in the xminusdirectioninteger xsxexm

The start end and number of ghost vertices in the xminusdirectioninteger gxsgxegxm

The start end and number of vertices in the yminusdirectioninteger ysyeym

The start end and number of ghost vertices in the yminusdirectioninteger gysgyegym

The number of vertices in the xminus and yminusdirectionsinteger mxmy

The MPI rank of this processinteger rank

The coefficient in the Bratu equationdouble precision lambda

end type userctx

call DACreate2d(PETSCCOMM WORLDDA NONPERIODICDA STENCIL STAR ampamp usermxusermyPETSCDECIDEPETSCDECIDE11 ampamp PETSCNULL INTEGERPETSCNULL INTEGERuserdaierr)call DAGetCorners(userdauserxsuserysPETSCNULL INTEGER amp

amp userxmuserymPETSCNULL INTEGERierr)call DAGetGhostCorners(userdausergxsusergys amp

amp PETSCNULL INTEGERusergxmusergym ampamp PETSCNULL INTEGERierr)

Here we shift the starting indices up by one so that we can easily use the Fortran convention of1minusbased indices rather than0minusbased

userxs = userxs+1userys = userys+1usergxs = usergxs+1usergys = usergys+1userye = userys+userymminus1userxe = userxs+userxmminus1usergye = usergys+usergymminus1usergxe = usergxs+usergxmminus1

Fig 11Creation of a DA for the Bratu problem

Developing a Geodynamics Simulator with PETSc 13

subroutine FormFunctionPETSc(uFuserierr)implicit none

Vec uFinteger ierrtype (userctx) user

double precisionpointer lu v()lf v()Vec uLocal

call DAGetLocalVector(userdauLocalierr)call DAGlobalToLocalBegin(userdauINSERT VALUESuLocalierr)call DAGlobalToLocalEnd(userdauINSERT VALUESuLocalierr)

call VecGetArrayF90(uLocallu vierr)call VecGetArrayF90(Flf vierr)

Actually compute the local portion of the residualcall FormFunctionLocal(lu vlf vuserierr)

call VecRestoreArrayF90(uLocallu vierr)call VecRestoreArrayF90(Flf vierr)

call DARestoreLocalVector(userdauLocalierr)

Fig 12Parallel residual calculation for the Bratu problem using PETSc

u(usergxsusergxeusergysusergye) which now includes ghostvalues

The changes toFormJacobian() are similar resizing the input array andchanging slightly the calculation of row and column indices and can be found inthe PETSc example source We have now finished the initial introduction of PETSclinear algebra and the simulation is ready to run in parallel

32 Rapid Evolution

The user who is willing to raise the level of abstraction in a code can start by em-ploying theDA to describe the problem domain and discretization The data interfacemimics a multidimensional array and is thus ideal for finite difference finite volumeand low-order finite element schemes on a logically rectangular grid In fact after thedefinition of aFormFunction() routine the simulation is ready to run becausethe nonlinear solver provides approximate Jacobians automatically as discussed inSection 4

14 Matthew G Knepley et al

DASetLocalFunction(userda (DALocalFunction1) FormFunctionLocal)

Fig 13Using the DA to form a function

int FormFunctionLocal(DALocalInfo infodouble udouble fAppCtx user)double two = 20hxhyhxdhyhydhxscdouble uijuxxuyyint ij

hx = 10(double)(infominusgtmxminus1)hy = 10(double)(infominusgtmyminus1)sc = hxhyuserminusgtlambdahxdhy = hxhyhydhx = hyhx Compute function over the locally owned part of the grid for (j=infominusgtys jltinfominusgtys+infominusgtym j++)

for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++) if (i == 0 | | j == 0 | | i == infominusgtmxminus1 | | j == infominusgtmyminus1)

f[j][i] = u[j][i] else

uij = u[j][i]uxx = (twouij minus u[j][iminus1] minus u[j][i+1])hydhxuyy = (twouij minus u[jminus1][i] minus u[j+1][i])hxdhyf[j][i] = uxx + uyy minus scexp(uij)

Fig 14FormFunctionLocal() for the Bratu problem

We begin by examining the Bratu problem from the last section only this time inC We can now provide our local residual routine to theDA see Fig 13

The grid information is passed toFormFunctionLocal() in a DALocalInfo structure as shown in Fig 14

The fields are passed directly as multidimensional C arrays complete with ghostvalues An application scientist can use a boilerplate example such as the Bratu prob-lem merely altering the local physics calculation inFormFunctionLocal() torapidly obtain a parallel scalable simulation that runs on the desktop as well as onmassively parallel supercomputers

If the user has an expression for the Jacobian of the residual function then thismatrix can also be computed by using theDA interface with the call as in Fig 15and the corresponding routine for the local calculation in Fig 16

Developing a Geodynamics Simulator with PETSc 15

DASetLocalJacobian(userda (DALocalFunction1) FormJacobianLocal)

Fig 15Using the DA to form a Jacobian

int FormJacobianLocal(DALocalInfo infodouble xMat jacAppCtx user)MatStencil col[5]rowdouble lambdav[5]hxhyhxdhyhydhxscint ij

lambda = userminusgtparamhx = 10(double)(infominusgtmxminus1)hy = 10(double)(infominusgtmyminus1)sc = hxhylambdahxdhy = hxhy hydhx = hyhx

for (j=infominusgtys jltinfominusgtys+infominusgtym j++) for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++)

rowj = j rowi = i boundary points if (i == 0 | | j == 0 | | i == infominusgtmxminus1 | | j == infominusgtmyminus1)

v[0] = 10MatSetValuesStencil(jac1amprow1amprowvINSERT VALUES)

else interior grid points

v[0] = minushxdhy col[0]j = j minus 1 col[0]i = iv[1] = minushydhx col[1]j = j col[1]i = iminus1v[2] = 20(hydhx + hxdhy) minus scexp(x[j][i]) col[2]j = j col[2]i = iv[3] = minushydhx col[3]j = j col[3]i = i+1v[4] = minushxdhy col[4]j = j + 1 col[4]i = iMatSetValuesStencil(jac1amprow5colvINSERT VALUES)

MatAssemblyBegin(jacMAT FINAL ASSEMBLY)MatAssemblyEnd(jacMAT FINAL ASSEMBLY)MatSetOption(jacMAT NEW NONZERO LOCATION ERR)

Fig 16FormJacobianLocal() for the Bratu problem

16 Matthew G Knepley et al

int FormFunctionLocal(DALocalInfo infoField xField fvoid ptr)AppCtx user = (AppCtx)ptrParameter param = userminusgtparamGridInfo grid = userminusgtgriddouble mag w mag uint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1 i j

for (j=infominusgtys jltinfominusgtys+infominusgtym j++) for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++)

calculateXMomentumResidual(i j x f user)calculateZMomentumResidual(i j x f user)calculatePressureResidual(i j x f user)calculateTemperatureResidual(i j x f user)

Fig 17FormFunctionLocal() for the mantle subduction problem

We have so far presented simple examples involving a perturbed Laplacian butthis development strategy extends far beyond toy problems We have developedlarge-scale parallel geophysical simulations [1] that are producing new results inthe field We began by replacingFormFunctionLocal() in the Bratu examplewith one containing the linear physics of the previous sequential linear subductioncode (dropping the previous codersquos linear solve) Then the PETSc solver results werecompared for correctness with the complete prior subduction code Next the newcode was immediately run in parallel for correctness and efficiency tests Finallythe nonlinear terms from the more complicated physics of the mantle subductionproblem discussed in Section 1 and shown in Figs 17-21 and better quality dis-cretizations of the boundary conditions were added In fact since the structure oftheFormFunction() changed significantly the finalFormFunction() looksnothing like the original but the process of obtaining through the incremental ap-proach reduced the learning curve for PETSc and make correctness checking at thevarious stages much easier

The details of the subduction code are not as important as the recognition thata very complicated physics problem need not involve any more complication in oursimulation infrastructure To give more insight into the actual calculation we showin Fig 22 the explicit calculation of the residual from the continuity equation

Here we have used theDA for a multicomponent problem on a staggered meshwithout alteration because the structure of the discretization remains logically rect-angular

Although theDA manages communication during the solution process data post-processing often necessary for visualization or analysis also requires this function-ality For the mantle subduction simulation we want to calculate both the second in-

Developing a Geodynamics Simulator with PETSc 17

int calculateXMomentumResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (iltj) f[j][i]u = x[j][i]u minus SlabVel(rsquoUrsquo ijuser)

else if (jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lithospheric lid f[j][i]u = x[j][i]u minus 00

else if (i==ilim ) on the right side boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]u = x[j][i]u minus HorizVelocity(ijuser) else

f[j][i]u = XNormalStress(xijCELL CENTERuser) minus EPS ZERO

else if (j==jlim ) on the bottom boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]u = x[j][i]u minus HorizVelocity(ijuser) else if (paramminusgtibound==BC NOSTRESS)

f[j][i]u = XMomentumResidual(xijuser)

else in the mantle wedge f[j][i]u = XMomentumResidual(xijuser)

Fig 18X-momentum residual for the mantle subduction problem

variant of the strain rate tensor and the viscosity over the grid (at both cell centers andcorners) Using theDAGlobalToLocalBegin () andDAGlobalToLocalEnd() functions we transferred the global solution data to local ghosted vectors and pro-ceeded with the calculation We can also transfer data from local vectors to a globalvector usingDALocalToGlobal () The entire viscosity calculation is given inFig 23

Note that this framework allows the user to manage arbitrary fields over the do-main For instance a porous flow simulation might manage material properties ofthe medium in this fashion

4 Solvers

The SNES object in PETSc is an abstraction of the ldquoinverserdquo of a nonlinear oper-ator The user provides aFormFunction() routine as seen in Section 3 whichcomputes the action of the operator on an input vector Linear problems can also be

18 Matthew G Knepley et al

int calculateZMomentumResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1jlim = infominusgtmyminus1

if (ilt=j) f[j][i]w = x[j][i]w minus SlabVel(rsquoWrsquo ijuser)

else if (jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lithospheric lid f[j][i]w = x[j][i]w minus 00

else if (j==jlim ) on the bottom boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]w = x[j][i]w minus VertVelocity(ijuser) else

f[j][i]w = ZNormalStress(xijCELL CENTERuser) minus EPS ZERO

else if (i==ilim ) on the right side boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]w = x[j][i]w minus VertVelocity(ijuser) else if (paramminusgtibound==BC NOSTRESS)

f[j][i]w = ZMomentumResidual(xijuser)

else in the mantle wedge f[j][i]w = ZMomentumResidual(xijuser)

Fig 19Z-momentum residual for the mantle subduction problem

int calculatePressureResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (iltj | | jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lid or slab f[j][i]p = x[j][i]p

else if ((i==ilim | | j==jlim ) ampamp paramminusgtibound==BC ANALYTIC ) on an analytic boundary f[j][i]p = x[j][i]p minus Pressure(ijuser)

else in the mantle wedge f[j][i]p = ContinuityResidual(xijuser)

Fig 20Pressure or continuity residual for the mantle subduction problem

Developing a Geodynamics Simulator with PETSc 19

int calculateTemperatureResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (j==0) on the surface f[j][i]T = x[j][i]T + x[j+1][i]T + PetscMax(x[j][i]T00)

else if (i==0) slab inflow boundary f[j][i]T = x[j][i]T minus PlateModel(jPLATE SLABuser)

else if (i==ilim ) right side boundary mag u = 10 minus pow( (10minusPetscMax(PetscMin(x[j][iminus1]uparamminusgtcb10)00)) 50 )f[j][i]T = x[j][i]T minus mag ux[jminus1][iminus1]T minus (10minusmag u)PlateModel(jPLATE LID user)

else if (j==jlim ) bottom boundary mag w = 10 minus pow( (10minusPetscMax(PetscMin(x[jminus1][i]wparamminusgtsb10)00)) 50 )f[j][i]T = x[j][i]T minus mag wx[jminus1][iminus1]T minus (10minusmag w)

else in the mantle wedge f[j][i]T = EnergyResidual(xijuser)

Fig 21Temperature or energy residual for the mantle subduction problem

double precision ContinuityResidual(Field x int i int j AppCtx user)

GridInfo grid = userminusgtgriddouble precision uEuWwNwSdudxdwdz

uW = x[j][iminus1]u uE = x[j][i]u dudx = (uE minus uW)gridminusgtdxwS = x[jminus1][i]w wN = x[j][i]w dwdz = (wN minus wS)gridminusgtdzreturn dudx + dwdz

Fig 22Calculation of the continuity residual for the mantle subduction problem

solved by usingSNES with no loss of efficiency compared to using the underlyingKSP object (PETSc linear solver object) directly ThusSNES seems the appropri-ate framework for a general-purpose simulator providing the flexibility to add orsubtract nonlinear terms from the equation at will

TheSNES solver does not require a user to implement a Jacobian Default rou-tines are provided to compute a finite difference approximation using coloring toaccount for the sparsity of the matrix The user does not even need the PETScMatinterface but can initially interact only withVec objects making the transition to

PETSc nearly painless However these approximations to the Jacobian can have nu-merical difficulties and are not as efficient as direct evaluation Therefore the user

20 Matthew G Knepley et al

Compute both the second invariant of the strain rate tensor and the viscosity int ViscosityField(DA da Vec X Vec VAppCtx user)

Parameter param = userminusgtparamGridInfo grid = userminusgtgridVec localXField v xdouble eps dx dz T epsC TCint ijisjsimjmilim jlim ivt

ivt = paramminusgtiviscparamminusgtivisc = paramminusgtoutput ivisc

DACreateLocalVector(da amplocalX)DAGlobalToLocalBegin(da X INSERT VALUES localX)DAGlobalToLocalEnd(da X INSERT VALUES localX)DAVecGetArray(dalocalX(void)ampx)DAVecGetArray(daV(void)ampv)

Parameters dx = gridminusgtdx dz = gridminusgtdzilim = gridminusgtniminus1 jlim = gridminusgtnjminus1

Compute real temperature strain rate and viscosity DAGetCorners(daampisampjsPETSCNULL ampimampjmPETSCNULL )for (j=js jltjs+jm j++)

for (i=is iltis+im i++) T = paramminusgtpotentialT x[j][i]T exp( (jminus05)dzparamminusgtz scale )if (iltilim ampamp jltjlim )

TC = paramminusgtpotentialT TInterp(xij) exp( jdzparamminusgtz scale ) else

TC = T Compute the values at both cell centers and cell corners eps = CalcSecInv(xijCELL CENTERuser)epsC = CalcSecInv(xijCELL CORNERuser)v[j][i]u = epsv[j][i]w = epsCv[j][i]p = Viscosity(Tepsdz(jminus05)param)v[j][i]T = Viscosity(TCepsCdzjparam)

DAVecRestoreArray(daV(void)ampv)DAVecRestoreArray(dalocalX(void)ampx)paramminusgtivisc = ivt

Fig 23 Calculation of the second invariant of the strain tensor and viscosity field for themantle subduction problem

Developing a Geodynamics Simulator with PETSc 21

KSP ksp subkspPC bjacobi ilu

SNESGetKSP(snes ampksp)KSPGetPC(ksp ampbjacobi)PCBJacobiGetSubKSP(bjacobi PETSCNULL PETSCNULL ampsubksp)KSPGetPC(subksp[0] ampilu)PCILUSetLevels(ilu levels)

Fig 24Customizing the preconditioner on a single process

has the option of providing a routine or of using the ADIC [15] or similar system forautomatic differentiation The finite difference approximation can be activated byusing the-snes fd and-mat fd coloring freq options or by providing theSNESDefaultComputeJacobianColor () function toSNESSetJacobian()

Through theSNES object the user may also access the great range of directand iterative linear solvers and preconditioners provided by PETSc Sixteen differentKrylov solvers are available including GMRES BiCGStab and LSQR along withinterfaces to popular sparse direct packages such as MUMPS [4] and SuperLU [9]In addition to a variety of incomplete factorization preconditioners including PI-LUT [16] PETSc supports additive Schwartz preconditioning algebraic multigridthrough BoomerAMG [13] and the geometric multigrid discussed below The fullpanoply of solvers and preconditioners available is catalogued on the PETSc Web-site [3]

The modularity of PETSc allows users to easily customize each solver For in-stance suppose the user wishes to increase the number of levels in ILU(k) precon-ditioning on one block of a block-Jacobi scheme The code fragment in Fig 24 willset the number of levels of fill tolevels

PETSc provides an elegant framework for managing geometric multigrid in com-bination with aDA which is abstracted in theDMMGobject In the same way thatany linear solve can be performed withSNES any preconditioner or solver com-bination is available inDMMGby using a single level That is the same user codesupports both geometric multigrid as well as direct methods Krylov methods etcWhen creating aDMMG the user specifies the number of levels of grid refinementand provides the coarse-grid information (see Fig 25) which is always aDA objectat present although unstructured prototypes are being developed Then theDMMGobject calculates the intergrid transfer operators prolongation and restriction and

allocates objects to hold the solution at each level The user must provide the actionof the residual operatorF and can optionally provide a routine to compute the Jaco-bian matrix (which will be called on each level) and a routine to compute the initialguess as in Fig 26

With this information the user can now solve the equation and retrieve the solu-tion vector as in Fig 27

22 Matthew G Knepley et al

DMMGCreate(comm gridmglevels user ampdmmg)DMMGSetDM(dmmg (DM) da)

Fig 25Creating a DMMG

DMMGSetSNESLocal(dmmg FormFunctionLocal FormJacobianLocal 0 0)DMMGSetInitialGuess(dmmg FormInitialGuess)

Fig 26Providing discretization and assembly routines to DMMG

DMMGSolve(dmmg)soln = DMMGGetx(dmmg)

Fig 27Solving the problem with DMMG

5 Extensions

PETSc is not a complete environment for simulating physical phenomena Ratherit is a set of tools that allow the user to assemble such an environment tailored toa specific application As such PETSc will never provide every facility appropriatefor a given simulation However the user can easily extend PETSc and supplementits capabilities We present two examples in the context of the mantle subductionsimulation

51 Simple Continuation

Because of the highly nonlinear dependence of viscosity on temperature and strainrate in equation (1) the iteration in Newtonrsquos method can fail to converge withouta good starting guess of the solution On the other hand for the isoviscous casewhere temperature is coupled to velocity only through advection a solution is easilyreached It is therefore natural to propose a continuation scheme in the viscosity be-ginning with constant viscosity and progressing to full variability A simple adaptivecontinuation model was devised in which the viscosity was raised to a power be-tween zero and oneη rarr ηα whereα=0 corresponds to constant viscosity andα=1corresponds to natural viscosity variation In the continuation loopα was increasedtowards unity by an amount dependent on the rate of convergence of the previousSNES solve

PETSc itself provides no special support for continuation but it is sufficientlymodular that a continuation loop was readily constructed in the application codeusing repeated calls toDMMGSolve() and found to work quite well

Developing a Geodynamics Simulator with PETSc 23

52 Simple Steering

No rigorous justification had been given for the continuation strategy above andconsequently it was possible that for certain initial conditions Newton would fail toconverge thereby resulting in extremely long run times An observant user could de-tect this situation and abort the run but all the potentially useful solution informationup to that point would be lost A clean way to asynchronously abort the continua-tion loop was thus needed On architectures where OS signals are available PETScprovides an interface for registering signal handlers Thus we were able to define ahandler The user wishing to change the control flow of the simulation simply sendsthe appropriate signal to the process This sets a flag in the user data that causes thechange at the next iteration

6 Simulation Results

A long-standing debate has existed between theorists and observationalists over thethermal structure of subduction zones Modelers using constant viscosity flow sim-ulations to compute thermal structure have predicted relatively cold subductionzones [19] Conversely observationalists who use heat flow measurements and min-eralogical thermobarometry to estimate temperatures at depth have long claimedthat subduction zones are hotter than predicted They have invoked the presence ofstrong upwelling of hot mantle material to supply the heat This upwelling was neverpredicted by isoviscous flow models however

The debate remained unresolved until recently when several groups includingour own succeeded in developing simulations with realistically variable viscosity(for other examples [810121821]) A comparison of flow and temperature fieldsfor variable and constant viscosity simulations generated with our code is shownin Fig 28 Results from these simulations are exciting because they close the gapbetween models and observations they predict hotter mantle temperatures steepersurface thermal gradients and upwelling mantle flow

Furthermore these simulations allow for quantitative predictions of variation ofobservables with subduction parameters A recent study of subduction zone earth-quakes has identified an intriguing trend the vertical distance from subduction zonevolcanoes to the surface of the subducting slab is anti-correlated with descent rate ofthe slab [11] Preliminary calculations with our model are consistent with this trendindicating that flow and thermal structure may play an important role in determiningnot only the quantity and chemistry of magmas but also their path of transport to thesurface Further work is required to resolve this issue

Simulating the geodynamics of subduction is an example of our success in port-ing an existing code into the PETSc framework simultaneously parallelizing thecode and increasing its functionality Using this code as a template we rapidly de-veloped a related simulation of another tectonic boundary the mid-ocean ridge Thiswork was done to address a set of observations of systematic morphological asym-metry in the global mid-ocean ridge system [7] Our model confirms the qualitative

24 Matthew G Knepley et al

Dep

th k

m

log10

η and flow field

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24 Temperature degC

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Distance km

Dep

th k

m

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24

Distance km

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Isov

isco

usV

aria

ble

Vis

cosi

ty

(a) (b)

(c) (d)

Fig 282D viscosity and potential temperature fields from simulations on 8 processors with230112 degrees of freedom Panels in the top row are from a simulation withα=1 in equation(1) Panels in bottom row haveα=0 The white box in panels (a) and (c) shows the region inwhich temperature is plotted in panels (b) and (d)(a) Colors showlog10 of the viscosity fieldNote that there are more than five orders of magnitude variation in viscosity Arrows show theflow direction and magnitude (the slab is being subducted at a rate of 6 cmyear) Upwellingis evident in the flow field near the wedge corner(b) Temperature field from the variableviscosity simulation 1100C isotherm is shown as a dashed line(c) (Constant) viscosity andflow field from the isoviscous simulation Strong flow is predicted at the base of the crustdespite the low-temperature rock there No upwelling flow is predicted(d) Temperature fieldfrom isoviscous simulation Note that the mantle wedge corner is much colder than in (b)

mechanism that had been proposed to explain these observations Furthermore itshows a quantitative agreement with trends in the observed data [17] As with ourmodels of subduction the key to demonstrating the validity of the hypothesized dy-namics was simulating them with the strongly nonlinear rheology that characterizesmantle rock on geologic timescales The computational tools provided by PETSc en-abled us easily handle this rheology reducing development time and allowing us tofocus on model interpretation

References

1 PETSc SNES Example 30 httpwwwmcsanlgovpetscpetsc-2

Developing a Geodynamics Simulator with PETSc 25

snapshotspetsc-currentsrcsnesexamplestutorialsex30chtml

2 PETSc SNES Example 5 httpwwwmcsanlgovpetscpetsc-2snapshotspetsc-currentsrcsnesexamplestutorialsex5f90Fhtml

3 PETSc Solvers httpwwwmcsanlgovpetscpetsc-2documentationlinearsolvertablehtml

4 Amestoy P R Duff I S LrsquoExcellent J-Y and Koster J A fully asynchronous mul-tifrontal solver using distributed dynamic schedulingSIAM Journal on Matrix Analysisand Applications 23(1)15ndash41 (2001)

5 Balay S Buschelman K Eijkhout V Gropp W D Kaushik D Knepley M GMcInnes L C Smith B F and Zhang H PETSc Web pagehttpwwwmcsanlgovpetsc

6 Balay S Gropp W D McInnes L C and Smith B F Efficient management of par-allelism in object oriented numerical software libraries In Arge E Bruaset A M andLangtangen H P editorsModern Software Tools in Scientific Computing pages 163ndash202 Birkhauser Press (1997)

7 Carbotte S Small C and Donnelly K The influence of ridge migration on the mag-matic segmentation of mid-ocean ridgesNature 429743ndash746 (2004)

8 Conder J Wiens D and Morris J On the decompression melting structure at volcanicarcs and back-arc spreading centersGeophys Res Letts 29 (2002)

9 Demmel J W Gilbert J R and Li X S SuperLU userrsquos guide Technical ReportLBNL-44289 Lawrence Berkeley National Laboratory (2003)

10 Eberle M Grasset O and Sotin C A numerical study of the interaction of the man-tle wedge subducting slab and overriding platePhys Earth Planet In 134191ndash202(2002)

11 England P Engdahl R and Thatcher W Systematic variation in the depth of slabsbeneath arc volcanosGeophys J Int 156(2)377ndash408 (2003)

12 Furukawa Y Depth of the decoupling plate interface and thermal structure under arcsJ Geophys Res 9820005ndash20013 (1993)

13 Henson V E and Yang U M BoomerAMG A parallel algebraic multigrid solver andpreconditioner Technical Report UCRL-JC-133948 Lawrence Livermore National Lab-oratory (2000)

14 Hirth G and Kohlstedt D Rheology of the upper mantle and the mantle wedge A viewfrom the experimentalists InInside the Subduction Factory volume 138 ofGeophysicalMonograph American Geophysical Union (2003)

15 Hovland P Norris B and Smith B Making automatic differentiation truly automaticCoupling PETSc with ADIC InProceedings of ICCS2002 (2002)

16 Hysom D and Pothen A A scalable parallel algorithm for incomplete factor precondi-tioning SIAM Journal on Scientific Computing 222194ndash2215 (2001)

17 Katz R Spiegelman M and Carbotte S Ridge migration asthenospheric flow and theorigin of magmatic segmentation in the global mid-ocean ridge systemGeophys ResLetts 31 (2004)

18 Kelemen P Rilling J Parmentier E Mehl L and Hacker B Thermal structure dueto solid-state flow in the mantle wedge beneath arcs InInside the Subduction Factoryvolume 138 ofGeophysical Monograph American Geophysical Union (2003)

19 Peacock S and Wang K Seismic consequences of warm versus cool subduction meta-morphism Examples from southwest and northeast JapanScience 286937ndash939 (1999)

20 Stern R Subduction zonesRev Geophys 40(4) (2002)

26 Matthew G Knepley et al

21 van Keken P Kiefer B and Peacock S High-resolution models of subduction zonesImplications for mineral dehydration reactions and the transport of water into the deepmantleGeochem Geophys Geosys 3(10) (2003)

Page 4: Developing a Geodynamics Simulator with PETScfoalab.earth.ox.ac.uk/publications/Knepley_etal_Springer06.pdf · Developing a Geodynamics Simulator with PETSc Matthew G. Knepley 1,

4 Matthew G Knepley et al

existing compile system (iemake) and to initialize the PETSc runtime This maybe done in two ways

Fig 2 Interface hierarchy in a PETSc application

The simplest approach is to adopt the PETSc makefile structure which is pre-sented in Fig 3 for an application using the nonlinear solver package The user in-cludes thebmakecommonbase which defines both rules for compiling sourceand variables that are used during the compile and link Then only a simple rule forthe executable is necessary using the appropriate PETSc library variable Individualcompilation can be customized with the variables shown at the top of the makefile

Users with large existing build systems may choose not to inherit the PETScmake rules but instead use a lower-level interface based only on PETSc make vari-ables A boilerplate example is given in Fig 4 The user now includes only thebmakecommonvariables file which defines the make variables but does notprescribe any rules for compilation or linking The variables provide informationabout all compilation flags libraries and external packages necessary to link withPETSc

Before any PETSc code can be run the user must callPetscInitialize() and likewise after all PETSc code has completed the user must callPetscFinalize () This is analogous to the requirements for using MPI In fact if the userhas not done so already PETSc will handle the initialization and cleanup of MPIautomatically in these routines A simple C driver is shown in Fig 5

The initialization call provides the command line arguments to PETSc for pro-cessing and can take an optional help string for the user (printed when the-help op-tion is given) The finalization call frees any resources used by PETSc and provides

Developing a Geodynamics Simulator with PETSc 5

CFLAGS =FFLAGS =CPPFLAGS=FPPFLAGS=

include $PETSCDIRbmakecommonbase

ex1 ex1o utilo chkoptsminus$CLINKER minuso $ $ˆ $PETSCSNES LIB$RM $ˆ

ex1f ex1fo physo chkoptsminus$FLINKER minuso $ $ˆ $PETSCFORTRAN LIB $PETSCSNES LIB$RM $ˆ

Fig 3Typical PETSc makefile

include $PETSCDIRbmakecommonvariables

comassageCpl $lt$CC minusc $MY CFLAGS $COPTFLAGS $CFLAGS $CCPPFLAGS $lt

FomassageFortranpl $lt$FC minusc $MY FFLAGS $FOPTFLAGS $FFLAGS $FCPPFLAGS $lt

ex1 ex1o utilominus$CLINKER minuso $ $ˆ $PETSCSNES LIB$RM $lt

ex1f ex1fo physominus$FLINKER minuso $ $ˆ $PETSCFORTRAN LIB $PETSCSNES LIB$RM $lt

Fig 4Boilerplate custom makefile using PETSc

summary logging and diagnostic information most notably performance profilingThe equivalent Fortran driver is shown in Fig 6

Notice that the command line arguments are now obtained directly from the For-tran runtime library rather than the user code Once these calls are inserted intothe application code the user can verify a successful link and run with the PETSclibraries

6 Matthew G Knepley et al

static char help[ ] = Boilerplate PETSc Examplenn

include petsch

int main(int argc char argv)

ierr = PetscInitialize(ampargc ampargv (char ) 0 help)CHKERRQ(ierr)

User code

return PetscFinalize()

Fig 5Boilerplate C driver for PETSc

program mainimplicit none

include includefincludepetsch

integer ierr

call PetscInitialize(PETSCNULL CHARACTER ierr)

User code

call PetscFinalize(ierr)end

Fig 6Boilerplate Fortran driver for PETSc

3 Data Distribution and Linear Algebra

The PETSc solvers (discussed in Section 4) require the user to provide C or Fortranroutines that compute the residual of the equation they wish to solve (after they havediscretized the PDE) and optionally the Jacobian of that residual function We re-fer to these functions generically asFormFunction() andFormJacobian() For nonlinear problems the PETSc solvers use truncated Newton methods with linesearches or trust-regions for robustness and possibly approximate or incomplete Ja-cobians for efficiency

The central objects in any PETSc simulation are the abstractions from linearalgebra vectors and matrices or in PETSc terms theVec andMat classes Theseform the basis of all solver and preconditioner interfaces as well as providing the linkto user-supplied discretization and physics routines Thus the most important stepin the migration of an application to the PETSc framework is the incorporation of its

Developing a Geodynamics Simulator with PETSc 7

linear algebra and data distribution abstractions Two obvious strategies emerge foraccomplishing this integration a gradual approach that seeks to minimize the changeto existing code and a more aggressive approach that leverages as much of the PETSctechnology as possible The next two subsections address these complementary pathstoward integration We use the simple example of the solid-fuel ignition problem orBratu problem to illustrate the gradual evolution of a simulation and then we givean example of mantle subduction for the more aggressive approach

31 Gradual Evolution

A new PETSc user with very complicated code which perhaps was originally writ-ten by someone else may opt to make as few changes as possible when moving toPETSc PETSc facilitates this approach with low-level interfaces to bothVec andMat objects that closely resemble common serial data structures Vectors are espe-cially easy to migrate because the default PETSc storage format contiguous arrayson each process is that most common in serial applications Matrices are somewhatmore complicated and in order to hide the actual matrix data storage format PETScrequires the user to access values through a functional interface [6]

Let us examine the Bratu problem as an example of integrating PETSc vectorsinto an existing simulation [2] This problem is modeled by the partial differentialequation

minus∆uminus λeu = 0 (4)

We take the domain to be the unit square and impose homogeneous Dirichlet bound-ary conditions on the edges We discretize the equation using a 5-point stencil finitedifference scheme which results in a set of nonlinear algebraic equationsF (uh) = 0where we useuh to indicate the discrete solution vector In Fig 7 we present a For-tran 90 routine that calculates the residualF as a function of the input vectoruhIt also takes a user context as input that holds the domain information and problemcoefficient

Notice that the boundary conditions on the residual are of the formuminusuΓ whereuΓ are the specified boundary values because we are driving the residualF (u) tozero An alternative would be to eliminate those variables altogether substitutingthe boundary values in any interior calculation However this technique currentlyimposes a greater burden on the programmer

When integrating PETSc vectors into this code we can let Fortran allocate thestorage or we can use PETSc for allocation In Fig 8 we let PETSc managethe memory The input vectors already have storage allocated by PETSc whichcan be accessed as an F90 array by using theVecGetArrayF90() functionor in C by usingVecGetArray() A sampledriver() routine shows howPETSc allocates vector storage If we let Fortran manage memory then we useVecCreateSeqWithArray() orVecCreateMPIWithArray() in parallelto create theVec objects

During a Newton iteration to solve this equation we would require the JacobianJ of the mappingF Rather than being accessed as a multidimensional array the

8 Matthew G Knepley et al

module f90moduletype userctx

The start end and number of vertices in the xminus and yminusdirectionsinteger xsxeysyeinteger mxmydouble precision lambda

end type userctxcontainsend module f90module

subroutine FormFunction(uFuserierr)use f90moduletype (userctx) userdouble precision u(userxsuserxeuserysuserye)double precision F(userxsuserxeuserysuserye)double precision twoonehxhyhxdhyhydhxscuijuxxuyyinteger ijierr

hx = 10dble(usermxminus1)hy = 10dble(usermyminus1)sc = hxhyuserlambdahxdhy = hxhyhydhx = hyhx

do 20 j=userysuseryedo 10 i=userxsuserxe

Apply boundary conditionsif (i == 1 or j == 1 or i == usermx or j == usermy) then

F(ij) = u(ij) Apply finite difference scheme

elseuij = u(ij)uxx = hydhx (20uij minus u(iminus1j) minus u(i+1j))uyy = hxdhy (20uij minus u(ijminus1) minus u(ij+1))F(ij) = uxx + uyy minus scexp(uij)

endif10 continue20 continue

Fig 7Residual calculation for the Bratu problem

Developing a Geodynamics Simulator with PETSc 9

subroutine driver(uFuserierr)implicit none

Vec uFint Nparameter(N=10000)

call VecCreate(PETSCCOMM WORLDuierr)call VecSetSizes(uPETSCDECIDENierr)call VecDuplicate(uFierr)call SolverLoop(uFierr)call VecDestroy(uierr)call VecDestroy(Fierr)end

subroutine FormFunctionPETSc(uFuserierr)use f90moduleimplicit none

Vec uFtype (userctx) userdouble precisionpointer u v()f v()

call VecGetArrayF90(uu vierr)call VecGetArrayF90(Ff vierr)call FormFunction(u vf vuserierr)call VecRestoreArrayF90(uu vierr)call VecRestoreArrayF90(Ff vierr)end

Fig 8Driver for the Bratu problem using PETSc

PETScMat object provides theMatSetValues () function to set logically denseblocks of values into the structure A function that computes the Jacobian and storesit in a PETSc matrix is shown in Fig 9 With this addition the user can now runserial code using the PETSc solvers (details are given in Section 4) Note that thestorage mechanism is the only change to user code necessary for its incorporationinto the PETSc framework using the wrapper in Fig 8

Although PETSc removes the need for low-level parallel programming such asdirect calls to the MPI library the user must still identify parallelism inherent inthe application and must structure the computation accordingly The first step is topartition the domain and have each process calculate the residual and Jacobian onlyover its local piece This is easily accomplished by redefining thexs xe ys andyevariables on each process converting loops over the entire domain into loops overthe local domain However this method uses nearest-neighbor information and thus

10 Matthew G Knepley et al

subroutine FormJacobian(ujacuserierr)use f90moduletype (userctx) userMat jacdouble precision x(userxsuserxeuserysuserye)double precision twoonehxhyhxdhyhydhxscv(5)integer rowcol(5)ijierr

one = 10two = 20hx = onedble(usermxminus1)hy = onedble(usermyminus1)sc = hxhyuserlambdahxdhy = hxhyhydhx = hyhxdo 20 j=userysuserye

row = (j minus userys)userxm minus 1do 10 i=userxsuserxe

row = row + 1 boundary points

if (i == 1 or j == 1 or i == usermx or j == usermy) thencol(1) = rowv(1) = onecall MatSetValues(jac1row1colvINSERT VALUESierr)

interior grid pointselse

v(1) = minushxdhyv(2) = minushydhxv(3) = two(hydhx + hxdhy) minus scexp(x(ij))v(4) = minushydhxv(5) = minushxdhycol(1) = row minus userxmcol(2) = row minus 1col(3) = rowcol(4) = row + 1col(5) = row + userxmcall MatSetValues(jac1row5colvINSERT VALUESierr)

endif10 continue20 continue

Fig 9Jacobian calculation for the Bratu problem using PETSc

Developing a Geodynamics Simulator with PETSc 11

we must have access to some values not present locally on the process as shown inFig 10

Box-type stencil Star-type stencil

Proc 6

Proc 0 Proc 0Proc 1 Proc 1

Proc 6

Fig 10Star and box stencils using a DA

We must store values for theseghostpoints locally so that they may be used forthe computation In addition we must ensure that these values are identical to the cor-responding values on the neighboring processes At the lowest level PETSc providesghosted vectors created usingVecCreateGhost () with storage for some valuesowned by other processes These values are explicitly indicated during constructionTheVecGhostUpdateBegin () andVecGhostUpdateEnd () functions trans-fer values between ghost storage and the corresponding storage on other processesThis method however can become complicated for the user Therefore for the com-mon case of a logically rectangular grid PETSc provides theDA object to managethe determination allocation and coherence of ghost values TheDA object is dis-cussed fully in Section 32 but we give here a short introduction in the context of theBratu problem

TheDA object represents a distributed structured (possibly staggered) grid andthus has more information than our serial grid If we augment our user context toinclude information about ghost values we can obtain all the grid information fromtheDA Figure 11 shows the creation of aDA when one initially know only the totalnumber of vertices in thex andy directions Alternatively we could have specifiedthe local number of vertices in each direction

Now we need only modify our PETSc residual routine to update the ghost valuesas shown in Fig 12

Notice that the ghost value scatter is a two-step operation This allows thecommunication to overlap with local computation that may be taking place whilethe messages are in transit The only change necessary for the residual calcu-lation itself is the correct declaration of the input arraydouble precision

12 Matthew G Knepley et al

type userctxDA da

The start end and number of vertices in the xminusdirectioninteger xsxexm

The start end and number of ghost vertices in the xminusdirectioninteger gxsgxegxm

The start end and number of vertices in the yminusdirectioninteger ysyeym

The start end and number of ghost vertices in the yminusdirectioninteger gysgyegym

The number of vertices in the xminus and yminusdirectionsinteger mxmy

The MPI rank of this processinteger rank

The coefficient in the Bratu equationdouble precision lambda

end type userctx

call DACreate2d(PETSCCOMM WORLDDA NONPERIODICDA STENCIL STAR ampamp usermxusermyPETSCDECIDEPETSCDECIDE11 ampamp PETSCNULL INTEGERPETSCNULL INTEGERuserdaierr)call DAGetCorners(userdauserxsuserysPETSCNULL INTEGER amp

amp userxmuserymPETSCNULL INTEGERierr)call DAGetGhostCorners(userdausergxsusergys amp

amp PETSCNULL INTEGERusergxmusergym ampamp PETSCNULL INTEGERierr)

Here we shift the starting indices up by one so that we can easily use the Fortran convention of1minusbased indices rather than0minusbased

userxs = userxs+1userys = userys+1usergxs = usergxs+1usergys = usergys+1userye = userys+userymminus1userxe = userxs+userxmminus1usergye = usergys+usergymminus1usergxe = usergxs+usergxmminus1

Fig 11Creation of a DA for the Bratu problem

Developing a Geodynamics Simulator with PETSc 13

subroutine FormFunctionPETSc(uFuserierr)implicit none

Vec uFinteger ierrtype (userctx) user

double precisionpointer lu v()lf v()Vec uLocal

call DAGetLocalVector(userdauLocalierr)call DAGlobalToLocalBegin(userdauINSERT VALUESuLocalierr)call DAGlobalToLocalEnd(userdauINSERT VALUESuLocalierr)

call VecGetArrayF90(uLocallu vierr)call VecGetArrayF90(Flf vierr)

Actually compute the local portion of the residualcall FormFunctionLocal(lu vlf vuserierr)

call VecRestoreArrayF90(uLocallu vierr)call VecRestoreArrayF90(Flf vierr)

call DARestoreLocalVector(userdauLocalierr)

Fig 12Parallel residual calculation for the Bratu problem using PETSc

u(usergxsusergxeusergysusergye) which now includes ghostvalues

The changes toFormJacobian() are similar resizing the input array andchanging slightly the calculation of row and column indices and can be found inthe PETSc example source We have now finished the initial introduction of PETSclinear algebra and the simulation is ready to run in parallel

32 Rapid Evolution

The user who is willing to raise the level of abstraction in a code can start by em-ploying theDA to describe the problem domain and discretization The data interfacemimics a multidimensional array and is thus ideal for finite difference finite volumeand low-order finite element schemes on a logically rectangular grid In fact after thedefinition of aFormFunction() routine the simulation is ready to run becausethe nonlinear solver provides approximate Jacobians automatically as discussed inSection 4

14 Matthew G Knepley et al

DASetLocalFunction(userda (DALocalFunction1) FormFunctionLocal)

Fig 13Using the DA to form a function

int FormFunctionLocal(DALocalInfo infodouble udouble fAppCtx user)double two = 20hxhyhxdhyhydhxscdouble uijuxxuyyint ij

hx = 10(double)(infominusgtmxminus1)hy = 10(double)(infominusgtmyminus1)sc = hxhyuserminusgtlambdahxdhy = hxhyhydhx = hyhx Compute function over the locally owned part of the grid for (j=infominusgtys jltinfominusgtys+infominusgtym j++)

for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++) if (i == 0 | | j == 0 | | i == infominusgtmxminus1 | | j == infominusgtmyminus1)

f[j][i] = u[j][i] else

uij = u[j][i]uxx = (twouij minus u[j][iminus1] minus u[j][i+1])hydhxuyy = (twouij minus u[jminus1][i] minus u[j+1][i])hxdhyf[j][i] = uxx + uyy minus scexp(uij)

Fig 14FormFunctionLocal() for the Bratu problem

We begin by examining the Bratu problem from the last section only this time inC We can now provide our local residual routine to theDA see Fig 13

The grid information is passed toFormFunctionLocal() in a DALocalInfo structure as shown in Fig 14

The fields are passed directly as multidimensional C arrays complete with ghostvalues An application scientist can use a boilerplate example such as the Bratu prob-lem merely altering the local physics calculation inFormFunctionLocal() torapidly obtain a parallel scalable simulation that runs on the desktop as well as onmassively parallel supercomputers

If the user has an expression for the Jacobian of the residual function then thismatrix can also be computed by using theDA interface with the call as in Fig 15and the corresponding routine for the local calculation in Fig 16

Developing a Geodynamics Simulator with PETSc 15

DASetLocalJacobian(userda (DALocalFunction1) FormJacobianLocal)

Fig 15Using the DA to form a Jacobian

int FormJacobianLocal(DALocalInfo infodouble xMat jacAppCtx user)MatStencil col[5]rowdouble lambdav[5]hxhyhxdhyhydhxscint ij

lambda = userminusgtparamhx = 10(double)(infominusgtmxminus1)hy = 10(double)(infominusgtmyminus1)sc = hxhylambdahxdhy = hxhy hydhx = hyhx

for (j=infominusgtys jltinfominusgtys+infominusgtym j++) for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++)

rowj = j rowi = i boundary points if (i == 0 | | j == 0 | | i == infominusgtmxminus1 | | j == infominusgtmyminus1)

v[0] = 10MatSetValuesStencil(jac1amprow1amprowvINSERT VALUES)

else interior grid points

v[0] = minushxdhy col[0]j = j minus 1 col[0]i = iv[1] = minushydhx col[1]j = j col[1]i = iminus1v[2] = 20(hydhx + hxdhy) minus scexp(x[j][i]) col[2]j = j col[2]i = iv[3] = minushydhx col[3]j = j col[3]i = i+1v[4] = minushxdhy col[4]j = j + 1 col[4]i = iMatSetValuesStencil(jac1amprow5colvINSERT VALUES)

MatAssemblyBegin(jacMAT FINAL ASSEMBLY)MatAssemblyEnd(jacMAT FINAL ASSEMBLY)MatSetOption(jacMAT NEW NONZERO LOCATION ERR)

Fig 16FormJacobianLocal() for the Bratu problem

16 Matthew G Knepley et al

int FormFunctionLocal(DALocalInfo infoField xField fvoid ptr)AppCtx user = (AppCtx)ptrParameter param = userminusgtparamGridInfo grid = userminusgtgriddouble mag w mag uint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1 i j

for (j=infominusgtys jltinfominusgtys+infominusgtym j++) for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++)

calculateXMomentumResidual(i j x f user)calculateZMomentumResidual(i j x f user)calculatePressureResidual(i j x f user)calculateTemperatureResidual(i j x f user)

Fig 17FormFunctionLocal() for the mantle subduction problem

We have so far presented simple examples involving a perturbed Laplacian butthis development strategy extends far beyond toy problems We have developedlarge-scale parallel geophysical simulations [1] that are producing new results inthe field We began by replacingFormFunctionLocal() in the Bratu examplewith one containing the linear physics of the previous sequential linear subductioncode (dropping the previous codersquos linear solve) Then the PETSc solver results werecompared for correctness with the complete prior subduction code Next the newcode was immediately run in parallel for correctness and efficiency tests Finallythe nonlinear terms from the more complicated physics of the mantle subductionproblem discussed in Section 1 and shown in Figs 17-21 and better quality dis-cretizations of the boundary conditions were added In fact since the structure oftheFormFunction() changed significantly the finalFormFunction() looksnothing like the original but the process of obtaining through the incremental ap-proach reduced the learning curve for PETSc and make correctness checking at thevarious stages much easier

The details of the subduction code are not as important as the recognition thata very complicated physics problem need not involve any more complication in oursimulation infrastructure To give more insight into the actual calculation we showin Fig 22 the explicit calculation of the residual from the continuity equation

Here we have used theDA for a multicomponent problem on a staggered meshwithout alteration because the structure of the discretization remains logically rect-angular

Although theDA manages communication during the solution process data post-processing often necessary for visualization or analysis also requires this function-ality For the mantle subduction simulation we want to calculate both the second in-

Developing a Geodynamics Simulator with PETSc 17

int calculateXMomentumResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (iltj) f[j][i]u = x[j][i]u minus SlabVel(rsquoUrsquo ijuser)

else if (jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lithospheric lid f[j][i]u = x[j][i]u minus 00

else if (i==ilim ) on the right side boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]u = x[j][i]u minus HorizVelocity(ijuser) else

f[j][i]u = XNormalStress(xijCELL CENTERuser) minus EPS ZERO

else if (j==jlim ) on the bottom boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]u = x[j][i]u minus HorizVelocity(ijuser) else if (paramminusgtibound==BC NOSTRESS)

f[j][i]u = XMomentumResidual(xijuser)

else in the mantle wedge f[j][i]u = XMomentumResidual(xijuser)

Fig 18X-momentum residual for the mantle subduction problem

variant of the strain rate tensor and the viscosity over the grid (at both cell centers andcorners) Using theDAGlobalToLocalBegin () andDAGlobalToLocalEnd() functions we transferred the global solution data to local ghosted vectors and pro-ceeded with the calculation We can also transfer data from local vectors to a globalvector usingDALocalToGlobal () The entire viscosity calculation is given inFig 23

Note that this framework allows the user to manage arbitrary fields over the do-main For instance a porous flow simulation might manage material properties ofthe medium in this fashion

4 Solvers

The SNES object in PETSc is an abstraction of the ldquoinverserdquo of a nonlinear oper-ator The user provides aFormFunction() routine as seen in Section 3 whichcomputes the action of the operator on an input vector Linear problems can also be

18 Matthew G Knepley et al

int calculateZMomentumResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1jlim = infominusgtmyminus1

if (ilt=j) f[j][i]w = x[j][i]w minus SlabVel(rsquoWrsquo ijuser)

else if (jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lithospheric lid f[j][i]w = x[j][i]w minus 00

else if (j==jlim ) on the bottom boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]w = x[j][i]w minus VertVelocity(ijuser) else

f[j][i]w = ZNormalStress(xijCELL CENTERuser) minus EPS ZERO

else if (i==ilim ) on the right side boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]w = x[j][i]w minus VertVelocity(ijuser) else if (paramminusgtibound==BC NOSTRESS)

f[j][i]w = ZMomentumResidual(xijuser)

else in the mantle wedge f[j][i]w = ZMomentumResidual(xijuser)

Fig 19Z-momentum residual for the mantle subduction problem

int calculatePressureResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (iltj | | jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lid or slab f[j][i]p = x[j][i]p

else if ((i==ilim | | j==jlim ) ampamp paramminusgtibound==BC ANALYTIC ) on an analytic boundary f[j][i]p = x[j][i]p minus Pressure(ijuser)

else in the mantle wedge f[j][i]p = ContinuityResidual(xijuser)

Fig 20Pressure or continuity residual for the mantle subduction problem

Developing a Geodynamics Simulator with PETSc 19

int calculateTemperatureResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (j==0) on the surface f[j][i]T = x[j][i]T + x[j+1][i]T + PetscMax(x[j][i]T00)

else if (i==0) slab inflow boundary f[j][i]T = x[j][i]T minus PlateModel(jPLATE SLABuser)

else if (i==ilim ) right side boundary mag u = 10 minus pow( (10minusPetscMax(PetscMin(x[j][iminus1]uparamminusgtcb10)00)) 50 )f[j][i]T = x[j][i]T minus mag ux[jminus1][iminus1]T minus (10minusmag u)PlateModel(jPLATE LID user)

else if (j==jlim ) bottom boundary mag w = 10 minus pow( (10minusPetscMax(PetscMin(x[jminus1][i]wparamminusgtsb10)00)) 50 )f[j][i]T = x[j][i]T minus mag wx[jminus1][iminus1]T minus (10minusmag w)

else in the mantle wedge f[j][i]T = EnergyResidual(xijuser)

Fig 21Temperature or energy residual for the mantle subduction problem

double precision ContinuityResidual(Field x int i int j AppCtx user)

GridInfo grid = userminusgtgriddouble precision uEuWwNwSdudxdwdz

uW = x[j][iminus1]u uE = x[j][i]u dudx = (uE minus uW)gridminusgtdxwS = x[jminus1][i]w wN = x[j][i]w dwdz = (wN minus wS)gridminusgtdzreturn dudx + dwdz

Fig 22Calculation of the continuity residual for the mantle subduction problem

solved by usingSNES with no loss of efficiency compared to using the underlyingKSP object (PETSc linear solver object) directly ThusSNES seems the appropri-ate framework for a general-purpose simulator providing the flexibility to add orsubtract nonlinear terms from the equation at will

TheSNES solver does not require a user to implement a Jacobian Default rou-tines are provided to compute a finite difference approximation using coloring toaccount for the sparsity of the matrix The user does not even need the PETScMatinterface but can initially interact only withVec objects making the transition to

PETSc nearly painless However these approximations to the Jacobian can have nu-merical difficulties and are not as efficient as direct evaluation Therefore the user

20 Matthew G Knepley et al

Compute both the second invariant of the strain rate tensor and the viscosity int ViscosityField(DA da Vec X Vec VAppCtx user)

Parameter param = userminusgtparamGridInfo grid = userminusgtgridVec localXField v xdouble eps dx dz T epsC TCint ijisjsimjmilim jlim ivt

ivt = paramminusgtiviscparamminusgtivisc = paramminusgtoutput ivisc

DACreateLocalVector(da amplocalX)DAGlobalToLocalBegin(da X INSERT VALUES localX)DAGlobalToLocalEnd(da X INSERT VALUES localX)DAVecGetArray(dalocalX(void)ampx)DAVecGetArray(daV(void)ampv)

Parameters dx = gridminusgtdx dz = gridminusgtdzilim = gridminusgtniminus1 jlim = gridminusgtnjminus1

Compute real temperature strain rate and viscosity DAGetCorners(daampisampjsPETSCNULL ampimampjmPETSCNULL )for (j=js jltjs+jm j++)

for (i=is iltis+im i++) T = paramminusgtpotentialT x[j][i]T exp( (jminus05)dzparamminusgtz scale )if (iltilim ampamp jltjlim )

TC = paramminusgtpotentialT TInterp(xij) exp( jdzparamminusgtz scale ) else

TC = T Compute the values at both cell centers and cell corners eps = CalcSecInv(xijCELL CENTERuser)epsC = CalcSecInv(xijCELL CORNERuser)v[j][i]u = epsv[j][i]w = epsCv[j][i]p = Viscosity(Tepsdz(jminus05)param)v[j][i]T = Viscosity(TCepsCdzjparam)

DAVecRestoreArray(daV(void)ampv)DAVecRestoreArray(dalocalX(void)ampx)paramminusgtivisc = ivt

Fig 23 Calculation of the second invariant of the strain tensor and viscosity field for themantle subduction problem

Developing a Geodynamics Simulator with PETSc 21

KSP ksp subkspPC bjacobi ilu

SNESGetKSP(snes ampksp)KSPGetPC(ksp ampbjacobi)PCBJacobiGetSubKSP(bjacobi PETSCNULL PETSCNULL ampsubksp)KSPGetPC(subksp[0] ampilu)PCILUSetLevels(ilu levels)

Fig 24Customizing the preconditioner on a single process

has the option of providing a routine or of using the ADIC [15] or similar system forautomatic differentiation The finite difference approximation can be activated byusing the-snes fd and-mat fd coloring freq options or by providing theSNESDefaultComputeJacobianColor () function toSNESSetJacobian()

Through theSNES object the user may also access the great range of directand iterative linear solvers and preconditioners provided by PETSc Sixteen differentKrylov solvers are available including GMRES BiCGStab and LSQR along withinterfaces to popular sparse direct packages such as MUMPS [4] and SuperLU [9]In addition to a variety of incomplete factorization preconditioners including PI-LUT [16] PETSc supports additive Schwartz preconditioning algebraic multigridthrough BoomerAMG [13] and the geometric multigrid discussed below The fullpanoply of solvers and preconditioners available is catalogued on the PETSc Web-site [3]

The modularity of PETSc allows users to easily customize each solver For in-stance suppose the user wishes to increase the number of levels in ILU(k) precon-ditioning on one block of a block-Jacobi scheme The code fragment in Fig 24 willset the number of levels of fill tolevels

PETSc provides an elegant framework for managing geometric multigrid in com-bination with aDA which is abstracted in theDMMGobject In the same way thatany linear solve can be performed withSNES any preconditioner or solver com-bination is available inDMMGby using a single level That is the same user codesupports both geometric multigrid as well as direct methods Krylov methods etcWhen creating aDMMG the user specifies the number of levels of grid refinementand provides the coarse-grid information (see Fig 25) which is always aDA objectat present although unstructured prototypes are being developed Then theDMMGobject calculates the intergrid transfer operators prolongation and restriction and

allocates objects to hold the solution at each level The user must provide the actionof the residual operatorF and can optionally provide a routine to compute the Jaco-bian matrix (which will be called on each level) and a routine to compute the initialguess as in Fig 26

With this information the user can now solve the equation and retrieve the solu-tion vector as in Fig 27

22 Matthew G Knepley et al

DMMGCreate(comm gridmglevels user ampdmmg)DMMGSetDM(dmmg (DM) da)

Fig 25Creating a DMMG

DMMGSetSNESLocal(dmmg FormFunctionLocal FormJacobianLocal 0 0)DMMGSetInitialGuess(dmmg FormInitialGuess)

Fig 26Providing discretization and assembly routines to DMMG

DMMGSolve(dmmg)soln = DMMGGetx(dmmg)

Fig 27Solving the problem with DMMG

5 Extensions

PETSc is not a complete environment for simulating physical phenomena Ratherit is a set of tools that allow the user to assemble such an environment tailored toa specific application As such PETSc will never provide every facility appropriatefor a given simulation However the user can easily extend PETSc and supplementits capabilities We present two examples in the context of the mantle subductionsimulation

51 Simple Continuation

Because of the highly nonlinear dependence of viscosity on temperature and strainrate in equation (1) the iteration in Newtonrsquos method can fail to converge withouta good starting guess of the solution On the other hand for the isoviscous casewhere temperature is coupled to velocity only through advection a solution is easilyreached It is therefore natural to propose a continuation scheme in the viscosity be-ginning with constant viscosity and progressing to full variability A simple adaptivecontinuation model was devised in which the viscosity was raised to a power be-tween zero and oneη rarr ηα whereα=0 corresponds to constant viscosity andα=1corresponds to natural viscosity variation In the continuation loopα was increasedtowards unity by an amount dependent on the rate of convergence of the previousSNES solve

PETSc itself provides no special support for continuation but it is sufficientlymodular that a continuation loop was readily constructed in the application codeusing repeated calls toDMMGSolve() and found to work quite well

Developing a Geodynamics Simulator with PETSc 23

52 Simple Steering

No rigorous justification had been given for the continuation strategy above andconsequently it was possible that for certain initial conditions Newton would fail toconverge thereby resulting in extremely long run times An observant user could de-tect this situation and abort the run but all the potentially useful solution informationup to that point would be lost A clean way to asynchronously abort the continua-tion loop was thus needed On architectures where OS signals are available PETScprovides an interface for registering signal handlers Thus we were able to define ahandler The user wishing to change the control flow of the simulation simply sendsthe appropriate signal to the process This sets a flag in the user data that causes thechange at the next iteration

6 Simulation Results

A long-standing debate has existed between theorists and observationalists over thethermal structure of subduction zones Modelers using constant viscosity flow sim-ulations to compute thermal structure have predicted relatively cold subductionzones [19] Conversely observationalists who use heat flow measurements and min-eralogical thermobarometry to estimate temperatures at depth have long claimedthat subduction zones are hotter than predicted They have invoked the presence ofstrong upwelling of hot mantle material to supply the heat This upwelling was neverpredicted by isoviscous flow models however

The debate remained unresolved until recently when several groups includingour own succeeded in developing simulations with realistically variable viscosity(for other examples [810121821]) A comparison of flow and temperature fieldsfor variable and constant viscosity simulations generated with our code is shownin Fig 28 Results from these simulations are exciting because they close the gapbetween models and observations they predict hotter mantle temperatures steepersurface thermal gradients and upwelling mantle flow

Furthermore these simulations allow for quantitative predictions of variation ofobservables with subduction parameters A recent study of subduction zone earth-quakes has identified an intriguing trend the vertical distance from subduction zonevolcanoes to the surface of the subducting slab is anti-correlated with descent rate ofthe slab [11] Preliminary calculations with our model are consistent with this trendindicating that flow and thermal structure may play an important role in determiningnot only the quantity and chemistry of magmas but also their path of transport to thesurface Further work is required to resolve this issue

Simulating the geodynamics of subduction is an example of our success in port-ing an existing code into the PETSc framework simultaneously parallelizing thecode and increasing its functionality Using this code as a template we rapidly de-veloped a related simulation of another tectonic boundary the mid-ocean ridge Thiswork was done to address a set of observations of systematic morphological asym-metry in the global mid-ocean ridge system [7] Our model confirms the qualitative

24 Matthew G Knepley et al

Dep

th k

m

log10

η and flow field

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24 Temperature degC

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Distance km

Dep

th k

m

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24

Distance km

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Isov

isco

usV

aria

ble

Vis

cosi

ty

(a) (b)

(c) (d)

Fig 282D viscosity and potential temperature fields from simulations on 8 processors with230112 degrees of freedom Panels in the top row are from a simulation withα=1 in equation(1) Panels in bottom row haveα=0 The white box in panels (a) and (c) shows the region inwhich temperature is plotted in panels (b) and (d)(a) Colors showlog10 of the viscosity fieldNote that there are more than five orders of magnitude variation in viscosity Arrows show theflow direction and magnitude (the slab is being subducted at a rate of 6 cmyear) Upwellingis evident in the flow field near the wedge corner(b) Temperature field from the variableviscosity simulation 1100C isotherm is shown as a dashed line(c) (Constant) viscosity andflow field from the isoviscous simulation Strong flow is predicted at the base of the crustdespite the low-temperature rock there No upwelling flow is predicted(d) Temperature fieldfrom isoviscous simulation Note that the mantle wedge corner is much colder than in (b)

mechanism that had been proposed to explain these observations Furthermore itshows a quantitative agreement with trends in the observed data [17] As with ourmodels of subduction the key to demonstrating the validity of the hypothesized dy-namics was simulating them with the strongly nonlinear rheology that characterizesmantle rock on geologic timescales The computational tools provided by PETSc en-abled us easily handle this rheology reducing development time and allowing us tofocus on model interpretation

References

1 PETSc SNES Example 30 httpwwwmcsanlgovpetscpetsc-2

Developing a Geodynamics Simulator with PETSc 25

snapshotspetsc-currentsrcsnesexamplestutorialsex30chtml

2 PETSc SNES Example 5 httpwwwmcsanlgovpetscpetsc-2snapshotspetsc-currentsrcsnesexamplestutorialsex5f90Fhtml

3 PETSc Solvers httpwwwmcsanlgovpetscpetsc-2documentationlinearsolvertablehtml

4 Amestoy P R Duff I S LrsquoExcellent J-Y and Koster J A fully asynchronous mul-tifrontal solver using distributed dynamic schedulingSIAM Journal on Matrix Analysisand Applications 23(1)15ndash41 (2001)

5 Balay S Buschelman K Eijkhout V Gropp W D Kaushik D Knepley M GMcInnes L C Smith B F and Zhang H PETSc Web pagehttpwwwmcsanlgovpetsc

6 Balay S Gropp W D McInnes L C and Smith B F Efficient management of par-allelism in object oriented numerical software libraries In Arge E Bruaset A M andLangtangen H P editorsModern Software Tools in Scientific Computing pages 163ndash202 Birkhauser Press (1997)

7 Carbotte S Small C and Donnelly K The influence of ridge migration on the mag-matic segmentation of mid-ocean ridgesNature 429743ndash746 (2004)

8 Conder J Wiens D and Morris J On the decompression melting structure at volcanicarcs and back-arc spreading centersGeophys Res Letts 29 (2002)

9 Demmel J W Gilbert J R and Li X S SuperLU userrsquos guide Technical ReportLBNL-44289 Lawrence Berkeley National Laboratory (2003)

10 Eberle M Grasset O and Sotin C A numerical study of the interaction of the man-tle wedge subducting slab and overriding platePhys Earth Planet In 134191ndash202(2002)

11 England P Engdahl R and Thatcher W Systematic variation in the depth of slabsbeneath arc volcanosGeophys J Int 156(2)377ndash408 (2003)

12 Furukawa Y Depth of the decoupling plate interface and thermal structure under arcsJ Geophys Res 9820005ndash20013 (1993)

13 Henson V E and Yang U M BoomerAMG A parallel algebraic multigrid solver andpreconditioner Technical Report UCRL-JC-133948 Lawrence Livermore National Lab-oratory (2000)

14 Hirth G and Kohlstedt D Rheology of the upper mantle and the mantle wedge A viewfrom the experimentalists InInside the Subduction Factory volume 138 ofGeophysicalMonograph American Geophysical Union (2003)

15 Hovland P Norris B and Smith B Making automatic differentiation truly automaticCoupling PETSc with ADIC InProceedings of ICCS2002 (2002)

16 Hysom D and Pothen A A scalable parallel algorithm for incomplete factor precondi-tioning SIAM Journal on Scientific Computing 222194ndash2215 (2001)

17 Katz R Spiegelman M and Carbotte S Ridge migration asthenospheric flow and theorigin of magmatic segmentation in the global mid-ocean ridge systemGeophys ResLetts 31 (2004)

18 Kelemen P Rilling J Parmentier E Mehl L and Hacker B Thermal structure dueto solid-state flow in the mantle wedge beneath arcs InInside the Subduction Factoryvolume 138 ofGeophysical Monograph American Geophysical Union (2003)

19 Peacock S and Wang K Seismic consequences of warm versus cool subduction meta-morphism Examples from southwest and northeast JapanScience 286937ndash939 (1999)

20 Stern R Subduction zonesRev Geophys 40(4) (2002)

26 Matthew G Knepley et al

21 van Keken P Kiefer B and Peacock S High-resolution models of subduction zonesImplications for mineral dehydration reactions and the transport of water into the deepmantleGeochem Geophys Geosys 3(10) (2003)

Page 5: Developing a Geodynamics Simulator with PETScfoalab.earth.ox.ac.uk/publications/Knepley_etal_Springer06.pdf · Developing a Geodynamics Simulator with PETSc Matthew G. Knepley 1,

Developing a Geodynamics Simulator with PETSc 5

CFLAGS =FFLAGS =CPPFLAGS=FPPFLAGS=

include $PETSCDIRbmakecommonbase

ex1 ex1o utilo chkoptsminus$CLINKER minuso $ $ˆ $PETSCSNES LIB$RM $ˆ

ex1f ex1fo physo chkoptsminus$FLINKER minuso $ $ˆ $PETSCFORTRAN LIB $PETSCSNES LIB$RM $ˆ

Fig 3Typical PETSc makefile

include $PETSCDIRbmakecommonvariables

comassageCpl $lt$CC minusc $MY CFLAGS $COPTFLAGS $CFLAGS $CCPPFLAGS $lt

FomassageFortranpl $lt$FC minusc $MY FFLAGS $FOPTFLAGS $FFLAGS $FCPPFLAGS $lt

ex1 ex1o utilominus$CLINKER minuso $ $ˆ $PETSCSNES LIB$RM $lt

ex1f ex1fo physominus$FLINKER minuso $ $ˆ $PETSCFORTRAN LIB $PETSCSNES LIB$RM $lt

Fig 4Boilerplate custom makefile using PETSc

summary logging and diagnostic information most notably performance profilingThe equivalent Fortran driver is shown in Fig 6

Notice that the command line arguments are now obtained directly from the For-tran runtime library rather than the user code Once these calls are inserted intothe application code the user can verify a successful link and run with the PETSclibraries

6 Matthew G Knepley et al

static char help[ ] = Boilerplate PETSc Examplenn

include petsch

int main(int argc char argv)

ierr = PetscInitialize(ampargc ampargv (char ) 0 help)CHKERRQ(ierr)

User code

return PetscFinalize()

Fig 5Boilerplate C driver for PETSc

program mainimplicit none

include includefincludepetsch

integer ierr

call PetscInitialize(PETSCNULL CHARACTER ierr)

User code

call PetscFinalize(ierr)end

Fig 6Boilerplate Fortran driver for PETSc

3 Data Distribution and Linear Algebra

The PETSc solvers (discussed in Section 4) require the user to provide C or Fortranroutines that compute the residual of the equation they wish to solve (after they havediscretized the PDE) and optionally the Jacobian of that residual function We re-fer to these functions generically asFormFunction() andFormJacobian() For nonlinear problems the PETSc solvers use truncated Newton methods with linesearches or trust-regions for robustness and possibly approximate or incomplete Ja-cobians for efficiency

The central objects in any PETSc simulation are the abstractions from linearalgebra vectors and matrices or in PETSc terms theVec andMat classes Theseform the basis of all solver and preconditioner interfaces as well as providing the linkto user-supplied discretization and physics routines Thus the most important stepin the migration of an application to the PETSc framework is the incorporation of its

Developing a Geodynamics Simulator with PETSc 7

linear algebra and data distribution abstractions Two obvious strategies emerge foraccomplishing this integration a gradual approach that seeks to minimize the changeto existing code and a more aggressive approach that leverages as much of the PETSctechnology as possible The next two subsections address these complementary pathstoward integration We use the simple example of the solid-fuel ignition problem orBratu problem to illustrate the gradual evolution of a simulation and then we givean example of mantle subduction for the more aggressive approach

31 Gradual Evolution

A new PETSc user with very complicated code which perhaps was originally writ-ten by someone else may opt to make as few changes as possible when moving toPETSc PETSc facilitates this approach with low-level interfaces to bothVec andMat objects that closely resemble common serial data structures Vectors are espe-cially easy to migrate because the default PETSc storage format contiguous arrayson each process is that most common in serial applications Matrices are somewhatmore complicated and in order to hide the actual matrix data storage format PETScrequires the user to access values through a functional interface [6]

Let us examine the Bratu problem as an example of integrating PETSc vectorsinto an existing simulation [2] This problem is modeled by the partial differentialequation

minus∆uminus λeu = 0 (4)

We take the domain to be the unit square and impose homogeneous Dirichlet bound-ary conditions on the edges We discretize the equation using a 5-point stencil finitedifference scheme which results in a set of nonlinear algebraic equationsF (uh) = 0where we useuh to indicate the discrete solution vector In Fig 7 we present a For-tran 90 routine that calculates the residualF as a function of the input vectoruhIt also takes a user context as input that holds the domain information and problemcoefficient

Notice that the boundary conditions on the residual are of the formuminusuΓ whereuΓ are the specified boundary values because we are driving the residualF (u) tozero An alternative would be to eliminate those variables altogether substitutingthe boundary values in any interior calculation However this technique currentlyimposes a greater burden on the programmer

When integrating PETSc vectors into this code we can let Fortran allocate thestorage or we can use PETSc for allocation In Fig 8 we let PETSc managethe memory The input vectors already have storage allocated by PETSc whichcan be accessed as an F90 array by using theVecGetArrayF90() functionor in C by usingVecGetArray() A sampledriver() routine shows howPETSc allocates vector storage If we let Fortran manage memory then we useVecCreateSeqWithArray() orVecCreateMPIWithArray() in parallelto create theVec objects

During a Newton iteration to solve this equation we would require the JacobianJ of the mappingF Rather than being accessed as a multidimensional array the

8 Matthew G Knepley et al

module f90moduletype userctx

The start end and number of vertices in the xminus and yminusdirectionsinteger xsxeysyeinteger mxmydouble precision lambda

end type userctxcontainsend module f90module

subroutine FormFunction(uFuserierr)use f90moduletype (userctx) userdouble precision u(userxsuserxeuserysuserye)double precision F(userxsuserxeuserysuserye)double precision twoonehxhyhxdhyhydhxscuijuxxuyyinteger ijierr

hx = 10dble(usermxminus1)hy = 10dble(usermyminus1)sc = hxhyuserlambdahxdhy = hxhyhydhx = hyhx

do 20 j=userysuseryedo 10 i=userxsuserxe

Apply boundary conditionsif (i == 1 or j == 1 or i == usermx or j == usermy) then

F(ij) = u(ij) Apply finite difference scheme

elseuij = u(ij)uxx = hydhx (20uij minus u(iminus1j) minus u(i+1j))uyy = hxdhy (20uij minus u(ijminus1) minus u(ij+1))F(ij) = uxx + uyy minus scexp(uij)

endif10 continue20 continue

Fig 7Residual calculation for the Bratu problem

Developing a Geodynamics Simulator with PETSc 9

subroutine driver(uFuserierr)implicit none

Vec uFint Nparameter(N=10000)

call VecCreate(PETSCCOMM WORLDuierr)call VecSetSizes(uPETSCDECIDENierr)call VecDuplicate(uFierr)call SolverLoop(uFierr)call VecDestroy(uierr)call VecDestroy(Fierr)end

subroutine FormFunctionPETSc(uFuserierr)use f90moduleimplicit none

Vec uFtype (userctx) userdouble precisionpointer u v()f v()

call VecGetArrayF90(uu vierr)call VecGetArrayF90(Ff vierr)call FormFunction(u vf vuserierr)call VecRestoreArrayF90(uu vierr)call VecRestoreArrayF90(Ff vierr)end

Fig 8Driver for the Bratu problem using PETSc

PETScMat object provides theMatSetValues () function to set logically denseblocks of values into the structure A function that computes the Jacobian and storesit in a PETSc matrix is shown in Fig 9 With this addition the user can now runserial code using the PETSc solvers (details are given in Section 4) Note that thestorage mechanism is the only change to user code necessary for its incorporationinto the PETSc framework using the wrapper in Fig 8

Although PETSc removes the need for low-level parallel programming such asdirect calls to the MPI library the user must still identify parallelism inherent inthe application and must structure the computation accordingly The first step is topartition the domain and have each process calculate the residual and Jacobian onlyover its local piece This is easily accomplished by redefining thexs xe ys andyevariables on each process converting loops over the entire domain into loops overthe local domain However this method uses nearest-neighbor information and thus

10 Matthew G Knepley et al

subroutine FormJacobian(ujacuserierr)use f90moduletype (userctx) userMat jacdouble precision x(userxsuserxeuserysuserye)double precision twoonehxhyhxdhyhydhxscv(5)integer rowcol(5)ijierr

one = 10two = 20hx = onedble(usermxminus1)hy = onedble(usermyminus1)sc = hxhyuserlambdahxdhy = hxhyhydhx = hyhxdo 20 j=userysuserye

row = (j minus userys)userxm minus 1do 10 i=userxsuserxe

row = row + 1 boundary points

if (i == 1 or j == 1 or i == usermx or j == usermy) thencol(1) = rowv(1) = onecall MatSetValues(jac1row1colvINSERT VALUESierr)

interior grid pointselse

v(1) = minushxdhyv(2) = minushydhxv(3) = two(hydhx + hxdhy) minus scexp(x(ij))v(4) = minushydhxv(5) = minushxdhycol(1) = row minus userxmcol(2) = row minus 1col(3) = rowcol(4) = row + 1col(5) = row + userxmcall MatSetValues(jac1row5colvINSERT VALUESierr)

endif10 continue20 continue

Fig 9Jacobian calculation for the Bratu problem using PETSc

Developing a Geodynamics Simulator with PETSc 11

we must have access to some values not present locally on the process as shown inFig 10

Box-type stencil Star-type stencil

Proc 6

Proc 0 Proc 0Proc 1 Proc 1

Proc 6

Fig 10Star and box stencils using a DA

We must store values for theseghostpoints locally so that they may be used forthe computation In addition we must ensure that these values are identical to the cor-responding values on the neighboring processes At the lowest level PETSc providesghosted vectors created usingVecCreateGhost () with storage for some valuesowned by other processes These values are explicitly indicated during constructionTheVecGhostUpdateBegin () andVecGhostUpdateEnd () functions trans-fer values between ghost storage and the corresponding storage on other processesThis method however can become complicated for the user Therefore for the com-mon case of a logically rectangular grid PETSc provides theDA object to managethe determination allocation and coherence of ghost values TheDA object is dis-cussed fully in Section 32 but we give here a short introduction in the context of theBratu problem

TheDA object represents a distributed structured (possibly staggered) grid andthus has more information than our serial grid If we augment our user context toinclude information about ghost values we can obtain all the grid information fromtheDA Figure 11 shows the creation of aDA when one initially know only the totalnumber of vertices in thex andy directions Alternatively we could have specifiedthe local number of vertices in each direction

Now we need only modify our PETSc residual routine to update the ghost valuesas shown in Fig 12

Notice that the ghost value scatter is a two-step operation This allows thecommunication to overlap with local computation that may be taking place whilethe messages are in transit The only change necessary for the residual calcu-lation itself is the correct declaration of the input arraydouble precision

12 Matthew G Knepley et al

type userctxDA da

The start end and number of vertices in the xminusdirectioninteger xsxexm

The start end and number of ghost vertices in the xminusdirectioninteger gxsgxegxm

The start end and number of vertices in the yminusdirectioninteger ysyeym

The start end and number of ghost vertices in the yminusdirectioninteger gysgyegym

The number of vertices in the xminus and yminusdirectionsinteger mxmy

The MPI rank of this processinteger rank

The coefficient in the Bratu equationdouble precision lambda

end type userctx

call DACreate2d(PETSCCOMM WORLDDA NONPERIODICDA STENCIL STAR ampamp usermxusermyPETSCDECIDEPETSCDECIDE11 ampamp PETSCNULL INTEGERPETSCNULL INTEGERuserdaierr)call DAGetCorners(userdauserxsuserysPETSCNULL INTEGER amp

amp userxmuserymPETSCNULL INTEGERierr)call DAGetGhostCorners(userdausergxsusergys amp

amp PETSCNULL INTEGERusergxmusergym ampamp PETSCNULL INTEGERierr)

Here we shift the starting indices up by one so that we can easily use the Fortran convention of1minusbased indices rather than0minusbased

userxs = userxs+1userys = userys+1usergxs = usergxs+1usergys = usergys+1userye = userys+userymminus1userxe = userxs+userxmminus1usergye = usergys+usergymminus1usergxe = usergxs+usergxmminus1

Fig 11Creation of a DA for the Bratu problem

Developing a Geodynamics Simulator with PETSc 13

subroutine FormFunctionPETSc(uFuserierr)implicit none

Vec uFinteger ierrtype (userctx) user

double precisionpointer lu v()lf v()Vec uLocal

call DAGetLocalVector(userdauLocalierr)call DAGlobalToLocalBegin(userdauINSERT VALUESuLocalierr)call DAGlobalToLocalEnd(userdauINSERT VALUESuLocalierr)

call VecGetArrayF90(uLocallu vierr)call VecGetArrayF90(Flf vierr)

Actually compute the local portion of the residualcall FormFunctionLocal(lu vlf vuserierr)

call VecRestoreArrayF90(uLocallu vierr)call VecRestoreArrayF90(Flf vierr)

call DARestoreLocalVector(userdauLocalierr)

Fig 12Parallel residual calculation for the Bratu problem using PETSc

u(usergxsusergxeusergysusergye) which now includes ghostvalues

The changes toFormJacobian() are similar resizing the input array andchanging slightly the calculation of row and column indices and can be found inthe PETSc example source We have now finished the initial introduction of PETSclinear algebra and the simulation is ready to run in parallel

32 Rapid Evolution

The user who is willing to raise the level of abstraction in a code can start by em-ploying theDA to describe the problem domain and discretization The data interfacemimics a multidimensional array and is thus ideal for finite difference finite volumeand low-order finite element schemes on a logically rectangular grid In fact after thedefinition of aFormFunction() routine the simulation is ready to run becausethe nonlinear solver provides approximate Jacobians automatically as discussed inSection 4

14 Matthew G Knepley et al

DASetLocalFunction(userda (DALocalFunction1) FormFunctionLocal)

Fig 13Using the DA to form a function

int FormFunctionLocal(DALocalInfo infodouble udouble fAppCtx user)double two = 20hxhyhxdhyhydhxscdouble uijuxxuyyint ij

hx = 10(double)(infominusgtmxminus1)hy = 10(double)(infominusgtmyminus1)sc = hxhyuserminusgtlambdahxdhy = hxhyhydhx = hyhx Compute function over the locally owned part of the grid for (j=infominusgtys jltinfominusgtys+infominusgtym j++)

for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++) if (i == 0 | | j == 0 | | i == infominusgtmxminus1 | | j == infominusgtmyminus1)

f[j][i] = u[j][i] else

uij = u[j][i]uxx = (twouij minus u[j][iminus1] minus u[j][i+1])hydhxuyy = (twouij minus u[jminus1][i] minus u[j+1][i])hxdhyf[j][i] = uxx + uyy minus scexp(uij)

Fig 14FormFunctionLocal() for the Bratu problem

We begin by examining the Bratu problem from the last section only this time inC We can now provide our local residual routine to theDA see Fig 13

The grid information is passed toFormFunctionLocal() in a DALocalInfo structure as shown in Fig 14

The fields are passed directly as multidimensional C arrays complete with ghostvalues An application scientist can use a boilerplate example such as the Bratu prob-lem merely altering the local physics calculation inFormFunctionLocal() torapidly obtain a parallel scalable simulation that runs on the desktop as well as onmassively parallel supercomputers

If the user has an expression for the Jacobian of the residual function then thismatrix can also be computed by using theDA interface with the call as in Fig 15and the corresponding routine for the local calculation in Fig 16

Developing a Geodynamics Simulator with PETSc 15

DASetLocalJacobian(userda (DALocalFunction1) FormJacobianLocal)

Fig 15Using the DA to form a Jacobian

int FormJacobianLocal(DALocalInfo infodouble xMat jacAppCtx user)MatStencil col[5]rowdouble lambdav[5]hxhyhxdhyhydhxscint ij

lambda = userminusgtparamhx = 10(double)(infominusgtmxminus1)hy = 10(double)(infominusgtmyminus1)sc = hxhylambdahxdhy = hxhy hydhx = hyhx

for (j=infominusgtys jltinfominusgtys+infominusgtym j++) for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++)

rowj = j rowi = i boundary points if (i == 0 | | j == 0 | | i == infominusgtmxminus1 | | j == infominusgtmyminus1)

v[0] = 10MatSetValuesStencil(jac1amprow1amprowvINSERT VALUES)

else interior grid points

v[0] = minushxdhy col[0]j = j minus 1 col[0]i = iv[1] = minushydhx col[1]j = j col[1]i = iminus1v[2] = 20(hydhx + hxdhy) minus scexp(x[j][i]) col[2]j = j col[2]i = iv[3] = minushydhx col[3]j = j col[3]i = i+1v[4] = minushxdhy col[4]j = j + 1 col[4]i = iMatSetValuesStencil(jac1amprow5colvINSERT VALUES)

MatAssemblyBegin(jacMAT FINAL ASSEMBLY)MatAssemblyEnd(jacMAT FINAL ASSEMBLY)MatSetOption(jacMAT NEW NONZERO LOCATION ERR)

Fig 16FormJacobianLocal() for the Bratu problem

16 Matthew G Knepley et al

int FormFunctionLocal(DALocalInfo infoField xField fvoid ptr)AppCtx user = (AppCtx)ptrParameter param = userminusgtparamGridInfo grid = userminusgtgriddouble mag w mag uint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1 i j

for (j=infominusgtys jltinfominusgtys+infominusgtym j++) for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++)

calculateXMomentumResidual(i j x f user)calculateZMomentumResidual(i j x f user)calculatePressureResidual(i j x f user)calculateTemperatureResidual(i j x f user)

Fig 17FormFunctionLocal() for the mantle subduction problem

We have so far presented simple examples involving a perturbed Laplacian butthis development strategy extends far beyond toy problems We have developedlarge-scale parallel geophysical simulations [1] that are producing new results inthe field We began by replacingFormFunctionLocal() in the Bratu examplewith one containing the linear physics of the previous sequential linear subductioncode (dropping the previous codersquos linear solve) Then the PETSc solver results werecompared for correctness with the complete prior subduction code Next the newcode was immediately run in parallel for correctness and efficiency tests Finallythe nonlinear terms from the more complicated physics of the mantle subductionproblem discussed in Section 1 and shown in Figs 17-21 and better quality dis-cretizations of the boundary conditions were added In fact since the structure oftheFormFunction() changed significantly the finalFormFunction() looksnothing like the original but the process of obtaining through the incremental ap-proach reduced the learning curve for PETSc and make correctness checking at thevarious stages much easier

The details of the subduction code are not as important as the recognition thata very complicated physics problem need not involve any more complication in oursimulation infrastructure To give more insight into the actual calculation we showin Fig 22 the explicit calculation of the residual from the continuity equation

Here we have used theDA for a multicomponent problem on a staggered meshwithout alteration because the structure of the discretization remains logically rect-angular

Although theDA manages communication during the solution process data post-processing often necessary for visualization or analysis also requires this function-ality For the mantle subduction simulation we want to calculate both the second in-

Developing a Geodynamics Simulator with PETSc 17

int calculateXMomentumResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (iltj) f[j][i]u = x[j][i]u minus SlabVel(rsquoUrsquo ijuser)

else if (jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lithospheric lid f[j][i]u = x[j][i]u minus 00

else if (i==ilim ) on the right side boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]u = x[j][i]u minus HorizVelocity(ijuser) else

f[j][i]u = XNormalStress(xijCELL CENTERuser) minus EPS ZERO

else if (j==jlim ) on the bottom boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]u = x[j][i]u minus HorizVelocity(ijuser) else if (paramminusgtibound==BC NOSTRESS)

f[j][i]u = XMomentumResidual(xijuser)

else in the mantle wedge f[j][i]u = XMomentumResidual(xijuser)

Fig 18X-momentum residual for the mantle subduction problem

variant of the strain rate tensor and the viscosity over the grid (at both cell centers andcorners) Using theDAGlobalToLocalBegin () andDAGlobalToLocalEnd() functions we transferred the global solution data to local ghosted vectors and pro-ceeded with the calculation We can also transfer data from local vectors to a globalvector usingDALocalToGlobal () The entire viscosity calculation is given inFig 23

Note that this framework allows the user to manage arbitrary fields over the do-main For instance a porous flow simulation might manage material properties ofthe medium in this fashion

4 Solvers

The SNES object in PETSc is an abstraction of the ldquoinverserdquo of a nonlinear oper-ator The user provides aFormFunction() routine as seen in Section 3 whichcomputes the action of the operator on an input vector Linear problems can also be

18 Matthew G Knepley et al

int calculateZMomentumResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1jlim = infominusgtmyminus1

if (ilt=j) f[j][i]w = x[j][i]w minus SlabVel(rsquoWrsquo ijuser)

else if (jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lithospheric lid f[j][i]w = x[j][i]w minus 00

else if (j==jlim ) on the bottom boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]w = x[j][i]w minus VertVelocity(ijuser) else

f[j][i]w = ZNormalStress(xijCELL CENTERuser) minus EPS ZERO

else if (i==ilim ) on the right side boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]w = x[j][i]w minus VertVelocity(ijuser) else if (paramminusgtibound==BC NOSTRESS)

f[j][i]w = ZMomentumResidual(xijuser)

else in the mantle wedge f[j][i]w = ZMomentumResidual(xijuser)

Fig 19Z-momentum residual for the mantle subduction problem

int calculatePressureResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (iltj | | jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lid or slab f[j][i]p = x[j][i]p

else if ((i==ilim | | j==jlim ) ampamp paramminusgtibound==BC ANALYTIC ) on an analytic boundary f[j][i]p = x[j][i]p minus Pressure(ijuser)

else in the mantle wedge f[j][i]p = ContinuityResidual(xijuser)

Fig 20Pressure or continuity residual for the mantle subduction problem

Developing a Geodynamics Simulator with PETSc 19

int calculateTemperatureResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (j==0) on the surface f[j][i]T = x[j][i]T + x[j+1][i]T + PetscMax(x[j][i]T00)

else if (i==0) slab inflow boundary f[j][i]T = x[j][i]T minus PlateModel(jPLATE SLABuser)

else if (i==ilim ) right side boundary mag u = 10 minus pow( (10minusPetscMax(PetscMin(x[j][iminus1]uparamminusgtcb10)00)) 50 )f[j][i]T = x[j][i]T minus mag ux[jminus1][iminus1]T minus (10minusmag u)PlateModel(jPLATE LID user)

else if (j==jlim ) bottom boundary mag w = 10 minus pow( (10minusPetscMax(PetscMin(x[jminus1][i]wparamminusgtsb10)00)) 50 )f[j][i]T = x[j][i]T minus mag wx[jminus1][iminus1]T minus (10minusmag w)

else in the mantle wedge f[j][i]T = EnergyResidual(xijuser)

Fig 21Temperature or energy residual for the mantle subduction problem

double precision ContinuityResidual(Field x int i int j AppCtx user)

GridInfo grid = userminusgtgriddouble precision uEuWwNwSdudxdwdz

uW = x[j][iminus1]u uE = x[j][i]u dudx = (uE minus uW)gridminusgtdxwS = x[jminus1][i]w wN = x[j][i]w dwdz = (wN minus wS)gridminusgtdzreturn dudx + dwdz

Fig 22Calculation of the continuity residual for the mantle subduction problem

solved by usingSNES with no loss of efficiency compared to using the underlyingKSP object (PETSc linear solver object) directly ThusSNES seems the appropri-ate framework for a general-purpose simulator providing the flexibility to add orsubtract nonlinear terms from the equation at will

TheSNES solver does not require a user to implement a Jacobian Default rou-tines are provided to compute a finite difference approximation using coloring toaccount for the sparsity of the matrix The user does not even need the PETScMatinterface but can initially interact only withVec objects making the transition to

PETSc nearly painless However these approximations to the Jacobian can have nu-merical difficulties and are not as efficient as direct evaluation Therefore the user

20 Matthew G Knepley et al

Compute both the second invariant of the strain rate tensor and the viscosity int ViscosityField(DA da Vec X Vec VAppCtx user)

Parameter param = userminusgtparamGridInfo grid = userminusgtgridVec localXField v xdouble eps dx dz T epsC TCint ijisjsimjmilim jlim ivt

ivt = paramminusgtiviscparamminusgtivisc = paramminusgtoutput ivisc

DACreateLocalVector(da amplocalX)DAGlobalToLocalBegin(da X INSERT VALUES localX)DAGlobalToLocalEnd(da X INSERT VALUES localX)DAVecGetArray(dalocalX(void)ampx)DAVecGetArray(daV(void)ampv)

Parameters dx = gridminusgtdx dz = gridminusgtdzilim = gridminusgtniminus1 jlim = gridminusgtnjminus1

Compute real temperature strain rate and viscosity DAGetCorners(daampisampjsPETSCNULL ampimampjmPETSCNULL )for (j=js jltjs+jm j++)

for (i=is iltis+im i++) T = paramminusgtpotentialT x[j][i]T exp( (jminus05)dzparamminusgtz scale )if (iltilim ampamp jltjlim )

TC = paramminusgtpotentialT TInterp(xij) exp( jdzparamminusgtz scale ) else

TC = T Compute the values at both cell centers and cell corners eps = CalcSecInv(xijCELL CENTERuser)epsC = CalcSecInv(xijCELL CORNERuser)v[j][i]u = epsv[j][i]w = epsCv[j][i]p = Viscosity(Tepsdz(jminus05)param)v[j][i]T = Viscosity(TCepsCdzjparam)

DAVecRestoreArray(daV(void)ampv)DAVecRestoreArray(dalocalX(void)ampx)paramminusgtivisc = ivt

Fig 23 Calculation of the second invariant of the strain tensor and viscosity field for themantle subduction problem

Developing a Geodynamics Simulator with PETSc 21

KSP ksp subkspPC bjacobi ilu

SNESGetKSP(snes ampksp)KSPGetPC(ksp ampbjacobi)PCBJacobiGetSubKSP(bjacobi PETSCNULL PETSCNULL ampsubksp)KSPGetPC(subksp[0] ampilu)PCILUSetLevels(ilu levels)

Fig 24Customizing the preconditioner on a single process

has the option of providing a routine or of using the ADIC [15] or similar system forautomatic differentiation The finite difference approximation can be activated byusing the-snes fd and-mat fd coloring freq options or by providing theSNESDefaultComputeJacobianColor () function toSNESSetJacobian()

Through theSNES object the user may also access the great range of directand iterative linear solvers and preconditioners provided by PETSc Sixteen differentKrylov solvers are available including GMRES BiCGStab and LSQR along withinterfaces to popular sparse direct packages such as MUMPS [4] and SuperLU [9]In addition to a variety of incomplete factorization preconditioners including PI-LUT [16] PETSc supports additive Schwartz preconditioning algebraic multigridthrough BoomerAMG [13] and the geometric multigrid discussed below The fullpanoply of solvers and preconditioners available is catalogued on the PETSc Web-site [3]

The modularity of PETSc allows users to easily customize each solver For in-stance suppose the user wishes to increase the number of levels in ILU(k) precon-ditioning on one block of a block-Jacobi scheme The code fragment in Fig 24 willset the number of levels of fill tolevels

PETSc provides an elegant framework for managing geometric multigrid in com-bination with aDA which is abstracted in theDMMGobject In the same way thatany linear solve can be performed withSNES any preconditioner or solver com-bination is available inDMMGby using a single level That is the same user codesupports both geometric multigrid as well as direct methods Krylov methods etcWhen creating aDMMG the user specifies the number of levels of grid refinementand provides the coarse-grid information (see Fig 25) which is always aDA objectat present although unstructured prototypes are being developed Then theDMMGobject calculates the intergrid transfer operators prolongation and restriction and

allocates objects to hold the solution at each level The user must provide the actionof the residual operatorF and can optionally provide a routine to compute the Jaco-bian matrix (which will be called on each level) and a routine to compute the initialguess as in Fig 26

With this information the user can now solve the equation and retrieve the solu-tion vector as in Fig 27

22 Matthew G Knepley et al

DMMGCreate(comm gridmglevels user ampdmmg)DMMGSetDM(dmmg (DM) da)

Fig 25Creating a DMMG

DMMGSetSNESLocal(dmmg FormFunctionLocal FormJacobianLocal 0 0)DMMGSetInitialGuess(dmmg FormInitialGuess)

Fig 26Providing discretization and assembly routines to DMMG

DMMGSolve(dmmg)soln = DMMGGetx(dmmg)

Fig 27Solving the problem with DMMG

5 Extensions

PETSc is not a complete environment for simulating physical phenomena Ratherit is a set of tools that allow the user to assemble such an environment tailored toa specific application As such PETSc will never provide every facility appropriatefor a given simulation However the user can easily extend PETSc and supplementits capabilities We present two examples in the context of the mantle subductionsimulation

51 Simple Continuation

Because of the highly nonlinear dependence of viscosity on temperature and strainrate in equation (1) the iteration in Newtonrsquos method can fail to converge withouta good starting guess of the solution On the other hand for the isoviscous casewhere temperature is coupled to velocity only through advection a solution is easilyreached It is therefore natural to propose a continuation scheme in the viscosity be-ginning with constant viscosity and progressing to full variability A simple adaptivecontinuation model was devised in which the viscosity was raised to a power be-tween zero and oneη rarr ηα whereα=0 corresponds to constant viscosity andα=1corresponds to natural viscosity variation In the continuation loopα was increasedtowards unity by an amount dependent on the rate of convergence of the previousSNES solve

PETSc itself provides no special support for continuation but it is sufficientlymodular that a continuation loop was readily constructed in the application codeusing repeated calls toDMMGSolve() and found to work quite well

Developing a Geodynamics Simulator with PETSc 23

52 Simple Steering

No rigorous justification had been given for the continuation strategy above andconsequently it was possible that for certain initial conditions Newton would fail toconverge thereby resulting in extremely long run times An observant user could de-tect this situation and abort the run but all the potentially useful solution informationup to that point would be lost A clean way to asynchronously abort the continua-tion loop was thus needed On architectures where OS signals are available PETScprovides an interface for registering signal handlers Thus we were able to define ahandler The user wishing to change the control flow of the simulation simply sendsthe appropriate signal to the process This sets a flag in the user data that causes thechange at the next iteration

6 Simulation Results

A long-standing debate has existed between theorists and observationalists over thethermal structure of subduction zones Modelers using constant viscosity flow sim-ulations to compute thermal structure have predicted relatively cold subductionzones [19] Conversely observationalists who use heat flow measurements and min-eralogical thermobarometry to estimate temperatures at depth have long claimedthat subduction zones are hotter than predicted They have invoked the presence ofstrong upwelling of hot mantle material to supply the heat This upwelling was neverpredicted by isoviscous flow models however

The debate remained unresolved until recently when several groups includingour own succeeded in developing simulations with realistically variable viscosity(for other examples [810121821]) A comparison of flow and temperature fieldsfor variable and constant viscosity simulations generated with our code is shownin Fig 28 Results from these simulations are exciting because they close the gapbetween models and observations they predict hotter mantle temperatures steepersurface thermal gradients and upwelling mantle flow

Furthermore these simulations allow for quantitative predictions of variation ofobservables with subduction parameters A recent study of subduction zone earth-quakes has identified an intriguing trend the vertical distance from subduction zonevolcanoes to the surface of the subducting slab is anti-correlated with descent rate ofthe slab [11] Preliminary calculations with our model are consistent with this trendindicating that flow and thermal structure may play an important role in determiningnot only the quantity and chemistry of magmas but also their path of transport to thesurface Further work is required to resolve this issue

Simulating the geodynamics of subduction is an example of our success in port-ing an existing code into the PETSc framework simultaneously parallelizing thecode and increasing its functionality Using this code as a template we rapidly de-veloped a related simulation of another tectonic boundary the mid-ocean ridge Thiswork was done to address a set of observations of systematic morphological asym-metry in the global mid-ocean ridge system [7] Our model confirms the qualitative

24 Matthew G Knepley et al

Dep

th k

m

log10

η and flow field

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24 Temperature degC

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Distance km

Dep

th k

m

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24

Distance km

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Isov

isco

usV

aria

ble

Vis

cosi

ty

(a) (b)

(c) (d)

Fig 282D viscosity and potential temperature fields from simulations on 8 processors with230112 degrees of freedom Panels in the top row are from a simulation withα=1 in equation(1) Panels in bottom row haveα=0 The white box in panels (a) and (c) shows the region inwhich temperature is plotted in panels (b) and (d)(a) Colors showlog10 of the viscosity fieldNote that there are more than five orders of magnitude variation in viscosity Arrows show theflow direction and magnitude (the slab is being subducted at a rate of 6 cmyear) Upwellingis evident in the flow field near the wedge corner(b) Temperature field from the variableviscosity simulation 1100C isotherm is shown as a dashed line(c) (Constant) viscosity andflow field from the isoviscous simulation Strong flow is predicted at the base of the crustdespite the low-temperature rock there No upwelling flow is predicted(d) Temperature fieldfrom isoviscous simulation Note that the mantle wedge corner is much colder than in (b)

mechanism that had been proposed to explain these observations Furthermore itshows a quantitative agreement with trends in the observed data [17] As with ourmodels of subduction the key to demonstrating the validity of the hypothesized dy-namics was simulating them with the strongly nonlinear rheology that characterizesmantle rock on geologic timescales The computational tools provided by PETSc en-abled us easily handle this rheology reducing development time and allowing us tofocus on model interpretation

References

1 PETSc SNES Example 30 httpwwwmcsanlgovpetscpetsc-2

Developing a Geodynamics Simulator with PETSc 25

snapshotspetsc-currentsrcsnesexamplestutorialsex30chtml

2 PETSc SNES Example 5 httpwwwmcsanlgovpetscpetsc-2snapshotspetsc-currentsrcsnesexamplestutorialsex5f90Fhtml

3 PETSc Solvers httpwwwmcsanlgovpetscpetsc-2documentationlinearsolvertablehtml

4 Amestoy P R Duff I S LrsquoExcellent J-Y and Koster J A fully asynchronous mul-tifrontal solver using distributed dynamic schedulingSIAM Journal on Matrix Analysisand Applications 23(1)15ndash41 (2001)

5 Balay S Buschelman K Eijkhout V Gropp W D Kaushik D Knepley M GMcInnes L C Smith B F and Zhang H PETSc Web pagehttpwwwmcsanlgovpetsc

6 Balay S Gropp W D McInnes L C and Smith B F Efficient management of par-allelism in object oriented numerical software libraries In Arge E Bruaset A M andLangtangen H P editorsModern Software Tools in Scientific Computing pages 163ndash202 Birkhauser Press (1997)

7 Carbotte S Small C and Donnelly K The influence of ridge migration on the mag-matic segmentation of mid-ocean ridgesNature 429743ndash746 (2004)

8 Conder J Wiens D and Morris J On the decompression melting structure at volcanicarcs and back-arc spreading centersGeophys Res Letts 29 (2002)

9 Demmel J W Gilbert J R and Li X S SuperLU userrsquos guide Technical ReportLBNL-44289 Lawrence Berkeley National Laboratory (2003)

10 Eberle M Grasset O and Sotin C A numerical study of the interaction of the man-tle wedge subducting slab and overriding platePhys Earth Planet In 134191ndash202(2002)

11 England P Engdahl R and Thatcher W Systematic variation in the depth of slabsbeneath arc volcanosGeophys J Int 156(2)377ndash408 (2003)

12 Furukawa Y Depth of the decoupling plate interface and thermal structure under arcsJ Geophys Res 9820005ndash20013 (1993)

13 Henson V E and Yang U M BoomerAMG A parallel algebraic multigrid solver andpreconditioner Technical Report UCRL-JC-133948 Lawrence Livermore National Lab-oratory (2000)

14 Hirth G and Kohlstedt D Rheology of the upper mantle and the mantle wedge A viewfrom the experimentalists InInside the Subduction Factory volume 138 ofGeophysicalMonograph American Geophysical Union (2003)

15 Hovland P Norris B and Smith B Making automatic differentiation truly automaticCoupling PETSc with ADIC InProceedings of ICCS2002 (2002)

16 Hysom D and Pothen A A scalable parallel algorithm for incomplete factor precondi-tioning SIAM Journal on Scientific Computing 222194ndash2215 (2001)

17 Katz R Spiegelman M and Carbotte S Ridge migration asthenospheric flow and theorigin of magmatic segmentation in the global mid-ocean ridge systemGeophys ResLetts 31 (2004)

18 Kelemen P Rilling J Parmentier E Mehl L and Hacker B Thermal structure dueto solid-state flow in the mantle wedge beneath arcs InInside the Subduction Factoryvolume 138 ofGeophysical Monograph American Geophysical Union (2003)

19 Peacock S and Wang K Seismic consequences of warm versus cool subduction meta-morphism Examples from southwest and northeast JapanScience 286937ndash939 (1999)

20 Stern R Subduction zonesRev Geophys 40(4) (2002)

26 Matthew G Knepley et al

21 van Keken P Kiefer B and Peacock S High-resolution models of subduction zonesImplications for mineral dehydration reactions and the transport of water into the deepmantleGeochem Geophys Geosys 3(10) (2003)

Page 6: Developing a Geodynamics Simulator with PETScfoalab.earth.ox.ac.uk/publications/Knepley_etal_Springer06.pdf · Developing a Geodynamics Simulator with PETSc Matthew G. Knepley 1,

6 Matthew G Knepley et al

static char help[ ] = Boilerplate PETSc Examplenn

include petsch

int main(int argc char argv)

ierr = PetscInitialize(ampargc ampargv (char ) 0 help)CHKERRQ(ierr)

User code

return PetscFinalize()

Fig 5Boilerplate C driver for PETSc

program mainimplicit none

include includefincludepetsch

integer ierr

call PetscInitialize(PETSCNULL CHARACTER ierr)

User code

call PetscFinalize(ierr)end

Fig 6Boilerplate Fortran driver for PETSc

3 Data Distribution and Linear Algebra

The PETSc solvers (discussed in Section 4) require the user to provide C or Fortranroutines that compute the residual of the equation they wish to solve (after they havediscretized the PDE) and optionally the Jacobian of that residual function We re-fer to these functions generically asFormFunction() andFormJacobian() For nonlinear problems the PETSc solvers use truncated Newton methods with linesearches or trust-regions for robustness and possibly approximate or incomplete Ja-cobians for efficiency

The central objects in any PETSc simulation are the abstractions from linearalgebra vectors and matrices or in PETSc terms theVec andMat classes Theseform the basis of all solver and preconditioner interfaces as well as providing the linkto user-supplied discretization and physics routines Thus the most important stepin the migration of an application to the PETSc framework is the incorporation of its

Developing a Geodynamics Simulator with PETSc 7

linear algebra and data distribution abstractions Two obvious strategies emerge foraccomplishing this integration a gradual approach that seeks to minimize the changeto existing code and a more aggressive approach that leverages as much of the PETSctechnology as possible The next two subsections address these complementary pathstoward integration We use the simple example of the solid-fuel ignition problem orBratu problem to illustrate the gradual evolution of a simulation and then we givean example of mantle subduction for the more aggressive approach

31 Gradual Evolution

A new PETSc user with very complicated code which perhaps was originally writ-ten by someone else may opt to make as few changes as possible when moving toPETSc PETSc facilitates this approach with low-level interfaces to bothVec andMat objects that closely resemble common serial data structures Vectors are espe-cially easy to migrate because the default PETSc storage format contiguous arrayson each process is that most common in serial applications Matrices are somewhatmore complicated and in order to hide the actual matrix data storage format PETScrequires the user to access values through a functional interface [6]

Let us examine the Bratu problem as an example of integrating PETSc vectorsinto an existing simulation [2] This problem is modeled by the partial differentialequation

minus∆uminus λeu = 0 (4)

We take the domain to be the unit square and impose homogeneous Dirichlet bound-ary conditions on the edges We discretize the equation using a 5-point stencil finitedifference scheme which results in a set of nonlinear algebraic equationsF (uh) = 0where we useuh to indicate the discrete solution vector In Fig 7 we present a For-tran 90 routine that calculates the residualF as a function of the input vectoruhIt also takes a user context as input that holds the domain information and problemcoefficient

Notice that the boundary conditions on the residual are of the formuminusuΓ whereuΓ are the specified boundary values because we are driving the residualF (u) tozero An alternative would be to eliminate those variables altogether substitutingthe boundary values in any interior calculation However this technique currentlyimposes a greater burden on the programmer

When integrating PETSc vectors into this code we can let Fortran allocate thestorage or we can use PETSc for allocation In Fig 8 we let PETSc managethe memory The input vectors already have storage allocated by PETSc whichcan be accessed as an F90 array by using theVecGetArrayF90() functionor in C by usingVecGetArray() A sampledriver() routine shows howPETSc allocates vector storage If we let Fortran manage memory then we useVecCreateSeqWithArray() orVecCreateMPIWithArray() in parallelto create theVec objects

During a Newton iteration to solve this equation we would require the JacobianJ of the mappingF Rather than being accessed as a multidimensional array the

8 Matthew G Knepley et al

module f90moduletype userctx

The start end and number of vertices in the xminus and yminusdirectionsinteger xsxeysyeinteger mxmydouble precision lambda

end type userctxcontainsend module f90module

subroutine FormFunction(uFuserierr)use f90moduletype (userctx) userdouble precision u(userxsuserxeuserysuserye)double precision F(userxsuserxeuserysuserye)double precision twoonehxhyhxdhyhydhxscuijuxxuyyinteger ijierr

hx = 10dble(usermxminus1)hy = 10dble(usermyminus1)sc = hxhyuserlambdahxdhy = hxhyhydhx = hyhx

do 20 j=userysuseryedo 10 i=userxsuserxe

Apply boundary conditionsif (i == 1 or j == 1 or i == usermx or j == usermy) then

F(ij) = u(ij) Apply finite difference scheme

elseuij = u(ij)uxx = hydhx (20uij minus u(iminus1j) minus u(i+1j))uyy = hxdhy (20uij minus u(ijminus1) minus u(ij+1))F(ij) = uxx + uyy minus scexp(uij)

endif10 continue20 continue

Fig 7Residual calculation for the Bratu problem

Developing a Geodynamics Simulator with PETSc 9

subroutine driver(uFuserierr)implicit none

Vec uFint Nparameter(N=10000)

call VecCreate(PETSCCOMM WORLDuierr)call VecSetSizes(uPETSCDECIDENierr)call VecDuplicate(uFierr)call SolverLoop(uFierr)call VecDestroy(uierr)call VecDestroy(Fierr)end

subroutine FormFunctionPETSc(uFuserierr)use f90moduleimplicit none

Vec uFtype (userctx) userdouble precisionpointer u v()f v()

call VecGetArrayF90(uu vierr)call VecGetArrayF90(Ff vierr)call FormFunction(u vf vuserierr)call VecRestoreArrayF90(uu vierr)call VecRestoreArrayF90(Ff vierr)end

Fig 8Driver for the Bratu problem using PETSc

PETScMat object provides theMatSetValues () function to set logically denseblocks of values into the structure A function that computes the Jacobian and storesit in a PETSc matrix is shown in Fig 9 With this addition the user can now runserial code using the PETSc solvers (details are given in Section 4) Note that thestorage mechanism is the only change to user code necessary for its incorporationinto the PETSc framework using the wrapper in Fig 8

Although PETSc removes the need for low-level parallel programming such asdirect calls to the MPI library the user must still identify parallelism inherent inthe application and must structure the computation accordingly The first step is topartition the domain and have each process calculate the residual and Jacobian onlyover its local piece This is easily accomplished by redefining thexs xe ys andyevariables on each process converting loops over the entire domain into loops overthe local domain However this method uses nearest-neighbor information and thus

10 Matthew G Knepley et al

subroutine FormJacobian(ujacuserierr)use f90moduletype (userctx) userMat jacdouble precision x(userxsuserxeuserysuserye)double precision twoonehxhyhxdhyhydhxscv(5)integer rowcol(5)ijierr

one = 10two = 20hx = onedble(usermxminus1)hy = onedble(usermyminus1)sc = hxhyuserlambdahxdhy = hxhyhydhx = hyhxdo 20 j=userysuserye

row = (j minus userys)userxm minus 1do 10 i=userxsuserxe

row = row + 1 boundary points

if (i == 1 or j == 1 or i == usermx or j == usermy) thencol(1) = rowv(1) = onecall MatSetValues(jac1row1colvINSERT VALUESierr)

interior grid pointselse

v(1) = minushxdhyv(2) = minushydhxv(3) = two(hydhx + hxdhy) minus scexp(x(ij))v(4) = minushydhxv(5) = minushxdhycol(1) = row minus userxmcol(2) = row minus 1col(3) = rowcol(4) = row + 1col(5) = row + userxmcall MatSetValues(jac1row5colvINSERT VALUESierr)

endif10 continue20 continue

Fig 9Jacobian calculation for the Bratu problem using PETSc

Developing a Geodynamics Simulator with PETSc 11

we must have access to some values not present locally on the process as shown inFig 10

Box-type stencil Star-type stencil

Proc 6

Proc 0 Proc 0Proc 1 Proc 1

Proc 6

Fig 10Star and box stencils using a DA

We must store values for theseghostpoints locally so that they may be used forthe computation In addition we must ensure that these values are identical to the cor-responding values on the neighboring processes At the lowest level PETSc providesghosted vectors created usingVecCreateGhost () with storage for some valuesowned by other processes These values are explicitly indicated during constructionTheVecGhostUpdateBegin () andVecGhostUpdateEnd () functions trans-fer values between ghost storage and the corresponding storage on other processesThis method however can become complicated for the user Therefore for the com-mon case of a logically rectangular grid PETSc provides theDA object to managethe determination allocation and coherence of ghost values TheDA object is dis-cussed fully in Section 32 but we give here a short introduction in the context of theBratu problem

TheDA object represents a distributed structured (possibly staggered) grid andthus has more information than our serial grid If we augment our user context toinclude information about ghost values we can obtain all the grid information fromtheDA Figure 11 shows the creation of aDA when one initially know only the totalnumber of vertices in thex andy directions Alternatively we could have specifiedthe local number of vertices in each direction

Now we need only modify our PETSc residual routine to update the ghost valuesas shown in Fig 12

Notice that the ghost value scatter is a two-step operation This allows thecommunication to overlap with local computation that may be taking place whilethe messages are in transit The only change necessary for the residual calcu-lation itself is the correct declaration of the input arraydouble precision

12 Matthew G Knepley et al

type userctxDA da

The start end and number of vertices in the xminusdirectioninteger xsxexm

The start end and number of ghost vertices in the xminusdirectioninteger gxsgxegxm

The start end and number of vertices in the yminusdirectioninteger ysyeym

The start end and number of ghost vertices in the yminusdirectioninteger gysgyegym

The number of vertices in the xminus and yminusdirectionsinteger mxmy

The MPI rank of this processinteger rank

The coefficient in the Bratu equationdouble precision lambda

end type userctx

call DACreate2d(PETSCCOMM WORLDDA NONPERIODICDA STENCIL STAR ampamp usermxusermyPETSCDECIDEPETSCDECIDE11 ampamp PETSCNULL INTEGERPETSCNULL INTEGERuserdaierr)call DAGetCorners(userdauserxsuserysPETSCNULL INTEGER amp

amp userxmuserymPETSCNULL INTEGERierr)call DAGetGhostCorners(userdausergxsusergys amp

amp PETSCNULL INTEGERusergxmusergym ampamp PETSCNULL INTEGERierr)

Here we shift the starting indices up by one so that we can easily use the Fortran convention of1minusbased indices rather than0minusbased

userxs = userxs+1userys = userys+1usergxs = usergxs+1usergys = usergys+1userye = userys+userymminus1userxe = userxs+userxmminus1usergye = usergys+usergymminus1usergxe = usergxs+usergxmminus1

Fig 11Creation of a DA for the Bratu problem

Developing a Geodynamics Simulator with PETSc 13

subroutine FormFunctionPETSc(uFuserierr)implicit none

Vec uFinteger ierrtype (userctx) user

double precisionpointer lu v()lf v()Vec uLocal

call DAGetLocalVector(userdauLocalierr)call DAGlobalToLocalBegin(userdauINSERT VALUESuLocalierr)call DAGlobalToLocalEnd(userdauINSERT VALUESuLocalierr)

call VecGetArrayF90(uLocallu vierr)call VecGetArrayF90(Flf vierr)

Actually compute the local portion of the residualcall FormFunctionLocal(lu vlf vuserierr)

call VecRestoreArrayF90(uLocallu vierr)call VecRestoreArrayF90(Flf vierr)

call DARestoreLocalVector(userdauLocalierr)

Fig 12Parallel residual calculation for the Bratu problem using PETSc

u(usergxsusergxeusergysusergye) which now includes ghostvalues

The changes toFormJacobian() are similar resizing the input array andchanging slightly the calculation of row and column indices and can be found inthe PETSc example source We have now finished the initial introduction of PETSclinear algebra and the simulation is ready to run in parallel

32 Rapid Evolution

The user who is willing to raise the level of abstraction in a code can start by em-ploying theDA to describe the problem domain and discretization The data interfacemimics a multidimensional array and is thus ideal for finite difference finite volumeand low-order finite element schemes on a logically rectangular grid In fact after thedefinition of aFormFunction() routine the simulation is ready to run becausethe nonlinear solver provides approximate Jacobians automatically as discussed inSection 4

14 Matthew G Knepley et al

DASetLocalFunction(userda (DALocalFunction1) FormFunctionLocal)

Fig 13Using the DA to form a function

int FormFunctionLocal(DALocalInfo infodouble udouble fAppCtx user)double two = 20hxhyhxdhyhydhxscdouble uijuxxuyyint ij

hx = 10(double)(infominusgtmxminus1)hy = 10(double)(infominusgtmyminus1)sc = hxhyuserminusgtlambdahxdhy = hxhyhydhx = hyhx Compute function over the locally owned part of the grid for (j=infominusgtys jltinfominusgtys+infominusgtym j++)

for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++) if (i == 0 | | j == 0 | | i == infominusgtmxminus1 | | j == infominusgtmyminus1)

f[j][i] = u[j][i] else

uij = u[j][i]uxx = (twouij minus u[j][iminus1] minus u[j][i+1])hydhxuyy = (twouij minus u[jminus1][i] minus u[j+1][i])hxdhyf[j][i] = uxx + uyy minus scexp(uij)

Fig 14FormFunctionLocal() for the Bratu problem

We begin by examining the Bratu problem from the last section only this time inC We can now provide our local residual routine to theDA see Fig 13

The grid information is passed toFormFunctionLocal() in a DALocalInfo structure as shown in Fig 14

The fields are passed directly as multidimensional C arrays complete with ghostvalues An application scientist can use a boilerplate example such as the Bratu prob-lem merely altering the local physics calculation inFormFunctionLocal() torapidly obtain a parallel scalable simulation that runs on the desktop as well as onmassively parallel supercomputers

If the user has an expression for the Jacobian of the residual function then thismatrix can also be computed by using theDA interface with the call as in Fig 15and the corresponding routine for the local calculation in Fig 16

Developing a Geodynamics Simulator with PETSc 15

DASetLocalJacobian(userda (DALocalFunction1) FormJacobianLocal)

Fig 15Using the DA to form a Jacobian

int FormJacobianLocal(DALocalInfo infodouble xMat jacAppCtx user)MatStencil col[5]rowdouble lambdav[5]hxhyhxdhyhydhxscint ij

lambda = userminusgtparamhx = 10(double)(infominusgtmxminus1)hy = 10(double)(infominusgtmyminus1)sc = hxhylambdahxdhy = hxhy hydhx = hyhx

for (j=infominusgtys jltinfominusgtys+infominusgtym j++) for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++)

rowj = j rowi = i boundary points if (i == 0 | | j == 0 | | i == infominusgtmxminus1 | | j == infominusgtmyminus1)

v[0] = 10MatSetValuesStencil(jac1amprow1amprowvINSERT VALUES)

else interior grid points

v[0] = minushxdhy col[0]j = j minus 1 col[0]i = iv[1] = minushydhx col[1]j = j col[1]i = iminus1v[2] = 20(hydhx + hxdhy) minus scexp(x[j][i]) col[2]j = j col[2]i = iv[3] = minushydhx col[3]j = j col[3]i = i+1v[4] = minushxdhy col[4]j = j + 1 col[4]i = iMatSetValuesStencil(jac1amprow5colvINSERT VALUES)

MatAssemblyBegin(jacMAT FINAL ASSEMBLY)MatAssemblyEnd(jacMAT FINAL ASSEMBLY)MatSetOption(jacMAT NEW NONZERO LOCATION ERR)

Fig 16FormJacobianLocal() for the Bratu problem

16 Matthew G Knepley et al

int FormFunctionLocal(DALocalInfo infoField xField fvoid ptr)AppCtx user = (AppCtx)ptrParameter param = userminusgtparamGridInfo grid = userminusgtgriddouble mag w mag uint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1 i j

for (j=infominusgtys jltinfominusgtys+infominusgtym j++) for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++)

calculateXMomentumResidual(i j x f user)calculateZMomentumResidual(i j x f user)calculatePressureResidual(i j x f user)calculateTemperatureResidual(i j x f user)

Fig 17FormFunctionLocal() for the mantle subduction problem

We have so far presented simple examples involving a perturbed Laplacian butthis development strategy extends far beyond toy problems We have developedlarge-scale parallel geophysical simulations [1] that are producing new results inthe field We began by replacingFormFunctionLocal() in the Bratu examplewith one containing the linear physics of the previous sequential linear subductioncode (dropping the previous codersquos linear solve) Then the PETSc solver results werecompared for correctness with the complete prior subduction code Next the newcode was immediately run in parallel for correctness and efficiency tests Finallythe nonlinear terms from the more complicated physics of the mantle subductionproblem discussed in Section 1 and shown in Figs 17-21 and better quality dis-cretizations of the boundary conditions were added In fact since the structure oftheFormFunction() changed significantly the finalFormFunction() looksnothing like the original but the process of obtaining through the incremental ap-proach reduced the learning curve for PETSc and make correctness checking at thevarious stages much easier

The details of the subduction code are not as important as the recognition thata very complicated physics problem need not involve any more complication in oursimulation infrastructure To give more insight into the actual calculation we showin Fig 22 the explicit calculation of the residual from the continuity equation

Here we have used theDA for a multicomponent problem on a staggered meshwithout alteration because the structure of the discretization remains logically rect-angular

Although theDA manages communication during the solution process data post-processing often necessary for visualization or analysis also requires this function-ality For the mantle subduction simulation we want to calculate both the second in-

Developing a Geodynamics Simulator with PETSc 17

int calculateXMomentumResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (iltj) f[j][i]u = x[j][i]u minus SlabVel(rsquoUrsquo ijuser)

else if (jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lithospheric lid f[j][i]u = x[j][i]u minus 00

else if (i==ilim ) on the right side boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]u = x[j][i]u minus HorizVelocity(ijuser) else

f[j][i]u = XNormalStress(xijCELL CENTERuser) minus EPS ZERO

else if (j==jlim ) on the bottom boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]u = x[j][i]u minus HorizVelocity(ijuser) else if (paramminusgtibound==BC NOSTRESS)

f[j][i]u = XMomentumResidual(xijuser)

else in the mantle wedge f[j][i]u = XMomentumResidual(xijuser)

Fig 18X-momentum residual for the mantle subduction problem

variant of the strain rate tensor and the viscosity over the grid (at both cell centers andcorners) Using theDAGlobalToLocalBegin () andDAGlobalToLocalEnd() functions we transferred the global solution data to local ghosted vectors and pro-ceeded with the calculation We can also transfer data from local vectors to a globalvector usingDALocalToGlobal () The entire viscosity calculation is given inFig 23

Note that this framework allows the user to manage arbitrary fields over the do-main For instance a porous flow simulation might manage material properties ofthe medium in this fashion

4 Solvers

The SNES object in PETSc is an abstraction of the ldquoinverserdquo of a nonlinear oper-ator The user provides aFormFunction() routine as seen in Section 3 whichcomputes the action of the operator on an input vector Linear problems can also be

18 Matthew G Knepley et al

int calculateZMomentumResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1jlim = infominusgtmyminus1

if (ilt=j) f[j][i]w = x[j][i]w minus SlabVel(rsquoWrsquo ijuser)

else if (jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lithospheric lid f[j][i]w = x[j][i]w minus 00

else if (j==jlim ) on the bottom boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]w = x[j][i]w minus VertVelocity(ijuser) else

f[j][i]w = ZNormalStress(xijCELL CENTERuser) minus EPS ZERO

else if (i==ilim ) on the right side boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]w = x[j][i]w minus VertVelocity(ijuser) else if (paramminusgtibound==BC NOSTRESS)

f[j][i]w = ZMomentumResidual(xijuser)

else in the mantle wedge f[j][i]w = ZMomentumResidual(xijuser)

Fig 19Z-momentum residual for the mantle subduction problem

int calculatePressureResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (iltj | | jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lid or slab f[j][i]p = x[j][i]p

else if ((i==ilim | | j==jlim ) ampamp paramminusgtibound==BC ANALYTIC ) on an analytic boundary f[j][i]p = x[j][i]p minus Pressure(ijuser)

else in the mantle wedge f[j][i]p = ContinuityResidual(xijuser)

Fig 20Pressure or continuity residual for the mantle subduction problem

Developing a Geodynamics Simulator with PETSc 19

int calculateTemperatureResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (j==0) on the surface f[j][i]T = x[j][i]T + x[j+1][i]T + PetscMax(x[j][i]T00)

else if (i==0) slab inflow boundary f[j][i]T = x[j][i]T minus PlateModel(jPLATE SLABuser)

else if (i==ilim ) right side boundary mag u = 10 minus pow( (10minusPetscMax(PetscMin(x[j][iminus1]uparamminusgtcb10)00)) 50 )f[j][i]T = x[j][i]T minus mag ux[jminus1][iminus1]T minus (10minusmag u)PlateModel(jPLATE LID user)

else if (j==jlim ) bottom boundary mag w = 10 minus pow( (10minusPetscMax(PetscMin(x[jminus1][i]wparamminusgtsb10)00)) 50 )f[j][i]T = x[j][i]T minus mag wx[jminus1][iminus1]T minus (10minusmag w)

else in the mantle wedge f[j][i]T = EnergyResidual(xijuser)

Fig 21Temperature or energy residual for the mantle subduction problem

double precision ContinuityResidual(Field x int i int j AppCtx user)

GridInfo grid = userminusgtgriddouble precision uEuWwNwSdudxdwdz

uW = x[j][iminus1]u uE = x[j][i]u dudx = (uE minus uW)gridminusgtdxwS = x[jminus1][i]w wN = x[j][i]w dwdz = (wN minus wS)gridminusgtdzreturn dudx + dwdz

Fig 22Calculation of the continuity residual for the mantle subduction problem

solved by usingSNES with no loss of efficiency compared to using the underlyingKSP object (PETSc linear solver object) directly ThusSNES seems the appropri-ate framework for a general-purpose simulator providing the flexibility to add orsubtract nonlinear terms from the equation at will

TheSNES solver does not require a user to implement a Jacobian Default rou-tines are provided to compute a finite difference approximation using coloring toaccount for the sparsity of the matrix The user does not even need the PETScMatinterface but can initially interact only withVec objects making the transition to

PETSc nearly painless However these approximations to the Jacobian can have nu-merical difficulties and are not as efficient as direct evaluation Therefore the user

20 Matthew G Knepley et al

Compute both the second invariant of the strain rate tensor and the viscosity int ViscosityField(DA da Vec X Vec VAppCtx user)

Parameter param = userminusgtparamGridInfo grid = userminusgtgridVec localXField v xdouble eps dx dz T epsC TCint ijisjsimjmilim jlim ivt

ivt = paramminusgtiviscparamminusgtivisc = paramminusgtoutput ivisc

DACreateLocalVector(da amplocalX)DAGlobalToLocalBegin(da X INSERT VALUES localX)DAGlobalToLocalEnd(da X INSERT VALUES localX)DAVecGetArray(dalocalX(void)ampx)DAVecGetArray(daV(void)ampv)

Parameters dx = gridminusgtdx dz = gridminusgtdzilim = gridminusgtniminus1 jlim = gridminusgtnjminus1

Compute real temperature strain rate and viscosity DAGetCorners(daampisampjsPETSCNULL ampimampjmPETSCNULL )for (j=js jltjs+jm j++)

for (i=is iltis+im i++) T = paramminusgtpotentialT x[j][i]T exp( (jminus05)dzparamminusgtz scale )if (iltilim ampamp jltjlim )

TC = paramminusgtpotentialT TInterp(xij) exp( jdzparamminusgtz scale ) else

TC = T Compute the values at both cell centers and cell corners eps = CalcSecInv(xijCELL CENTERuser)epsC = CalcSecInv(xijCELL CORNERuser)v[j][i]u = epsv[j][i]w = epsCv[j][i]p = Viscosity(Tepsdz(jminus05)param)v[j][i]T = Viscosity(TCepsCdzjparam)

DAVecRestoreArray(daV(void)ampv)DAVecRestoreArray(dalocalX(void)ampx)paramminusgtivisc = ivt

Fig 23 Calculation of the second invariant of the strain tensor and viscosity field for themantle subduction problem

Developing a Geodynamics Simulator with PETSc 21

KSP ksp subkspPC bjacobi ilu

SNESGetKSP(snes ampksp)KSPGetPC(ksp ampbjacobi)PCBJacobiGetSubKSP(bjacobi PETSCNULL PETSCNULL ampsubksp)KSPGetPC(subksp[0] ampilu)PCILUSetLevels(ilu levels)

Fig 24Customizing the preconditioner on a single process

has the option of providing a routine or of using the ADIC [15] or similar system forautomatic differentiation The finite difference approximation can be activated byusing the-snes fd and-mat fd coloring freq options or by providing theSNESDefaultComputeJacobianColor () function toSNESSetJacobian()

Through theSNES object the user may also access the great range of directand iterative linear solvers and preconditioners provided by PETSc Sixteen differentKrylov solvers are available including GMRES BiCGStab and LSQR along withinterfaces to popular sparse direct packages such as MUMPS [4] and SuperLU [9]In addition to a variety of incomplete factorization preconditioners including PI-LUT [16] PETSc supports additive Schwartz preconditioning algebraic multigridthrough BoomerAMG [13] and the geometric multigrid discussed below The fullpanoply of solvers and preconditioners available is catalogued on the PETSc Web-site [3]

The modularity of PETSc allows users to easily customize each solver For in-stance suppose the user wishes to increase the number of levels in ILU(k) precon-ditioning on one block of a block-Jacobi scheme The code fragment in Fig 24 willset the number of levels of fill tolevels

PETSc provides an elegant framework for managing geometric multigrid in com-bination with aDA which is abstracted in theDMMGobject In the same way thatany linear solve can be performed withSNES any preconditioner or solver com-bination is available inDMMGby using a single level That is the same user codesupports both geometric multigrid as well as direct methods Krylov methods etcWhen creating aDMMG the user specifies the number of levels of grid refinementand provides the coarse-grid information (see Fig 25) which is always aDA objectat present although unstructured prototypes are being developed Then theDMMGobject calculates the intergrid transfer operators prolongation and restriction and

allocates objects to hold the solution at each level The user must provide the actionof the residual operatorF and can optionally provide a routine to compute the Jaco-bian matrix (which will be called on each level) and a routine to compute the initialguess as in Fig 26

With this information the user can now solve the equation and retrieve the solu-tion vector as in Fig 27

22 Matthew G Knepley et al

DMMGCreate(comm gridmglevels user ampdmmg)DMMGSetDM(dmmg (DM) da)

Fig 25Creating a DMMG

DMMGSetSNESLocal(dmmg FormFunctionLocal FormJacobianLocal 0 0)DMMGSetInitialGuess(dmmg FormInitialGuess)

Fig 26Providing discretization and assembly routines to DMMG

DMMGSolve(dmmg)soln = DMMGGetx(dmmg)

Fig 27Solving the problem with DMMG

5 Extensions

PETSc is not a complete environment for simulating physical phenomena Ratherit is a set of tools that allow the user to assemble such an environment tailored toa specific application As such PETSc will never provide every facility appropriatefor a given simulation However the user can easily extend PETSc and supplementits capabilities We present two examples in the context of the mantle subductionsimulation

51 Simple Continuation

Because of the highly nonlinear dependence of viscosity on temperature and strainrate in equation (1) the iteration in Newtonrsquos method can fail to converge withouta good starting guess of the solution On the other hand for the isoviscous casewhere temperature is coupled to velocity only through advection a solution is easilyreached It is therefore natural to propose a continuation scheme in the viscosity be-ginning with constant viscosity and progressing to full variability A simple adaptivecontinuation model was devised in which the viscosity was raised to a power be-tween zero and oneη rarr ηα whereα=0 corresponds to constant viscosity andα=1corresponds to natural viscosity variation In the continuation loopα was increasedtowards unity by an amount dependent on the rate of convergence of the previousSNES solve

PETSc itself provides no special support for continuation but it is sufficientlymodular that a continuation loop was readily constructed in the application codeusing repeated calls toDMMGSolve() and found to work quite well

Developing a Geodynamics Simulator with PETSc 23

52 Simple Steering

No rigorous justification had been given for the continuation strategy above andconsequently it was possible that for certain initial conditions Newton would fail toconverge thereby resulting in extremely long run times An observant user could de-tect this situation and abort the run but all the potentially useful solution informationup to that point would be lost A clean way to asynchronously abort the continua-tion loop was thus needed On architectures where OS signals are available PETScprovides an interface for registering signal handlers Thus we were able to define ahandler The user wishing to change the control flow of the simulation simply sendsthe appropriate signal to the process This sets a flag in the user data that causes thechange at the next iteration

6 Simulation Results

A long-standing debate has existed between theorists and observationalists over thethermal structure of subduction zones Modelers using constant viscosity flow sim-ulations to compute thermal structure have predicted relatively cold subductionzones [19] Conversely observationalists who use heat flow measurements and min-eralogical thermobarometry to estimate temperatures at depth have long claimedthat subduction zones are hotter than predicted They have invoked the presence ofstrong upwelling of hot mantle material to supply the heat This upwelling was neverpredicted by isoviscous flow models however

The debate remained unresolved until recently when several groups includingour own succeeded in developing simulations with realistically variable viscosity(for other examples [810121821]) A comparison of flow and temperature fieldsfor variable and constant viscosity simulations generated with our code is shownin Fig 28 Results from these simulations are exciting because they close the gapbetween models and observations they predict hotter mantle temperatures steepersurface thermal gradients and upwelling mantle flow

Furthermore these simulations allow for quantitative predictions of variation ofobservables with subduction parameters A recent study of subduction zone earth-quakes has identified an intriguing trend the vertical distance from subduction zonevolcanoes to the surface of the subducting slab is anti-correlated with descent rate ofthe slab [11] Preliminary calculations with our model are consistent with this trendindicating that flow and thermal structure may play an important role in determiningnot only the quantity and chemistry of magmas but also their path of transport to thesurface Further work is required to resolve this issue

Simulating the geodynamics of subduction is an example of our success in port-ing an existing code into the PETSc framework simultaneously parallelizing thecode and increasing its functionality Using this code as a template we rapidly de-veloped a related simulation of another tectonic boundary the mid-ocean ridge Thiswork was done to address a set of observations of systematic morphological asym-metry in the global mid-ocean ridge system [7] Our model confirms the qualitative

24 Matthew G Knepley et al

Dep

th k

m

log10

η and flow field

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24 Temperature degC

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Distance km

Dep

th k

m

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24

Distance km

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Isov

isco

usV

aria

ble

Vis

cosi

ty

(a) (b)

(c) (d)

Fig 282D viscosity and potential temperature fields from simulations on 8 processors with230112 degrees of freedom Panels in the top row are from a simulation withα=1 in equation(1) Panels in bottom row haveα=0 The white box in panels (a) and (c) shows the region inwhich temperature is plotted in panels (b) and (d)(a) Colors showlog10 of the viscosity fieldNote that there are more than five orders of magnitude variation in viscosity Arrows show theflow direction and magnitude (the slab is being subducted at a rate of 6 cmyear) Upwellingis evident in the flow field near the wedge corner(b) Temperature field from the variableviscosity simulation 1100C isotherm is shown as a dashed line(c) (Constant) viscosity andflow field from the isoviscous simulation Strong flow is predicted at the base of the crustdespite the low-temperature rock there No upwelling flow is predicted(d) Temperature fieldfrom isoviscous simulation Note that the mantle wedge corner is much colder than in (b)

mechanism that had been proposed to explain these observations Furthermore itshows a quantitative agreement with trends in the observed data [17] As with ourmodels of subduction the key to demonstrating the validity of the hypothesized dy-namics was simulating them with the strongly nonlinear rheology that characterizesmantle rock on geologic timescales The computational tools provided by PETSc en-abled us easily handle this rheology reducing development time and allowing us tofocus on model interpretation

References

1 PETSc SNES Example 30 httpwwwmcsanlgovpetscpetsc-2

Developing a Geodynamics Simulator with PETSc 25

snapshotspetsc-currentsrcsnesexamplestutorialsex30chtml

2 PETSc SNES Example 5 httpwwwmcsanlgovpetscpetsc-2snapshotspetsc-currentsrcsnesexamplestutorialsex5f90Fhtml

3 PETSc Solvers httpwwwmcsanlgovpetscpetsc-2documentationlinearsolvertablehtml

4 Amestoy P R Duff I S LrsquoExcellent J-Y and Koster J A fully asynchronous mul-tifrontal solver using distributed dynamic schedulingSIAM Journal on Matrix Analysisand Applications 23(1)15ndash41 (2001)

5 Balay S Buschelman K Eijkhout V Gropp W D Kaushik D Knepley M GMcInnes L C Smith B F and Zhang H PETSc Web pagehttpwwwmcsanlgovpetsc

6 Balay S Gropp W D McInnes L C and Smith B F Efficient management of par-allelism in object oriented numerical software libraries In Arge E Bruaset A M andLangtangen H P editorsModern Software Tools in Scientific Computing pages 163ndash202 Birkhauser Press (1997)

7 Carbotte S Small C and Donnelly K The influence of ridge migration on the mag-matic segmentation of mid-ocean ridgesNature 429743ndash746 (2004)

8 Conder J Wiens D and Morris J On the decompression melting structure at volcanicarcs and back-arc spreading centersGeophys Res Letts 29 (2002)

9 Demmel J W Gilbert J R and Li X S SuperLU userrsquos guide Technical ReportLBNL-44289 Lawrence Berkeley National Laboratory (2003)

10 Eberle M Grasset O and Sotin C A numerical study of the interaction of the man-tle wedge subducting slab and overriding platePhys Earth Planet In 134191ndash202(2002)

11 England P Engdahl R and Thatcher W Systematic variation in the depth of slabsbeneath arc volcanosGeophys J Int 156(2)377ndash408 (2003)

12 Furukawa Y Depth of the decoupling plate interface and thermal structure under arcsJ Geophys Res 9820005ndash20013 (1993)

13 Henson V E and Yang U M BoomerAMG A parallel algebraic multigrid solver andpreconditioner Technical Report UCRL-JC-133948 Lawrence Livermore National Lab-oratory (2000)

14 Hirth G and Kohlstedt D Rheology of the upper mantle and the mantle wedge A viewfrom the experimentalists InInside the Subduction Factory volume 138 ofGeophysicalMonograph American Geophysical Union (2003)

15 Hovland P Norris B and Smith B Making automatic differentiation truly automaticCoupling PETSc with ADIC InProceedings of ICCS2002 (2002)

16 Hysom D and Pothen A A scalable parallel algorithm for incomplete factor precondi-tioning SIAM Journal on Scientific Computing 222194ndash2215 (2001)

17 Katz R Spiegelman M and Carbotte S Ridge migration asthenospheric flow and theorigin of magmatic segmentation in the global mid-ocean ridge systemGeophys ResLetts 31 (2004)

18 Kelemen P Rilling J Parmentier E Mehl L and Hacker B Thermal structure dueto solid-state flow in the mantle wedge beneath arcs InInside the Subduction Factoryvolume 138 ofGeophysical Monograph American Geophysical Union (2003)

19 Peacock S and Wang K Seismic consequences of warm versus cool subduction meta-morphism Examples from southwest and northeast JapanScience 286937ndash939 (1999)

20 Stern R Subduction zonesRev Geophys 40(4) (2002)

26 Matthew G Knepley et al

21 van Keken P Kiefer B and Peacock S High-resolution models of subduction zonesImplications for mineral dehydration reactions and the transport of water into the deepmantleGeochem Geophys Geosys 3(10) (2003)

Page 7: Developing a Geodynamics Simulator with PETScfoalab.earth.ox.ac.uk/publications/Knepley_etal_Springer06.pdf · Developing a Geodynamics Simulator with PETSc Matthew G. Knepley 1,

Developing a Geodynamics Simulator with PETSc 7

linear algebra and data distribution abstractions Two obvious strategies emerge foraccomplishing this integration a gradual approach that seeks to minimize the changeto existing code and a more aggressive approach that leverages as much of the PETSctechnology as possible The next two subsections address these complementary pathstoward integration We use the simple example of the solid-fuel ignition problem orBratu problem to illustrate the gradual evolution of a simulation and then we givean example of mantle subduction for the more aggressive approach

31 Gradual Evolution

A new PETSc user with very complicated code which perhaps was originally writ-ten by someone else may opt to make as few changes as possible when moving toPETSc PETSc facilitates this approach with low-level interfaces to bothVec andMat objects that closely resemble common serial data structures Vectors are espe-cially easy to migrate because the default PETSc storage format contiguous arrayson each process is that most common in serial applications Matrices are somewhatmore complicated and in order to hide the actual matrix data storage format PETScrequires the user to access values through a functional interface [6]

Let us examine the Bratu problem as an example of integrating PETSc vectorsinto an existing simulation [2] This problem is modeled by the partial differentialequation

minus∆uminus λeu = 0 (4)

We take the domain to be the unit square and impose homogeneous Dirichlet bound-ary conditions on the edges We discretize the equation using a 5-point stencil finitedifference scheme which results in a set of nonlinear algebraic equationsF (uh) = 0where we useuh to indicate the discrete solution vector In Fig 7 we present a For-tran 90 routine that calculates the residualF as a function of the input vectoruhIt also takes a user context as input that holds the domain information and problemcoefficient

Notice that the boundary conditions on the residual are of the formuminusuΓ whereuΓ are the specified boundary values because we are driving the residualF (u) tozero An alternative would be to eliminate those variables altogether substitutingthe boundary values in any interior calculation However this technique currentlyimposes a greater burden on the programmer

When integrating PETSc vectors into this code we can let Fortran allocate thestorage or we can use PETSc for allocation In Fig 8 we let PETSc managethe memory The input vectors already have storage allocated by PETSc whichcan be accessed as an F90 array by using theVecGetArrayF90() functionor in C by usingVecGetArray() A sampledriver() routine shows howPETSc allocates vector storage If we let Fortran manage memory then we useVecCreateSeqWithArray() orVecCreateMPIWithArray() in parallelto create theVec objects

During a Newton iteration to solve this equation we would require the JacobianJ of the mappingF Rather than being accessed as a multidimensional array the

8 Matthew G Knepley et al

module f90moduletype userctx

The start end and number of vertices in the xminus and yminusdirectionsinteger xsxeysyeinteger mxmydouble precision lambda

end type userctxcontainsend module f90module

subroutine FormFunction(uFuserierr)use f90moduletype (userctx) userdouble precision u(userxsuserxeuserysuserye)double precision F(userxsuserxeuserysuserye)double precision twoonehxhyhxdhyhydhxscuijuxxuyyinteger ijierr

hx = 10dble(usermxminus1)hy = 10dble(usermyminus1)sc = hxhyuserlambdahxdhy = hxhyhydhx = hyhx

do 20 j=userysuseryedo 10 i=userxsuserxe

Apply boundary conditionsif (i == 1 or j == 1 or i == usermx or j == usermy) then

F(ij) = u(ij) Apply finite difference scheme

elseuij = u(ij)uxx = hydhx (20uij minus u(iminus1j) minus u(i+1j))uyy = hxdhy (20uij minus u(ijminus1) minus u(ij+1))F(ij) = uxx + uyy minus scexp(uij)

endif10 continue20 continue

Fig 7Residual calculation for the Bratu problem

Developing a Geodynamics Simulator with PETSc 9

subroutine driver(uFuserierr)implicit none

Vec uFint Nparameter(N=10000)

call VecCreate(PETSCCOMM WORLDuierr)call VecSetSizes(uPETSCDECIDENierr)call VecDuplicate(uFierr)call SolverLoop(uFierr)call VecDestroy(uierr)call VecDestroy(Fierr)end

subroutine FormFunctionPETSc(uFuserierr)use f90moduleimplicit none

Vec uFtype (userctx) userdouble precisionpointer u v()f v()

call VecGetArrayF90(uu vierr)call VecGetArrayF90(Ff vierr)call FormFunction(u vf vuserierr)call VecRestoreArrayF90(uu vierr)call VecRestoreArrayF90(Ff vierr)end

Fig 8Driver for the Bratu problem using PETSc

PETScMat object provides theMatSetValues () function to set logically denseblocks of values into the structure A function that computes the Jacobian and storesit in a PETSc matrix is shown in Fig 9 With this addition the user can now runserial code using the PETSc solvers (details are given in Section 4) Note that thestorage mechanism is the only change to user code necessary for its incorporationinto the PETSc framework using the wrapper in Fig 8

Although PETSc removes the need for low-level parallel programming such asdirect calls to the MPI library the user must still identify parallelism inherent inthe application and must structure the computation accordingly The first step is topartition the domain and have each process calculate the residual and Jacobian onlyover its local piece This is easily accomplished by redefining thexs xe ys andyevariables on each process converting loops over the entire domain into loops overthe local domain However this method uses nearest-neighbor information and thus

10 Matthew G Knepley et al

subroutine FormJacobian(ujacuserierr)use f90moduletype (userctx) userMat jacdouble precision x(userxsuserxeuserysuserye)double precision twoonehxhyhxdhyhydhxscv(5)integer rowcol(5)ijierr

one = 10two = 20hx = onedble(usermxminus1)hy = onedble(usermyminus1)sc = hxhyuserlambdahxdhy = hxhyhydhx = hyhxdo 20 j=userysuserye

row = (j minus userys)userxm minus 1do 10 i=userxsuserxe

row = row + 1 boundary points

if (i == 1 or j == 1 or i == usermx or j == usermy) thencol(1) = rowv(1) = onecall MatSetValues(jac1row1colvINSERT VALUESierr)

interior grid pointselse

v(1) = minushxdhyv(2) = minushydhxv(3) = two(hydhx + hxdhy) minus scexp(x(ij))v(4) = minushydhxv(5) = minushxdhycol(1) = row minus userxmcol(2) = row minus 1col(3) = rowcol(4) = row + 1col(5) = row + userxmcall MatSetValues(jac1row5colvINSERT VALUESierr)

endif10 continue20 continue

Fig 9Jacobian calculation for the Bratu problem using PETSc

Developing a Geodynamics Simulator with PETSc 11

we must have access to some values not present locally on the process as shown inFig 10

Box-type stencil Star-type stencil

Proc 6

Proc 0 Proc 0Proc 1 Proc 1

Proc 6

Fig 10Star and box stencils using a DA

We must store values for theseghostpoints locally so that they may be used forthe computation In addition we must ensure that these values are identical to the cor-responding values on the neighboring processes At the lowest level PETSc providesghosted vectors created usingVecCreateGhost () with storage for some valuesowned by other processes These values are explicitly indicated during constructionTheVecGhostUpdateBegin () andVecGhostUpdateEnd () functions trans-fer values between ghost storage and the corresponding storage on other processesThis method however can become complicated for the user Therefore for the com-mon case of a logically rectangular grid PETSc provides theDA object to managethe determination allocation and coherence of ghost values TheDA object is dis-cussed fully in Section 32 but we give here a short introduction in the context of theBratu problem

TheDA object represents a distributed structured (possibly staggered) grid andthus has more information than our serial grid If we augment our user context toinclude information about ghost values we can obtain all the grid information fromtheDA Figure 11 shows the creation of aDA when one initially know only the totalnumber of vertices in thex andy directions Alternatively we could have specifiedthe local number of vertices in each direction

Now we need only modify our PETSc residual routine to update the ghost valuesas shown in Fig 12

Notice that the ghost value scatter is a two-step operation This allows thecommunication to overlap with local computation that may be taking place whilethe messages are in transit The only change necessary for the residual calcu-lation itself is the correct declaration of the input arraydouble precision

12 Matthew G Knepley et al

type userctxDA da

The start end and number of vertices in the xminusdirectioninteger xsxexm

The start end and number of ghost vertices in the xminusdirectioninteger gxsgxegxm

The start end and number of vertices in the yminusdirectioninteger ysyeym

The start end and number of ghost vertices in the yminusdirectioninteger gysgyegym

The number of vertices in the xminus and yminusdirectionsinteger mxmy

The MPI rank of this processinteger rank

The coefficient in the Bratu equationdouble precision lambda

end type userctx

call DACreate2d(PETSCCOMM WORLDDA NONPERIODICDA STENCIL STAR ampamp usermxusermyPETSCDECIDEPETSCDECIDE11 ampamp PETSCNULL INTEGERPETSCNULL INTEGERuserdaierr)call DAGetCorners(userdauserxsuserysPETSCNULL INTEGER amp

amp userxmuserymPETSCNULL INTEGERierr)call DAGetGhostCorners(userdausergxsusergys amp

amp PETSCNULL INTEGERusergxmusergym ampamp PETSCNULL INTEGERierr)

Here we shift the starting indices up by one so that we can easily use the Fortran convention of1minusbased indices rather than0minusbased

userxs = userxs+1userys = userys+1usergxs = usergxs+1usergys = usergys+1userye = userys+userymminus1userxe = userxs+userxmminus1usergye = usergys+usergymminus1usergxe = usergxs+usergxmminus1

Fig 11Creation of a DA for the Bratu problem

Developing a Geodynamics Simulator with PETSc 13

subroutine FormFunctionPETSc(uFuserierr)implicit none

Vec uFinteger ierrtype (userctx) user

double precisionpointer lu v()lf v()Vec uLocal

call DAGetLocalVector(userdauLocalierr)call DAGlobalToLocalBegin(userdauINSERT VALUESuLocalierr)call DAGlobalToLocalEnd(userdauINSERT VALUESuLocalierr)

call VecGetArrayF90(uLocallu vierr)call VecGetArrayF90(Flf vierr)

Actually compute the local portion of the residualcall FormFunctionLocal(lu vlf vuserierr)

call VecRestoreArrayF90(uLocallu vierr)call VecRestoreArrayF90(Flf vierr)

call DARestoreLocalVector(userdauLocalierr)

Fig 12Parallel residual calculation for the Bratu problem using PETSc

u(usergxsusergxeusergysusergye) which now includes ghostvalues

The changes toFormJacobian() are similar resizing the input array andchanging slightly the calculation of row and column indices and can be found inthe PETSc example source We have now finished the initial introduction of PETSclinear algebra and the simulation is ready to run in parallel

32 Rapid Evolution

The user who is willing to raise the level of abstraction in a code can start by em-ploying theDA to describe the problem domain and discretization The data interfacemimics a multidimensional array and is thus ideal for finite difference finite volumeand low-order finite element schemes on a logically rectangular grid In fact after thedefinition of aFormFunction() routine the simulation is ready to run becausethe nonlinear solver provides approximate Jacobians automatically as discussed inSection 4

14 Matthew G Knepley et al

DASetLocalFunction(userda (DALocalFunction1) FormFunctionLocal)

Fig 13Using the DA to form a function

int FormFunctionLocal(DALocalInfo infodouble udouble fAppCtx user)double two = 20hxhyhxdhyhydhxscdouble uijuxxuyyint ij

hx = 10(double)(infominusgtmxminus1)hy = 10(double)(infominusgtmyminus1)sc = hxhyuserminusgtlambdahxdhy = hxhyhydhx = hyhx Compute function over the locally owned part of the grid for (j=infominusgtys jltinfominusgtys+infominusgtym j++)

for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++) if (i == 0 | | j == 0 | | i == infominusgtmxminus1 | | j == infominusgtmyminus1)

f[j][i] = u[j][i] else

uij = u[j][i]uxx = (twouij minus u[j][iminus1] minus u[j][i+1])hydhxuyy = (twouij minus u[jminus1][i] minus u[j+1][i])hxdhyf[j][i] = uxx + uyy minus scexp(uij)

Fig 14FormFunctionLocal() for the Bratu problem

We begin by examining the Bratu problem from the last section only this time inC We can now provide our local residual routine to theDA see Fig 13

The grid information is passed toFormFunctionLocal() in a DALocalInfo structure as shown in Fig 14

The fields are passed directly as multidimensional C arrays complete with ghostvalues An application scientist can use a boilerplate example such as the Bratu prob-lem merely altering the local physics calculation inFormFunctionLocal() torapidly obtain a parallel scalable simulation that runs on the desktop as well as onmassively parallel supercomputers

If the user has an expression for the Jacobian of the residual function then thismatrix can also be computed by using theDA interface with the call as in Fig 15and the corresponding routine for the local calculation in Fig 16

Developing a Geodynamics Simulator with PETSc 15

DASetLocalJacobian(userda (DALocalFunction1) FormJacobianLocal)

Fig 15Using the DA to form a Jacobian

int FormJacobianLocal(DALocalInfo infodouble xMat jacAppCtx user)MatStencil col[5]rowdouble lambdav[5]hxhyhxdhyhydhxscint ij

lambda = userminusgtparamhx = 10(double)(infominusgtmxminus1)hy = 10(double)(infominusgtmyminus1)sc = hxhylambdahxdhy = hxhy hydhx = hyhx

for (j=infominusgtys jltinfominusgtys+infominusgtym j++) for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++)

rowj = j rowi = i boundary points if (i == 0 | | j == 0 | | i == infominusgtmxminus1 | | j == infominusgtmyminus1)

v[0] = 10MatSetValuesStencil(jac1amprow1amprowvINSERT VALUES)

else interior grid points

v[0] = minushxdhy col[0]j = j minus 1 col[0]i = iv[1] = minushydhx col[1]j = j col[1]i = iminus1v[2] = 20(hydhx + hxdhy) minus scexp(x[j][i]) col[2]j = j col[2]i = iv[3] = minushydhx col[3]j = j col[3]i = i+1v[4] = minushxdhy col[4]j = j + 1 col[4]i = iMatSetValuesStencil(jac1amprow5colvINSERT VALUES)

MatAssemblyBegin(jacMAT FINAL ASSEMBLY)MatAssemblyEnd(jacMAT FINAL ASSEMBLY)MatSetOption(jacMAT NEW NONZERO LOCATION ERR)

Fig 16FormJacobianLocal() for the Bratu problem

16 Matthew G Knepley et al

int FormFunctionLocal(DALocalInfo infoField xField fvoid ptr)AppCtx user = (AppCtx)ptrParameter param = userminusgtparamGridInfo grid = userminusgtgriddouble mag w mag uint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1 i j

for (j=infominusgtys jltinfominusgtys+infominusgtym j++) for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++)

calculateXMomentumResidual(i j x f user)calculateZMomentumResidual(i j x f user)calculatePressureResidual(i j x f user)calculateTemperatureResidual(i j x f user)

Fig 17FormFunctionLocal() for the mantle subduction problem

We have so far presented simple examples involving a perturbed Laplacian butthis development strategy extends far beyond toy problems We have developedlarge-scale parallel geophysical simulations [1] that are producing new results inthe field We began by replacingFormFunctionLocal() in the Bratu examplewith one containing the linear physics of the previous sequential linear subductioncode (dropping the previous codersquos linear solve) Then the PETSc solver results werecompared for correctness with the complete prior subduction code Next the newcode was immediately run in parallel for correctness and efficiency tests Finallythe nonlinear terms from the more complicated physics of the mantle subductionproblem discussed in Section 1 and shown in Figs 17-21 and better quality dis-cretizations of the boundary conditions were added In fact since the structure oftheFormFunction() changed significantly the finalFormFunction() looksnothing like the original but the process of obtaining through the incremental ap-proach reduced the learning curve for PETSc and make correctness checking at thevarious stages much easier

The details of the subduction code are not as important as the recognition thata very complicated physics problem need not involve any more complication in oursimulation infrastructure To give more insight into the actual calculation we showin Fig 22 the explicit calculation of the residual from the continuity equation

Here we have used theDA for a multicomponent problem on a staggered meshwithout alteration because the structure of the discretization remains logically rect-angular

Although theDA manages communication during the solution process data post-processing often necessary for visualization or analysis also requires this function-ality For the mantle subduction simulation we want to calculate both the second in-

Developing a Geodynamics Simulator with PETSc 17

int calculateXMomentumResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (iltj) f[j][i]u = x[j][i]u minus SlabVel(rsquoUrsquo ijuser)

else if (jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lithospheric lid f[j][i]u = x[j][i]u minus 00

else if (i==ilim ) on the right side boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]u = x[j][i]u minus HorizVelocity(ijuser) else

f[j][i]u = XNormalStress(xijCELL CENTERuser) minus EPS ZERO

else if (j==jlim ) on the bottom boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]u = x[j][i]u minus HorizVelocity(ijuser) else if (paramminusgtibound==BC NOSTRESS)

f[j][i]u = XMomentumResidual(xijuser)

else in the mantle wedge f[j][i]u = XMomentumResidual(xijuser)

Fig 18X-momentum residual for the mantle subduction problem

variant of the strain rate tensor and the viscosity over the grid (at both cell centers andcorners) Using theDAGlobalToLocalBegin () andDAGlobalToLocalEnd() functions we transferred the global solution data to local ghosted vectors and pro-ceeded with the calculation We can also transfer data from local vectors to a globalvector usingDALocalToGlobal () The entire viscosity calculation is given inFig 23

Note that this framework allows the user to manage arbitrary fields over the do-main For instance a porous flow simulation might manage material properties ofthe medium in this fashion

4 Solvers

The SNES object in PETSc is an abstraction of the ldquoinverserdquo of a nonlinear oper-ator The user provides aFormFunction() routine as seen in Section 3 whichcomputes the action of the operator on an input vector Linear problems can also be

18 Matthew G Knepley et al

int calculateZMomentumResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1jlim = infominusgtmyminus1

if (ilt=j) f[j][i]w = x[j][i]w minus SlabVel(rsquoWrsquo ijuser)

else if (jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lithospheric lid f[j][i]w = x[j][i]w minus 00

else if (j==jlim ) on the bottom boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]w = x[j][i]w minus VertVelocity(ijuser) else

f[j][i]w = ZNormalStress(xijCELL CENTERuser) minus EPS ZERO

else if (i==ilim ) on the right side boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]w = x[j][i]w minus VertVelocity(ijuser) else if (paramminusgtibound==BC NOSTRESS)

f[j][i]w = ZMomentumResidual(xijuser)

else in the mantle wedge f[j][i]w = ZMomentumResidual(xijuser)

Fig 19Z-momentum residual for the mantle subduction problem

int calculatePressureResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (iltj | | jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lid or slab f[j][i]p = x[j][i]p

else if ((i==ilim | | j==jlim ) ampamp paramminusgtibound==BC ANALYTIC ) on an analytic boundary f[j][i]p = x[j][i]p minus Pressure(ijuser)

else in the mantle wedge f[j][i]p = ContinuityResidual(xijuser)

Fig 20Pressure or continuity residual for the mantle subduction problem

Developing a Geodynamics Simulator with PETSc 19

int calculateTemperatureResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (j==0) on the surface f[j][i]T = x[j][i]T + x[j+1][i]T + PetscMax(x[j][i]T00)

else if (i==0) slab inflow boundary f[j][i]T = x[j][i]T minus PlateModel(jPLATE SLABuser)

else if (i==ilim ) right side boundary mag u = 10 minus pow( (10minusPetscMax(PetscMin(x[j][iminus1]uparamminusgtcb10)00)) 50 )f[j][i]T = x[j][i]T minus mag ux[jminus1][iminus1]T minus (10minusmag u)PlateModel(jPLATE LID user)

else if (j==jlim ) bottom boundary mag w = 10 minus pow( (10minusPetscMax(PetscMin(x[jminus1][i]wparamminusgtsb10)00)) 50 )f[j][i]T = x[j][i]T minus mag wx[jminus1][iminus1]T minus (10minusmag w)

else in the mantle wedge f[j][i]T = EnergyResidual(xijuser)

Fig 21Temperature or energy residual for the mantle subduction problem

double precision ContinuityResidual(Field x int i int j AppCtx user)

GridInfo grid = userminusgtgriddouble precision uEuWwNwSdudxdwdz

uW = x[j][iminus1]u uE = x[j][i]u dudx = (uE minus uW)gridminusgtdxwS = x[jminus1][i]w wN = x[j][i]w dwdz = (wN minus wS)gridminusgtdzreturn dudx + dwdz

Fig 22Calculation of the continuity residual for the mantle subduction problem

solved by usingSNES with no loss of efficiency compared to using the underlyingKSP object (PETSc linear solver object) directly ThusSNES seems the appropri-ate framework for a general-purpose simulator providing the flexibility to add orsubtract nonlinear terms from the equation at will

TheSNES solver does not require a user to implement a Jacobian Default rou-tines are provided to compute a finite difference approximation using coloring toaccount for the sparsity of the matrix The user does not even need the PETScMatinterface but can initially interact only withVec objects making the transition to

PETSc nearly painless However these approximations to the Jacobian can have nu-merical difficulties and are not as efficient as direct evaluation Therefore the user

20 Matthew G Knepley et al

Compute both the second invariant of the strain rate tensor and the viscosity int ViscosityField(DA da Vec X Vec VAppCtx user)

Parameter param = userminusgtparamGridInfo grid = userminusgtgridVec localXField v xdouble eps dx dz T epsC TCint ijisjsimjmilim jlim ivt

ivt = paramminusgtiviscparamminusgtivisc = paramminusgtoutput ivisc

DACreateLocalVector(da amplocalX)DAGlobalToLocalBegin(da X INSERT VALUES localX)DAGlobalToLocalEnd(da X INSERT VALUES localX)DAVecGetArray(dalocalX(void)ampx)DAVecGetArray(daV(void)ampv)

Parameters dx = gridminusgtdx dz = gridminusgtdzilim = gridminusgtniminus1 jlim = gridminusgtnjminus1

Compute real temperature strain rate and viscosity DAGetCorners(daampisampjsPETSCNULL ampimampjmPETSCNULL )for (j=js jltjs+jm j++)

for (i=is iltis+im i++) T = paramminusgtpotentialT x[j][i]T exp( (jminus05)dzparamminusgtz scale )if (iltilim ampamp jltjlim )

TC = paramminusgtpotentialT TInterp(xij) exp( jdzparamminusgtz scale ) else

TC = T Compute the values at both cell centers and cell corners eps = CalcSecInv(xijCELL CENTERuser)epsC = CalcSecInv(xijCELL CORNERuser)v[j][i]u = epsv[j][i]w = epsCv[j][i]p = Viscosity(Tepsdz(jminus05)param)v[j][i]T = Viscosity(TCepsCdzjparam)

DAVecRestoreArray(daV(void)ampv)DAVecRestoreArray(dalocalX(void)ampx)paramminusgtivisc = ivt

Fig 23 Calculation of the second invariant of the strain tensor and viscosity field for themantle subduction problem

Developing a Geodynamics Simulator with PETSc 21

KSP ksp subkspPC bjacobi ilu

SNESGetKSP(snes ampksp)KSPGetPC(ksp ampbjacobi)PCBJacobiGetSubKSP(bjacobi PETSCNULL PETSCNULL ampsubksp)KSPGetPC(subksp[0] ampilu)PCILUSetLevels(ilu levels)

Fig 24Customizing the preconditioner on a single process

has the option of providing a routine or of using the ADIC [15] or similar system forautomatic differentiation The finite difference approximation can be activated byusing the-snes fd and-mat fd coloring freq options or by providing theSNESDefaultComputeJacobianColor () function toSNESSetJacobian()

Through theSNES object the user may also access the great range of directand iterative linear solvers and preconditioners provided by PETSc Sixteen differentKrylov solvers are available including GMRES BiCGStab and LSQR along withinterfaces to popular sparse direct packages such as MUMPS [4] and SuperLU [9]In addition to a variety of incomplete factorization preconditioners including PI-LUT [16] PETSc supports additive Schwartz preconditioning algebraic multigridthrough BoomerAMG [13] and the geometric multigrid discussed below The fullpanoply of solvers and preconditioners available is catalogued on the PETSc Web-site [3]

The modularity of PETSc allows users to easily customize each solver For in-stance suppose the user wishes to increase the number of levels in ILU(k) precon-ditioning on one block of a block-Jacobi scheme The code fragment in Fig 24 willset the number of levels of fill tolevels

PETSc provides an elegant framework for managing geometric multigrid in com-bination with aDA which is abstracted in theDMMGobject In the same way thatany linear solve can be performed withSNES any preconditioner or solver com-bination is available inDMMGby using a single level That is the same user codesupports both geometric multigrid as well as direct methods Krylov methods etcWhen creating aDMMG the user specifies the number of levels of grid refinementand provides the coarse-grid information (see Fig 25) which is always aDA objectat present although unstructured prototypes are being developed Then theDMMGobject calculates the intergrid transfer operators prolongation and restriction and

allocates objects to hold the solution at each level The user must provide the actionof the residual operatorF and can optionally provide a routine to compute the Jaco-bian matrix (which will be called on each level) and a routine to compute the initialguess as in Fig 26

With this information the user can now solve the equation and retrieve the solu-tion vector as in Fig 27

22 Matthew G Knepley et al

DMMGCreate(comm gridmglevels user ampdmmg)DMMGSetDM(dmmg (DM) da)

Fig 25Creating a DMMG

DMMGSetSNESLocal(dmmg FormFunctionLocal FormJacobianLocal 0 0)DMMGSetInitialGuess(dmmg FormInitialGuess)

Fig 26Providing discretization and assembly routines to DMMG

DMMGSolve(dmmg)soln = DMMGGetx(dmmg)

Fig 27Solving the problem with DMMG

5 Extensions

PETSc is not a complete environment for simulating physical phenomena Ratherit is a set of tools that allow the user to assemble such an environment tailored toa specific application As such PETSc will never provide every facility appropriatefor a given simulation However the user can easily extend PETSc and supplementits capabilities We present two examples in the context of the mantle subductionsimulation

51 Simple Continuation

Because of the highly nonlinear dependence of viscosity on temperature and strainrate in equation (1) the iteration in Newtonrsquos method can fail to converge withouta good starting guess of the solution On the other hand for the isoviscous casewhere temperature is coupled to velocity only through advection a solution is easilyreached It is therefore natural to propose a continuation scheme in the viscosity be-ginning with constant viscosity and progressing to full variability A simple adaptivecontinuation model was devised in which the viscosity was raised to a power be-tween zero and oneη rarr ηα whereα=0 corresponds to constant viscosity andα=1corresponds to natural viscosity variation In the continuation loopα was increasedtowards unity by an amount dependent on the rate of convergence of the previousSNES solve

PETSc itself provides no special support for continuation but it is sufficientlymodular that a continuation loop was readily constructed in the application codeusing repeated calls toDMMGSolve() and found to work quite well

Developing a Geodynamics Simulator with PETSc 23

52 Simple Steering

No rigorous justification had been given for the continuation strategy above andconsequently it was possible that for certain initial conditions Newton would fail toconverge thereby resulting in extremely long run times An observant user could de-tect this situation and abort the run but all the potentially useful solution informationup to that point would be lost A clean way to asynchronously abort the continua-tion loop was thus needed On architectures where OS signals are available PETScprovides an interface for registering signal handlers Thus we were able to define ahandler The user wishing to change the control flow of the simulation simply sendsthe appropriate signal to the process This sets a flag in the user data that causes thechange at the next iteration

6 Simulation Results

A long-standing debate has existed between theorists and observationalists over thethermal structure of subduction zones Modelers using constant viscosity flow sim-ulations to compute thermal structure have predicted relatively cold subductionzones [19] Conversely observationalists who use heat flow measurements and min-eralogical thermobarometry to estimate temperatures at depth have long claimedthat subduction zones are hotter than predicted They have invoked the presence ofstrong upwelling of hot mantle material to supply the heat This upwelling was neverpredicted by isoviscous flow models however

The debate remained unresolved until recently when several groups includingour own succeeded in developing simulations with realistically variable viscosity(for other examples [810121821]) A comparison of flow and temperature fieldsfor variable and constant viscosity simulations generated with our code is shownin Fig 28 Results from these simulations are exciting because they close the gapbetween models and observations they predict hotter mantle temperatures steepersurface thermal gradients and upwelling mantle flow

Furthermore these simulations allow for quantitative predictions of variation ofobservables with subduction parameters A recent study of subduction zone earth-quakes has identified an intriguing trend the vertical distance from subduction zonevolcanoes to the surface of the subducting slab is anti-correlated with descent rate ofthe slab [11] Preliminary calculations with our model are consistent with this trendindicating that flow and thermal structure may play an important role in determiningnot only the quantity and chemistry of magmas but also their path of transport to thesurface Further work is required to resolve this issue

Simulating the geodynamics of subduction is an example of our success in port-ing an existing code into the PETSc framework simultaneously parallelizing thecode and increasing its functionality Using this code as a template we rapidly de-veloped a related simulation of another tectonic boundary the mid-ocean ridge Thiswork was done to address a set of observations of systematic morphological asym-metry in the global mid-ocean ridge system [7] Our model confirms the qualitative

24 Matthew G Knepley et al

Dep

th k

m

log10

η and flow field

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24 Temperature degC

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Distance km

Dep

th k

m

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24

Distance km

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Isov

isco

usV

aria

ble

Vis

cosi

ty

(a) (b)

(c) (d)

Fig 282D viscosity and potential temperature fields from simulations on 8 processors with230112 degrees of freedom Panels in the top row are from a simulation withα=1 in equation(1) Panels in bottom row haveα=0 The white box in panels (a) and (c) shows the region inwhich temperature is plotted in panels (b) and (d)(a) Colors showlog10 of the viscosity fieldNote that there are more than five orders of magnitude variation in viscosity Arrows show theflow direction and magnitude (the slab is being subducted at a rate of 6 cmyear) Upwellingis evident in the flow field near the wedge corner(b) Temperature field from the variableviscosity simulation 1100C isotherm is shown as a dashed line(c) (Constant) viscosity andflow field from the isoviscous simulation Strong flow is predicted at the base of the crustdespite the low-temperature rock there No upwelling flow is predicted(d) Temperature fieldfrom isoviscous simulation Note that the mantle wedge corner is much colder than in (b)

mechanism that had been proposed to explain these observations Furthermore itshows a quantitative agreement with trends in the observed data [17] As with ourmodels of subduction the key to demonstrating the validity of the hypothesized dy-namics was simulating them with the strongly nonlinear rheology that characterizesmantle rock on geologic timescales The computational tools provided by PETSc en-abled us easily handle this rheology reducing development time and allowing us tofocus on model interpretation

References

1 PETSc SNES Example 30 httpwwwmcsanlgovpetscpetsc-2

Developing a Geodynamics Simulator with PETSc 25

snapshotspetsc-currentsrcsnesexamplestutorialsex30chtml

2 PETSc SNES Example 5 httpwwwmcsanlgovpetscpetsc-2snapshotspetsc-currentsrcsnesexamplestutorialsex5f90Fhtml

3 PETSc Solvers httpwwwmcsanlgovpetscpetsc-2documentationlinearsolvertablehtml

4 Amestoy P R Duff I S LrsquoExcellent J-Y and Koster J A fully asynchronous mul-tifrontal solver using distributed dynamic schedulingSIAM Journal on Matrix Analysisand Applications 23(1)15ndash41 (2001)

5 Balay S Buschelman K Eijkhout V Gropp W D Kaushik D Knepley M GMcInnes L C Smith B F and Zhang H PETSc Web pagehttpwwwmcsanlgovpetsc

6 Balay S Gropp W D McInnes L C and Smith B F Efficient management of par-allelism in object oriented numerical software libraries In Arge E Bruaset A M andLangtangen H P editorsModern Software Tools in Scientific Computing pages 163ndash202 Birkhauser Press (1997)

7 Carbotte S Small C and Donnelly K The influence of ridge migration on the mag-matic segmentation of mid-ocean ridgesNature 429743ndash746 (2004)

8 Conder J Wiens D and Morris J On the decompression melting structure at volcanicarcs and back-arc spreading centersGeophys Res Letts 29 (2002)

9 Demmel J W Gilbert J R and Li X S SuperLU userrsquos guide Technical ReportLBNL-44289 Lawrence Berkeley National Laboratory (2003)

10 Eberle M Grasset O and Sotin C A numerical study of the interaction of the man-tle wedge subducting slab and overriding platePhys Earth Planet In 134191ndash202(2002)

11 England P Engdahl R and Thatcher W Systematic variation in the depth of slabsbeneath arc volcanosGeophys J Int 156(2)377ndash408 (2003)

12 Furukawa Y Depth of the decoupling plate interface and thermal structure under arcsJ Geophys Res 9820005ndash20013 (1993)

13 Henson V E and Yang U M BoomerAMG A parallel algebraic multigrid solver andpreconditioner Technical Report UCRL-JC-133948 Lawrence Livermore National Lab-oratory (2000)

14 Hirth G and Kohlstedt D Rheology of the upper mantle and the mantle wedge A viewfrom the experimentalists InInside the Subduction Factory volume 138 ofGeophysicalMonograph American Geophysical Union (2003)

15 Hovland P Norris B and Smith B Making automatic differentiation truly automaticCoupling PETSc with ADIC InProceedings of ICCS2002 (2002)

16 Hysom D and Pothen A A scalable parallel algorithm for incomplete factor precondi-tioning SIAM Journal on Scientific Computing 222194ndash2215 (2001)

17 Katz R Spiegelman M and Carbotte S Ridge migration asthenospheric flow and theorigin of magmatic segmentation in the global mid-ocean ridge systemGeophys ResLetts 31 (2004)

18 Kelemen P Rilling J Parmentier E Mehl L and Hacker B Thermal structure dueto solid-state flow in the mantle wedge beneath arcs InInside the Subduction Factoryvolume 138 ofGeophysical Monograph American Geophysical Union (2003)

19 Peacock S and Wang K Seismic consequences of warm versus cool subduction meta-morphism Examples from southwest and northeast JapanScience 286937ndash939 (1999)

20 Stern R Subduction zonesRev Geophys 40(4) (2002)

26 Matthew G Knepley et al

21 van Keken P Kiefer B and Peacock S High-resolution models of subduction zonesImplications for mineral dehydration reactions and the transport of water into the deepmantleGeochem Geophys Geosys 3(10) (2003)

Page 8: Developing a Geodynamics Simulator with PETScfoalab.earth.ox.ac.uk/publications/Knepley_etal_Springer06.pdf · Developing a Geodynamics Simulator with PETSc Matthew G. Knepley 1,

8 Matthew G Knepley et al

module f90moduletype userctx

The start end and number of vertices in the xminus and yminusdirectionsinteger xsxeysyeinteger mxmydouble precision lambda

end type userctxcontainsend module f90module

subroutine FormFunction(uFuserierr)use f90moduletype (userctx) userdouble precision u(userxsuserxeuserysuserye)double precision F(userxsuserxeuserysuserye)double precision twoonehxhyhxdhyhydhxscuijuxxuyyinteger ijierr

hx = 10dble(usermxminus1)hy = 10dble(usermyminus1)sc = hxhyuserlambdahxdhy = hxhyhydhx = hyhx

do 20 j=userysuseryedo 10 i=userxsuserxe

Apply boundary conditionsif (i == 1 or j == 1 or i == usermx or j == usermy) then

F(ij) = u(ij) Apply finite difference scheme

elseuij = u(ij)uxx = hydhx (20uij minus u(iminus1j) minus u(i+1j))uyy = hxdhy (20uij minus u(ijminus1) minus u(ij+1))F(ij) = uxx + uyy minus scexp(uij)

endif10 continue20 continue

Fig 7Residual calculation for the Bratu problem

Developing a Geodynamics Simulator with PETSc 9

subroutine driver(uFuserierr)implicit none

Vec uFint Nparameter(N=10000)

call VecCreate(PETSCCOMM WORLDuierr)call VecSetSizes(uPETSCDECIDENierr)call VecDuplicate(uFierr)call SolverLoop(uFierr)call VecDestroy(uierr)call VecDestroy(Fierr)end

subroutine FormFunctionPETSc(uFuserierr)use f90moduleimplicit none

Vec uFtype (userctx) userdouble precisionpointer u v()f v()

call VecGetArrayF90(uu vierr)call VecGetArrayF90(Ff vierr)call FormFunction(u vf vuserierr)call VecRestoreArrayF90(uu vierr)call VecRestoreArrayF90(Ff vierr)end

Fig 8Driver for the Bratu problem using PETSc

PETScMat object provides theMatSetValues () function to set logically denseblocks of values into the structure A function that computes the Jacobian and storesit in a PETSc matrix is shown in Fig 9 With this addition the user can now runserial code using the PETSc solvers (details are given in Section 4) Note that thestorage mechanism is the only change to user code necessary for its incorporationinto the PETSc framework using the wrapper in Fig 8

Although PETSc removes the need for low-level parallel programming such asdirect calls to the MPI library the user must still identify parallelism inherent inthe application and must structure the computation accordingly The first step is topartition the domain and have each process calculate the residual and Jacobian onlyover its local piece This is easily accomplished by redefining thexs xe ys andyevariables on each process converting loops over the entire domain into loops overthe local domain However this method uses nearest-neighbor information and thus

10 Matthew G Knepley et al

subroutine FormJacobian(ujacuserierr)use f90moduletype (userctx) userMat jacdouble precision x(userxsuserxeuserysuserye)double precision twoonehxhyhxdhyhydhxscv(5)integer rowcol(5)ijierr

one = 10two = 20hx = onedble(usermxminus1)hy = onedble(usermyminus1)sc = hxhyuserlambdahxdhy = hxhyhydhx = hyhxdo 20 j=userysuserye

row = (j minus userys)userxm minus 1do 10 i=userxsuserxe

row = row + 1 boundary points

if (i == 1 or j == 1 or i == usermx or j == usermy) thencol(1) = rowv(1) = onecall MatSetValues(jac1row1colvINSERT VALUESierr)

interior grid pointselse

v(1) = minushxdhyv(2) = minushydhxv(3) = two(hydhx + hxdhy) minus scexp(x(ij))v(4) = minushydhxv(5) = minushxdhycol(1) = row minus userxmcol(2) = row minus 1col(3) = rowcol(4) = row + 1col(5) = row + userxmcall MatSetValues(jac1row5colvINSERT VALUESierr)

endif10 continue20 continue

Fig 9Jacobian calculation for the Bratu problem using PETSc

Developing a Geodynamics Simulator with PETSc 11

we must have access to some values not present locally on the process as shown inFig 10

Box-type stencil Star-type stencil

Proc 6

Proc 0 Proc 0Proc 1 Proc 1

Proc 6

Fig 10Star and box stencils using a DA

We must store values for theseghostpoints locally so that they may be used forthe computation In addition we must ensure that these values are identical to the cor-responding values on the neighboring processes At the lowest level PETSc providesghosted vectors created usingVecCreateGhost () with storage for some valuesowned by other processes These values are explicitly indicated during constructionTheVecGhostUpdateBegin () andVecGhostUpdateEnd () functions trans-fer values between ghost storage and the corresponding storage on other processesThis method however can become complicated for the user Therefore for the com-mon case of a logically rectangular grid PETSc provides theDA object to managethe determination allocation and coherence of ghost values TheDA object is dis-cussed fully in Section 32 but we give here a short introduction in the context of theBratu problem

TheDA object represents a distributed structured (possibly staggered) grid andthus has more information than our serial grid If we augment our user context toinclude information about ghost values we can obtain all the grid information fromtheDA Figure 11 shows the creation of aDA when one initially know only the totalnumber of vertices in thex andy directions Alternatively we could have specifiedthe local number of vertices in each direction

Now we need only modify our PETSc residual routine to update the ghost valuesas shown in Fig 12

Notice that the ghost value scatter is a two-step operation This allows thecommunication to overlap with local computation that may be taking place whilethe messages are in transit The only change necessary for the residual calcu-lation itself is the correct declaration of the input arraydouble precision

12 Matthew G Knepley et al

type userctxDA da

The start end and number of vertices in the xminusdirectioninteger xsxexm

The start end and number of ghost vertices in the xminusdirectioninteger gxsgxegxm

The start end and number of vertices in the yminusdirectioninteger ysyeym

The start end and number of ghost vertices in the yminusdirectioninteger gysgyegym

The number of vertices in the xminus and yminusdirectionsinteger mxmy

The MPI rank of this processinteger rank

The coefficient in the Bratu equationdouble precision lambda

end type userctx

call DACreate2d(PETSCCOMM WORLDDA NONPERIODICDA STENCIL STAR ampamp usermxusermyPETSCDECIDEPETSCDECIDE11 ampamp PETSCNULL INTEGERPETSCNULL INTEGERuserdaierr)call DAGetCorners(userdauserxsuserysPETSCNULL INTEGER amp

amp userxmuserymPETSCNULL INTEGERierr)call DAGetGhostCorners(userdausergxsusergys amp

amp PETSCNULL INTEGERusergxmusergym ampamp PETSCNULL INTEGERierr)

Here we shift the starting indices up by one so that we can easily use the Fortran convention of1minusbased indices rather than0minusbased

userxs = userxs+1userys = userys+1usergxs = usergxs+1usergys = usergys+1userye = userys+userymminus1userxe = userxs+userxmminus1usergye = usergys+usergymminus1usergxe = usergxs+usergxmminus1

Fig 11Creation of a DA for the Bratu problem

Developing a Geodynamics Simulator with PETSc 13

subroutine FormFunctionPETSc(uFuserierr)implicit none

Vec uFinteger ierrtype (userctx) user

double precisionpointer lu v()lf v()Vec uLocal

call DAGetLocalVector(userdauLocalierr)call DAGlobalToLocalBegin(userdauINSERT VALUESuLocalierr)call DAGlobalToLocalEnd(userdauINSERT VALUESuLocalierr)

call VecGetArrayF90(uLocallu vierr)call VecGetArrayF90(Flf vierr)

Actually compute the local portion of the residualcall FormFunctionLocal(lu vlf vuserierr)

call VecRestoreArrayF90(uLocallu vierr)call VecRestoreArrayF90(Flf vierr)

call DARestoreLocalVector(userdauLocalierr)

Fig 12Parallel residual calculation for the Bratu problem using PETSc

u(usergxsusergxeusergysusergye) which now includes ghostvalues

The changes toFormJacobian() are similar resizing the input array andchanging slightly the calculation of row and column indices and can be found inthe PETSc example source We have now finished the initial introduction of PETSclinear algebra and the simulation is ready to run in parallel

32 Rapid Evolution

The user who is willing to raise the level of abstraction in a code can start by em-ploying theDA to describe the problem domain and discretization The data interfacemimics a multidimensional array and is thus ideal for finite difference finite volumeand low-order finite element schemes on a logically rectangular grid In fact after thedefinition of aFormFunction() routine the simulation is ready to run becausethe nonlinear solver provides approximate Jacobians automatically as discussed inSection 4

14 Matthew G Knepley et al

DASetLocalFunction(userda (DALocalFunction1) FormFunctionLocal)

Fig 13Using the DA to form a function

int FormFunctionLocal(DALocalInfo infodouble udouble fAppCtx user)double two = 20hxhyhxdhyhydhxscdouble uijuxxuyyint ij

hx = 10(double)(infominusgtmxminus1)hy = 10(double)(infominusgtmyminus1)sc = hxhyuserminusgtlambdahxdhy = hxhyhydhx = hyhx Compute function over the locally owned part of the grid for (j=infominusgtys jltinfominusgtys+infominusgtym j++)

for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++) if (i == 0 | | j == 0 | | i == infominusgtmxminus1 | | j == infominusgtmyminus1)

f[j][i] = u[j][i] else

uij = u[j][i]uxx = (twouij minus u[j][iminus1] minus u[j][i+1])hydhxuyy = (twouij minus u[jminus1][i] minus u[j+1][i])hxdhyf[j][i] = uxx + uyy minus scexp(uij)

Fig 14FormFunctionLocal() for the Bratu problem

We begin by examining the Bratu problem from the last section only this time inC We can now provide our local residual routine to theDA see Fig 13

The grid information is passed toFormFunctionLocal() in a DALocalInfo structure as shown in Fig 14

The fields are passed directly as multidimensional C arrays complete with ghostvalues An application scientist can use a boilerplate example such as the Bratu prob-lem merely altering the local physics calculation inFormFunctionLocal() torapidly obtain a parallel scalable simulation that runs on the desktop as well as onmassively parallel supercomputers

If the user has an expression for the Jacobian of the residual function then thismatrix can also be computed by using theDA interface with the call as in Fig 15and the corresponding routine for the local calculation in Fig 16

Developing a Geodynamics Simulator with PETSc 15

DASetLocalJacobian(userda (DALocalFunction1) FormJacobianLocal)

Fig 15Using the DA to form a Jacobian

int FormJacobianLocal(DALocalInfo infodouble xMat jacAppCtx user)MatStencil col[5]rowdouble lambdav[5]hxhyhxdhyhydhxscint ij

lambda = userminusgtparamhx = 10(double)(infominusgtmxminus1)hy = 10(double)(infominusgtmyminus1)sc = hxhylambdahxdhy = hxhy hydhx = hyhx

for (j=infominusgtys jltinfominusgtys+infominusgtym j++) for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++)

rowj = j rowi = i boundary points if (i == 0 | | j == 0 | | i == infominusgtmxminus1 | | j == infominusgtmyminus1)

v[0] = 10MatSetValuesStencil(jac1amprow1amprowvINSERT VALUES)

else interior grid points

v[0] = minushxdhy col[0]j = j minus 1 col[0]i = iv[1] = minushydhx col[1]j = j col[1]i = iminus1v[2] = 20(hydhx + hxdhy) minus scexp(x[j][i]) col[2]j = j col[2]i = iv[3] = minushydhx col[3]j = j col[3]i = i+1v[4] = minushxdhy col[4]j = j + 1 col[4]i = iMatSetValuesStencil(jac1amprow5colvINSERT VALUES)

MatAssemblyBegin(jacMAT FINAL ASSEMBLY)MatAssemblyEnd(jacMAT FINAL ASSEMBLY)MatSetOption(jacMAT NEW NONZERO LOCATION ERR)

Fig 16FormJacobianLocal() for the Bratu problem

16 Matthew G Knepley et al

int FormFunctionLocal(DALocalInfo infoField xField fvoid ptr)AppCtx user = (AppCtx)ptrParameter param = userminusgtparamGridInfo grid = userminusgtgriddouble mag w mag uint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1 i j

for (j=infominusgtys jltinfominusgtys+infominusgtym j++) for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++)

calculateXMomentumResidual(i j x f user)calculateZMomentumResidual(i j x f user)calculatePressureResidual(i j x f user)calculateTemperatureResidual(i j x f user)

Fig 17FormFunctionLocal() for the mantle subduction problem

We have so far presented simple examples involving a perturbed Laplacian butthis development strategy extends far beyond toy problems We have developedlarge-scale parallel geophysical simulations [1] that are producing new results inthe field We began by replacingFormFunctionLocal() in the Bratu examplewith one containing the linear physics of the previous sequential linear subductioncode (dropping the previous codersquos linear solve) Then the PETSc solver results werecompared for correctness with the complete prior subduction code Next the newcode was immediately run in parallel for correctness and efficiency tests Finallythe nonlinear terms from the more complicated physics of the mantle subductionproblem discussed in Section 1 and shown in Figs 17-21 and better quality dis-cretizations of the boundary conditions were added In fact since the structure oftheFormFunction() changed significantly the finalFormFunction() looksnothing like the original but the process of obtaining through the incremental ap-proach reduced the learning curve for PETSc and make correctness checking at thevarious stages much easier

The details of the subduction code are not as important as the recognition thata very complicated physics problem need not involve any more complication in oursimulation infrastructure To give more insight into the actual calculation we showin Fig 22 the explicit calculation of the residual from the continuity equation

Here we have used theDA for a multicomponent problem on a staggered meshwithout alteration because the structure of the discretization remains logically rect-angular

Although theDA manages communication during the solution process data post-processing often necessary for visualization or analysis also requires this function-ality For the mantle subduction simulation we want to calculate both the second in-

Developing a Geodynamics Simulator with PETSc 17

int calculateXMomentumResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (iltj) f[j][i]u = x[j][i]u minus SlabVel(rsquoUrsquo ijuser)

else if (jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lithospheric lid f[j][i]u = x[j][i]u minus 00

else if (i==ilim ) on the right side boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]u = x[j][i]u minus HorizVelocity(ijuser) else

f[j][i]u = XNormalStress(xijCELL CENTERuser) minus EPS ZERO

else if (j==jlim ) on the bottom boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]u = x[j][i]u minus HorizVelocity(ijuser) else if (paramminusgtibound==BC NOSTRESS)

f[j][i]u = XMomentumResidual(xijuser)

else in the mantle wedge f[j][i]u = XMomentumResidual(xijuser)

Fig 18X-momentum residual for the mantle subduction problem

variant of the strain rate tensor and the viscosity over the grid (at both cell centers andcorners) Using theDAGlobalToLocalBegin () andDAGlobalToLocalEnd() functions we transferred the global solution data to local ghosted vectors and pro-ceeded with the calculation We can also transfer data from local vectors to a globalvector usingDALocalToGlobal () The entire viscosity calculation is given inFig 23

Note that this framework allows the user to manage arbitrary fields over the do-main For instance a porous flow simulation might manage material properties ofthe medium in this fashion

4 Solvers

The SNES object in PETSc is an abstraction of the ldquoinverserdquo of a nonlinear oper-ator The user provides aFormFunction() routine as seen in Section 3 whichcomputes the action of the operator on an input vector Linear problems can also be

18 Matthew G Knepley et al

int calculateZMomentumResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1jlim = infominusgtmyminus1

if (ilt=j) f[j][i]w = x[j][i]w minus SlabVel(rsquoWrsquo ijuser)

else if (jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lithospheric lid f[j][i]w = x[j][i]w minus 00

else if (j==jlim ) on the bottom boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]w = x[j][i]w minus VertVelocity(ijuser) else

f[j][i]w = ZNormalStress(xijCELL CENTERuser) minus EPS ZERO

else if (i==ilim ) on the right side boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]w = x[j][i]w minus VertVelocity(ijuser) else if (paramminusgtibound==BC NOSTRESS)

f[j][i]w = ZMomentumResidual(xijuser)

else in the mantle wedge f[j][i]w = ZMomentumResidual(xijuser)

Fig 19Z-momentum residual for the mantle subduction problem

int calculatePressureResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (iltj | | jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lid or slab f[j][i]p = x[j][i]p

else if ((i==ilim | | j==jlim ) ampamp paramminusgtibound==BC ANALYTIC ) on an analytic boundary f[j][i]p = x[j][i]p minus Pressure(ijuser)

else in the mantle wedge f[j][i]p = ContinuityResidual(xijuser)

Fig 20Pressure or continuity residual for the mantle subduction problem

Developing a Geodynamics Simulator with PETSc 19

int calculateTemperatureResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (j==0) on the surface f[j][i]T = x[j][i]T + x[j+1][i]T + PetscMax(x[j][i]T00)

else if (i==0) slab inflow boundary f[j][i]T = x[j][i]T minus PlateModel(jPLATE SLABuser)

else if (i==ilim ) right side boundary mag u = 10 minus pow( (10minusPetscMax(PetscMin(x[j][iminus1]uparamminusgtcb10)00)) 50 )f[j][i]T = x[j][i]T minus mag ux[jminus1][iminus1]T minus (10minusmag u)PlateModel(jPLATE LID user)

else if (j==jlim ) bottom boundary mag w = 10 minus pow( (10minusPetscMax(PetscMin(x[jminus1][i]wparamminusgtsb10)00)) 50 )f[j][i]T = x[j][i]T minus mag wx[jminus1][iminus1]T minus (10minusmag w)

else in the mantle wedge f[j][i]T = EnergyResidual(xijuser)

Fig 21Temperature or energy residual for the mantle subduction problem

double precision ContinuityResidual(Field x int i int j AppCtx user)

GridInfo grid = userminusgtgriddouble precision uEuWwNwSdudxdwdz

uW = x[j][iminus1]u uE = x[j][i]u dudx = (uE minus uW)gridminusgtdxwS = x[jminus1][i]w wN = x[j][i]w dwdz = (wN minus wS)gridminusgtdzreturn dudx + dwdz

Fig 22Calculation of the continuity residual for the mantle subduction problem

solved by usingSNES with no loss of efficiency compared to using the underlyingKSP object (PETSc linear solver object) directly ThusSNES seems the appropri-ate framework for a general-purpose simulator providing the flexibility to add orsubtract nonlinear terms from the equation at will

TheSNES solver does not require a user to implement a Jacobian Default rou-tines are provided to compute a finite difference approximation using coloring toaccount for the sparsity of the matrix The user does not even need the PETScMatinterface but can initially interact only withVec objects making the transition to

PETSc nearly painless However these approximations to the Jacobian can have nu-merical difficulties and are not as efficient as direct evaluation Therefore the user

20 Matthew G Knepley et al

Compute both the second invariant of the strain rate tensor and the viscosity int ViscosityField(DA da Vec X Vec VAppCtx user)

Parameter param = userminusgtparamGridInfo grid = userminusgtgridVec localXField v xdouble eps dx dz T epsC TCint ijisjsimjmilim jlim ivt

ivt = paramminusgtiviscparamminusgtivisc = paramminusgtoutput ivisc

DACreateLocalVector(da amplocalX)DAGlobalToLocalBegin(da X INSERT VALUES localX)DAGlobalToLocalEnd(da X INSERT VALUES localX)DAVecGetArray(dalocalX(void)ampx)DAVecGetArray(daV(void)ampv)

Parameters dx = gridminusgtdx dz = gridminusgtdzilim = gridminusgtniminus1 jlim = gridminusgtnjminus1

Compute real temperature strain rate and viscosity DAGetCorners(daampisampjsPETSCNULL ampimampjmPETSCNULL )for (j=js jltjs+jm j++)

for (i=is iltis+im i++) T = paramminusgtpotentialT x[j][i]T exp( (jminus05)dzparamminusgtz scale )if (iltilim ampamp jltjlim )

TC = paramminusgtpotentialT TInterp(xij) exp( jdzparamminusgtz scale ) else

TC = T Compute the values at both cell centers and cell corners eps = CalcSecInv(xijCELL CENTERuser)epsC = CalcSecInv(xijCELL CORNERuser)v[j][i]u = epsv[j][i]w = epsCv[j][i]p = Viscosity(Tepsdz(jminus05)param)v[j][i]T = Viscosity(TCepsCdzjparam)

DAVecRestoreArray(daV(void)ampv)DAVecRestoreArray(dalocalX(void)ampx)paramminusgtivisc = ivt

Fig 23 Calculation of the second invariant of the strain tensor and viscosity field for themantle subduction problem

Developing a Geodynamics Simulator with PETSc 21

KSP ksp subkspPC bjacobi ilu

SNESGetKSP(snes ampksp)KSPGetPC(ksp ampbjacobi)PCBJacobiGetSubKSP(bjacobi PETSCNULL PETSCNULL ampsubksp)KSPGetPC(subksp[0] ampilu)PCILUSetLevels(ilu levels)

Fig 24Customizing the preconditioner on a single process

has the option of providing a routine or of using the ADIC [15] or similar system forautomatic differentiation The finite difference approximation can be activated byusing the-snes fd and-mat fd coloring freq options or by providing theSNESDefaultComputeJacobianColor () function toSNESSetJacobian()

Through theSNES object the user may also access the great range of directand iterative linear solvers and preconditioners provided by PETSc Sixteen differentKrylov solvers are available including GMRES BiCGStab and LSQR along withinterfaces to popular sparse direct packages such as MUMPS [4] and SuperLU [9]In addition to a variety of incomplete factorization preconditioners including PI-LUT [16] PETSc supports additive Schwartz preconditioning algebraic multigridthrough BoomerAMG [13] and the geometric multigrid discussed below The fullpanoply of solvers and preconditioners available is catalogued on the PETSc Web-site [3]

The modularity of PETSc allows users to easily customize each solver For in-stance suppose the user wishes to increase the number of levels in ILU(k) precon-ditioning on one block of a block-Jacobi scheme The code fragment in Fig 24 willset the number of levels of fill tolevels

PETSc provides an elegant framework for managing geometric multigrid in com-bination with aDA which is abstracted in theDMMGobject In the same way thatany linear solve can be performed withSNES any preconditioner or solver com-bination is available inDMMGby using a single level That is the same user codesupports both geometric multigrid as well as direct methods Krylov methods etcWhen creating aDMMG the user specifies the number of levels of grid refinementand provides the coarse-grid information (see Fig 25) which is always aDA objectat present although unstructured prototypes are being developed Then theDMMGobject calculates the intergrid transfer operators prolongation and restriction and

allocates objects to hold the solution at each level The user must provide the actionof the residual operatorF and can optionally provide a routine to compute the Jaco-bian matrix (which will be called on each level) and a routine to compute the initialguess as in Fig 26

With this information the user can now solve the equation and retrieve the solu-tion vector as in Fig 27

22 Matthew G Knepley et al

DMMGCreate(comm gridmglevels user ampdmmg)DMMGSetDM(dmmg (DM) da)

Fig 25Creating a DMMG

DMMGSetSNESLocal(dmmg FormFunctionLocal FormJacobianLocal 0 0)DMMGSetInitialGuess(dmmg FormInitialGuess)

Fig 26Providing discretization and assembly routines to DMMG

DMMGSolve(dmmg)soln = DMMGGetx(dmmg)

Fig 27Solving the problem with DMMG

5 Extensions

PETSc is not a complete environment for simulating physical phenomena Ratherit is a set of tools that allow the user to assemble such an environment tailored toa specific application As such PETSc will never provide every facility appropriatefor a given simulation However the user can easily extend PETSc and supplementits capabilities We present two examples in the context of the mantle subductionsimulation

51 Simple Continuation

Because of the highly nonlinear dependence of viscosity on temperature and strainrate in equation (1) the iteration in Newtonrsquos method can fail to converge withouta good starting guess of the solution On the other hand for the isoviscous casewhere temperature is coupled to velocity only through advection a solution is easilyreached It is therefore natural to propose a continuation scheme in the viscosity be-ginning with constant viscosity and progressing to full variability A simple adaptivecontinuation model was devised in which the viscosity was raised to a power be-tween zero and oneη rarr ηα whereα=0 corresponds to constant viscosity andα=1corresponds to natural viscosity variation In the continuation loopα was increasedtowards unity by an amount dependent on the rate of convergence of the previousSNES solve

PETSc itself provides no special support for continuation but it is sufficientlymodular that a continuation loop was readily constructed in the application codeusing repeated calls toDMMGSolve() and found to work quite well

Developing a Geodynamics Simulator with PETSc 23

52 Simple Steering

No rigorous justification had been given for the continuation strategy above andconsequently it was possible that for certain initial conditions Newton would fail toconverge thereby resulting in extremely long run times An observant user could de-tect this situation and abort the run but all the potentially useful solution informationup to that point would be lost A clean way to asynchronously abort the continua-tion loop was thus needed On architectures where OS signals are available PETScprovides an interface for registering signal handlers Thus we were able to define ahandler The user wishing to change the control flow of the simulation simply sendsthe appropriate signal to the process This sets a flag in the user data that causes thechange at the next iteration

6 Simulation Results

A long-standing debate has existed between theorists and observationalists over thethermal structure of subduction zones Modelers using constant viscosity flow sim-ulations to compute thermal structure have predicted relatively cold subductionzones [19] Conversely observationalists who use heat flow measurements and min-eralogical thermobarometry to estimate temperatures at depth have long claimedthat subduction zones are hotter than predicted They have invoked the presence ofstrong upwelling of hot mantle material to supply the heat This upwelling was neverpredicted by isoviscous flow models however

The debate remained unresolved until recently when several groups includingour own succeeded in developing simulations with realistically variable viscosity(for other examples [810121821]) A comparison of flow and temperature fieldsfor variable and constant viscosity simulations generated with our code is shownin Fig 28 Results from these simulations are exciting because they close the gapbetween models and observations they predict hotter mantle temperatures steepersurface thermal gradients and upwelling mantle flow

Furthermore these simulations allow for quantitative predictions of variation ofobservables with subduction parameters A recent study of subduction zone earth-quakes has identified an intriguing trend the vertical distance from subduction zonevolcanoes to the surface of the subducting slab is anti-correlated with descent rate ofthe slab [11] Preliminary calculations with our model are consistent with this trendindicating that flow and thermal structure may play an important role in determiningnot only the quantity and chemistry of magmas but also their path of transport to thesurface Further work is required to resolve this issue

Simulating the geodynamics of subduction is an example of our success in port-ing an existing code into the PETSc framework simultaneously parallelizing thecode and increasing its functionality Using this code as a template we rapidly de-veloped a related simulation of another tectonic boundary the mid-ocean ridge Thiswork was done to address a set of observations of systematic morphological asym-metry in the global mid-ocean ridge system [7] Our model confirms the qualitative

24 Matthew G Knepley et al

Dep

th k

m

log10

η and flow field

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24 Temperature degC

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Distance km

Dep

th k

m

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24

Distance km

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Isov

isco

usV

aria

ble

Vis

cosi

ty

(a) (b)

(c) (d)

Fig 282D viscosity and potential temperature fields from simulations on 8 processors with230112 degrees of freedom Panels in the top row are from a simulation withα=1 in equation(1) Panels in bottom row haveα=0 The white box in panels (a) and (c) shows the region inwhich temperature is plotted in panels (b) and (d)(a) Colors showlog10 of the viscosity fieldNote that there are more than five orders of magnitude variation in viscosity Arrows show theflow direction and magnitude (the slab is being subducted at a rate of 6 cmyear) Upwellingis evident in the flow field near the wedge corner(b) Temperature field from the variableviscosity simulation 1100C isotherm is shown as a dashed line(c) (Constant) viscosity andflow field from the isoviscous simulation Strong flow is predicted at the base of the crustdespite the low-temperature rock there No upwelling flow is predicted(d) Temperature fieldfrom isoviscous simulation Note that the mantle wedge corner is much colder than in (b)

mechanism that had been proposed to explain these observations Furthermore itshows a quantitative agreement with trends in the observed data [17] As with ourmodels of subduction the key to demonstrating the validity of the hypothesized dy-namics was simulating them with the strongly nonlinear rheology that characterizesmantle rock on geologic timescales The computational tools provided by PETSc en-abled us easily handle this rheology reducing development time and allowing us tofocus on model interpretation

References

1 PETSc SNES Example 30 httpwwwmcsanlgovpetscpetsc-2

Developing a Geodynamics Simulator with PETSc 25

snapshotspetsc-currentsrcsnesexamplestutorialsex30chtml

2 PETSc SNES Example 5 httpwwwmcsanlgovpetscpetsc-2snapshotspetsc-currentsrcsnesexamplestutorialsex5f90Fhtml

3 PETSc Solvers httpwwwmcsanlgovpetscpetsc-2documentationlinearsolvertablehtml

4 Amestoy P R Duff I S LrsquoExcellent J-Y and Koster J A fully asynchronous mul-tifrontal solver using distributed dynamic schedulingSIAM Journal on Matrix Analysisand Applications 23(1)15ndash41 (2001)

5 Balay S Buschelman K Eijkhout V Gropp W D Kaushik D Knepley M GMcInnes L C Smith B F and Zhang H PETSc Web pagehttpwwwmcsanlgovpetsc

6 Balay S Gropp W D McInnes L C and Smith B F Efficient management of par-allelism in object oriented numerical software libraries In Arge E Bruaset A M andLangtangen H P editorsModern Software Tools in Scientific Computing pages 163ndash202 Birkhauser Press (1997)

7 Carbotte S Small C and Donnelly K The influence of ridge migration on the mag-matic segmentation of mid-ocean ridgesNature 429743ndash746 (2004)

8 Conder J Wiens D and Morris J On the decompression melting structure at volcanicarcs and back-arc spreading centersGeophys Res Letts 29 (2002)

9 Demmel J W Gilbert J R and Li X S SuperLU userrsquos guide Technical ReportLBNL-44289 Lawrence Berkeley National Laboratory (2003)

10 Eberle M Grasset O and Sotin C A numerical study of the interaction of the man-tle wedge subducting slab and overriding platePhys Earth Planet In 134191ndash202(2002)

11 England P Engdahl R and Thatcher W Systematic variation in the depth of slabsbeneath arc volcanosGeophys J Int 156(2)377ndash408 (2003)

12 Furukawa Y Depth of the decoupling plate interface and thermal structure under arcsJ Geophys Res 9820005ndash20013 (1993)

13 Henson V E and Yang U M BoomerAMG A parallel algebraic multigrid solver andpreconditioner Technical Report UCRL-JC-133948 Lawrence Livermore National Lab-oratory (2000)

14 Hirth G and Kohlstedt D Rheology of the upper mantle and the mantle wedge A viewfrom the experimentalists InInside the Subduction Factory volume 138 ofGeophysicalMonograph American Geophysical Union (2003)

15 Hovland P Norris B and Smith B Making automatic differentiation truly automaticCoupling PETSc with ADIC InProceedings of ICCS2002 (2002)

16 Hysom D and Pothen A A scalable parallel algorithm for incomplete factor precondi-tioning SIAM Journal on Scientific Computing 222194ndash2215 (2001)

17 Katz R Spiegelman M and Carbotte S Ridge migration asthenospheric flow and theorigin of magmatic segmentation in the global mid-ocean ridge systemGeophys ResLetts 31 (2004)

18 Kelemen P Rilling J Parmentier E Mehl L and Hacker B Thermal structure dueto solid-state flow in the mantle wedge beneath arcs InInside the Subduction Factoryvolume 138 ofGeophysical Monograph American Geophysical Union (2003)

19 Peacock S and Wang K Seismic consequences of warm versus cool subduction meta-morphism Examples from southwest and northeast JapanScience 286937ndash939 (1999)

20 Stern R Subduction zonesRev Geophys 40(4) (2002)

26 Matthew G Knepley et al

21 van Keken P Kiefer B and Peacock S High-resolution models of subduction zonesImplications for mineral dehydration reactions and the transport of water into the deepmantleGeochem Geophys Geosys 3(10) (2003)

Page 9: Developing a Geodynamics Simulator with PETScfoalab.earth.ox.ac.uk/publications/Knepley_etal_Springer06.pdf · Developing a Geodynamics Simulator with PETSc Matthew G. Knepley 1,

Developing a Geodynamics Simulator with PETSc 9

subroutine driver(uFuserierr)implicit none

Vec uFint Nparameter(N=10000)

call VecCreate(PETSCCOMM WORLDuierr)call VecSetSizes(uPETSCDECIDENierr)call VecDuplicate(uFierr)call SolverLoop(uFierr)call VecDestroy(uierr)call VecDestroy(Fierr)end

subroutine FormFunctionPETSc(uFuserierr)use f90moduleimplicit none

Vec uFtype (userctx) userdouble precisionpointer u v()f v()

call VecGetArrayF90(uu vierr)call VecGetArrayF90(Ff vierr)call FormFunction(u vf vuserierr)call VecRestoreArrayF90(uu vierr)call VecRestoreArrayF90(Ff vierr)end

Fig 8Driver for the Bratu problem using PETSc

PETScMat object provides theMatSetValues () function to set logically denseblocks of values into the structure A function that computes the Jacobian and storesit in a PETSc matrix is shown in Fig 9 With this addition the user can now runserial code using the PETSc solvers (details are given in Section 4) Note that thestorage mechanism is the only change to user code necessary for its incorporationinto the PETSc framework using the wrapper in Fig 8

Although PETSc removes the need for low-level parallel programming such asdirect calls to the MPI library the user must still identify parallelism inherent inthe application and must structure the computation accordingly The first step is topartition the domain and have each process calculate the residual and Jacobian onlyover its local piece This is easily accomplished by redefining thexs xe ys andyevariables on each process converting loops over the entire domain into loops overthe local domain However this method uses nearest-neighbor information and thus

10 Matthew G Knepley et al

subroutine FormJacobian(ujacuserierr)use f90moduletype (userctx) userMat jacdouble precision x(userxsuserxeuserysuserye)double precision twoonehxhyhxdhyhydhxscv(5)integer rowcol(5)ijierr

one = 10two = 20hx = onedble(usermxminus1)hy = onedble(usermyminus1)sc = hxhyuserlambdahxdhy = hxhyhydhx = hyhxdo 20 j=userysuserye

row = (j minus userys)userxm minus 1do 10 i=userxsuserxe

row = row + 1 boundary points

if (i == 1 or j == 1 or i == usermx or j == usermy) thencol(1) = rowv(1) = onecall MatSetValues(jac1row1colvINSERT VALUESierr)

interior grid pointselse

v(1) = minushxdhyv(2) = minushydhxv(3) = two(hydhx + hxdhy) minus scexp(x(ij))v(4) = minushydhxv(5) = minushxdhycol(1) = row minus userxmcol(2) = row minus 1col(3) = rowcol(4) = row + 1col(5) = row + userxmcall MatSetValues(jac1row5colvINSERT VALUESierr)

endif10 continue20 continue

Fig 9Jacobian calculation for the Bratu problem using PETSc

Developing a Geodynamics Simulator with PETSc 11

we must have access to some values not present locally on the process as shown inFig 10

Box-type stencil Star-type stencil

Proc 6

Proc 0 Proc 0Proc 1 Proc 1

Proc 6

Fig 10Star and box stencils using a DA

We must store values for theseghostpoints locally so that they may be used forthe computation In addition we must ensure that these values are identical to the cor-responding values on the neighboring processes At the lowest level PETSc providesghosted vectors created usingVecCreateGhost () with storage for some valuesowned by other processes These values are explicitly indicated during constructionTheVecGhostUpdateBegin () andVecGhostUpdateEnd () functions trans-fer values between ghost storage and the corresponding storage on other processesThis method however can become complicated for the user Therefore for the com-mon case of a logically rectangular grid PETSc provides theDA object to managethe determination allocation and coherence of ghost values TheDA object is dis-cussed fully in Section 32 but we give here a short introduction in the context of theBratu problem

TheDA object represents a distributed structured (possibly staggered) grid andthus has more information than our serial grid If we augment our user context toinclude information about ghost values we can obtain all the grid information fromtheDA Figure 11 shows the creation of aDA when one initially know only the totalnumber of vertices in thex andy directions Alternatively we could have specifiedthe local number of vertices in each direction

Now we need only modify our PETSc residual routine to update the ghost valuesas shown in Fig 12

Notice that the ghost value scatter is a two-step operation This allows thecommunication to overlap with local computation that may be taking place whilethe messages are in transit The only change necessary for the residual calcu-lation itself is the correct declaration of the input arraydouble precision

12 Matthew G Knepley et al

type userctxDA da

The start end and number of vertices in the xminusdirectioninteger xsxexm

The start end and number of ghost vertices in the xminusdirectioninteger gxsgxegxm

The start end and number of vertices in the yminusdirectioninteger ysyeym

The start end and number of ghost vertices in the yminusdirectioninteger gysgyegym

The number of vertices in the xminus and yminusdirectionsinteger mxmy

The MPI rank of this processinteger rank

The coefficient in the Bratu equationdouble precision lambda

end type userctx

call DACreate2d(PETSCCOMM WORLDDA NONPERIODICDA STENCIL STAR ampamp usermxusermyPETSCDECIDEPETSCDECIDE11 ampamp PETSCNULL INTEGERPETSCNULL INTEGERuserdaierr)call DAGetCorners(userdauserxsuserysPETSCNULL INTEGER amp

amp userxmuserymPETSCNULL INTEGERierr)call DAGetGhostCorners(userdausergxsusergys amp

amp PETSCNULL INTEGERusergxmusergym ampamp PETSCNULL INTEGERierr)

Here we shift the starting indices up by one so that we can easily use the Fortran convention of1minusbased indices rather than0minusbased

userxs = userxs+1userys = userys+1usergxs = usergxs+1usergys = usergys+1userye = userys+userymminus1userxe = userxs+userxmminus1usergye = usergys+usergymminus1usergxe = usergxs+usergxmminus1

Fig 11Creation of a DA for the Bratu problem

Developing a Geodynamics Simulator with PETSc 13

subroutine FormFunctionPETSc(uFuserierr)implicit none

Vec uFinteger ierrtype (userctx) user

double precisionpointer lu v()lf v()Vec uLocal

call DAGetLocalVector(userdauLocalierr)call DAGlobalToLocalBegin(userdauINSERT VALUESuLocalierr)call DAGlobalToLocalEnd(userdauINSERT VALUESuLocalierr)

call VecGetArrayF90(uLocallu vierr)call VecGetArrayF90(Flf vierr)

Actually compute the local portion of the residualcall FormFunctionLocal(lu vlf vuserierr)

call VecRestoreArrayF90(uLocallu vierr)call VecRestoreArrayF90(Flf vierr)

call DARestoreLocalVector(userdauLocalierr)

Fig 12Parallel residual calculation for the Bratu problem using PETSc

u(usergxsusergxeusergysusergye) which now includes ghostvalues

The changes toFormJacobian() are similar resizing the input array andchanging slightly the calculation of row and column indices and can be found inthe PETSc example source We have now finished the initial introduction of PETSclinear algebra and the simulation is ready to run in parallel

32 Rapid Evolution

The user who is willing to raise the level of abstraction in a code can start by em-ploying theDA to describe the problem domain and discretization The data interfacemimics a multidimensional array and is thus ideal for finite difference finite volumeand low-order finite element schemes on a logically rectangular grid In fact after thedefinition of aFormFunction() routine the simulation is ready to run becausethe nonlinear solver provides approximate Jacobians automatically as discussed inSection 4

14 Matthew G Knepley et al

DASetLocalFunction(userda (DALocalFunction1) FormFunctionLocal)

Fig 13Using the DA to form a function

int FormFunctionLocal(DALocalInfo infodouble udouble fAppCtx user)double two = 20hxhyhxdhyhydhxscdouble uijuxxuyyint ij

hx = 10(double)(infominusgtmxminus1)hy = 10(double)(infominusgtmyminus1)sc = hxhyuserminusgtlambdahxdhy = hxhyhydhx = hyhx Compute function over the locally owned part of the grid for (j=infominusgtys jltinfominusgtys+infominusgtym j++)

for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++) if (i == 0 | | j == 0 | | i == infominusgtmxminus1 | | j == infominusgtmyminus1)

f[j][i] = u[j][i] else

uij = u[j][i]uxx = (twouij minus u[j][iminus1] minus u[j][i+1])hydhxuyy = (twouij minus u[jminus1][i] minus u[j+1][i])hxdhyf[j][i] = uxx + uyy minus scexp(uij)

Fig 14FormFunctionLocal() for the Bratu problem

We begin by examining the Bratu problem from the last section only this time inC We can now provide our local residual routine to theDA see Fig 13

The grid information is passed toFormFunctionLocal() in a DALocalInfo structure as shown in Fig 14

The fields are passed directly as multidimensional C arrays complete with ghostvalues An application scientist can use a boilerplate example such as the Bratu prob-lem merely altering the local physics calculation inFormFunctionLocal() torapidly obtain a parallel scalable simulation that runs on the desktop as well as onmassively parallel supercomputers

If the user has an expression for the Jacobian of the residual function then thismatrix can also be computed by using theDA interface with the call as in Fig 15and the corresponding routine for the local calculation in Fig 16

Developing a Geodynamics Simulator with PETSc 15

DASetLocalJacobian(userda (DALocalFunction1) FormJacobianLocal)

Fig 15Using the DA to form a Jacobian

int FormJacobianLocal(DALocalInfo infodouble xMat jacAppCtx user)MatStencil col[5]rowdouble lambdav[5]hxhyhxdhyhydhxscint ij

lambda = userminusgtparamhx = 10(double)(infominusgtmxminus1)hy = 10(double)(infominusgtmyminus1)sc = hxhylambdahxdhy = hxhy hydhx = hyhx

for (j=infominusgtys jltinfominusgtys+infominusgtym j++) for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++)

rowj = j rowi = i boundary points if (i == 0 | | j == 0 | | i == infominusgtmxminus1 | | j == infominusgtmyminus1)

v[0] = 10MatSetValuesStencil(jac1amprow1amprowvINSERT VALUES)

else interior grid points

v[0] = minushxdhy col[0]j = j minus 1 col[0]i = iv[1] = minushydhx col[1]j = j col[1]i = iminus1v[2] = 20(hydhx + hxdhy) minus scexp(x[j][i]) col[2]j = j col[2]i = iv[3] = minushydhx col[3]j = j col[3]i = i+1v[4] = minushxdhy col[4]j = j + 1 col[4]i = iMatSetValuesStencil(jac1amprow5colvINSERT VALUES)

MatAssemblyBegin(jacMAT FINAL ASSEMBLY)MatAssemblyEnd(jacMAT FINAL ASSEMBLY)MatSetOption(jacMAT NEW NONZERO LOCATION ERR)

Fig 16FormJacobianLocal() for the Bratu problem

16 Matthew G Knepley et al

int FormFunctionLocal(DALocalInfo infoField xField fvoid ptr)AppCtx user = (AppCtx)ptrParameter param = userminusgtparamGridInfo grid = userminusgtgriddouble mag w mag uint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1 i j

for (j=infominusgtys jltinfominusgtys+infominusgtym j++) for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++)

calculateXMomentumResidual(i j x f user)calculateZMomentumResidual(i j x f user)calculatePressureResidual(i j x f user)calculateTemperatureResidual(i j x f user)

Fig 17FormFunctionLocal() for the mantle subduction problem

We have so far presented simple examples involving a perturbed Laplacian butthis development strategy extends far beyond toy problems We have developedlarge-scale parallel geophysical simulations [1] that are producing new results inthe field We began by replacingFormFunctionLocal() in the Bratu examplewith one containing the linear physics of the previous sequential linear subductioncode (dropping the previous codersquos linear solve) Then the PETSc solver results werecompared for correctness with the complete prior subduction code Next the newcode was immediately run in parallel for correctness and efficiency tests Finallythe nonlinear terms from the more complicated physics of the mantle subductionproblem discussed in Section 1 and shown in Figs 17-21 and better quality dis-cretizations of the boundary conditions were added In fact since the structure oftheFormFunction() changed significantly the finalFormFunction() looksnothing like the original but the process of obtaining through the incremental ap-proach reduced the learning curve for PETSc and make correctness checking at thevarious stages much easier

The details of the subduction code are not as important as the recognition thata very complicated physics problem need not involve any more complication in oursimulation infrastructure To give more insight into the actual calculation we showin Fig 22 the explicit calculation of the residual from the continuity equation

Here we have used theDA for a multicomponent problem on a staggered meshwithout alteration because the structure of the discretization remains logically rect-angular

Although theDA manages communication during the solution process data post-processing often necessary for visualization or analysis also requires this function-ality For the mantle subduction simulation we want to calculate both the second in-

Developing a Geodynamics Simulator with PETSc 17

int calculateXMomentumResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (iltj) f[j][i]u = x[j][i]u minus SlabVel(rsquoUrsquo ijuser)

else if (jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lithospheric lid f[j][i]u = x[j][i]u minus 00

else if (i==ilim ) on the right side boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]u = x[j][i]u minus HorizVelocity(ijuser) else

f[j][i]u = XNormalStress(xijCELL CENTERuser) minus EPS ZERO

else if (j==jlim ) on the bottom boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]u = x[j][i]u minus HorizVelocity(ijuser) else if (paramminusgtibound==BC NOSTRESS)

f[j][i]u = XMomentumResidual(xijuser)

else in the mantle wedge f[j][i]u = XMomentumResidual(xijuser)

Fig 18X-momentum residual for the mantle subduction problem

variant of the strain rate tensor and the viscosity over the grid (at both cell centers andcorners) Using theDAGlobalToLocalBegin () andDAGlobalToLocalEnd() functions we transferred the global solution data to local ghosted vectors and pro-ceeded with the calculation We can also transfer data from local vectors to a globalvector usingDALocalToGlobal () The entire viscosity calculation is given inFig 23

Note that this framework allows the user to manage arbitrary fields over the do-main For instance a porous flow simulation might manage material properties ofthe medium in this fashion

4 Solvers

The SNES object in PETSc is an abstraction of the ldquoinverserdquo of a nonlinear oper-ator The user provides aFormFunction() routine as seen in Section 3 whichcomputes the action of the operator on an input vector Linear problems can also be

18 Matthew G Knepley et al

int calculateZMomentumResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1jlim = infominusgtmyminus1

if (ilt=j) f[j][i]w = x[j][i]w minus SlabVel(rsquoWrsquo ijuser)

else if (jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lithospheric lid f[j][i]w = x[j][i]w minus 00

else if (j==jlim ) on the bottom boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]w = x[j][i]w minus VertVelocity(ijuser) else

f[j][i]w = ZNormalStress(xijCELL CENTERuser) minus EPS ZERO

else if (i==ilim ) on the right side boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]w = x[j][i]w minus VertVelocity(ijuser) else if (paramminusgtibound==BC NOSTRESS)

f[j][i]w = ZMomentumResidual(xijuser)

else in the mantle wedge f[j][i]w = ZMomentumResidual(xijuser)

Fig 19Z-momentum residual for the mantle subduction problem

int calculatePressureResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (iltj | | jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lid or slab f[j][i]p = x[j][i]p

else if ((i==ilim | | j==jlim ) ampamp paramminusgtibound==BC ANALYTIC ) on an analytic boundary f[j][i]p = x[j][i]p minus Pressure(ijuser)

else in the mantle wedge f[j][i]p = ContinuityResidual(xijuser)

Fig 20Pressure or continuity residual for the mantle subduction problem

Developing a Geodynamics Simulator with PETSc 19

int calculateTemperatureResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (j==0) on the surface f[j][i]T = x[j][i]T + x[j+1][i]T + PetscMax(x[j][i]T00)

else if (i==0) slab inflow boundary f[j][i]T = x[j][i]T minus PlateModel(jPLATE SLABuser)

else if (i==ilim ) right side boundary mag u = 10 minus pow( (10minusPetscMax(PetscMin(x[j][iminus1]uparamminusgtcb10)00)) 50 )f[j][i]T = x[j][i]T minus mag ux[jminus1][iminus1]T minus (10minusmag u)PlateModel(jPLATE LID user)

else if (j==jlim ) bottom boundary mag w = 10 minus pow( (10minusPetscMax(PetscMin(x[jminus1][i]wparamminusgtsb10)00)) 50 )f[j][i]T = x[j][i]T minus mag wx[jminus1][iminus1]T minus (10minusmag w)

else in the mantle wedge f[j][i]T = EnergyResidual(xijuser)

Fig 21Temperature or energy residual for the mantle subduction problem

double precision ContinuityResidual(Field x int i int j AppCtx user)

GridInfo grid = userminusgtgriddouble precision uEuWwNwSdudxdwdz

uW = x[j][iminus1]u uE = x[j][i]u dudx = (uE minus uW)gridminusgtdxwS = x[jminus1][i]w wN = x[j][i]w dwdz = (wN minus wS)gridminusgtdzreturn dudx + dwdz

Fig 22Calculation of the continuity residual for the mantle subduction problem

solved by usingSNES with no loss of efficiency compared to using the underlyingKSP object (PETSc linear solver object) directly ThusSNES seems the appropri-ate framework for a general-purpose simulator providing the flexibility to add orsubtract nonlinear terms from the equation at will

TheSNES solver does not require a user to implement a Jacobian Default rou-tines are provided to compute a finite difference approximation using coloring toaccount for the sparsity of the matrix The user does not even need the PETScMatinterface but can initially interact only withVec objects making the transition to

PETSc nearly painless However these approximations to the Jacobian can have nu-merical difficulties and are not as efficient as direct evaluation Therefore the user

20 Matthew G Knepley et al

Compute both the second invariant of the strain rate tensor and the viscosity int ViscosityField(DA da Vec X Vec VAppCtx user)

Parameter param = userminusgtparamGridInfo grid = userminusgtgridVec localXField v xdouble eps dx dz T epsC TCint ijisjsimjmilim jlim ivt

ivt = paramminusgtiviscparamminusgtivisc = paramminusgtoutput ivisc

DACreateLocalVector(da amplocalX)DAGlobalToLocalBegin(da X INSERT VALUES localX)DAGlobalToLocalEnd(da X INSERT VALUES localX)DAVecGetArray(dalocalX(void)ampx)DAVecGetArray(daV(void)ampv)

Parameters dx = gridminusgtdx dz = gridminusgtdzilim = gridminusgtniminus1 jlim = gridminusgtnjminus1

Compute real temperature strain rate and viscosity DAGetCorners(daampisampjsPETSCNULL ampimampjmPETSCNULL )for (j=js jltjs+jm j++)

for (i=is iltis+im i++) T = paramminusgtpotentialT x[j][i]T exp( (jminus05)dzparamminusgtz scale )if (iltilim ampamp jltjlim )

TC = paramminusgtpotentialT TInterp(xij) exp( jdzparamminusgtz scale ) else

TC = T Compute the values at both cell centers and cell corners eps = CalcSecInv(xijCELL CENTERuser)epsC = CalcSecInv(xijCELL CORNERuser)v[j][i]u = epsv[j][i]w = epsCv[j][i]p = Viscosity(Tepsdz(jminus05)param)v[j][i]T = Viscosity(TCepsCdzjparam)

DAVecRestoreArray(daV(void)ampv)DAVecRestoreArray(dalocalX(void)ampx)paramminusgtivisc = ivt

Fig 23 Calculation of the second invariant of the strain tensor and viscosity field for themantle subduction problem

Developing a Geodynamics Simulator with PETSc 21

KSP ksp subkspPC bjacobi ilu

SNESGetKSP(snes ampksp)KSPGetPC(ksp ampbjacobi)PCBJacobiGetSubKSP(bjacobi PETSCNULL PETSCNULL ampsubksp)KSPGetPC(subksp[0] ampilu)PCILUSetLevels(ilu levels)

Fig 24Customizing the preconditioner on a single process

has the option of providing a routine or of using the ADIC [15] or similar system forautomatic differentiation The finite difference approximation can be activated byusing the-snes fd and-mat fd coloring freq options or by providing theSNESDefaultComputeJacobianColor () function toSNESSetJacobian()

Through theSNES object the user may also access the great range of directand iterative linear solvers and preconditioners provided by PETSc Sixteen differentKrylov solvers are available including GMRES BiCGStab and LSQR along withinterfaces to popular sparse direct packages such as MUMPS [4] and SuperLU [9]In addition to a variety of incomplete factorization preconditioners including PI-LUT [16] PETSc supports additive Schwartz preconditioning algebraic multigridthrough BoomerAMG [13] and the geometric multigrid discussed below The fullpanoply of solvers and preconditioners available is catalogued on the PETSc Web-site [3]

The modularity of PETSc allows users to easily customize each solver For in-stance suppose the user wishes to increase the number of levels in ILU(k) precon-ditioning on one block of a block-Jacobi scheme The code fragment in Fig 24 willset the number of levels of fill tolevels

PETSc provides an elegant framework for managing geometric multigrid in com-bination with aDA which is abstracted in theDMMGobject In the same way thatany linear solve can be performed withSNES any preconditioner or solver com-bination is available inDMMGby using a single level That is the same user codesupports both geometric multigrid as well as direct methods Krylov methods etcWhen creating aDMMG the user specifies the number of levels of grid refinementand provides the coarse-grid information (see Fig 25) which is always aDA objectat present although unstructured prototypes are being developed Then theDMMGobject calculates the intergrid transfer operators prolongation and restriction and

allocates objects to hold the solution at each level The user must provide the actionof the residual operatorF and can optionally provide a routine to compute the Jaco-bian matrix (which will be called on each level) and a routine to compute the initialguess as in Fig 26

With this information the user can now solve the equation and retrieve the solu-tion vector as in Fig 27

22 Matthew G Knepley et al

DMMGCreate(comm gridmglevels user ampdmmg)DMMGSetDM(dmmg (DM) da)

Fig 25Creating a DMMG

DMMGSetSNESLocal(dmmg FormFunctionLocal FormJacobianLocal 0 0)DMMGSetInitialGuess(dmmg FormInitialGuess)

Fig 26Providing discretization and assembly routines to DMMG

DMMGSolve(dmmg)soln = DMMGGetx(dmmg)

Fig 27Solving the problem with DMMG

5 Extensions

PETSc is not a complete environment for simulating physical phenomena Ratherit is a set of tools that allow the user to assemble such an environment tailored toa specific application As such PETSc will never provide every facility appropriatefor a given simulation However the user can easily extend PETSc and supplementits capabilities We present two examples in the context of the mantle subductionsimulation

51 Simple Continuation

Because of the highly nonlinear dependence of viscosity on temperature and strainrate in equation (1) the iteration in Newtonrsquos method can fail to converge withouta good starting guess of the solution On the other hand for the isoviscous casewhere temperature is coupled to velocity only through advection a solution is easilyreached It is therefore natural to propose a continuation scheme in the viscosity be-ginning with constant viscosity and progressing to full variability A simple adaptivecontinuation model was devised in which the viscosity was raised to a power be-tween zero and oneη rarr ηα whereα=0 corresponds to constant viscosity andα=1corresponds to natural viscosity variation In the continuation loopα was increasedtowards unity by an amount dependent on the rate of convergence of the previousSNES solve

PETSc itself provides no special support for continuation but it is sufficientlymodular that a continuation loop was readily constructed in the application codeusing repeated calls toDMMGSolve() and found to work quite well

Developing a Geodynamics Simulator with PETSc 23

52 Simple Steering

No rigorous justification had been given for the continuation strategy above andconsequently it was possible that for certain initial conditions Newton would fail toconverge thereby resulting in extremely long run times An observant user could de-tect this situation and abort the run but all the potentially useful solution informationup to that point would be lost A clean way to asynchronously abort the continua-tion loop was thus needed On architectures where OS signals are available PETScprovides an interface for registering signal handlers Thus we were able to define ahandler The user wishing to change the control flow of the simulation simply sendsthe appropriate signal to the process This sets a flag in the user data that causes thechange at the next iteration

6 Simulation Results

A long-standing debate has existed between theorists and observationalists over thethermal structure of subduction zones Modelers using constant viscosity flow sim-ulations to compute thermal structure have predicted relatively cold subductionzones [19] Conversely observationalists who use heat flow measurements and min-eralogical thermobarometry to estimate temperatures at depth have long claimedthat subduction zones are hotter than predicted They have invoked the presence ofstrong upwelling of hot mantle material to supply the heat This upwelling was neverpredicted by isoviscous flow models however

The debate remained unresolved until recently when several groups includingour own succeeded in developing simulations with realistically variable viscosity(for other examples [810121821]) A comparison of flow and temperature fieldsfor variable and constant viscosity simulations generated with our code is shownin Fig 28 Results from these simulations are exciting because they close the gapbetween models and observations they predict hotter mantle temperatures steepersurface thermal gradients and upwelling mantle flow

Furthermore these simulations allow for quantitative predictions of variation ofobservables with subduction parameters A recent study of subduction zone earth-quakes has identified an intriguing trend the vertical distance from subduction zonevolcanoes to the surface of the subducting slab is anti-correlated with descent rate ofthe slab [11] Preliminary calculations with our model are consistent with this trendindicating that flow and thermal structure may play an important role in determiningnot only the quantity and chemistry of magmas but also their path of transport to thesurface Further work is required to resolve this issue

Simulating the geodynamics of subduction is an example of our success in port-ing an existing code into the PETSc framework simultaneously parallelizing thecode and increasing its functionality Using this code as a template we rapidly de-veloped a related simulation of another tectonic boundary the mid-ocean ridge Thiswork was done to address a set of observations of systematic morphological asym-metry in the global mid-ocean ridge system [7] Our model confirms the qualitative

24 Matthew G Knepley et al

Dep

th k

m

log10

η and flow field

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24 Temperature degC

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Distance km

Dep

th k

m

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24

Distance km

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Isov

isco

usV

aria

ble

Vis

cosi

ty

(a) (b)

(c) (d)

Fig 282D viscosity and potential temperature fields from simulations on 8 processors with230112 degrees of freedom Panels in the top row are from a simulation withα=1 in equation(1) Panels in bottom row haveα=0 The white box in panels (a) and (c) shows the region inwhich temperature is plotted in panels (b) and (d)(a) Colors showlog10 of the viscosity fieldNote that there are more than five orders of magnitude variation in viscosity Arrows show theflow direction and magnitude (the slab is being subducted at a rate of 6 cmyear) Upwellingis evident in the flow field near the wedge corner(b) Temperature field from the variableviscosity simulation 1100C isotherm is shown as a dashed line(c) (Constant) viscosity andflow field from the isoviscous simulation Strong flow is predicted at the base of the crustdespite the low-temperature rock there No upwelling flow is predicted(d) Temperature fieldfrom isoviscous simulation Note that the mantle wedge corner is much colder than in (b)

mechanism that had been proposed to explain these observations Furthermore itshows a quantitative agreement with trends in the observed data [17] As with ourmodels of subduction the key to demonstrating the validity of the hypothesized dy-namics was simulating them with the strongly nonlinear rheology that characterizesmantle rock on geologic timescales The computational tools provided by PETSc en-abled us easily handle this rheology reducing development time and allowing us tofocus on model interpretation

References

1 PETSc SNES Example 30 httpwwwmcsanlgovpetscpetsc-2

Developing a Geodynamics Simulator with PETSc 25

snapshotspetsc-currentsrcsnesexamplestutorialsex30chtml

2 PETSc SNES Example 5 httpwwwmcsanlgovpetscpetsc-2snapshotspetsc-currentsrcsnesexamplestutorialsex5f90Fhtml

3 PETSc Solvers httpwwwmcsanlgovpetscpetsc-2documentationlinearsolvertablehtml

4 Amestoy P R Duff I S LrsquoExcellent J-Y and Koster J A fully asynchronous mul-tifrontal solver using distributed dynamic schedulingSIAM Journal on Matrix Analysisand Applications 23(1)15ndash41 (2001)

5 Balay S Buschelman K Eijkhout V Gropp W D Kaushik D Knepley M GMcInnes L C Smith B F and Zhang H PETSc Web pagehttpwwwmcsanlgovpetsc

6 Balay S Gropp W D McInnes L C and Smith B F Efficient management of par-allelism in object oriented numerical software libraries In Arge E Bruaset A M andLangtangen H P editorsModern Software Tools in Scientific Computing pages 163ndash202 Birkhauser Press (1997)

7 Carbotte S Small C and Donnelly K The influence of ridge migration on the mag-matic segmentation of mid-ocean ridgesNature 429743ndash746 (2004)

8 Conder J Wiens D and Morris J On the decompression melting structure at volcanicarcs and back-arc spreading centersGeophys Res Letts 29 (2002)

9 Demmel J W Gilbert J R and Li X S SuperLU userrsquos guide Technical ReportLBNL-44289 Lawrence Berkeley National Laboratory (2003)

10 Eberle M Grasset O and Sotin C A numerical study of the interaction of the man-tle wedge subducting slab and overriding platePhys Earth Planet In 134191ndash202(2002)

11 England P Engdahl R and Thatcher W Systematic variation in the depth of slabsbeneath arc volcanosGeophys J Int 156(2)377ndash408 (2003)

12 Furukawa Y Depth of the decoupling plate interface and thermal structure under arcsJ Geophys Res 9820005ndash20013 (1993)

13 Henson V E and Yang U M BoomerAMG A parallel algebraic multigrid solver andpreconditioner Technical Report UCRL-JC-133948 Lawrence Livermore National Lab-oratory (2000)

14 Hirth G and Kohlstedt D Rheology of the upper mantle and the mantle wedge A viewfrom the experimentalists InInside the Subduction Factory volume 138 ofGeophysicalMonograph American Geophysical Union (2003)

15 Hovland P Norris B and Smith B Making automatic differentiation truly automaticCoupling PETSc with ADIC InProceedings of ICCS2002 (2002)

16 Hysom D and Pothen A A scalable parallel algorithm for incomplete factor precondi-tioning SIAM Journal on Scientific Computing 222194ndash2215 (2001)

17 Katz R Spiegelman M and Carbotte S Ridge migration asthenospheric flow and theorigin of magmatic segmentation in the global mid-ocean ridge systemGeophys ResLetts 31 (2004)

18 Kelemen P Rilling J Parmentier E Mehl L and Hacker B Thermal structure dueto solid-state flow in the mantle wedge beneath arcs InInside the Subduction Factoryvolume 138 ofGeophysical Monograph American Geophysical Union (2003)

19 Peacock S and Wang K Seismic consequences of warm versus cool subduction meta-morphism Examples from southwest and northeast JapanScience 286937ndash939 (1999)

20 Stern R Subduction zonesRev Geophys 40(4) (2002)

26 Matthew G Knepley et al

21 van Keken P Kiefer B and Peacock S High-resolution models of subduction zonesImplications for mineral dehydration reactions and the transport of water into the deepmantleGeochem Geophys Geosys 3(10) (2003)

Page 10: Developing a Geodynamics Simulator with PETScfoalab.earth.ox.ac.uk/publications/Knepley_etal_Springer06.pdf · Developing a Geodynamics Simulator with PETSc Matthew G. Knepley 1,

10 Matthew G Knepley et al

subroutine FormJacobian(ujacuserierr)use f90moduletype (userctx) userMat jacdouble precision x(userxsuserxeuserysuserye)double precision twoonehxhyhxdhyhydhxscv(5)integer rowcol(5)ijierr

one = 10two = 20hx = onedble(usermxminus1)hy = onedble(usermyminus1)sc = hxhyuserlambdahxdhy = hxhyhydhx = hyhxdo 20 j=userysuserye

row = (j minus userys)userxm minus 1do 10 i=userxsuserxe

row = row + 1 boundary points

if (i == 1 or j == 1 or i == usermx or j == usermy) thencol(1) = rowv(1) = onecall MatSetValues(jac1row1colvINSERT VALUESierr)

interior grid pointselse

v(1) = minushxdhyv(2) = minushydhxv(3) = two(hydhx + hxdhy) minus scexp(x(ij))v(4) = minushydhxv(5) = minushxdhycol(1) = row minus userxmcol(2) = row minus 1col(3) = rowcol(4) = row + 1col(5) = row + userxmcall MatSetValues(jac1row5colvINSERT VALUESierr)

endif10 continue20 continue

Fig 9Jacobian calculation for the Bratu problem using PETSc

Developing a Geodynamics Simulator with PETSc 11

we must have access to some values not present locally on the process as shown inFig 10

Box-type stencil Star-type stencil

Proc 6

Proc 0 Proc 0Proc 1 Proc 1

Proc 6

Fig 10Star and box stencils using a DA

We must store values for theseghostpoints locally so that they may be used forthe computation In addition we must ensure that these values are identical to the cor-responding values on the neighboring processes At the lowest level PETSc providesghosted vectors created usingVecCreateGhost () with storage for some valuesowned by other processes These values are explicitly indicated during constructionTheVecGhostUpdateBegin () andVecGhostUpdateEnd () functions trans-fer values between ghost storage and the corresponding storage on other processesThis method however can become complicated for the user Therefore for the com-mon case of a logically rectangular grid PETSc provides theDA object to managethe determination allocation and coherence of ghost values TheDA object is dis-cussed fully in Section 32 but we give here a short introduction in the context of theBratu problem

TheDA object represents a distributed structured (possibly staggered) grid andthus has more information than our serial grid If we augment our user context toinclude information about ghost values we can obtain all the grid information fromtheDA Figure 11 shows the creation of aDA when one initially know only the totalnumber of vertices in thex andy directions Alternatively we could have specifiedthe local number of vertices in each direction

Now we need only modify our PETSc residual routine to update the ghost valuesas shown in Fig 12

Notice that the ghost value scatter is a two-step operation This allows thecommunication to overlap with local computation that may be taking place whilethe messages are in transit The only change necessary for the residual calcu-lation itself is the correct declaration of the input arraydouble precision

12 Matthew G Knepley et al

type userctxDA da

The start end and number of vertices in the xminusdirectioninteger xsxexm

The start end and number of ghost vertices in the xminusdirectioninteger gxsgxegxm

The start end and number of vertices in the yminusdirectioninteger ysyeym

The start end and number of ghost vertices in the yminusdirectioninteger gysgyegym

The number of vertices in the xminus and yminusdirectionsinteger mxmy

The MPI rank of this processinteger rank

The coefficient in the Bratu equationdouble precision lambda

end type userctx

call DACreate2d(PETSCCOMM WORLDDA NONPERIODICDA STENCIL STAR ampamp usermxusermyPETSCDECIDEPETSCDECIDE11 ampamp PETSCNULL INTEGERPETSCNULL INTEGERuserdaierr)call DAGetCorners(userdauserxsuserysPETSCNULL INTEGER amp

amp userxmuserymPETSCNULL INTEGERierr)call DAGetGhostCorners(userdausergxsusergys amp

amp PETSCNULL INTEGERusergxmusergym ampamp PETSCNULL INTEGERierr)

Here we shift the starting indices up by one so that we can easily use the Fortran convention of1minusbased indices rather than0minusbased

userxs = userxs+1userys = userys+1usergxs = usergxs+1usergys = usergys+1userye = userys+userymminus1userxe = userxs+userxmminus1usergye = usergys+usergymminus1usergxe = usergxs+usergxmminus1

Fig 11Creation of a DA for the Bratu problem

Developing a Geodynamics Simulator with PETSc 13

subroutine FormFunctionPETSc(uFuserierr)implicit none

Vec uFinteger ierrtype (userctx) user

double precisionpointer lu v()lf v()Vec uLocal

call DAGetLocalVector(userdauLocalierr)call DAGlobalToLocalBegin(userdauINSERT VALUESuLocalierr)call DAGlobalToLocalEnd(userdauINSERT VALUESuLocalierr)

call VecGetArrayF90(uLocallu vierr)call VecGetArrayF90(Flf vierr)

Actually compute the local portion of the residualcall FormFunctionLocal(lu vlf vuserierr)

call VecRestoreArrayF90(uLocallu vierr)call VecRestoreArrayF90(Flf vierr)

call DARestoreLocalVector(userdauLocalierr)

Fig 12Parallel residual calculation for the Bratu problem using PETSc

u(usergxsusergxeusergysusergye) which now includes ghostvalues

The changes toFormJacobian() are similar resizing the input array andchanging slightly the calculation of row and column indices and can be found inthe PETSc example source We have now finished the initial introduction of PETSclinear algebra and the simulation is ready to run in parallel

32 Rapid Evolution

The user who is willing to raise the level of abstraction in a code can start by em-ploying theDA to describe the problem domain and discretization The data interfacemimics a multidimensional array and is thus ideal for finite difference finite volumeand low-order finite element schemes on a logically rectangular grid In fact after thedefinition of aFormFunction() routine the simulation is ready to run becausethe nonlinear solver provides approximate Jacobians automatically as discussed inSection 4

14 Matthew G Knepley et al

DASetLocalFunction(userda (DALocalFunction1) FormFunctionLocal)

Fig 13Using the DA to form a function

int FormFunctionLocal(DALocalInfo infodouble udouble fAppCtx user)double two = 20hxhyhxdhyhydhxscdouble uijuxxuyyint ij

hx = 10(double)(infominusgtmxminus1)hy = 10(double)(infominusgtmyminus1)sc = hxhyuserminusgtlambdahxdhy = hxhyhydhx = hyhx Compute function over the locally owned part of the grid for (j=infominusgtys jltinfominusgtys+infominusgtym j++)

for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++) if (i == 0 | | j == 0 | | i == infominusgtmxminus1 | | j == infominusgtmyminus1)

f[j][i] = u[j][i] else

uij = u[j][i]uxx = (twouij minus u[j][iminus1] minus u[j][i+1])hydhxuyy = (twouij minus u[jminus1][i] minus u[j+1][i])hxdhyf[j][i] = uxx + uyy minus scexp(uij)

Fig 14FormFunctionLocal() for the Bratu problem

We begin by examining the Bratu problem from the last section only this time inC We can now provide our local residual routine to theDA see Fig 13

The grid information is passed toFormFunctionLocal() in a DALocalInfo structure as shown in Fig 14

The fields are passed directly as multidimensional C arrays complete with ghostvalues An application scientist can use a boilerplate example such as the Bratu prob-lem merely altering the local physics calculation inFormFunctionLocal() torapidly obtain a parallel scalable simulation that runs on the desktop as well as onmassively parallel supercomputers

If the user has an expression for the Jacobian of the residual function then thismatrix can also be computed by using theDA interface with the call as in Fig 15and the corresponding routine for the local calculation in Fig 16

Developing a Geodynamics Simulator with PETSc 15

DASetLocalJacobian(userda (DALocalFunction1) FormJacobianLocal)

Fig 15Using the DA to form a Jacobian

int FormJacobianLocal(DALocalInfo infodouble xMat jacAppCtx user)MatStencil col[5]rowdouble lambdav[5]hxhyhxdhyhydhxscint ij

lambda = userminusgtparamhx = 10(double)(infominusgtmxminus1)hy = 10(double)(infominusgtmyminus1)sc = hxhylambdahxdhy = hxhy hydhx = hyhx

for (j=infominusgtys jltinfominusgtys+infominusgtym j++) for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++)

rowj = j rowi = i boundary points if (i == 0 | | j == 0 | | i == infominusgtmxminus1 | | j == infominusgtmyminus1)

v[0] = 10MatSetValuesStencil(jac1amprow1amprowvINSERT VALUES)

else interior grid points

v[0] = minushxdhy col[0]j = j minus 1 col[0]i = iv[1] = minushydhx col[1]j = j col[1]i = iminus1v[2] = 20(hydhx + hxdhy) minus scexp(x[j][i]) col[2]j = j col[2]i = iv[3] = minushydhx col[3]j = j col[3]i = i+1v[4] = minushxdhy col[4]j = j + 1 col[4]i = iMatSetValuesStencil(jac1amprow5colvINSERT VALUES)

MatAssemblyBegin(jacMAT FINAL ASSEMBLY)MatAssemblyEnd(jacMAT FINAL ASSEMBLY)MatSetOption(jacMAT NEW NONZERO LOCATION ERR)

Fig 16FormJacobianLocal() for the Bratu problem

16 Matthew G Knepley et al

int FormFunctionLocal(DALocalInfo infoField xField fvoid ptr)AppCtx user = (AppCtx)ptrParameter param = userminusgtparamGridInfo grid = userminusgtgriddouble mag w mag uint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1 i j

for (j=infominusgtys jltinfominusgtys+infominusgtym j++) for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++)

calculateXMomentumResidual(i j x f user)calculateZMomentumResidual(i j x f user)calculatePressureResidual(i j x f user)calculateTemperatureResidual(i j x f user)

Fig 17FormFunctionLocal() for the mantle subduction problem

We have so far presented simple examples involving a perturbed Laplacian butthis development strategy extends far beyond toy problems We have developedlarge-scale parallel geophysical simulations [1] that are producing new results inthe field We began by replacingFormFunctionLocal() in the Bratu examplewith one containing the linear physics of the previous sequential linear subductioncode (dropping the previous codersquos linear solve) Then the PETSc solver results werecompared for correctness with the complete prior subduction code Next the newcode was immediately run in parallel for correctness and efficiency tests Finallythe nonlinear terms from the more complicated physics of the mantle subductionproblem discussed in Section 1 and shown in Figs 17-21 and better quality dis-cretizations of the boundary conditions were added In fact since the structure oftheFormFunction() changed significantly the finalFormFunction() looksnothing like the original but the process of obtaining through the incremental ap-proach reduced the learning curve for PETSc and make correctness checking at thevarious stages much easier

The details of the subduction code are not as important as the recognition thata very complicated physics problem need not involve any more complication in oursimulation infrastructure To give more insight into the actual calculation we showin Fig 22 the explicit calculation of the residual from the continuity equation

Here we have used theDA for a multicomponent problem on a staggered meshwithout alteration because the structure of the discretization remains logically rect-angular

Although theDA manages communication during the solution process data post-processing often necessary for visualization or analysis also requires this function-ality For the mantle subduction simulation we want to calculate both the second in-

Developing a Geodynamics Simulator with PETSc 17

int calculateXMomentumResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (iltj) f[j][i]u = x[j][i]u minus SlabVel(rsquoUrsquo ijuser)

else if (jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lithospheric lid f[j][i]u = x[j][i]u minus 00

else if (i==ilim ) on the right side boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]u = x[j][i]u minus HorizVelocity(ijuser) else

f[j][i]u = XNormalStress(xijCELL CENTERuser) minus EPS ZERO

else if (j==jlim ) on the bottom boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]u = x[j][i]u minus HorizVelocity(ijuser) else if (paramminusgtibound==BC NOSTRESS)

f[j][i]u = XMomentumResidual(xijuser)

else in the mantle wedge f[j][i]u = XMomentumResidual(xijuser)

Fig 18X-momentum residual for the mantle subduction problem

variant of the strain rate tensor and the viscosity over the grid (at both cell centers andcorners) Using theDAGlobalToLocalBegin () andDAGlobalToLocalEnd() functions we transferred the global solution data to local ghosted vectors and pro-ceeded with the calculation We can also transfer data from local vectors to a globalvector usingDALocalToGlobal () The entire viscosity calculation is given inFig 23

Note that this framework allows the user to manage arbitrary fields over the do-main For instance a porous flow simulation might manage material properties ofthe medium in this fashion

4 Solvers

The SNES object in PETSc is an abstraction of the ldquoinverserdquo of a nonlinear oper-ator The user provides aFormFunction() routine as seen in Section 3 whichcomputes the action of the operator on an input vector Linear problems can also be

18 Matthew G Knepley et al

int calculateZMomentumResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1jlim = infominusgtmyminus1

if (ilt=j) f[j][i]w = x[j][i]w minus SlabVel(rsquoWrsquo ijuser)

else if (jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lithospheric lid f[j][i]w = x[j][i]w minus 00

else if (j==jlim ) on the bottom boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]w = x[j][i]w minus VertVelocity(ijuser) else

f[j][i]w = ZNormalStress(xijCELL CENTERuser) minus EPS ZERO

else if (i==ilim ) on the right side boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]w = x[j][i]w minus VertVelocity(ijuser) else if (paramminusgtibound==BC NOSTRESS)

f[j][i]w = ZMomentumResidual(xijuser)

else in the mantle wedge f[j][i]w = ZMomentumResidual(xijuser)

Fig 19Z-momentum residual for the mantle subduction problem

int calculatePressureResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (iltj | | jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lid or slab f[j][i]p = x[j][i]p

else if ((i==ilim | | j==jlim ) ampamp paramminusgtibound==BC ANALYTIC ) on an analytic boundary f[j][i]p = x[j][i]p minus Pressure(ijuser)

else in the mantle wedge f[j][i]p = ContinuityResidual(xijuser)

Fig 20Pressure or continuity residual for the mantle subduction problem

Developing a Geodynamics Simulator with PETSc 19

int calculateTemperatureResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (j==0) on the surface f[j][i]T = x[j][i]T + x[j+1][i]T + PetscMax(x[j][i]T00)

else if (i==0) slab inflow boundary f[j][i]T = x[j][i]T minus PlateModel(jPLATE SLABuser)

else if (i==ilim ) right side boundary mag u = 10 minus pow( (10minusPetscMax(PetscMin(x[j][iminus1]uparamminusgtcb10)00)) 50 )f[j][i]T = x[j][i]T minus mag ux[jminus1][iminus1]T minus (10minusmag u)PlateModel(jPLATE LID user)

else if (j==jlim ) bottom boundary mag w = 10 minus pow( (10minusPetscMax(PetscMin(x[jminus1][i]wparamminusgtsb10)00)) 50 )f[j][i]T = x[j][i]T minus mag wx[jminus1][iminus1]T minus (10minusmag w)

else in the mantle wedge f[j][i]T = EnergyResidual(xijuser)

Fig 21Temperature or energy residual for the mantle subduction problem

double precision ContinuityResidual(Field x int i int j AppCtx user)

GridInfo grid = userminusgtgriddouble precision uEuWwNwSdudxdwdz

uW = x[j][iminus1]u uE = x[j][i]u dudx = (uE minus uW)gridminusgtdxwS = x[jminus1][i]w wN = x[j][i]w dwdz = (wN minus wS)gridminusgtdzreturn dudx + dwdz

Fig 22Calculation of the continuity residual for the mantle subduction problem

solved by usingSNES with no loss of efficiency compared to using the underlyingKSP object (PETSc linear solver object) directly ThusSNES seems the appropri-ate framework for a general-purpose simulator providing the flexibility to add orsubtract nonlinear terms from the equation at will

TheSNES solver does not require a user to implement a Jacobian Default rou-tines are provided to compute a finite difference approximation using coloring toaccount for the sparsity of the matrix The user does not even need the PETScMatinterface but can initially interact only withVec objects making the transition to

PETSc nearly painless However these approximations to the Jacobian can have nu-merical difficulties and are not as efficient as direct evaluation Therefore the user

20 Matthew G Knepley et al

Compute both the second invariant of the strain rate tensor and the viscosity int ViscosityField(DA da Vec X Vec VAppCtx user)

Parameter param = userminusgtparamGridInfo grid = userminusgtgridVec localXField v xdouble eps dx dz T epsC TCint ijisjsimjmilim jlim ivt

ivt = paramminusgtiviscparamminusgtivisc = paramminusgtoutput ivisc

DACreateLocalVector(da amplocalX)DAGlobalToLocalBegin(da X INSERT VALUES localX)DAGlobalToLocalEnd(da X INSERT VALUES localX)DAVecGetArray(dalocalX(void)ampx)DAVecGetArray(daV(void)ampv)

Parameters dx = gridminusgtdx dz = gridminusgtdzilim = gridminusgtniminus1 jlim = gridminusgtnjminus1

Compute real temperature strain rate and viscosity DAGetCorners(daampisampjsPETSCNULL ampimampjmPETSCNULL )for (j=js jltjs+jm j++)

for (i=is iltis+im i++) T = paramminusgtpotentialT x[j][i]T exp( (jminus05)dzparamminusgtz scale )if (iltilim ampamp jltjlim )

TC = paramminusgtpotentialT TInterp(xij) exp( jdzparamminusgtz scale ) else

TC = T Compute the values at both cell centers and cell corners eps = CalcSecInv(xijCELL CENTERuser)epsC = CalcSecInv(xijCELL CORNERuser)v[j][i]u = epsv[j][i]w = epsCv[j][i]p = Viscosity(Tepsdz(jminus05)param)v[j][i]T = Viscosity(TCepsCdzjparam)

DAVecRestoreArray(daV(void)ampv)DAVecRestoreArray(dalocalX(void)ampx)paramminusgtivisc = ivt

Fig 23 Calculation of the second invariant of the strain tensor and viscosity field for themantle subduction problem

Developing a Geodynamics Simulator with PETSc 21

KSP ksp subkspPC bjacobi ilu

SNESGetKSP(snes ampksp)KSPGetPC(ksp ampbjacobi)PCBJacobiGetSubKSP(bjacobi PETSCNULL PETSCNULL ampsubksp)KSPGetPC(subksp[0] ampilu)PCILUSetLevels(ilu levels)

Fig 24Customizing the preconditioner on a single process

has the option of providing a routine or of using the ADIC [15] or similar system forautomatic differentiation The finite difference approximation can be activated byusing the-snes fd and-mat fd coloring freq options or by providing theSNESDefaultComputeJacobianColor () function toSNESSetJacobian()

Through theSNES object the user may also access the great range of directand iterative linear solvers and preconditioners provided by PETSc Sixteen differentKrylov solvers are available including GMRES BiCGStab and LSQR along withinterfaces to popular sparse direct packages such as MUMPS [4] and SuperLU [9]In addition to a variety of incomplete factorization preconditioners including PI-LUT [16] PETSc supports additive Schwartz preconditioning algebraic multigridthrough BoomerAMG [13] and the geometric multigrid discussed below The fullpanoply of solvers and preconditioners available is catalogued on the PETSc Web-site [3]

The modularity of PETSc allows users to easily customize each solver For in-stance suppose the user wishes to increase the number of levels in ILU(k) precon-ditioning on one block of a block-Jacobi scheme The code fragment in Fig 24 willset the number of levels of fill tolevels

PETSc provides an elegant framework for managing geometric multigrid in com-bination with aDA which is abstracted in theDMMGobject In the same way thatany linear solve can be performed withSNES any preconditioner or solver com-bination is available inDMMGby using a single level That is the same user codesupports both geometric multigrid as well as direct methods Krylov methods etcWhen creating aDMMG the user specifies the number of levels of grid refinementand provides the coarse-grid information (see Fig 25) which is always aDA objectat present although unstructured prototypes are being developed Then theDMMGobject calculates the intergrid transfer operators prolongation and restriction and

allocates objects to hold the solution at each level The user must provide the actionof the residual operatorF and can optionally provide a routine to compute the Jaco-bian matrix (which will be called on each level) and a routine to compute the initialguess as in Fig 26

With this information the user can now solve the equation and retrieve the solu-tion vector as in Fig 27

22 Matthew G Knepley et al

DMMGCreate(comm gridmglevels user ampdmmg)DMMGSetDM(dmmg (DM) da)

Fig 25Creating a DMMG

DMMGSetSNESLocal(dmmg FormFunctionLocal FormJacobianLocal 0 0)DMMGSetInitialGuess(dmmg FormInitialGuess)

Fig 26Providing discretization and assembly routines to DMMG

DMMGSolve(dmmg)soln = DMMGGetx(dmmg)

Fig 27Solving the problem with DMMG

5 Extensions

PETSc is not a complete environment for simulating physical phenomena Ratherit is a set of tools that allow the user to assemble such an environment tailored toa specific application As such PETSc will never provide every facility appropriatefor a given simulation However the user can easily extend PETSc and supplementits capabilities We present two examples in the context of the mantle subductionsimulation

51 Simple Continuation

Because of the highly nonlinear dependence of viscosity on temperature and strainrate in equation (1) the iteration in Newtonrsquos method can fail to converge withouta good starting guess of the solution On the other hand for the isoviscous casewhere temperature is coupled to velocity only through advection a solution is easilyreached It is therefore natural to propose a continuation scheme in the viscosity be-ginning with constant viscosity and progressing to full variability A simple adaptivecontinuation model was devised in which the viscosity was raised to a power be-tween zero and oneη rarr ηα whereα=0 corresponds to constant viscosity andα=1corresponds to natural viscosity variation In the continuation loopα was increasedtowards unity by an amount dependent on the rate of convergence of the previousSNES solve

PETSc itself provides no special support for continuation but it is sufficientlymodular that a continuation loop was readily constructed in the application codeusing repeated calls toDMMGSolve() and found to work quite well

Developing a Geodynamics Simulator with PETSc 23

52 Simple Steering

No rigorous justification had been given for the continuation strategy above andconsequently it was possible that for certain initial conditions Newton would fail toconverge thereby resulting in extremely long run times An observant user could de-tect this situation and abort the run but all the potentially useful solution informationup to that point would be lost A clean way to asynchronously abort the continua-tion loop was thus needed On architectures where OS signals are available PETScprovides an interface for registering signal handlers Thus we were able to define ahandler The user wishing to change the control flow of the simulation simply sendsthe appropriate signal to the process This sets a flag in the user data that causes thechange at the next iteration

6 Simulation Results

A long-standing debate has existed between theorists and observationalists over thethermal structure of subduction zones Modelers using constant viscosity flow sim-ulations to compute thermal structure have predicted relatively cold subductionzones [19] Conversely observationalists who use heat flow measurements and min-eralogical thermobarometry to estimate temperatures at depth have long claimedthat subduction zones are hotter than predicted They have invoked the presence ofstrong upwelling of hot mantle material to supply the heat This upwelling was neverpredicted by isoviscous flow models however

The debate remained unresolved until recently when several groups includingour own succeeded in developing simulations with realistically variable viscosity(for other examples [810121821]) A comparison of flow and temperature fieldsfor variable and constant viscosity simulations generated with our code is shownin Fig 28 Results from these simulations are exciting because they close the gapbetween models and observations they predict hotter mantle temperatures steepersurface thermal gradients and upwelling mantle flow

Furthermore these simulations allow for quantitative predictions of variation ofobservables with subduction parameters A recent study of subduction zone earth-quakes has identified an intriguing trend the vertical distance from subduction zonevolcanoes to the surface of the subducting slab is anti-correlated with descent rate ofthe slab [11] Preliminary calculations with our model are consistent with this trendindicating that flow and thermal structure may play an important role in determiningnot only the quantity and chemistry of magmas but also their path of transport to thesurface Further work is required to resolve this issue

Simulating the geodynamics of subduction is an example of our success in port-ing an existing code into the PETSc framework simultaneously parallelizing thecode and increasing its functionality Using this code as a template we rapidly de-veloped a related simulation of another tectonic boundary the mid-ocean ridge Thiswork was done to address a set of observations of systematic morphological asym-metry in the global mid-ocean ridge system [7] Our model confirms the qualitative

24 Matthew G Knepley et al

Dep

th k

m

log10

η and flow field

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24 Temperature degC

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Distance km

Dep

th k

m

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24

Distance km

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Isov

isco

usV

aria

ble

Vis

cosi

ty

(a) (b)

(c) (d)

Fig 282D viscosity and potential temperature fields from simulations on 8 processors with230112 degrees of freedom Panels in the top row are from a simulation withα=1 in equation(1) Panels in bottom row haveα=0 The white box in panels (a) and (c) shows the region inwhich temperature is plotted in panels (b) and (d)(a) Colors showlog10 of the viscosity fieldNote that there are more than five orders of magnitude variation in viscosity Arrows show theflow direction and magnitude (the slab is being subducted at a rate of 6 cmyear) Upwellingis evident in the flow field near the wedge corner(b) Temperature field from the variableviscosity simulation 1100C isotherm is shown as a dashed line(c) (Constant) viscosity andflow field from the isoviscous simulation Strong flow is predicted at the base of the crustdespite the low-temperature rock there No upwelling flow is predicted(d) Temperature fieldfrom isoviscous simulation Note that the mantle wedge corner is much colder than in (b)

mechanism that had been proposed to explain these observations Furthermore itshows a quantitative agreement with trends in the observed data [17] As with ourmodels of subduction the key to demonstrating the validity of the hypothesized dy-namics was simulating them with the strongly nonlinear rheology that characterizesmantle rock on geologic timescales The computational tools provided by PETSc en-abled us easily handle this rheology reducing development time and allowing us tofocus on model interpretation

References

1 PETSc SNES Example 30 httpwwwmcsanlgovpetscpetsc-2

Developing a Geodynamics Simulator with PETSc 25

snapshotspetsc-currentsrcsnesexamplestutorialsex30chtml

2 PETSc SNES Example 5 httpwwwmcsanlgovpetscpetsc-2snapshotspetsc-currentsrcsnesexamplestutorialsex5f90Fhtml

3 PETSc Solvers httpwwwmcsanlgovpetscpetsc-2documentationlinearsolvertablehtml

4 Amestoy P R Duff I S LrsquoExcellent J-Y and Koster J A fully asynchronous mul-tifrontal solver using distributed dynamic schedulingSIAM Journal on Matrix Analysisand Applications 23(1)15ndash41 (2001)

5 Balay S Buschelman K Eijkhout V Gropp W D Kaushik D Knepley M GMcInnes L C Smith B F and Zhang H PETSc Web pagehttpwwwmcsanlgovpetsc

6 Balay S Gropp W D McInnes L C and Smith B F Efficient management of par-allelism in object oriented numerical software libraries In Arge E Bruaset A M andLangtangen H P editorsModern Software Tools in Scientific Computing pages 163ndash202 Birkhauser Press (1997)

7 Carbotte S Small C and Donnelly K The influence of ridge migration on the mag-matic segmentation of mid-ocean ridgesNature 429743ndash746 (2004)

8 Conder J Wiens D and Morris J On the decompression melting structure at volcanicarcs and back-arc spreading centersGeophys Res Letts 29 (2002)

9 Demmel J W Gilbert J R and Li X S SuperLU userrsquos guide Technical ReportLBNL-44289 Lawrence Berkeley National Laboratory (2003)

10 Eberle M Grasset O and Sotin C A numerical study of the interaction of the man-tle wedge subducting slab and overriding platePhys Earth Planet In 134191ndash202(2002)

11 England P Engdahl R and Thatcher W Systematic variation in the depth of slabsbeneath arc volcanosGeophys J Int 156(2)377ndash408 (2003)

12 Furukawa Y Depth of the decoupling plate interface and thermal structure under arcsJ Geophys Res 9820005ndash20013 (1993)

13 Henson V E and Yang U M BoomerAMG A parallel algebraic multigrid solver andpreconditioner Technical Report UCRL-JC-133948 Lawrence Livermore National Lab-oratory (2000)

14 Hirth G and Kohlstedt D Rheology of the upper mantle and the mantle wedge A viewfrom the experimentalists InInside the Subduction Factory volume 138 ofGeophysicalMonograph American Geophysical Union (2003)

15 Hovland P Norris B and Smith B Making automatic differentiation truly automaticCoupling PETSc with ADIC InProceedings of ICCS2002 (2002)

16 Hysom D and Pothen A A scalable parallel algorithm for incomplete factor precondi-tioning SIAM Journal on Scientific Computing 222194ndash2215 (2001)

17 Katz R Spiegelman M and Carbotte S Ridge migration asthenospheric flow and theorigin of magmatic segmentation in the global mid-ocean ridge systemGeophys ResLetts 31 (2004)

18 Kelemen P Rilling J Parmentier E Mehl L and Hacker B Thermal structure dueto solid-state flow in the mantle wedge beneath arcs InInside the Subduction Factoryvolume 138 ofGeophysical Monograph American Geophysical Union (2003)

19 Peacock S and Wang K Seismic consequences of warm versus cool subduction meta-morphism Examples from southwest and northeast JapanScience 286937ndash939 (1999)

20 Stern R Subduction zonesRev Geophys 40(4) (2002)

26 Matthew G Knepley et al

21 van Keken P Kiefer B and Peacock S High-resolution models of subduction zonesImplications for mineral dehydration reactions and the transport of water into the deepmantleGeochem Geophys Geosys 3(10) (2003)

Page 11: Developing a Geodynamics Simulator with PETScfoalab.earth.ox.ac.uk/publications/Knepley_etal_Springer06.pdf · Developing a Geodynamics Simulator with PETSc Matthew G. Knepley 1,

Developing a Geodynamics Simulator with PETSc 11

we must have access to some values not present locally on the process as shown inFig 10

Box-type stencil Star-type stencil

Proc 6

Proc 0 Proc 0Proc 1 Proc 1

Proc 6

Fig 10Star and box stencils using a DA

We must store values for theseghostpoints locally so that they may be used forthe computation In addition we must ensure that these values are identical to the cor-responding values on the neighboring processes At the lowest level PETSc providesghosted vectors created usingVecCreateGhost () with storage for some valuesowned by other processes These values are explicitly indicated during constructionTheVecGhostUpdateBegin () andVecGhostUpdateEnd () functions trans-fer values between ghost storage and the corresponding storage on other processesThis method however can become complicated for the user Therefore for the com-mon case of a logically rectangular grid PETSc provides theDA object to managethe determination allocation and coherence of ghost values TheDA object is dis-cussed fully in Section 32 but we give here a short introduction in the context of theBratu problem

TheDA object represents a distributed structured (possibly staggered) grid andthus has more information than our serial grid If we augment our user context toinclude information about ghost values we can obtain all the grid information fromtheDA Figure 11 shows the creation of aDA when one initially know only the totalnumber of vertices in thex andy directions Alternatively we could have specifiedthe local number of vertices in each direction

Now we need only modify our PETSc residual routine to update the ghost valuesas shown in Fig 12

Notice that the ghost value scatter is a two-step operation This allows thecommunication to overlap with local computation that may be taking place whilethe messages are in transit The only change necessary for the residual calcu-lation itself is the correct declaration of the input arraydouble precision

12 Matthew G Knepley et al

type userctxDA da

The start end and number of vertices in the xminusdirectioninteger xsxexm

The start end and number of ghost vertices in the xminusdirectioninteger gxsgxegxm

The start end and number of vertices in the yminusdirectioninteger ysyeym

The start end and number of ghost vertices in the yminusdirectioninteger gysgyegym

The number of vertices in the xminus and yminusdirectionsinteger mxmy

The MPI rank of this processinteger rank

The coefficient in the Bratu equationdouble precision lambda

end type userctx

call DACreate2d(PETSCCOMM WORLDDA NONPERIODICDA STENCIL STAR ampamp usermxusermyPETSCDECIDEPETSCDECIDE11 ampamp PETSCNULL INTEGERPETSCNULL INTEGERuserdaierr)call DAGetCorners(userdauserxsuserysPETSCNULL INTEGER amp

amp userxmuserymPETSCNULL INTEGERierr)call DAGetGhostCorners(userdausergxsusergys amp

amp PETSCNULL INTEGERusergxmusergym ampamp PETSCNULL INTEGERierr)

Here we shift the starting indices up by one so that we can easily use the Fortran convention of1minusbased indices rather than0minusbased

userxs = userxs+1userys = userys+1usergxs = usergxs+1usergys = usergys+1userye = userys+userymminus1userxe = userxs+userxmminus1usergye = usergys+usergymminus1usergxe = usergxs+usergxmminus1

Fig 11Creation of a DA for the Bratu problem

Developing a Geodynamics Simulator with PETSc 13

subroutine FormFunctionPETSc(uFuserierr)implicit none

Vec uFinteger ierrtype (userctx) user

double precisionpointer lu v()lf v()Vec uLocal

call DAGetLocalVector(userdauLocalierr)call DAGlobalToLocalBegin(userdauINSERT VALUESuLocalierr)call DAGlobalToLocalEnd(userdauINSERT VALUESuLocalierr)

call VecGetArrayF90(uLocallu vierr)call VecGetArrayF90(Flf vierr)

Actually compute the local portion of the residualcall FormFunctionLocal(lu vlf vuserierr)

call VecRestoreArrayF90(uLocallu vierr)call VecRestoreArrayF90(Flf vierr)

call DARestoreLocalVector(userdauLocalierr)

Fig 12Parallel residual calculation for the Bratu problem using PETSc

u(usergxsusergxeusergysusergye) which now includes ghostvalues

The changes toFormJacobian() are similar resizing the input array andchanging slightly the calculation of row and column indices and can be found inthe PETSc example source We have now finished the initial introduction of PETSclinear algebra and the simulation is ready to run in parallel

32 Rapid Evolution

The user who is willing to raise the level of abstraction in a code can start by em-ploying theDA to describe the problem domain and discretization The data interfacemimics a multidimensional array and is thus ideal for finite difference finite volumeand low-order finite element schemes on a logically rectangular grid In fact after thedefinition of aFormFunction() routine the simulation is ready to run becausethe nonlinear solver provides approximate Jacobians automatically as discussed inSection 4

14 Matthew G Knepley et al

DASetLocalFunction(userda (DALocalFunction1) FormFunctionLocal)

Fig 13Using the DA to form a function

int FormFunctionLocal(DALocalInfo infodouble udouble fAppCtx user)double two = 20hxhyhxdhyhydhxscdouble uijuxxuyyint ij

hx = 10(double)(infominusgtmxminus1)hy = 10(double)(infominusgtmyminus1)sc = hxhyuserminusgtlambdahxdhy = hxhyhydhx = hyhx Compute function over the locally owned part of the grid for (j=infominusgtys jltinfominusgtys+infominusgtym j++)

for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++) if (i == 0 | | j == 0 | | i == infominusgtmxminus1 | | j == infominusgtmyminus1)

f[j][i] = u[j][i] else

uij = u[j][i]uxx = (twouij minus u[j][iminus1] minus u[j][i+1])hydhxuyy = (twouij minus u[jminus1][i] minus u[j+1][i])hxdhyf[j][i] = uxx + uyy minus scexp(uij)

Fig 14FormFunctionLocal() for the Bratu problem

We begin by examining the Bratu problem from the last section only this time inC We can now provide our local residual routine to theDA see Fig 13

The grid information is passed toFormFunctionLocal() in a DALocalInfo structure as shown in Fig 14

The fields are passed directly as multidimensional C arrays complete with ghostvalues An application scientist can use a boilerplate example such as the Bratu prob-lem merely altering the local physics calculation inFormFunctionLocal() torapidly obtain a parallel scalable simulation that runs on the desktop as well as onmassively parallel supercomputers

If the user has an expression for the Jacobian of the residual function then thismatrix can also be computed by using theDA interface with the call as in Fig 15and the corresponding routine for the local calculation in Fig 16

Developing a Geodynamics Simulator with PETSc 15

DASetLocalJacobian(userda (DALocalFunction1) FormJacobianLocal)

Fig 15Using the DA to form a Jacobian

int FormJacobianLocal(DALocalInfo infodouble xMat jacAppCtx user)MatStencil col[5]rowdouble lambdav[5]hxhyhxdhyhydhxscint ij

lambda = userminusgtparamhx = 10(double)(infominusgtmxminus1)hy = 10(double)(infominusgtmyminus1)sc = hxhylambdahxdhy = hxhy hydhx = hyhx

for (j=infominusgtys jltinfominusgtys+infominusgtym j++) for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++)

rowj = j rowi = i boundary points if (i == 0 | | j == 0 | | i == infominusgtmxminus1 | | j == infominusgtmyminus1)

v[0] = 10MatSetValuesStencil(jac1amprow1amprowvINSERT VALUES)

else interior grid points

v[0] = minushxdhy col[0]j = j minus 1 col[0]i = iv[1] = minushydhx col[1]j = j col[1]i = iminus1v[2] = 20(hydhx + hxdhy) minus scexp(x[j][i]) col[2]j = j col[2]i = iv[3] = minushydhx col[3]j = j col[3]i = i+1v[4] = minushxdhy col[4]j = j + 1 col[4]i = iMatSetValuesStencil(jac1amprow5colvINSERT VALUES)

MatAssemblyBegin(jacMAT FINAL ASSEMBLY)MatAssemblyEnd(jacMAT FINAL ASSEMBLY)MatSetOption(jacMAT NEW NONZERO LOCATION ERR)

Fig 16FormJacobianLocal() for the Bratu problem

16 Matthew G Knepley et al

int FormFunctionLocal(DALocalInfo infoField xField fvoid ptr)AppCtx user = (AppCtx)ptrParameter param = userminusgtparamGridInfo grid = userminusgtgriddouble mag w mag uint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1 i j

for (j=infominusgtys jltinfominusgtys+infominusgtym j++) for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++)

calculateXMomentumResidual(i j x f user)calculateZMomentumResidual(i j x f user)calculatePressureResidual(i j x f user)calculateTemperatureResidual(i j x f user)

Fig 17FormFunctionLocal() for the mantle subduction problem

We have so far presented simple examples involving a perturbed Laplacian butthis development strategy extends far beyond toy problems We have developedlarge-scale parallel geophysical simulations [1] that are producing new results inthe field We began by replacingFormFunctionLocal() in the Bratu examplewith one containing the linear physics of the previous sequential linear subductioncode (dropping the previous codersquos linear solve) Then the PETSc solver results werecompared for correctness with the complete prior subduction code Next the newcode was immediately run in parallel for correctness and efficiency tests Finallythe nonlinear terms from the more complicated physics of the mantle subductionproblem discussed in Section 1 and shown in Figs 17-21 and better quality dis-cretizations of the boundary conditions were added In fact since the structure oftheFormFunction() changed significantly the finalFormFunction() looksnothing like the original but the process of obtaining through the incremental ap-proach reduced the learning curve for PETSc and make correctness checking at thevarious stages much easier

The details of the subduction code are not as important as the recognition thata very complicated physics problem need not involve any more complication in oursimulation infrastructure To give more insight into the actual calculation we showin Fig 22 the explicit calculation of the residual from the continuity equation

Here we have used theDA for a multicomponent problem on a staggered meshwithout alteration because the structure of the discretization remains logically rect-angular

Although theDA manages communication during the solution process data post-processing often necessary for visualization or analysis also requires this function-ality For the mantle subduction simulation we want to calculate both the second in-

Developing a Geodynamics Simulator with PETSc 17

int calculateXMomentumResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (iltj) f[j][i]u = x[j][i]u minus SlabVel(rsquoUrsquo ijuser)

else if (jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lithospheric lid f[j][i]u = x[j][i]u minus 00

else if (i==ilim ) on the right side boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]u = x[j][i]u minus HorizVelocity(ijuser) else

f[j][i]u = XNormalStress(xijCELL CENTERuser) minus EPS ZERO

else if (j==jlim ) on the bottom boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]u = x[j][i]u minus HorizVelocity(ijuser) else if (paramminusgtibound==BC NOSTRESS)

f[j][i]u = XMomentumResidual(xijuser)

else in the mantle wedge f[j][i]u = XMomentumResidual(xijuser)

Fig 18X-momentum residual for the mantle subduction problem

variant of the strain rate tensor and the viscosity over the grid (at both cell centers andcorners) Using theDAGlobalToLocalBegin () andDAGlobalToLocalEnd() functions we transferred the global solution data to local ghosted vectors and pro-ceeded with the calculation We can also transfer data from local vectors to a globalvector usingDALocalToGlobal () The entire viscosity calculation is given inFig 23

Note that this framework allows the user to manage arbitrary fields over the do-main For instance a porous flow simulation might manage material properties ofthe medium in this fashion

4 Solvers

The SNES object in PETSc is an abstraction of the ldquoinverserdquo of a nonlinear oper-ator The user provides aFormFunction() routine as seen in Section 3 whichcomputes the action of the operator on an input vector Linear problems can also be

18 Matthew G Knepley et al

int calculateZMomentumResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1jlim = infominusgtmyminus1

if (ilt=j) f[j][i]w = x[j][i]w minus SlabVel(rsquoWrsquo ijuser)

else if (jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lithospheric lid f[j][i]w = x[j][i]w minus 00

else if (j==jlim ) on the bottom boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]w = x[j][i]w minus VertVelocity(ijuser) else

f[j][i]w = ZNormalStress(xijCELL CENTERuser) minus EPS ZERO

else if (i==ilim ) on the right side boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]w = x[j][i]w minus VertVelocity(ijuser) else if (paramminusgtibound==BC NOSTRESS)

f[j][i]w = ZMomentumResidual(xijuser)

else in the mantle wedge f[j][i]w = ZMomentumResidual(xijuser)

Fig 19Z-momentum residual for the mantle subduction problem

int calculatePressureResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (iltj | | jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lid or slab f[j][i]p = x[j][i]p

else if ((i==ilim | | j==jlim ) ampamp paramminusgtibound==BC ANALYTIC ) on an analytic boundary f[j][i]p = x[j][i]p minus Pressure(ijuser)

else in the mantle wedge f[j][i]p = ContinuityResidual(xijuser)

Fig 20Pressure or continuity residual for the mantle subduction problem

Developing a Geodynamics Simulator with PETSc 19

int calculateTemperatureResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (j==0) on the surface f[j][i]T = x[j][i]T + x[j+1][i]T + PetscMax(x[j][i]T00)

else if (i==0) slab inflow boundary f[j][i]T = x[j][i]T minus PlateModel(jPLATE SLABuser)

else if (i==ilim ) right side boundary mag u = 10 minus pow( (10minusPetscMax(PetscMin(x[j][iminus1]uparamminusgtcb10)00)) 50 )f[j][i]T = x[j][i]T minus mag ux[jminus1][iminus1]T minus (10minusmag u)PlateModel(jPLATE LID user)

else if (j==jlim ) bottom boundary mag w = 10 minus pow( (10minusPetscMax(PetscMin(x[jminus1][i]wparamminusgtsb10)00)) 50 )f[j][i]T = x[j][i]T minus mag wx[jminus1][iminus1]T minus (10minusmag w)

else in the mantle wedge f[j][i]T = EnergyResidual(xijuser)

Fig 21Temperature or energy residual for the mantle subduction problem

double precision ContinuityResidual(Field x int i int j AppCtx user)

GridInfo grid = userminusgtgriddouble precision uEuWwNwSdudxdwdz

uW = x[j][iminus1]u uE = x[j][i]u dudx = (uE minus uW)gridminusgtdxwS = x[jminus1][i]w wN = x[j][i]w dwdz = (wN minus wS)gridminusgtdzreturn dudx + dwdz

Fig 22Calculation of the continuity residual for the mantle subduction problem

solved by usingSNES with no loss of efficiency compared to using the underlyingKSP object (PETSc linear solver object) directly ThusSNES seems the appropri-ate framework for a general-purpose simulator providing the flexibility to add orsubtract nonlinear terms from the equation at will

TheSNES solver does not require a user to implement a Jacobian Default rou-tines are provided to compute a finite difference approximation using coloring toaccount for the sparsity of the matrix The user does not even need the PETScMatinterface but can initially interact only withVec objects making the transition to

PETSc nearly painless However these approximations to the Jacobian can have nu-merical difficulties and are not as efficient as direct evaluation Therefore the user

20 Matthew G Knepley et al

Compute both the second invariant of the strain rate tensor and the viscosity int ViscosityField(DA da Vec X Vec VAppCtx user)

Parameter param = userminusgtparamGridInfo grid = userminusgtgridVec localXField v xdouble eps dx dz T epsC TCint ijisjsimjmilim jlim ivt

ivt = paramminusgtiviscparamminusgtivisc = paramminusgtoutput ivisc

DACreateLocalVector(da amplocalX)DAGlobalToLocalBegin(da X INSERT VALUES localX)DAGlobalToLocalEnd(da X INSERT VALUES localX)DAVecGetArray(dalocalX(void)ampx)DAVecGetArray(daV(void)ampv)

Parameters dx = gridminusgtdx dz = gridminusgtdzilim = gridminusgtniminus1 jlim = gridminusgtnjminus1

Compute real temperature strain rate and viscosity DAGetCorners(daampisampjsPETSCNULL ampimampjmPETSCNULL )for (j=js jltjs+jm j++)

for (i=is iltis+im i++) T = paramminusgtpotentialT x[j][i]T exp( (jminus05)dzparamminusgtz scale )if (iltilim ampamp jltjlim )

TC = paramminusgtpotentialT TInterp(xij) exp( jdzparamminusgtz scale ) else

TC = T Compute the values at both cell centers and cell corners eps = CalcSecInv(xijCELL CENTERuser)epsC = CalcSecInv(xijCELL CORNERuser)v[j][i]u = epsv[j][i]w = epsCv[j][i]p = Viscosity(Tepsdz(jminus05)param)v[j][i]T = Viscosity(TCepsCdzjparam)

DAVecRestoreArray(daV(void)ampv)DAVecRestoreArray(dalocalX(void)ampx)paramminusgtivisc = ivt

Fig 23 Calculation of the second invariant of the strain tensor and viscosity field for themantle subduction problem

Developing a Geodynamics Simulator with PETSc 21

KSP ksp subkspPC bjacobi ilu

SNESGetKSP(snes ampksp)KSPGetPC(ksp ampbjacobi)PCBJacobiGetSubKSP(bjacobi PETSCNULL PETSCNULL ampsubksp)KSPGetPC(subksp[0] ampilu)PCILUSetLevels(ilu levels)

Fig 24Customizing the preconditioner on a single process

has the option of providing a routine or of using the ADIC [15] or similar system forautomatic differentiation The finite difference approximation can be activated byusing the-snes fd and-mat fd coloring freq options or by providing theSNESDefaultComputeJacobianColor () function toSNESSetJacobian()

Through theSNES object the user may also access the great range of directand iterative linear solvers and preconditioners provided by PETSc Sixteen differentKrylov solvers are available including GMRES BiCGStab and LSQR along withinterfaces to popular sparse direct packages such as MUMPS [4] and SuperLU [9]In addition to a variety of incomplete factorization preconditioners including PI-LUT [16] PETSc supports additive Schwartz preconditioning algebraic multigridthrough BoomerAMG [13] and the geometric multigrid discussed below The fullpanoply of solvers and preconditioners available is catalogued on the PETSc Web-site [3]

The modularity of PETSc allows users to easily customize each solver For in-stance suppose the user wishes to increase the number of levels in ILU(k) precon-ditioning on one block of a block-Jacobi scheme The code fragment in Fig 24 willset the number of levels of fill tolevels

PETSc provides an elegant framework for managing geometric multigrid in com-bination with aDA which is abstracted in theDMMGobject In the same way thatany linear solve can be performed withSNES any preconditioner or solver com-bination is available inDMMGby using a single level That is the same user codesupports both geometric multigrid as well as direct methods Krylov methods etcWhen creating aDMMG the user specifies the number of levels of grid refinementand provides the coarse-grid information (see Fig 25) which is always aDA objectat present although unstructured prototypes are being developed Then theDMMGobject calculates the intergrid transfer operators prolongation and restriction and

allocates objects to hold the solution at each level The user must provide the actionof the residual operatorF and can optionally provide a routine to compute the Jaco-bian matrix (which will be called on each level) and a routine to compute the initialguess as in Fig 26

With this information the user can now solve the equation and retrieve the solu-tion vector as in Fig 27

22 Matthew G Knepley et al

DMMGCreate(comm gridmglevels user ampdmmg)DMMGSetDM(dmmg (DM) da)

Fig 25Creating a DMMG

DMMGSetSNESLocal(dmmg FormFunctionLocal FormJacobianLocal 0 0)DMMGSetInitialGuess(dmmg FormInitialGuess)

Fig 26Providing discretization and assembly routines to DMMG

DMMGSolve(dmmg)soln = DMMGGetx(dmmg)

Fig 27Solving the problem with DMMG

5 Extensions

PETSc is not a complete environment for simulating physical phenomena Ratherit is a set of tools that allow the user to assemble such an environment tailored toa specific application As such PETSc will never provide every facility appropriatefor a given simulation However the user can easily extend PETSc and supplementits capabilities We present two examples in the context of the mantle subductionsimulation

51 Simple Continuation

Because of the highly nonlinear dependence of viscosity on temperature and strainrate in equation (1) the iteration in Newtonrsquos method can fail to converge withouta good starting guess of the solution On the other hand for the isoviscous casewhere temperature is coupled to velocity only through advection a solution is easilyreached It is therefore natural to propose a continuation scheme in the viscosity be-ginning with constant viscosity and progressing to full variability A simple adaptivecontinuation model was devised in which the viscosity was raised to a power be-tween zero and oneη rarr ηα whereα=0 corresponds to constant viscosity andα=1corresponds to natural viscosity variation In the continuation loopα was increasedtowards unity by an amount dependent on the rate of convergence of the previousSNES solve

PETSc itself provides no special support for continuation but it is sufficientlymodular that a continuation loop was readily constructed in the application codeusing repeated calls toDMMGSolve() and found to work quite well

Developing a Geodynamics Simulator with PETSc 23

52 Simple Steering

No rigorous justification had been given for the continuation strategy above andconsequently it was possible that for certain initial conditions Newton would fail toconverge thereby resulting in extremely long run times An observant user could de-tect this situation and abort the run but all the potentially useful solution informationup to that point would be lost A clean way to asynchronously abort the continua-tion loop was thus needed On architectures where OS signals are available PETScprovides an interface for registering signal handlers Thus we were able to define ahandler The user wishing to change the control flow of the simulation simply sendsthe appropriate signal to the process This sets a flag in the user data that causes thechange at the next iteration

6 Simulation Results

A long-standing debate has existed between theorists and observationalists over thethermal structure of subduction zones Modelers using constant viscosity flow sim-ulations to compute thermal structure have predicted relatively cold subductionzones [19] Conversely observationalists who use heat flow measurements and min-eralogical thermobarometry to estimate temperatures at depth have long claimedthat subduction zones are hotter than predicted They have invoked the presence ofstrong upwelling of hot mantle material to supply the heat This upwelling was neverpredicted by isoviscous flow models however

The debate remained unresolved until recently when several groups includingour own succeeded in developing simulations with realistically variable viscosity(for other examples [810121821]) A comparison of flow and temperature fieldsfor variable and constant viscosity simulations generated with our code is shownin Fig 28 Results from these simulations are exciting because they close the gapbetween models and observations they predict hotter mantle temperatures steepersurface thermal gradients and upwelling mantle flow

Furthermore these simulations allow for quantitative predictions of variation ofobservables with subduction parameters A recent study of subduction zone earth-quakes has identified an intriguing trend the vertical distance from subduction zonevolcanoes to the surface of the subducting slab is anti-correlated with descent rate ofthe slab [11] Preliminary calculations with our model are consistent with this trendindicating that flow and thermal structure may play an important role in determiningnot only the quantity and chemistry of magmas but also their path of transport to thesurface Further work is required to resolve this issue

Simulating the geodynamics of subduction is an example of our success in port-ing an existing code into the PETSc framework simultaneously parallelizing thecode and increasing its functionality Using this code as a template we rapidly de-veloped a related simulation of another tectonic boundary the mid-ocean ridge Thiswork was done to address a set of observations of systematic morphological asym-metry in the global mid-ocean ridge system [7] Our model confirms the qualitative

24 Matthew G Knepley et al

Dep

th k

m

log10

η and flow field

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24 Temperature degC

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Distance km

Dep

th k

m

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24

Distance km

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Isov

isco

usV

aria

ble

Vis

cosi

ty

(a) (b)

(c) (d)

Fig 282D viscosity and potential temperature fields from simulations on 8 processors with230112 degrees of freedom Panels in the top row are from a simulation withα=1 in equation(1) Panels in bottom row haveα=0 The white box in panels (a) and (c) shows the region inwhich temperature is plotted in panels (b) and (d)(a) Colors showlog10 of the viscosity fieldNote that there are more than five orders of magnitude variation in viscosity Arrows show theflow direction and magnitude (the slab is being subducted at a rate of 6 cmyear) Upwellingis evident in the flow field near the wedge corner(b) Temperature field from the variableviscosity simulation 1100C isotherm is shown as a dashed line(c) (Constant) viscosity andflow field from the isoviscous simulation Strong flow is predicted at the base of the crustdespite the low-temperature rock there No upwelling flow is predicted(d) Temperature fieldfrom isoviscous simulation Note that the mantle wedge corner is much colder than in (b)

mechanism that had been proposed to explain these observations Furthermore itshows a quantitative agreement with trends in the observed data [17] As with ourmodels of subduction the key to demonstrating the validity of the hypothesized dy-namics was simulating them with the strongly nonlinear rheology that characterizesmantle rock on geologic timescales The computational tools provided by PETSc en-abled us easily handle this rheology reducing development time and allowing us tofocus on model interpretation

References

1 PETSc SNES Example 30 httpwwwmcsanlgovpetscpetsc-2

Developing a Geodynamics Simulator with PETSc 25

snapshotspetsc-currentsrcsnesexamplestutorialsex30chtml

2 PETSc SNES Example 5 httpwwwmcsanlgovpetscpetsc-2snapshotspetsc-currentsrcsnesexamplestutorialsex5f90Fhtml

3 PETSc Solvers httpwwwmcsanlgovpetscpetsc-2documentationlinearsolvertablehtml

4 Amestoy P R Duff I S LrsquoExcellent J-Y and Koster J A fully asynchronous mul-tifrontal solver using distributed dynamic schedulingSIAM Journal on Matrix Analysisand Applications 23(1)15ndash41 (2001)

5 Balay S Buschelman K Eijkhout V Gropp W D Kaushik D Knepley M GMcInnes L C Smith B F and Zhang H PETSc Web pagehttpwwwmcsanlgovpetsc

6 Balay S Gropp W D McInnes L C and Smith B F Efficient management of par-allelism in object oriented numerical software libraries In Arge E Bruaset A M andLangtangen H P editorsModern Software Tools in Scientific Computing pages 163ndash202 Birkhauser Press (1997)

7 Carbotte S Small C and Donnelly K The influence of ridge migration on the mag-matic segmentation of mid-ocean ridgesNature 429743ndash746 (2004)

8 Conder J Wiens D and Morris J On the decompression melting structure at volcanicarcs and back-arc spreading centersGeophys Res Letts 29 (2002)

9 Demmel J W Gilbert J R and Li X S SuperLU userrsquos guide Technical ReportLBNL-44289 Lawrence Berkeley National Laboratory (2003)

10 Eberle M Grasset O and Sotin C A numerical study of the interaction of the man-tle wedge subducting slab and overriding platePhys Earth Planet In 134191ndash202(2002)

11 England P Engdahl R and Thatcher W Systematic variation in the depth of slabsbeneath arc volcanosGeophys J Int 156(2)377ndash408 (2003)

12 Furukawa Y Depth of the decoupling plate interface and thermal structure under arcsJ Geophys Res 9820005ndash20013 (1993)

13 Henson V E and Yang U M BoomerAMG A parallel algebraic multigrid solver andpreconditioner Technical Report UCRL-JC-133948 Lawrence Livermore National Lab-oratory (2000)

14 Hirth G and Kohlstedt D Rheology of the upper mantle and the mantle wedge A viewfrom the experimentalists InInside the Subduction Factory volume 138 ofGeophysicalMonograph American Geophysical Union (2003)

15 Hovland P Norris B and Smith B Making automatic differentiation truly automaticCoupling PETSc with ADIC InProceedings of ICCS2002 (2002)

16 Hysom D and Pothen A A scalable parallel algorithm for incomplete factor precondi-tioning SIAM Journal on Scientific Computing 222194ndash2215 (2001)

17 Katz R Spiegelman M and Carbotte S Ridge migration asthenospheric flow and theorigin of magmatic segmentation in the global mid-ocean ridge systemGeophys ResLetts 31 (2004)

18 Kelemen P Rilling J Parmentier E Mehl L and Hacker B Thermal structure dueto solid-state flow in the mantle wedge beneath arcs InInside the Subduction Factoryvolume 138 ofGeophysical Monograph American Geophysical Union (2003)

19 Peacock S and Wang K Seismic consequences of warm versus cool subduction meta-morphism Examples from southwest and northeast JapanScience 286937ndash939 (1999)

20 Stern R Subduction zonesRev Geophys 40(4) (2002)

26 Matthew G Knepley et al

21 van Keken P Kiefer B and Peacock S High-resolution models of subduction zonesImplications for mineral dehydration reactions and the transport of water into the deepmantleGeochem Geophys Geosys 3(10) (2003)

Page 12: Developing a Geodynamics Simulator with PETScfoalab.earth.ox.ac.uk/publications/Knepley_etal_Springer06.pdf · Developing a Geodynamics Simulator with PETSc Matthew G. Knepley 1,

12 Matthew G Knepley et al

type userctxDA da

The start end and number of vertices in the xminusdirectioninteger xsxexm

The start end and number of ghost vertices in the xminusdirectioninteger gxsgxegxm

The start end and number of vertices in the yminusdirectioninteger ysyeym

The start end and number of ghost vertices in the yminusdirectioninteger gysgyegym

The number of vertices in the xminus and yminusdirectionsinteger mxmy

The MPI rank of this processinteger rank

The coefficient in the Bratu equationdouble precision lambda

end type userctx

call DACreate2d(PETSCCOMM WORLDDA NONPERIODICDA STENCIL STAR ampamp usermxusermyPETSCDECIDEPETSCDECIDE11 ampamp PETSCNULL INTEGERPETSCNULL INTEGERuserdaierr)call DAGetCorners(userdauserxsuserysPETSCNULL INTEGER amp

amp userxmuserymPETSCNULL INTEGERierr)call DAGetGhostCorners(userdausergxsusergys amp

amp PETSCNULL INTEGERusergxmusergym ampamp PETSCNULL INTEGERierr)

Here we shift the starting indices up by one so that we can easily use the Fortran convention of1minusbased indices rather than0minusbased

userxs = userxs+1userys = userys+1usergxs = usergxs+1usergys = usergys+1userye = userys+userymminus1userxe = userxs+userxmminus1usergye = usergys+usergymminus1usergxe = usergxs+usergxmminus1

Fig 11Creation of a DA for the Bratu problem

Developing a Geodynamics Simulator with PETSc 13

subroutine FormFunctionPETSc(uFuserierr)implicit none

Vec uFinteger ierrtype (userctx) user

double precisionpointer lu v()lf v()Vec uLocal

call DAGetLocalVector(userdauLocalierr)call DAGlobalToLocalBegin(userdauINSERT VALUESuLocalierr)call DAGlobalToLocalEnd(userdauINSERT VALUESuLocalierr)

call VecGetArrayF90(uLocallu vierr)call VecGetArrayF90(Flf vierr)

Actually compute the local portion of the residualcall FormFunctionLocal(lu vlf vuserierr)

call VecRestoreArrayF90(uLocallu vierr)call VecRestoreArrayF90(Flf vierr)

call DARestoreLocalVector(userdauLocalierr)

Fig 12Parallel residual calculation for the Bratu problem using PETSc

u(usergxsusergxeusergysusergye) which now includes ghostvalues

The changes toFormJacobian() are similar resizing the input array andchanging slightly the calculation of row and column indices and can be found inthe PETSc example source We have now finished the initial introduction of PETSclinear algebra and the simulation is ready to run in parallel

32 Rapid Evolution

The user who is willing to raise the level of abstraction in a code can start by em-ploying theDA to describe the problem domain and discretization The data interfacemimics a multidimensional array and is thus ideal for finite difference finite volumeand low-order finite element schemes on a logically rectangular grid In fact after thedefinition of aFormFunction() routine the simulation is ready to run becausethe nonlinear solver provides approximate Jacobians automatically as discussed inSection 4

14 Matthew G Knepley et al

DASetLocalFunction(userda (DALocalFunction1) FormFunctionLocal)

Fig 13Using the DA to form a function

int FormFunctionLocal(DALocalInfo infodouble udouble fAppCtx user)double two = 20hxhyhxdhyhydhxscdouble uijuxxuyyint ij

hx = 10(double)(infominusgtmxminus1)hy = 10(double)(infominusgtmyminus1)sc = hxhyuserminusgtlambdahxdhy = hxhyhydhx = hyhx Compute function over the locally owned part of the grid for (j=infominusgtys jltinfominusgtys+infominusgtym j++)

for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++) if (i == 0 | | j == 0 | | i == infominusgtmxminus1 | | j == infominusgtmyminus1)

f[j][i] = u[j][i] else

uij = u[j][i]uxx = (twouij minus u[j][iminus1] minus u[j][i+1])hydhxuyy = (twouij minus u[jminus1][i] minus u[j+1][i])hxdhyf[j][i] = uxx + uyy minus scexp(uij)

Fig 14FormFunctionLocal() for the Bratu problem

We begin by examining the Bratu problem from the last section only this time inC We can now provide our local residual routine to theDA see Fig 13

The grid information is passed toFormFunctionLocal() in a DALocalInfo structure as shown in Fig 14

The fields are passed directly as multidimensional C arrays complete with ghostvalues An application scientist can use a boilerplate example such as the Bratu prob-lem merely altering the local physics calculation inFormFunctionLocal() torapidly obtain a parallel scalable simulation that runs on the desktop as well as onmassively parallel supercomputers

If the user has an expression for the Jacobian of the residual function then thismatrix can also be computed by using theDA interface with the call as in Fig 15and the corresponding routine for the local calculation in Fig 16

Developing a Geodynamics Simulator with PETSc 15

DASetLocalJacobian(userda (DALocalFunction1) FormJacobianLocal)

Fig 15Using the DA to form a Jacobian

int FormJacobianLocal(DALocalInfo infodouble xMat jacAppCtx user)MatStencil col[5]rowdouble lambdav[5]hxhyhxdhyhydhxscint ij

lambda = userminusgtparamhx = 10(double)(infominusgtmxminus1)hy = 10(double)(infominusgtmyminus1)sc = hxhylambdahxdhy = hxhy hydhx = hyhx

for (j=infominusgtys jltinfominusgtys+infominusgtym j++) for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++)

rowj = j rowi = i boundary points if (i == 0 | | j == 0 | | i == infominusgtmxminus1 | | j == infominusgtmyminus1)

v[0] = 10MatSetValuesStencil(jac1amprow1amprowvINSERT VALUES)

else interior grid points

v[0] = minushxdhy col[0]j = j minus 1 col[0]i = iv[1] = minushydhx col[1]j = j col[1]i = iminus1v[2] = 20(hydhx + hxdhy) minus scexp(x[j][i]) col[2]j = j col[2]i = iv[3] = minushydhx col[3]j = j col[3]i = i+1v[4] = minushxdhy col[4]j = j + 1 col[4]i = iMatSetValuesStencil(jac1amprow5colvINSERT VALUES)

MatAssemblyBegin(jacMAT FINAL ASSEMBLY)MatAssemblyEnd(jacMAT FINAL ASSEMBLY)MatSetOption(jacMAT NEW NONZERO LOCATION ERR)

Fig 16FormJacobianLocal() for the Bratu problem

16 Matthew G Knepley et al

int FormFunctionLocal(DALocalInfo infoField xField fvoid ptr)AppCtx user = (AppCtx)ptrParameter param = userminusgtparamGridInfo grid = userminusgtgriddouble mag w mag uint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1 i j

for (j=infominusgtys jltinfominusgtys+infominusgtym j++) for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++)

calculateXMomentumResidual(i j x f user)calculateZMomentumResidual(i j x f user)calculatePressureResidual(i j x f user)calculateTemperatureResidual(i j x f user)

Fig 17FormFunctionLocal() for the mantle subduction problem

We have so far presented simple examples involving a perturbed Laplacian butthis development strategy extends far beyond toy problems We have developedlarge-scale parallel geophysical simulations [1] that are producing new results inthe field We began by replacingFormFunctionLocal() in the Bratu examplewith one containing the linear physics of the previous sequential linear subductioncode (dropping the previous codersquos linear solve) Then the PETSc solver results werecompared for correctness with the complete prior subduction code Next the newcode was immediately run in parallel for correctness and efficiency tests Finallythe nonlinear terms from the more complicated physics of the mantle subductionproblem discussed in Section 1 and shown in Figs 17-21 and better quality dis-cretizations of the boundary conditions were added In fact since the structure oftheFormFunction() changed significantly the finalFormFunction() looksnothing like the original but the process of obtaining through the incremental ap-proach reduced the learning curve for PETSc and make correctness checking at thevarious stages much easier

The details of the subduction code are not as important as the recognition thata very complicated physics problem need not involve any more complication in oursimulation infrastructure To give more insight into the actual calculation we showin Fig 22 the explicit calculation of the residual from the continuity equation

Here we have used theDA for a multicomponent problem on a staggered meshwithout alteration because the structure of the discretization remains logically rect-angular

Although theDA manages communication during the solution process data post-processing often necessary for visualization or analysis also requires this function-ality For the mantle subduction simulation we want to calculate both the second in-

Developing a Geodynamics Simulator with PETSc 17

int calculateXMomentumResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (iltj) f[j][i]u = x[j][i]u minus SlabVel(rsquoUrsquo ijuser)

else if (jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lithospheric lid f[j][i]u = x[j][i]u minus 00

else if (i==ilim ) on the right side boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]u = x[j][i]u minus HorizVelocity(ijuser) else

f[j][i]u = XNormalStress(xijCELL CENTERuser) minus EPS ZERO

else if (j==jlim ) on the bottom boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]u = x[j][i]u minus HorizVelocity(ijuser) else if (paramminusgtibound==BC NOSTRESS)

f[j][i]u = XMomentumResidual(xijuser)

else in the mantle wedge f[j][i]u = XMomentumResidual(xijuser)

Fig 18X-momentum residual for the mantle subduction problem

variant of the strain rate tensor and the viscosity over the grid (at both cell centers andcorners) Using theDAGlobalToLocalBegin () andDAGlobalToLocalEnd() functions we transferred the global solution data to local ghosted vectors and pro-ceeded with the calculation We can also transfer data from local vectors to a globalvector usingDALocalToGlobal () The entire viscosity calculation is given inFig 23

Note that this framework allows the user to manage arbitrary fields over the do-main For instance a porous flow simulation might manage material properties ofthe medium in this fashion

4 Solvers

The SNES object in PETSc is an abstraction of the ldquoinverserdquo of a nonlinear oper-ator The user provides aFormFunction() routine as seen in Section 3 whichcomputes the action of the operator on an input vector Linear problems can also be

18 Matthew G Knepley et al

int calculateZMomentumResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1jlim = infominusgtmyminus1

if (ilt=j) f[j][i]w = x[j][i]w minus SlabVel(rsquoWrsquo ijuser)

else if (jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lithospheric lid f[j][i]w = x[j][i]w minus 00

else if (j==jlim ) on the bottom boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]w = x[j][i]w minus VertVelocity(ijuser) else

f[j][i]w = ZNormalStress(xijCELL CENTERuser) minus EPS ZERO

else if (i==ilim ) on the right side boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]w = x[j][i]w minus VertVelocity(ijuser) else if (paramminusgtibound==BC NOSTRESS)

f[j][i]w = ZMomentumResidual(xijuser)

else in the mantle wedge f[j][i]w = ZMomentumResidual(xijuser)

Fig 19Z-momentum residual for the mantle subduction problem

int calculatePressureResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (iltj | | jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lid or slab f[j][i]p = x[j][i]p

else if ((i==ilim | | j==jlim ) ampamp paramminusgtibound==BC ANALYTIC ) on an analytic boundary f[j][i]p = x[j][i]p minus Pressure(ijuser)

else in the mantle wedge f[j][i]p = ContinuityResidual(xijuser)

Fig 20Pressure or continuity residual for the mantle subduction problem

Developing a Geodynamics Simulator with PETSc 19

int calculateTemperatureResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (j==0) on the surface f[j][i]T = x[j][i]T + x[j+1][i]T + PetscMax(x[j][i]T00)

else if (i==0) slab inflow boundary f[j][i]T = x[j][i]T minus PlateModel(jPLATE SLABuser)

else if (i==ilim ) right side boundary mag u = 10 minus pow( (10minusPetscMax(PetscMin(x[j][iminus1]uparamminusgtcb10)00)) 50 )f[j][i]T = x[j][i]T minus mag ux[jminus1][iminus1]T minus (10minusmag u)PlateModel(jPLATE LID user)

else if (j==jlim ) bottom boundary mag w = 10 minus pow( (10minusPetscMax(PetscMin(x[jminus1][i]wparamminusgtsb10)00)) 50 )f[j][i]T = x[j][i]T minus mag wx[jminus1][iminus1]T minus (10minusmag w)

else in the mantle wedge f[j][i]T = EnergyResidual(xijuser)

Fig 21Temperature or energy residual for the mantle subduction problem

double precision ContinuityResidual(Field x int i int j AppCtx user)

GridInfo grid = userminusgtgriddouble precision uEuWwNwSdudxdwdz

uW = x[j][iminus1]u uE = x[j][i]u dudx = (uE minus uW)gridminusgtdxwS = x[jminus1][i]w wN = x[j][i]w dwdz = (wN minus wS)gridminusgtdzreturn dudx + dwdz

Fig 22Calculation of the continuity residual for the mantle subduction problem

solved by usingSNES with no loss of efficiency compared to using the underlyingKSP object (PETSc linear solver object) directly ThusSNES seems the appropri-ate framework for a general-purpose simulator providing the flexibility to add orsubtract nonlinear terms from the equation at will

TheSNES solver does not require a user to implement a Jacobian Default rou-tines are provided to compute a finite difference approximation using coloring toaccount for the sparsity of the matrix The user does not even need the PETScMatinterface but can initially interact only withVec objects making the transition to

PETSc nearly painless However these approximations to the Jacobian can have nu-merical difficulties and are not as efficient as direct evaluation Therefore the user

20 Matthew G Knepley et al

Compute both the second invariant of the strain rate tensor and the viscosity int ViscosityField(DA da Vec X Vec VAppCtx user)

Parameter param = userminusgtparamGridInfo grid = userminusgtgridVec localXField v xdouble eps dx dz T epsC TCint ijisjsimjmilim jlim ivt

ivt = paramminusgtiviscparamminusgtivisc = paramminusgtoutput ivisc

DACreateLocalVector(da amplocalX)DAGlobalToLocalBegin(da X INSERT VALUES localX)DAGlobalToLocalEnd(da X INSERT VALUES localX)DAVecGetArray(dalocalX(void)ampx)DAVecGetArray(daV(void)ampv)

Parameters dx = gridminusgtdx dz = gridminusgtdzilim = gridminusgtniminus1 jlim = gridminusgtnjminus1

Compute real temperature strain rate and viscosity DAGetCorners(daampisampjsPETSCNULL ampimampjmPETSCNULL )for (j=js jltjs+jm j++)

for (i=is iltis+im i++) T = paramminusgtpotentialT x[j][i]T exp( (jminus05)dzparamminusgtz scale )if (iltilim ampamp jltjlim )

TC = paramminusgtpotentialT TInterp(xij) exp( jdzparamminusgtz scale ) else

TC = T Compute the values at both cell centers and cell corners eps = CalcSecInv(xijCELL CENTERuser)epsC = CalcSecInv(xijCELL CORNERuser)v[j][i]u = epsv[j][i]w = epsCv[j][i]p = Viscosity(Tepsdz(jminus05)param)v[j][i]T = Viscosity(TCepsCdzjparam)

DAVecRestoreArray(daV(void)ampv)DAVecRestoreArray(dalocalX(void)ampx)paramminusgtivisc = ivt

Fig 23 Calculation of the second invariant of the strain tensor and viscosity field for themantle subduction problem

Developing a Geodynamics Simulator with PETSc 21

KSP ksp subkspPC bjacobi ilu

SNESGetKSP(snes ampksp)KSPGetPC(ksp ampbjacobi)PCBJacobiGetSubKSP(bjacobi PETSCNULL PETSCNULL ampsubksp)KSPGetPC(subksp[0] ampilu)PCILUSetLevels(ilu levels)

Fig 24Customizing the preconditioner on a single process

has the option of providing a routine or of using the ADIC [15] or similar system forautomatic differentiation The finite difference approximation can be activated byusing the-snes fd and-mat fd coloring freq options or by providing theSNESDefaultComputeJacobianColor () function toSNESSetJacobian()

Through theSNES object the user may also access the great range of directand iterative linear solvers and preconditioners provided by PETSc Sixteen differentKrylov solvers are available including GMRES BiCGStab and LSQR along withinterfaces to popular sparse direct packages such as MUMPS [4] and SuperLU [9]In addition to a variety of incomplete factorization preconditioners including PI-LUT [16] PETSc supports additive Schwartz preconditioning algebraic multigridthrough BoomerAMG [13] and the geometric multigrid discussed below The fullpanoply of solvers and preconditioners available is catalogued on the PETSc Web-site [3]

The modularity of PETSc allows users to easily customize each solver For in-stance suppose the user wishes to increase the number of levels in ILU(k) precon-ditioning on one block of a block-Jacobi scheme The code fragment in Fig 24 willset the number of levels of fill tolevels

PETSc provides an elegant framework for managing geometric multigrid in com-bination with aDA which is abstracted in theDMMGobject In the same way thatany linear solve can be performed withSNES any preconditioner or solver com-bination is available inDMMGby using a single level That is the same user codesupports both geometric multigrid as well as direct methods Krylov methods etcWhen creating aDMMG the user specifies the number of levels of grid refinementand provides the coarse-grid information (see Fig 25) which is always aDA objectat present although unstructured prototypes are being developed Then theDMMGobject calculates the intergrid transfer operators prolongation and restriction and

allocates objects to hold the solution at each level The user must provide the actionof the residual operatorF and can optionally provide a routine to compute the Jaco-bian matrix (which will be called on each level) and a routine to compute the initialguess as in Fig 26

With this information the user can now solve the equation and retrieve the solu-tion vector as in Fig 27

22 Matthew G Knepley et al

DMMGCreate(comm gridmglevels user ampdmmg)DMMGSetDM(dmmg (DM) da)

Fig 25Creating a DMMG

DMMGSetSNESLocal(dmmg FormFunctionLocal FormJacobianLocal 0 0)DMMGSetInitialGuess(dmmg FormInitialGuess)

Fig 26Providing discretization and assembly routines to DMMG

DMMGSolve(dmmg)soln = DMMGGetx(dmmg)

Fig 27Solving the problem with DMMG

5 Extensions

PETSc is not a complete environment for simulating physical phenomena Ratherit is a set of tools that allow the user to assemble such an environment tailored toa specific application As such PETSc will never provide every facility appropriatefor a given simulation However the user can easily extend PETSc and supplementits capabilities We present two examples in the context of the mantle subductionsimulation

51 Simple Continuation

Because of the highly nonlinear dependence of viscosity on temperature and strainrate in equation (1) the iteration in Newtonrsquos method can fail to converge withouta good starting guess of the solution On the other hand for the isoviscous casewhere temperature is coupled to velocity only through advection a solution is easilyreached It is therefore natural to propose a continuation scheme in the viscosity be-ginning with constant viscosity and progressing to full variability A simple adaptivecontinuation model was devised in which the viscosity was raised to a power be-tween zero and oneη rarr ηα whereα=0 corresponds to constant viscosity andα=1corresponds to natural viscosity variation In the continuation loopα was increasedtowards unity by an amount dependent on the rate of convergence of the previousSNES solve

PETSc itself provides no special support for continuation but it is sufficientlymodular that a continuation loop was readily constructed in the application codeusing repeated calls toDMMGSolve() and found to work quite well

Developing a Geodynamics Simulator with PETSc 23

52 Simple Steering

No rigorous justification had been given for the continuation strategy above andconsequently it was possible that for certain initial conditions Newton would fail toconverge thereby resulting in extremely long run times An observant user could de-tect this situation and abort the run but all the potentially useful solution informationup to that point would be lost A clean way to asynchronously abort the continua-tion loop was thus needed On architectures where OS signals are available PETScprovides an interface for registering signal handlers Thus we were able to define ahandler The user wishing to change the control flow of the simulation simply sendsthe appropriate signal to the process This sets a flag in the user data that causes thechange at the next iteration

6 Simulation Results

A long-standing debate has existed between theorists and observationalists over thethermal structure of subduction zones Modelers using constant viscosity flow sim-ulations to compute thermal structure have predicted relatively cold subductionzones [19] Conversely observationalists who use heat flow measurements and min-eralogical thermobarometry to estimate temperatures at depth have long claimedthat subduction zones are hotter than predicted They have invoked the presence ofstrong upwelling of hot mantle material to supply the heat This upwelling was neverpredicted by isoviscous flow models however

The debate remained unresolved until recently when several groups includingour own succeeded in developing simulations with realistically variable viscosity(for other examples [810121821]) A comparison of flow and temperature fieldsfor variable and constant viscosity simulations generated with our code is shownin Fig 28 Results from these simulations are exciting because they close the gapbetween models and observations they predict hotter mantle temperatures steepersurface thermal gradients and upwelling mantle flow

Furthermore these simulations allow for quantitative predictions of variation ofobservables with subduction parameters A recent study of subduction zone earth-quakes has identified an intriguing trend the vertical distance from subduction zonevolcanoes to the surface of the subducting slab is anti-correlated with descent rate ofthe slab [11] Preliminary calculations with our model are consistent with this trendindicating that flow and thermal structure may play an important role in determiningnot only the quantity and chemistry of magmas but also their path of transport to thesurface Further work is required to resolve this issue

Simulating the geodynamics of subduction is an example of our success in port-ing an existing code into the PETSc framework simultaneously parallelizing thecode and increasing its functionality Using this code as a template we rapidly de-veloped a related simulation of another tectonic boundary the mid-ocean ridge Thiswork was done to address a set of observations of systematic morphological asym-metry in the global mid-ocean ridge system [7] Our model confirms the qualitative

24 Matthew G Knepley et al

Dep

th k

m

log10

η and flow field

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24 Temperature degC

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Distance km

Dep

th k

m

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24

Distance km

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Isov

isco

usV

aria

ble

Vis

cosi

ty

(a) (b)

(c) (d)

Fig 282D viscosity and potential temperature fields from simulations on 8 processors with230112 degrees of freedom Panels in the top row are from a simulation withα=1 in equation(1) Panels in bottom row haveα=0 The white box in panels (a) and (c) shows the region inwhich temperature is plotted in panels (b) and (d)(a) Colors showlog10 of the viscosity fieldNote that there are more than five orders of magnitude variation in viscosity Arrows show theflow direction and magnitude (the slab is being subducted at a rate of 6 cmyear) Upwellingis evident in the flow field near the wedge corner(b) Temperature field from the variableviscosity simulation 1100C isotherm is shown as a dashed line(c) (Constant) viscosity andflow field from the isoviscous simulation Strong flow is predicted at the base of the crustdespite the low-temperature rock there No upwelling flow is predicted(d) Temperature fieldfrom isoviscous simulation Note that the mantle wedge corner is much colder than in (b)

mechanism that had been proposed to explain these observations Furthermore itshows a quantitative agreement with trends in the observed data [17] As with ourmodels of subduction the key to demonstrating the validity of the hypothesized dy-namics was simulating them with the strongly nonlinear rheology that characterizesmantle rock on geologic timescales The computational tools provided by PETSc en-abled us easily handle this rheology reducing development time and allowing us tofocus on model interpretation

References

1 PETSc SNES Example 30 httpwwwmcsanlgovpetscpetsc-2

Developing a Geodynamics Simulator with PETSc 25

snapshotspetsc-currentsrcsnesexamplestutorialsex30chtml

2 PETSc SNES Example 5 httpwwwmcsanlgovpetscpetsc-2snapshotspetsc-currentsrcsnesexamplestutorialsex5f90Fhtml

3 PETSc Solvers httpwwwmcsanlgovpetscpetsc-2documentationlinearsolvertablehtml

4 Amestoy P R Duff I S LrsquoExcellent J-Y and Koster J A fully asynchronous mul-tifrontal solver using distributed dynamic schedulingSIAM Journal on Matrix Analysisand Applications 23(1)15ndash41 (2001)

5 Balay S Buschelman K Eijkhout V Gropp W D Kaushik D Knepley M GMcInnes L C Smith B F and Zhang H PETSc Web pagehttpwwwmcsanlgovpetsc

6 Balay S Gropp W D McInnes L C and Smith B F Efficient management of par-allelism in object oriented numerical software libraries In Arge E Bruaset A M andLangtangen H P editorsModern Software Tools in Scientific Computing pages 163ndash202 Birkhauser Press (1997)

7 Carbotte S Small C and Donnelly K The influence of ridge migration on the mag-matic segmentation of mid-ocean ridgesNature 429743ndash746 (2004)

8 Conder J Wiens D and Morris J On the decompression melting structure at volcanicarcs and back-arc spreading centersGeophys Res Letts 29 (2002)

9 Demmel J W Gilbert J R and Li X S SuperLU userrsquos guide Technical ReportLBNL-44289 Lawrence Berkeley National Laboratory (2003)

10 Eberle M Grasset O and Sotin C A numerical study of the interaction of the man-tle wedge subducting slab and overriding platePhys Earth Planet In 134191ndash202(2002)

11 England P Engdahl R and Thatcher W Systematic variation in the depth of slabsbeneath arc volcanosGeophys J Int 156(2)377ndash408 (2003)

12 Furukawa Y Depth of the decoupling plate interface and thermal structure under arcsJ Geophys Res 9820005ndash20013 (1993)

13 Henson V E and Yang U M BoomerAMG A parallel algebraic multigrid solver andpreconditioner Technical Report UCRL-JC-133948 Lawrence Livermore National Lab-oratory (2000)

14 Hirth G and Kohlstedt D Rheology of the upper mantle and the mantle wedge A viewfrom the experimentalists InInside the Subduction Factory volume 138 ofGeophysicalMonograph American Geophysical Union (2003)

15 Hovland P Norris B and Smith B Making automatic differentiation truly automaticCoupling PETSc with ADIC InProceedings of ICCS2002 (2002)

16 Hysom D and Pothen A A scalable parallel algorithm for incomplete factor precondi-tioning SIAM Journal on Scientific Computing 222194ndash2215 (2001)

17 Katz R Spiegelman M and Carbotte S Ridge migration asthenospheric flow and theorigin of magmatic segmentation in the global mid-ocean ridge systemGeophys ResLetts 31 (2004)

18 Kelemen P Rilling J Parmentier E Mehl L and Hacker B Thermal structure dueto solid-state flow in the mantle wedge beneath arcs InInside the Subduction Factoryvolume 138 ofGeophysical Monograph American Geophysical Union (2003)

19 Peacock S and Wang K Seismic consequences of warm versus cool subduction meta-morphism Examples from southwest and northeast JapanScience 286937ndash939 (1999)

20 Stern R Subduction zonesRev Geophys 40(4) (2002)

26 Matthew G Knepley et al

21 van Keken P Kiefer B and Peacock S High-resolution models of subduction zonesImplications for mineral dehydration reactions and the transport of water into the deepmantleGeochem Geophys Geosys 3(10) (2003)

Page 13: Developing a Geodynamics Simulator with PETScfoalab.earth.ox.ac.uk/publications/Knepley_etal_Springer06.pdf · Developing a Geodynamics Simulator with PETSc Matthew G. Knepley 1,

Developing a Geodynamics Simulator with PETSc 13

subroutine FormFunctionPETSc(uFuserierr)implicit none

Vec uFinteger ierrtype (userctx) user

double precisionpointer lu v()lf v()Vec uLocal

call DAGetLocalVector(userdauLocalierr)call DAGlobalToLocalBegin(userdauINSERT VALUESuLocalierr)call DAGlobalToLocalEnd(userdauINSERT VALUESuLocalierr)

call VecGetArrayF90(uLocallu vierr)call VecGetArrayF90(Flf vierr)

Actually compute the local portion of the residualcall FormFunctionLocal(lu vlf vuserierr)

call VecRestoreArrayF90(uLocallu vierr)call VecRestoreArrayF90(Flf vierr)

call DARestoreLocalVector(userdauLocalierr)

Fig 12Parallel residual calculation for the Bratu problem using PETSc

u(usergxsusergxeusergysusergye) which now includes ghostvalues

The changes toFormJacobian() are similar resizing the input array andchanging slightly the calculation of row and column indices and can be found inthe PETSc example source We have now finished the initial introduction of PETSclinear algebra and the simulation is ready to run in parallel

32 Rapid Evolution

The user who is willing to raise the level of abstraction in a code can start by em-ploying theDA to describe the problem domain and discretization The data interfacemimics a multidimensional array and is thus ideal for finite difference finite volumeand low-order finite element schemes on a logically rectangular grid In fact after thedefinition of aFormFunction() routine the simulation is ready to run becausethe nonlinear solver provides approximate Jacobians automatically as discussed inSection 4

14 Matthew G Knepley et al

DASetLocalFunction(userda (DALocalFunction1) FormFunctionLocal)

Fig 13Using the DA to form a function

int FormFunctionLocal(DALocalInfo infodouble udouble fAppCtx user)double two = 20hxhyhxdhyhydhxscdouble uijuxxuyyint ij

hx = 10(double)(infominusgtmxminus1)hy = 10(double)(infominusgtmyminus1)sc = hxhyuserminusgtlambdahxdhy = hxhyhydhx = hyhx Compute function over the locally owned part of the grid for (j=infominusgtys jltinfominusgtys+infominusgtym j++)

for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++) if (i == 0 | | j == 0 | | i == infominusgtmxminus1 | | j == infominusgtmyminus1)

f[j][i] = u[j][i] else

uij = u[j][i]uxx = (twouij minus u[j][iminus1] minus u[j][i+1])hydhxuyy = (twouij minus u[jminus1][i] minus u[j+1][i])hxdhyf[j][i] = uxx + uyy minus scexp(uij)

Fig 14FormFunctionLocal() for the Bratu problem

We begin by examining the Bratu problem from the last section only this time inC We can now provide our local residual routine to theDA see Fig 13

The grid information is passed toFormFunctionLocal() in a DALocalInfo structure as shown in Fig 14

The fields are passed directly as multidimensional C arrays complete with ghostvalues An application scientist can use a boilerplate example such as the Bratu prob-lem merely altering the local physics calculation inFormFunctionLocal() torapidly obtain a parallel scalable simulation that runs on the desktop as well as onmassively parallel supercomputers

If the user has an expression for the Jacobian of the residual function then thismatrix can also be computed by using theDA interface with the call as in Fig 15and the corresponding routine for the local calculation in Fig 16

Developing a Geodynamics Simulator with PETSc 15

DASetLocalJacobian(userda (DALocalFunction1) FormJacobianLocal)

Fig 15Using the DA to form a Jacobian

int FormJacobianLocal(DALocalInfo infodouble xMat jacAppCtx user)MatStencil col[5]rowdouble lambdav[5]hxhyhxdhyhydhxscint ij

lambda = userminusgtparamhx = 10(double)(infominusgtmxminus1)hy = 10(double)(infominusgtmyminus1)sc = hxhylambdahxdhy = hxhy hydhx = hyhx

for (j=infominusgtys jltinfominusgtys+infominusgtym j++) for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++)

rowj = j rowi = i boundary points if (i == 0 | | j == 0 | | i == infominusgtmxminus1 | | j == infominusgtmyminus1)

v[0] = 10MatSetValuesStencil(jac1amprow1amprowvINSERT VALUES)

else interior grid points

v[0] = minushxdhy col[0]j = j minus 1 col[0]i = iv[1] = minushydhx col[1]j = j col[1]i = iminus1v[2] = 20(hydhx + hxdhy) minus scexp(x[j][i]) col[2]j = j col[2]i = iv[3] = minushydhx col[3]j = j col[3]i = i+1v[4] = minushxdhy col[4]j = j + 1 col[4]i = iMatSetValuesStencil(jac1amprow5colvINSERT VALUES)

MatAssemblyBegin(jacMAT FINAL ASSEMBLY)MatAssemblyEnd(jacMAT FINAL ASSEMBLY)MatSetOption(jacMAT NEW NONZERO LOCATION ERR)

Fig 16FormJacobianLocal() for the Bratu problem

16 Matthew G Knepley et al

int FormFunctionLocal(DALocalInfo infoField xField fvoid ptr)AppCtx user = (AppCtx)ptrParameter param = userminusgtparamGridInfo grid = userminusgtgriddouble mag w mag uint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1 i j

for (j=infominusgtys jltinfominusgtys+infominusgtym j++) for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++)

calculateXMomentumResidual(i j x f user)calculateZMomentumResidual(i j x f user)calculatePressureResidual(i j x f user)calculateTemperatureResidual(i j x f user)

Fig 17FormFunctionLocal() for the mantle subduction problem

We have so far presented simple examples involving a perturbed Laplacian butthis development strategy extends far beyond toy problems We have developedlarge-scale parallel geophysical simulations [1] that are producing new results inthe field We began by replacingFormFunctionLocal() in the Bratu examplewith one containing the linear physics of the previous sequential linear subductioncode (dropping the previous codersquos linear solve) Then the PETSc solver results werecompared for correctness with the complete prior subduction code Next the newcode was immediately run in parallel for correctness and efficiency tests Finallythe nonlinear terms from the more complicated physics of the mantle subductionproblem discussed in Section 1 and shown in Figs 17-21 and better quality dis-cretizations of the boundary conditions were added In fact since the structure oftheFormFunction() changed significantly the finalFormFunction() looksnothing like the original but the process of obtaining through the incremental ap-proach reduced the learning curve for PETSc and make correctness checking at thevarious stages much easier

The details of the subduction code are not as important as the recognition thata very complicated physics problem need not involve any more complication in oursimulation infrastructure To give more insight into the actual calculation we showin Fig 22 the explicit calculation of the residual from the continuity equation

Here we have used theDA for a multicomponent problem on a staggered meshwithout alteration because the structure of the discretization remains logically rect-angular

Although theDA manages communication during the solution process data post-processing often necessary for visualization or analysis also requires this function-ality For the mantle subduction simulation we want to calculate both the second in-

Developing a Geodynamics Simulator with PETSc 17

int calculateXMomentumResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (iltj) f[j][i]u = x[j][i]u minus SlabVel(rsquoUrsquo ijuser)

else if (jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lithospheric lid f[j][i]u = x[j][i]u minus 00

else if (i==ilim ) on the right side boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]u = x[j][i]u minus HorizVelocity(ijuser) else

f[j][i]u = XNormalStress(xijCELL CENTERuser) minus EPS ZERO

else if (j==jlim ) on the bottom boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]u = x[j][i]u minus HorizVelocity(ijuser) else if (paramminusgtibound==BC NOSTRESS)

f[j][i]u = XMomentumResidual(xijuser)

else in the mantle wedge f[j][i]u = XMomentumResidual(xijuser)

Fig 18X-momentum residual for the mantle subduction problem

variant of the strain rate tensor and the viscosity over the grid (at both cell centers andcorners) Using theDAGlobalToLocalBegin () andDAGlobalToLocalEnd() functions we transferred the global solution data to local ghosted vectors and pro-ceeded with the calculation We can also transfer data from local vectors to a globalvector usingDALocalToGlobal () The entire viscosity calculation is given inFig 23

Note that this framework allows the user to manage arbitrary fields over the do-main For instance a porous flow simulation might manage material properties ofthe medium in this fashion

4 Solvers

The SNES object in PETSc is an abstraction of the ldquoinverserdquo of a nonlinear oper-ator The user provides aFormFunction() routine as seen in Section 3 whichcomputes the action of the operator on an input vector Linear problems can also be

18 Matthew G Knepley et al

int calculateZMomentumResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1jlim = infominusgtmyminus1

if (ilt=j) f[j][i]w = x[j][i]w minus SlabVel(rsquoWrsquo ijuser)

else if (jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lithospheric lid f[j][i]w = x[j][i]w minus 00

else if (j==jlim ) on the bottom boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]w = x[j][i]w minus VertVelocity(ijuser) else

f[j][i]w = ZNormalStress(xijCELL CENTERuser) minus EPS ZERO

else if (i==ilim ) on the right side boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]w = x[j][i]w minus VertVelocity(ijuser) else if (paramminusgtibound==BC NOSTRESS)

f[j][i]w = ZMomentumResidual(xijuser)

else in the mantle wedge f[j][i]w = ZMomentumResidual(xijuser)

Fig 19Z-momentum residual for the mantle subduction problem

int calculatePressureResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (iltj | | jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lid or slab f[j][i]p = x[j][i]p

else if ((i==ilim | | j==jlim ) ampamp paramminusgtibound==BC ANALYTIC ) on an analytic boundary f[j][i]p = x[j][i]p minus Pressure(ijuser)

else in the mantle wedge f[j][i]p = ContinuityResidual(xijuser)

Fig 20Pressure or continuity residual for the mantle subduction problem

Developing a Geodynamics Simulator with PETSc 19

int calculateTemperatureResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (j==0) on the surface f[j][i]T = x[j][i]T + x[j+1][i]T + PetscMax(x[j][i]T00)

else if (i==0) slab inflow boundary f[j][i]T = x[j][i]T minus PlateModel(jPLATE SLABuser)

else if (i==ilim ) right side boundary mag u = 10 minus pow( (10minusPetscMax(PetscMin(x[j][iminus1]uparamminusgtcb10)00)) 50 )f[j][i]T = x[j][i]T minus mag ux[jminus1][iminus1]T minus (10minusmag u)PlateModel(jPLATE LID user)

else if (j==jlim ) bottom boundary mag w = 10 minus pow( (10minusPetscMax(PetscMin(x[jminus1][i]wparamminusgtsb10)00)) 50 )f[j][i]T = x[j][i]T minus mag wx[jminus1][iminus1]T minus (10minusmag w)

else in the mantle wedge f[j][i]T = EnergyResidual(xijuser)

Fig 21Temperature or energy residual for the mantle subduction problem

double precision ContinuityResidual(Field x int i int j AppCtx user)

GridInfo grid = userminusgtgriddouble precision uEuWwNwSdudxdwdz

uW = x[j][iminus1]u uE = x[j][i]u dudx = (uE minus uW)gridminusgtdxwS = x[jminus1][i]w wN = x[j][i]w dwdz = (wN minus wS)gridminusgtdzreturn dudx + dwdz

Fig 22Calculation of the continuity residual for the mantle subduction problem

solved by usingSNES with no loss of efficiency compared to using the underlyingKSP object (PETSc linear solver object) directly ThusSNES seems the appropri-ate framework for a general-purpose simulator providing the flexibility to add orsubtract nonlinear terms from the equation at will

TheSNES solver does not require a user to implement a Jacobian Default rou-tines are provided to compute a finite difference approximation using coloring toaccount for the sparsity of the matrix The user does not even need the PETScMatinterface but can initially interact only withVec objects making the transition to

PETSc nearly painless However these approximations to the Jacobian can have nu-merical difficulties and are not as efficient as direct evaluation Therefore the user

20 Matthew G Knepley et al

Compute both the second invariant of the strain rate tensor and the viscosity int ViscosityField(DA da Vec X Vec VAppCtx user)

Parameter param = userminusgtparamGridInfo grid = userminusgtgridVec localXField v xdouble eps dx dz T epsC TCint ijisjsimjmilim jlim ivt

ivt = paramminusgtiviscparamminusgtivisc = paramminusgtoutput ivisc

DACreateLocalVector(da amplocalX)DAGlobalToLocalBegin(da X INSERT VALUES localX)DAGlobalToLocalEnd(da X INSERT VALUES localX)DAVecGetArray(dalocalX(void)ampx)DAVecGetArray(daV(void)ampv)

Parameters dx = gridminusgtdx dz = gridminusgtdzilim = gridminusgtniminus1 jlim = gridminusgtnjminus1

Compute real temperature strain rate and viscosity DAGetCorners(daampisampjsPETSCNULL ampimampjmPETSCNULL )for (j=js jltjs+jm j++)

for (i=is iltis+im i++) T = paramminusgtpotentialT x[j][i]T exp( (jminus05)dzparamminusgtz scale )if (iltilim ampamp jltjlim )

TC = paramminusgtpotentialT TInterp(xij) exp( jdzparamminusgtz scale ) else

TC = T Compute the values at both cell centers and cell corners eps = CalcSecInv(xijCELL CENTERuser)epsC = CalcSecInv(xijCELL CORNERuser)v[j][i]u = epsv[j][i]w = epsCv[j][i]p = Viscosity(Tepsdz(jminus05)param)v[j][i]T = Viscosity(TCepsCdzjparam)

DAVecRestoreArray(daV(void)ampv)DAVecRestoreArray(dalocalX(void)ampx)paramminusgtivisc = ivt

Fig 23 Calculation of the second invariant of the strain tensor and viscosity field for themantle subduction problem

Developing a Geodynamics Simulator with PETSc 21

KSP ksp subkspPC bjacobi ilu

SNESGetKSP(snes ampksp)KSPGetPC(ksp ampbjacobi)PCBJacobiGetSubKSP(bjacobi PETSCNULL PETSCNULL ampsubksp)KSPGetPC(subksp[0] ampilu)PCILUSetLevels(ilu levels)

Fig 24Customizing the preconditioner on a single process

has the option of providing a routine or of using the ADIC [15] or similar system forautomatic differentiation The finite difference approximation can be activated byusing the-snes fd and-mat fd coloring freq options or by providing theSNESDefaultComputeJacobianColor () function toSNESSetJacobian()

Through theSNES object the user may also access the great range of directand iterative linear solvers and preconditioners provided by PETSc Sixteen differentKrylov solvers are available including GMRES BiCGStab and LSQR along withinterfaces to popular sparse direct packages such as MUMPS [4] and SuperLU [9]In addition to a variety of incomplete factorization preconditioners including PI-LUT [16] PETSc supports additive Schwartz preconditioning algebraic multigridthrough BoomerAMG [13] and the geometric multigrid discussed below The fullpanoply of solvers and preconditioners available is catalogued on the PETSc Web-site [3]

The modularity of PETSc allows users to easily customize each solver For in-stance suppose the user wishes to increase the number of levels in ILU(k) precon-ditioning on one block of a block-Jacobi scheme The code fragment in Fig 24 willset the number of levels of fill tolevels

PETSc provides an elegant framework for managing geometric multigrid in com-bination with aDA which is abstracted in theDMMGobject In the same way thatany linear solve can be performed withSNES any preconditioner or solver com-bination is available inDMMGby using a single level That is the same user codesupports both geometric multigrid as well as direct methods Krylov methods etcWhen creating aDMMG the user specifies the number of levels of grid refinementand provides the coarse-grid information (see Fig 25) which is always aDA objectat present although unstructured prototypes are being developed Then theDMMGobject calculates the intergrid transfer operators prolongation and restriction and

allocates objects to hold the solution at each level The user must provide the actionof the residual operatorF and can optionally provide a routine to compute the Jaco-bian matrix (which will be called on each level) and a routine to compute the initialguess as in Fig 26

With this information the user can now solve the equation and retrieve the solu-tion vector as in Fig 27

22 Matthew G Knepley et al

DMMGCreate(comm gridmglevels user ampdmmg)DMMGSetDM(dmmg (DM) da)

Fig 25Creating a DMMG

DMMGSetSNESLocal(dmmg FormFunctionLocal FormJacobianLocal 0 0)DMMGSetInitialGuess(dmmg FormInitialGuess)

Fig 26Providing discretization and assembly routines to DMMG

DMMGSolve(dmmg)soln = DMMGGetx(dmmg)

Fig 27Solving the problem with DMMG

5 Extensions

PETSc is not a complete environment for simulating physical phenomena Ratherit is a set of tools that allow the user to assemble such an environment tailored toa specific application As such PETSc will never provide every facility appropriatefor a given simulation However the user can easily extend PETSc and supplementits capabilities We present two examples in the context of the mantle subductionsimulation

51 Simple Continuation

Because of the highly nonlinear dependence of viscosity on temperature and strainrate in equation (1) the iteration in Newtonrsquos method can fail to converge withouta good starting guess of the solution On the other hand for the isoviscous casewhere temperature is coupled to velocity only through advection a solution is easilyreached It is therefore natural to propose a continuation scheme in the viscosity be-ginning with constant viscosity and progressing to full variability A simple adaptivecontinuation model was devised in which the viscosity was raised to a power be-tween zero and oneη rarr ηα whereα=0 corresponds to constant viscosity andα=1corresponds to natural viscosity variation In the continuation loopα was increasedtowards unity by an amount dependent on the rate of convergence of the previousSNES solve

PETSc itself provides no special support for continuation but it is sufficientlymodular that a continuation loop was readily constructed in the application codeusing repeated calls toDMMGSolve() and found to work quite well

Developing a Geodynamics Simulator with PETSc 23

52 Simple Steering

No rigorous justification had been given for the continuation strategy above andconsequently it was possible that for certain initial conditions Newton would fail toconverge thereby resulting in extremely long run times An observant user could de-tect this situation and abort the run but all the potentially useful solution informationup to that point would be lost A clean way to asynchronously abort the continua-tion loop was thus needed On architectures where OS signals are available PETScprovides an interface for registering signal handlers Thus we were able to define ahandler The user wishing to change the control flow of the simulation simply sendsthe appropriate signal to the process This sets a flag in the user data that causes thechange at the next iteration

6 Simulation Results

A long-standing debate has existed between theorists and observationalists over thethermal structure of subduction zones Modelers using constant viscosity flow sim-ulations to compute thermal structure have predicted relatively cold subductionzones [19] Conversely observationalists who use heat flow measurements and min-eralogical thermobarometry to estimate temperatures at depth have long claimedthat subduction zones are hotter than predicted They have invoked the presence ofstrong upwelling of hot mantle material to supply the heat This upwelling was neverpredicted by isoviscous flow models however

The debate remained unresolved until recently when several groups includingour own succeeded in developing simulations with realistically variable viscosity(for other examples [810121821]) A comparison of flow and temperature fieldsfor variable and constant viscosity simulations generated with our code is shownin Fig 28 Results from these simulations are exciting because they close the gapbetween models and observations they predict hotter mantle temperatures steepersurface thermal gradients and upwelling mantle flow

Furthermore these simulations allow for quantitative predictions of variation ofobservables with subduction parameters A recent study of subduction zone earth-quakes has identified an intriguing trend the vertical distance from subduction zonevolcanoes to the surface of the subducting slab is anti-correlated with descent rate ofthe slab [11] Preliminary calculations with our model are consistent with this trendindicating that flow and thermal structure may play an important role in determiningnot only the quantity and chemistry of magmas but also their path of transport to thesurface Further work is required to resolve this issue

Simulating the geodynamics of subduction is an example of our success in port-ing an existing code into the PETSc framework simultaneously parallelizing thecode and increasing its functionality Using this code as a template we rapidly de-veloped a related simulation of another tectonic boundary the mid-ocean ridge Thiswork was done to address a set of observations of systematic morphological asym-metry in the global mid-ocean ridge system [7] Our model confirms the qualitative

24 Matthew G Knepley et al

Dep

th k

m

log10

η and flow field

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24 Temperature degC

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Distance km

Dep

th k

m

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24

Distance km

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Isov

isco

usV

aria

ble

Vis

cosi

ty

(a) (b)

(c) (d)

Fig 282D viscosity and potential temperature fields from simulations on 8 processors with230112 degrees of freedom Panels in the top row are from a simulation withα=1 in equation(1) Panels in bottom row haveα=0 The white box in panels (a) and (c) shows the region inwhich temperature is plotted in panels (b) and (d)(a) Colors showlog10 of the viscosity fieldNote that there are more than five orders of magnitude variation in viscosity Arrows show theflow direction and magnitude (the slab is being subducted at a rate of 6 cmyear) Upwellingis evident in the flow field near the wedge corner(b) Temperature field from the variableviscosity simulation 1100C isotherm is shown as a dashed line(c) (Constant) viscosity andflow field from the isoviscous simulation Strong flow is predicted at the base of the crustdespite the low-temperature rock there No upwelling flow is predicted(d) Temperature fieldfrom isoviscous simulation Note that the mantle wedge corner is much colder than in (b)

mechanism that had been proposed to explain these observations Furthermore itshows a quantitative agreement with trends in the observed data [17] As with ourmodels of subduction the key to demonstrating the validity of the hypothesized dy-namics was simulating them with the strongly nonlinear rheology that characterizesmantle rock on geologic timescales The computational tools provided by PETSc en-abled us easily handle this rheology reducing development time and allowing us tofocus on model interpretation

References

1 PETSc SNES Example 30 httpwwwmcsanlgovpetscpetsc-2

Developing a Geodynamics Simulator with PETSc 25

snapshotspetsc-currentsrcsnesexamplestutorialsex30chtml

2 PETSc SNES Example 5 httpwwwmcsanlgovpetscpetsc-2snapshotspetsc-currentsrcsnesexamplestutorialsex5f90Fhtml

3 PETSc Solvers httpwwwmcsanlgovpetscpetsc-2documentationlinearsolvertablehtml

4 Amestoy P R Duff I S LrsquoExcellent J-Y and Koster J A fully asynchronous mul-tifrontal solver using distributed dynamic schedulingSIAM Journal on Matrix Analysisand Applications 23(1)15ndash41 (2001)

5 Balay S Buschelman K Eijkhout V Gropp W D Kaushik D Knepley M GMcInnes L C Smith B F and Zhang H PETSc Web pagehttpwwwmcsanlgovpetsc

6 Balay S Gropp W D McInnes L C and Smith B F Efficient management of par-allelism in object oriented numerical software libraries In Arge E Bruaset A M andLangtangen H P editorsModern Software Tools in Scientific Computing pages 163ndash202 Birkhauser Press (1997)

7 Carbotte S Small C and Donnelly K The influence of ridge migration on the mag-matic segmentation of mid-ocean ridgesNature 429743ndash746 (2004)

8 Conder J Wiens D and Morris J On the decompression melting structure at volcanicarcs and back-arc spreading centersGeophys Res Letts 29 (2002)

9 Demmel J W Gilbert J R and Li X S SuperLU userrsquos guide Technical ReportLBNL-44289 Lawrence Berkeley National Laboratory (2003)

10 Eberle M Grasset O and Sotin C A numerical study of the interaction of the man-tle wedge subducting slab and overriding platePhys Earth Planet In 134191ndash202(2002)

11 England P Engdahl R and Thatcher W Systematic variation in the depth of slabsbeneath arc volcanosGeophys J Int 156(2)377ndash408 (2003)

12 Furukawa Y Depth of the decoupling plate interface and thermal structure under arcsJ Geophys Res 9820005ndash20013 (1993)

13 Henson V E and Yang U M BoomerAMG A parallel algebraic multigrid solver andpreconditioner Technical Report UCRL-JC-133948 Lawrence Livermore National Lab-oratory (2000)

14 Hirth G and Kohlstedt D Rheology of the upper mantle and the mantle wedge A viewfrom the experimentalists InInside the Subduction Factory volume 138 ofGeophysicalMonograph American Geophysical Union (2003)

15 Hovland P Norris B and Smith B Making automatic differentiation truly automaticCoupling PETSc with ADIC InProceedings of ICCS2002 (2002)

16 Hysom D and Pothen A A scalable parallel algorithm for incomplete factor precondi-tioning SIAM Journal on Scientific Computing 222194ndash2215 (2001)

17 Katz R Spiegelman M and Carbotte S Ridge migration asthenospheric flow and theorigin of magmatic segmentation in the global mid-ocean ridge systemGeophys ResLetts 31 (2004)

18 Kelemen P Rilling J Parmentier E Mehl L and Hacker B Thermal structure dueto solid-state flow in the mantle wedge beneath arcs InInside the Subduction Factoryvolume 138 ofGeophysical Monograph American Geophysical Union (2003)

19 Peacock S and Wang K Seismic consequences of warm versus cool subduction meta-morphism Examples from southwest and northeast JapanScience 286937ndash939 (1999)

20 Stern R Subduction zonesRev Geophys 40(4) (2002)

26 Matthew G Knepley et al

21 van Keken P Kiefer B and Peacock S High-resolution models of subduction zonesImplications for mineral dehydration reactions and the transport of water into the deepmantleGeochem Geophys Geosys 3(10) (2003)

Page 14: Developing a Geodynamics Simulator with PETScfoalab.earth.ox.ac.uk/publications/Knepley_etal_Springer06.pdf · Developing a Geodynamics Simulator with PETSc Matthew G. Knepley 1,

14 Matthew G Knepley et al

DASetLocalFunction(userda (DALocalFunction1) FormFunctionLocal)

Fig 13Using the DA to form a function

int FormFunctionLocal(DALocalInfo infodouble udouble fAppCtx user)double two = 20hxhyhxdhyhydhxscdouble uijuxxuyyint ij

hx = 10(double)(infominusgtmxminus1)hy = 10(double)(infominusgtmyminus1)sc = hxhyuserminusgtlambdahxdhy = hxhyhydhx = hyhx Compute function over the locally owned part of the grid for (j=infominusgtys jltinfominusgtys+infominusgtym j++)

for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++) if (i == 0 | | j == 0 | | i == infominusgtmxminus1 | | j == infominusgtmyminus1)

f[j][i] = u[j][i] else

uij = u[j][i]uxx = (twouij minus u[j][iminus1] minus u[j][i+1])hydhxuyy = (twouij minus u[jminus1][i] minus u[j+1][i])hxdhyf[j][i] = uxx + uyy minus scexp(uij)

Fig 14FormFunctionLocal() for the Bratu problem

We begin by examining the Bratu problem from the last section only this time inC We can now provide our local residual routine to theDA see Fig 13

The grid information is passed toFormFunctionLocal() in a DALocalInfo structure as shown in Fig 14

The fields are passed directly as multidimensional C arrays complete with ghostvalues An application scientist can use a boilerplate example such as the Bratu prob-lem merely altering the local physics calculation inFormFunctionLocal() torapidly obtain a parallel scalable simulation that runs on the desktop as well as onmassively parallel supercomputers

If the user has an expression for the Jacobian of the residual function then thismatrix can also be computed by using theDA interface with the call as in Fig 15and the corresponding routine for the local calculation in Fig 16

Developing a Geodynamics Simulator with PETSc 15

DASetLocalJacobian(userda (DALocalFunction1) FormJacobianLocal)

Fig 15Using the DA to form a Jacobian

int FormJacobianLocal(DALocalInfo infodouble xMat jacAppCtx user)MatStencil col[5]rowdouble lambdav[5]hxhyhxdhyhydhxscint ij

lambda = userminusgtparamhx = 10(double)(infominusgtmxminus1)hy = 10(double)(infominusgtmyminus1)sc = hxhylambdahxdhy = hxhy hydhx = hyhx

for (j=infominusgtys jltinfominusgtys+infominusgtym j++) for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++)

rowj = j rowi = i boundary points if (i == 0 | | j == 0 | | i == infominusgtmxminus1 | | j == infominusgtmyminus1)

v[0] = 10MatSetValuesStencil(jac1amprow1amprowvINSERT VALUES)

else interior grid points

v[0] = minushxdhy col[0]j = j minus 1 col[0]i = iv[1] = minushydhx col[1]j = j col[1]i = iminus1v[2] = 20(hydhx + hxdhy) minus scexp(x[j][i]) col[2]j = j col[2]i = iv[3] = minushydhx col[3]j = j col[3]i = i+1v[4] = minushxdhy col[4]j = j + 1 col[4]i = iMatSetValuesStencil(jac1amprow5colvINSERT VALUES)

MatAssemblyBegin(jacMAT FINAL ASSEMBLY)MatAssemblyEnd(jacMAT FINAL ASSEMBLY)MatSetOption(jacMAT NEW NONZERO LOCATION ERR)

Fig 16FormJacobianLocal() for the Bratu problem

16 Matthew G Knepley et al

int FormFunctionLocal(DALocalInfo infoField xField fvoid ptr)AppCtx user = (AppCtx)ptrParameter param = userminusgtparamGridInfo grid = userminusgtgriddouble mag w mag uint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1 i j

for (j=infominusgtys jltinfominusgtys+infominusgtym j++) for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++)

calculateXMomentumResidual(i j x f user)calculateZMomentumResidual(i j x f user)calculatePressureResidual(i j x f user)calculateTemperatureResidual(i j x f user)

Fig 17FormFunctionLocal() for the mantle subduction problem

We have so far presented simple examples involving a perturbed Laplacian butthis development strategy extends far beyond toy problems We have developedlarge-scale parallel geophysical simulations [1] that are producing new results inthe field We began by replacingFormFunctionLocal() in the Bratu examplewith one containing the linear physics of the previous sequential linear subductioncode (dropping the previous codersquos linear solve) Then the PETSc solver results werecompared for correctness with the complete prior subduction code Next the newcode was immediately run in parallel for correctness and efficiency tests Finallythe nonlinear terms from the more complicated physics of the mantle subductionproblem discussed in Section 1 and shown in Figs 17-21 and better quality dis-cretizations of the boundary conditions were added In fact since the structure oftheFormFunction() changed significantly the finalFormFunction() looksnothing like the original but the process of obtaining through the incremental ap-proach reduced the learning curve for PETSc and make correctness checking at thevarious stages much easier

The details of the subduction code are not as important as the recognition thata very complicated physics problem need not involve any more complication in oursimulation infrastructure To give more insight into the actual calculation we showin Fig 22 the explicit calculation of the residual from the continuity equation

Here we have used theDA for a multicomponent problem on a staggered meshwithout alteration because the structure of the discretization remains logically rect-angular

Although theDA manages communication during the solution process data post-processing often necessary for visualization or analysis also requires this function-ality For the mantle subduction simulation we want to calculate both the second in-

Developing a Geodynamics Simulator with PETSc 17

int calculateXMomentumResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (iltj) f[j][i]u = x[j][i]u minus SlabVel(rsquoUrsquo ijuser)

else if (jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lithospheric lid f[j][i]u = x[j][i]u minus 00

else if (i==ilim ) on the right side boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]u = x[j][i]u minus HorizVelocity(ijuser) else

f[j][i]u = XNormalStress(xijCELL CENTERuser) minus EPS ZERO

else if (j==jlim ) on the bottom boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]u = x[j][i]u minus HorizVelocity(ijuser) else if (paramminusgtibound==BC NOSTRESS)

f[j][i]u = XMomentumResidual(xijuser)

else in the mantle wedge f[j][i]u = XMomentumResidual(xijuser)

Fig 18X-momentum residual for the mantle subduction problem

variant of the strain rate tensor and the viscosity over the grid (at both cell centers andcorners) Using theDAGlobalToLocalBegin () andDAGlobalToLocalEnd() functions we transferred the global solution data to local ghosted vectors and pro-ceeded with the calculation We can also transfer data from local vectors to a globalvector usingDALocalToGlobal () The entire viscosity calculation is given inFig 23

Note that this framework allows the user to manage arbitrary fields over the do-main For instance a porous flow simulation might manage material properties ofthe medium in this fashion

4 Solvers

The SNES object in PETSc is an abstraction of the ldquoinverserdquo of a nonlinear oper-ator The user provides aFormFunction() routine as seen in Section 3 whichcomputes the action of the operator on an input vector Linear problems can also be

18 Matthew G Knepley et al

int calculateZMomentumResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1jlim = infominusgtmyminus1

if (ilt=j) f[j][i]w = x[j][i]w minus SlabVel(rsquoWrsquo ijuser)

else if (jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lithospheric lid f[j][i]w = x[j][i]w minus 00

else if (j==jlim ) on the bottom boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]w = x[j][i]w minus VertVelocity(ijuser) else

f[j][i]w = ZNormalStress(xijCELL CENTERuser) minus EPS ZERO

else if (i==ilim ) on the right side boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]w = x[j][i]w minus VertVelocity(ijuser) else if (paramminusgtibound==BC NOSTRESS)

f[j][i]w = ZMomentumResidual(xijuser)

else in the mantle wedge f[j][i]w = ZMomentumResidual(xijuser)

Fig 19Z-momentum residual for the mantle subduction problem

int calculatePressureResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (iltj | | jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lid or slab f[j][i]p = x[j][i]p

else if ((i==ilim | | j==jlim ) ampamp paramminusgtibound==BC ANALYTIC ) on an analytic boundary f[j][i]p = x[j][i]p minus Pressure(ijuser)

else in the mantle wedge f[j][i]p = ContinuityResidual(xijuser)

Fig 20Pressure or continuity residual for the mantle subduction problem

Developing a Geodynamics Simulator with PETSc 19

int calculateTemperatureResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (j==0) on the surface f[j][i]T = x[j][i]T + x[j+1][i]T + PetscMax(x[j][i]T00)

else if (i==0) slab inflow boundary f[j][i]T = x[j][i]T minus PlateModel(jPLATE SLABuser)

else if (i==ilim ) right side boundary mag u = 10 minus pow( (10minusPetscMax(PetscMin(x[j][iminus1]uparamminusgtcb10)00)) 50 )f[j][i]T = x[j][i]T minus mag ux[jminus1][iminus1]T minus (10minusmag u)PlateModel(jPLATE LID user)

else if (j==jlim ) bottom boundary mag w = 10 minus pow( (10minusPetscMax(PetscMin(x[jminus1][i]wparamminusgtsb10)00)) 50 )f[j][i]T = x[j][i]T minus mag wx[jminus1][iminus1]T minus (10minusmag w)

else in the mantle wedge f[j][i]T = EnergyResidual(xijuser)

Fig 21Temperature or energy residual for the mantle subduction problem

double precision ContinuityResidual(Field x int i int j AppCtx user)

GridInfo grid = userminusgtgriddouble precision uEuWwNwSdudxdwdz

uW = x[j][iminus1]u uE = x[j][i]u dudx = (uE minus uW)gridminusgtdxwS = x[jminus1][i]w wN = x[j][i]w dwdz = (wN minus wS)gridminusgtdzreturn dudx + dwdz

Fig 22Calculation of the continuity residual for the mantle subduction problem

solved by usingSNES with no loss of efficiency compared to using the underlyingKSP object (PETSc linear solver object) directly ThusSNES seems the appropri-ate framework for a general-purpose simulator providing the flexibility to add orsubtract nonlinear terms from the equation at will

TheSNES solver does not require a user to implement a Jacobian Default rou-tines are provided to compute a finite difference approximation using coloring toaccount for the sparsity of the matrix The user does not even need the PETScMatinterface but can initially interact only withVec objects making the transition to

PETSc nearly painless However these approximations to the Jacobian can have nu-merical difficulties and are not as efficient as direct evaluation Therefore the user

20 Matthew G Knepley et al

Compute both the second invariant of the strain rate tensor and the viscosity int ViscosityField(DA da Vec X Vec VAppCtx user)

Parameter param = userminusgtparamGridInfo grid = userminusgtgridVec localXField v xdouble eps dx dz T epsC TCint ijisjsimjmilim jlim ivt

ivt = paramminusgtiviscparamminusgtivisc = paramminusgtoutput ivisc

DACreateLocalVector(da amplocalX)DAGlobalToLocalBegin(da X INSERT VALUES localX)DAGlobalToLocalEnd(da X INSERT VALUES localX)DAVecGetArray(dalocalX(void)ampx)DAVecGetArray(daV(void)ampv)

Parameters dx = gridminusgtdx dz = gridminusgtdzilim = gridminusgtniminus1 jlim = gridminusgtnjminus1

Compute real temperature strain rate and viscosity DAGetCorners(daampisampjsPETSCNULL ampimampjmPETSCNULL )for (j=js jltjs+jm j++)

for (i=is iltis+im i++) T = paramminusgtpotentialT x[j][i]T exp( (jminus05)dzparamminusgtz scale )if (iltilim ampamp jltjlim )

TC = paramminusgtpotentialT TInterp(xij) exp( jdzparamminusgtz scale ) else

TC = T Compute the values at both cell centers and cell corners eps = CalcSecInv(xijCELL CENTERuser)epsC = CalcSecInv(xijCELL CORNERuser)v[j][i]u = epsv[j][i]w = epsCv[j][i]p = Viscosity(Tepsdz(jminus05)param)v[j][i]T = Viscosity(TCepsCdzjparam)

DAVecRestoreArray(daV(void)ampv)DAVecRestoreArray(dalocalX(void)ampx)paramminusgtivisc = ivt

Fig 23 Calculation of the second invariant of the strain tensor and viscosity field for themantle subduction problem

Developing a Geodynamics Simulator with PETSc 21

KSP ksp subkspPC bjacobi ilu

SNESGetKSP(snes ampksp)KSPGetPC(ksp ampbjacobi)PCBJacobiGetSubKSP(bjacobi PETSCNULL PETSCNULL ampsubksp)KSPGetPC(subksp[0] ampilu)PCILUSetLevels(ilu levels)

Fig 24Customizing the preconditioner on a single process

has the option of providing a routine or of using the ADIC [15] or similar system forautomatic differentiation The finite difference approximation can be activated byusing the-snes fd and-mat fd coloring freq options or by providing theSNESDefaultComputeJacobianColor () function toSNESSetJacobian()

Through theSNES object the user may also access the great range of directand iterative linear solvers and preconditioners provided by PETSc Sixteen differentKrylov solvers are available including GMRES BiCGStab and LSQR along withinterfaces to popular sparse direct packages such as MUMPS [4] and SuperLU [9]In addition to a variety of incomplete factorization preconditioners including PI-LUT [16] PETSc supports additive Schwartz preconditioning algebraic multigridthrough BoomerAMG [13] and the geometric multigrid discussed below The fullpanoply of solvers and preconditioners available is catalogued on the PETSc Web-site [3]

The modularity of PETSc allows users to easily customize each solver For in-stance suppose the user wishes to increase the number of levels in ILU(k) precon-ditioning on one block of a block-Jacobi scheme The code fragment in Fig 24 willset the number of levels of fill tolevels

PETSc provides an elegant framework for managing geometric multigrid in com-bination with aDA which is abstracted in theDMMGobject In the same way thatany linear solve can be performed withSNES any preconditioner or solver com-bination is available inDMMGby using a single level That is the same user codesupports both geometric multigrid as well as direct methods Krylov methods etcWhen creating aDMMG the user specifies the number of levels of grid refinementand provides the coarse-grid information (see Fig 25) which is always aDA objectat present although unstructured prototypes are being developed Then theDMMGobject calculates the intergrid transfer operators prolongation and restriction and

allocates objects to hold the solution at each level The user must provide the actionof the residual operatorF and can optionally provide a routine to compute the Jaco-bian matrix (which will be called on each level) and a routine to compute the initialguess as in Fig 26

With this information the user can now solve the equation and retrieve the solu-tion vector as in Fig 27

22 Matthew G Knepley et al

DMMGCreate(comm gridmglevels user ampdmmg)DMMGSetDM(dmmg (DM) da)

Fig 25Creating a DMMG

DMMGSetSNESLocal(dmmg FormFunctionLocal FormJacobianLocal 0 0)DMMGSetInitialGuess(dmmg FormInitialGuess)

Fig 26Providing discretization and assembly routines to DMMG

DMMGSolve(dmmg)soln = DMMGGetx(dmmg)

Fig 27Solving the problem with DMMG

5 Extensions

PETSc is not a complete environment for simulating physical phenomena Ratherit is a set of tools that allow the user to assemble such an environment tailored toa specific application As such PETSc will never provide every facility appropriatefor a given simulation However the user can easily extend PETSc and supplementits capabilities We present two examples in the context of the mantle subductionsimulation

51 Simple Continuation

Because of the highly nonlinear dependence of viscosity on temperature and strainrate in equation (1) the iteration in Newtonrsquos method can fail to converge withouta good starting guess of the solution On the other hand for the isoviscous casewhere temperature is coupled to velocity only through advection a solution is easilyreached It is therefore natural to propose a continuation scheme in the viscosity be-ginning with constant viscosity and progressing to full variability A simple adaptivecontinuation model was devised in which the viscosity was raised to a power be-tween zero and oneη rarr ηα whereα=0 corresponds to constant viscosity andα=1corresponds to natural viscosity variation In the continuation loopα was increasedtowards unity by an amount dependent on the rate of convergence of the previousSNES solve

PETSc itself provides no special support for continuation but it is sufficientlymodular that a continuation loop was readily constructed in the application codeusing repeated calls toDMMGSolve() and found to work quite well

Developing a Geodynamics Simulator with PETSc 23

52 Simple Steering

No rigorous justification had been given for the continuation strategy above andconsequently it was possible that for certain initial conditions Newton would fail toconverge thereby resulting in extremely long run times An observant user could de-tect this situation and abort the run but all the potentially useful solution informationup to that point would be lost A clean way to asynchronously abort the continua-tion loop was thus needed On architectures where OS signals are available PETScprovides an interface for registering signal handlers Thus we were able to define ahandler The user wishing to change the control flow of the simulation simply sendsthe appropriate signal to the process This sets a flag in the user data that causes thechange at the next iteration

6 Simulation Results

A long-standing debate has existed between theorists and observationalists over thethermal structure of subduction zones Modelers using constant viscosity flow sim-ulations to compute thermal structure have predicted relatively cold subductionzones [19] Conversely observationalists who use heat flow measurements and min-eralogical thermobarometry to estimate temperatures at depth have long claimedthat subduction zones are hotter than predicted They have invoked the presence ofstrong upwelling of hot mantle material to supply the heat This upwelling was neverpredicted by isoviscous flow models however

The debate remained unresolved until recently when several groups includingour own succeeded in developing simulations with realistically variable viscosity(for other examples [810121821]) A comparison of flow and temperature fieldsfor variable and constant viscosity simulations generated with our code is shownin Fig 28 Results from these simulations are exciting because they close the gapbetween models and observations they predict hotter mantle temperatures steepersurface thermal gradients and upwelling mantle flow

Furthermore these simulations allow for quantitative predictions of variation ofobservables with subduction parameters A recent study of subduction zone earth-quakes has identified an intriguing trend the vertical distance from subduction zonevolcanoes to the surface of the subducting slab is anti-correlated with descent rate ofthe slab [11] Preliminary calculations with our model are consistent with this trendindicating that flow and thermal structure may play an important role in determiningnot only the quantity and chemistry of magmas but also their path of transport to thesurface Further work is required to resolve this issue

Simulating the geodynamics of subduction is an example of our success in port-ing an existing code into the PETSc framework simultaneously parallelizing thecode and increasing its functionality Using this code as a template we rapidly de-veloped a related simulation of another tectonic boundary the mid-ocean ridge Thiswork was done to address a set of observations of systematic morphological asym-metry in the global mid-ocean ridge system [7] Our model confirms the qualitative

24 Matthew G Knepley et al

Dep

th k

m

log10

η and flow field

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24 Temperature degC

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Distance km

Dep

th k

m

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24

Distance km

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Isov

isco

usV

aria

ble

Vis

cosi

ty

(a) (b)

(c) (d)

Fig 282D viscosity and potential temperature fields from simulations on 8 processors with230112 degrees of freedom Panels in the top row are from a simulation withα=1 in equation(1) Panels in bottom row haveα=0 The white box in panels (a) and (c) shows the region inwhich temperature is plotted in panels (b) and (d)(a) Colors showlog10 of the viscosity fieldNote that there are more than five orders of magnitude variation in viscosity Arrows show theflow direction and magnitude (the slab is being subducted at a rate of 6 cmyear) Upwellingis evident in the flow field near the wedge corner(b) Temperature field from the variableviscosity simulation 1100C isotherm is shown as a dashed line(c) (Constant) viscosity andflow field from the isoviscous simulation Strong flow is predicted at the base of the crustdespite the low-temperature rock there No upwelling flow is predicted(d) Temperature fieldfrom isoviscous simulation Note that the mantle wedge corner is much colder than in (b)

mechanism that had been proposed to explain these observations Furthermore itshows a quantitative agreement with trends in the observed data [17] As with ourmodels of subduction the key to demonstrating the validity of the hypothesized dy-namics was simulating them with the strongly nonlinear rheology that characterizesmantle rock on geologic timescales The computational tools provided by PETSc en-abled us easily handle this rheology reducing development time and allowing us tofocus on model interpretation

References

1 PETSc SNES Example 30 httpwwwmcsanlgovpetscpetsc-2

Developing a Geodynamics Simulator with PETSc 25

snapshotspetsc-currentsrcsnesexamplestutorialsex30chtml

2 PETSc SNES Example 5 httpwwwmcsanlgovpetscpetsc-2snapshotspetsc-currentsrcsnesexamplestutorialsex5f90Fhtml

3 PETSc Solvers httpwwwmcsanlgovpetscpetsc-2documentationlinearsolvertablehtml

4 Amestoy P R Duff I S LrsquoExcellent J-Y and Koster J A fully asynchronous mul-tifrontal solver using distributed dynamic schedulingSIAM Journal on Matrix Analysisand Applications 23(1)15ndash41 (2001)

5 Balay S Buschelman K Eijkhout V Gropp W D Kaushik D Knepley M GMcInnes L C Smith B F and Zhang H PETSc Web pagehttpwwwmcsanlgovpetsc

6 Balay S Gropp W D McInnes L C and Smith B F Efficient management of par-allelism in object oriented numerical software libraries In Arge E Bruaset A M andLangtangen H P editorsModern Software Tools in Scientific Computing pages 163ndash202 Birkhauser Press (1997)

7 Carbotte S Small C and Donnelly K The influence of ridge migration on the mag-matic segmentation of mid-ocean ridgesNature 429743ndash746 (2004)

8 Conder J Wiens D and Morris J On the decompression melting structure at volcanicarcs and back-arc spreading centersGeophys Res Letts 29 (2002)

9 Demmel J W Gilbert J R and Li X S SuperLU userrsquos guide Technical ReportLBNL-44289 Lawrence Berkeley National Laboratory (2003)

10 Eberle M Grasset O and Sotin C A numerical study of the interaction of the man-tle wedge subducting slab and overriding platePhys Earth Planet In 134191ndash202(2002)

11 England P Engdahl R and Thatcher W Systematic variation in the depth of slabsbeneath arc volcanosGeophys J Int 156(2)377ndash408 (2003)

12 Furukawa Y Depth of the decoupling plate interface and thermal structure under arcsJ Geophys Res 9820005ndash20013 (1993)

13 Henson V E and Yang U M BoomerAMG A parallel algebraic multigrid solver andpreconditioner Technical Report UCRL-JC-133948 Lawrence Livermore National Lab-oratory (2000)

14 Hirth G and Kohlstedt D Rheology of the upper mantle and the mantle wedge A viewfrom the experimentalists InInside the Subduction Factory volume 138 ofGeophysicalMonograph American Geophysical Union (2003)

15 Hovland P Norris B and Smith B Making automatic differentiation truly automaticCoupling PETSc with ADIC InProceedings of ICCS2002 (2002)

16 Hysom D and Pothen A A scalable parallel algorithm for incomplete factor precondi-tioning SIAM Journal on Scientific Computing 222194ndash2215 (2001)

17 Katz R Spiegelman M and Carbotte S Ridge migration asthenospheric flow and theorigin of magmatic segmentation in the global mid-ocean ridge systemGeophys ResLetts 31 (2004)

18 Kelemen P Rilling J Parmentier E Mehl L and Hacker B Thermal structure dueto solid-state flow in the mantle wedge beneath arcs InInside the Subduction Factoryvolume 138 ofGeophysical Monograph American Geophysical Union (2003)

19 Peacock S and Wang K Seismic consequences of warm versus cool subduction meta-morphism Examples from southwest and northeast JapanScience 286937ndash939 (1999)

20 Stern R Subduction zonesRev Geophys 40(4) (2002)

26 Matthew G Knepley et al

21 van Keken P Kiefer B and Peacock S High-resolution models of subduction zonesImplications for mineral dehydration reactions and the transport of water into the deepmantleGeochem Geophys Geosys 3(10) (2003)

Page 15: Developing a Geodynamics Simulator with PETScfoalab.earth.ox.ac.uk/publications/Knepley_etal_Springer06.pdf · Developing a Geodynamics Simulator with PETSc Matthew G. Knepley 1,

Developing a Geodynamics Simulator with PETSc 15

DASetLocalJacobian(userda (DALocalFunction1) FormJacobianLocal)

Fig 15Using the DA to form a Jacobian

int FormJacobianLocal(DALocalInfo infodouble xMat jacAppCtx user)MatStencil col[5]rowdouble lambdav[5]hxhyhxdhyhydhxscint ij

lambda = userminusgtparamhx = 10(double)(infominusgtmxminus1)hy = 10(double)(infominusgtmyminus1)sc = hxhylambdahxdhy = hxhy hydhx = hyhx

for (j=infominusgtys jltinfominusgtys+infominusgtym j++) for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++)

rowj = j rowi = i boundary points if (i == 0 | | j == 0 | | i == infominusgtmxminus1 | | j == infominusgtmyminus1)

v[0] = 10MatSetValuesStencil(jac1amprow1amprowvINSERT VALUES)

else interior grid points

v[0] = minushxdhy col[0]j = j minus 1 col[0]i = iv[1] = minushydhx col[1]j = j col[1]i = iminus1v[2] = 20(hydhx + hxdhy) minus scexp(x[j][i]) col[2]j = j col[2]i = iv[3] = minushydhx col[3]j = j col[3]i = i+1v[4] = minushxdhy col[4]j = j + 1 col[4]i = iMatSetValuesStencil(jac1amprow5colvINSERT VALUES)

MatAssemblyBegin(jacMAT FINAL ASSEMBLY)MatAssemblyEnd(jacMAT FINAL ASSEMBLY)MatSetOption(jacMAT NEW NONZERO LOCATION ERR)

Fig 16FormJacobianLocal() for the Bratu problem

16 Matthew G Knepley et al

int FormFunctionLocal(DALocalInfo infoField xField fvoid ptr)AppCtx user = (AppCtx)ptrParameter param = userminusgtparamGridInfo grid = userminusgtgriddouble mag w mag uint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1 i j

for (j=infominusgtys jltinfominusgtys+infominusgtym j++) for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++)

calculateXMomentumResidual(i j x f user)calculateZMomentumResidual(i j x f user)calculatePressureResidual(i j x f user)calculateTemperatureResidual(i j x f user)

Fig 17FormFunctionLocal() for the mantle subduction problem

We have so far presented simple examples involving a perturbed Laplacian butthis development strategy extends far beyond toy problems We have developedlarge-scale parallel geophysical simulations [1] that are producing new results inthe field We began by replacingFormFunctionLocal() in the Bratu examplewith one containing the linear physics of the previous sequential linear subductioncode (dropping the previous codersquos linear solve) Then the PETSc solver results werecompared for correctness with the complete prior subduction code Next the newcode was immediately run in parallel for correctness and efficiency tests Finallythe nonlinear terms from the more complicated physics of the mantle subductionproblem discussed in Section 1 and shown in Figs 17-21 and better quality dis-cretizations of the boundary conditions were added In fact since the structure oftheFormFunction() changed significantly the finalFormFunction() looksnothing like the original but the process of obtaining through the incremental ap-proach reduced the learning curve for PETSc and make correctness checking at thevarious stages much easier

The details of the subduction code are not as important as the recognition thata very complicated physics problem need not involve any more complication in oursimulation infrastructure To give more insight into the actual calculation we showin Fig 22 the explicit calculation of the residual from the continuity equation

Here we have used theDA for a multicomponent problem on a staggered meshwithout alteration because the structure of the discretization remains logically rect-angular

Although theDA manages communication during the solution process data post-processing often necessary for visualization or analysis also requires this function-ality For the mantle subduction simulation we want to calculate both the second in-

Developing a Geodynamics Simulator with PETSc 17

int calculateXMomentumResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (iltj) f[j][i]u = x[j][i]u minus SlabVel(rsquoUrsquo ijuser)

else if (jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lithospheric lid f[j][i]u = x[j][i]u minus 00

else if (i==ilim ) on the right side boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]u = x[j][i]u minus HorizVelocity(ijuser) else

f[j][i]u = XNormalStress(xijCELL CENTERuser) minus EPS ZERO

else if (j==jlim ) on the bottom boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]u = x[j][i]u minus HorizVelocity(ijuser) else if (paramminusgtibound==BC NOSTRESS)

f[j][i]u = XMomentumResidual(xijuser)

else in the mantle wedge f[j][i]u = XMomentumResidual(xijuser)

Fig 18X-momentum residual for the mantle subduction problem

variant of the strain rate tensor and the viscosity over the grid (at both cell centers andcorners) Using theDAGlobalToLocalBegin () andDAGlobalToLocalEnd() functions we transferred the global solution data to local ghosted vectors and pro-ceeded with the calculation We can also transfer data from local vectors to a globalvector usingDALocalToGlobal () The entire viscosity calculation is given inFig 23

Note that this framework allows the user to manage arbitrary fields over the do-main For instance a porous flow simulation might manage material properties ofthe medium in this fashion

4 Solvers

The SNES object in PETSc is an abstraction of the ldquoinverserdquo of a nonlinear oper-ator The user provides aFormFunction() routine as seen in Section 3 whichcomputes the action of the operator on an input vector Linear problems can also be

18 Matthew G Knepley et al

int calculateZMomentumResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1jlim = infominusgtmyminus1

if (ilt=j) f[j][i]w = x[j][i]w minus SlabVel(rsquoWrsquo ijuser)

else if (jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lithospheric lid f[j][i]w = x[j][i]w minus 00

else if (j==jlim ) on the bottom boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]w = x[j][i]w minus VertVelocity(ijuser) else

f[j][i]w = ZNormalStress(xijCELL CENTERuser) minus EPS ZERO

else if (i==ilim ) on the right side boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]w = x[j][i]w minus VertVelocity(ijuser) else if (paramminusgtibound==BC NOSTRESS)

f[j][i]w = ZMomentumResidual(xijuser)

else in the mantle wedge f[j][i]w = ZMomentumResidual(xijuser)

Fig 19Z-momentum residual for the mantle subduction problem

int calculatePressureResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (iltj | | jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lid or slab f[j][i]p = x[j][i]p

else if ((i==ilim | | j==jlim ) ampamp paramminusgtibound==BC ANALYTIC ) on an analytic boundary f[j][i]p = x[j][i]p minus Pressure(ijuser)

else in the mantle wedge f[j][i]p = ContinuityResidual(xijuser)

Fig 20Pressure or continuity residual for the mantle subduction problem

Developing a Geodynamics Simulator with PETSc 19

int calculateTemperatureResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (j==0) on the surface f[j][i]T = x[j][i]T + x[j+1][i]T + PetscMax(x[j][i]T00)

else if (i==0) slab inflow boundary f[j][i]T = x[j][i]T minus PlateModel(jPLATE SLABuser)

else if (i==ilim ) right side boundary mag u = 10 minus pow( (10minusPetscMax(PetscMin(x[j][iminus1]uparamminusgtcb10)00)) 50 )f[j][i]T = x[j][i]T minus mag ux[jminus1][iminus1]T minus (10minusmag u)PlateModel(jPLATE LID user)

else if (j==jlim ) bottom boundary mag w = 10 minus pow( (10minusPetscMax(PetscMin(x[jminus1][i]wparamminusgtsb10)00)) 50 )f[j][i]T = x[j][i]T minus mag wx[jminus1][iminus1]T minus (10minusmag w)

else in the mantle wedge f[j][i]T = EnergyResidual(xijuser)

Fig 21Temperature or energy residual for the mantle subduction problem

double precision ContinuityResidual(Field x int i int j AppCtx user)

GridInfo grid = userminusgtgriddouble precision uEuWwNwSdudxdwdz

uW = x[j][iminus1]u uE = x[j][i]u dudx = (uE minus uW)gridminusgtdxwS = x[jminus1][i]w wN = x[j][i]w dwdz = (wN minus wS)gridminusgtdzreturn dudx + dwdz

Fig 22Calculation of the continuity residual for the mantle subduction problem

solved by usingSNES with no loss of efficiency compared to using the underlyingKSP object (PETSc linear solver object) directly ThusSNES seems the appropri-ate framework for a general-purpose simulator providing the flexibility to add orsubtract nonlinear terms from the equation at will

TheSNES solver does not require a user to implement a Jacobian Default rou-tines are provided to compute a finite difference approximation using coloring toaccount for the sparsity of the matrix The user does not even need the PETScMatinterface but can initially interact only withVec objects making the transition to

PETSc nearly painless However these approximations to the Jacobian can have nu-merical difficulties and are not as efficient as direct evaluation Therefore the user

20 Matthew G Knepley et al

Compute both the second invariant of the strain rate tensor and the viscosity int ViscosityField(DA da Vec X Vec VAppCtx user)

Parameter param = userminusgtparamGridInfo grid = userminusgtgridVec localXField v xdouble eps dx dz T epsC TCint ijisjsimjmilim jlim ivt

ivt = paramminusgtiviscparamminusgtivisc = paramminusgtoutput ivisc

DACreateLocalVector(da amplocalX)DAGlobalToLocalBegin(da X INSERT VALUES localX)DAGlobalToLocalEnd(da X INSERT VALUES localX)DAVecGetArray(dalocalX(void)ampx)DAVecGetArray(daV(void)ampv)

Parameters dx = gridminusgtdx dz = gridminusgtdzilim = gridminusgtniminus1 jlim = gridminusgtnjminus1

Compute real temperature strain rate and viscosity DAGetCorners(daampisampjsPETSCNULL ampimampjmPETSCNULL )for (j=js jltjs+jm j++)

for (i=is iltis+im i++) T = paramminusgtpotentialT x[j][i]T exp( (jminus05)dzparamminusgtz scale )if (iltilim ampamp jltjlim )

TC = paramminusgtpotentialT TInterp(xij) exp( jdzparamminusgtz scale ) else

TC = T Compute the values at both cell centers and cell corners eps = CalcSecInv(xijCELL CENTERuser)epsC = CalcSecInv(xijCELL CORNERuser)v[j][i]u = epsv[j][i]w = epsCv[j][i]p = Viscosity(Tepsdz(jminus05)param)v[j][i]T = Viscosity(TCepsCdzjparam)

DAVecRestoreArray(daV(void)ampv)DAVecRestoreArray(dalocalX(void)ampx)paramminusgtivisc = ivt

Fig 23 Calculation of the second invariant of the strain tensor and viscosity field for themantle subduction problem

Developing a Geodynamics Simulator with PETSc 21

KSP ksp subkspPC bjacobi ilu

SNESGetKSP(snes ampksp)KSPGetPC(ksp ampbjacobi)PCBJacobiGetSubKSP(bjacobi PETSCNULL PETSCNULL ampsubksp)KSPGetPC(subksp[0] ampilu)PCILUSetLevels(ilu levels)

Fig 24Customizing the preconditioner on a single process

has the option of providing a routine or of using the ADIC [15] or similar system forautomatic differentiation The finite difference approximation can be activated byusing the-snes fd and-mat fd coloring freq options or by providing theSNESDefaultComputeJacobianColor () function toSNESSetJacobian()

Through theSNES object the user may also access the great range of directand iterative linear solvers and preconditioners provided by PETSc Sixteen differentKrylov solvers are available including GMRES BiCGStab and LSQR along withinterfaces to popular sparse direct packages such as MUMPS [4] and SuperLU [9]In addition to a variety of incomplete factorization preconditioners including PI-LUT [16] PETSc supports additive Schwartz preconditioning algebraic multigridthrough BoomerAMG [13] and the geometric multigrid discussed below The fullpanoply of solvers and preconditioners available is catalogued on the PETSc Web-site [3]

The modularity of PETSc allows users to easily customize each solver For in-stance suppose the user wishes to increase the number of levels in ILU(k) precon-ditioning on one block of a block-Jacobi scheme The code fragment in Fig 24 willset the number of levels of fill tolevels

PETSc provides an elegant framework for managing geometric multigrid in com-bination with aDA which is abstracted in theDMMGobject In the same way thatany linear solve can be performed withSNES any preconditioner or solver com-bination is available inDMMGby using a single level That is the same user codesupports both geometric multigrid as well as direct methods Krylov methods etcWhen creating aDMMG the user specifies the number of levels of grid refinementand provides the coarse-grid information (see Fig 25) which is always aDA objectat present although unstructured prototypes are being developed Then theDMMGobject calculates the intergrid transfer operators prolongation and restriction and

allocates objects to hold the solution at each level The user must provide the actionof the residual operatorF and can optionally provide a routine to compute the Jaco-bian matrix (which will be called on each level) and a routine to compute the initialguess as in Fig 26

With this information the user can now solve the equation and retrieve the solu-tion vector as in Fig 27

22 Matthew G Knepley et al

DMMGCreate(comm gridmglevels user ampdmmg)DMMGSetDM(dmmg (DM) da)

Fig 25Creating a DMMG

DMMGSetSNESLocal(dmmg FormFunctionLocal FormJacobianLocal 0 0)DMMGSetInitialGuess(dmmg FormInitialGuess)

Fig 26Providing discretization and assembly routines to DMMG

DMMGSolve(dmmg)soln = DMMGGetx(dmmg)

Fig 27Solving the problem with DMMG

5 Extensions

PETSc is not a complete environment for simulating physical phenomena Ratherit is a set of tools that allow the user to assemble such an environment tailored toa specific application As such PETSc will never provide every facility appropriatefor a given simulation However the user can easily extend PETSc and supplementits capabilities We present two examples in the context of the mantle subductionsimulation

51 Simple Continuation

Because of the highly nonlinear dependence of viscosity on temperature and strainrate in equation (1) the iteration in Newtonrsquos method can fail to converge withouta good starting guess of the solution On the other hand for the isoviscous casewhere temperature is coupled to velocity only through advection a solution is easilyreached It is therefore natural to propose a continuation scheme in the viscosity be-ginning with constant viscosity and progressing to full variability A simple adaptivecontinuation model was devised in which the viscosity was raised to a power be-tween zero and oneη rarr ηα whereα=0 corresponds to constant viscosity andα=1corresponds to natural viscosity variation In the continuation loopα was increasedtowards unity by an amount dependent on the rate of convergence of the previousSNES solve

PETSc itself provides no special support for continuation but it is sufficientlymodular that a continuation loop was readily constructed in the application codeusing repeated calls toDMMGSolve() and found to work quite well

Developing a Geodynamics Simulator with PETSc 23

52 Simple Steering

No rigorous justification had been given for the continuation strategy above andconsequently it was possible that for certain initial conditions Newton would fail toconverge thereby resulting in extremely long run times An observant user could de-tect this situation and abort the run but all the potentially useful solution informationup to that point would be lost A clean way to asynchronously abort the continua-tion loop was thus needed On architectures where OS signals are available PETScprovides an interface for registering signal handlers Thus we were able to define ahandler The user wishing to change the control flow of the simulation simply sendsthe appropriate signal to the process This sets a flag in the user data that causes thechange at the next iteration

6 Simulation Results

A long-standing debate has existed between theorists and observationalists over thethermal structure of subduction zones Modelers using constant viscosity flow sim-ulations to compute thermal structure have predicted relatively cold subductionzones [19] Conversely observationalists who use heat flow measurements and min-eralogical thermobarometry to estimate temperatures at depth have long claimedthat subduction zones are hotter than predicted They have invoked the presence ofstrong upwelling of hot mantle material to supply the heat This upwelling was neverpredicted by isoviscous flow models however

The debate remained unresolved until recently when several groups includingour own succeeded in developing simulations with realistically variable viscosity(for other examples [810121821]) A comparison of flow and temperature fieldsfor variable and constant viscosity simulations generated with our code is shownin Fig 28 Results from these simulations are exciting because they close the gapbetween models and observations they predict hotter mantle temperatures steepersurface thermal gradients and upwelling mantle flow

Furthermore these simulations allow for quantitative predictions of variation ofobservables with subduction parameters A recent study of subduction zone earth-quakes has identified an intriguing trend the vertical distance from subduction zonevolcanoes to the surface of the subducting slab is anti-correlated with descent rate ofthe slab [11] Preliminary calculations with our model are consistent with this trendindicating that flow and thermal structure may play an important role in determiningnot only the quantity and chemistry of magmas but also their path of transport to thesurface Further work is required to resolve this issue

Simulating the geodynamics of subduction is an example of our success in port-ing an existing code into the PETSc framework simultaneously parallelizing thecode and increasing its functionality Using this code as a template we rapidly de-veloped a related simulation of another tectonic boundary the mid-ocean ridge Thiswork was done to address a set of observations of systematic morphological asym-metry in the global mid-ocean ridge system [7] Our model confirms the qualitative

24 Matthew G Knepley et al

Dep

th k

m

log10

η and flow field

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24 Temperature degC

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Distance km

Dep

th k

m

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24

Distance km

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Isov

isco

usV

aria

ble

Vis

cosi

ty

(a) (b)

(c) (d)

Fig 282D viscosity and potential temperature fields from simulations on 8 processors with230112 degrees of freedom Panels in the top row are from a simulation withα=1 in equation(1) Panels in bottom row haveα=0 The white box in panels (a) and (c) shows the region inwhich temperature is plotted in panels (b) and (d)(a) Colors showlog10 of the viscosity fieldNote that there are more than five orders of magnitude variation in viscosity Arrows show theflow direction and magnitude (the slab is being subducted at a rate of 6 cmyear) Upwellingis evident in the flow field near the wedge corner(b) Temperature field from the variableviscosity simulation 1100C isotherm is shown as a dashed line(c) (Constant) viscosity andflow field from the isoviscous simulation Strong flow is predicted at the base of the crustdespite the low-temperature rock there No upwelling flow is predicted(d) Temperature fieldfrom isoviscous simulation Note that the mantle wedge corner is much colder than in (b)

mechanism that had been proposed to explain these observations Furthermore itshows a quantitative agreement with trends in the observed data [17] As with ourmodels of subduction the key to demonstrating the validity of the hypothesized dy-namics was simulating them with the strongly nonlinear rheology that characterizesmantle rock on geologic timescales The computational tools provided by PETSc en-abled us easily handle this rheology reducing development time and allowing us tofocus on model interpretation

References

1 PETSc SNES Example 30 httpwwwmcsanlgovpetscpetsc-2

Developing a Geodynamics Simulator with PETSc 25

snapshotspetsc-currentsrcsnesexamplestutorialsex30chtml

2 PETSc SNES Example 5 httpwwwmcsanlgovpetscpetsc-2snapshotspetsc-currentsrcsnesexamplestutorialsex5f90Fhtml

3 PETSc Solvers httpwwwmcsanlgovpetscpetsc-2documentationlinearsolvertablehtml

4 Amestoy P R Duff I S LrsquoExcellent J-Y and Koster J A fully asynchronous mul-tifrontal solver using distributed dynamic schedulingSIAM Journal on Matrix Analysisand Applications 23(1)15ndash41 (2001)

5 Balay S Buschelman K Eijkhout V Gropp W D Kaushik D Knepley M GMcInnes L C Smith B F and Zhang H PETSc Web pagehttpwwwmcsanlgovpetsc

6 Balay S Gropp W D McInnes L C and Smith B F Efficient management of par-allelism in object oriented numerical software libraries In Arge E Bruaset A M andLangtangen H P editorsModern Software Tools in Scientific Computing pages 163ndash202 Birkhauser Press (1997)

7 Carbotte S Small C and Donnelly K The influence of ridge migration on the mag-matic segmentation of mid-ocean ridgesNature 429743ndash746 (2004)

8 Conder J Wiens D and Morris J On the decompression melting structure at volcanicarcs and back-arc spreading centersGeophys Res Letts 29 (2002)

9 Demmel J W Gilbert J R and Li X S SuperLU userrsquos guide Technical ReportLBNL-44289 Lawrence Berkeley National Laboratory (2003)

10 Eberle M Grasset O and Sotin C A numerical study of the interaction of the man-tle wedge subducting slab and overriding platePhys Earth Planet In 134191ndash202(2002)

11 England P Engdahl R and Thatcher W Systematic variation in the depth of slabsbeneath arc volcanosGeophys J Int 156(2)377ndash408 (2003)

12 Furukawa Y Depth of the decoupling plate interface and thermal structure under arcsJ Geophys Res 9820005ndash20013 (1993)

13 Henson V E and Yang U M BoomerAMG A parallel algebraic multigrid solver andpreconditioner Technical Report UCRL-JC-133948 Lawrence Livermore National Lab-oratory (2000)

14 Hirth G and Kohlstedt D Rheology of the upper mantle and the mantle wedge A viewfrom the experimentalists InInside the Subduction Factory volume 138 ofGeophysicalMonograph American Geophysical Union (2003)

15 Hovland P Norris B and Smith B Making automatic differentiation truly automaticCoupling PETSc with ADIC InProceedings of ICCS2002 (2002)

16 Hysom D and Pothen A A scalable parallel algorithm for incomplete factor precondi-tioning SIAM Journal on Scientific Computing 222194ndash2215 (2001)

17 Katz R Spiegelman M and Carbotte S Ridge migration asthenospheric flow and theorigin of magmatic segmentation in the global mid-ocean ridge systemGeophys ResLetts 31 (2004)

18 Kelemen P Rilling J Parmentier E Mehl L and Hacker B Thermal structure dueto solid-state flow in the mantle wedge beneath arcs InInside the Subduction Factoryvolume 138 ofGeophysical Monograph American Geophysical Union (2003)

19 Peacock S and Wang K Seismic consequences of warm versus cool subduction meta-morphism Examples from southwest and northeast JapanScience 286937ndash939 (1999)

20 Stern R Subduction zonesRev Geophys 40(4) (2002)

26 Matthew G Knepley et al

21 van Keken P Kiefer B and Peacock S High-resolution models of subduction zonesImplications for mineral dehydration reactions and the transport of water into the deepmantleGeochem Geophys Geosys 3(10) (2003)

Page 16: Developing a Geodynamics Simulator with PETScfoalab.earth.ox.ac.uk/publications/Knepley_etal_Springer06.pdf · Developing a Geodynamics Simulator with PETSc Matthew G. Knepley 1,

16 Matthew G Knepley et al

int FormFunctionLocal(DALocalInfo infoField xField fvoid ptr)AppCtx user = (AppCtx)ptrParameter param = userminusgtparamGridInfo grid = userminusgtgriddouble mag w mag uint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1 i j

for (j=infominusgtys jltinfominusgtys+infominusgtym j++) for (i=infominusgtxs iltinfominusgtxs+infominusgtxm i++)

calculateXMomentumResidual(i j x f user)calculateZMomentumResidual(i j x f user)calculatePressureResidual(i j x f user)calculateTemperatureResidual(i j x f user)

Fig 17FormFunctionLocal() for the mantle subduction problem

We have so far presented simple examples involving a perturbed Laplacian butthis development strategy extends far beyond toy problems We have developedlarge-scale parallel geophysical simulations [1] that are producing new results inthe field We began by replacingFormFunctionLocal() in the Bratu examplewith one containing the linear physics of the previous sequential linear subductioncode (dropping the previous codersquos linear solve) Then the PETSc solver results werecompared for correctness with the complete prior subduction code Next the newcode was immediately run in parallel for correctness and efficiency tests Finallythe nonlinear terms from the more complicated physics of the mantle subductionproblem discussed in Section 1 and shown in Figs 17-21 and better quality dis-cretizations of the boundary conditions were added In fact since the structure oftheFormFunction() changed significantly the finalFormFunction() looksnothing like the original but the process of obtaining through the incremental ap-proach reduced the learning curve for PETSc and make correctness checking at thevarious stages much easier

The details of the subduction code are not as important as the recognition thata very complicated physics problem need not involve any more complication in oursimulation infrastructure To give more insight into the actual calculation we showin Fig 22 the explicit calculation of the residual from the continuity equation

Here we have used theDA for a multicomponent problem on a staggered meshwithout alteration because the structure of the discretization remains logically rect-angular

Although theDA manages communication during the solution process data post-processing often necessary for visualization or analysis also requires this function-ality For the mantle subduction simulation we want to calculate both the second in-

Developing a Geodynamics Simulator with PETSc 17

int calculateXMomentumResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (iltj) f[j][i]u = x[j][i]u minus SlabVel(rsquoUrsquo ijuser)

else if (jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lithospheric lid f[j][i]u = x[j][i]u minus 00

else if (i==ilim ) on the right side boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]u = x[j][i]u minus HorizVelocity(ijuser) else

f[j][i]u = XNormalStress(xijCELL CENTERuser) minus EPS ZERO

else if (j==jlim ) on the bottom boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]u = x[j][i]u minus HorizVelocity(ijuser) else if (paramminusgtibound==BC NOSTRESS)

f[j][i]u = XMomentumResidual(xijuser)

else in the mantle wedge f[j][i]u = XMomentumResidual(xijuser)

Fig 18X-momentum residual for the mantle subduction problem

variant of the strain rate tensor and the viscosity over the grid (at both cell centers andcorners) Using theDAGlobalToLocalBegin () andDAGlobalToLocalEnd() functions we transferred the global solution data to local ghosted vectors and pro-ceeded with the calculation We can also transfer data from local vectors to a globalvector usingDALocalToGlobal () The entire viscosity calculation is given inFig 23

Note that this framework allows the user to manage arbitrary fields over the do-main For instance a porous flow simulation might manage material properties ofthe medium in this fashion

4 Solvers

The SNES object in PETSc is an abstraction of the ldquoinverserdquo of a nonlinear oper-ator The user provides aFormFunction() routine as seen in Section 3 whichcomputes the action of the operator on an input vector Linear problems can also be

18 Matthew G Knepley et al

int calculateZMomentumResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1jlim = infominusgtmyminus1

if (ilt=j) f[j][i]w = x[j][i]w minus SlabVel(rsquoWrsquo ijuser)

else if (jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lithospheric lid f[j][i]w = x[j][i]w minus 00

else if (j==jlim ) on the bottom boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]w = x[j][i]w minus VertVelocity(ijuser) else

f[j][i]w = ZNormalStress(xijCELL CENTERuser) minus EPS ZERO

else if (i==ilim ) on the right side boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]w = x[j][i]w minus VertVelocity(ijuser) else if (paramminusgtibound==BC NOSTRESS)

f[j][i]w = ZMomentumResidual(xijuser)

else in the mantle wedge f[j][i]w = ZMomentumResidual(xijuser)

Fig 19Z-momentum residual for the mantle subduction problem

int calculatePressureResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (iltj | | jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lid or slab f[j][i]p = x[j][i]p

else if ((i==ilim | | j==jlim ) ampamp paramminusgtibound==BC ANALYTIC ) on an analytic boundary f[j][i]p = x[j][i]p minus Pressure(ijuser)

else in the mantle wedge f[j][i]p = ContinuityResidual(xijuser)

Fig 20Pressure or continuity residual for the mantle subduction problem

Developing a Geodynamics Simulator with PETSc 19

int calculateTemperatureResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (j==0) on the surface f[j][i]T = x[j][i]T + x[j+1][i]T + PetscMax(x[j][i]T00)

else if (i==0) slab inflow boundary f[j][i]T = x[j][i]T minus PlateModel(jPLATE SLABuser)

else if (i==ilim ) right side boundary mag u = 10 minus pow( (10minusPetscMax(PetscMin(x[j][iminus1]uparamminusgtcb10)00)) 50 )f[j][i]T = x[j][i]T minus mag ux[jminus1][iminus1]T minus (10minusmag u)PlateModel(jPLATE LID user)

else if (j==jlim ) bottom boundary mag w = 10 minus pow( (10minusPetscMax(PetscMin(x[jminus1][i]wparamminusgtsb10)00)) 50 )f[j][i]T = x[j][i]T minus mag wx[jminus1][iminus1]T minus (10minusmag w)

else in the mantle wedge f[j][i]T = EnergyResidual(xijuser)

Fig 21Temperature or energy residual for the mantle subduction problem

double precision ContinuityResidual(Field x int i int j AppCtx user)

GridInfo grid = userminusgtgriddouble precision uEuWwNwSdudxdwdz

uW = x[j][iminus1]u uE = x[j][i]u dudx = (uE minus uW)gridminusgtdxwS = x[jminus1][i]w wN = x[j][i]w dwdz = (wN minus wS)gridminusgtdzreturn dudx + dwdz

Fig 22Calculation of the continuity residual for the mantle subduction problem

solved by usingSNES with no loss of efficiency compared to using the underlyingKSP object (PETSc linear solver object) directly ThusSNES seems the appropri-ate framework for a general-purpose simulator providing the flexibility to add orsubtract nonlinear terms from the equation at will

TheSNES solver does not require a user to implement a Jacobian Default rou-tines are provided to compute a finite difference approximation using coloring toaccount for the sparsity of the matrix The user does not even need the PETScMatinterface but can initially interact only withVec objects making the transition to

PETSc nearly painless However these approximations to the Jacobian can have nu-merical difficulties and are not as efficient as direct evaluation Therefore the user

20 Matthew G Knepley et al

Compute both the second invariant of the strain rate tensor and the viscosity int ViscosityField(DA da Vec X Vec VAppCtx user)

Parameter param = userminusgtparamGridInfo grid = userminusgtgridVec localXField v xdouble eps dx dz T epsC TCint ijisjsimjmilim jlim ivt

ivt = paramminusgtiviscparamminusgtivisc = paramminusgtoutput ivisc

DACreateLocalVector(da amplocalX)DAGlobalToLocalBegin(da X INSERT VALUES localX)DAGlobalToLocalEnd(da X INSERT VALUES localX)DAVecGetArray(dalocalX(void)ampx)DAVecGetArray(daV(void)ampv)

Parameters dx = gridminusgtdx dz = gridminusgtdzilim = gridminusgtniminus1 jlim = gridminusgtnjminus1

Compute real temperature strain rate and viscosity DAGetCorners(daampisampjsPETSCNULL ampimampjmPETSCNULL )for (j=js jltjs+jm j++)

for (i=is iltis+im i++) T = paramminusgtpotentialT x[j][i]T exp( (jminus05)dzparamminusgtz scale )if (iltilim ampamp jltjlim )

TC = paramminusgtpotentialT TInterp(xij) exp( jdzparamminusgtz scale ) else

TC = T Compute the values at both cell centers and cell corners eps = CalcSecInv(xijCELL CENTERuser)epsC = CalcSecInv(xijCELL CORNERuser)v[j][i]u = epsv[j][i]w = epsCv[j][i]p = Viscosity(Tepsdz(jminus05)param)v[j][i]T = Viscosity(TCepsCdzjparam)

DAVecRestoreArray(daV(void)ampv)DAVecRestoreArray(dalocalX(void)ampx)paramminusgtivisc = ivt

Fig 23 Calculation of the second invariant of the strain tensor and viscosity field for themantle subduction problem

Developing a Geodynamics Simulator with PETSc 21

KSP ksp subkspPC bjacobi ilu

SNESGetKSP(snes ampksp)KSPGetPC(ksp ampbjacobi)PCBJacobiGetSubKSP(bjacobi PETSCNULL PETSCNULL ampsubksp)KSPGetPC(subksp[0] ampilu)PCILUSetLevels(ilu levels)

Fig 24Customizing the preconditioner on a single process

has the option of providing a routine or of using the ADIC [15] or similar system forautomatic differentiation The finite difference approximation can be activated byusing the-snes fd and-mat fd coloring freq options or by providing theSNESDefaultComputeJacobianColor () function toSNESSetJacobian()

Through theSNES object the user may also access the great range of directand iterative linear solvers and preconditioners provided by PETSc Sixteen differentKrylov solvers are available including GMRES BiCGStab and LSQR along withinterfaces to popular sparse direct packages such as MUMPS [4] and SuperLU [9]In addition to a variety of incomplete factorization preconditioners including PI-LUT [16] PETSc supports additive Schwartz preconditioning algebraic multigridthrough BoomerAMG [13] and the geometric multigrid discussed below The fullpanoply of solvers and preconditioners available is catalogued on the PETSc Web-site [3]

The modularity of PETSc allows users to easily customize each solver For in-stance suppose the user wishes to increase the number of levels in ILU(k) precon-ditioning on one block of a block-Jacobi scheme The code fragment in Fig 24 willset the number of levels of fill tolevels

PETSc provides an elegant framework for managing geometric multigrid in com-bination with aDA which is abstracted in theDMMGobject In the same way thatany linear solve can be performed withSNES any preconditioner or solver com-bination is available inDMMGby using a single level That is the same user codesupports both geometric multigrid as well as direct methods Krylov methods etcWhen creating aDMMG the user specifies the number of levels of grid refinementand provides the coarse-grid information (see Fig 25) which is always aDA objectat present although unstructured prototypes are being developed Then theDMMGobject calculates the intergrid transfer operators prolongation and restriction and

allocates objects to hold the solution at each level The user must provide the actionof the residual operatorF and can optionally provide a routine to compute the Jaco-bian matrix (which will be called on each level) and a routine to compute the initialguess as in Fig 26

With this information the user can now solve the equation and retrieve the solu-tion vector as in Fig 27

22 Matthew G Knepley et al

DMMGCreate(comm gridmglevels user ampdmmg)DMMGSetDM(dmmg (DM) da)

Fig 25Creating a DMMG

DMMGSetSNESLocal(dmmg FormFunctionLocal FormJacobianLocal 0 0)DMMGSetInitialGuess(dmmg FormInitialGuess)

Fig 26Providing discretization and assembly routines to DMMG

DMMGSolve(dmmg)soln = DMMGGetx(dmmg)

Fig 27Solving the problem with DMMG

5 Extensions

PETSc is not a complete environment for simulating physical phenomena Ratherit is a set of tools that allow the user to assemble such an environment tailored toa specific application As such PETSc will never provide every facility appropriatefor a given simulation However the user can easily extend PETSc and supplementits capabilities We present two examples in the context of the mantle subductionsimulation

51 Simple Continuation

Because of the highly nonlinear dependence of viscosity on temperature and strainrate in equation (1) the iteration in Newtonrsquos method can fail to converge withouta good starting guess of the solution On the other hand for the isoviscous casewhere temperature is coupled to velocity only through advection a solution is easilyreached It is therefore natural to propose a continuation scheme in the viscosity be-ginning with constant viscosity and progressing to full variability A simple adaptivecontinuation model was devised in which the viscosity was raised to a power be-tween zero and oneη rarr ηα whereα=0 corresponds to constant viscosity andα=1corresponds to natural viscosity variation In the continuation loopα was increasedtowards unity by an amount dependent on the rate of convergence of the previousSNES solve

PETSc itself provides no special support for continuation but it is sufficientlymodular that a continuation loop was readily constructed in the application codeusing repeated calls toDMMGSolve() and found to work quite well

Developing a Geodynamics Simulator with PETSc 23

52 Simple Steering

No rigorous justification had been given for the continuation strategy above andconsequently it was possible that for certain initial conditions Newton would fail toconverge thereby resulting in extremely long run times An observant user could de-tect this situation and abort the run but all the potentially useful solution informationup to that point would be lost A clean way to asynchronously abort the continua-tion loop was thus needed On architectures where OS signals are available PETScprovides an interface for registering signal handlers Thus we were able to define ahandler The user wishing to change the control flow of the simulation simply sendsthe appropriate signal to the process This sets a flag in the user data that causes thechange at the next iteration

6 Simulation Results

A long-standing debate has existed between theorists and observationalists over thethermal structure of subduction zones Modelers using constant viscosity flow sim-ulations to compute thermal structure have predicted relatively cold subductionzones [19] Conversely observationalists who use heat flow measurements and min-eralogical thermobarometry to estimate temperatures at depth have long claimedthat subduction zones are hotter than predicted They have invoked the presence ofstrong upwelling of hot mantle material to supply the heat This upwelling was neverpredicted by isoviscous flow models however

The debate remained unresolved until recently when several groups includingour own succeeded in developing simulations with realistically variable viscosity(for other examples [810121821]) A comparison of flow and temperature fieldsfor variable and constant viscosity simulations generated with our code is shownin Fig 28 Results from these simulations are exciting because they close the gapbetween models and observations they predict hotter mantle temperatures steepersurface thermal gradients and upwelling mantle flow

Furthermore these simulations allow for quantitative predictions of variation ofobservables with subduction parameters A recent study of subduction zone earth-quakes has identified an intriguing trend the vertical distance from subduction zonevolcanoes to the surface of the subducting slab is anti-correlated with descent rate ofthe slab [11] Preliminary calculations with our model are consistent with this trendindicating that flow and thermal structure may play an important role in determiningnot only the quantity and chemistry of magmas but also their path of transport to thesurface Further work is required to resolve this issue

Simulating the geodynamics of subduction is an example of our success in port-ing an existing code into the PETSc framework simultaneously parallelizing thecode and increasing its functionality Using this code as a template we rapidly de-veloped a related simulation of another tectonic boundary the mid-ocean ridge Thiswork was done to address a set of observations of systematic morphological asym-metry in the global mid-ocean ridge system [7] Our model confirms the qualitative

24 Matthew G Knepley et al

Dep

th k

m

log10

η and flow field

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24 Temperature degC

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Distance km

Dep

th k

m

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24

Distance km

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Isov

isco

usV

aria

ble

Vis

cosi

ty

(a) (b)

(c) (d)

Fig 282D viscosity and potential temperature fields from simulations on 8 processors with230112 degrees of freedom Panels in the top row are from a simulation withα=1 in equation(1) Panels in bottom row haveα=0 The white box in panels (a) and (c) shows the region inwhich temperature is plotted in panels (b) and (d)(a) Colors showlog10 of the viscosity fieldNote that there are more than five orders of magnitude variation in viscosity Arrows show theflow direction and magnitude (the slab is being subducted at a rate of 6 cmyear) Upwellingis evident in the flow field near the wedge corner(b) Temperature field from the variableviscosity simulation 1100C isotherm is shown as a dashed line(c) (Constant) viscosity andflow field from the isoviscous simulation Strong flow is predicted at the base of the crustdespite the low-temperature rock there No upwelling flow is predicted(d) Temperature fieldfrom isoviscous simulation Note that the mantle wedge corner is much colder than in (b)

mechanism that had been proposed to explain these observations Furthermore itshows a quantitative agreement with trends in the observed data [17] As with ourmodels of subduction the key to demonstrating the validity of the hypothesized dy-namics was simulating them with the strongly nonlinear rheology that characterizesmantle rock on geologic timescales The computational tools provided by PETSc en-abled us easily handle this rheology reducing development time and allowing us tofocus on model interpretation

References

1 PETSc SNES Example 30 httpwwwmcsanlgovpetscpetsc-2

Developing a Geodynamics Simulator with PETSc 25

snapshotspetsc-currentsrcsnesexamplestutorialsex30chtml

2 PETSc SNES Example 5 httpwwwmcsanlgovpetscpetsc-2snapshotspetsc-currentsrcsnesexamplestutorialsex5f90Fhtml

3 PETSc Solvers httpwwwmcsanlgovpetscpetsc-2documentationlinearsolvertablehtml

4 Amestoy P R Duff I S LrsquoExcellent J-Y and Koster J A fully asynchronous mul-tifrontal solver using distributed dynamic schedulingSIAM Journal on Matrix Analysisand Applications 23(1)15ndash41 (2001)

5 Balay S Buschelman K Eijkhout V Gropp W D Kaushik D Knepley M GMcInnes L C Smith B F and Zhang H PETSc Web pagehttpwwwmcsanlgovpetsc

6 Balay S Gropp W D McInnes L C and Smith B F Efficient management of par-allelism in object oriented numerical software libraries In Arge E Bruaset A M andLangtangen H P editorsModern Software Tools in Scientific Computing pages 163ndash202 Birkhauser Press (1997)

7 Carbotte S Small C and Donnelly K The influence of ridge migration on the mag-matic segmentation of mid-ocean ridgesNature 429743ndash746 (2004)

8 Conder J Wiens D and Morris J On the decompression melting structure at volcanicarcs and back-arc spreading centersGeophys Res Letts 29 (2002)

9 Demmel J W Gilbert J R and Li X S SuperLU userrsquos guide Technical ReportLBNL-44289 Lawrence Berkeley National Laboratory (2003)

10 Eberle M Grasset O and Sotin C A numerical study of the interaction of the man-tle wedge subducting slab and overriding platePhys Earth Planet In 134191ndash202(2002)

11 England P Engdahl R and Thatcher W Systematic variation in the depth of slabsbeneath arc volcanosGeophys J Int 156(2)377ndash408 (2003)

12 Furukawa Y Depth of the decoupling plate interface and thermal structure under arcsJ Geophys Res 9820005ndash20013 (1993)

13 Henson V E and Yang U M BoomerAMG A parallel algebraic multigrid solver andpreconditioner Technical Report UCRL-JC-133948 Lawrence Livermore National Lab-oratory (2000)

14 Hirth G and Kohlstedt D Rheology of the upper mantle and the mantle wedge A viewfrom the experimentalists InInside the Subduction Factory volume 138 ofGeophysicalMonograph American Geophysical Union (2003)

15 Hovland P Norris B and Smith B Making automatic differentiation truly automaticCoupling PETSc with ADIC InProceedings of ICCS2002 (2002)

16 Hysom D and Pothen A A scalable parallel algorithm for incomplete factor precondi-tioning SIAM Journal on Scientific Computing 222194ndash2215 (2001)

17 Katz R Spiegelman M and Carbotte S Ridge migration asthenospheric flow and theorigin of magmatic segmentation in the global mid-ocean ridge systemGeophys ResLetts 31 (2004)

18 Kelemen P Rilling J Parmentier E Mehl L and Hacker B Thermal structure dueto solid-state flow in the mantle wedge beneath arcs InInside the Subduction Factoryvolume 138 ofGeophysical Monograph American Geophysical Union (2003)

19 Peacock S and Wang K Seismic consequences of warm versus cool subduction meta-morphism Examples from southwest and northeast JapanScience 286937ndash939 (1999)

20 Stern R Subduction zonesRev Geophys 40(4) (2002)

26 Matthew G Knepley et al

21 van Keken P Kiefer B and Peacock S High-resolution models of subduction zonesImplications for mineral dehydration reactions and the transport of water into the deepmantleGeochem Geophys Geosys 3(10) (2003)

Page 17: Developing a Geodynamics Simulator with PETScfoalab.earth.ox.ac.uk/publications/Knepley_etal_Springer06.pdf · Developing a Geodynamics Simulator with PETSc Matthew G. Knepley 1,

Developing a Geodynamics Simulator with PETSc 17

int calculateXMomentumResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (iltj) f[j][i]u = x[j][i]u minus SlabVel(rsquoUrsquo ijuser)

else if (jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lithospheric lid f[j][i]u = x[j][i]u minus 00

else if (i==ilim ) on the right side boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]u = x[j][i]u minus HorizVelocity(ijuser) else

f[j][i]u = XNormalStress(xijCELL CENTERuser) minus EPS ZERO

else if (j==jlim ) on the bottom boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]u = x[j][i]u minus HorizVelocity(ijuser) else if (paramminusgtibound==BC NOSTRESS)

f[j][i]u = XMomentumResidual(xijuser)

else in the mantle wedge f[j][i]u = XMomentumResidual(xijuser)

Fig 18X-momentum residual for the mantle subduction problem

variant of the strain rate tensor and the viscosity over the grid (at both cell centers andcorners) Using theDAGlobalToLocalBegin () andDAGlobalToLocalEnd() functions we transferred the global solution data to local ghosted vectors and pro-ceeded with the calculation We can also transfer data from local vectors to a globalvector usingDALocalToGlobal () The entire viscosity calculation is given inFig 23

Note that this framework allows the user to manage arbitrary fields over the do-main For instance a porous flow simulation might manage material properties ofthe medium in this fashion

4 Solvers

The SNES object in PETSc is an abstraction of the ldquoinverserdquo of a nonlinear oper-ator The user provides aFormFunction() routine as seen in Section 3 whichcomputes the action of the operator on an input vector Linear problems can also be

18 Matthew G Knepley et al

int calculateZMomentumResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1jlim = infominusgtmyminus1

if (ilt=j) f[j][i]w = x[j][i]w minus SlabVel(rsquoWrsquo ijuser)

else if (jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lithospheric lid f[j][i]w = x[j][i]w minus 00

else if (j==jlim ) on the bottom boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]w = x[j][i]w minus VertVelocity(ijuser) else

f[j][i]w = ZNormalStress(xijCELL CENTERuser) minus EPS ZERO

else if (i==ilim ) on the right side boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]w = x[j][i]w minus VertVelocity(ijuser) else if (paramminusgtibound==BC NOSTRESS)

f[j][i]w = ZMomentumResidual(xijuser)

else in the mantle wedge f[j][i]w = ZMomentumResidual(xijuser)

Fig 19Z-momentum residual for the mantle subduction problem

int calculatePressureResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (iltj | | jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lid or slab f[j][i]p = x[j][i]p

else if ((i==ilim | | j==jlim ) ampamp paramminusgtibound==BC ANALYTIC ) on an analytic boundary f[j][i]p = x[j][i]p minus Pressure(ijuser)

else in the mantle wedge f[j][i]p = ContinuityResidual(xijuser)

Fig 20Pressure or continuity residual for the mantle subduction problem

Developing a Geodynamics Simulator with PETSc 19

int calculateTemperatureResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (j==0) on the surface f[j][i]T = x[j][i]T + x[j+1][i]T + PetscMax(x[j][i]T00)

else if (i==0) slab inflow boundary f[j][i]T = x[j][i]T minus PlateModel(jPLATE SLABuser)

else if (i==ilim ) right side boundary mag u = 10 minus pow( (10minusPetscMax(PetscMin(x[j][iminus1]uparamminusgtcb10)00)) 50 )f[j][i]T = x[j][i]T minus mag ux[jminus1][iminus1]T minus (10minusmag u)PlateModel(jPLATE LID user)

else if (j==jlim ) bottom boundary mag w = 10 minus pow( (10minusPetscMax(PetscMin(x[jminus1][i]wparamminusgtsb10)00)) 50 )f[j][i]T = x[j][i]T minus mag wx[jminus1][iminus1]T minus (10minusmag w)

else in the mantle wedge f[j][i]T = EnergyResidual(xijuser)

Fig 21Temperature or energy residual for the mantle subduction problem

double precision ContinuityResidual(Field x int i int j AppCtx user)

GridInfo grid = userminusgtgriddouble precision uEuWwNwSdudxdwdz

uW = x[j][iminus1]u uE = x[j][i]u dudx = (uE minus uW)gridminusgtdxwS = x[jminus1][i]w wN = x[j][i]w dwdz = (wN minus wS)gridminusgtdzreturn dudx + dwdz

Fig 22Calculation of the continuity residual for the mantle subduction problem

solved by usingSNES with no loss of efficiency compared to using the underlyingKSP object (PETSc linear solver object) directly ThusSNES seems the appropri-ate framework for a general-purpose simulator providing the flexibility to add orsubtract nonlinear terms from the equation at will

TheSNES solver does not require a user to implement a Jacobian Default rou-tines are provided to compute a finite difference approximation using coloring toaccount for the sparsity of the matrix The user does not even need the PETScMatinterface but can initially interact only withVec objects making the transition to

PETSc nearly painless However these approximations to the Jacobian can have nu-merical difficulties and are not as efficient as direct evaluation Therefore the user

20 Matthew G Knepley et al

Compute both the second invariant of the strain rate tensor and the viscosity int ViscosityField(DA da Vec X Vec VAppCtx user)

Parameter param = userminusgtparamGridInfo grid = userminusgtgridVec localXField v xdouble eps dx dz T epsC TCint ijisjsimjmilim jlim ivt

ivt = paramminusgtiviscparamminusgtivisc = paramminusgtoutput ivisc

DACreateLocalVector(da amplocalX)DAGlobalToLocalBegin(da X INSERT VALUES localX)DAGlobalToLocalEnd(da X INSERT VALUES localX)DAVecGetArray(dalocalX(void)ampx)DAVecGetArray(daV(void)ampv)

Parameters dx = gridminusgtdx dz = gridminusgtdzilim = gridminusgtniminus1 jlim = gridminusgtnjminus1

Compute real temperature strain rate and viscosity DAGetCorners(daampisampjsPETSCNULL ampimampjmPETSCNULL )for (j=js jltjs+jm j++)

for (i=is iltis+im i++) T = paramminusgtpotentialT x[j][i]T exp( (jminus05)dzparamminusgtz scale )if (iltilim ampamp jltjlim )

TC = paramminusgtpotentialT TInterp(xij) exp( jdzparamminusgtz scale ) else

TC = T Compute the values at both cell centers and cell corners eps = CalcSecInv(xijCELL CENTERuser)epsC = CalcSecInv(xijCELL CORNERuser)v[j][i]u = epsv[j][i]w = epsCv[j][i]p = Viscosity(Tepsdz(jminus05)param)v[j][i]T = Viscosity(TCepsCdzjparam)

DAVecRestoreArray(daV(void)ampv)DAVecRestoreArray(dalocalX(void)ampx)paramminusgtivisc = ivt

Fig 23 Calculation of the second invariant of the strain tensor and viscosity field for themantle subduction problem

Developing a Geodynamics Simulator with PETSc 21

KSP ksp subkspPC bjacobi ilu

SNESGetKSP(snes ampksp)KSPGetPC(ksp ampbjacobi)PCBJacobiGetSubKSP(bjacobi PETSCNULL PETSCNULL ampsubksp)KSPGetPC(subksp[0] ampilu)PCILUSetLevels(ilu levels)

Fig 24Customizing the preconditioner on a single process

has the option of providing a routine or of using the ADIC [15] or similar system forautomatic differentiation The finite difference approximation can be activated byusing the-snes fd and-mat fd coloring freq options or by providing theSNESDefaultComputeJacobianColor () function toSNESSetJacobian()

Through theSNES object the user may also access the great range of directand iterative linear solvers and preconditioners provided by PETSc Sixteen differentKrylov solvers are available including GMRES BiCGStab and LSQR along withinterfaces to popular sparse direct packages such as MUMPS [4] and SuperLU [9]In addition to a variety of incomplete factorization preconditioners including PI-LUT [16] PETSc supports additive Schwartz preconditioning algebraic multigridthrough BoomerAMG [13] and the geometric multigrid discussed below The fullpanoply of solvers and preconditioners available is catalogued on the PETSc Web-site [3]

The modularity of PETSc allows users to easily customize each solver For in-stance suppose the user wishes to increase the number of levels in ILU(k) precon-ditioning on one block of a block-Jacobi scheme The code fragment in Fig 24 willset the number of levels of fill tolevels

PETSc provides an elegant framework for managing geometric multigrid in com-bination with aDA which is abstracted in theDMMGobject In the same way thatany linear solve can be performed withSNES any preconditioner or solver com-bination is available inDMMGby using a single level That is the same user codesupports both geometric multigrid as well as direct methods Krylov methods etcWhen creating aDMMG the user specifies the number of levels of grid refinementand provides the coarse-grid information (see Fig 25) which is always aDA objectat present although unstructured prototypes are being developed Then theDMMGobject calculates the intergrid transfer operators prolongation and restriction and

allocates objects to hold the solution at each level The user must provide the actionof the residual operatorF and can optionally provide a routine to compute the Jaco-bian matrix (which will be called on each level) and a routine to compute the initialguess as in Fig 26

With this information the user can now solve the equation and retrieve the solu-tion vector as in Fig 27

22 Matthew G Knepley et al

DMMGCreate(comm gridmglevels user ampdmmg)DMMGSetDM(dmmg (DM) da)

Fig 25Creating a DMMG

DMMGSetSNESLocal(dmmg FormFunctionLocal FormJacobianLocal 0 0)DMMGSetInitialGuess(dmmg FormInitialGuess)

Fig 26Providing discretization and assembly routines to DMMG

DMMGSolve(dmmg)soln = DMMGGetx(dmmg)

Fig 27Solving the problem with DMMG

5 Extensions

PETSc is not a complete environment for simulating physical phenomena Ratherit is a set of tools that allow the user to assemble such an environment tailored toa specific application As such PETSc will never provide every facility appropriatefor a given simulation However the user can easily extend PETSc and supplementits capabilities We present two examples in the context of the mantle subductionsimulation

51 Simple Continuation

Because of the highly nonlinear dependence of viscosity on temperature and strainrate in equation (1) the iteration in Newtonrsquos method can fail to converge withouta good starting guess of the solution On the other hand for the isoviscous casewhere temperature is coupled to velocity only through advection a solution is easilyreached It is therefore natural to propose a continuation scheme in the viscosity be-ginning with constant viscosity and progressing to full variability A simple adaptivecontinuation model was devised in which the viscosity was raised to a power be-tween zero and oneη rarr ηα whereα=0 corresponds to constant viscosity andα=1corresponds to natural viscosity variation In the continuation loopα was increasedtowards unity by an amount dependent on the rate of convergence of the previousSNES solve

PETSc itself provides no special support for continuation but it is sufficientlymodular that a continuation loop was readily constructed in the application codeusing repeated calls toDMMGSolve() and found to work quite well

Developing a Geodynamics Simulator with PETSc 23

52 Simple Steering

No rigorous justification had been given for the continuation strategy above andconsequently it was possible that for certain initial conditions Newton would fail toconverge thereby resulting in extremely long run times An observant user could de-tect this situation and abort the run but all the potentially useful solution informationup to that point would be lost A clean way to asynchronously abort the continua-tion loop was thus needed On architectures where OS signals are available PETScprovides an interface for registering signal handlers Thus we were able to define ahandler The user wishing to change the control flow of the simulation simply sendsthe appropriate signal to the process This sets a flag in the user data that causes thechange at the next iteration

6 Simulation Results

A long-standing debate has existed between theorists and observationalists over thethermal structure of subduction zones Modelers using constant viscosity flow sim-ulations to compute thermal structure have predicted relatively cold subductionzones [19] Conversely observationalists who use heat flow measurements and min-eralogical thermobarometry to estimate temperatures at depth have long claimedthat subduction zones are hotter than predicted They have invoked the presence ofstrong upwelling of hot mantle material to supply the heat This upwelling was neverpredicted by isoviscous flow models however

The debate remained unresolved until recently when several groups includingour own succeeded in developing simulations with realistically variable viscosity(for other examples [810121821]) A comparison of flow and temperature fieldsfor variable and constant viscosity simulations generated with our code is shownin Fig 28 Results from these simulations are exciting because they close the gapbetween models and observations they predict hotter mantle temperatures steepersurface thermal gradients and upwelling mantle flow

Furthermore these simulations allow for quantitative predictions of variation ofobservables with subduction parameters A recent study of subduction zone earth-quakes has identified an intriguing trend the vertical distance from subduction zonevolcanoes to the surface of the subducting slab is anti-correlated with descent rate ofthe slab [11] Preliminary calculations with our model are consistent with this trendindicating that flow and thermal structure may play an important role in determiningnot only the quantity and chemistry of magmas but also their path of transport to thesurface Further work is required to resolve this issue

Simulating the geodynamics of subduction is an example of our success in port-ing an existing code into the PETSc framework simultaneously parallelizing thecode and increasing its functionality Using this code as a template we rapidly de-veloped a related simulation of another tectonic boundary the mid-ocean ridge Thiswork was done to address a set of observations of systematic morphological asym-metry in the global mid-ocean ridge system [7] Our model confirms the qualitative

24 Matthew G Knepley et al

Dep

th k

m

log10

η and flow field

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24 Temperature degC

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Distance km

Dep

th k

m

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24

Distance km

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Isov

isco

usV

aria

ble

Vis

cosi

ty

(a) (b)

(c) (d)

Fig 282D viscosity and potential temperature fields from simulations on 8 processors with230112 degrees of freedom Panels in the top row are from a simulation withα=1 in equation(1) Panels in bottom row haveα=0 The white box in panels (a) and (c) shows the region inwhich temperature is plotted in panels (b) and (d)(a) Colors showlog10 of the viscosity fieldNote that there are more than five orders of magnitude variation in viscosity Arrows show theflow direction and magnitude (the slab is being subducted at a rate of 6 cmyear) Upwellingis evident in the flow field near the wedge corner(b) Temperature field from the variableviscosity simulation 1100C isotherm is shown as a dashed line(c) (Constant) viscosity andflow field from the isoviscous simulation Strong flow is predicted at the base of the crustdespite the low-temperature rock there No upwelling flow is predicted(d) Temperature fieldfrom isoviscous simulation Note that the mantle wedge corner is much colder than in (b)

mechanism that had been proposed to explain these observations Furthermore itshows a quantitative agreement with trends in the observed data [17] As with ourmodels of subduction the key to demonstrating the validity of the hypothesized dy-namics was simulating them with the strongly nonlinear rheology that characterizesmantle rock on geologic timescales The computational tools provided by PETSc en-abled us easily handle this rheology reducing development time and allowing us tofocus on model interpretation

References

1 PETSc SNES Example 30 httpwwwmcsanlgovpetscpetsc-2

Developing a Geodynamics Simulator with PETSc 25

snapshotspetsc-currentsrcsnesexamplestutorialsex30chtml

2 PETSc SNES Example 5 httpwwwmcsanlgovpetscpetsc-2snapshotspetsc-currentsrcsnesexamplestutorialsex5f90Fhtml

3 PETSc Solvers httpwwwmcsanlgovpetscpetsc-2documentationlinearsolvertablehtml

4 Amestoy P R Duff I S LrsquoExcellent J-Y and Koster J A fully asynchronous mul-tifrontal solver using distributed dynamic schedulingSIAM Journal on Matrix Analysisand Applications 23(1)15ndash41 (2001)

5 Balay S Buschelman K Eijkhout V Gropp W D Kaushik D Knepley M GMcInnes L C Smith B F and Zhang H PETSc Web pagehttpwwwmcsanlgovpetsc

6 Balay S Gropp W D McInnes L C and Smith B F Efficient management of par-allelism in object oriented numerical software libraries In Arge E Bruaset A M andLangtangen H P editorsModern Software Tools in Scientific Computing pages 163ndash202 Birkhauser Press (1997)

7 Carbotte S Small C and Donnelly K The influence of ridge migration on the mag-matic segmentation of mid-ocean ridgesNature 429743ndash746 (2004)

8 Conder J Wiens D and Morris J On the decompression melting structure at volcanicarcs and back-arc spreading centersGeophys Res Letts 29 (2002)

9 Demmel J W Gilbert J R and Li X S SuperLU userrsquos guide Technical ReportLBNL-44289 Lawrence Berkeley National Laboratory (2003)

10 Eberle M Grasset O and Sotin C A numerical study of the interaction of the man-tle wedge subducting slab and overriding platePhys Earth Planet In 134191ndash202(2002)

11 England P Engdahl R and Thatcher W Systematic variation in the depth of slabsbeneath arc volcanosGeophys J Int 156(2)377ndash408 (2003)

12 Furukawa Y Depth of the decoupling plate interface and thermal structure under arcsJ Geophys Res 9820005ndash20013 (1993)

13 Henson V E and Yang U M BoomerAMG A parallel algebraic multigrid solver andpreconditioner Technical Report UCRL-JC-133948 Lawrence Livermore National Lab-oratory (2000)

14 Hirth G and Kohlstedt D Rheology of the upper mantle and the mantle wedge A viewfrom the experimentalists InInside the Subduction Factory volume 138 ofGeophysicalMonograph American Geophysical Union (2003)

15 Hovland P Norris B and Smith B Making automatic differentiation truly automaticCoupling PETSc with ADIC InProceedings of ICCS2002 (2002)

16 Hysom D and Pothen A A scalable parallel algorithm for incomplete factor precondi-tioning SIAM Journal on Scientific Computing 222194ndash2215 (2001)

17 Katz R Spiegelman M and Carbotte S Ridge migration asthenospheric flow and theorigin of magmatic segmentation in the global mid-ocean ridge systemGeophys ResLetts 31 (2004)

18 Kelemen P Rilling J Parmentier E Mehl L and Hacker B Thermal structure dueto solid-state flow in the mantle wedge beneath arcs InInside the Subduction Factoryvolume 138 ofGeophysical Monograph American Geophysical Union (2003)

19 Peacock S and Wang K Seismic consequences of warm versus cool subduction meta-morphism Examples from southwest and northeast JapanScience 286937ndash939 (1999)

20 Stern R Subduction zonesRev Geophys 40(4) (2002)

26 Matthew G Knepley et al

21 van Keken P Kiefer B and Peacock S High-resolution models of subduction zonesImplications for mineral dehydration reactions and the transport of water into the deepmantleGeochem Geophys Geosys 3(10) (2003)

Page 18: Developing a Geodynamics Simulator with PETScfoalab.earth.ox.ac.uk/publications/Knepley_etal_Springer06.pdf · Developing a Geodynamics Simulator with PETSc Matthew G. Knepley 1,

18 Matthew G Knepley et al

int calculateZMomentumResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1jlim = infominusgtmyminus1

if (ilt=j) f[j][i]w = x[j][i]w minus SlabVel(rsquoWrsquo ijuser)

else if (jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lithospheric lid f[j][i]w = x[j][i]w minus 00

else if (j==jlim ) on the bottom boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]w = x[j][i]w minus VertVelocity(ijuser) else

f[j][i]w = ZNormalStress(xijCELL CENTERuser) minus EPS ZERO

else if (i==ilim ) on the right side boundary if (paramminusgtibound==BC ANALYTIC )

f[j][i]w = x[j][i]w minus VertVelocity(ijuser) else if (paramminusgtibound==BC NOSTRESS)

f[j][i]w = ZMomentumResidual(xijuser)

else in the mantle wedge f[j][i]w = ZMomentumResidual(xijuser)

Fig 19Z-momentum residual for the mantle subduction problem

int calculatePressureResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (iltj | | jlt=gridminusgtjlid | | (jltgridminusgtcorner+gridminusgtinose ampamp iltgridminusgtcorner+gridminusgtinose)) in the lid or slab f[j][i]p = x[j][i]p

else if ((i==ilim | | j==jlim ) ampamp paramminusgtibound==BC ANALYTIC ) on an analytic boundary f[j][i]p = x[j][i]p minus Pressure(ijuser)

else in the mantle wedge f[j][i]p = ContinuityResidual(xijuser)

Fig 20Pressure or continuity residual for the mantle subduction problem

Developing a Geodynamics Simulator with PETSc 19

int calculateTemperatureResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (j==0) on the surface f[j][i]T = x[j][i]T + x[j+1][i]T + PetscMax(x[j][i]T00)

else if (i==0) slab inflow boundary f[j][i]T = x[j][i]T minus PlateModel(jPLATE SLABuser)

else if (i==ilim ) right side boundary mag u = 10 minus pow( (10minusPetscMax(PetscMin(x[j][iminus1]uparamminusgtcb10)00)) 50 )f[j][i]T = x[j][i]T minus mag ux[jminus1][iminus1]T minus (10minusmag u)PlateModel(jPLATE LID user)

else if (j==jlim ) bottom boundary mag w = 10 minus pow( (10minusPetscMax(PetscMin(x[jminus1][i]wparamminusgtsb10)00)) 50 )f[j][i]T = x[j][i]T minus mag wx[jminus1][iminus1]T minus (10minusmag w)

else in the mantle wedge f[j][i]T = EnergyResidual(xijuser)

Fig 21Temperature or energy residual for the mantle subduction problem

double precision ContinuityResidual(Field x int i int j AppCtx user)

GridInfo grid = userminusgtgriddouble precision uEuWwNwSdudxdwdz

uW = x[j][iminus1]u uE = x[j][i]u dudx = (uE minus uW)gridminusgtdxwS = x[jminus1][i]w wN = x[j][i]w dwdz = (wN minus wS)gridminusgtdzreturn dudx + dwdz

Fig 22Calculation of the continuity residual for the mantle subduction problem

solved by usingSNES with no loss of efficiency compared to using the underlyingKSP object (PETSc linear solver object) directly ThusSNES seems the appropri-ate framework for a general-purpose simulator providing the flexibility to add orsubtract nonlinear terms from the equation at will

TheSNES solver does not require a user to implement a Jacobian Default rou-tines are provided to compute a finite difference approximation using coloring toaccount for the sparsity of the matrix The user does not even need the PETScMatinterface but can initially interact only withVec objects making the transition to

PETSc nearly painless However these approximations to the Jacobian can have nu-merical difficulties and are not as efficient as direct evaluation Therefore the user

20 Matthew G Knepley et al

Compute both the second invariant of the strain rate tensor and the viscosity int ViscosityField(DA da Vec X Vec VAppCtx user)

Parameter param = userminusgtparamGridInfo grid = userminusgtgridVec localXField v xdouble eps dx dz T epsC TCint ijisjsimjmilim jlim ivt

ivt = paramminusgtiviscparamminusgtivisc = paramminusgtoutput ivisc

DACreateLocalVector(da amplocalX)DAGlobalToLocalBegin(da X INSERT VALUES localX)DAGlobalToLocalEnd(da X INSERT VALUES localX)DAVecGetArray(dalocalX(void)ampx)DAVecGetArray(daV(void)ampv)

Parameters dx = gridminusgtdx dz = gridminusgtdzilim = gridminusgtniminus1 jlim = gridminusgtnjminus1

Compute real temperature strain rate and viscosity DAGetCorners(daampisampjsPETSCNULL ampimampjmPETSCNULL )for (j=js jltjs+jm j++)

for (i=is iltis+im i++) T = paramminusgtpotentialT x[j][i]T exp( (jminus05)dzparamminusgtz scale )if (iltilim ampamp jltjlim )

TC = paramminusgtpotentialT TInterp(xij) exp( jdzparamminusgtz scale ) else

TC = T Compute the values at both cell centers and cell corners eps = CalcSecInv(xijCELL CENTERuser)epsC = CalcSecInv(xijCELL CORNERuser)v[j][i]u = epsv[j][i]w = epsCv[j][i]p = Viscosity(Tepsdz(jminus05)param)v[j][i]T = Viscosity(TCepsCdzjparam)

DAVecRestoreArray(daV(void)ampv)DAVecRestoreArray(dalocalX(void)ampx)paramminusgtivisc = ivt

Fig 23 Calculation of the second invariant of the strain tensor and viscosity field for themantle subduction problem

Developing a Geodynamics Simulator with PETSc 21

KSP ksp subkspPC bjacobi ilu

SNESGetKSP(snes ampksp)KSPGetPC(ksp ampbjacobi)PCBJacobiGetSubKSP(bjacobi PETSCNULL PETSCNULL ampsubksp)KSPGetPC(subksp[0] ampilu)PCILUSetLevels(ilu levels)

Fig 24Customizing the preconditioner on a single process

has the option of providing a routine or of using the ADIC [15] or similar system forautomatic differentiation The finite difference approximation can be activated byusing the-snes fd and-mat fd coloring freq options or by providing theSNESDefaultComputeJacobianColor () function toSNESSetJacobian()

Through theSNES object the user may also access the great range of directand iterative linear solvers and preconditioners provided by PETSc Sixteen differentKrylov solvers are available including GMRES BiCGStab and LSQR along withinterfaces to popular sparse direct packages such as MUMPS [4] and SuperLU [9]In addition to a variety of incomplete factorization preconditioners including PI-LUT [16] PETSc supports additive Schwartz preconditioning algebraic multigridthrough BoomerAMG [13] and the geometric multigrid discussed below The fullpanoply of solvers and preconditioners available is catalogued on the PETSc Web-site [3]

The modularity of PETSc allows users to easily customize each solver For in-stance suppose the user wishes to increase the number of levels in ILU(k) precon-ditioning on one block of a block-Jacobi scheme The code fragment in Fig 24 willset the number of levels of fill tolevels

PETSc provides an elegant framework for managing geometric multigrid in com-bination with aDA which is abstracted in theDMMGobject In the same way thatany linear solve can be performed withSNES any preconditioner or solver com-bination is available inDMMGby using a single level That is the same user codesupports both geometric multigrid as well as direct methods Krylov methods etcWhen creating aDMMG the user specifies the number of levels of grid refinementand provides the coarse-grid information (see Fig 25) which is always aDA objectat present although unstructured prototypes are being developed Then theDMMGobject calculates the intergrid transfer operators prolongation and restriction and

allocates objects to hold the solution at each level The user must provide the actionof the residual operatorF and can optionally provide a routine to compute the Jaco-bian matrix (which will be called on each level) and a routine to compute the initialguess as in Fig 26

With this information the user can now solve the equation and retrieve the solu-tion vector as in Fig 27

22 Matthew G Knepley et al

DMMGCreate(comm gridmglevels user ampdmmg)DMMGSetDM(dmmg (DM) da)

Fig 25Creating a DMMG

DMMGSetSNESLocal(dmmg FormFunctionLocal FormJacobianLocal 0 0)DMMGSetInitialGuess(dmmg FormInitialGuess)

Fig 26Providing discretization and assembly routines to DMMG

DMMGSolve(dmmg)soln = DMMGGetx(dmmg)

Fig 27Solving the problem with DMMG

5 Extensions

PETSc is not a complete environment for simulating physical phenomena Ratherit is a set of tools that allow the user to assemble such an environment tailored toa specific application As such PETSc will never provide every facility appropriatefor a given simulation However the user can easily extend PETSc and supplementits capabilities We present two examples in the context of the mantle subductionsimulation

51 Simple Continuation

Because of the highly nonlinear dependence of viscosity on temperature and strainrate in equation (1) the iteration in Newtonrsquos method can fail to converge withouta good starting guess of the solution On the other hand for the isoviscous casewhere temperature is coupled to velocity only through advection a solution is easilyreached It is therefore natural to propose a continuation scheme in the viscosity be-ginning with constant viscosity and progressing to full variability A simple adaptivecontinuation model was devised in which the viscosity was raised to a power be-tween zero and oneη rarr ηα whereα=0 corresponds to constant viscosity andα=1corresponds to natural viscosity variation In the continuation loopα was increasedtowards unity by an amount dependent on the rate of convergence of the previousSNES solve

PETSc itself provides no special support for continuation but it is sufficientlymodular that a continuation loop was readily constructed in the application codeusing repeated calls toDMMGSolve() and found to work quite well

Developing a Geodynamics Simulator with PETSc 23

52 Simple Steering

No rigorous justification had been given for the continuation strategy above andconsequently it was possible that for certain initial conditions Newton would fail toconverge thereby resulting in extremely long run times An observant user could de-tect this situation and abort the run but all the potentially useful solution informationup to that point would be lost A clean way to asynchronously abort the continua-tion loop was thus needed On architectures where OS signals are available PETScprovides an interface for registering signal handlers Thus we were able to define ahandler The user wishing to change the control flow of the simulation simply sendsthe appropriate signal to the process This sets a flag in the user data that causes thechange at the next iteration

6 Simulation Results

A long-standing debate has existed between theorists and observationalists over thethermal structure of subduction zones Modelers using constant viscosity flow sim-ulations to compute thermal structure have predicted relatively cold subductionzones [19] Conversely observationalists who use heat flow measurements and min-eralogical thermobarometry to estimate temperatures at depth have long claimedthat subduction zones are hotter than predicted They have invoked the presence ofstrong upwelling of hot mantle material to supply the heat This upwelling was neverpredicted by isoviscous flow models however

The debate remained unresolved until recently when several groups includingour own succeeded in developing simulations with realistically variable viscosity(for other examples [810121821]) A comparison of flow and temperature fieldsfor variable and constant viscosity simulations generated with our code is shownin Fig 28 Results from these simulations are exciting because they close the gapbetween models and observations they predict hotter mantle temperatures steepersurface thermal gradients and upwelling mantle flow

Furthermore these simulations allow for quantitative predictions of variation ofobservables with subduction parameters A recent study of subduction zone earth-quakes has identified an intriguing trend the vertical distance from subduction zonevolcanoes to the surface of the subducting slab is anti-correlated with descent rate ofthe slab [11] Preliminary calculations with our model are consistent with this trendindicating that flow and thermal structure may play an important role in determiningnot only the quantity and chemistry of magmas but also their path of transport to thesurface Further work is required to resolve this issue

Simulating the geodynamics of subduction is an example of our success in port-ing an existing code into the PETSc framework simultaneously parallelizing thecode and increasing its functionality Using this code as a template we rapidly de-veloped a related simulation of another tectonic boundary the mid-ocean ridge Thiswork was done to address a set of observations of systematic morphological asym-metry in the global mid-ocean ridge system [7] Our model confirms the qualitative

24 Matthew G Knepley et al

Dep

th k

m

log10

η and flow field

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24 Temperature degC

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Distance km

Dep

th k

m

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24

Distance km

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Isov

isco

usV

aria

ble

Vis

cosi

ty

(a) (b)

(c) (d)

Fig 282D viscosity and potential temperature fields from simulations on 8 processors with230112 degrees of freedom Panels in the top row are from a simulation withα=1 in equation(1) Panels in bottom row haveα=0 The white box in panels (a) and (c) shows the region inwhich temperature is plotted in panels (b) and (d)(a) Colors showlog10 of the viscosity fieldNote that there are more than five orders of magnitude variation in viscosity Arrows show theflow direction and magnitude (the slab is being subducted at a rate of 6 cmyear) Upwellingis evident in the flow field near the wedge corner(b) Temperature field from the variableviscosity simulation 1100C isotherm is shown as a dashed line(c) (Constant) viscosity andflow field from the isoviscous simulation Strong flow is predicted at the base of the crustdespite the low-temperature rock there No upwelling flow is predicted(d) Temperature fieldfrom isoviscous simulation Note that the mantle wedge corner is much colder than in (b)

mechanism that had been proposed to explain these observations Furthermore itshows a quantitative agreement with trends in the observed data [17] As with ourmodels of subduction the key to demonstrating the validity of the hypothesized dy-namics was simulating them with the strongly nonlinear rheology that characterizesmantle rock on geologic timescales The computational tools provided by PETSc en-abled us easily handle this rheology reducing development time and allowing us tofocus on model interpretation

References

1 PETSc SNES Example 30 httpwwwmcsanlgovpetscpetsc-2

Developing a Geodynamics Simulator with PETSc 25

snapshotspetsc-currentsrcsnesexamplestutorialsex30chtml

2 PETSc SNES Example 5 httpwwwmcsanlgovpetscpetsc-2snapshotspetsc-currentsrcsnesexamplestutorialsex5f90Fhtml

3 PETSc Solvers httpwwwmcsanlgovpetscpetsc-2documentationlinearsolvertablehtml

4 Amestoy P R Duff I S LrsquoExcellent J-Y and Koster J A fully asynchronous mul-tifrontal solver using distributed dynamic schedulingSIAM Journal on Matrix Analysisand Applications 23(1)15ndash41 (2001)

5 Balay S Buschelman K Eijkhout V Gropp W D Kaushik D Knepley M GMcInnes L C Smith B F and Zhang H PETSc Web pagehttpwwwmcsanlgovpetsc

6 Balay S Gropp W D McInnes L C and Smith B F Efficient management of par-allelism in object oriented numerical software libraries In Arge E Bruaset A M andLangtangen H P editorsModern Software Tools in Scientific Computing pages 163ndash202 Birkhauser Press (1997)

7 Carbotte S Small C and Donnelly K The influence of ridge migration on the mag-matic segmentation of mid-ocean ridgesNature 429743ndash746 (2004)

8 Conder J Wiens D and Morris J On the decompression melting structure at volcanicarcs and back-arc spreading centersGeophys Res Letts 29 (2002)

9 Demmel J W Gilbert J R and Li X S SuperLU userrsquos guide Technical ReportLBNL-44289 Lawrence Berkeley National Laboratory (2003)

10 Eberle M Grasset O and Sotin C A numerical study of the interaction of the man-tle wedge subducting slab and overriding platePhys Earth Planet In 134191ndash202(2002)

11 England P Engdahl R and Thatcher W Systematic variation in the depth of slabsbeneath arc volcanosGeophys J Int 156(2)377ndash408 (2003)

12 Furukawa Y Depth of the decoupling plate interface and thermal structure under arcsJ Geophys Res 9820005ndash20013 (1993)

13 Henson V E and Yang U M BoomerAMG A parallel algebraic multigrid solver andpreconditioner Technical Report UCRL-JC-133948 Lawrence Livermore National Lab-oratory (2000)

14 Hirth G and Kohlstedt D Rheology of the upper mantle and the mantle wedge A viewfrom the experimentalists InInside the Subduction Factory volume 138 ofGeophysicalMonograph American Geophysical Union (2003)

15 Hovland P Norris B and Smith B Making automatic differentiation truly automaticCoupling PETSc with ADIC InProceedings of ICCS2002 (2002)

16 Hysom D and Pothen A A scalable parallel algorithm for incomplete factor precondi-tioning SIAM Journal on Scientific Computing 222194ndash2215 (2001)

17 Katz R Spiegelman M and Carbotte S Ridge migration asthenospheric flow and theorigin of magmatic segmentation in the global mid-ocean ridge systemGeophys ResLetts 31 (2004)

18 Kelemen P Rilling J Parmentier E Mehl L and Hacker B Thermal structure dueto solid-state flow in the mantle wedge beneath arcs InInside the Subduction Factoryvolume 138 ofGeophysical Monograph American Geophysical Union (2003)

19 Peacock S and Wang K Seismic consequences of warm versus cool subduction meta-morphism Examples from southwest and northeast JapanScience 286937ndash939 (1999)

20 Stern R Subduction zonesRev Geophys 40(4) (2002)

26 Matthew G Knepley et al

21 van Keken P Kiefer B and Peacock S High-resolution models of subduction zonesImplications for mineral dehydration reactions and the transport of water into the deepmantleGeochem Geophys Geosys 3(10) (2003)

Page 19: Developing a Geodynamics Simulator with PETScfoalab.earth.ox.ac.uk/publications/Knepley_etal_Springer06.pdf · Developing a Geodynamics Simulator with PETSc Matthew G. Knepley 1,

Developing a Geodynamics Simulator with PETSc 19

int calculateTemperatureResidual(int i int j Field xField fAppCtx user)Parameterparam = userminusgtparamGridInfo grid = userminusgtgridint ilim = infominusgtmxminus1 jlim = infominusgtmyminus1

if (j==0) on the surface f[j][i]T = x[j][i]T + x[j+1][i]T + PetscMax(x[j][i]T00)

else if (i==0) slab inflow boundary f[j][i]T = x[j][i]T minus PlateModel(jPLATE SLABuser)

else if (i==ilim ) right side boundary mag u = 10 minus pow( (10minusPetscMax(PetscMin(x[j][iminus1]uparamminusgtcb10)00)) 50 )f[j][i]T = x[j][i]T minus mag ux[jminus1][iminus1]T minus (10minusmag u)PlateModel(jPLATE LID user)

else if (j==jlim ) bottom boundary mag w = 10 minus pow( (10minusPetscMax(PetscMin(x[jminus1][i]wparamminusgtsb10)00)) 50 )f[j][i]T = x[j][i]T minus mag wx[jminus1][iminus1]T minus (10minusmag w)

else in the mantle wedge f[j][i]T = EnergyResidual(xijuser)

Fig 21Temperature or energy residual for the mantle subduction problem

double precision ContinuityResidual(Field x int i int j AppCtx user)

GridInfo grid = userminusgtgriddouble precision uEuWwNwSdudxdwdz

uW = x[j][iminus1]u uE = x[j][i]u dudx = (uE minus uW)gridminusgtdxwS = x[jminus1][i]w wN = x[j][i]w dwdz = (wN minus wS)gridminusgtdzreturn dudx + dwdz

Fig 22Calculation of the continuity residual for the mantle subduction problem

solved by usingSNES with no loss of efficiency compared to using the underlyingKSP object (PETSc linear solver object) directly ThusSNES seems the appropri-ate framework for a general-purpose simulator providing the flexibility to add orsubtract nonlinear terms from the equation at will

TheSNES solver does not require a user to implement a Jacobian Default rou-tines are provided to compute a finite difference approximation using coloring toaccount for the sparsity of the matrix The user does not even need the PETScMatinterface but can initially interact only withVec objects making the transition to

PETSc nearly painless However these approximations to the Jacobian can have nu-merical difficulties and are not as efficient as direct evaluation Therefore the user

20 Matthew G Knepley et al

Compute both the second invariant of the strain rate tensor and the viscosity int ViscosityField(DA da Vec X Vec VAppCtx user)

Parameter param = userminusgtparamGridInfo grid = userminusgtgridVec localXField v xdouble eps dx dz T epsC TCint ijisjsimjmilim jlim ivt

ivt = paramminusgtiviscparamminusgtivisc = paramminusgtoutput ivisc

DACreateLocalVector(da amplocalX)DAGlobalToLocalBegin(da X INSERT VALUES localX)DAGlobalToLocalEnd(da X INSERT VALUES localX)DAVecGetArray(dalocalX(void)ampx)DAVecGetArray(daV(void)ampv)

Parameters dx = gridminusgtdx dz = gridminusgtdzilim = gridminusgtniminus1 jlim = gridminusgtnjminus1

Compute real temperature strain rate and viscosity DAGetCorners(daampisampjsPETSCNULL ampimampjmPETSCNULL )for (j=js jltjs+jm j++)

for (i=is iltis+im i++) T = paramminusgtpotentialT x[j][i]T exp( (jminus05)dzparamminusgtz scale )if (iltilim ampamp jltjlim )

TC = paramminusgtpotentialT TInterp(xij) exp( jdzparamminusgtz scale ) else

TC = T Compute the values at both cell centers and cell corners eps = CalcSecInv(xijCELL CENTERuser)epsC = CalcSecInv(xijCELL CORNERuser)v[j][i]u = epsv[j][i]w = epsCv[j][i]p = Viscosity(Tepsdz(jminus05)param)v[j][i]T = Viscosity(TCepsCdzjparam)

DAVecRestoreArray(daV(void)ampv)DAVecRestoreArray(dalocalX(void)ampx)paramminusgtivisc = ivt

Fig 23 Calculation of the second invariant of the strain tensor and viscosity field for themantle subduction problem

Developing a Geodynamics Simulator with PETSc 21

KSP ksp subkspPC bjacobi ilu

SNESGetKSP(snes ampksp)KSPGetPC(ksp ampbjacobi)PCBJacobiGetSubKSP(bjacobi PETSCNULL PETSCNULL ampsubksp)KSPGetPC(subksp[0] ampilu)PCILUSetLevels(ilu levels)

Fig 24Customizing the preconditioner on a single process

has the option of providing a routine or of using the ADIC [15] or similar system forautomatic differentiation The finite difference approximation can be activated byusing the-snes fd and-mat fd coloring freq options or by providing theSNESDefaultComputeJacobianColor () function toSNESSetJacobian()

Through theSNES object the user may also access the great range of directand iterative linear solvers and preconditioners provided by PETSc Sixteen differentKrylov solvers are available including GMRES BiCGStab and LSQR along withinterfaces to popular sparse direct packages such as MUMPS [4] and SuperLU [9]In addition to a variety of incomplete factorization preconditioners including PI-LUT [16] PETSc supports additive Schwartz preconditioning algebraic multigridthrough BoomerAMG [13] and the geometric multigrid discussed below The fullpanoply of solvers and preconditioners available is catalogued on the PETSc Web-site [3]

The modularity of PETSc allows users to easily customize each solver For in-stance suppose the user wishes to increase the number of levels in ILU(k) precon-ditioning on one block of a block-Jacobi scheme The code fragment in Fig 24 willset the number of levels of fill tolevels

PETSc provides an elegant framework for managing geometric multigrid in com-bination with aDA which is abstracted in theDMMGobject In the same way thatany linear solve can be performed withSNES any preconditioner or solver com-bination is available inDMMGby using a single level That is the same user codesupports both geometric multigrid as well as direct methods Krylov methods etcWhen creating aDMMG the user specifies the number of levels of grid refinementand provides the coarse-grid information (see Fig 25) which is always aDA objectat present although unstructured prototypes are being developed Then theDMMGobject calculates the intergrid transfer operators prolongation and restriction and

allocates objects to hold the solution at each level The user must provide the actionof the residual operatorF and can optionally provide a routine to compute the Jaco-bian matrix (which will be called on each level) and a routine to compute the initialguess as in Fig 26

With this information the user can now solve the equation and retrieve the solu-tion vector as in Fig 27

22 Matthew G Knepley et al

DMMGCreate(comm gridmglevels user ampdmmg)DMMGSetDM(dmmg (DM) da)

Fig 25Creating a DMMG

DMMGSetSNESLocal(dmmg FormFunctionLocal FormJacobianLocal 0 0)DMMGSetInitialGuess(dmmg FormInitialGuess)

Fig 26Providing discretization and assembly routines to DMMG

DMMGSolve(dmmg)soln = DMMGGetx(dmmg)

Fig 27Solving the problem with DMMG

5 Extensions

PETSc is not a complete environment for simulating physical phenomena Ratherit is a set of tools that allow the user to assemble such an environment tailored toa specific application As such PETSc will never provide every facility appropriatefor a given simulation However the user can easily extend PETSc and supplementits capabilities We present two examples in the context of the mantle subductionsimulation

51 Simple Continuation

Because of the highly nonlinear dependence of viscosity on temperature and strainrate in equation (1) the iteration in Newtonrsquos method can fail to converge withouta good starting guess of the solution On the other hand for the isoviscous casewhere temperature is coupled to velocity only through advection a solution is easilyreached It is therefore natural to propose a continuation scheme in the viscosity be-ginning with constant viscosity and progressing to full variability A simple adaptivecontinuation model was devised in which the viscosity was raised to a power be-tween zero and oneη rarr ηα whereα=0 corresponds to constant viscosity andα=1corresponds to natural viscosity variation In the continuation loopα was increasedtowards unity by an amount dependent on the rate of convergence of the previousSNES solve

PETSc itself provides no special support for continuation but it is sufficientlymodular that a continuation loop was readily constructed in the application codeusing repeated calls toDMMGSolve() and found to work quite well

Developing a Geodynamics Simulator with PETSc 23

52 Simple Steering

No rigorous justification had been given for the continuation strategy above andconsequently it was possible that for certain initial conditions Newton would fail toconverge thereby resulting in extremely long run times An observant user could de-tect this situation and abort the run but all the potentially useful solution informationup to that point would be lost A clean way to asynchronously abort the continua-tion loop was thus needed On architectures where OS signals are available PETScprovides an interface for registering signal handlers Thus we were able to define ahandler The user wishing to change the control flow of the simulation simply sendsthe appropriate signal to the process This sets a flag in the user data that causes thechange at the next iteration

6 Simulation Results

A long-standing debate has existed between theorists and observationalists over thethermal structure of subduction zones Modelers using constant viscosity flow sim-ulations to compute thermal structure have predicted relatively cold subductionzones [19] Conversely observationalists who use heat flow measurements and min-eralogical thermobarometry to estimate temperatures at depth have long claimedthat subduction zones are hotter than predicted They have invoked the presence ofstrong upwelling of hot mantle material to supply the heat This upwelling was neverpredicted by isoviscous flow models however

The debate remained unresolved until recently when several groups includingour own succeeded in developing simulations with realistically variable viscosity(for other examples [810121821]) A comparison of flow and temperature fieldsfor variable and constant viscosity simulations generated with our code is shownin Fig 28 Results from these simulations are exciting because they close the gapbetween models and observations they predict hotter mantle temperatures steepersurface thermal gradients and upwelling mantle flow

Furthermore these simulations allow for quantitative predictions of variation ofobservables with subduction parameters A recent study of subduction zone earth-quakes has identified an intriguing trend the vertical distance from subduction zonevolcanoes to the surface of the subducting slab is anti-correlated with descent rate ofthe slab [11] Preliminary calculations with our model are consistent with this trendindicating that flow and thermal structure may play an important role in determiningnot only the quantity and chemistry of magmas but also their path of transport to thesurface Further work is required to resolve this issue

Simulating the geodynamics of subduction is an example of our success in port-ing an existing code into the PETSc framework simultaneously parallelizing thecode and increasing its functionality Using this code as a template we rapidly de-veloped a related simulation of another tectonic boundary the mid-ocean ridge Thiswork was done to address a set of observations of systematic morphological asym-metry in the global mid-ocean ridge system [7] Our model confirms the qualitative

24 Matthew G Knepley et al

Dep

th k

m

log10

η and flow field

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24 Temperature degC

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Distance km

Dep

th k

m

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24

Distance km

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Isov

isco

usV

aria

ble

Vis

cosi

ty

(a) (b)

(c) (d)

Fig 282D viscosity and potential temperature fields from simulations on 8 processors with230112 degrees of freedom Panels in the top row are from a simulation withα=1 in equation(1) Panels in bottom row haveα=0 The white box in panels (a) and (c) shows the region inwhich temperature is plotted in panels (b) and (d)(a) Colors showlog10 of the viscosity fieldNote that there are more than five orders of magnitude variation in viscosity Arrows show theflow direction and magnitude (the slab is being subducted at a rate of 6 cmyear) Upwellingis evident in the flow field near the wedge corner(b) Temperature field from the variableviscosity simulation 1100C isotherm is shown as a dashed line(c) (Constant) viscosity andflow field from the isoviscous simulation Strong flow is predicted at the base of the crustdespite the low-temperature rock there No upwelling flow is predicted(d) Temperature fieldfrom isoviscous simulation Note that the mantle wedge corner is much colder than in (b)

mechanism that had been proposed to explain these observations Furthermore itshows a quantitative agreement with trends in the observed data [17] As with ourmodels of subduction the key to demonstrating the validity of the hypothesized dy-namics was simulating them with the strongly nonlinear rheology that characterizesmantle rock on geologic timescales The computational tools provided by PETSc en-abled us easily handle this rheology reducing development time and allowing us tofocus on model interpretation

References

1 PETSc SNES Example 30 httpwwwmcsanlgovpetscpetsc-2

Developing a Geodynamics Simulator with PETSc 25

snapshotspetsc-currentsrcsnesexamplestutorialsex30chtml

2 PETSc SNES Example 5 httpwwwmcsanlgovpetscpetsc-2snapshotspetsc-currentsrcsnesexamplestutorialsex5f90Fhtml

3 PETSc Solvers httpwwwmcsanlgovpetscpetsc-2documentationlinearsolvertablehtml

4 Amestoy P R Duff I S LrsquoExcellent J-Y and Koster J A fully asynchronous mul-tifrontal solver using distributed dynamic schedulingSIAM Journal on Matrix Analysisand Applications 23(1)15ndash41 (2001)

5 Balay S Buschelman K Eijkhout V Gropp W D Kaushik D Knepley M GMcInnes L C Smith B F and Zhang H PETSc Web pagehttpwwwmcsanlgovpetsc

6 Balay S Gropp W D McInnes L C and Smith B F Efficient management of par-allelism in object oriented numerical software libraries In Arge E Bruaset A M andLangtangen H P editorsModern Software Tools in Scientific Computing pages 163ndash202 Birkhauser Press (1997)

7 Carbotte S Small C and Donnelly K The influence of ridge migration on the mag-matic segmentation of mid-ocean ridgesNature 429743ndash746 (2004)

8 Conder J Wiens D and Morris J On the decompression melting structure at volcanicarcs and back-arc spreading centersGeophys Res Letts 29 (2002)

9 Demmel J W Gilbert J R and Li X S SuperLU userrsquos guide Technical ReportLBNL-44289 Lawrence Berkeley National Laboratory (2003)

10 Eberle M Grasset O and Sotin C A numerical study of the interaction of the man-tle wedge subducting slab and overriding platePhys Earth Planet In 134191ndash202(2002)

11 England P Engdahl R and Thatcher W Systematic variation in the depth of slabsbeneath arc volcanosGeophys J Int 156(2)377ndash408 (2003)

12 Furukawa Y Depth of the decoupling plate interface and thermal structure under arcsJ Geophys Res 9820005ndash20013 (1993)

13 Henson V E and Yang U M BoomerAMG A parallel algebraic multigrid solver andpreconditioner Technical Report UCRL-JC-133948 Lawrence Livermore National Lab-oratory (2000)

14 Hirth G and Kohlstedt D Rheology of the upper mantle and the mantle wedge A viewfrom the experimentalists InInside the Subduction Factory volume 138 ofGeophysicalMonograph American Geophysical Union (2003)

15 Hovland P Norris B and Smith B Making automatic differentiation truly automaticCoupling PETSc with ADIC InProceedings of ICCS2002 (2002)

16 Hysom D and Pothen A A scalable parallel algorithm for incomplete factor precondi-tioning SIAM Journal on Scientific Computing 222194ndash2215 (2001)

17 Katz R Spiegelman M and Carbotte S Ridge migration asthenospheric flow and theorigin of magmatic segmentation in the global mid-ocean ridge systemGeophys ResLetts 31 (2004)

18 Kelemen P Rilling J Parmentier E Mehl L and Hacker B Thermal structure dueto solid-state flow in the mantle wedge beneath arcs InInside the Subduction Factoryvolume 138 ofGeophysical Monograph American Geophysical Union (2003)

19 Peacock S and Wang K Seismic consequences of warm versus cool subduction meta-morphism Examples from southwest and northeast JapanScience 286937ndash939 (1999)

20 Stern R Subduction zonesRev Geophys 40(4) (2002)

26 Matthew G Knepley et al

21 van Keken P Kiefer B and Peacock S High-resolution models of subduction zonesImplications for mineral dehydration reactions and the transport of water into the deepmantleGeochem Geophys Geosys 3(10) (2003)

Page 20: Developing a Geodynamics Simulator with PETScfoalab.earth.ox.ac.uk/publications/Knepley_etal_Springer06.pdf · Developing a Geodynamics Simulator with PETSc Matthew G. Knepley 1,

20 Matthew G Knepley et al

Compute both the second invariant of the strain rate tensor and the viscosity int ViscosityField(DA da Vec X Vec VAppCtx user)

Parameter param = userminusgtparamGridInfo grid = userminusgtgridVec localXField v xdouble eps dx dz T epsC TCint ijisjsimjmilim jlim ivt

ivt = paramminusgtiviscparamminusgtivisc = paramminusgtoutput ivisc

DACreateLocalVector(da amplocalX)DAGlobalToLocalBegin(da X INSERT VALUES localX)DAGlobalToLocalEnd(da X INSERT VALUES localX)DAVecGetArray(dalocalX(void)ampx)DAVecGetArray(daV(void)ampv)

Parameters dx = gridminusgtdx dz = gridminusgtdzilim = gridminusgtniminus1 jlim = gridminusgtnjminus1

Compute real temperature strain rate and viscosity DAGetCorners(daampisampjsPETSCNULL ampimampjmPETSCNULL )for (j=js jltjs+jm j++)

for (i=is iltis+im i++) T = paramminusgtpotentialT x[j][i]T exp( (jminus05)dzparamminusgtz scale )if (iltilim ampamp jltjlim )

TC = paramminusgtpotentialT TInterp(xij) exp( jdzparamminusgtz scale ) else

TC = T Compute the values at both cell centers and cell corners eps = CalcSecInv(xijCELL CENTERuser)epsC = CalcSecInv(xijCELL CORNERuser)v[j][i]u = epsv[j][i]w = epsCv[j][i]p = Viscosity(Tepsdz(jminus05)param)v[j][i]T = Viscosity(TCepsCdzjparam)

DAVecRestoreArray(daV(void)ampv)DAVecRestoreArray(dalocalX(void)ampx)paramminusgtivisc = ivt

Fig 23 Calculation of the second invariant of the strain tensor and viscosity field for themantle subduction problem

Developing a Geodynamics Simulator with PETSc 21

KSP ksp subkspPC bjacobi ilu

SNESGetKSP(snes ampksp)KSPGetPC(ksp ampbjacobi)PCBJacobiGetSubKSP(bjacobi PETSCNULL PETSCNULL ampsubksp)KSPGetPC(subksp[0] ampilu)PCILUSetLevels(ilu levels)

Fig 24Customizing the preconditioner on a single process

has the option of providing a routine or of using the ADIC [15] or similar system forautomatic differentiation The finite difference approximation can be activated byusing the-snes fd and-mat fd coloring freq options or by providing theSNESDefaultComputeJacobianColor () function toSNESSetJacobian()

Through theSNES object the user may also access the great range of directand iterative linear solvers and preconditioners provided by PETSc Sixteen differentKrylov solvers are available including GMRES BiCGStab and LSQR along withinterfaces to popular sparse direct packages such as MUMPS [4] and SuperLU [9]In addition to a variety of incomplete factorization preconditioners including PI-LUT [16] PETSc supports additive Schwartz preconditioning algebraic multigridthrough BoomerAMG [13] and the geometric multigrid discussed below The fullpanoply of solvers and preconditioners available is catalogued on the PETSc Web-site [3]

The modularity of PETSc allows users to easily customize each solver For in-stance suppose the user wishes to increase the number of levels in ILU(k) precon-ditioning on one block of a block-Jacobi scheme The code fragment in Fig 24 willset the number of levels of fill tolevels

PETSc provides an elegant framework for managing geometric multigrid in com-bination with aDA which is abstracted in theDMMGobject In the same way thatany linear solve can be performed withSNES any preconditioner or solver com-bination is available inDMMGby using a single level That is the same user codesupports both geometric multigrid as well as direct methods Krylov methods etcWhen creating aDMMG the user specifies the number of levels of grid refinementand provides the coarse-grid information (see Fig 25) which is always aDA objectat present although unstructured prototypes are being developed Then theDMMGobject calculates the intergrid transfer operators prolongation and restriction and

allocates objects to hold the solution at each level The user must provide the actionof the residual operatorF and can optionally provide a routine to compute the Jaco-bian matrix (which will be called on each level) and a routine to compute the initialguess as in Fig 26

With this information the user can now solve the equation and retrieve the solu-tion vector as in Fig 27

22 Matthew G Knepley et al

DMMGCreate(comm gridmglevels user ampdmmg)DMMGSetDM(dmmg (DM) da)

Fig 25Creating a DMMG

DMMGSetSNESLocal(dmmg FormFunctionLocal FormJacobianLocal 0 0)DMMGSetInitialGuess(dmmg FormInitialGuess)

Fig 26Providing discretization and assembly routines to DMMG

DMMGSolve(dmmg)soln = DMMGGetx(dmmg)

Fig 27Solving the problem with DMMG

5 Extensions

PETSc is not a complete environment for simulating physical phenomena Ratherit is a set of tools that allow the user to assemble such an environment tailored toa specific application As such PETSc will never provide every facility appropriatefor a given simulation However the user can easily extend PETSc and supplementits capabilities We present two examples in the context of the mantle subductionsimulation

51 Simple Continuation

Because of the highly nonlinear dependence of viscosity on temperature and strainrate in equation (1) the iteration in Newtonrsquos method can fail to converge withouta good starting guess of the solution On the other hand for the isoviscous casewhere temperature is coupled to velocity only through advection a solution is easilyreached It is therefore natural to propose a continuation scheme in the viscosity be-ginning with constant viscosity and progressing to full variability A simple adaptivecontinuation model was devised in which the viscosity was raised to a power be-tween zero and oneη rarr ηα whereα=0 corresponds to constant viscosity andα=1corresponds to natural viscosity variation In the continuation loopα was increasedtowards unity by an amount dependent on the rate of convergence of the previousSNES solve

PETSc itself provides no special support for continuation but it is sufficientlymodular that a continuation loop was readily constructed in the application codeusing repeated calls toDMMGSolve() and found to work quite well

Developing a Geodynamics Simulator with PETSc 23

52 Simple Steering

No rigorous justification had been given for the continuation strategy above andconsequently it was possible that for certain initial conditions Newton would fail toconverge thereby resulting in extremely long run times An observant user could de-tect this situation and abort the run but all the potentially useful solution informationup to that point would be lost A clean way to asynchronously abort the continua-tion loop was thus needed On architectures where OS signals are available PETScprovides an interface for registering signal handlers Thus we were able to define ahandler The user wishing to change the control flow of the simulation simply sendsthe appropriate signal to the process This sets a flag in the user data that causes thechange at the next iteration

6 Simulation Results

A long-standing debate has existed between theorists and observationalists over thethermal structure of subduction zones Modelers using constant viscosity flow sim-ulations to compute thermal structure have predicted relatively cold subductionzones [19] Conversely observationalists who use heat flow measurements and min-eralogical thermobarometry to estimate temperatures at depth have long claimedthat subduction zones are hotter than predicted They have invoked the presence ofstrong upwelling of hot mantle material to supply the heat This upwelling was neverpredicted by isoviscous flow models however

The debate remained unresolved until recently when several groups includingour own succeeded in developing simulations with realistically variable viscosity(for other examples [810121821]) A comparison of flow and temperature fieldsfor variable and constant viscosity simulations generated with our code is shownin Fig 28 Results from these simulations are exciting because they close the gapbetween models and observations they predict hotter mantle temperatures steepersurface thermal gradients and upwelling mantle flow

Furthermore these simulations allow for quantitative predictions of variation ofobservables with subduction parameters A recent study of subduction zone earth-quakes has identified an intriguing trend the vertical distance from subduction zonevolcanoes to the surface of the subducting slab is anti-correlated with descent rate ofthe slab [11] Preliminary calculations with our model are consistent with this trendindicating that flow and thermal structure may play an important role in determiningnot only the quantity and chemistry of magmas but also their path of transport to thesurface Further work is required to resolve this issue

Simulating the geodynamics of subduction is an example of our success in port-ing an existing code into the PETSc framework simultaneously parallelizing thecode and increasing its functionality Using this code as a template we rapidly de-veloped a related simulation of another tectonic boundary the mid-ocean ridge Thiswork was done to address a set of observations of systematic morphological asym-metry in the global mid-ocean ridge system [7] Our model confirms the qualitative

24 Matthew G Knepley et al

Dep

th k

m

log10

η and flow field

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24 Temperature degC

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Distance km

Dep

th k

m

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24

Distance km

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Isov

isco

usV

aria

ble

Vis

cosi

ty

(a) (b)

(c) (d)

Fig 282D viscosity and potential temperature fields from simulations on 8 processors with230112 degrees of freedom Panels in the top row are from a simulation withα=1 in equation(1) Panels in bottom row haveα=0 The white box in panels (a) and (c) shows the region inwhich temperature is plotted in panels (b) and (d)(a) Colors showlog10 of the viscosity fieldNote that there are more than five orders of magnitude variation in viscosity Arrows show theflow direction and magnitude (the slab is being subducted at a rate of 6 cmyear) Upwellingis evident in the flow field near the wedge corner(b) Temperature field from the variableviscosity simulation 1100C isotherm is shown as a dashed line(c) (Constant) viscosity andflow field from the isoviscous simulation Strong flow is predicted at the base of the crustdespite the low-temperature rock there No upwelling flow is predicted(d) Temperature fieldfrom isoviscous simulation Note that the mantle wedge corner is much colder than in (b)

mechanism that had been proposed to explain these observations Furthermore itshows a quantitative agreement with trends in the observed data [17] As with ourmodels of subduction the key to demonstrating the validity of the hypothesized dy-namics was simulating them with the strongly nonlinear rheology that characterizesmantle rock on geologic timescales The computational tools provided by PETSc en-abled us easily handle this rheology reducing development time and allowing us tofocus on model interpretation

References

1 PETSc SNES Example 30 httpwwwmcsanlgovpetscpetsc-2

Developing a Geodynamics Simulator with PETSc 25

snapshotspetsc-currentsrcsnesexamplestutorialsex30chtml

2 PETSc SNES Example 5 httpwwwmcsanlgovpetscpetsc-2snapshotspetsc-currentsrcsnesexamplestutorialsex5f90Fhtml

3 PETSc Solvers httpwwwmcsanlgovpetscpetsc-2documentationlinearsolvertablehtml

4 Amestoy P R Duff I S LrsquoExcellent J-Y and Koster J A fully asynchronous mul-tifrontal solver using distributed dynamic schedulingSIAM Journal on Matrix Analysisand Applications 23(1)15ndash41 (2001)

5 Balay S Buschelman K Eijkhout V Gropp W D Kaushik D Knepley M GMcInnes L C Smith B F and Zhang H PETSc Web pagehttpwwwmcsanlgovpetsc

6 Balay S Gropp W D McInnes L C and Smith B F Efficient management of par-allelism in object oriented numerical software libraries In Arge E Bruaset A M andLangtangen H P editorsModern Software Tools in Scientific Computing pages 163ndash202 Birkhauser Press (1997)

7 Carbotte S Small C and Donnelly K The influence of ridge migration on the mag-matic segmentation of mid-ocean ridgesNature 429743ndash746 (2004)

8 Conder J Wiens D and Morris J On the decompression melting structure at volcanicarcs and back-arc spreading centersGeophys Res Letts 29 (2002)

9 Demmel J W Gilbert J R and Li X S SuperLU userrsquos guide Technical ReportLBNL-44289 Lawrence Berkeley National Laboratory (2003)

10 Eberle M Grasset O and Sotin C A numerical study of the interaction of the man-tle wedge subducting slab and overriding platePhys Earth Planet In 134191ndash202(2002)

11 England P Engdahl R and Thatcher W Systematic variation in the depth of slabsbeneath arc volcanosGeophys J Int 156(2)377ndash408 (2003)

12 Furukawa Y Depth of the decoupling plate interface and thermal structure under arcsJ Geophys Res 9820005ndash20013 (1993)

13 Henson V E and Yang U M BoomerAMG A parallel algebraic multigrid solver andpreconditioner Technical Report UCRL-JC-133948 Lawrence Livermore National Lab-oratory (2000)

14 Hirth G and Kohlstedt D Rheology of the upper mantle and the mantle wedge A viewfrom the experimentalists InInside the Subduction Factory volume 138 ofGeophysicalMonograph American Geophysical Union (2003)

15 Hovland P Norris B and Smith B Making automatic differentiation truly automaticCoupling PETSc with ADIC InProceedings of ICCS2002 (2002)

16 Hysom D and Pothen A A scalable parallel algorithm for incomplete factor precondi-tioning SIAM Journal on Scientific Computing 222194ndash2215 (2001)

17 Katz R Spiegelman M and Carbotte S Ridge migration asthenospheric flow and theorigin of magmatic segmentation in the global mid-ocean ridge systemGeophys ResLetts 31 (2004)

18 Kelemen P Rilling J Parmentier E Mehl L and Hacker B Thermal structure dueto solid-state flow in the mantle wedge beneath arcs InInside the Subduction Factoryvolume 138 ofGeophysical Monograph American Geophysical Union (2003)

19 Peacock S and Wang K Seismic consequences of warm versus cool subduction meta-morphism Examples from southwest and northeast JapanScience 286937ndash939 (1999)

20 Stern R Subduction zonesRev Geophys 40(4) (2002)

26 Matthew G Knepley et al

21 van Keken P Kiefer B and Peacock S High-resolution models of subduction zonesImplications for mineral dehydration reactions and the transport of water into the deepmantleGeochem Geophys Geosys 3(10) (2003)

Page 21: Developing a Geodynamics Simulator with PETScfoalab.earth.ox.ac.uk/publications/Knepley_etal_Springer06.pdf · Developing a Geodynamics Simulator with PETSc Matthew G. Knepley 1,

Developing a Geodynamics Simulator with PETSc 21

KSP ksp subkspPC bjacobi ilu

SNESGetKSP(snes ampksp)KSPGetPC(ksp ampbjacobi)PCBJacobiGetSubKSP(bjacobi PETSCNULL PETSCNULL ampsubksp)KSPGetPC(subksp[0] ampilu)PCILUSetLevels(ilu levels)

Fig 24Customizing the preconditioner on a single process

has the option of providing a routine or of using the ADIC [15] or similar system forautomatic differentiation The finite difference approximation can be activated byusing the-snes fd and-mat fd coloring freq options or by providing theSNESDefaultComputeJacobianColor () function toSNESSetJacobian()

Through theSNES object the user may also access the great range of directand iterative linear solvers and preconditioners provided by PETSc Sixteen differentKrylov solvers are available including GMRES BiCGStab and LSQR along withinterfaces to popular sparse direct packages such as MUMPS [4] and SuperLU [9]In addition to a variety of incomplete factorization preconditioners including PI-LUT [16] PETSc supports additive Schwartz preconditioning algebraic multigridthrough BoomerAMG [13] and the geometric multigrid discussed below The fullpanoply of solvers and preconditioners available is catalogued on the PETSc Web-site [3]

The modularity of PETSc allows users to easily customize each solver For in-stance suppose the user wishes to increase the number of levels in ILU(k) precon-ditioning on one block of a block-Jacobi scheme The code fragment in Fig 24 willset the number of levels of fill tolevels

PETSc provides an elegant framework for managing geometric multigrid in com-bination with aDA which is abstracted in theDMMGobject In the same way thatany linear solve can be performed withSNES any preconditioner or solver com-bination is available inDMMGby using a single level That is the same user codesupports both geometric multigrid as well as direct methods Krylov methods etcWhen creating aDMMG the user specifies the number of levels of grid refinementand provides the coarse-grid information (see Fig 25) which is always aDA objectat present although unstructured prototypes are being developed Then theDMMGobject calculates the intergrid transfer operators prolongation and restriction and

allocates objects to hold the solution at each level The user must provide the actionof the residual operatorF and can optionally provide a routine to compute the Jaco-bian matrix (which will be called on each level) and a routine to compute the initialguess as in Fig 26

With this information the user can now solve the equation and retrieve the solu-tion vector as in Fig 27

22 Matthew G Knepley et al

DMMGCreate(comm gridmglevels user ampdmmg)DMMGSetDM(dmmg (DM) da)

Fig 25Creating a DMMG

DMMGSetSNESLocal(dmmg FormFunctionLocal FormJacobianLocal 0 0)DMMGSetInitialGuess(dmmg FormInitialGuess)

Fig 26Providing discretization and assembly routines to DMMG

DMMGSolve(dmmg)soln = DMMGGetx(dmmg)

Fig 27Solving the problem with DMMG

5 Extensions

PETSc is not a complete environment for simulating physical phenomena Ratherit is a set of tools that allow the user to assemble such an environment tailored toa specific application As such PETSc will never provide every facility appropriatefor a given simulation However the user can easily extend PETSc and supplementits capabilities We present two examples in the context of the mantle subductionsimulation

51 Simple Continuation

Because of the highly nonlinear dependence of viscosity on temperature and strainrate in equation (1) the iteration in Newtonrsquos method can fail to converge withouta good starting guess of the solution On the other hand for the isoviscous casewhere temperature is coupled to velocity only through advection a solution is easilyreached It is therefore natural to propose a continuation scheme in the viscosity be-ginning with constant viscosity and progressing to full variability A simple adaptivecontinuation model was devised in which the viscosity was raised to a power be-tween zero and oneη rarr ηα whereα=0 corresponds to constant viscosity andα=1corresponds to natural viscosity variation In the continuation loopα was increasedtowards unity by an amount dependent on the rate of convergence of the previousSNES solve

PETSc itself provides no special support for continuation but it is sufficientlymodular that a continuation loop was readily constructed in the application codeusing repeated calls toDMMGSolve() and found to work quite well

Developing a Geodynamics Simulator with PETSc 23

52 Simple Steering

No rigorous justification had been given for the continuation strategy above andconsequently it was possible that for certain initial conditions Newton would fail toconverge thereby resulting in extremely long run times An observant user could de-tect this situation and abort the run but all the potentially useful solution informationup to that point would be lost A clean way to asynchronously abort the continua-tion loop was thus needed On architectures where OS signals are available PETScprovides an interface for registering signal handlers Thus we were able to define ahandler The user wishing to change the control flow of the simulation simply sendsthe appropriate signal to the process This sets a flag in the user data that causes thechange at the next iteration

6 Simulation Results

A long-standing debate has existed between theorists and observationalists over thethermal structure of subduction zones Modelers using constant viscosity flow sim-ulations to compute thermal structure have predicted relatively cold subductionzones [19] Conversely observationalists who use heat flow measurements and min-eralogical thermobarometry to estimate temperatures at depth have long claimedthat subduction zones are hotter than predicted They have invoked the presence ofstrong upwelling of hot mantle material to supply the heat This upwelling was neverpredicted by isoviscous flow models however

The debate remained unresolved until recently when several groups includingour own succeeded in developing simulations with realistically variable viscosity(for other examples [810121821]) A comparison of flow and temperature fieldsfor variable and constant viscosity simulations generated with our code is shownin Fig 28 Results from these simulations are exciting because they close the gapbetween models and observations they predict hotter mantle temperatures steepersurface thermal gradients and upwelling mantle flow

Furthermore these simulations allow for quantitative predictions of variation ofobservables with subduction parameters A recent study of subduction zone earth-quakes has identified an intriguing trend the vertical distance from subduction zonevolcanoes to the surface of the subducting slab is anti-correlated with descent rate ofthe slab [11] Preliminary calculations with our model are consistent with this trendindicating that flow and thermal structure may play an important role in determiningnot only the quantity and chemistry of magmas but also their path of transport to thesurface Further work is required to resolve this issue

Simulating the geodynamics of subduction is an example of our success in port-ing an existing code into the PETSc framework simultaneously parallelizing thecode and increasing its functionality Using this code as a template we rapidly de-veloped a related simulation of another tectonic boundary the mid-ocean ridge Thiswork was done to address a set of observations of systematic morphological asym-metry in the global mid-ocean ridge system [7] Our model confirms the qualitative

24 Matthew G Knepley et al

Dep

th k

m

log10

η and flow field

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24 Temperature degC

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Distance km

Dep

th k

m

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24

Distance km

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Isov

isco

usV

aria

ble

Vis

cosi

ty

(a) (b)

(c) (d)

Fig 282D viscosity and potential temperature fields from simulations on 8 processors with230112 degrees of freedom Panels in the top row are from a simulation withα=1 in equation(1) Panels in bottom row haveα=0 The white box in panels (a) and (c) shows the region inwhich temperature is plotted in panels (b) and (d)(a) Colors showlog10 of the viscosity fieldNote that there are more than five orders of magnitude variation in viscosity Arrows show theflow direction and magnitude (the slab is being subducted at a rate of 6 cmyear) Upwellingis evident in the flow field near the wedge corner(b) Temperature field from the variableviscosity simulation 1100C isotherm is shown as a dashed line(c) (Constant) viscosity andflow field from the isoviscous simulation Strong flow is predicted at the base of the crustdespite the low-temperature rock there No upwelling flow is predicted(d) Temperature fieldfrom isoviscous simulation Note that the mantle wedge corner is much colder than in (b)

mechanism that had been proposed to explain these observations Furthermore itshows a quantitative agreement with trends in the observed data [17] As with ourmodels of subduction the key to demonstrating the validity of the hypothesized dy-namics was simulating them with the strongly nonlinear rheology that characterizesmantle rock on geologic timescales The computational tools provided by PETSc en-abled us easily handle this rheology reducing development time and allowing us tofocus on model interpretation

References

1 PETSc SNES Example 30 httpwwwmcsanlgovpetscpetsc-2

Developing a Geodynamics Simulator with PETSc 25

snapshotspetsc-currentsrcsnesexamplestutorialsex30chtml

2 PETSc SNES Example 5 httpwwwmcsanlgovpetscpetsc-2snapshotspetsc-currentsrcsnesexamplestutorialsex5f90Fhtml

3 PETSc Solvers httpwwwmcsanlgovpetscpetsc-2documentationlinearsolvertablehtml

4 Amestoy P R Duff I S LrsquoExcellent J-Y and Koster J A fully asynchronous mul-tifrontal solver using distributed dynamic schedulingSIAM Journal on Matrix Analysisand Applications 23(1)15ndash41 (2001)

5 Balay S Buschelman K Eijkhout V Gropp W D Kaushik D Knepley M GMcInnes L C Smith B F and Zhang H PETSc Web pagehttpwwwmcsanlgovpetsc

6 Balay S Gropp W D McInnes L C and Smith B F Efficient management of par-allelism in object oriented numerical software libraries In Arge E Bruaset A M andLangtangen H P editorsModern Software Tools in Scientific Computing pages 163ndash202 Birkhauser Press (1997)

7 Carbotte S Small C and Donnelly K The influence of ridge migration on the mag-matic segmentation of mid-ocean ridgesNature 429743ndash746 (2004)

8 Conder J Wiens D and Morris J On the decompression melting structure at volcanicarcs and back-arc spreading centersGeophys Res Letts 29 (2002)

9 Demmel J W Gilbert J R and Li X S SuperLU userrsquos guide Technical ReportLBNL-44289 Lawrence Berkeley National Laboratory (2003)

10 Eberle M Grasset O and Sotin C A numerical study of the interaction of the man-tle wedge subducting slab and overriding platePhys Earth Planet In 134191ndash202(2002)

11 England P Engdahl R and Thatcher W Systematic variation in the depth of slabsbeneath arc volcanosGeophys J Int 156(2)377ndash408 (2003)

12 Furukawa Y Depth of the decoupling plate interface and thermal structure under arcsJ Geophys Res 9820005ndash20013 (1993)

13 Henson V E and Yang U M BoomerAMG A parallel algebraic multigrid solver andpreconditioner Technical Report UCRL-JC-133948 Lawrence Livermore National Lab-oratory (2000)

14 Hirth G and Kohlstedt D Rheology of the upper mantle and the mantle wedge A viewfrom the experimentalists InInside the Subduction Factory volume 138 ofGeophysicalMonograph American Geophysical Union (2003)

15 Hovland P Norris B and Smith B Making automatic differentiation truly automaticCoupling PETSc with ADIC InProceedings of ICCS2002 (2002)

16 Hysom D and Pothen A A scalable parallel algorithm for incomplete factor precondi-tioning SIAM Journal on Scientific Computing 222194ndash2215 (2001)

17 Katz R Spiegelman M and Carbotte S Ridge migration asthenospheric flow and theorigin of magmatic segmentation in the global mid-ocean ridge systemGeophys ResLetts 31 (2004)

18 Kelemen P Rilling J Parmentier E Mehl L and Hacker B Thermal structure dueto solid-state flow in the mantle wedge beneath arcs InInside the Subduction Factoryvolume 138 ofGeophysical Monograph American Geophysical Union (2003)

19 Peacock S and Wang K Seismic consequences of warm versus cool subduction meta-morphism Examples from southwest and northeast JapanScience 286937ndash939 (1999)

20 Stern R Subduction zonesRev Geophys 40(4) (2002)

26 Matthew G Knepley et al

21 van Keken P Kiefer B and Peacock S High-resolution models of subduction zonesImplications for mineral dehydration reactions and the transport of water into the deepmantleGeochem Geophys Geosys 3(10) (2003)

Page 22: Developing a Geodynamics Simulator with PETScfoalab.earth.ox.ac.uk/publications/Knepley_etal_Springer06.pdf · Developing a Geodynamics Simulator with PETSc Matthew G. Knepley 1,

22 Matthew G Knepley et al

DMMGCreate(comm gridmglevels user ampdmmg)DMMGSetDM(dmmg (DM) da)

Fig 25Creating a DMMG

DMMGSetSNESLocal(dmmg FormFunctionLocal FormJacobianLocal 0 0)DMMGSetInitialGuess(dmmg FormInitialGuess)

Fig 26Providing discretization and assembly routines to DMMG

DMMGSolve(dmmg)soln = DMMGGetx(dmmg)

Fig 27Solving the problem with DMMG

5 Extensions

PETSc is not a complete environment for simulating physical phenomena Ratherit is a set of tools that allow the user to assemble such an environment tailored toa specific application As such PETSc will never provide every facility appropriatefor a given simulation However the user can easily extend PETSc and supplementits capabilities We present two examples in the context of the mantle subductionsimulation

51 Simple Continuation

Because of the highly nonlinear dependence of viscosity on temperature and strainrate in equation (1) the iteration in Newtonrsquos method can fail to converge withouta good starting guess of the solution On the other hand for the isoviscous casewhere temperature is coupled to velocity only through advection a solution is easilyreached It is therefore natural to propose a continuation scheme in the viscosity be-ginning with constant viscosity and progressing to full variability A simple adaptivecontinuation model was devised in which the viscosity was raised to a power be-tween zero and oneη rarr ηα whereα=0 corresponds to constant viscosity andα=1corresponds to natural viscosity variation In the continuation loopα was increasedtowards unity by an amount dependent on the rate of convergence of the previousSNES solve

PETSc itself provides no special support for continuation but it is sufficientlymodular that a continuation loop was readily constructed in the application codeusing repeated calls toDMMGSolve() and found to work quite well

Developing a Geodynamics Simulator with PETSc 23

52 Simple Steering

No rigorous justification had been given for the continuation strategy above andconsequently it was possible that for certain initial conditions Newton would fail toconverge thereby resulting in extremely long run times An observant user could de-tect this situation and abort the run but all the potentially useful solution informationup to that point would be lost A clean way to asynchronously abort the continua-tion loop was thus needed On architectures where OS signals are available PETScprovides an interface for registering signal handlers Thus we were able to define ahandler The user wishing to change the control flow of the simulation simply sendsthe appropriate signal to the process This sets a flag in the user data that causes thechange at the next iteration

6 Simulation Results

A long-standing debate has existed between theorists and observationalists over thethermal structure of subduction zones Modelers using constant viscosity flow sim-ulations to compute thermal structure have predicted relatively cold subductionzones [19] Conversely observationalists who use heat flow measurements and min-eralogical thermobarometry to estimate temperatures at depth have long claimedthat subduction zones are hotter than predicted They have invoked the presence ofstrong upwelling of hot mantle material to supply the heat This upwelling was neverpredicted by isoviscous flow models however

The debate remained unresolved until recently when several groups includingour own succeeded in developing simulations with realistically variable viscosity(for other examples [810121821]) A comparison of flow and temperature fieldsfor variable and constant viscosity simulations generated with our code is shownin Fig 28 Results from these simulations are exciting because they close the gapbetween models and observations they predict hotter mantle temperatures steepersurface thermal gradients and upwelling mantle flow

Furthermore these simulations allow for quantitative predictions of variation ofobservables with subduction parameters A recent study of subduction zone earth-quakes has identified an intriguing trend the vertical distance from subduction zonevolcanoes to the surface of the subducting slab is anti-correlated with descent rate ofthe slab [11] Preliminary calculations with our model are consistent with this trendindicating that flow and thermal structure may play an important role in determiningnot only the quantity and chemistry of magmas but also their path of transport to thesurface Further work is required to resolve this issue

Simulating the geodynamics of subduction is an example of our success in port-ing an existing code into the PETSc framework simultaneously parallelizing thecode and increasing its functionality Using this code as a template we rapidly de-veloped a related simulation of another tectonic boundary the mid-ocean ridge Thiswork was done to address a set of observations of systematic morphological asym-metry in the global mid-ocean ridge system [7] Our model confirms the qualitative

24 Matthew G Knepley et al

Dep

th k

m

log10

η and flow field

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24 Temperature degC

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Distance km

Dep

th k

m

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24

Distance km

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Isov

isco

usV

aria

ble

Vis

cosi

ty

(a) (b)

(c) (d)

Fig 282D viscosity and potential temperature fields from simulations on 8 processors with230112 degrees of freedom Panels in the top row are from a simulation withα=1 in equation(1) Panels in bottom row haveα=0 The white box in panels (a) and (c) shows the region inwhich temperature is plotted in panels (b) and (d)(a) Colors showlog10 of the viscosity fieldNote that there are more than five orders of magnitude variation in viscosity Arrows show theflow direction and magnitude (the slab is being subducted at a rate of 6 cmyear) Upwellingis evident in the flow field near the wedge corner(b) Temperature field from the variableviscosity simulation 1100C isotherm is shown as a dashed line(c) (Constant) viscosity andflow field from the isoviscous simulation Strong flow is predicted at the base of the crustdespite the low-temperature rock there No upwelling flow is predicted(d) Temperature fieldfrom isoviscous simulation Note that the mantle wedge corner is much colder than in (b)

mechanism that had been proposed to explain these observations Furthermore itshows a quantitative agreement with trends in the observed data [17] As with ourmodels of subduction the key to demonstrating the validity of the hypothesized dy-namics was simulating them with the strongly nonlinear rheology that characterizesmantle rock on geologic timescales The computational tools provided by PETSc en-abled us easily handle this rheology reducing development time and allowing us tofocus on model interpretation

References

1 PETSc SNES Example 30 httpwwwmcsanlgovpetscpetsc-2

Developing a Geodynamics Simulator with PETSc 25

snapshotspetsc-currentsrcsnesexamplestutorialsex30chtml

2 PETSc SNES Example 5 httpwwwmcsanlgovpetscpetsc-2snapshotspetsc-currentsrcsnesexamplestutorialsex5f90Fhtml

3 PETSc Solvers httpwwwmcsanlgovpetscpetsc-2documentationlinearsolvertablehtml

4 Amestoy P R Duff I S LrsquoExcellent J-Y and Koster J A fully asynchronous mul-tifrontal solver using distributed dynamic schedulingSIAM Journal on Matrix Analysisand Applications 23(1)15ndash41 (2001)

5 Balay S Buschelman K Eijkhout V Gropp W D Kaushik D Knepley M GMcInnes L C Smith B F and Zhang H PETSc Web pagehttpwwwmcsanlgovpetsc

6 Balay S Gropp W D McInnes L C and Smith B F Efficient management of par-allelism in object oriented numerical software libraries In Arge E Bruaset A M andLangtangen H P editorsModern Software Tools in Scientific Computing pages 163ndash202 Birkhauser Press (1997)

7 Carbotte S Small C and Donnelly K The influence of ridge migration on the mag-matic segmentation of mid-ocean ridgesNature 429743ndash746 (2004)

8 Conder J Wiens D and Morris J On the decompression melting structure at volcanicarcs and back-arc spreading centersGeophys Res Letts 29 (2002)

9 Demmel J W Gilbert J R and Li X S SuperLU userrsquos guide Technical ReportLBNL-44289 Lawrence Berkeley National Laboratory (2003)

10 Eberle M Grasset O and Sotin C A numerical study of the interaction of the man-tle wedge subducting slab and overriding platePhys Earth Planet In 134191ndash202(2002)

11 England P Engdahl R and Thatcher W Systematic variation in the depth of slabsbeneath arc volcanosGeophys J Int 156(2)377ndash408 (2003)

12 Furukawa Y Depth of the decoupling plate interface and thermal structure under arcsJ Geophys Res 9820005ndash20013 (1993)

13 Henson V E and Yang U M BoomerAMG A parallel algebraic multigrid solver andpreconditioner Technical Report UCRL-JC-133948 Lawrence Livermore National Lab-oratory (2000)

14 Hirth G and Kohlstedt D Rheology of the upper mantle and the mantle wedge A viewfrom the experimentalists InInside the Subduction Factory volume 138 ofGeophysicalMonograph American Geophysical Union (2003)

15 Hovland P Norris B and Smith B Making automatic differentiation truly automaticCoupling PETSc with ADIC InProceedings of ICCS2002 (2002)

16 Hysom D and Pothen A A scalable parallel algorithm for incomplete factor precondi-tioning SIAM Journal on Scientific Computing 222194ndash2215 (2001)

17 Katz R Spiegelman M and Carbotte S Ridge migration asthenospheric flow and theorigin of magmatic segmentation in the global mid-ocean ridge systemGeophys ResLetts 31 (2004)

18 Kelemen P Rilling J Parmentier E Mehl L and Hacker B Thermal structure dueto solid-state flow in the mantle wedge beneath arcs InInside the Subduction Factoryvolume 138 ofGeophysical Monograph American Geophysical Union (2003)

19 Peacock S and Wang K Seismic consequences of warm versus cool subduction meta-morphism Examples from southwest and northeast JapanScience 286937ndash939 (1999)

20 Stern R Subduction zonesRev Geophys 40(4) (2002)

26 Matthew G Knepley et al

21 van Keken P Kiefer B and Peacock S High-resolution models of subduction zonesImplications for mineral dehydration reactions and the transport of water into the deepmantleGeochem Geophys Geosys 3(10) (2003)

Page 23: Developing a Geodynamics Simulator with PETScfoalab.earth.ox.ac.uk/publications/Knepley_etal_Springer06.pdf · Developing a Geodynamics Simulator with PETSc Matthew G. Knepley 1,

Developing a Geodynamics Simulator with PETSc 23

52 Simple Steering

No rigorous justification had been given for the continuation strategy above andconsequently it was possible that for certain initial conditions Newton would fail toconverge thereby resulting in extremely long run times An observant user could de-tect this situation and abort the run but all the potentially useful solution informationup to that point would be lost A clean way to asynchronously abort the continua-tion loop was thus needed On architectures where OS signals are available PETScprovides an interface for registering signal handlers Thus we were able to define ahandler The user wishing to change the control flow of the simulation simply sendsthe appropriate signal to the process This sets a flag in the user data that causes thechange at the next iteration

6 Simulation Results

A long-standing debate has existed between theorists and observationalists over thethermal structure of subduction zones Modelers using constant viscosity flow sim-ulations to compute thermal structure have predicted relatively cold subductionzones [19] Conversely observationalists who use heat flow measurements and min-eralogical thermobarometry to estimate temperatures at depth have long claimedthat subduction zones are hotter than predicted They have invoked the presence ofstrong upwelling of hot mantle material to supply the heat This upwelling was neverpredicted by isoviscous flow models however

The debate remained unresolved until recently when several groups includingour own succeeded in developing simulations with realistically variable viscosity(for other examples [810121821]) A comparison of flow and temperature fieldsfor variable and constant viscosity simulations generated with our code is shownin Fig 28 Results from these simulations are exciting because they close the gapbetween models and observations they predict hotter mantle temperatures steepersurface thermal gradients and upwelling mantle flow

Furthermore these simulations allow for quantitative predictions of variation ofobservables with subduction parameters A recent study of subduction zone earth-quakes has identified an intriguing trend the vertical distance from subduction zonevolcanoes to the surface of the subducting slab is anti-correlated with descent rate ofthe slab [11] Preliminary calculations with our model are consistent with this trendindicating that flow and thermal structure may play an important role in determiningnot only the quantity and chemistry of magmas but also their path of transport to thesurface Further work is required to resolve this issue

Simulating the geodynamics of subduction is an example of our success in port-ing an existing code into the PETSc framework simultaneously parallelizing thecode and increasing its functionality Using this code as a template we rapidly de-veloped a related simulation of another tectonic boundary the mid-ocean ridge Thiswork was done to address a set of observations of systematic morphological asym-metry in the global mid-ocean ridge system [7] Our model confirms the qualitative

24 Matthew G Knepley et al

Dep

th k

m

log10

η and flow field

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24 Temperature degC

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Distance km

Dep

th k

m

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24

Distance km

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Isov

isco

usV

aria

ble

Vis

cosi

ty

(a) (b)

(c) (d)

Fig 282D viscosity and potential temperature fields from simulations on 8 processors with230112 degrees of freedom Panels in the top row are from a simulation withα=1 in equation(1) Panels in bottom row haveα=0 The white box in panels (a) and (c) shows the region inwhich temperature is plotted in panels (b) and (d)(a) Colors showlog10 of the viscosity fieldNote that there are more than five orders of magnitude variation in viscosity Arrows show theflow direction and magnitude (the slab is being subducted at a rate of 6 cmyear) Upwellingis evident in the flow field near the wedge corner(b) Temperature field from the variableviscosity simulation 1100C isotherm is shown as a dashed line(c) (Constant) viscosity andflow field from the isoviscous simulation Strong flow is predicted at the base of the crustdespite the low-temperature rock there No upwelling flow is predicted(d) Temperature fieldfrom isoviscous simulation Note that the mantle wedge corner is much colder than in (b)

mechanism that had been proposed to explain these observations Furthermore itshows a quantitative agreement with trends in the observed data [17] As with ourmodels of subduction the key to demonstrating the validity of the hypothesized dy-namics was simulating them with the strongly nonlinear rheology that characterizesmantle rock on geologic timescales The computational tools provided by PETSc en-abled us easily handle this rheology reducing development time and allowing us tofocus on model interpretation

References

1 PETSc SNES Example 30 httpwwwmcsanlgovpetscpetsc-2

Developing a Geodynamics Simulator with PETSc 25

snapshotspetsc-currentsrcsnesexamplestutorialsex30chtml

2 PETSc SNES Example 5 httpwwwmcsanlgovpetscpetsc-2snapshotspetsc-currentsrcsnesexamplestutorialsex5f90Fhtml

3 PETSc Solvers httpwwwmcsanlgovpetscpetsc-2documentationlinearsolvertablehtml

4 Amestoy P R Duff I S LrsquoExcellent J-Y and Koster J A fully asynchronous mul-tifrontal solver using distributed dynamic schedulingSIAM Journal on Matrix Analysisand Applications 23(1)15ndash41 (2001)

5 Balay S Buschelman K Eijkhout V Gropp W D Kaushik D Knepley M GMcInnes L C Smith B F and Zhang H PETSc Web pagehttpwwwmcsanlgovpetsc

6 Balay S Gropp W D McInnes L C and Smith B F Efficient management of par-allelism in object oriented numerical software libraries In Arge E Bruaset A M andLangtangen H P editorsModern Software Tools in Scientific Computing pages 163ndash202 Birkhauser Press (1997)

7 Carbotte S Small C and Donnelly K The influence of ridge migration on the mag-matic segmentation of mid-ocean ridgesNature 429743ndash746 (2004)

8 Conder J Wiens D and Morris J On the decompression melting structure at volcanicarcs and back-arc spreading centersGeophys Res Letts 29 (2002)

9 Demmel J W Gilbert J R and Li X S SuperLU userrsquos guide Technical ReportLBNL-44289 Lawrence Berkeley National Laboratory (2003)

10 Eberle M Grasset O and Sotin C A numerical study of the interaction of the man-tle wedge subducting slab and overriding platePhys Earth Planet In 134191ndash202(2002)

11 England P Engdahl R and Thatcher W Systematic variation in the depth of slabsbeneath arc volcanosGeophys J Int 156(2)377ndash408 (2003)

12 Furukawa Y Depth of the decoupling plate interface and thermal structure under arcsJ Geophys Res 9820005ndash20013 (1993)

13 Henson V E and Yang U M BoomerAMG A parallel algebraic multigrid solver andpreconditioner Technical Report UCRL-JC-133948 Lawrence Livermore National Lab-oratory (2000)

14 Hirth G and Kohlstedt D Rheology of the upper mantle and the mantle wedge A viewfrom the experimentalists InInside the Subduction Factory volume 138 ofGeophysicalMonograph American Geophysical Union (2003)

15 Hovland P Norris B and Smith B Making automatic differentiation truly automaticCoupling PETSc with ADIC InProceedings of ICCS2002 (2002)

16 Hysom D and Pothen A A scalable parallel algorithm for incomplete factor precondi-tioning SIAM Journal on Scientific Computing 222194ndash2215 (2001)

17 Katz R Spiegelman M and Carbotte S Ridge migration asthenospheric flow and theorigin of magmatic segmentation in the global mid-ocean ridge systemGeophys ResLetts 31 (2004)

18 Kelemen P Rilling J Parmentier E Mehl L and Hacker B Thermal structure dueto solid-state flow in the mantle wedge beneath arcs InInside the Subduction Factoryvolume 138 ofGeophysical Monograph American Geophysical Union (2003)

19 Peacock S and Wang K Seismic consequences of warm versus cool subduction meta-morphism Examples from southwest and northeast JapanScience 286937ndash939 (1999)

20 Stern R Subduction zonesRev Geophys 40(4) (2002)

26 Matthew G Knepley et al

21 van Keken P Kiefer B and Peacock S High-resolution models of subduction zonesImplications for mineral dehydration reactions and the transport of water into the deepmantleGeochem Geophys Geosys 3(10) (2003)

Page 24: Developing a Geodynamics Simulator with PETScfoalab.earth.ox.ac.uk/publications/Knepley_etal_Springer06.pdf · Developing a Geodynamics Simulator with PETSc Matthew G. Knepley 1,

24 Matthew G Knepley et al

Dep

th k

m

log10

η and flow field

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24 Temperature degC

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Distance km

Dep

th k

m

0 100 200 300

0

50

100

150

200

250

300 19

20

21

22

23

24

Distance km

1100

50 100 150 200

50

100

150

0

200

400

600

800

1000

1200

1400

Isov

isco

usV

aria

ble

Vis

cosi

ty

(a) (b)

(c) (d)

Fig 282D viscosity and potential temperature fields from simulations on 8 processors with230112 degrees of freedom Panels in the top row are from a simulation withα=1 in equation(1) Panels in bottom row haveα=0 The white box in panels (a) and (c) shows the region inwhich temperature is plotted in panels (b) and (d)(a) Colors showlog10 of the viscosity fieldNote that there are more than five orders of magnitude variation in viscosity Arrows show theflow direction and magnitude (the slab is being subducted at a rate of 6 cmyear) Upwellingis evident in the flow field near the wedge corner(b) Temperature field from the variableviscosity simulation 1100C isotherm is shown as a dashed line(c) (Constant) viscosity andflow field from the isoviscous simulation Strong flow is predicted at the base of the crustdespite the low-temperature rock there No upwelling flow is predicted(d) Temperature fieldfrom isoviscous simulation Note that the mantle wedge corner is much colder than in (b)

mechanism that had been proposed to explain these observations Furthermore itshows a quantitative agreement with trends in the observed data [17] As with ourmodels of subduction the key to demonstrating the validity of the hypothesized dy-namics was simulating them with the strongly nonlinear rheology that characterizesmantle rock on geologic timescales The computational tools provided by PETSc en-abled us easily handle this rheology reducing development time and allowing us tofocus on model interpretation

References

1 PETSc SNES Example 30 httpwwwmcsanlgovpetscpetsc-2

Developing a Geodynamics Simulator with PETSc 25

snapshotspetsc-currentsrcsnesexamplestutorialsex30chtml

2 PETSc SNES Example 5 httpwwwmcsanlgovpetscpetsc-2snapshotspetsc-currentsrcsnesexamplestutorialsex5f90Fhtml

3 PETSc Solvers httpwwwmcsanlgovpetscpetsc-2documentationlinearsolvertablehtml

4 Amestoy P R Duff I S LrsquoExcellent J-Y and Koster J A fully asynchronous mul-tifrontal solver using distributed dynamic schedulingSIAM Journal on Matrix Analysisand Applications 23(1)15ndash41 (2001)

5 Balay S Buschelman K Eijkhout V Gropp W D Kaushik D Knepley M GMcInnes L C Smith B F and Zhang H PETSc Web pagehttpwwwmcsanlgovpetsc

6 Balay S Gropp W D McInnes L C and Smith B F Efficient management of par-allelism in object oriented numerical software libraries In Arge E Bruaset A M andLangtangen H P editorsModern Software Tools in Scientific Computing pages 163ndash202 Birkhauser Press (1997)

7 Carbotte S Small C and Donnelly K The influence of ridge migration on the mag-matic segmentation of mid-ocean ridgesNature 429743ndash746 (2004)

8 Conder J Wiens D and Morris J On the decompression melting structure at volcanicarcs and back-arc spreading centersGeophys Res Letts 29 (2002)

9 Demmel J W Gilbert J R and Li X S SuperLU userrsquos guide Technical ReportLBNL-44289 Lawrence Berkeley National Laboratory (2003)

10 Eberle M Grasset O and Sotin C A numerical study of the interaction of the man-tle wedge subducting slab and overriding platePhys Earth Planet In 134191ndash202(2002)

11 England P Engdahl R and Thatcher W Systematic variation in the depth of slabsbeneath arc volcanosGeophys J Int 156(2)377ndash408 (2003)

12 Furukawa Y Depth of the decoupling plate interface and thermal structure under arcsJ Geophys Res 9820005ndash20013 (1993)

13 Henson V E and Yang U M BoomerAMG A parallel algebraic multigrid solver andpreconditioner Technical Report UCRL-JC-133948 Lawrence Livermore National Lab-oratory (2000)

14 Hirth G and Kohlstedt D Rheology of the upper mantle and the mantle wedge A viewfrom the experimentalists InInside the Subduction Factory volume 138 ofGeophysicalMonograph American Geophysical Union (2003)

15 Hovland P Norris B and Smith B Making automatic differentiation truly automaticCoupling PETSc with ADIC InProceedings of ICCS2002 (2002)

16 Hysom D and Pothen A A scalable parallel algorithm for incomplete factor precondi-tioning SIAM Journal on Scientific Computing 222194ndash2215 (2001)

17 Katz R Spiegelman M and Carbotte S Ridge migration asthenospheric flow and theorigin of magmatic segmentation in the global mid-ocean ridge systemGeophys ResLetts 31 (2004)

18 Kelemen P Rilling J Parmentier E Mehl L and Hacker B Thermal structure dueto solid-state flow in the mantle wedge beneath arcs InInside the Subduction Factoryvolume 138 ofGeophysical Monograph American Geophysical Union (2003)

19 Peacock S and Wang K Seismic consequences of warm versus cool subduction meta-morphism Examples from southwest and northeast JapanScience 286937ndash939 (1999)

20 Stern R Subduction zonesRev Geophys 40(4) (2002)

26 Matthew G Knepley et al

21 van Keken P Kiefer B and Peacock S High-resolution models of subduction zonesImplications for mineral dehydration reactions and the transport of water into the deepmantleGeochem Geophys Geosys 3(10) (2003)

Page 25: Developing a Geodynamics Simulator with PETScfoalab.earth.ox.ac.uk/publications/Knepley_etal_Springer06.pdf · Developing a Geodynamics Simulator with PETSc Matthew G. Knepley 1,

Developing a Geodynamics Simulator with PETSc 25

snapshotspetsc-currentsrcsnesexamplestutorialsex30chtml

2 PETSc SNES Example 5 httpwwwmcsanlgovpetscpetsc-2snapshotspetsc-currentsrcsnesexamplestutorialsex5f90Fhtml

3 PETSc Solvers httpwwwmcsanlgovpetscpetsc-2documentationlinearsolvertablehtml

4 Amestoy P R Duff I S LrsquoExcellent J-Y and Koster J A fully asynchronous mul-tifrontal solver using distributed dynamic schedulingSIAM Journal on Matrix Analysisand Applications 23(1)15ndash41 (2001)

5 Balay S Buschelman K Eijkhout V Gropp W D Kaushik D Knepley M GMcInnes L C Smith B F and Zhang H PETSc Web pagehttpwwwmcsanlgovpetsc

6 Balay S Gropp W D McInnes L C and Smith B F Efficient management of par-allelism in object oriented numerical software libraries In Arge E Bruaset A M andLangtangen H P editorsModern Software Tools in Scientific Computing pages 163ndash202 Birkhauser Press (1997)

7 Carbotte S Small C and Donnelly K The influence of ridge migration on the mag-matic segmentation of mid-ocean ridgesNature 429743ndash746 (2004)

8 Conder J Wiens D and Morris J On the decompression melting structure at volcanicarcs and back-arc spreading centersGeophys Res Letts 29 (2002)

9 Demmel J W Gilbert J R and Li X S SuperLU userrsquos guide Technical ReportLBNL-44289 Lawrence Berkeley National Laboratory (2003)

10 Eberle M Grasset O and Sotin C A numerical study of the interaction of the man-tle wedge subducting slab and overriding platePhys Earth Planet In 134191ndash202(2002)

11 England P Engdahl R and Thatcher W Systematic variation in the depth of slabsbeneath arc volcanosGeophys J Int 156(2)377ndash408 (2003)

12 Furukawa Y Depth of the decoupling plate interface and thermal structure under arcsJ Geophys Res 9820005ndash20013 (1993)

13 Henson V E and Yang U M BoomerAMG A parallel algebraic multigrid solver andpreconditioner Technical Report UCRL-JC-133948 Lawrence Livermore National Lab-oratory (2000)

14 Hirth G and Kohlstedt D Rheology of the upper mantle and the mantle wedge A viewfrom the experimentalists InInside the Subduction Factory volume 138 ofGeophysicalMonograph American Geophysical Union (2003)

15 Hovland P Norris B and Smith B Making automatic differentiation truly automaticCoupling PETSc with ADIC InProceedings of ICCS2002 (2002)

16 Hysom D and Pothen A A scalable parallel algorithm for incomplete factor precondi-tioning SIAM Journal on Scientific Computing 222194ndash2215 (2001)

17 Katz R Spiegelman M and Carbotte S Ridge migration asthenospheric flow and theorigin of magmatic segmentation in the global mid-ocean ridge systemGeophys ResLetts 31 (2004)

18 Kelemen P Rilling J Parmentier E Mehl L and Hacker B Thermal structure dueto solid-state flow in the mantle wedge beneath arcs InInside the Subduction Factoryvolume 138 ofGeophysical Monograph American Geophysical Union (2003)

19 Peacock S and Wang K Seismic consequences of warm versus cool subduction meta-morphism Examples from southwest and northeast JapanScience 286937ndash939 (1999)

20 Stern R Subduction zonesRev Geophys 40(4) (2002)

26 Matthew G Knepley et al

21 van Keken P Kiefer B and Peacock S High-resolution models of subduction zonesImplications for mineral dehydration reactions and the transport of water into the deepmantleGeochem Geophys Geosys 3(10) (2003)

Page 26: Developing a Geodynamics Simulator with PETScfoalab.earth.ox.ac.uk/publications/Knepley_etal_Springer06.pdf · Developing a Geodynamics Simulator with PETSc Matthew G. Knepley 1,

26 Matthew G Knepley et al

21 van Keken P Kiefer B and Peacock S High-resolution models of subduction zonesImplications for mineral dehydration reactions and the transport of water into the deepmantleGeochem Geophys Geosys 3(10) (2003)