Massively Parallel Magnetohydrodynamics on the
Cray XT3
Joshua Breslau and Jin ChenPrinceton Plasma Physics Laboratory
Cray XT3 Technical WorkshopNashville, TN
February 28, 2007
Motivation: Modeling Magnetic Confinement Fusion Experiments
NSTX(Spherical Torus)
NCSX(Compact Stellarator)
ITER(Advanced Tokamak)
Characteristics of Magnetic Confinement Fusion Experiments
• Multispecies hydrogen plasma at ~108 °C (Te and Ti may differ): low collisionality, high electrical conductivity.
• Toroidal topology with complex boundary geometry.
• Strong toroidal magnetic field giving highly anisotropic transport, low .
• Rotational transform gives nested flux surfaces.
• Spatial scales range from electron skin depth, ~10-4 m to major radius, ~6 m.
• Time scales range from Alfvén wave transit time (~s) to discharge time, ~100 s.
• Susceptible to microinstabilities leading to loss of energy confinement; and macroinstabilities leading to large-scale rearrangment of plasma and possible disruption.
Extended MHD Equations
The M3D Code
• Physics models include ideal and resistive MHD; two-fluid; or hybrid with kinetic ions.
• Field and velocity variables are expressed in terms of potentials, keeping B divergence-free and separating compressible and incompressible components of flow.
• Uses linear, 2nd, or 3rd-order finite elements in-plane on an unstructured triangular mesh.
• Uses 4th-order finite differences between planes or pseudo-spectral derivatives.
• Partially implicit treatment allows efficient advance over dissipative time scales but requires small time steps relative to A.
• Linear and nonlinear modes of operation are available.
• The PETSc library is used for parallelization and linear solves with Krylov methods.
M3D (multi-level 3D) is a 3D nonlinear extended MHD code in toroidal geometry maintained by a multi-institutional collaboration, designed for the study of macroscopic instabilities in tokamaks and stellarators.
M3D MeshSingle plane Full torus
Radial zones alignwith flux surfaces. 2n+1 planes needed to resolve toroidal mode #n.
Domain Decomposition
Poloidal(cross-section view)
Toroidal(overhead view)
or
D = 1F = 5
D = 3F = 3
B = 16
Linear solves independent on each processor Linear solves parallel over processors
Porting M3D to the XT3
• Previously run on Cray T3E, IBM SP, SGI Origin 2000.
• Few modifications to source code were necessary for the new platform.
• Installation of HYPRE preconditioner in PETSc library made possible faster inversion of symmetric form of linear operators using CG.
• Reducing interprocessor communication was key to improving scaling.
1D Weak Scaling, Single Core (SN)
560 radial zones (626,081 vertices/plane), 16 poloidal processors, 4 planes/processor,4-320 toroidal processors
poloidal domain decomposition
1D Weak Scaling, Dual Core (VN)
398 radial zones (316,013 vertices/plane), 16 poloidal processors, 4 planes/processor,4-640 toroidal processors
Smallest run has 64 planes, 160,000 vertices/plane on 16 toroidal x 4 poloidal processors.
Successive runs increase number of poloidal processors by 4, number oftoroidal processors by 12, while maintaining 4 planes, 40,000 vertices/processor.
3D Weak Scaling, Single Core
3D Strong Scaling, Single Core
All runs have 32 planes, 474,151 vertices/plane.Smallest has 8 toroidal x 12 poloidal processors.
Successive runs increase number of poloidal processors by 6, double number oftoroidal processors.
3D Strong Scaling, Single Core
All runs have 128 planes, 474,151 vertices/plane.Smallest has 32 toroidal x 12 poloidal processors.
Successive runs increase number of poloidal processors by 6, double number oftoroidal processors.
3D Strong Scaling, Single Core
All runs have 208 planes, 474,151 vertices/plane.Smallest has 52 toroidal x 12 poloidal processors.
Successive runs increase number of poloidal processors by 6, double number oftoroidal processors.
Sample Application: CDX Sawteeth
High temperature core
X-point(site of reconnection)
Low temperaturem=1, n=1 island
q=1 surface(inversion radius)
• Small laboratory tokamak
• Oscillations in X-ray signal during discharge consistent with sudden outward shift of hot plasma
• Objective: predict effect and conditions for onset of instability.
Initialization• Equilibrium taken from a
transport-timescale code. ~ 3.3%• q0 0.922• Sawtooth instability is
predicted when q0 is sufficiently below 1.
toroidal current density
Linear n=1 eigenmode: A 6 10-4
Perturbed temperature Perturbed current density Velocity stream function
Nonlinear Results24 planes, 79 radial grids
24 toroidal x 6 poloidal processors221,856 vertices on 144 Jaguar CPUs (VN mode)
Kin
etic
ene
rgy,
by
toro
idal
mod
e nu
mbe
rP
oinc
aré
Sec
tions
13,920 CPU hours(96:40 wallclock hours)
Conclusions
• The XT3 has been a productive environment for tokamak simulations with M3D.
• Improved scaling can be expected with the faster interconnects on the XT4.
• Scaling to thousands of processors has been demonstrated, but may be impractical for real applications while the code remains explicit.