massively parallel magnetohydrodynamics on the cray xt3 joshua breslau and jin chen princeton plasma...
TRANSCRIPT
Massively Parallel Magnetohydrodynamics on the
Cray XT3
Joshua Breslau and Jin ChenPrinceton Plasma Physics Laboratory
Cray XT3 Technical WorkshopNashville, TN
February 28, 2007
Motivation: Modeling Magnetic Confinement Fusion Experiments
NSTX(Spherical Torus)
NCSX(Compact Stellarator)
ITER(Advanced Tokamak)
Characteristics of Magnetic Confinement Fusion Experiments
• Multispecies hydrogen plasma at ~108 °C (Te and Ti may differ): low collisionality, high electrical conductivity.
• Toroidal topology with complex boundary geometry.
• Strong toroidal magnetic field giving highly anisotropic transport, low .
• Rotational transform gives nested flux surfaces.
• Spatial scales range from electron skin depth, ~10-4 m to major radius, ~6 m.
• Time scales range from Alfvén wave transit time (~s) to discharge time, ~100 s.
• Susceptible to microinstabilities leading to loss of energy confinement; and macroinstabilities leading to large-scale rearrangment of plasma and possible disruption.
The M3D Code
• Physics models include ideal and resistive MHD; two-fluid; or hybrid with kinetic ions.
• Field and velocity variables are expressed in terms of potentials, keeping B divergence-free and separating compressible and incompressible components of flow.
• Uses linear, 2nd, or 3rd-order finite elements in-plane on an unstructured triangular mesh.
• Uses 4th-order finite differences between planes or pseudo-spectral derivatives.
• Partially implicit treatment allows efficient advance over dissipative time scales but requires small time steps relative to A.
• Linear and nonlinear modes of operation are available.
• The PETSc library is used for parallelization and linear solves with Krylov methods.
M3D (multi-level 3D) is a 3D nonlinear extended MHD code in toroidal geometry maintained by a multi-institutional collaboration, designed for the study of macroscopic instabilities in tokamaks and stellarators.
M3D MeshSingle plane Full torus
Radial zones alignwith flux surfaces. 2n+1 planes needed to resolve toroidal mode #n.
Domain Decomposition
Poloidal(cross-section view)
Toroidal(overhead view)
or
D = 1F = 5
D = 3F = 3
B = 16
Linear solves independent on each processor Linear solves parallel over processors
Porting M3D to the XT3
• Previously run on Cray T3E, IBM SP, SGI Origin 2000.
• Few modifications to source code were necessary for the new platform.
• Installation of HYPRE preconditioner in PETSc library made possible faster inversion of symmetric form of linear operators using CG.
• Reducing interprocessor communication was key to improving scaling.
1D Weak Scaling, Single Core (SN)
560 radial zones (626,081 vertices/plane), 16 poloidal processors, 4 planes/processor,4-320 toroidal processors
poloidal domain decomposition
1D Weak Scaling, Dual Core (VN)
398 radial zones (316,013 vertices/plane), 16 poloidal processors, 4 planes/processor,4-640 toroidal processors
Smallest run has 64 planes, 160,000 vertices/plane on 16 toroidal x 4 poloidal processors.
Successive runs increase number of poloidal processors by 4, number oftoroidal processors by 12, while maintaining 4 planes, 40,000 vertices/processor.
3D Weak Scaling, Single Core
3D Strong Scaling, Single Core
All runs have 32 planes, 474,151 vertices/plane.Smallest has 8 toroidal x 12 poloidal processors.
Successive runs increase number of poloidal processors by 6, double number oftoroidal processors.
3D Strong Scaling, Single Core
All runs have 128 planes, 474,151 vertices/plane.Smallest has 32 toroidal x 12 poloidal processors.
Successive runs increase number of poloidal processors by 6, double number oftoroidal processors.
3D Strong Scaling, Single Core
All runs have 208 planes, 474,151 vertices/plane.Smallest has 52 toroidal x 12 poloidal processors.
Successive runs increase number of poloidal processors by 6, double number oftoroidal processors.
Sample Application: CDX Sawteeth
High temperature core
X-point(site of reconnection)
Low temperaturem=1, n=1 island
q=1 surface(inversion radius)
• Small laboratory tokamak
• Oscillations in X-ray signal during discharge consistent with sudden outward shift of hot plasma
• Objective: predict effect and conditions for onset of instability.
Initialization• Equilibrium taken from a
transport-timescale code. ~ 3.3%• q0 0.922• Sawtooth instability is
predicted when q0 is sufficiently below 1.
toroidal current density
Linear n=1 eigenmode: A 6 10-4
Perturbed temperature Perturbed current density Velocity stream function
Nonlinear Results24 planes, 79 radial grids
24 toroidal x 6 poloidal processors221,856 vertices on 144 Jaguar CPUs (VN mode)
Kin
etic
ene
rgy,
by
toro
idal
mod
e nu
mbe
rP
oinc
aré
Sec
tions
13,920 CPU hours(96:40 wallclock hours)
Conclusions
• The XT3 has been a productive environment for tokamak simulations with M3D.
• Improved scaling can be expected with the faster interconnects on the XT4.
• Scaling to thousands of processors has been demonstrated, but may be impractical for real applications while the code remains explicit.