gyrokinetic particle-in-cell simulations of plasma microturbulence on advanced computing platforms

Post on 30-Dec-2015

40 Views

Category:

Documents

3 Downloads

Preview:

Click to see full reader

DESCRIPTION

Gyrokinetic particle-in-cell simulations of plasma microturbulence on advanced computing platforms. Stephane Ethier Princeton Plasma Physics Laboratory SCIDAC 2005 Conference San Francisco, CA. - PowerPoint PPT Presentation

TRANSCRIPT

Gyrokinetic particle-in-cell simulations of plasma microturbulence on advanced

computing platforms

Stephane Ethier

Princeton Plasma Physics Laboratory

SCIDAC 2005 Conference

San Francisco, CA

Work performed under the DOE SCIDAC Center for Gyrokinetic Particle Simulation of Turbulent

Transport in Burning Plasmas

The Ultimate Burning Plasma

Fusion Powers the Sun and Stars

Can we harness Fusion power on Earth?

The Case for Fusion Energy

• Worldwide demand for energy continues to increase– Due to population increases and economic development

• Worldwide oil and gas production is near or past peak– Need for alternative source: coal, fission, fusion

• Increasing evidence that release of greenhouse gases is causing global climate change– This makes nuclear (fission or fusion) preferable to fossil

(coal)• Fusion has clear advantages over fission

– Inherently safe (no China syndrome)– No weapons proliferation considerations (security)– Greatly reduced waste disposal problems (no Yucca mountain)

• Can produce electricity and hydrogen• Abundant fuel, available to all nations

– Deuterium and lithium supply will last 1000’s of years

Fusion Reaction

• The two ions need to break the Coulomb barrier to get close enough to fuse together

Putting the Sun in a Bottle

The Most Successful Magnetic Confinement Configuration is the Tokamak

magnets

plasma

magnetic field

Fusion Experiments in the World

J. B. Lister

We know we can make Fusion Energy –The Challenge now is to make it Practical!

Fusion: DOE OFES #1 Item on List of Priorities

November 10, 2003Energy Secretary Spencer Abraham Announces Department of Energy20-Year Science Facility PlanSets Priorities for 28 New, Major Science Research Facilities

#1 on the list of priorities is ITER, an unprecedented international collaboration on the next major step for the development of fusion

#2 is UltraScale Scientific Computing Capability

Plasma Science Challenges

• Macroscopic Stability

– What limits the pressure in plasmas?

• Wave-particle Interactions

– How do particles and plasma waves interact?

• Microturbulence & Transport

– What causes plasma transport?

• Plasma-material Interactions

– How can high-temperature plasma and material surfaces co-exist?

Challenge to Theory & Simulations

• Huge range of spatial and temporal scales

• Overlap in scales often means strong (simplified) ordering not possible

10-6 10-4 10-2 100 102

Spatial Scales (m)electron gyroradius

debye length

ion gyroradius

tearing length

skin depth system size

atomic mfp electron-ion mfp

10-10 10-5 100 105

Temporal Scales (s)

electron gyroperiod electron collision

ion gyroperiod Ion collision

inverse electron plasma frequency confinement

Inverse ion plasma frequency current diffusion

pulse length

Major Fusion Codes

Importance of Turbulence inFusion Plasmas

• Turbulence is believed to be the mechanism for cross-field transport in magnetically confined plasmas:– Size and cost of a fusion reactor determined by particle and

energy confinement time and fusion self-heating.

• Plasma turbulence is a complex nonlinear phenomenon:– Large time and spatial scale separations similar to fluid

turbulence.

– Self-consistent electromagnetic fields: many-body problem

– Strong nonlinear wave-particle interactions: kinetic effects.

– Importance of plasma spatial inhomogeneities, coupled with complex confining magnetic fields, as drivers for microinstabilities and the ensuing plasma turbulence.

The Fundamental Equations for Plasma Physics: Boltzmann+Maxwell

• Complete but impractical• Cannot solve on all time and

length scales

• Can eliminate dimensions by integrating over velocity space (assuming a Maxwellian)

udfqEJB

t

E

c

udfuqJBEt

B

turft

ffBuE

m

qfu

t

f

Cu

002

1

0

),,(

6D+time

Gyrokinetic Approximation forLow Frequency Modes

• Gyrokinetic ordering

• Gyro-motion: guiding center drifts + charged ring

– Parallel to B: mirror force, magnetically trapped

– Perpendicular: E x B, polarization, gradient, and curvature drifts

• Gyrophase-averaged 5D gyrokinetic equation

– Suppress plasma oscillation and gyro-motion

– Larger time step and grid size, smaller number of particles

1~

1~~~ //

k

kTe

L

The Gyrokinetic Toroidal CodeGTC

• Description:– Particle-in-cell code (PIC)

– Developed by Zhihong Lin (now at UC Irvine)

– Non-linear gyrokinetic simulation of microturbulence [Lee, 1983]

– Particle-electric field interaction treated self-consistently

– Uses magnetic field line following coordinates ()

– Guiding center Hamiltonian [White and Chance, 1984]

– Non-spectral Poisson solver [Lin and Lee, 1995]

– Low numerical noise algorithm (f method)

– Full torus (global) simulation

The Particle-in-cell Method

• Particles sample distribution function• Interactions via the grid, on which the potential is

calculated (from deposited charges).

The PIC Steps• “SCATTER”, or deposit,

charges on the grid (nearest neighbors)

• Solve Poisson equation• “GATHER” forces on each

particle from potential• Move particles (PUSH)• Repeat…

Charge Deposition:4-point average method

Classic PIC 4-Point Average GK(W.W. Lee)

Charge Deposition Step (SCATTER operation)

GTC

Quasi-2D Structure of Electrostatic Potential

Global Field-aligned Mesh () /q

Saves a factor of about 100 in CPU time

Domain Decomposition

• Domain decomposition:– each MPI process holds a toroidal section

– each particle is assigned to a processor according to its position

• Initial memory allocation is done locally on each processor to maximize efficiency

• Communication between domains is done with MPI calls (runs on most parallel computers)

2nd Level of Parallelism:Loop-level with OpenMP

MPI_init

MPI process MPI process MPI process MPI process

MPI_finalize

OpenMPLoop

OpenMPLoop

Startthreads

Mergethreads

New MPI-based particle decomposition

• Each domain in the 1D (and soon 2D) domain decomposition can have more than 1 processor associated with it.

• Each processor holds a fraction of the total number of particles in that domain.

• Scales well when using a large number of particles

Processor 2

Processor 3

Processor 0

Processor 1

Main Computing Platform:NERSC’s IBM SP Seaborg

• 416 x 16-processor SMP nodes (with 64G, 32G, or 16G memory)

• 380 compute nodes (6,080 processors)

• 375 MHz POWER 3+ processors with 1.5 GFlops/sec/proc peak

CRAY X1 at ORNL

• 512 Multi-streaming vector processors (MSPs)• 12.8 Gflops/sec peak performance per MSP• Currently being upgraded to X1E (1,024 – 18GF/MSP)

Earth Simulator

• Collaboration with Dr. Leonid Oliker of LBL/NERSC

• Only US team doing performance study on ES

• Many thanks to Dr. Sato

• 5,120 vector processors• 8 Gflops/sec per proc.• 40 Tflops/sec peak

• “Gather-Scatter” operation in PIC codes– The particles are randomly distributed in the simulation

volume (grid).

– Particle charge deposition on the grid leads to indirect addressing in memory

– Not cache friendly.

– Need to be tuned differently depending on the architecture.

particle array scatter operation

grid array

Optimization Challenges

Vectorization Work

• Main challenge: charge deposition (scatter)– Need to avoid memory dependencies– Solved with work-vector method– Each element in the processor register has a private copy of

the local grid

• ES: Minimize memory banks conflicts– Use “duplicate” directive (thanks to David Parks…)

• X1: Streaming + vector– Straightforward since GTC already had loop-level

parallelism.

GTC Performance

# of proc.

#part (Bil-lion)

IBM SP 3 (Seaborg)

Itanium 2 + Quadrics

CRAY X1 ES

Gflop %Pk Gflop %Pk Gflop %Pk Gflop %Pk

64 0.207 9.0 9.3 25.0 6.9 82.6 10.1 102.4 20.0

128 0.414 17.9 9.3 49.9 6.9 156.2 9.6 199.7 19.5

256 0.828 35.8 9.3 97.3 6.9 299.5 9.1 396.8 19.4

512 1.657 71.7 9.4 194.6 6.8 783.4 19.1

1024 3.314 143.4 8.7 378.9 6.7 1,925 23.5

2048 6.627 266.2 8.4 757.8 6.7 3,727 22.7

3.7 Teraflops achieved on the Earth Simulator with 2,048 processorsusing 6.6 billion particles!!

Performance Results

Number of particles (in million) that move 1 step in 1 second

1

10

100

1000

10000

64 128 256 512 1024 2048

Number of processors

Co

mp

ute

Po

wer

(m

illio

ns

of

par

ticl

es)

Earth Simulator

NEC SX8

CRAY X1

Thunder

Seaborg

Blue Gene/L

Device-size Scans: ITER-size Simulations

• ITER-size simulation using 1 billion particles (GC), 125 M spatial grid points, and 7000 time steps --- leading to important (previously inaccessible) new results

• Made possible by mixed-model MPI-OpenMP on Seaborg

a / i0

0.5

1.0

1.5

0 1000200 400 600 800

/i GB

simulation

formula

Continuous Improvements in GTC bringnew Computational Challenges

• Recent full kinetic electron simulations of electron temperature gradient instability required 8 billion particles!

• Electron-wave interaction has sharp resonances that requires higher phase space resolution

• Fully electromagnetic version requires new solver (multi-grid)

Look for many GPS-related work during this conference

• Scientific accomplishments with enhanced versions of GTC (Z. Lin et al., presented by G. Rewoldt)

• Shaped plasma device simulations with general geometry GTC (W. Wang)

• New electromagnetic solver for kinetic electrons capability (M. Adams)

• Visualization techniques (K.-L. Ma)• Data management and workflows (S. Klasky)

Conclusions

• Simulating fusion experiments is very challenging• It involves multiscale physics• Gyrokinetic particle-in-cell simulation is a very

powerful method to study plasma micro-turbulence• The GTC code can efficiently use the available

computing power• New and exciting discoveries are continuously being

made with GTC through advanced computing

top related