beam dynamic calculation by nvidia® cuda technology e. perepelkin, v. smirnov, and s. vorozhtsov...

32
Beam Dynamic Calculation by NVIDIA® CUDA Technology E. Perepelkin, V. Smirnov, and S. Vorozhtsov JINR, Dubna 7 July 2009

Upload: franklin-nelson

Post on 29-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Beam Dynamic Calculation by NVIDIA® CUDA Technology E. Perepelkin, V. Smirnov, and S. Vorozhtsov JINR, Dubna 7 July 2009

Beam Dynamic Calculation by

NVIDIA® CUDA Technology

E. Perepelkin, V. Smirnov, and S. Vorozhtsov JINR, Dubna

7 July 2009

Page 2: Beam Dynamic Calculation by NVIDIA® CUDA Technology E. Perepelkin, V. Smirnov, and S. Vorozhtsov JINR, Dubna 7 July 2009

Introduction• Cyclotron beam dynamic problems [1]:

• Losses on geometry• Space Charge effects• Optimization of the central region [2]

• CBDA [3] code calculations:• OpenMP ( by CPU )• CUDA ( by GPU )

__________________________________________________________________[1] Beam injection and extraction of RIKEN AVF cyclotron, A. Goto, CNS-RIKEN Workshop on Upgrade

of AVF Cyclotron, CNS Wako Campus, 3-4 March 2008[2] SPIRAL INFLECTORS AND ELECTRODES IN THE CENTRAL REGION OF THE VINCY

CYCLOTRON, E. Perepelkin, A. Vorozhtsov, S. Vorozhtsov, P. Beličev, V. Jocić, N. Nešković, etc., Cyclotrons and Their Applications 2007, Eighteenth International Conference

[3] CBDA - CYCLOTRON BEAM DYNAMICS ANALYSIS CODE, E. Perepelkin, S. Vorozhtsov, RuPAC 2008, Zvenigorod, Russia

Page 3: Beam Dynamic Calculation by NVIDIA® CUDA Technology E. Perepelkin, V. Smirnov, and S. Vorozhtsov JINR, Dubna 7 July 2009

Computer model of the cyclotron

Injection line

ESD

Dee

Magnet sectors

Page 4: Beam Dynamic Calculation by NVIDIA® CUDA Technology E. Perepelkin, V. Smirnov, and S. Vorozhtsov JINR, Dubna 7 July 2009

Regions of the field maps

Inflector Electric

field

Axial channel Magnetic

field

G1 Magnetic

field

Page 5: Beam Dynamic Calculation by NVIDIA® CUDA Technology E. Perepelkin, V. Smirnov, and S. Vorozhtsov JINR, Dubna 7 July 2009

Axial injection line

Page 6: Beam Dynamic Calculation by NVIDIA® CUDA Technology E. Perepelkin, V. Smirnov, and S. Vorozhtsov JINR, Dubna 7 July 2009

Cyclotron

Page 7: Beam Dynamic Calculation by NVIDIA® CUDA Technology E. Perepelkin, V. Smirnov, and S. Vorozhtsov JINR, Dubna 7 July 2009

Central region optimization

φRF = 13°

φRF = 15°

φRF = 28°

φRF = 10°

Page 8: Beam Dynamic Calculation by NVIDIA® CUDA Technology E. Perepelkin, V. Smirnov, and S. Vorozhtsov JINR, Dubna 7 July 2009

Particle losses

Page 9: Beam Dynamic Calculation by NVIDIA® CUDA Technology E. Perepelkin, V. Smirnov, and S. Vorozhtsov JINR, Dubna 7 July 2009

Bunch acceleration

Page 10: Beam Dynamic Calculation by NVIDIA® CUDA Technology E. Perepelkin, V. Smirnov, and S. Vorozhtsov JINR, Dubna 7 July 2009

Optimization process

S0 S1 S2

S3 S4

Page 11: Beam Dynamic Calculation by NVIDIA® CUDA Technology E. Perepelkin, V. Smirnov, and S. Vorozhtsov JINR, Dubna 7 July 2009

Acceleration field map

Page 12: Beam Dynamic Calculation by NVIDIA® CUDA Technology E. Perepelkin, V. Smirnov, and S. Vorozhtsov JINR, Dubna 7 July 2009

Very time consuming problem• About 5 different variants – minimum• Many ion species – accelerated• Very complicated structure• Multi macro particle simulations for

SC dominated beams

One run requires ~ several days of computer time

Page 13: Beam Dynamic Calculation by NVIDIA® CUDA Technology E. Perepelkin, V. Smirnov, and S. Vorozhtsov JINR, Dubna 7 July 2009

Open Multi-Processing

( Open MP )

Page 14: Beam Dynamic Calculation by NVIDIA® CUDA Technology E. Perepelkin, V. Smirnov, and S. Vorozhtsov JINR, Dubna 7 July 2009

Spiral inflector

Page 15: Beam Dynamic Calculation by NVIDIA® CUDA Technology E. Perepelkin, V. Smirnov, and S. Vorozhtsov JINR, Dubna 7 July 2009

Beam phase space projections at the inflector entrance

Page 16: Beam Dynamic Calculation by NVIDIA® CUDA Technology E. Perepelkin, V. Smirnov, and S. Vorozhtsov JINR, Dubna 7 July 2009

Beam phase space projections

at the inflector exit

Blue points – PIC by FFT (Grid: 25 x 25 x 25 )Red points – PP

Page 17: Beam Dynamic Calculation by NVIDIA® CUDA Technology E. Perepelkin, V. Smirnov, and S. Vorozhtsov JINR, Dubna 7 July 2009

MethodWithout OpenMP

With OpenMP

Computerplatform

PP4 h. 53 min. 2 h. 34 min. AMD Turion

64×2, 1.60 GHz

4 h. 38min 1 h. 25 min.Intel Core Quad

2.4 GHz

PIC25 x 25 x 25

~11 min. ~6 min. AMD Turion 64×2, 1.60 GHz

7 min. ~2 min.Intel Core Quad

2.4 GHz

Calculation time

10,000 particlesNo geometry losses

Page 18: Beam Dynamic Calculation by NVIDIA® CUDA Technology E. Perepelkin, V. Smirnov, and S. Vorozhtsov JINR, Dubna 7 July 2009

Compute Unified Device Architecture

( CUDA )

Page 19: Beam Dynamic Calculation by NVIDIA® CUDA Technology E. Perepelkin, V. Smirnov, and S. Vorozhtsov JINR, Dubna 7 July 2009

GeForce 8800 GTX ( price ~ $300 )

Page 20: Beam Dynamic Calculation by NVIDIA® CUDA Technology E. Perepelkin, V. Smirnov, and S. Vorozhtsov JINR, Dubna 7 July 2009

GPU structure

128 SP ( Streaming Processors )

Page 21: Beam Dynamic Calculation by NVIDIA® CUDA Technology E. Perepelkin, V. Smirnov, and S. Vorozhtsov JINR, Dubna 7 July 2009

Kernel functions• __global__ void Track ( field maps, particles

coordinates )• Calculate particle motion in electromagnetic

field maps

• __global__ void Losses ( geometry, particles coordinates )• Calculate particle losses on the structure

• __global__ void Rho ( particles coordinates )• Produce charge density for SC effects

Page 22: Beam Dynamic Calculation by NVIDIA® CUDA Technology E. Perepelkin, V. Smirnov, and S. Vorozhtsov JINR, Dubna 7 July 2009

Kernel functions• __global__ FFT ( charge density )

• FFT method ( analysis / synthesis )

• __global__ PoissonSolver ( Fourie’s coefficients )• Find solution of Poisson equation

• __global__ E_SC ( electric potential )• Calculate electric field by E = -

grad( U )

Page 23: Beam Dynamic Calculation by NVIDIA® CUDA Technology E. Perepelkin, V. Smirnov, and S. Vorozhtsov JINR, Dubna 7 July 2009

__global__ void Track ( )• Function with many parameters. Use

variable type __constant__:• __device__ __constant__ float d_float[200];• __device__ __constant__ int d_int[80];

• Particle number corresponds • int n = threadIdx.x+blockIdx.x*blockDim.x;

• Number of “if, goto, for” should be decreased;

Page 24: Beam Dynamic Calculation by NVIDIA® CUDA Technology E. Perepelkin, V. Smirnov, and S. Vorozhtsov JINR, Dubna 7 July 2009

__global__ void Losses ( )• Geometry structure consists from triangles.

Triangles coordinates stored in __shared__ variables. This feature gave drastically increase performance• int tid = threadIdx.x; - used for parallel

copying data to shared memory

• Particle number corresponds to • int n = threadIdx.x+blockIdx.x*blockDim.x; • Check particles and triangle match

Page 25: Beam Dynamic Calculation by NVIDIA® CUDA Technology E. Perepelkin, V. Smirnov, and S. Vorozhtsov JINR, Dubna 7 July 2009

__global__ void Rho• Calculate charge impact in the nodes of

mesh from particle with number int n = threadIdx.x+blockIdx.x*blockDim.x;

Cell 7

Cell 1

Cell 8

Cell 3

Cell 2

Cell 5

Cell 6 Node

Page 26: Beam Dynamic Calculation by NVIDIA® CUDA Technology E. Perepelkin, V. Smirnov, and S. Vorozhtsov JINR, Dubna 7 July 2009

__global__ FFT ( )• Used real FFT for sin(πn/N) basis functions;• 3D transform consist from three 1D FFT for

each axis: X, Y, Z• int n = threadIdx.x+blockIdx.x*blockDim.x;

k=(int)(n/(NY+1));j=n-k*(NY+1);

m=j*(NX+1)+k*(NX+1)*(NY+1);FFT_X[i+1]=Rho[i+m];

n = j +

k*(NY+1)

NZ

NY

Page 27: Beam Dynamic Calculation by NVIDIA® CUDA Technology E. Perepelkin, V. Smirnov, and S. Vorozhtsov JINR, Dubna 7 July 2009

__global__ PoissonSolver ( )• int n =

threadIdx.x+blockIdx.x*blockDim.x;

• Uind(i,j,k) = Uind(i,j,k) / ( kxi2 + kyj

2 + kzk2 )

ind(i,j,k)=i+j*(NX+1)+k*(NX+1)*(NY+1);

k=(int)(n/(NX+1)*(NY+1));j=(int)(n-k*(NX+1)*(NY+1))/(NX+1);i=n-j*(NX+1)-k*(NX+1)*(NY+1);

Page 28: Beam Dynamic Calculation by NVIDIA® CUDA Technology E. Perepelkin, V. Smirnov, and S. Vorozhtsov JINR, Dubna 7 July 2009

__global__ E_SC ( )• int n = threadIdx.x+blockIdx.x*blockDim.x+st_ind

Un

Un + 1

Un - 1

Un - ( NX + 1

)

Un + ( NX + 1

)

Un - ( NX + 1 )( NY +

1 )

Un + ( NX + 1 )( NY +

1 )

Page 29: Beam Dynamic Calculation by NVIDIA® CUDA Technology E. Perepelkin, V. Smirnov, and S. Vorozhtsov JINR, Dubna 7 July 2009

Performance

Functions*

Time, [msec] Ratio,[x]CPU GPU

Track 486 30 16

Losses 6997 75 93

Rho 79 6 14

Poisson/FFT 35 3 13

E_SC 1.2 0.8 1.4

Total 7598 114 67* Mesh size: 25 x 25 x 25. Particles: 100,000.

Triangles: 2054

Page 30: Beam Dynamic Calculation by NVIDIA® CUDA Technology E. Perepelkin, V. Smirnov, and S. Vorozhtsov JINR, Dubna 7 July 2009

Comparison

Number of

particles

Calculation time

Rate,[x]CPU GPU

1,000 3 min 19 sec 12 sec 17

10,000 34 min 14 sec 42 sec 49

100,000 5h 41 min ~6 min 56

1,000,000

2 days 8h 53 min

1h 60

Page 31: Beam Dynamic Calculation by NVIDIA® CUDA Technology E. Perepelkin, V. Smirnov, and S. Vorozhtsov JINR, Dubna 7 July 2009

SC effect

no SCLosses 24%

SCLosses 94%

I = 4 mA

Page 32: Beam Dynamic Calculation by NVIDIA® CUDA Technology E. Perepelkin, V. Smirnov, and S. Vorozhtsov JINR, Dubna 7 July 2009

Conclusions• Very chipper technology• Increasing of performance at

power 1.5 gave chance to produce the complex cyclotron modeling

• Careful programming• Expand this method for

calculation of beam halo and etc.