accelerator modeling through high performance...
TRANSCRIPT
NERSC,LBNL
NCCS, ORNL
Accelerator Modeling Through High Performance Computing
Z. LiAdvanced Computations DepartmentStanford Linear Accelerator Center
Presented at Jefferson Lab, 9-24-2007
Work supported by U.S. DOE ASCR, BES & HEP Divisions under contract DE-AC02-76SF00515
Contributions To This TalkA. CandelA. KabelK. KoZ. LiC. NgL. Xiao
V. AkcelikS. ChenL. GeL. LeeE. PrudencioG. SchussmanR. Uplenchwar
Advanced Computations DepartmentWork supported by U.S. DOE ASCR, BES & HEP Divisions under contract DE-AC02-76SF00515
Outline
DOE SciDAC Program
Parallel Code Development under SciDAC
Applications to DOE Accelerator Projects
Collaborations in Computational Science Research
SciDAC ProgramSciDAC: Scientific Discovery through Advanced Computing
DOE Office of Science (SC) Simulation Initiative Promotes application of High Performance Computing to SC programs across BES/NP/HEP Offices
Multi-disciplinary approach – computational scientists (CS & AM) work alongside application scientistsAccelerator project started as Accelerator Simulation and Technology (AST) in SciDAC1, and continues as Community Petascale Project for Accelerator Science and Simulation (COMPASS) in SciDAC2
Goal – To develop next generation simulation tools to improve the performance of present accelerators and optimize the design of future machines using flagship supercomputers at NERSC (LBNL) and NLCF (ORNL)
SLAC SciDAC ActivitiesParallel code development in electromagnetics and beam dynamics for accelerator design, optimization and analysis
Application to accelerator projects across HEP/BES/NPsuch as ILC, LHC, LCLS, SNS, etc…
Petascale simulations under SciDAC2 on DOE’ssupercomputers - currently 3 allocation awards at NERSC (Seaborg, Bassi, Jacquard) and NCCS (Phoenix)
Computational science research through collaborations with SciDAC CET/Institutes’ computer scientists and applied mathematicians
SLAC Parallel Codes under SciDAC1Electromagnetic codes in production mode:
Omega3P – frequency domain eigensolver for mode and damping calculations
S3P – frequency domain S-parameter computations
T3P – time domain solver for transient effects and wakefield computations with beam excitation
Track3P – particle tracking for dark current and multipacting simulations
V3D – visualization of meshes, fields and particles
SLAC Parallel Codes under SciDAC2Codes under development:
ElectromagneticsGun3P – 3D electron trajectory code for beam
formation and transport Pic3P – self-consistent particle-in-cell code for RF
gun and klystron (LSBK) simulations TEM3P – integrated EM/thermal/mechanical
analysis for cavity design
Beam dynamicsNimzovich – particle-in-cell strong-strong beam-
beam simulation
SciDAC Tools for Accelerator ApplicationsILC
Accelerating Cavity (DESY, KEK, JLab)– TDR, Low-loss, ICHIRO & cryomodule designs
Input Coupler (SLAC, LLNL) – TTFIII multipacting studiesCrab Crossing (FNAL/UK) - Deflecting cavity design Damping Ring (LBNL) – Impedance calculationsL-Band Sheet Beam Klystron – Gun and window modeling
LHCBeam-beam simulations
LCLSRF gun – emittance calculations using PIC codes
SNSBeta 0.81 cavity – end-group heating and multipacting
iWSMP MUMPS SuperLU
Krylov Subspace Methods
Domain-specific preconditioners
- Calculating HOM damping in the ILC cavities requires a nonlinear eigensolver when modeling the coupling to external waveguides (FP & HOM couplers) to obtain the complex mode frequencies as a result of power outflow
Omega3P
Lossless LossyMaterial
PeriodicStructure
ExternalCoupling
ESIL/withRestart
ISIL w/ refinement
Implicit/ExplicitRestartedArnoldi SOAR Self-Consistent
LoopNonlinearArnoldi/JD
Problems and Solver Options
Advances In Accelerator Simulation
1.0E+02
1.0E+03
1.0E+04
1.0E+05
1.0E+06
1.0E+07
1.0E+08
1.0E+09
1.0E+10
1990 1995 2000 2005 2010 2015 2020
Deg
rees
of F
reed
om
2D Cell
3D DampedCell
2D detuned structure
3D detuned structure with coupler
9-cell ILC cavity
ILC 3-module RF unit
ILC cryomodule
Modeling ILC RF systems under operation conditions
Moore’s law
End Group Design – HOM Damping
LL Cavity End-group Design
1.E+01
1.E+02
1.E+03
1.E+04
1.E+05
1.E+06
1.600 2.100 2.600 3.100F (GHz)
R/Q
(ohm
/m^2
/cav
ity) 1st
2nd3rd4th5th
LL Shape>15% higher R/Q (1177 ohm/cavity)>12% lower Bpeak/Eacc ratio20% lower cryogenic heating
Most important modes are 0-mode in the 3rd band
High R/Q in the 1st&2nd bands are up to 1/3 of the 3rd band
Beam pipe tapers down to 30-mm, 3rd band damped locally by HOM couplers
Damping criteria: 3rd band mode Qext<105 (?)
High R/Q 3rd Band Modes
Qext=4.6x105 Qext=1.4x104
-40
-35
-30
-25
-20
-15
-10
-5
0
5
10
-0.4 -0.2 0 0.2 0.4 0.6 0.8 1 1.2 1.4z (m)
Er (
a.u.
)
r=37mmr=38mmr=39mmr=41mmCoupler region
41mm end pipe radiusLow field in the coupler region
38mm pipe radiusField significantly improvedEnd-group modified to enhance damping
LL Cavity End-group
1.0E+03
1.0E+04
1.0E+05
1.0E+06
1.630 2.130 2.630F (GHz)
Qex
t
New LL designInitial with TTF coupler
370
540
3mm
-30
0
30
60
90
120
150
1.600 1.800 2.000 2.200 2.400 2.600 2.800 3.000F (GHz)
Mod
e po
lariz
atio
n an
gle Polarization
Qext
Effective damping achieved by optimizing:End-group geometry to increase fields in coupler regionLoop shape and orientation to enhance couplingOptimized azimuthal coupler orientation for x-y mode polarization
-120
-100
-80
-60
-40
-20
03.6E+09 3.7E+09 3.8E+09 3.9E+09 4.0E+09 4.1E+09 4.2E+09
F (Hz)
S12
(dB
)
notch gap=3.0mmnotch gap=3.1mm
Notch gap sensitivityQext in Crab-cavity
1.E+02
1.E+03
1.E+04
1.E+05
1.E+06
1.E+07
1.E+08
2.70E+09 3.30E+09 3.90E+09 4.50E+09 5.10E+09F (Hz)
Qex
t
Qext in the original design
Qext in the new design
Omega3P damping calculation
A copper model is being built in UK lab based on this design.
Crab Cavity Design for ILC BDSImproved FNAL design• better HOM, LOM and SOM damping• reduced HOM notch gap sensitivity(to 0.1 MHz/μm from original 1.6 MHz/μm)
• eliminates LOM notch filter• avoids x-y SOM coupling
SOM x-y coupling
original
modified
Original Design
New Design
Cavity Imperfection
HOM dampingX-Y couplingEffects on beam emittance
TESLA cavity imperfection studyTDR prototype cavity
Idea TDR cavity Omega3p model
The actual cell shape of the TESLA cavities differ from the idea due to fabrication errors, the addition of stiffening rings and the frequency tuning process.
TESLA cavity Measurement Data Study
20
25
30
35
40
45
700 720 740 760 780 800 820 840 860 880
Eac
c(M
V/m
)
fπ -f 8π/9
(KHz)
Idea: 753KHz
TDR cavity: operating mode from 80 cavities
The mode spacing increases.
• omega3p Calculation
TTF module 5: 1st/2nd dipole band
1.E+03
1.E+04
1.E+05
1.E+06
1.E+07
1600 1650 1700 1750 1800 1850 1900
F (MHz)
Qex
t
1.E+03
1.E+04
1.E+05
1700 1701 1702 1703 1704 1705 1706
1st band 6th pair
1.E+04
1.E+05
1.E+06
1877 1878 1879 1880 1881 1882
2nd band 6th pair
Dipole mode frequencies shift and Qext scatter.
(Neubauer, Michael L.)
Modeling Imperfection Of ILC TDR Cavity
Determine shape deformation from measured cavity data, inverse and forward methods Important to understand effect on Qext and x-y coupling of beam dynamicsActual deformation? – geometry measurement data will be very helpful
Ideal Cavity
Qext
shift
splitscatter
Welding stiffening ring deforms disk f
Stretching cavity f
Stiffening ring
Cell elliptically deformed
Red: ideal cavityBlue: deformed cavity
Cylindrical Symmetric Deformation (200micro on top/607micro on disk)
- cause frequency shift
Cavity stretching
Stiffening ring
TTF module 5: 1st/2nd dipole band meas. data
1.E+03
1.E+04
1.E+05
1.E+06
1.E+07
1600 1650 1700 1750 1800 1850 1900
F (MHz)
Qex
t
Module 5: 8 cavities :1st/2nd dipole band mode frequency shift
-12
-10
-8
-6
-4
-2
0
2
0 4 8 12 16 20 24 28 32 36Mode Index
F(re
al)-
F(id
ea) (
MH
z)
ac62 ac61ac65 ac66ac79 ac77ac63 ac60defomed surface
1st/2nd dipole band modes
1.E+03
1.E+04
1.E+05
1.E+06
1.E+07
1.60E+09 1.70E+09 1.80E+09 1.90E+09F (Hz)
Qex
t
idea-cavitydeform ed surface
1 .E + 0 4
1 .E + 0 5
1 .E + 0 6
1 .8 7 9 E + 0 9 1 .8 8 1 E + 0 9F (H z )
Qex
t
id
1 .E + 0 3
1 .E + 0 4
1 .E + 0 5
1 .7 0 0 E + 0 9 1 .7 0 5 E + 0 9F (H z )
Qex
t
id
8-cavity measurement v.s. simulation
Ideal v.s. deformed
fπ-f8π/9=772KHz within meas. Range. 1st/2nd dipole band mode freq. shift roughly fit measurement data.
Cell elliptical deformation (dr=250micro)
- cause mode Mode x-y coupling& Qext scattering
TDR cavity with elliptical cell shape
1.E+03
1.E+04
1.E+05
1.E+06
1.E+07
1.60E+09 1.65E+09 1.70E+09 1.75E+09 1.80E+09 1.85E+09 1.90E+09
F (Hz)
Qex
t
2nd band: 6th pair
1.E+04
1.E+05
1704.0 1705.0 1706.0 1707.0 1708.0F (MHz)
Qex
t
defor cell1 along xdefor cell4 along xdefor cell1 and 4 along xidea cavity
1st band 6th
pair 2nd band 6th
pair
1.E+03
1.E+04
1.E+05
1787.0 1787.5 1788.0 1788.5 1789.0 1789.5 1790.0
defor cell1 along xdefor cell4 along xdefor cell1 and 4 along xidea cavity
elliptically deformed
cavity
ideal cavity
End Group RF Study
Notch filterPeak surface fieldMultipacting
Crab Cavity: HOM Notch Filter Sensitivity
-120
-100
-80
-60
-40
-20
03.6E+09 3.7E+09 3.8E+09 3.9E+09 4.0E+09 4.1E+09 4.2E+09
F (Hz)
S12
(dB
)
notch gap=0.6mm
notch gap=0.64mm
1.6MHz/micron
-120
-100
-80
-60
-40
-20
03.6E+09 3.7E+09 3.8E+09 3.9E+09 4.0E+09 4.1E+09 4.2E+09
F (Hz)
S12
(dB)
notch gap=3.0mmnotch gap=3.1mm
0.1MHz/micron
Original Design
New Design
SLAC-PUB-12409
Very sensitive tuning was found in the original design
−1.6MHz/micron−0.1MHz/micron for TESLA TDR
Resonator geometry was modified to improve the tunability
−0.1MHz/micron achieved
Multipacting in HOM Coupler
Re-optimized loop: with round surfaces and a larger gap.
• No multipacting up to 50MV/m.• Qext for the 3rd
band mode is 3.4x104
larger gap round surfaces
MP trajectories at 15-MV/m.
Initial optimized design: multipacting in the gap between the flat surface and outer cylinder at field levels starting from 10- MV/m and up.
Multipacting in SNS HOM Coupler
0 5 10 15 200
0.5
1
1.5
Field Level (MV/m)
Del
ta
SNS-beta=0.81 cavity@16MV/m
0
2
4
6
8
10
12
14
16
0.48 0.49 0.50 0.51 0.52 0.53 0.54HOM coupler notch gap (mm)
max
E a
t HO
M n
otch
gap
(MV
/m) HOM2 coupler
HOM1 coupler
Field level in HOM couplers
• SNS SCRF cavity experienced RF heating at HOM coupler
• 3D MP simulations showed MP barriers closed to measurements
• Similar analysis are carried out for ILC ICHIRO and crab cavity
Track3P
Expt. MP bands
HOM2
HOM1 HOM2
Multipacting Simulation – Track3P3D parallel high-order finite-element particle tracking code for dark current and multipacting simulations (developed under SciDAC)
Track3P − traces particles in resonant modes, steady state or transient fields− accommodates several emission models: thermal, field and secondary
MP simulation procedure− Launch electrons on specified surfaces with different RF phase, energy and
emission angle− Record impact position, energy and RF phase; generate secondary electrons
based on SEY according to impact energy− Determine “resonant” trajectories by consecutive impact phase and position− Calculate MP order (#RF cycles/impact) and MP type (#impacts /MP cycle)
Track3P benchmarked extensively− Rise time effects on dark current for an X-band 30-cell structure − Prediction of MP barriers in the KEK ICHIRO cavity
TTFIII Coupler – Multipacting AnalysisMP simulations are carried out in support of ILC test stand at SLAC (LLNL) to study the cause of the TTFIII coupler’s long processing time
Track3P model
RF In RF Out
Mulitpacting in Coax of TTFIII Coupler
0 0.2 0.4 0.6 0.8 11
1.2
1.4
1.6
1.8
2
RF Input Power (MW)
Ave
rage
Del
ta
third orderfourth orderfifth ordersixth orderseven order
Simulated power (kW) 170~190 230~270 350~390 510~590 830~1000Power in Coupler (kW) 43~170 280~340 340~490 530~660 850~1020
klystron power (kW) 50~200 330~400 400~580 620~780 1000~1200
After high power processing
(F. Wang, C. Adolphsen, et. al)
Track3P simulation
Cold coax
PAC07, Albuquerque, Jun 25-29, 2007
Parallel Finite Element Particle-In-Cell Codefor Simulations of Space-Charge
Dominated Beam-Cavity Interactions
Arno Candel
Andreas Kabel, Liequan Lee, Zenghai Li, Cho Ng,Ernesto Prudencio, Greg Schussman, Ravi Uplenchwar and Kwok Ko
ACD, Stanford Linear Accelerator Center
Cecile LimborgLCLS, Stanford Linear Accelerator Center
* Work supported by U.S. DOE ASCR & HEP Divisions under contract DE-AC02-76SF00515
Parallel Finite Element Time-Domain Maxwell’s Wave Equation in Time-Domain:
Time integration - Unconditionally stable implicitNewmark scheme (to do: solve Ax=b)
Parallelization - MPI on distributed memory platforms
Higher-order (p=1…6) Whitney basis functions:
N1 N2 N3 … N76
Spatial discretization -Conformal, unstructured gridwith curved surfaces (q=1…2)
LCLSRF Gun
SciDAC Codes – Pic3P/Pic2P
1st successful implementation of self-consistent, charge-conserving PIC code with conformal Whitney elements on unstructured FE grid
Higher-order particle-fieldcoupling, no interpolationrequired
2) Calculate EM fields from Maxwell’s Eqs.
3) Push particles
1) Compute particle current
Pic2P – Parallel 2.5D FE PIC CodePic3P – Parallel 3D FE PIC Code
Pic2P Simulation of LCLS RF Gun
Pic2P – Code from 1st principles, accurately includes effects of space charge, retardation, and wakefieldsUses conformal grid, higher-order particle-field coupling and parallel computing for large, fast and accurate simulations
Drive + Scattered fields Scattered fields only
LCLS RF Gun Bunch Radius
LCLS RF Gun Emittance
PARMELA:No retardation
MAFIA PARMELAPic2P
Pic2P PARMELAMAFIA
LCLS RF Gun Phasespace (1.5 nC) Long.
Transv.
δEδE
Pic2P - Performance
10 minutes!
Pic2P with parallel computing:Highly accurate results during a coffee break!
LCLS Injector Modeling
Adaptive refinement – Efficient simulations of long structures
PIC in long structures – Klystrons, injectors, … Active research
LCLS injector
1m
LCLS Injector Modeling
Adaptive refinement – Efficient simulations of long structures
PIC in long structures – Klystrons, injectors, … Active research
only scattered fields shown
RF gun + drift with focusing solenoid Z=60 cm
LCLS injector
1m
L-Band Sheet Beam KlystronInput: 115 kVOutput: 129 A
Parallel scaling
Bassi at NERSC
LBSK gun –• Simulated using GUN3P, a parallel,
3D, finite-element (up to 4th order) electron trajectory code
• Parallel computation allows high resolution simulation with fast turnaround time
144K tets4.5M DOF’s
Multi-physics Analysis for Accelerator Components
Virtual prototyping through computing− RF design− RF heating− Thermal radiation− Lorentz force detuning− Mechanical stress − Optimization
Large-scale parallel computing enables:− Large system optimization − Accurate and reliable multi-physics analysis− Fast turn around time
TEM3P – integrated parallel multi-physics tools
TEM3P: Multi-Physics Analysis
CAD Model
EM Analysis
Thermal Analysis
Mechanical Analysis
Finite element based with high-order basis functions− Natural choice: FEM originated from
structural analysis! Use the same software infrastructure as Omega3P− Reuse solvers framework− Mesh data structures and format
Parallel
TEM3P for LCLS RF Gun
EM Domain
Thermal/MechanicalDomain
Benchmark TEM3P against ANSYS
CAD Model (courtesy of Eric Jongewaard)
RF Gun EM Thermal/Mechanical Analysis
Operating mode: 2.856GHz
Mesh for Thermal/Mechanical analysisMesh: 0.6 million nodes.Materials: Copper + Stainless steelThermal analysis: 7 cooling channels
Magnetic field on the cavity inner surface generates RF heat load
Mesh for RF analysis
Thermal Analysis Benchmarked With ANSYS
ANSYSMaximal Temperature 49.82 C
Temperature Distribution
RF heat load: 4000 WattCooling channels: with given temperature, Robin BCThermal conductivity: copper 391; stainless steel 16.2
TEM3PMaximal Temperature 49.96 C
Mechanical Analysis With Thermal Load
ANSYS TEM3PMaximal displacement: 37.1 μm Maximal displacement: 36.99 μm
Future work: compute stress and drift frequency
Multi-physics Analysis for SRF Cavities and Cryomodules
Thermal behaviors are highly nonlinearMeshing thin shell geometry
− Anisotropic high-order mesh will reduce significant amount of computing
− Working with RPI/ITAPS
Modeling ILC Cryomodule & RF Unit
Physics Goal: Calculate wakefield effects in the 3-cryomodule RF unit with realistic 3D dimensions and misalignments
• Trapped mode and damping• Cavity imperfection effects on HOM damping• Wakefield effect on beam dynamics• Effectiveness of beam line aborsorber
cryomodule
rf unit cavity
ILC 8-Cavity Module
A dipole mode in 8-cavity cryomodule at 3rd band
First ever calculation of a 8 cavity cryomodule~ 20 M DOFs~ 1 hour per mode on 1024 CPUs for the cryomodule
To model a 3-module RF unit would require• >200 M DOFs• Advances in algorithm and solvers• Petascale computing resources
TDR 8-Cavity Module 3rd Band Modes From Omega3P Calculation(R. Lee)
Calculated on NERSC Seaborg: 1500 CPUs, over one hour per mode
Kick Factor Of One Set Of 3rd Band Modes in the 8-Cavity TDR Module
1.0E+05
1.0E+06
1.0E+07
1.0E+08
1.0E+09
1.0E+10
1.0E+11
1.0E+12
1.0E+13
1.0E+14
2.5770 2.5772 2.5774 2.5776 2.5778 2.5780 2.5782 2.5784F (GHz)
Kick
Fac
tor (
V/C
/m/m
odul
e)
K_XK_YKx[y]=Ky[x]_ampKx[0]_ampKy[0]_amp
• Modes above cutoff frequency are coupled through out 8 cavities• Modes are generally x/y-tilted & twisted due to 3D end-group geometry• Both tilted and twisted modes cause x-y coupling
TDR 8-Cavity Module
1.0E+03
1.0E+04
1.0E+05
1.0E+06
1.0E+07
2.577 2.5775 2.578 2.5785F (GHz)
Qex
t
1.E+03
1.E+04
1.E+05
1.E+06
1.E+07
2.576E+09 2.581E+09 2.586E+09 2.591E+09F (Hz)
Qex
tmeasured data in some cavities
calculated
One polarization mode is well damped.
The calculated trapped mode damping in 8-cavity
1.E+03
1.E+04
1.E+05
1.E+06
1.E+07
2.58E+09 2.58E+09 2.58E+09 2.58E+09 2.58E+09 2.58E+09
F (Hz)
Qex
t
calculated
Recent Advances in Solver and Meshing
Invalid tets (yellow) Corrected mesh
Linear SolverSimulation capabilities limited by memory available even on DOE flagship supercomputers – develop methods for reducing memory usage
Method Memory (GB) Runtime (s)
MUMPS 155.3 293.3MUMPS + single precision factorization
82.3 450.1
Meshing• Invalid quadratic tets
generated on curved surface
• Collaborated with RPI on amesh correction tool
• Runtime of corrected modelfaster by 30% (T3P)
SciDAC CS/AM ActivitiesShape Determination & Optimization (TOPS/UT Austin, LBNL) –Obtain cavity deformations from measured mode data through solving a weighted least square minimization problem
Parallel Complex Nonlinear Eigensolver/Linear Solver (TOPS/LBNL)– Develop scalable algorithms for solving LARGE, complex, nonlinear eigenvalue problems to find mode damping in the rf unit complete with input/HOM couplers and external beampipes
Parallel Adaptive Mesh Refinement and Meshing (ITAPS/RPI, ANL)– Optimize computing resources and increase solution accuracy through adaptive mesh refinement using local error indicator based on gradient of electromagnetic energy in curved domain
Parallel and Interactive Visualization (ISUV/UC Davis) –Visualize complex electromagnetic fields and particles with largecomplex geometries and large aspect ratio
SummaryA suite of parallel codes in electromagnetics and beam dynamics was developed for accelerator design, optimization and analysis
Important contributions have been made using these codes to accelerator projects such as ILC, LHC, LCLS, SNS, etc…
Through the SciDAC support and collaborations, advances in applied math and computer science are being made towards Petascale computing of large accelerator systems such as the ILC RF unit, etc
ILC Damping Ring Impedance Calculations
DR Cavity (scaled Cornell): sigma_z=0.5mm
-1.0-0.8-0.6
-0.4-0.20.00.20.4
0.60.81.0
0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07s (m)
W_L
, Q
Long. WakeCharge
σ = 1 mm
• Components scaled from existing machines• Determine pseudo Green’s function wakefield for beam stability studies
RF cavity BPM
T3P