aerospace structures laboratory, seoul national university 1/27 parallel implementation of impact...
Post on 20-Dec-2015
215 Views
Preview:
TRANSCRIPT
<Supercomputing Korea 2006>
AEROSPACE STRUCTURES Laboratory, Seoul National University
1/27
Parallel Implementation of Impact Simulation
Seung Hoon Paik*, Ji-Joong Moon* and Seung Jo Kim**
* School of Mechanical and Aerospace Engineering, Seoul National University, Seoul, Korea** School of Mechanical and Aerospace Engineering, Flight Vehicle Research Center,
Seoul National University, Seoul, Korea
Supercomputing Korea 2006
November 21, 2006
<Supercomputing Korea 2006>
AEROSPACE STRUCTURES Laboratory, Seoul National University
2/27
Outline
Introduction
Lagrangian Scheme FE Calculation Parallelization Contact Parallelization Verification & Performance Evaluation
Taylor Impact Test Oblique Impact of Metal Sphere
Eulerian Scheme
Two-step strategy
Eulerian Scheme - Remap Module
Eulerian Scheme Parallelization
Verification – 2-D/3-D Square Bar Impact
Parallel Performance Evaluation
- 3-D Square Bar Impact
<Supercomputing Korea 2006>
AEROSPACE STRUCTURES Laboratory, Seoul National University
3/27
Introduction
Impact problem Lagrangian
Eulerian
Includes complex phenomenon
Various numerical schemes are required
Material velocity=Mesh velocity
Instability due to large distortion
[Scheffler, 2000]
Fixed Mesh
Ambiguous material Interface
Applications of Eulerian Scheme in Aerospace Engineering
Bird Strike on composite plate
[Hassen, 2006] LS-DYNA
Impact of fuel filled wing
[Anderson, 1999], CTH
<Supercomputing Korea 2006>
AEROSPACE STRUCTURES Laboratory, Seoul National University
4/27
Introduction
Impact problem
Many time stepsComplex contact Nonlinear material behaviorLarge scale FE model for whole structure
Requires many computing times
Parallel Computing
Includes complex phenomenon
Various numerical schemes are required
Objectives
Development of Impact Code based on the Eulerian & Lagrangian Scheme
Implementation of efficient parallel algorithm and achieving a good performance
<Supercomputing Korea 2006>
AEROSPACE STRUCTURES Laboratory, Seoul National University
IPSAP (Internet Parallel Structural Analysis Program)
IPSAP/standard IPSAP/Explicit
IPSAP(http://ipsap.snu.ac.kr)
FEM Modules Solver Engine Lagrangian Scheme
Eulerian Scheme
Linear Equation Solver
Eigenvalue Solver
Non-linear Solver
(under development)
5/27
<Supercomputing Korea 2006>
AEROSPACE STRUCTURES Laboratory, Seoul National University
Linux Cluster Supercomputer : Pegasus system
Hardware Software
Unit Node
CPUIntel Xeon
2.2/2.4/2.8/3.0 GHzOS
Windows Adv. Server 2000
Redhat Linux 9.0RAM
DDR ECC 3GB/6GB
HDD
IDE 80GB/160GB
Compiler
gcc-3.3 compiler
Intel 8.0 compiler
Visual Studio 6.0M/BE7500/7501 Dual
M/B
Total CPU520 CPUs
(2.2/256, 2.4/112, 2.8/64, 3.0/88)
MPI
LAM/MPI – 7.0.6
MPICH – 1.2.5.2
MPI/Pro, NT-MPICH
Total Node 260 NodesJob
schedulerOpen PBS, Condor
Total Memory/
Storage1.02 GB / 25 TB
Grid Middleware
Globus 2.4
NetworkGigabit Ethernet : Intel NIC e1000 /
Fast Ethernet - 7 NFS Server
performance1.283 Tflops (Rmax)
2.5 Tflops (Rpeek)
Local Gigabit
Local Fast
NFS & Gateke
eper
External Network
Rack ( 20 Node )
Rack-20 Node & Multi Trunking (4 GB Uplink) -Nortel 380-24T (Giga) & Intel 24T (Fast)
Gigabit Ethernet- Nortel 5510-48T
Fast ethernet- Intel 24T
6/27
<Supercomputing Korea 2006>
AEROSPACE STRUCTURES Laboratory, Seoul National University
IPSAP/Explicit - Lagrangian Scheme
FE Calculation Parallelization
Contact Parallelization
Verification & Performance Evaluation Taylor Impact Test Oblique Impact of Metal Sphere
7/27
AEROSPACE STRUCTURES Laboratory, Seoul National University
<Supercomputing Korea 2006>
• Explicit Time Integration, Automatic Time Step Control
• Elastic, Orthotropic, Elastoplastic, Johnson-Cook
• EOS (Equation of State) : Polynomial Model, JWL, Grüneisen
• FE Model : 8 node Hexahedron, 4 node BLT Shell
1 point integration with Hourglass Control
• Object Stress Update : Jaumann rate stress update
• Artificial Bulk Viscosity
• Contact Treatment :
Contact Search : Bucket Sorting Master-Slave Algorithm, Penalty Method Single Surface Contact (or Self Contact)
• Element Erosion and Automatic Exterior contact surface update
• MPP Parallelization
IPSAP/Explicit (Internet Parallel Structural Analysis Program)
Lagrangian Scheme
8/27
AEROSPACE STRUCTURES Laboratory, Seoul National University
<Supercomputing Korea 2006>
FE Calculation Parallelization
1. Compute at each processor
independently.
2. Interface values are swapped and
added.1 2 3
4 5 6
7 8 9
Each Processor (or domain) knows
(1) list of processors that share common interface
(2) list of nodes in each shared interface.
At the initialization stage and are not changed
through the computation (Static)
Array structure of send buffer
Array structure of Receive buffer
9/27
<Supercomputing Korea 2006>
AEROSPACE STRUCTURES Laboratory, Seoul National University
Contact Parallelization
Contact Parallelization (Computers and Structures, 2006)
Contact segment 를 FE decomposition 과 동일하게 분할하되 ,
Segment 가 공간상 차지하는 확장영역에 들어오는 Contact node 의 data 를 Unstructured Communication 을 적용하여 송 / 수신
Two-body contact 및 Single Surface contact 에 모두 적용 가능하도록 일반화 유한요소 내력벡터와 마스터 노드의 접촉력벡터 동시 통신 송 / 수신 자료 구조의 일관성 유지 , 비구조화 통신 하에서도 송신데이터의 최소화 Maker 나 특별히 최적화 된 OS 가 아닌 일반 linux cluster 에서 대규모 병렬성능 테스트 결과
제시Contact Load Balancing
10/27
AEROSPACE STRUCTURES Laboratory, Seoul National University
<Supercomputing Korea 2006>
Verification : Taylor Bar Impact Test
Analysis Conditions Material Model : Elastic-Plastic with Linear Hardening Termination Time : 80 micro sec Constraints : Sliding condition in bottom surface
Results
Density 8.93
Yong’s Modulus(Gpa) 117
Poison Ratio 0.35
Initial Yield Stress(Gpa) 0.4
Hardening Modules(Gpa) 0.1
Initial Velocity(km/sec) 0.227
Initial Length(mm) 32.4
Initial Radius(mm) 3.2
Initial & Deformed Configuration
Material Constants & Geometric Configuration
Number of Node : 1369Number of Element : 972
CodesLength(mm)
Radius(mm)
ABAQUS/Explicit 21.48 7.08
LS-DYNA 21.23 6.18
IPSAP/Explicit 21.52 7.00
11/27
AEROSPACE STRUCTURES Laboratory, Seoul National University
<Supercomputing Korea 2006>
Verification : Oblique Impact of Metal Sphere
Comparison with Experiment(Finnegan SA, Dimaranan LG, Heimdahl OER - 1993)
Model Configuration
IPSAP/Explicit Impact angle = 60°
(a) 610m/s (b) 910 m/s
Material Model Johnson-Cook
Diameter (mm) 6.35
Mass (g) 1.04
Plate Length /width(mm) 50/40
Erosion EPS 2.0
(a) 610m/s (b) 910 m/s
Comparison : Experiment vs. IPSAP/Explicit
12/27
AEROSPACE STRUCTURES Laboratory, Seoul National University
<Supercomputing Korea 2006>
Parallel Performance Evaluation
Taylor Impact Test
Domain Decomposition Graph partitioning scheme (METIS)
Fixed Size Speed Up 10 million DOF 1CPU/Node
122 Speed up at 128CPUs 2CPUs/Node
105 Speed up at128CPUs 151 Speed up at 256CPUs
Scaled Speed Up 55,296 elements / CPU 7 million elements at 128CPUs 128 Speed up at 128 CPUs(1CPU/Node)
Fixed Size Speed Up
Scaled Speed Up
13/27
AEROSPACE STRUCTURES Laboratory, Seoul National University
<Supercomputing Korea 2006>
Parallel Performance Evaluation : Oblique Impact of Metal Sphere
유한요소와 접촉처리 계산의 병렬 성능
접촉 처리의 병렬 성능
- 접촉처리 계산으로 인한 효율 감소
접촉처리-접촉처리를 위한 Load Balancing (Contact L/B) : CPU 증가에 따라 증가하는 양상
접촉 영역이 전 범위에 걸쳐 있지 않고 , 충격 부위에서 국부적으로 발생하기 때문에 접촉 계산의 불균형
-접촉 처리 (Contact Force) : 전체 계산시간에서 차지하는 비중이 미약
-접촉 데이터의 통신 (Contact Comm.)
14/27
<Supercomputing Korea 2006>
AEROSPACE STRUCTURES Laboratory, Seoul National University
Eulerian Scheme
Two-step strategy
Eulerian Scheme - Remap Module
Eulerian Scheme Parallelization
Verification – 2-D/3-D Square Bar Impact
Parallel Performance Evaluation
- 3-D Square Bar Impact
15/27
AEROSPACE STRUCTURES Laboratory, Seoul National University
<Supercomputing Korea 2006>
Two-step strategy
Eulerian equations in conservation form
St
St
0
t
solved sequentially
Eulerian Step
Remap Step/Part/Module
Two-Step strategy : Operator-Split
Eulerian Codes
CELL, JOY, HULL, PISCES, CSQ,
CTH, MESA, KRAKEN
Two-step Codes
16/27
AEROSPACE STRUCTURES Laboratory, Seoul National University
<Supercomputing Korea 2006>
Eulerian Scheme - Remap Module
Compute Material Flux Compute Volume Flux Compute Material Flux by using Interface Tracking Algorithm
Material centered advection Advect density, stress, strain, energy
Vertex centered advection Advect momentum and kinetic energy Compute nodal velocity
ALEGRA
(MMALE code, 1997)RHALE
(ALE code, 1993)CTH
(Eulerian code, 1989)
Interface TrackingAlgorithm
Young's InterfaceSLIC Interface
Young's InterfaceSLIC Interface
Young's InterfaceSLIC Interface
Advection (element-centered)
Van Leer’s MUSCLSupter-B
Van Leer’s MUSCLSupter-B
Van Leer’s MUSCL
Advection (vertex-centered)
SALEHIS
Modified HIS
SHALEHIS
Momentum 보존운동에너지 불일치
Developer Sandia National Lab. Sandia National Lab. Sandia National Lab.
17/27
<Supercomputing Korea 2006>
AEROSPACE STRUCTURES Laboratory, Seoul National University
Two-step Eulerian Code Structure
Structure of Program
Serial Lagrangian
Serial Eulerian
Parallel Lagrangian
Parallel Eulerian
IPSAP/Explicit
18/27
AEROSPACE STRUCTURES Laboratory, Seoul National University
<Supercomputing Korea 2006>
Eulerian Scheme Parallelization
Step 1. Communication of cell values VOF, density, ST*, EPS, ABV, IE
Step 2. Calculation of material flux Volume flux, Material interface tracking, Material flux at the cell face
Step 3. Advection of cell centered variables VOF, density, ST, EPS, ABV, IE
Step 4. Advection of vertex centered variables Calculate momentum and mass at the vertex Communicate vertex values Momentum in I, J, K direction, mass Calculate velocity at vertex
[i-2] [i-1] [ I ] [i+1] [i+2]
*ST : Stress Tensor, EPS : Effective Plastic Strain,
ABE : Artificial Bulk Viscosity, IE : Internal Energy I
J
19/27
AEROSPACE STRUCTURES Laboratory, Seoul National University
<Supercomputing Korea 2006>
Verification : Square Bar Impact 2D
Model Configurations Geometric Configuration
32x1x10(mmxmmxmm)Length of each cell : 1mm (Total 320
Cells) Constraints
Exterior Surface of Model: Sliding BC Impact Velocity : 200 m/sec Termination Time : 80 μsec
Results
0 μsec
20 μsec
40 μsec
80 μsec
Deformation Configuration (Lagrangian : Left, Eulerian : Right)
DeformationResults
IPSAP/Explicit LS/DYNA
Lag. Eul. Lag. Eul.
Length (mm)
22.5 22.0 22.5 22.7
Width (mm)
9.6 9.5 9.5 9.2
Contour plot (VOF=0.5 ) .Not Equal. Material Interface
20/27
AEROSPACE STRUCTURES Laboratory, Seoul National University
<Supercomputing Korea 2006>
Verification : Square Bar Impact 3D
Model Configurations Geometric Configuration
32x10x10(mmxmmxmm)
Length of each cell : 1mm
(Total 3,200cells) Constraints
Exterior surface of Model : Sliding BC
Impact Velocity : 200 m/sec Termination Time : 80 μsec
DeformationResults
IPSAP/Explicit LS/DYNA
Lag Eul Lag Eul
Length (mm) 21.9 21.5 21.8 23.5
Width (mm) 6.12 6.1 6.0 5.5
0 μsec
10 μsec
20 μsec
40 μsec
IPSAP/Explicit (Lagrangian : Left, Eulerian : Right)
LS/DYNA : 80 μsec (Lagrangian : Left, Eulerian : Right)
40 μsec
TimeIPSAP/Explicit
Lagrangian Eulerian
Elapsed Time(sec) 2.94 4.91
N. Of Step (Ncycle) 2311 415
21/27
AEROSPACE STRUCTURES Laboratory, Seoul National University
<Supercomputing Korea 2006>
Parallel Performance Evaluation
3-D Square Bar Impact Model Configurations
Example : Bar Impact 3D 1024x20x20 (409,600)element 10μsec (1,500 cycle) Domains are decomposed along the
impact direction
IPSAP/Explicit vs. LS-DYNA
* clock time per zone cycle (Total Elapsed Time/(Total element*Nsteps))
NCPU
IPSAP/Explicit LS-DYNA_double
Elapsed Time(sec)
Grind Time(nano sec)
SpeedUp(grind time)
Elapsed Time(sec)
Grind Time(nano sec)*
SpeedUp(grind time)
1 4.666e+03 7.584e+03 1.00 7.698e+03 1.222e+04 1.00
2 3.589e+03 5.842e+03 1.30 5,240e+03 8.171e+03 1.50
4 1.804e+03 2.936e+03 2.58 2.533e+04 3.887e+03 3.14
8 8.596e+02 1.399e+03 5.42 1.722e+04 2.606e+03 4.69
16 4.630e+02 7.536e+02 10.1 1.343e+03 1.988e+03 6.15
32 2.902e+02 4.723e+02 16.1 6.956e+02 9.387e+02 13.0
64 1.840e+02 2.995e+02 25.3 5.61e+02 7.048e+02 17.3
IPSAP/Explicit shows 2 or 3 times of smaller elapsed time than LS-DYNA
LS-DYNA : use HIS algorithm, MM
22/27
AEROSPACE STRUCTURES Laboratory, Seoul National University
<Supercomputing Korea 2006>
Parallel Performance Evaluation
NCPU
Lagrangian Part Remap PartTime
IntegrationTotal
(elapsed)Internal FComm.
Internal FRemap
Comm. Remap
13.230e+0
20.0
4.062e+03
0.0 3.257e+024.711e+0
3
21.957e+0
26.773e-01
3.034e+03
4.408e+01 3.482e+023.587e+0
3
41.056e+0
21.191e+00
1.502e+03
5.873e+01 1.842e+021.803e+0
3
86.301e+0
11.381e+00
6.829e+02
6.038e+01 1.046e+028.591e+0
2
163.899e+0
11.505e+00
3.555e+02
5.639e+01 6.226e+014.630e+0
2
322.625e+0
11.642e+00
2.172e+02
5.762e+01 4.103e+012.902e+0
2
641.910e+0
11.820e+00
1.282e+02
5.736e+01 3.010e+011.840e+0
2
Elapsed Time of Each Sub Function
NCPUInternal Force
Remap Total
1 1.00 1.00 1.00
2 1.65 1.34 1.30
4 3.06 2.70 2.58
8 5.13 5.95 5.42
16 8.28 11.4 10.1
32 12.3 18.7 16.1
64 16.9 31.7 25.3
Speed-Up of Internal force calculation and Remap
•elapsed time for remap part including the communication time takes about 90 % of total elapsed time•communication time for remap part is 30~40 times larger than that for the Lagrangian part.
parallel efficiency for remap part is better than that of internal force. This is because the calculation of internal force of the void cell is skipped in the program
23/27
<Supercomputing Korea 2006>
AEROSPACE STRUCTURES Laboratory, Seoul National University
Summary & Future Work
Summary A newly developed Lagrangian/Eulerian code has been described and its
parallel procedure has been provided. Parallel performance is compared with a commercial code and is shown to
be very efficient as the number of CPUs increases. The remap part is identified to be the most influent part to the serial and
parallel performance since it takes over 90% of total elapsed time. The first parallel Two-step Eulerian Code developed in Korea
Future Work Multi-material capability 2nd order accuracy Lagrangian-Eulerian Interface
24/27
AEROSPACE STRUCTURES Laboratory, Seoul National University
<Supercomputing Korea 2006>
Thank You
25/27
top related