high performance computing for mechanical simulations ... · pdf filehigh‐performance...
TRANSCRIPT
![Page 1: High Performance Computing for Mechanical Simulations ... · PDF fileHigh‐Performance Computing for Mechanical Simulations using ANSYS Jeff Beisheim ... SSD • 8 million DOF •](https://reader031.vdocument.in/reader031/viewer/2022022421/5a8525447f8b9a9f1b8c4640/html5/thumbnails/1.jpg)
High‐Performance Computing for Mechanical Simulations using ANSYS
Jeff BeisheimANSYS, Inc
![Page 2: High Performance Computing for Mechanical Simulations ... · PDF fileHigh‐Performance Computing for Mechanical Simulations using ANSYS Jeff Beisheim ... SSD • 8 million DOF •](https://reader031.vdocument.in/reader031/viewer/2022022421/5a8525447f8b9a9f1b8c4640/html5/thumbnails/2.jpg)
High Performance Computing (HPC) at ANSYS:
An ongoing effort designed to remove computing limitations from engineers who use computer aided engineering in all phases of design, analysis, and testing.
It is a hardware and software initiative!
HPC Defined
![Page 3: High Performance Computing for Mechanical Simulations ... · PDF fileHigh‐Performance Computing for Mechanical Simulations using ANSYS Jeff Beisheim ... SSD • 8 million DOF •](https://reader031.vdocument.in/reader031/viewer/2022022421/5a8525447f8b9a9f1b8c4640/html5/thumbnails/3.jpg)
Need for Speed
Impact product designEnable large modelsAllow parametric studies
ModalNonlinearMultiphysicsDynamics
AssembliesCAD‐to‐meshCapture fidelity
![Page 4: High Performance Computing for Mechanical Simulations ... · PDF fileHigh‐Performance Computing for Mechanical Simulations using ANSYS Jeff Beisheim ... SSD • 8 million DOF •](https://reader031.vdocument.in/reader031/viewer/2022022421/5a8525447f8b9a9f1b8c4640/html5/thumbnails/4.jpg)
A History of HPC Performance
1990► Shared Memory Multiprocessing (SMP) available
1990► Shared Memory Multiprocessing (SMP) available
1994►Iterative PCG Solver introduced for large analyses
1994►Iterative PCG Solver introduced for large analyses
1999 ‐ 2000►64‐bit large memory addressing
1999 ‐ 2000►64‐bit large memory addressing
2004►1st company to solve 100M structural DOF
2004►1st company to solve 100M structural DOF
2007 ‐ 2009► Optimized for multicore processors►Teraflop performance at 512 cores
2007 ‐ 2009► Optimized for multicore processors►Teraflop performance at 512 cores
1980’s► Vector Processing on Mainframes
1980’s► Vector Processing on Mainframes
2005 ‐2007►Distributed PCG solver►Distributed ANSYS (DMP) released►Distributed sparse solver►Variational Technology►Support for clusters using Windows HPC
2005 ‐2007►Distributed PCG solver►Distributed ANSYS (DMP) released►Distributed sparse solver►Variational Technology►Support for clusters using Windows HPC
1980
1990
2010
2000
2012
2010►GPU acceleration (single GPU; SMP)
2010►GPU acceleration (single GPU; SMP)
2012► GPU acceleration (multiple GPUs; DMP)
2012► GPU acceleration (multiple GPUs; DMP)
![Page 5: High Performance Computing for Mechanical Simulations ... · PDF fileHigh‐Performance Computing for Mechanical Simulations using ANSYS Jeff Beisheim ... SSD • 8 million DOF •](https://reader031.vdocument.in/reader031/viewer/2022022421/5a8525447f8b9a9f1b8c4640/html5/thumbnails/5.jpg)
HPC Revolution
• Recent advancements have revolutionized the computational speed available on the desktop– Multi‐core processors• Every core is really an independent processor
– Large amounts of RAM and SSDs– GPUs
![Page 6: High Performance Computing for Mechanical Simulations ... · PDF fileHigh‐Performance Computing for Mechanical Simulations using ANSYS Jeff Beisheim ... SSD • 8 million DOF •](https://reader031.vdocument.in/reader031/viewer/2022022421/5a8525447f8b9a9f1b8c4640/html5/thumbnails/6.jpg)
Parallel Processing – Hardware
• 2 Types of memory systems– Shared memory parallel (SMP) single box, workstation/server– Distributed memory parallel (DMP) multiple boxes, cluster
ClusterWorkstation
![Page 7: High Performance Computing for Mechanical Simulations ... · PDF fileHigh‐Performance Computing for Mechanical Simulations using ANSYS Jeff Beisheim ... SSD • 8 million DOF •](https://reader031.vdocument.in/reader031/viewer/2022022421/5a8525447f8b9a9f1b8c4640/html5/thumbnails/7.jpg)
Parallel Processing – Software
• 2 Types of parallel processing for Mechanical APDL– Shared memory parallel (‐np > 1)• First available in v4.3• Can only be used on single machine
– Distributed memory parallel (‐dis ‐np > 1)• First available in v6.0 with the DDS solver• Can be used on single machine or cluster
• GPU acceleration (‐acc)• First available in v13.0 using NVIDIA GPUs• Supports using either single GPU or multiple GPUs• Can be used on single machine or cluster
![Page 8: High Performance Computing for Mechanical Simulations ... · PDF fileHigh‐Performance Computing for Mechanical Simulations using ANSYS Jeff Beisheim ... SSD • 8 million DOF •](https://reader031.vdocument.in/reader031/viewer/2022022421/5a8525447f8b9a9f1b8c4640/html5/thumbnails/8.jpg)
Distributed ANSYS Design Requirements
• No limitation in simulation capability– Must support all features– Continually working to add more functionality with each release
• Reproducible and consistent results– Same answers achieved using 1 core or 100 cores– Same quality checks and testing are done as with SMP version– Uses the same code base as SMP version of ANSYS
• Support all major platforms– Most widely used processors, operating systems, and interconnects– Supports same platforms that SMP version supports– Uses latest versions of MPI software which support the latest
interconnects
![Page 9: High Performance Computing for Mechanical Simulations ... · PDF fileHigh‐Performance Computing for Mechanical Simulations using ANSYS Jeff Beisheim ... SSD • 8 million DOF •](https://reader031.vdocument.in/reader031/viewer/2022022421/5a8525447f8b9a9f1b8c4640/html5/thumbnails/9.jpg)
Distributed ANSYS Design
• Distributed steps (‐dis ‐np N)– At start of first load step, decompose FEA model into N pieces (domains)
– Each domain goes to a different core to be solved
– Solution is not independent!!– Lots of communication required to
achieve solution– Lots of synchronization required to
keep all processes together– Each process writes its own sets of files (file0*, file1*, file2*,…, file[N‐1]*)
– Results are automatically combined at end of solution– Facilitates post‐processing in /POST1,
/POST26, or WorkBench
![Page 10: High Performance Computing for Mechanical Simulations ... · PDF fileHigh‐Performance Computing for Mechanical Simulations using ANSYS Jeff Beisheim ... SSD • 8 million DOF •](https://reader031.vdocument.in/reader031/viewer/2022022421/5a8525447f8b9a9f1b8c4640/html5/thumbnails/10.jpg)
Distributed ANSYS Capabilities• Static linear or nonlinear analyses• Buckling analyses • Modal analyses• Harmonic response analyses using FULL method• Transient response analyses using FULL method• Single field structural and thermal analyses• Low‐frequency electromagnetic analysis • High‐frequency electromagnetic analysis • Coupled‐field analyses• All widely used element types and materials• Superelements (use pass) • NLGEOM, SOLC, LNSRCH, AUTOTS, IC, INISTATE, …• Linear Perturbation• Multiframe restarts• Cyclic symmetry analyses• User Programmable features (UPFs)
Wide variety of features & analysis
capabilities are supported
![Page 11: High Performance Computing for Mechanical Simulations ... · PDF fileHigh‐Performance Computing for Mechanical Simulations using ANSYS Jeff Beisheim ... SSD • 8 million DOF •](https://reader031.vdocument.in/reader031/viewer/2022022421/5a8525447f8b9a9f1b8c4640/html5/thumbnails/11.jpg)
Distributed ANSYS Equation Solvers
• Sparse direct solver (default)– Supports SMP, DMP, and GPU acceleration– Can handle all analysis types and options– Foundation for Block Lanczos, Unsymmetric, Damped, and QR
damped eigensolvers• PCG iterative solver– Supports SMP, DMP, and GPU acceleration– Symmetric, real‐value matrices only (i.e., static/full transient)– Foundation for PCG Lanczos eigensolver• JCG/ICCG iterative solvers– Supports SMP only
![Page 12: High Performance Computing for Mechanical Simulations ... · PDF fileHigh‐Performance Computing for Mechanical Simulations using ANSYS Jeff Beisheim ... SSD • 8 million DOF •](https://reader031.vdocument.in/reader031/viewer/2022022421/5a8525447f8b9a9f1b8c4640/html5/thumbnails/12.jpg)
Distributed ANSYS Eigensolvers
• Block Lanczos eigensolver (including QRdamp)– Supports SMP and GPU acceleration• PCG Lanczos eigensolver– Supports SMP, DMP, and GPU acceleration– Great for large models (>5 MDOF) with relatively few modes (< 50)• Supernode eigensolver– Supports SMP only– Optimal choice when requesting hundreds or thousands of modes• Subspace eigensolver– Supports SMP, DMP, and GPU acceleration– Currently only supports buckling analyses; “beta” for modal in R14.5• Unsymmetric/Damped eigensolvers– Supports SMP, DMP, and GPU acceleration
![Page 13: High Performance Computing for Mechanical Simulations ... · PDF fileHigh‐Performance Computing for Mechanical Simulations using ANSYS Jeff Beisheim ... SSD • 8 million DOF •](https://reader031.vdocument.in/reader031/viewer/2022022421/5a8525447f8b9a9f1b8c4640/html5/thumbnails/13.jpg)
Distributed ANSYS Benefits
• Better architecture– More computations performed in parallel faster solution time
• Better speedups than SMP– Can achieve > 10x on 16 cores (try getting that with SMP!)– Can be used for jobs running on 1000+ CPU cores
• Can take advantage of resources on multiple machines– Memory usage and bandwidth scales– Disk (I/O) usage scales– Whole new class of problems can be solved!
![Page 14: High Performance Computing for Mechanical Simulations ... · PDF fileHigh‐Performance Computing for Mechanical Simulations using ANSYS Jeff Beisheim ... SSD • 8 million DOF •](https://reader031.vdocument.in/reader031/viewer/2022022421/5a8525447f8b9a9f1b8c4640/html5/thumbnails/14.jpg)
Distributed ANSYS Performance
• Need fast interconnects to feed fast processors– Two main characteristics for each interconnect: latency and bandwidth– Distributed ANSYS is highly bandwidth bound
+--------- D I S T R I B U T E D A N S Y S S T A T I S T I C S ------------+
Release: 14.5 Build: UP20120802 Platform: LINUX x64 Date Run: 08/09/2012 Time: 23:07
Processor Model: Intel(R) Xeon(R) CPU E5-2690 0 @ 2.90GHz
Total number of cores available : 32Number of physical cores available : 32Number of cores requested : 4 (Distributed Memory Parallel)MPI Type: INTELMPI
Core Machine Name Working Directory----------------------------------------------------
0 hpclnxsmc00 /data1/ansyswork1 hpclnxsmc00 /data1/ansyswork2 hpclnxsmc01 /data1/ansyswork3 hpclnxsmc01 /data1/ansyswork
Latency time from master to core 1 = 1.171 microsecondsLatency time from master to core 2 = 2.251 microsecondsLatency time from master to core 3 = 2.225 microseconds
Communication speed from master to core 1 = 7934.49 MB/sec Same machineCommunication speed from master to core 2 = 3011.09 MB/sec QDR InfinibandCommunication speed from master to core 3 = 3235.00 MB/sec QDR Infiniband
![Page 15: High Performance Computing for Mechanical Simulations ... · PDF fileHigh‐Performance Computing for Mechanical Simulations using ANSYS Jeff Beisheim ... SSD • 8 million DOF •](https://reader031.vdocument.in/reader031/viewer/2022022421/5a8525447f8b9a9f1b8c4640/html5/thumbnails/15.jpg)
Distributed ANSYS Performance
0
10
20
30
40
50
60
8 cores 16 cores 32 cores 64 cores 128 cores
Ratin
g (run
s/day)
Interconnect Performance
Gigabit EthernetDDR Infiniband
• Need fast interconnects to feed fast processors
• Turbine model• 2.1 million DOF• SOLID187 elements• Nonlinear static analysis• Sparse solver (DMP)• Linux cluster (8 cores per node)
![Page 16: High Performance Computing for Mechanical Simulations ... · PDF fileHigh‐Performance Computing for Mechanical Simulations using ANSYS Jeff Beisheim ... SSD • 8 million DOF •](https://reader031.vdocument.in/reader031/viewer/2022022421/5a8525447f8b9a9f1b8c4640/html5/thumbnails/16.jpg)
Distributed ANSYS Performance
• Need fast hard drives to feed fast processors– Check the bandwidth specs
– ANSYS Mechanical can be highly I/O bandwidth bound– Sparse solver in the out‐of‐core memory mode does lots of I/O
– Distributed ANSYS can be highly I/O latency bound– Seek time to read/write each set of files causes overhead
– Consider SSDs– High bandwidth and extremely low seek times
– Consider RAID configurationsRAID 0 – for speed RAID 1,5 – for redundancyRAID 10 – for speed and redundancy
![Page 17: High Performance Computing for Mechanical Simulations ... · PDF fileHigh‐Performance Computing for Mechanical Simulations using ANSYS Jeff Beisheim ... SSD • 8 million DOF •](https://reader031.vdocument.in/reader031/viewer/2022022421/5a8525447f8b9a9f1b8c4640/html5/thumbnails/17.jpg)
Distributed ANSYS Performance
Beisheim, J.R., “Boosting Memory Capacity with SSDs”, ANSYS Advantage Magazine, Volume IV, Issue 1, pp. 37, 2010.
• Need fast hard drives to feed fast processors
0
5
10
15
20
25
30
1 core 2 cores 4 cores 8 cores
Ratin
g (run
s/day)
Hard Drive Performance
HDDSSD
• 8 million DOF• Linear static analysis• Sparse solver (DMP)• Dell T5500 workstation
12 Intel Xeon x5675 cores, 48 GB RAM, single 7.2k rpm HDD, single SSD, Win7)
![Page 18: High Performance Computing for Mechanical Simulations ... · PDF fileHigh‐Performance Computing for Mechanical Simulations using ANSYS Jeff Beisheim ... SSD • 8 million DOF •](https://reader031.vdocument.in/reader031/viewer/2022022421/5a8525447f8b9a9f1b8c4640/html5/thumbnails/18.jpg)
• Avoid waiting for I/O to complete!• Check to see if job is I/O bound or compute bound– Check output file for CPU and Elapsed times• When Elapsed time >> main thread CPU time I/O bound
– Consider adding more RAM or faster hard drive configuration• When Elapsed time ≈ main thread CPU time Compute bound– Considering moving simulation to a machine with faster processors– Consider using Distributed ANSYS (DMP) instead of SMP– Consider running on more cores or possibly using GPU(s)
Distributed ANSYS Performance
Total CPU time for main thread : 167.8 seconds. . . . . .Elapsed Time (sec) = 388.000 Date = 08/21/2012
![Page 19: High Performance Computing for Mechanical Simulations ... · PDF fileHigh‐Performance Computing for Mechanical Simulations using ANSYS Jeff Beisheim ... SSD • 8 million DOF •](https://reader031.vdocument.in/reader031/viewer/2022022421/5a8525447f8b9a9f1b8c4640/html5/thumbnails/19.jpg)
All runs with Sparse solverHardware 12.0: dual X5460 (3.16 GHz Harpertown Intel Xeon) 64GB RAM per nodeHardware 12.1 + 13.0: dual X5570 (2.93 GHz Nehalem Intel Xeon) 72GB RAM per nodeANSYS 12.0 to 14.0 runs with DDR Infiniband® interconnectANSYS 14.0 creep runs with NROPT,CRPL + DDOPT,METIS
Distributed ANSYS PerformanceANSYS 11.0 ANSYS 12.0 ANSYS 12.1 ANSYS 13.0 SP2 ANSYS 14.0
Thermal (full model)3 MDOF
Time 4 hours 4 hours 4 hours 4 hours 1 hour 0.8 hour
Cores 8 8 8 8 8 + 1 GPU 32
ThermomechanicalSimulation (full model) 7.8 MDOF
Time ~5.5 days 34.3 hours 12.5 hours 9.9 hours 7.5 hours
Iterations 163 164 195 195 195
Cores 8 20 64 64 128
Interpolation of Boundary Conditions
Time 37 hours 37 hours 37 hours 0.2 hour 0.2 hour
Load Steps 16 16 16 Improved algorithm 16
Submodel: Creep‐StrainAnalysis 5.5 MDOF
Time ~5.5 days 38.5 hours 8.5 hours 6.1 hours 5.9 hours 4.2 hours
Iterations 492 492 492 488 498 498
Cores 18 16 76 128 64 + 8 GPU 256
Total Time 2 weeks 5 days 2 days 1 day 0.5 day
Results Courtesy of MicroConsult Engineering, GmbH
![Page 20: High Performance Computing for Mechanical Simulations ... · PDF fileHigh‐Performance Computing for Mechanical Simulations using ANSYS Jeff Beisheim ... SSD • 8 million DOF •](https://reader031.vdocument.in/reader031/viewer/2022022421/5a8525447f8b9a9f1b8c4640/html5/thumbnails/20.jpg)
Distributed ANSYS Performance
0
5
10
15
20
25
0 8 16 24 32 40 48 56 64
Speedu
p
Solution Scalability
• Minimum time to solution more important than scaling
• Turbine model• 2.1 million DOF• Nonlinear static analysis• 1 Loadstep, 7 substeps,25 equilibrium iterations• Linux cluster (8 cores per node)
![Page 21: High Performance Computing for Mechanical Simulations ... · PDF fileHigh‐Performance Computing for Mechanical Simulations using ANSYS Jeff Beisheim ... SSD • 8 million DOF •](https://reader031.vdocument.in/reader031/viewer/2022022421/5a8525447f8b9a9f1b8c4640/html5/thumbnails/21.jpg)
Distributed ANSYS Performance
0
5000
10000
15000
20000
25000
30000
35000
40000
45000
0 8 16 24 32 40 48 56 64
Solutio
n Elapsed
Tim
e
Solution Scalability
11 hrs, 48 mins
30 mins
• Minimum time to solution more important than scaling
1 hr, 20 mins• Turbine model• 2.1 million DOF• Nonlinear static analysis• 1 Loadstep, 7 substeps,25 equilibrium iterations• Linux cluster (8 cores per node)
![Page 22: High Performance Computing for Mechanical Simulations ... · PDF fileHigh‐Performance Computing for Mechanical Simulations using ANSYS Jeff Beisheim ... SSD • 8 million DOF •](https://reader031.vdocument.in/reader031/viewer/2022022421/5a8525447f8b9a9f1b8c4640/html5/thumbnails/22.jpg)
• Graphics processing units (GPUs)– Widely used for gaming, graphics rendering– Recently been made available as general‐purpose “accelerators”• Support for double precision computations• Performance exceeding the latest multicore CPUs
So how can ANSYS make use of this new technology to reduce the overall time to solution??
GPU Accelerator Capability
![Page 23: High Performance Computing for Mechanical Simulations ... · PDF fileHigh‐Performance Computing for Mechanical Simulations using ANSYS Jeff Beisheim ... SSD • 8 million DOF •](https://reader031.vdocument.in/reader031/viewer/2022022421/5a8525447f8b9a9f1b8c4640/html5/thumbnails/23.jpg)
• “Accelerate” Sparse direct solver (SMP & DMP)– GPU is used to factor many dense “frontal” matrices– Decision is made automatically on when to send data to GPU• “Frontal matrix” too small, too much overhead, stays on CPU• “Frontal matrix” too large, exceeds GPU memory, only partially accelerated
• “Accelerate” PCG/JCG iterative solvers (SMP & DMP)– GPU is only used for sparse‐matrix vector multiply (SpMV
kernel)– Decision is made automatically on when to send data to GPU• Model too small, too much overhead, stays on CPU• Model too large, exceeds GPU memory, only partially accelerated
GPU Accelerator Capability
![Page 24: High Performance Computing for Mechanical Simulations ... · PDF fileHigh‐Performance Computing for Mechanical Simulations using ANSYS Jeff Beisheim ... SSD • 8 million DOF •](https://reader031.vdocument.in/reader031/viewer/2022022421/5a8525447f8b9a9f1b8c4640/html5/thumbnails/24.jpg)
• Supported hardware– Currently support NVIDIA Tesla 20‐series, Quadro 6000, and
Quadro K5000 cards– Next generation NVIDIA Tesla cards (Kepler) should work with
R14.5– Installing a GPU requires the following:• Larger power supply (single card needs ~250W)• Open 2x form factor PCIe x16 2.0 (or 3.0) slot
• Supported platforms– Windows and Linux 64‐bit platforms only• Does not include Linux Itanium (IA‐64) platform
GPU Accelerator Capability
![Page 25: High Performance Computing for Mechanical Simulations ... · PDF fileHigh‐Performance Computing for Mechanical Simulations using ANSYS Jeff Beisheim ... SSD • 8 million DOF •](https://reader031.vdocument.in/reader031/viewer/2022022421/5a8525447f8b9a9f1b8c4640/html5/thumbnails/25.jpg)
NVIDIA Tesla C2075
NVIDIA Tesla M2090
NVIDIAQuadro 6000
NVIDIAQuadro K5000†
NVIDIATesla K10
NVIDIATesla K20†
Power (W) 225 250 225 122 250 250
Memory 6 GB 6 GB 6 GB 4 GB 8 GB 6 to 24 GBMemory Bandwidth (GB/s)
144 177.4 144 173 320 288
Peak Speed SP/DP (GFlops)
1030/515 1331/665 1030/515 2290/95 4577/190 5184/1728
• Targeted hardware
† These NVIDIA “Kepler” based products are not released yet, so specifications may be incorrect
GPU Accelerator Capability
![Page 26: High Performance Computing for Mechanical Simulations ... · PDF fileHigh‐Performance Computing for Mechanical Simulations using ANSYS Jeff Beisheim ... SSD • 8 million DOF •](https://reader031.vdocument.in/reader031/viewer/2022022421/5a8525447f8b9a9f1b8c4640/html5/thumbnails/26.jpg)
GPU Accelerator Capability
2.6x
3.8x
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
2 cores 8 cores 8 cores
Relativ
e Speedu
p
GPU Performance
(no GPU) (no GPU)
• 6.5 million DOF• Linear static analysis• Sparse solver (DMP)• 2 Intel Xeon E5‐2670 (2.6 GHz, 16 cores total), 128 GB RAM, SSD, 4 Tesla C2075, Win7
• GPUs can offer significantly faster time to solution
(1 GPU)
![Page 27: High Performance Computing for Mechanical Simulations ... · PDF fileHigh‐Performance Computing for Mechanical Simulations using ANSYS Jeff Beisheim ... SSD • 8 million DOF •](https://reader031.vdocument.in/reader031/viewer/2022022421/5a8525447f8b9a9f1b8c4640/html5/thumbnails/27.jpg)
GPU Accelerator Capability
• GPUs can offer significantly faster time to solution
2.7x
5.2x
0.0
1.0
2.0
3.0
4.0
5.0
6.0
2 cores 8 cores 16 cores
Relativ
e Speedu
p
GPU Performance
• 11.8 million DOF• Linear static analysis• PCG solver (DMP)• 2 Intel Xeon E5‐2670 (2.6 GHz, 16 cores total), 128 GB RAM, SSD, 4 Tesla C2075, Win7
(no GPU) (1 GPU) (4 GPUs)
![Page 28: High Performance Computing for Mechanical Simulations ... · PDF fileHigh‐Performance Computing for Mechanical Simulations using ANSYS Jeff Beisheim ... SSD • 8 million DOF •](https://reader031.vdocument.in/reader031/viewer/2022022421/5a8525447f8b9a9f1b8c4640/html5/thumbnails/28.jpg)
• Supports majority of ANSYS users– Covers both sparse direct and PCG iterative solvers– Only a few minor limitations
• Ease of use– Requires at least one supported GPU card to be installed– Requires at least one HPC pack license– No rebuild, no additional installation steps
• Performance– ~10‐25% reduction in time to solution when using 8 CPU cores– Should never slow down your simulation!
GPU Accelerator Capability
![Page 29: High Performance Computing for Mechanical Simulations ... · PDF fileHigh‐Performance Computing for Mechanical Simulations using ANSYS Jeff Beisheim ... SSD • 8 million DOF •](https://reader031.vdocument.in/reader031/viewer/2022022421/5a8525447f8b9a9f1b8c4640/html5/thumbnails/29.jpg)
• How will you use all of this computing power?
Design Optimization Studies
Design Optimization
Higher fidelity Full assemblies More nonlinear
![Page 30: High Performance Computing for Mechanical Simulations ... · PDF fileHigh‐Performance Computing for Mechanical Simulations using ANSYS Jeff Beisheim ... SSD • 8 million DOF •](https://reader031.vdocument.in/reader031/viewer/2022022421/5a8525447f8b9a9f1b8c4640/html5/thumbnails/30.jpg)
• ANSYS HPC Packs enable high‐fidelity insight – Each simulation consumes one
or more packs– Parallel enabled increases
quickly with added packs
• Single solution for all physics and any level of fidelity• Flexibility as your HPC resources grow– Reallocate Packs, as resources
allow
2048
328
128
512
ParallelEnabled(Cores)
Packs per Simulation1 2 3 4 5
HPC Licensing
1 GPU+
4 GPU+
16 GPU+
64 GPU+
256 GPU+
![Page 31: High Performance Computing for Mechanical Simulations ... · PDF fileHigh‐Performance Computing for Mechanical Simulations using ANSYS Jeff Beisheim ... SSD • 8 million DOF •](https://reader031.vdocument.in/reader031/viewer/2022022421/5a8525447f8b9a9f1b8c4640/html5/thumbnails/31.jpg)
• Scalable, like ANSYS HPC Packs– Enhances the customer’s ability to include many design points as part of a single study
– Ensure sound product decision making
• Amplifies complete workflow– Design points can include execution of multiple products (pre, solve, HPC, post)
– Packaged to encourage adoption of the path to robust design!
Number of Simultaneous Design Points
Enabled
Number of HPC Parametric Pack Licenses
1 2 3 4 5
64
84
16
32
HPC Parametric Pack Licensing
![Page 32: High Performance Computing for Mechanical Simulations ... · PDF fileHigh‐Performance Computing for Mechanical Simulations using ANSYS Jeff Beisheim ... SSD • 8 million DOF •](https://reader031.vdocument.in/reader031/viewer/2022022421/5a8525447f8b9a9f1b8c4640/html5/thumbnails/32.jpg)
HPC Revolution
The right combination ofalgorithms and hardware
leads to maximum efficiency
SMP vs. DMP
HDD vs. SSDs
InterconnectsClusters
GPUs
![Page 33: High Performance Computing for Mechanical Simulations ... · PDF fileHigh‐Performance Computing for Mechanical Simulations using ANSYS Jeff Beisheim ... SSD • 8 million DOF •](https://reader031.vdocument.in/reader031/viewer/2022022421/5a8525447f8b9a9f1b8c4640/html5/thumbnails/33.jpg)
HPC Revolution
Every computer today is a parallel computer
Every simulation in ANSYS can benefit from parallel processing