ibm confidential © 2013 ibm corporation 1 technical computing for engineering analysis in industry...
Post on 26-Mar-2015
214 Views
Preview:
TRANSCRIPT
IBM Confidential
© 2013 IBM Corporation
1
Technical Computing for Engineering Analysis in Industry Today
Martin Feyereisen, Ph.D.Auto/Aero Sales LeadTechnical ComputingIBM
feyer@us.ibm.comTel: +01 (715) 410 1276
IBM Confidential
© 2013 IBM Corporation
2
CAE – Computer Aided Engineering– Engineering virtual design, prototyping, testing, and manufacturing
Automotive/Aerospace– Largest revenue in the
CAE segment (90%)
Others– Consumer products – Packaging– Universities/Labs
CAE in use: http://www.physicsgames.net/game/Cargo_Bridge.html
CAE in Industry: definitions
IBM Confidential
© 2013 IBM Corporation
3
brake design
safetyaerodynamics
engine design
CAE in Industry: Uses of CAE in Automotive
IBM Confidential
© 2013 IBM Corporation
4
Wing DevelopmentEngine design
Weapons IntegrationLanding gear, etc
CAE in Industry: Uses of CAE in Aerospace
IBM Confidential
© 2013 IBM Corporation
5
Explicit Analysis– Crash/Impact testing: typically 60% of cycles– Applications scale well in parallel typically to 128 cores– Memory and IO requirements are average– Important ISV Applications:
• LS-DYNA, PAM-CRASH, RADIOSS, ABAQUS/Explicit
Implicit Analysis / Structural Mechanics– Structural integrity, vibration analysis, acoustics, etc:
typically 15% of cycles– Applications typically run in single server– Memory and IO requirements are moderate to extreme– Important ISV Applications:
• Nastran, ABAQUS, ANSYS, Optistruct
Computational Fluid Dynamics– Aerodynamics, cooling, HVAC, combustion, etc: typically
25% of cycles– Applications scale well in parallel typically to 128 cores– Memory and IO requirements are average– Important ISV Applications:
• STAR-CD, STAR-CCM+, FLUENT, PowerFLOW, OpenFOAM
Crash/ExplcitStructures/ImplicitCFD
CAE in Industry: Automotive Engineering Workloads
IBM Confidential
© 2013 IBM Corporation
6
Explicit Analysis– Crash/Impact testing: Typically 10% of cycles– Applications scale well in parallel typically to 64 cores– Memory and IO requirements are average– Important ISV Applications:
• LS-DYNA, ABAQUS/Explicit
Implicit Analysis / Structural Mechanics– Structural integrity, vibration analysis, acoustics, etc:
Typically 25% of cycles– Applications typically run in single server– Memory and IO requirements are moderate to extreme– Important ISV Applications:
• Nastran, ABAQUS, ANSYS
Computational Fluid Dynamics– Aerodynamics, cooling, HVAC, combustion, etc: Typically
65% of cycles– Applications scale well in parallel typically to 256 cores
and above– Memory and IO requirements are average– Often a significant amount of “in-house” CFD workload– Important ISV Applications:
• FLUENT, CFX CFD++
Crash/ExplcitStructures/ImplicitCFD
CAE in Industry: Aerospace Engineering Workloads
IBM Confidential
© 2013 IBM Corporation
7
General Purpose– crash, CFD– diskless– 16 core, 64 GB memory
High-End General Purpose– scalable linear, non-linear, vibro-accoustics– 2 HDDs: ~1TB– 16 core, 128 GB memory
Structural Mechanics– NVH– 4-16 HDDs: ~3TB– 8-16 core, 128-256 GB memory
Graphics Servers– PrePost, Remote Desktop– 1-2 HDD: ~1TB– 16 core, 256 GB memory– 1-2 GPUs
CAE in Industry: Trends in CAE servers
IBM Confidential
© 2013 IBM Corporation
8
CAE Industry Trends: Customer pain points
While HPC hardware costs are declining, overall HPC workloads are increasing on fixed budgets putting extreme pressure on all HPC aspects
– as a result, hardware costs are typically no longer the “throttle point” for HPC capacity
Customer data centers are often near capacity– which can be particularly troublesome when deploying new systems while
currents systems are still in production
Many customers are trying to consolidate data centers– but finding poor network bandwidth between users and data centers an
obstacle to consolidation
Many customers have gigabit Ethernet based campus networks– which are not able to keep up with ever increasing HPC dataflow
IBM Confidential
© 2013 IBM Corporation
9
• Clusters continue to grow in size and complexity
• 1985 1 core•CRAY XMP
• 1992 10 cores•CRAY YMP
• 1998 100 cores•SGI PowerChallenge
• 2004 1,000 cores•POWER4, Itanium, x86
• 2010 10,000+ cores•x86_64
• 2015 100,000 cores•???
• How to effectively build, use, and manage such systems?
CAE in Industry: Trends in CAE Cluster Growth
IBM Confidential
© 2013 IBM Corporation
10
CAE Infrastructure Challenges
• Servers
‒ Space, energy, and cooling constraints as size of clusters increases
‒ Reliability, manageability, and serviceability of servers
• Network
‒ Cost of high speed networks as size of clusters increases
‒ Connectivity between servers storage , and users as amount of data flow increases
• Storage
‒ Scalability and performance of file systems as amount of data and number of users increases
‒ Availability of data across multiple sites
IBM Confidential
© 2013 IBM Corporation
11
CAE servers: Recent Server Performance Gains
Processor’s “core” power has remained constant‒ Application performance gains mainly to better
algorithms and increased parallelism
‒ Turnaround time still critical for product development cycles
Capacity of servers continues to increase with improved chip technology (gate count)
• Increased ability to exploit increase in applcation parallel scalability
• Ability to carry out more simulations per day helpful for computer based optimization
0
0.5
1
1.5
2
2.5
3
3.5
4
2008 2010 2012 2014
High
er is
bet
ter
Time/ job Jobs/ day
CAE performance on a single server*
* - LS-DYNA Neon-Revised BMT (www.topcrunch.org)
IBM Confidential
© 2013 IBM Corporation
12
CAE servers: Processor speed effect on Application Performance
Intel’s E5 processors are available with several different options‒ wattage, cores, base clock frequency
‒ turbo mode allow for increased performance with complicated sets of factors involved
‒ typically CAE users prefer highest speed cores to make most effective use of ISV license costs.
Benchmarks• single node CAE benchmarks carried out at
different clock speed and with turbo enabled (*)
• effective frequency determined from1200 MHz baseline benchmark
• most applications show nearly liner benefit from clock frequency up to almost 3 GHz
• turbo benefit is substantial.
1.7
2
2.3
2.6
2.9
3.2
3.5
2 2.2 2.4 2.6 2.6* 2.9 2.9*Effec
tive C
lock
Frequ
ency
(bas
ed on
1.2
GHz)
ABAQUS DYNA FLUENT
CAE Application Performance
IBM Confidential
© 2013 IBM Corporation
13
0
0.2
0.4
0.6
0.8
1
V13c
g
V13s
p1
V13s
p2
V13s
p3
V13s
p4
V13s
p5cu
st1
Elap
sed T
ime
SMP- 1 SMP- 1+GPU SMP- 8
• Growing interest in using GPUS for both remote desktop visualization and computing
• Outlook
‒ lack of broad functionality is a severe concern for many customers who require broad solutions
‒ performance can be excellent, but spotty
‒ currently dual processors servers with shared memory parallelism typically outperforms GPUs
• Many customers are keenly interested in GPUs as a way to reduce hardware costs, but reluctant to purchase at this time
• Primarily for Structural Analysis
‒ available with limited functionality in MSC.Nastran, ABAQUS and ANSYS Mechanical
‒ general purpose CFD and Explicit not expected any time soon
GPU BMTs
CAE servers: Use of GPU’s in CAE
IBM Confidential
© 2013 IBM Corporation
14
Rack Dense
─ form factor: 19” rack(1-4U)
─ typical deployment: 8-64 nodes
─ typical workloads: simulation, pre/post, storage
─ benefits: maximum flexibility
Flex System─ form factor: 19” rack(10U)
─ typical deployment: 200-800 nodes
─ typical configuration: simulation
─ benefits: integrated solution, manageability, reliability
NeXtScale System─ form factor: 19” rack (6)
─ typical deployment: 1-4 Racks
─ typical configuration: simulation, pre/post
─ benefits: price/performance GPU capabilities
CAE servers: IBM CAE servers
IBM Confidential
© 2013 IBM Corporation
15
HPC Networks: High Speed Network Options for CAE
10 Gigabit Ethernet– typically more expensive for large HPC
deployments– significantly lagging InfiniBand in performance– non-standard SW stack for HPC applications
InfiniBand– Fat Tree
• uniform performance• significant cost and extra hardware• difficult to expand
– Cluster Islands• minimal cost and space• easy to expand• workflow coordination required• storage access can be complicated
– Torus• mostly uniform performance• using existing IB hardware for minimal
cost and no additional footprint/power• easy to expand
IBM Confidential
© 2013 IBM Corporation
16
• existing HPC Network designs• currently most HPC clusters have
Ethernet network for management and administration and InfiniBand network to handle MPI and IO traffic
• network benchmarks• use of emerging Ethernet hardware and
software technology (such as iWARP and RoCE) should erase performance delta between Ethernet and InfiniBand
• many HPC applications are latency sensitive and run well with higher blocking factors
• transforming HPC networks designs
• use of a single Ethernet network for all network traffic to reduce network complexity and cost
• improved network topologies with increased blocking factors could significantly reduce network complexity and costs
• use of Blades to improve network reliability and reduce network complexity
Datset Max ISL Utilization
Max BW (Gb/s)
Full Bisection Performance*
Oversubscribed Performance*
Eddy_417K 2% 0.8 12387 (8:1) 12387
Truck_111m 15% 6.0 226 (4:1) 225
Qlogic# FLUENT 256-way FLUENT Benchmarks
* - FLUENT rating (higher is better)# - IBM iDataPlex dx360M2 with Qlogic TrueScale QDR QLE7340
Benchmark QDR-IB 10-GE(tcp) 10-GE(iWARP)
FLUENT 24,200s 25,000s 24,200s
STAR-CCM+ 26,300s 62,700s 27,000s
ABAQUS_exp_dp 18,400s D.N.F 19,200s
LS-DYNA 23,700s 59,800s 24,300s
21,500s 24,000s 21,300s
48-way$ Customer Benchmarks
$ - IBM iDataPlex dx360M3 with X5670 hex-core processors$ - IBM iDataPlex dx360M3 with X5670 hex-core processors
HPC Networks: Reducing Network Complexity
IBM Confidential
© 2013 IBM Corporation
17
Historically most storage for CAE has been handled with a combination with local direct-attached storage and NFS network storage
Implicit structural mechanics applications (i.e. Nastran, etc) often generate significant IO which would overwhelm a shared file system and is best handled with servers using direct-attached disk arrays
– Solid State Disk (SSD’s) rapidly replacing spinning disk for local scratch I/O storage
Slow movement to high-performance shared file systems, such as GPFS
– often a viable option in conjunction with diskless/stateless clusters, particularly for CFD
System storage: I/O and Storage for CAE
IBM Confidential
© 2013 IBM Corporation
18
Typical Structural Mechanics Job– 1 node– 10 hours– 1 GB input, 10GB output– Total I/O = 1000GB– AVG BW important for sustained scratch I/O
• local disks are best
Typical CFD & Explicit/Impact Job– 4 nodes– 10 hours– 10GB input; 90GB output– Total I/O = 100GB– Peak BW important for file copy
• high performance shared file system is best
Local GPFS
Peak BW/job 200MB/s 3500MB/s
AVG BW/job 200MB/s 50MB/s
timetime
IO r
ate
IO r
ate
timetime
IO r
ate
IO r
ate
System storage: General CAE I/O Patterns
IBM Confidential
© 2013 IBM Corporation
19
• Benchmarks
• Altair Optistruct Ver11
• AMLS 4.2.r33
• Server A• 2.6GHz E5-2670 SNB, 384GB memory
• 8x SAS@10Krpm
• Server B• 2.9GHz E5-2670 SNB, 128 GB memory
• 8x SAS@15krpm
• 4x 400GB Intel DC S3700 SSD’s
• Lanczos modal analysis• 1, 2, 4 jobs at a time
• Frequency response with AMLS
• 1, 2, 4 jobs at a time
• Observations• multiple jobs stress I/O as OS runs out of memory to buffer I/O
• lots of memory is still very good
• SSD’s are particularly useful for large AMLS problems which tends to have smaller random I/O blocks than typical Lanczos methods.
00.5
11.5
22.5
33.5
44.5
Ela
pse
d T
ime (re
lati
ve
to S
erv
er-
A)
A- SAS8 B- SAS8 B- SSD4
• Transformation from SAS Hard drives to SSD’s• Structural mechanics applications tend to do heavy local I/O
• Substructure methods are increasing popular but randomize data access patterns
• SSD’s offer potential ability to increase application performance and server reliability while decreasing server foot-print
System Storage: Use of SSD’s for local scratch I/O
IBM Confidential
© 2013 IBM Corporation
20
GSS replaces hardware controller with software controller Disks
IO N
ode
Use
r S
pace
GPFS NSD Server
Ker
nel S
pace
GPFS Kernel IO Layer
OS Device Driver
HBA Device Driver
Com
pute
Nod
e
Use
r S
pace
GPFS NSD Client
GPFS
Client Application
Con
trol
RP
C
Dat
a R
DM
A
Disk Array Controller
Disks
IO N
ode
Use
r S
pace
GPFS NSD Server
Ker
nel S
pace
GPFS Kernel IO Layer
GPFS Vdisk (PERSEUS)
OS Device Driver
HBA Device Driver
Com
pute
Nod
e
Use
r S
pace
GPFS NSD Client
GPFS
Client Application
Con
trol
RP
C
Dat
a R
DM
A
GPFS Storage System
GSS Storage System
Embedded within Network Shared Disk (NSD) layer of GPFS Utilizes generic servers with direct-attach SBOD disks Scalable from small systems to large supercomputers (10 - 100,000 disks)
System Storage: IBM System x GPFS Storage Server (GSS)
GSS 24 GSS 26
IBM Confidential
© 2013 IBM Corporation
21
20 disks, 5 disks per traditional RAID array
4x4 RAID stripes(data plus parity)
20 disks / 1 declustered array
Declustered RAID: Data+parity distributed over all disks– Rebuild uses IO capacity of an array’s 19 (surviving) disks
Striping across all arrays, all file accesses are throttled by array 2’srebuild overhead.
Load on files accesses are reduced by 4.8x (=19/4)during array rebuild.
Failed Disk
16 RAID stripes(data plus parity)
Traditional RAID: Narrow data+parity arrays– Rebuild uses IO capacity of an array’s only 4 (surviving) disks
Failed Disk
System Storage: GSS Declustering for low rebuild overhead
IBM Confidential
© 2013 IBM Corporation
22
System Storage: IBM SONAS
Designed as a “Simple “Appliance or a Gateway
XIV like GUI
Scale Out: predictable, linear performance
Scale Up: manage many PBs with one administrator
Reduces Capital Expenditures: # of filers, software
licenses, maintenance, power/cooling
Improves capacity utilization
IBM Confidential
© 2013 IBM Corporation
23
Traditional Clusters
‒ Focused separately on optimized servers, network and storage
‒ Difficulty with upgrading
• Rip out of existing infrastructure often required for expansion
• Inability to access growing data between sub-clusters
Compute systems
‒ Combination of server, network, and storage designed to maximize reliability with minimal cost and footprint
‒ Sizing
• Size dictated by extendibility of high-speed network and shared file system used for working storage
• Must be large enough to accommodate largest jobs
• Sized appropriate to hardware acquisition patterns
storage storage
serversservers
networknetwork
compute systemscompute systems
Transformation of CAE infrastructure from “Clusters” to “Compute Systems”
Traditional CAE InfrastructureTraditional CAE Infrastructure
centralized storagecentralized storage
Compute serversCompute servers
Network backboneNetwork backbone
serversservers
networknetwork
Data CenterData Centerserversservers
networknetworkserversservers
networknetwork
storage storage storage storage storage storage
IBM Confidential
© 2013 IBM Corporation
24
Customers are looking to Cloud Computing as way to control costs
– both public and private clouds
Existing public clouds typically are lacking HPC hardware suitable for CAE
Ambiguity on use of Cloud Computing to avoid direct ISV licensing costs
– current license models require customers to provide application licenses
While interest in Cloud is high, adoption has been slow
Private or Private Hosted CloudPrivate or Private Hosted Cloud
Mechanical CAD/CAE CloudMechanical CAD/CAE Cloud
CAE in Industry: Cloud Computing
IBM Confidential
© 2013 IBM Corporation
25
Consolidation
‒ focus on optimizing I/T capabilities with a fixed budget
‒ growing requirement for 3D remote visualization to handle remote engineering
‒ issues with software licensing for use in different geographies
‒ Solutioned with: GPFS-AFM, SoNAS ACE
Disaster Recovery
‒ multiple sites with an increased attention to spill-over of HPC and enterprise workloads onto common hardware platform to minimize total costs
‒ Solutioned with: GPFS-AFM, SoNAS ACE
Cloud
‒ not a quantum step with most companies, but a gradual change of best practices with increased use and acceptance of more cloud based software
‒ requirement for development of strong internal cloud framework to allow efficient bursting to public clouds with minimal disruption to line of business
‒ Solutioned with: Platform HPC Cloud products
CAE in Industry: Trends in Data Centers
IBM Confidential
© 2013 IBM Corporation
26
Easy to deploy and manage solutions, optimized for price/performance
Improves productivity, reduces training requirements
Based on reference architectures from industry experts
Leading-edge technology to ensure optimal performance for your budget, comprised of
– IBM SystemsIBM Flex System™, IBM NeXtScale™, IBM System Storage®
– IBM cluster, workload and file management IBM Platform™ HPC, IBM Platform LSF™; IBM GPFS™ (opt.)
– ISV Applications (sold separately)
Small, medium, large cluster configurations for – ANSYS: CFD and structural mechanics– Abaqus: FEA solver, pre- and post-processing– MSC Software: MSC Nastran, MSC Patran, MSC SimManager– Remote 3D Visualization
Abaqus Solution BriefAbaqus Ref. Arch.
ANSYS Ref. Arch.
MSC Ref. Arch.
CAE in Industry: IBM Application Ready Solutions for CAE
IBM Confidential
© 2013 IBM Corporation
27
Acoustic Response Optimization
─ Objective: Minimize maximum acoustic pressure at driver’s ear
Problem Characteristics─ Structure DOFs: 11.5M─ Acoustic DOFs: 1.2M─ # Str. Modes: 2950─ # Fluid Modes: 99─ # Exc. Freq: 200─ # Loads 4─ # Design variables: 190 (Panel thickness +/- 20%)
Optimization Run─ IBM x3650M3
• MSC.Nastran V2010, CDH/AMLS, FastFRS, (MIO, libIBM)
─ 11 optimization steps, 93 hrs; 115TB of IO─ improvement of 7.1 dB in maximum pressure
Outlook─ increased complexity of CAE models is reducing the ability
of engineers to intuitively steer design process─ computer based design optimization will increasing drive
product design ─ data center design for computer based design optimization
will look different than one designed for engineer based optimization• focus shifts from single job performance to overall throughput
0 50 100 150 200
S1003_70001951_opt
S1003_70003932_opt
Pre
ssu
re (
db
A)
Frequency (Hz)
7.1 dB
Mladen Chargin, CDH AGAdvanced EngineeringAm Marktplatz 6 79336 HerbolzheimGermany
Mladen Chargin, CDH AGAdvanced EngineeringAm Marktplatz 6 79336 HerbolzheimGermany
CAE in Industry: Transformation of Engineering Design Mechanism
IBM Confidential
© 2013 IBM Corporation
28
CAE in Industry: Delivering HPC Infrastructure for Energy
Customer Business Problem/Pain
•Wind forecasting around the world, using (primarily) the WRF application
•Analysis of wind data archived for locations around the world
•Mechanical design of huge turbine blades – using CFD (computational fluid dynamics) software, primarily STAR-CD
Value Proposition of the IBM Solution
• Deploy large compute cluster
• Demonstrate high efficiency through LINPACK testing
• Deploy large storage system, based on GPFS• Demonstrate performance with throughput test• Smooth installation and setup, thanks to our
business partner Gridcore• Ability to perform data analysis, from the IBM
BigInsights team
IBM Solution
•IBM System x iDataPlex dx360M3• 1,306 nodes• 2:1 4X QDR InfiniBand Network
•IBM GPFS file System• 2.8 PB Capacity• 20+ GB/s I/O Bandwidth
•IBM InfoSphere® BigInsights Enterprise Edition
IBM Confidential
© 2013 IBM Corporation
29
Hardware
‒ continued reliance in Intel x86_64 multi-core processors
‒ interest in GPUs for computing
‒ increased use of SSD’s
Servers
‒ trade-offs for application specific servers and general purpose servers to maximize utilization
‒ transition from engineering workstations to VDI
Clusters
‒ pressure on network performance and cost as size of cluster grows
‒ increased demands for shared file systems
‒ Application Ready Solutions for simplification, improved productivity
Data Centers
‒ economics of data center consolidation
‒ desire for failover capability
‒ remote access performance and security
CAE in Industry: Summary
top related