ibm confidential © 2013 ibm corporation 1 technical computing for engineering analysis in industry...

IBM Confidential

Technical Computing for Engineering Analysis in Industry Today

Martin Feyereisen, Ph.D.Auto/Aero Sales LeadTechnical ComputingIBM

feyer@us.ibm.comTel: +01 (715) 410 1276

IBM Confidential

CAE – Computer Aided Engineering– Engineering virtual design, prototyping, testing, and manufacturing

Automotive/Aerospace– Largest revenue in the

CAE segment (90%)

Others– Consumer products – Packaging– Universities/Labs

CAE in use: http://www.physicsgames.net/game/Cargo_Bridge.html

CAE in Industry: definitions

IBM Confidential

brake design

safetyaerodynamics

engine design

CAE in Industry: Uses of CAE in Automotive

IBM Confidential

Wing DevelopmentEngine design

Weapons IntegrationLanding gear, etc

CAE in Industry: Uses of CAE in Aerospace

IBM Confidential

Explicit Analysis– Crash/Impact testing: typically 60% of cycles– Applications scale well in parallel typically to 128 cores– Memory and IO requirements are average– Important ISV Applications:

• LS-DYNA, PAM-CRASH, RADIOSS, ABAQUS/Explicit

Implicit Analysis / Structural Mechanics– Structural integrity, vibration analysis, acoustics, etc:

typically 15% of cycles– Applications typically run in single server– Memory and IO requirements are moderate to extreme– Important ISV Applications:

• Nastran, ABAQUS, ANSYS, Optistruct

Computational Fluid Dynamics– Aerodynamics, cooling, HVAC, combustion, etc: typically

25% of cycles– Applications scale well in parallel typically to 128 cores– Memory and IO requirements are average– Important ISV Applications:

• STAR-CD, STAR-CCM+, FLUENT, PowerFLOW, OpenFOAM

Crash/ExplcitStructures/ImplicitCFD

CAE in Industry: Automotive Engineering Workloads

IBM Confidential

Explicit Analysis– Crash/Impact testing: Typically 10% of cycles– Applications scale well in parallel typically to 64 cores– Memory and IO requirements are average– Important ISV Applications:

• LS-DYNA, ABAQUS/Explicit

Implicit Analysis / Structural Mechanics– Structural integrity, vibration analysis, acoustics, etc:

Typically 25% of cycles– Applications typically run in single server– Memory and IO requirements are moderate to extreme– Important ISV Applications:

• Nastran, ABAQUS, ANSYS

Computational Fluid Dynamics– Aerodynamics, cooling, HVAC, combustion, etc: Typically

65% of cycles– Applications scale well in parallel typically to 256 cores

and above– Memory and IO requirements are average– Often a significant amount of “in-house” CFD workload– Important ISV Applications:

• FLUENT, CFX CFD++

Crash/ExplcitStructures/ImplicitCFD

CAE in Industry: Aerospace Engineering Workloads

IBM Confidential

General Purpose– crash, CFD– diskless– 16 core, 64 GB memory

High-End General Purpose– scalable linear, non-linear, vibro-accoustics– 2 HDDs: ~1TB– 16 core, 128 GB memory

Structural Mechanics– NVH– 4-16 HDDs: ~3TB– 8-16 core, 128-256 GB memory

Graphics Servers– PrePost, Remote Desktop– 1-2 HDD: ~1TB– 16 core, 256 GB memory– 1-2 GPUs

CAE in Industry: Trends in CAE servers

IBM Confidential

CAE Industry Trends: Customer pain points

While HPC hardware costs are declining, overall HPC workloads are increasing on fixed budgets putting extreme pressure on all HPC aspects

– as a result, hardware costs are typically no longer the “throttle point” for HPC capacity

Customer data centers are often near capacity– which can be particularly troublesome when deploying new systems while

currents systems are still in production

Many customers are trying to consolidate data centers– but finding poor network bandwidth between users and data centers an

obstacle to consolidation

Many customers have gigabit Ethernet based campus networks– which are not able to keep up with ever increasing HPC dataflow

IBM Confidential

• Clusters continue to grow in size and complexity

• 1985 1 core•CRAY XMP

• 1992 10 cores•CRAY YMP

• 1998 100 cores•SGI PowerChallenge

• 2004 1,000 cores•POWER4, Itanium, x86

• 2010 10,000+ cores•x86_64

• 2015 100,000 cores•???

• How to effectively build, use, and manage such systems?

CAE in Industry: Trends in CAE Cluster Growth

IBM Confidential

CAE Infrastructure Challenges

• Servers

‒ Space, energy, and cooling constraints as size of clusters increases

‒ Reliability, manageability, and serviceability of servers

• Network

‒ Cost of high speed networks as size of clusters increases

‒ Connectivity between servers storage , and users as amount of data flow increases

• Storage

‒ Scalability and performance of file systems as amount of data and number of users increases

‒ Availability of data across multiple sites

IBM Confidential

CAE servers: Recent Server Performance Gains

Processor’s “core” power has remained constant‒ Application performance gains mainly to better

algorithms and increased parallelism

‒ Turnaround time still critical for product development cycles

Capacity of servers continues to increase with improved chip technology (gate count)

• Increased ability to exploit increase in applcation parallel scalability

• Ability to carry out more simulations per day helpful for computer based optimization

2008 2010 2012 2014

Time/ job Jobs/ day

CAE performance on a single server*

* - LS-DYNA Neon-Revised BMT (www.topcrunch.org)

IBM Confidential

CAE servers: Processor speed effect on Application Performance

Intel’s E5 processors are available with several different options‒ wattage, cores, base clock frequency

‒ turbo mode allow for increased performance with complicated sets of factors involved

‒ typically CAE users prefer highest speed cores to make most effective use of ISV license costs.

Benchmarks• single node CAE benchmarks carried out at

different clock speed and with turbo enabled (*)

• effective frequency determined from1200 MHz baseline benchmark

• most applications show nearly liner benefit from clock frequency up to almost 3 GHz

• turbo benefit is substantial.

2 2.2 2.4 2.6 2.6* 2.9 2.9*Effec

tive C

ABAQUS DYNA FLUENT

CAE Application Performance

IBM Confidential

SMP- 1 SMP- 1+GPU SMP- 8

• Growing interest in using GPUS for both remote desktop visualization and computing

• Outlook

‒ lack of broad functionality is a severe concern for many customers who require broad solutions

‒ performance can be excellent, but spotty

‒ currently dual processors servers with shared memory parallelism typically outperforms GPUs

• Many customers are keenly interested in GPUs as a way to reduce hardware costs, but reluctant to purchase at this time

• Primarily for Structural Analysis

‒ available with limited functionality in MSC.Nastran, ABAQUS and ANSYS Mechanical

‒ general purpose CFD and Explicit not expected any time soon

GPU BMTs

CAE servers: Use of GPU’s in CAE

IBM Confidential

Rack Dense

─ form factor: 19” rack(1-4U)

─ typical deployment: 8-64 nodes

─ typical workloads: simulation, pre/post, storage

─ benefits: maximum flexibility

Flex System─ form factor: 19” rack(10U)

─ typical deployment: 200-800 nodes

─ typical configuration: simulation

─ benefits: integrated solution, manageability, reliability

NeXtScale System─ form factor: 19” rack (6)

─ typical deployment: 1-4 Racks

─ typical configuration: simulation, pre/post

─ benefits: price/performance GPU capabilities

CAE servers: IBM CAE servers

IBM Confidential

HPC Networks: High Speed Network Options for CAE

10 Gigabit Ethernet– typically more expensive for large HPC

deployments– significantly lagging InfiniBand in performance– non-standard SW stack for HPC applications

InfiniBand– Fat Tree

• uniform performance• significant cost and extra hardware• difficult to expand

– Cluster Islands• minimal cost and space• easy to expand• workflow coordination required• storage access can be complicated

– Torus• mostly uniform performance• using existing IB hardware for minimal

cost and no additional footprint/power• easy to expand

IBM Confidential

• existing HPC Network designs• currently most HPC clusters have

Ethernet network for management and administration and InfiniBand network to handle MPI and IO traffic

• network benchmarks• use of emerging Ethernet hardware and

software technology (such as iWARP and RoCE) should erase performance delta between Ethernet and InfiniBand

• many HPC applications are latency sensitive and run well with higher blocking factors

• transforming HPC networks designs

• use of a single Ethernet network for all network traffic to reduce network complexity and cost

• improved network topologies with increased blocking factors could significantly reduce network complexity and costs

• use of Blades to improve network reliability and reduce network complexity

Datset Max ISL Utilization

Max BW (Gb/s)

Full Bisection Performance*

Oversubscribed Performance*

Eddy_417K 2% 0.8 12387 (8:1) 12387

Truck_111m 15% 6.0 226 (4:1) 225

Qlogic# FLUENT 256-way FLUENT Benchmarks

* - FLUENT rating (higher is better)# - IBM iDataPlex dx360M2 with Qlogic TrueScale QDR QLE7340

Benchmark QDR-IB 10-GE(tcp) 10-GE(iWARP)

FLUENT 24,200s 25,000s 24,200s

STAR-CCM+ 26,300s 62,700s 27,000s

ABAQUS_exp_dp 18,400s D.N.F 19,200s

LS-DYNA 23,700s 59,800s 24,300s

21,500s 24,000s 21,300s

48-way$ Customer Benchmarks

$ - IBM iDataPlex dx360M3 with X5670 hex-core processors$ - IBM iDataPlex dx360M3 with X5670 hex-core processors

HPC Networks: Reducing Network Complexity

IBM Confidential

Historically most storage for CAE has been handled with a combination with local direct-attached storage and NFS network storage

Implicit structural mechanics applications (i.e. Nastran, etc) often generate significant IO which would overwhelm a shared file system and is best handled with servers using direct-attached disk arrays

– Solid State Disk (SSD’s) rapidly replacing spinning disk for local scratch I/O storage

Slow movement to high-performance shared file systems, such as GPFS

– often a viable option in conjunction with diskless/stateless clusters, particularly for CFD

System storage: I/O and Storage for CAE

IBM Confidential

Typical Structural Mechanics Job– 1 node– 10 hours– 1 GB input, 10GB output– Total I/O = 1000GB– AVG BW important for sustained scratch I/O

• local disks are best

Typical CFD & Explicit/Impact Job– 4 nodes– 10 hours– 10GB input; 90GB output– Total I/O = 100GB– Peak BW important for file copy

• high performance shared file system is best

Local GPFS

Peak BW/job 200MB/s 3500MB/s

AVG BW/job 200MB/s 50MB/s

timetime

System storage: General CAE I/O Patterns

IBM Confidential

• Benchmarks

• Altair Optistruct Ver11

• AMLS 4.2.r33

• Server A• 2.6GHz E5-2670 SNB, 384GB memory

• 8x SAS@10Krpm

• Server B• 2.9GHz E5-2670 SNB, 128 GB memory

• 8x SAS@15krpm

• 4x 400GB Intel DC S3700 SSD’s

• Lanczos modal analysis• 1, 2, 4 jobs at a time

• Frequency response with AMLS

• 1, 2, 4 jobs at a time

• Observations• multiple jobs stress I/O as OS runs out of memory to buffer I/O

• lots of memory is still very good

• SSD’s are particularly useful for large AMLS problems which tends to have smaller random I/O blocks than typical Lanczos methods.

ime (re

A- SAS8 B- SAS8 B- SSD4

• Transformation from SAS Hard drives to SSD’s• Structural mechanics applications tend to do heavy local I/O

• Substructure methods are increasing popular but randomize data access patterns

• SSD’s offer potential ability to increase application performance and server reliability while decreasing server foot-print

System Storage: Use of SSD’s for local scratch I/O

IBM Confidential

GSS replaces hardware controller with software controller Disks

GPFS NSD Server

GPFS Kernel IO Layer

OS Device Driver

HBA Device Driver

GPFS NSD Client

Client Application

Disk Array Controller

GPFS NSD Server

GPFS Kernel IO Layer

GPFS Vdisk (PERSEUS)

OS Device Driver

HBA Device Driver

GPFS NSD Client

Client Application

GPFS Storage System

GSS Storage System

Embedded within Network Shared Disk (NSD) layer of GPFS Utilizes generic servers with direct-attach SBOD disks Scalable from small systems to large supercomputers (10 - 100,000 disks)

System Storage: IBM System x GPFS Storage Server (GSS)

GSS 24 GSS 26

IBM Confidential

20 disks, 5 disks per traditional RAID array

4x4 RAID stripes(data plus parity)

20 disks / 1 declustered array

Declustered RAID: Data+parity distributed over all disks– Rebuild uses IO capacity of an array’s 19 (surviving) disks

Striping across all arrays, all file accesses are throttled by array 2’srebuild overhead.

Load on files accesses are reduced by 4.8x (=19/4)during array rebuild.

Failed Disk

16 RAID stripes(data plus parity)

Traditional RAID: Narrow data+parity arrays– Rebuild uses IO capacity of an array’s only 4 (surviving) disks

Failed Disk

System Storage: GSS Declustering for low rebuild overhead

IBM Confidential

System Storage: IBM SONAS

Designed as a “Simple “Appliance or a Gateway

XIV like GUI

Scale Out: predictable, linear performance

Scale Up: manage many PBs with one administrator

Reduces Capital Expenditures: # of filers, software

licenses, maintenance, power/cooling

Improves capacity utilization

IBM Confidential

Traditional Clusters

‒ Focused separately on optimized servers, network and storage

‒ Difficulty with upgrading

• Rip out of existing infrastructure often required for expansion

• Inability to access growing data between sub-clusters

Compute systems

‒ Combination of server, network, and storage designed to maximize reliability with minimal cost and footprint

‒ Sizing

• Size dictated by extendibility of high-speed network and shared file system used for working storage

• Must be large enough to accommodate largest jobs

• Sized appropriate to hardware acquisition patterns

storage storage

serversservers

networknetwork

compute systemscompute systems

Transformation of CAE infrastructure from “Clusters” to “Compute Systems”

Traditional CAE InfrastructureTraditional CAE Infrastructure

centralized storagecentralized storage

Compute serversCompute servers

Network backboneNetwork backbone

serversservers

networknetwork

Data CenterData Centerserversservers

networknetworkserversservers

networknetwork

storage storage storage storage storage storage

IBM Confidential

Customers are looking to Cloud Computing as way to control costs

– both public and private clouds

Existing public clouds typically are lacking HPC hardware suitable for CAE

Ambiguity on use of Cloud Computing to avoid direct ISV licensing costs

– current license models require customers to provide application licenses

While interest in Cloud is high, adoption has been slow

Private or Private Hosted CloudPrivate or Private Hosted Cloud

Mechanical CAD/CAE CloudMechanical CAD/CAE Cloud

CAE in Industry: Cloud Computing

IBM Confidential

Consolidation

‒ focus on optimizing I/T capabilities with a fixed budget

‒ growing requirement for 3D remote visualization to handle remote engineering

‒ issues with software licensing for use in different geographies

‒ Solutioned with: GPFS-AFM, SoNAS ACE

Disaster Recovery

‒ multiple sites with an increased attention to spill-over of HPC and enterprise workloads onto common hardware platform to minimize total costs

‒ Solutioned with: GPFS-AFM, SoNAS ACE

‒ not a quantum step with most companies, but a gradual change of best practices with increased use and acceptance of more cloud based software

‒ requirement for development of strong internal cloud framework to allow efficient bursting to public clouds with minimal disruption to line of business

‒ Solutioned with: Platform HPC Cloud products

CAE in Industry: Trends in Data Centers

IBM Confidential

Easy to deploy and manage solutions, optimized for price/performance

Improves productivity, reduces training requirements

Based on reference architectures from industry experts

Leading-edge technology to ensure optimal performance for your budget, comprised of

– IBM SystemsIBM Flex System™, IBM NeXtScale™, IBM System Storage®

– IBM cluster, workload and file management IBM Platform™ HPC, IBM Platform LSF™; IBM GPFS™ (opt.)

– ISV Applications (sold separately)

Small, medium, large cluster configurations for – ANSYS: CFD and structural mechanics– Abaqus: FEA solver, pre- and post-processing– MSC Software: MSC Nastran, MSC Patran, MSC SimManager– Remote 3D Visualization

Abaqus Solution BriefAbaqus Ref. Arch.

ANSYS Ref. Arch.

MSC Ref. Arch.

CAE in Industry: IBM Application Ready Solutions for CAE

IBM Confidential

Acoustic Response Optimization

─ Objective: Minimize maximum acoustic pressure at driver’s ear

Problem Characteristics─ Structure DOFs: 11.5M─ Acoustic DOFs: 1.2M─ # Str. Modes: 2950─ # Fluid Modes: 99─ # Exc. Freq: 200─ # Loads 4─ # Design variables: 190 (Panel thickness +/- 20%)

Optimization Run─ IBM x3650M3

• MSC.Nastran V2010, CDH/AMLS, FastFRS, (MIO, libIBM)

─ 11 optimization steps, 93 hrs; 115TB of IO─ improvement of 7.1 dB in maximum pressure

Outlook─ increased complexity of CAE models is reducing the ability

of engineers to intuitively steer design process─ computer based design optimization will increasing drive

product design ─ data center design for computer based design optimization

will look different than one designed for engineer based optimization• focus shifts from single job performance to overall throughput

0 50 100 150 200

S1003_70001951_opt

S1003_70003932_opt

Frequency (Hz)

7.1 dB

Mladen Chargin, CDH AGAdvanced EngineeringAm Marktplatz 6 79336 HerbolzheimGermany

CAE in Industry: Transformation of Engineering Design Mechanism

IBM Confidential

CAE in Industry: Delivering HPC Infrastructure for Energy

Customer Business Problem/Pain

•Wind forecasting around the world, using (primarily) the WRF application

•Analysis of wind data archived for locations around the world

•Mechanical design of huge turbine blades – using CFD (computational fluid dynamics) software, primarily STAR-CD

Value Proposition of the IBM Solution

• Deploy large compute cluster

• Demonstrate high efficiency through LINPACK testing

• Deploy large storage system, based on GPFS• Demonstrate performance with throughput test• Smooth installation and setup, thanks to our

business partner Gridcore• Ability to perform data analysis, from the IBM

BigInsights team

IBM Solution

•IBM System x iDataPlex dx360M3• 1,306 nodes• 2:1 4X QDR InfiniBand Network

•IBM GPFS file System• 2.8 PB Capacity• 20+ GB/s I/O Bandwidth

•IBM InfoSphere® BigInsights Enterprise Edition

IBM Confidential

Hardware

‒ continued reliance in Intel x86_64 multi-core processors

‒ interest in GPUs for computing

‒ increased use of SSD’s

Servers

‒ trade-offs for application specific servers and general purpose servers to maximize utilization

‒ transition from engineering workstations to VDI

Clusters

‒ pressure on network performance and cost as size of cluster grows

‒ increased demands for shared file systems

‒ Application Ready Solutions for simplification, improved productivity

Data Centers

‒ economics of data center consolidation

‒ desire for failover capability

‒ remote access performance and security

CAE in Industry: Summary

ibm confidential © 2013 ibm corporation 1 technical computing for engineering analysis in industry...

ibm corporation

ibm confidential

cae servers

cae industry trends

cores memory

cae segment

openfoam cae

gpus cae

Documents

fisher’s 15 point checklist aero grow (aero)

aero protocols

aero dynamics1

complete configuration aero-structural optimization...

applied aero

aero syllabus

aero modelling

aero workshop18

windows 7 aero snap/aero peek/aero shake and windows flip 3d...

aero 80fp - mycoldjet.com needs and budget, we have...

aero engines

perspective - aero

hospital care medical devices · hospital care medical...

aero solutions

laparoscopic banding device client: dr. thomas m. julian,...

hotline: 08337 - 75301 zubehör für sauger robe · 154...

aero structures

mm10001 - janitrol aero | janitrol aero

aero electron

uva-dare (digital academic repository) the role of ... ·...