performance study: abaqus/standard 6.8-3slide 11slide 11 please keep confidential to customer and...
TRANSCRIPT
![Page 1: Performance Study: Abaqus/Standard 6.8-3Slide 11Slide 11 Please Keep Confidential to Customer and PanasasPlease Keep Confidential Between CSC and Panasas Abaqus/Standard 6.8-3: Comparison](https://reader035.vdocument.in/reader035/viewer/2022070915/5fb5ea38133348132f6a4b38/html5/thumbnails/1.jpg)
Performance Study:Abaqus/Standard 6.8-3
Stan PoseyDirector, Industry and Applications Market DevelopmentPanasas, Fremont, CA, USA
Bill Loewe, Ph.D.Sr. Applications EngineerPanasas, Fremont, CA, USA
![Page 2: Performance Study: Abaqus/Standard 6.8-3Slide 11Slide 11 Please Keep Confidential to Customer and PanasasPlease Keep Confidential Between CSC and Panasas Abaqus/Standard 6.8-3: Comparison](https://reader035.vdocument.in/reader035/viewer/2022070915/5fb5ea38133348132f6a4b38/html5/thumbnails/2.jpg)
Slide 2 Please Keep Confidential Between CSC and PanasasSlide 2 Please Keep Confidential to Customer and Panasas
Background on Abaqus/Standard Study
Abaqus is an application from SIMULIA -- not a benchmark kernel
The FEA model and tests are relevant to customer practice
All tests were run on a dedicated system at Panasas
The results were validated by SIMULIA
Since Apr 2007, SIMULIA and Panasas have made joint
investments in a business and technical alliance that
ensures Abaqus will fully leverage Panasas PanFS
This study demonstrates benefits of Panasas parallel file
system and parallel storage for Abaqus/Standard 6.8-3
with tests for both single job and mulit-job computing
Motivation
Considerations
![Page 3: Performance Study: Abaqus/Standard 6.8-3Slide 11Slide 11 Please Keep Confidential to Customer and PanasasPlease Keep Confidential Between CSC and Panasas Abaqus/Standard 6.8-3: Comparison](https://reader035.vdocument.in/reader035/viewer/2022070915/5fb5ea38133348132f6a4b38/html5/thumbnails/3.jpg)
Slide 3 Please Keep Confidential Between CSC and PanasasSlide 3 Please Keep Confidential to Customer and Panasas
3
Abaqus/Standard 6.8-3: Model S4b 5M DOF Non-linear Static Analysis
Automotive engine block cylinder head bolt-up
Panasas Study on Abaqus/Standard 6.8-3
![Page 4: Performance Study: Abaqus/Standard 6.8-3Slide 11Slide 11 Please Keep Confidential to Customer and PanasasPlease Keep Confidential Between CSC and Panasas Abaqus/Standard 6.8-3: Comparison](https://reader035.vdocument.in/reader035/viewer/2022070915/5fb5ea38133348132f6a4b38/html5/thumbnails/4.jpg)
Slide 4 Please Keep Confidential Between CSC and PanasasSlide 4 Please Keep Confidential to Customer and Panasas
Abaqus/Standard I/O Scheme
CSM implicit solver
Abaqus/Standard is
direct and single-
step, with out-of-core
READS and WRITES
-- I/O occurs in the sparse
factor phase of the solver
-- this scheme is for static, if
an eigen (Lanzcos) solution,
then I/O can be VERY heavy
-- NOTE: Abaqus also has an
implicit iterative solver
start
Write solution results [100’s of GB’s of I/O]
complete
element matrix
generation and
assembly into
global matrix
matrix factor
(dominant phase,
as much as 85%
of total time,
often I/O wait)
FBS solve phase,
stress recovery,
multiple RHS’s
Factor matrixout-of-core,reads/writes
.
.
.
.
.
.
Read nodes,elements andcontrol file
Work Dir: serial IO
Scratch Dir: parallel IO
Work Dir: serial IO
Job Task IO Scheme IO Operation
![Page 5: Performance Study: Abaqus/Standard 6.8-3Slide 11Slide 11 Please Keep Confidential to Customer and PanasasPlease Keep Confidential Between CSC and Panasas Abaqus/Standard 6.8-3: Comparison](https://reader035.vdocument.in/reader035/viewer/2022070915/5fb5ea38133348132f6a4b38/html5/thumbnails/5.jpg)
Slide 5 Please Keep Confidential Between CSC and PanasasSlide 5 Please Keep Confidential to Customer and Panasas
Features of the Hardware System Configurations
CISCOSYSTEMS
NOTE: Panasas total 30 TB in 12U, installed and operational in just 1 hr!
10 GigE
Features of Penguin cluster configuration:
Processors: 2.3GHz QC AMD Opteron
Nodes: 8 x 2 Sockets x 4 cores; 2 GB/core
Interconnect: 10GigE
Local FS: Ext3, single drive per node, 160 GB
SATA, 7200 RPM
Features of the Panasas storage system:
3 shelves: 1 director + 10 storage blades
Each shelf 10 TB, total of 30 TB
Panasas Study on Abaqus/Standard 6.8-3
8 nodes,
64 cores
10
GigE
![Page 6: Performance Study: Abaqus/Standard 6.8-3Slide 11Slide 11 Please Keep Confidential to Customer and PanasasPlease Keep Confidential Between CSC and Panasas Abaqus/Standard 6.8-3: Comparison](https://reader035.vdocument.in/reader035/viewer/2022070915/5fb5ea38133348132f6a4b38/html5/thumbnails/6.jpg)
Slide 6 Please Keep Confidential Between CSC and PanasasSlide 6 Please Keep Confidential to Customer and Panasas
Abaqus/Standard 6.8-3: Comparison of PanFS vs. Local FS
Total Time in Seconds
5M DOFEngine Block
1273213674
12770
15108
0
6000
12000
18000
PanFS -- Num OpsPanFS -- Total TimeLocal FS -- Num OpsLocal FS -- Total Time
Lower
is
better
11%
NOTE: Num-Ops times within 1%Difference is IO
NOTE: PanFS
11% Advantage
in Total Time
vs. Local FS
S4b Performance for Single Core
1 Job x 1 Core x 1 Node
Times for Single Job on a Single Core
![Page 7: Performance Study: Abaqus/Standard 6.8-3Slide 11Slide 11 Please Keep Confidential to Customer and PanasasPlease Keep Confidential Between CSC and Panasas Abaqus/Standard 6.8-3: Comparison](https://reader035.vdocument.in/reader035/viewer/2022070915/5fb5ea38133348132f6a4b38/html5/thumbnails/7.jpg)
Slide 7 Please Keep Confidential Between CSC and PanasasSlide 7 Please Keep Confidential to Customer and Panasas
Numerical vs. IO Computational Profile
0
6000
12000
18000Local FS -- IO-OpsLocal FS -- Num-OpsPanFS -- IO-OpsPanFS -- Num-Ops
Job Profiles of Numerical Ops % vs. IO %
So
lve
r
97%
50%
Lower
is
better
13674 IO – 16%
93%
Numerical
Operations
IO – 7%
15108
NOTE: PanFS
11% Advantage
in Total Time
vs. Local FS
5M DOFEngine Block
Abaqus/Standard 6.8-3: Comparison of PanFS vs. Local FS
Total Time in Seconds
84%
Numerical
Operations
1 Job x 1 Core x 1 Node
![Page 8: Performance Study: Abaqus/Standard 6.8-3Slide 11Slide 11 Please Keep Confidential to Customer and PanasasPlease Keep Confidential Between CSC and Panasas Abaqus/Standard 6.8-3: Comparison](https://reader035.vdocument.in/reader035/viewer/2022070915/5fb5ea38133348132f6a4b38/html5/thumbnails/8.jpg)
Slide 8 Please Keep Confidential Between CSC and PanasasSlide 8 Please Keep Confidential to Customer and Panasas
Abaqus/Standard 6.8-3: Comparison of PanFS vs. Local FS
S4b Performance for 1 Core x 8 NodesTotal Time in Seconds
5M DOFEngine Block
12641
14373
12654
15064
0
6000
12000
18000
PanFS -- Num OpsPanFS -- Total TimeLocal FS -- Num OpsLocal FS -- Total Time
Lower
is
better
5%
NOTE: PanFS
5% Advantage
in Total Time
vs. Local FS
Average Times of 8 Simultaneous Jobs
NOTE: N-Ops times within 1%Difference is IO
Average of 8 Jobs | Each on 1 Core | Each on 1 Node | 7 Cores Idle on Each Node
8 Jobs x 1 Core x 8 Nodes
![Page 9: Performance Study: Abaqus/Standard 6.8-3Slide 11Slide 11 Please Keep Confidential to Customer and PanasasPlease Keep Confidential Between CSC and Panasas Abaqus/Standard 6.8-3: Comparison](https://reader035.vdocument.in/reader035/viewer/2022070915/5fb5ea38133348132f6a4b38/html5/thumbnails/9.jpg)
Slide 9 Please Keep Confidential Between CSC and PanasasSlide 9 Please Keep Confidential to Customer and Panasas
Numerical vs. IO Computational Profile
0
6000
12000
18000Local FS -- IO-OpsLocal FS -- Num-OpsPanFS -- IO-OpsPanFS -- Num-Ops
Job Profiles of Numerical Ops % vs. IO %
So
lve
r
97%
50%
Lower
is
better
14373IO – 19%
88%
Numerical
Operations
IO – 12%
15064
5M DOFEngine Block
Abaqus/Standard 6.8-3: Comparison of PanFS vs. Local FS
Total Time in Seconds
81%
Numerical
Operations
NOTE: PanFS
5% Advantage
in Total Time
vs. Local FS
Average of 8 Jobs | Each on 1 Core | Each on 1 Node | 7 Cores Idle on Each Node
8 Jobs x 1 Core x 8 Nodes
![Page 10: Performance Study: Abaqus/Standard 6.8-3Slide 11Slide 11 Please Keep Confidential to Customer and PanasasPlease Keep Confidential Between CSC and Panasas Abaqus/Standard 6.8-3: Comparison](https://reader035.vdocument.in/reader035/viewer/2022070915/5fb5ea38133348132f6a4b38/html5/thumbnails/10.jpg)
Slide 10 Please Keep Confidential Between CSC and PanasasSlide 10 Please Keep Confidential to Customer and Panasas
Abaqus/Standard 6.8-3: Comparison of PanFS vs. Local FS
S4b Performance for 8 Cores x 1 NodeTotal Time in Seconds
5M DOFEngine Block
2946
5495
0
2000
4000
6000PanFS -- Total Time
Local FS -- Total Time
Lower
is
better
NOTE: PanFS
46% Advantage
in Total Time
vs. Local FS
Singe Job on Single 8-Core Node
46%
1 Job x 8 Cores x 1 Node
![Page 11: Performance Study: Abaqus/Standard 6.8-3Slide 11Slide 11 Please Keep Confidential to Customer and PanasasPlease Keep Confidential Between CSC and Panasas Abaqus/Standard 6.8-3: Comparison](https://reader035.vdocument.in/reader035/viewer/2022070915/5fb5ea38133348132f6a4b38/html5/thumbnails/11.jpg)
Slide 11 Please Keep Confidential Between CSC and PanasasSlide 11 Please Keep Confidential to Customer and Panasas
Abaqus/Standard 6.8-3: Comparison of PanFS vs. Local FS
S4b Performance for Single Job ScalingTotal Time in Seconds
5M DOFEngine Block
13674
2946
5495
15108
0
4000
8000
12000
16000
Lower
is
better
Scalability of Single Job from 1 to 8 Cores
1 Job
1 Core
1 Job
8 Cores
NOTE: PanFS
58% in Parallel
Efficiency vs.
35% for Local FS
1 Job
1 Core
1 Job
8 Cores
4.6x on 8
2.8x on 8
![Page 12: Performance Study: Abaqus/Standard 6.8-3Slide 11Slide 11 Please Keep Confidential to Customer and PanasasPlease Keep Confidential Between CSC and Panasas Abaqus/Standard 6.8-3: Comparison](https://reader035.vdocument.in/reader035/viewer/2022070915/5fb5ea38133348132f6a4b38/html5/thumbnails/12.jpg)
Slide 12 Please Keep Confidential Between CSC and PanasasSlide 12 Please Keep Confidential to Customer and Panasas
Abaqus/Standard 6.8-3: Comparison of PanFS vs. Local FS
S4b Performance for 8 Cores x 4 NodesTotal Time in Seconds
5M DOFEngine Block3773
5289
0
2000
4000
6000PanFS -- Total Time
Local FS -- Total Time
Lower
is
better
NOTE: PanFS
40% Advantage
in Total Time
vs. Local FS
Average Times of 4 Simultaneous Jobs
Average of 4 Jobs | Each Job on 8 Cores | Each Job on 1 Node Using All 8 Cores
40%
4 Jobs x 8 Cores x 4 Nodes
![Page 13: Performance Study: Abaqus/Standard 6.8-3Slide 11Slide 11 Please Keep Confidential to Customer and PanasasPlease Keep Confidential Between CSC and Panasas Abaqus/Standard 6.8-3: Comparison](https://reader035.vdocument.in/reader035/viewer/2022070915/5fb5ea38133348132f6a4b38/html5/thumbnails/13.jpg)
Slide 13 Please Keep Confidential Between CSC and PanasasSlide 13 Please Keep Confidential to Customer and Panasas
Abaqus/Standard 6.8-3: Comparison of PanFS vs. Local FS
S4b Performance for Single vs. Multi-JobTotal Time in Seconds
5M DOFEngine Block
2946
3773
52895495
0
2000
4000
6000
PanFS -- Total Time, Single Job
PanFS -- Total Time, Multi-Job
Local FS -- Total Time, Single Job
Local FS - Total Time, Multi-JobLower
is
better
Times of Single 8-way and Multi 8-way Jobs
1 Job
8-way
1 Job
8-way
4 Jobs
8-way
4 Jobs
8-way
NOTE: PanFSdegrades 22%for1 to 4 nodes
NOTE: Local FSabout the same for 1 to 4 nodes, each FS on node is independent
NOTE: PanFS
40% Advantage
in Total Time
vs. Local FS
22%
![Page 14: Performance Study: Abaqus/Standard 6.8-3Slide 11Slide 11 Please Keep Confidential to Customer and PanasasPlease Keep Confidential Between CSC and Panasas Abaqus/Standard 6.8-3: Comparison](https://reader035.vdocument.in/reader035/viewer/2022070915/5fb5ea38133348132f6a4b38/html5/thumbnails/14.jpg)
Slide 14 Please Keep Confidential Between CSC and PanasasSlide 14 Please Keep Confidential to Customer and Panasas
Panasas and Intel Abaqus S4b Study
Panasas:
16 client iozone
1180 MB/s write
1260 MB/s read
ENDEAVOR File Systems and Storage
PanFS: 2 Shelves AS6000 (1+10 and 2+9), 38 TB FS; network connected through 10GigE switches and IB router, ~ 1.2 GB/s
Lustre: DDN storage, 100 TB FS, ~ 5 GB/s
Local FS: Ext2 FS, 370 GB SATA drive, 80 MB/s per disk
Intel ENDEAVOR Xeon ClusterLocation: Intel HPC Customer Enabling Center, Dupont ,WA
Vendor: Intel; 80 nodes; 640 c ores; 18 GB memory per node
CPU: Intel Xeon (Nehalem) QC, 2.8 GHz, 8 cores per node
Interconnect: Infiniband
File Systems: Panasas PanFS; Lustre on DDN; Local disk
Operating System: RHEL Linux v5.2
Local FS:
Ext2
~80 MB/s
per disk
DDN/Lustre:
16 client iozone
5390 MB/s write
3370 MB/s read
ENDEAVOR
![Page 15: Performance Study: Abaqus/Standard 6.8-3Slide 11Slide 11 Please Keep Confidential to Customer and PanasasPlease Keep Confidential Between CSC and Panasas Abaqus/Standard 6.8-3: Comparison](https://reader035.vdocument.in/reader035/viewer/2022070915/5fb5ea38133348132f6a4b38/html5/thumbnails/15.jpg)
Slide 15 Please Keep Confidential Between CSC and PanasasSlide 15 Please Keep Confidential to Customer and Panasas
2613
1268
639 574
1289
4180
0
1000
2000
3000
4000
5000
8 16 32
PanFS
Local FS
Abaqus/Standard 6.8-3: Comparison of PanFS vs. Local FS Ext2
S4b Performance for Single Job ScalingTotal Time in Seconds
5M DOFEngine Block
Lower
is
better
Single Job Scalability 8 to 32 Cores; Memory 90%
NOTE: PanFS
advantage over
Local for single
node case when
IO is heavy – in
the same range
for 2-4 nodes
when job goes
in-memoryNumber of Cores
60 %
![Page 16: Performance Study: Abaqus/Standard 6.8-3Slide 11Slide 11 Please Keep Confidential to Customer and PanasasPlease Keep Confidential Between CSC and Panasas Abaqus/Standard 6.8-3: Comparison](https://reader035.vdocument.in/reader035/viewer/2022070915/5fb5ea38133348132f6a4b38/html5/thumbnails/16.jpg)
Slide 16 Please Keep Confidential Between CSC and PanasasSlide 16 Please Keep Confidential to Customer and Panasas
1268 1219 12191289
0
500
1000
1500
2000
Memory 90% Memory 70%
PanFS
Local FS
S4b Performance for Single Job ScalingTotal Time in Seconds
5M DOFEngine Block
Lower
is
better
Single Job Scalability on 16 Cores; Memory 90%/70%
NOTE: Effect of
memory setting
16 Cores Each Case
Abaqus/Standard 6.8-3: Comparison of PanFS vs. Local FS Ext2
![Page 17: Performance Study: Abaqus/Standard 6.8-3Slide 11Slide 11 Please Keep Confidential to Customer and PanasasPlease Keep Confidential Between CSC and Panasas Abaqus/Standard 6.8-3: Comparison](https://reader035.vdocument.in/reader035/viewer/2022070915/5fb5ea38133348132f6a4b38/html5/thumbnails/17.jpg)
Slide 17 Please Keep Confidential Between CSC and PanasasSlide 17 Please Keep Confidential to Customer and Panasas
12941360
0
500
1000
1500
2000PanFS
Local FS
S4b Performance for Multi-Job Thru-putTotal Time in Seconds
5M DOFEngine Block
Lower
is
better
Average Times for 8 Jobs, Each 16 Cores; Mem 90%
NOTE: PanFS
and Local FS
difference ~ 5%
Average Times for 8 Jobs | Each Job on 2 Nodes | Each Job on 16 Cores | Total 128 Cores
8 Jobs x 16 Nodes x 128 Cores
Average of 8 Jobs Average of 8 Jobs
Abaqus/Standard 6.8-3: Comparison of PanFS vs. Local FS Ext2
![Page 18: Performance Study: Abaqus/Standard 6.8-3Slide 11Slide 11 Please Keep Confidential to Customer and PanasasPlease Keep Confidential Between CSC and Panasas Abaqus/Standard 6.8-3: Comparison](https://reader035.vdocument.in/reader035/viewer/2022070915/5fb5ea38133348132f6a4b38/html5/thumbnails/18.jpg)
Slide 18 Please Keep Confidential Between CSC and PanasasSlide 18 Please Keep Confidential to Customer and Panasas
Questions
Thank You
For more information,call Panasas at:
1-888-PANASAS(US & Canada)
00 (800) PANASAS2(UK & France)
00 (800) 787-702(Italy)
+001 (510) 608-7790(All Other Countries)