optimizing lustre and gpfs with ddn
TRANSCRIPT
Optimizing Lustre and GPFS Solutions with DDN
Robert Triendl VP of Worldwide HPC Strategy, DataDirect Networks
2!
© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com
File Systems @ DDN
3!
© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com
File System Basics
• File system are where your data lives
• File systems are complex software level technologies…
• … so there are always surprises!
• There are huge differences in performance, functionality, and reliability
• When it comes to performance, no file system fits all requirements
4!
© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com
Test and Benchmark Labs
5!
© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com
GTLS | Benchmark Lab Sites
EMEA Lab Dusseldorf, Germany
Asia Pacific Lab Tokyo, Japan
East Coast Lab Columbia, MD
West Coast Lab Sunnyvale, CA
6!
© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com
DDN and Lustre
• Started with Lustre 0.6, and the first commercial Lustre support contract with CFS!
• Over 250 EXAScaler customers worldwide today and many more using DDN storage for Lustre
• Customers in many industries (HPC centers, Large Experimental Facilities, Oil & Gas, Life Science, Automotive, etc.)
• Very broad set of applications supported
7!
© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com
Corp Data 4%
Government Security 17%
Research Data
Analysis, 28% HPC
Archive 18%
HPC Work 20%
HPC Work Corp 12%
Project Quota
Metadata Perf
SSD Acceleration
Fine-Grained Monitoring
NFS/CIFS Access
Management
Connectors
Object/Cloud Links
Data Management
Backup/Replication
HSM
Client Performance
Cluster Integration
Large I/O
IME Caching
Security Features
Lustre WAN
RAS
Small File I/O
8!
© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com
DDN Open Source Lustre Contributions
0
20
40
60
80
100
120
140
160
180
2.1 2.4.0 2.3.50-2.4.0 2.5.0 2.5.50-2.6.0
EMC
CEA
SUSE
Bull
Other
Cray
LLNN
Xyratex
DDN
9!
© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com
Large RPC Size Effects
0%!
20%!
40%!
60%!
80%!
100%!
120%!
0! 100! 200! 300! 400! 500! 600! 700!Number of Process!
WRITE!7.2KSAS(1MB RPC)! 7.2KSAS(4MB RPC)! SSD(1MB RPC)!
0%!
20%!
40%!
60%!
80%!
100%!
120%!
0! 100! 200! 300! 400! 500! 600! 700!Number of Process!
READ!7.2KSAS(1MB RPC)! 7.2KSAS(4MB RPC)! SSD(1MB RPC)!
10!
© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com
• Limited single client scaling
• Good scaling with clock speed
• Good Scaling with core count and HT
• Great Scaling with DNE • Limita<ons on Dir Creates (TBD)
Lustre Metadata
11!
© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com
mmap() I/O Performance Improvements
0!
100!
200!
300!
400!
500!
lustre-1.8.9! lustre-2.5.2! DDN branch!
mmap() Read Performance !(1MB block size)!
0!
100!
200!
300!
400!
500!
32K! 128K! 512K! 1024K!
mmap() Read Performance!
Lustre-1.8.9! DDN branch!
12!
© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com
EXAScaler Monitoring
OSS/MDS !
collectd !
Lustre client!DDN monitoring plugin !
graphite!
Monitoring Server!
collectd !
Graphite plugin !
UDP(TCP)/IP based small text message transfer graphite!
• Lightweight • Near real-‐<me • Massive scale • Customizable
• File system, OST Pool, OST/MDT stats, etc. • JOB ID, UID/GID, aggregation of application's
stats, etc. • Archive of data by policy
13!
© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com
EXAScaler Monitoring
• Running in TITECH – over 112 Object Storage
Targets across – 1700 clients
• That’s around 1M statistics
• Need to store every few seconds
• Demo of over 10M stats at DDN Booth
14!
© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com
VMs on GRIDScaler 256 VMs on 16 Clients
0!
2000!
4000!
6000!
8000!
10000!
12000!
14000!
16000!
1! 2! 4! 8! 16! 32! 64! 128! 256!
Thro
ughp
ut (M
B/se
c)!
Number of Process!0!
2000!
4000!
6000!
8000!
10000!
12000!
1! 2! 4! 8! 16! 32! 64! 128! 256!
Thro
ughp
ut (M
B/se
c)!
Number of Process!
15!
© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com
0!
100!
200!
300!
400!
500!
600!
700!
800!
900!
1000!
1! 10! 20! 30! 40!
Total Bandwidth!
Read Bandwidth! Write Bandwidth!
GRIDScaler for OpenStack vbench Results
0!
1000!
2000!
3000!
4000!
5000!
6000!
1! 10! 20! 30! 40!
Total IOPS!
Read IOPS! Write IOPS!