xyratex update clusterstor - hpc advisory council · xyratex update – clusterstor dr. torben...
TRANSCRIPT
2
So I
showed
So I’m walking down the street
when a guy steps out of a dark alley,
point a gun at me and says …
YOUR MONEY
OR YOUR LIFE So I
showed him
my Xyratex
badge
Why’d you do that ?
Well. it proves I work
for Xyratex So ? Which means I
have no money
and no life.
Introduction
3
• Balanced hardware – Interconnect to disk
–Hardware design team with
more than 20 years experience.
• Integrated software
–HAL stack for HW
management/monitoring
• ONE software stack
–World class software
development team
• Intelligent integration
–Extensive testing ….
It all started with a vision of good design…..
4
• CS-6000 Overview
– Proven performance >42 GB/s per rack
– Overall Performance scalable to >1 TB/s bandwidth
– Overall capacity scalable to >100 PBs
– ClusterStor a complete ready-to-run Lustre solution
– Up to 560 HDD’s per rack (42RU)
– Up to 1.8 PBs usable per rack (with 4TB HDD’s)
– Up to 14 embedded servers (OSSes) per rack
– Up to 14 high bandwidth network ports/rack
• Factory Integration & Staging
– Rack integration & Cabling
– Entire storage software stack factory
pre-installed and pre-configured
• System Burn-in and benchmark testing at Xyratex factory
• “Rack’n’Roll” installation – hours vs. days or weeks
ClusterStor Lustre Appliance
So what’s new ????
5
• Expansion Storage Unit (ESU)
–Optional ESU building block doubles the number of drives
controlled by a scalable storage unit (SSU)
–Higher capacity while maintaining good performance
• Hardware-assisted CRC-32
–Enables several categories of enhanced functionality:
security, data integrity, efficiency and performance
–Leverages Intel® Xeon® processor instruction sets on
SSU embedded server modules
• New base OS – Scientific Linux 6.2
• Email alerts
• Authentication: NIS support
ClusterStor Software Update
6
• Back ports from Lustre® 2.2 and 2.3:
– SMP scalability improvements
• Wide stripe
– Improved file system performance
– Increases number of OSTs that an single file may span
–Expands number of SSUs and ESUs from 40 (160 object
storage targets) to 500 (2000 object storage targets)
• Improved HA
–Faster and more reliable failovers
–Extended component tests
• Fabric Failure, SAS Interruption to Disks, Heartbeat
Compromised, Software Interrupt
• HW Failure of any kind
ClusterStor Software Update
8
• CS6000 comparison:
–Same embedded servers
–Same high availability features
–Same software stack
–Same management SW
• Differences
–Standard 19’ rack
–Less dense storage
• Front loaded drives
–Based on Expansion units
ClusterStor 1500 – Entry level Lustre appliance
4U24 SSU
2x CS6000 ESMs
4U24 EBOD#2
4U24 EBOD #1
4U24 EBOD#3
2U24 CMU
2x CS6000 ESMs
9
Global Name Space /
Single File System Performance and Capacity Scalability
Base Configuration Options Maximum
Configuration
Capacity +
Performance
Qty 66
Capacity +
Performance
Capacity +
Performance
High Availability
OSS Pair +
Capacity Qty 22
High Availability
MDS/MGS Pair(s) Qty 1
~ Performance
(w/IOR
Benchmark)
up to
1.25GB sec
up to
2.50 GB/sec
up to
3.75 GB/sec
up to
5.00 GB/sec up to 110 GB/sec
Capacity 42TB to 84TB 84TB to 168TB 168TB to 256TB 256TB to
336TB up to 7,392 TBs
CS1500 Configuration Options
Performance and Capacity Scalability
11
• Performance is still important but …
• Support for small files
• Enterprise reliability
–Simplified management
• Back-up
• Desktop connectivity
• Energy efficient storage
–Accurate measurements in real time
• Statistics and reporting
• Capacity and performance
Requirements are changing
12
• Significant part of the data is sub 256k file sizes
• “Small file” repository should be part of the file system
• SSDs can help but …
–SSDs are not as reliable as expected.
–Lustre is part of the problem
–as is SAS/S-ATA/FC-AL and other protocols
–PCI-E direct attach is coming
–SAS 12G is around the corner
–eMLC SSDs show great promise
• New ideas and methods are needed
–But anything new needs tight integration
Better support for Small File sizes
13
• Data integrity – Silent Data Corruption
–T10-DIF
–ZFS
• Hardware resiliency
–HA on ALL components …
–Self healing software …
• Manageability
–Better tools
–Better reporting
–Lights out management
• Security (Kerberos, role based access control etc)
• No longer just scratch
Enterprise Reliability
14
• /home and /project increasingly common on Lustre
• Backing up not just data is imperative
–Restore needs to be demonstrated (not just stated)
–Restore must also restore striping etc, not just files ..
• HSM can (will) help but we need new tools
• Replication of entire file systems will be required
–… and rsync won’t cut it ..
• Disaster Recovery is also on the horizon
–But 10s of PBs takes time even on fully saturated FDR
Backing up and restoring data
15
• “I want to mount my Lustre file system on my Mac/WinPC”
–Roll it yourself Samba server works, right ??
–Native Win/Mac clients ???
• Complex workflows require non Linux platforms
–DCC
–O&G
–Genomics
–Financial
• CIFS/NFS connectivity no longer an add-on
–Need a supported platform
–Needs scalability
Desktop connectivity
16
• Power cost BIG money
–Storage is taking a bigger piece of the budgets
• New metrics are showing up
–Watts per GB/s
–Watts per TB
• EU legislation on power capping datacenters will make
this worse as we get closer to ExaScale
• We now have conferences dedicated to the subject :
International Conference on Energy-Aware High Performance ComputingSeptember 2 - September
3, 2013
http://www.ena-hpc.org
Energy efficiency
17
ClusterStor Manager • Provides extensive end-to-end management of the entire Lustre™ storage cluster
• Automated Installation and Maintenance
– Up and running in hours vs. days/weeks
– No need to go offline for updates
– Integrated Cluster and block storage management
• Health Monitoring and Notification
– Health status of entire storage cluster
• Error logging at every layer (Lustre™, RAID, Enclosure..)
– E-mail alerts & single button bug submission
and support via Bugzilla (OEM and/or XRTX)
– Root cause diagnostics and error message interpretation
• Leverages proven Open Source Infrastructure – LMT- Lustre™ management
– Yaci – cluster provisioning
– Puppet – cluster configuration
– LNET – LLNL GUI for Lustre™ monitoring
– Ganglia/Cerebro – scalable cluster monitoring
– Powerman – power management
– Pacemaker – fail over clustering
– DRBD – configuration database failover
• Does not interfere with Open Source and/or 3rd party tools or scripts
19
• The ClusterStor Lustre® appliance:
–An enterprise class Lustre solution
–Factory built and tested – Turn key solution
–Single source for product and support
–Highest density Lustre solution on the market
–Most energy efficient Lustre® Solution*
–Moving towards even more enterprise features
Summary
Some
Partners
Some
Customers
*Watts per GB/s