xyratex update clusterstor - hpc advisory council · xyratex update – clusterstor dr. torben...

Xyratex update – ClusterStor

Dr. Torben Kling Petersen, PhD Principal Architect, HPC

2

So I

showed

So I’m walking down the street

when a guy steps out of a dark alley,

point a gun at me and says …

YOUR MONEY

OR YOUR LIFE So I

showed him

my Xyratex

badge

Why’d you do that ?

Well. it proves I work

for Xyratex So ? Which means I

have no money

and no life.

Introduction

3

• Balanced hardware – Interconnect to disk

–Hardware design team with

more than 20 years experience.

• Integrated software

–HAL stack for HW

management/monitoring

• ONE software stack

–World class software

development team

• Intelligent integration

–Extensive testing ….

It all started with a vision of good design…..

4

• CS-6000 Overview

– Proven performance >42 GB/s per rack

– Overall Performance scalable to >1 TB/s bandwidth

– Overall capacity scalable to >100 PBs

– ClusterStor a complete ready-to-run Lustre solution

– Up to 560 HDD’s per rack (42RU)

– Up to 1.8 PBs usable per rack (with 4TB HDD’s)

– Up to 14 embedded servers (OSSes) per rack

– Up to 14 high bandwidth network ports/rack

• Factory Integration & Staging

– Rack integration & Cabling

– Entire storage software stack factory

pre-installed and pre-configured

• System Burn-in and benchmark testing at Xyratex factory

• “Rack’n’Roll” installation – hours vs. days or weeks

ClusterStor Lustre Appliance

So what’s new ????

5

• Expansion Storage Unit (ESU)

–Optional ESU building block doubles the number of drives

controlled by a scalable storage unit (SSU)

–Higher capacity while maintaining good performance

• Hardware-assisted CRC-32

–Enables several categories of enhanced functionality:

security, data integrity, efficiency and performance

–Leverages Intel® Xeon® processor instruction sets on

SSU embedded server modules

• New base OS – Scientific Linux 6.2

• Email alerts

• Authentication: NIS support

ClusterStor Software Update

6

• Back ports from Lustre® 2.2 and 2.3:

– SMP scalability improvements

• Wide stripe

– Improved file system performance

– Increases number of OSTs that an single file may span

–Expands number of SSUs and ESUs from 40 (160 object

storage targets) to 500 (2000 object storage targets)

• Improved HA

–Faster and more reliable failovers

–Extended component tests

• Fabric Failure, SAS Interruption to Disks, Heartbeat

Compromised, Software Interrupt

• HW Failure of any kind

ClusterStor Software Update

Introducing

ClusterStor 1500

8

• CS6000 comparison:

–Same embedded servers

–Same high availability features

–Same software stack

–Same management SW

• Differences

–Standard 19’ rack

–Less dense storage

• Front loaded drives

–Based on Expansion units

ClusterStor 1500 – Entry level Lustre appliance

4U24 SSU

2x CS6000 ESMs

4U24 EBOD#2

4U24 EBOD #1

4U24 EBOD#3

2U24 CMU

2x CS6000 ESMs

9

Global Name Space /

Single File System Performance and Capacity Scalability

Base Configuration Options Maximum

Configuration

Capacity +

Performance

Qty 66

Capacity +

Performance

Capacity +

Performance

High Availability

OSS Pair +

Capacity Qty 22

High Availability

MDS/MGS Pair(s) Qty 1

~ Performance

(w/IOR

Benchmark)

up to

1.25GB sec

up to

2.50 GB/sec

up to

3.75 GB/sec

up to

5.00 GB/sec up to 110 GB/sec

Capacity 42TB to 84TB 84TB to 168TB 168TB to 256TB 256TB to

336TB up to 7,392 TBs

CS1500 Configuration Options

Performance and Capacity Scalability

So where are we going ???

Lustre®, ClusterStor and beyond

11

• Performance is still important but …

• Support for small files

• Enterprise reliability

–Simplified management

• Back-up

• Desktop connectivity

• Energy efficient storage

–Accurate measurements in real time

• Statistics and reporting

• Capacity and performance

Requirements are changing

12

• Significant part of the data is sub 256k file sizes

• “Small file” repository should be part of the file system

• SSDs can help but …

–SSDs are not as reliable as expected.

–Lustre is part of the problem

–as is SAS/S-ATA/FC-AL and other protocols

–PCI-E direct attach is coming

–SAS 12G is around the corner

–eMLC SSDs show great promise

• New ideas and methods are needed

–But anything new needs tight integration

Better support for Small File sizes

13

• Data integrity – Silent Data Corruption

–T10-DIF

–ZFS

• Hardware resiliency

–HA on ALL components …

–Self healing software …

• Manageability

–Better tools

–Better reporting

–Lights out management

• Security (Kerberos, role based access control etc)

• No longer just scratch

Enterprise Reliability

14

• /home and /project increasingly common on Lustre

• Backing up not just data is imperative

–Restore needs to be demonstrated (not just stated)

–Restore must also restore striping etc, not just files ..

• HSM can (will) help but we need new tools

• Replication of entire file systems will be required

–… and rsync won’t cut it ..

• Disaster Recovery is also on the horizon

–But 10s of PBs takes time even on fully saturated FDR

Backing up and restoring data

15

• “I want to mount my Lustre file system on my Mac/WinPC”

–Roll it yourself Samba server works, right ??

–Native Win/Mac clients ???

• Complex workflows require non Linux platforms

–DCC

–O&G

–Genomics

–Financial

• CIFS/NFS connectivity no longer an add-on

–Need a supported platform

–Needs scalability

Desktop connectivity

16

• Power cost BIG money

–Storage is taking a bigger piece of the budgets

• New metrics are showing up

–Watts per GB/s

–Watts per TB

• EU legislation on power capping datacenters will make

this worse as we get closer to ExaScale

• We now have conferences dedicated to the subject :

International Conference on Energy-Aware High Performance ComputingSeptember 2 - September

3, 2013

http://www.ena-hpc.org

Energy efficiency

17

ClusterStor Manager • Provides extensive end-to-end management of the entire Lustre™ storage cluster

• Automated Installation and Maintenance

– Up and running in hours vs. days/weeks

– No need to go offline for updates

– Integrated Cluster and block storage management

• Health Monitoring and Notification

– Health status of entire storage cluster

• Error logging at every layer (Lustre™, RAID, Enclosure..)

– E-mail alerts & single button bug submission

and support via Bugzilla (OEM and/or XRTX)

– Root cause diagnostics and error message interpretation

• Leverages proven Open Source Infrastructure – LMT- Lustre™ management

– Yaci – cluster provisioning

– Puppet – cluster configuration

– LNET – LLNL GUI for Lustre™ monitoring

– Ganglia/Cerebro – scalable cluster monitoring

– Powerman – power management

– Pacemaker – fail over clustering

– DRBD – configuration database failover

• Does not interfere with Open Source and/or 3rd party tools or scripts

18

New and improved dashboard

19

• The ClusterStor Lustre® appliance:

–An enterprise class Lustre solution

–Factory built and tested – Turn key solution

–Single source for product and support

–Highest density Lustre solution on the market

–Most energy efficient Lustre® Solution*

–Moving towards even more enterprise features

Summary

Some

Partners

Some

Customers

*Watts per GB/s

Thank You

[email protected]

xyratex update clusterstor - hpc advisory council · xyratex update – clusterstor dr. torben...

Documents