dlr portal - high performance storage systemkonferenz-nz.dlr.de/pages/storage2016/present/2... ·...

21
HPSS High Performance Storage System An overview… Jim Gerry IBM Senior IT Architect

Upload: others

Post on 24-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: DLR Portal - High Performance Storage Systemkonferenz-nz.dlr.de/pages/storage2016/present/2... · 2016. 6. 20. · Brookhaven National Lab (BNL) 75.69 104.2 Lawrence Berkeley National

HPSSHigh Performance Storage System

An overview…

Jim GerryIBM Senior IT Architect

Page 2: DLR Portal - High Performance Storage Systemkonferenz-nz.dlr.de/pages/storage2016/present/2... · 2016. 6. 20. · Brookhaven National Lab (BNL) 75.69 104.2 Lawrence Berkeley National

IBM Systems

Today’s agenda

• What is HPSS?

• The HPSS collaboration

• What is HPSS software?

• Who is using HPSS?

• Key strengths

• HPSS interfaces

• HPSS and cloud storage

• Migrating from a legacy system to HPSS

| 2

Page 3: DLR Portal - High Performance Storage Systemkonferenz-nz.dlr.de/pages/storage2016/present/2... · 2016. 6. 20. · Brookhaven National Lab (BNL) 75.69 104.2 Lawrence Berkeley National

IBM Systems

What is HPSS?

• HPSS is an IBM Software as a Service offering with an historically high rate of

deployment success that maximizes the value of tape within large scale storage

deployments

• HPSS service is focused on the success of big data storage deployments with

demanding requirements

o High data transfer rates

o Data integrity

o Exabytes of data within a single name space

o Petascale bulk data ingest of legacy data

• HPSS competitive deployment price is NOT based on volume

o capacity

o processors

o users

o number of files

| 3

Page 4: DLR Portal - High Performance Storage Systemkonferenz-nz.dlr.de/pages/storage2016/present/2... · 2016. 6. 20. · Brookhaven National Lab (BNL) 75.69 104.2 Lawrence Berkeley National

IBM Systems

HPSS Collaboration

• The HPSS Collaboration is a visibly trusted member of the HPC

and technical computing big data deployment community

• HPSS Collaboration has been a software defined storage

technology leader and pioneer in big data deployments solutions

since 1992

• 23-year IBM and US DoE collaborative effort

o Actively solving future exascale requirements

o End user success is a collaboration priority

• Over 50 people in the HPSS development collaboration

Page 5: DLR Portal - High Performance Storage Systemkonferenz-nz.dlr.de/pages/storage2016/present/2... · 2016. 6. 20. · Brookhaven National Lab (BNL) 75.69 104.2 Lawrence Berkeley National

IBM Systems

What is HPSS software?

• Hierarchical Storage Management (HSM) software for customers

demanding high performance data movement

• HPSS presents ONE exascale file system name space:

o Virtualization of storage with computers

o Active files remain on disk for low latency access

o Dormant files rest on tape for economical near-zero-watt storage

o Automated movement of files between disk and tape

o Familiar interface for storing, retrieving and sharing files

• European Center for Medium-Range Weather Forecasts

(ECMWF) sustained 211 TB/day, growing from 165 PB to 184

PB, in a single instance of HPSS, during 1Q2016

| 5

Page 6: DLR Portal - High Performance Storage Systemkonferenz-nz.dlr.de/pages/storage2016/present/2... · 2016. 6. 20. · Brookhaven National Lab (BNL) 75.69 104.2 Lawrence Berkeley National

IBM Systems

Largest HPSS sites world wide (1Q 2016)

Per single HPSS System & Namespace: PetabytesMillionsof Files

European Center for Medium-Range Weather Forecasts (ECMWF) 184.09 238.5

NOAA Research & Development Facility 86.61 76.6

United Kingdom Meteorological Office (UKMO) 78.40 113.0

Brookhaven National Lab (BNL) 75.69 104.2

Lawrence Berkeley National Lab (LBNL) - User Facility 72.85 209.1

Los Alamos National Lab (LANL) – Secure Facility 61.92 611.3

Oak Ridge National Lab (ORNL) 60.54 61.8

National Center for Atmospheric Research (NCAR) 58.83 233.0

Lawrence Livermore National Lab (LLNL) – Secure Facility 56.73 852.2

French Atomic Energy Commission (CEA) – TERA 54.05 12.0

German Climate Research Center (DKRZ) 48.45 19.6

Page 7: DLR Portal - High Performance Storage Systemkonferenz-nz.dlr.de/pages/storage2016/present/2... · 2016. 6. 20. · Brookhaven National Lab (BNL) 75.69 104.2 Lawrence Berkeley National

IBM Systems

Publically disclosed sites…

Page 8: DLR Portal - High Performance Storage Systemkonferenz-nz.dlr.de/pages/storage2016/present/2... · 2016. 6. 20. · Brookhaven National Lab (BNL) 75.69 104.2 Lawrence Berkeley National

IBM Systems

HPSS strengths – “best of breed” for tape

• Parallel movement of large files

• RAIT to reduce cost of redundant tape

o Oak Ridge National Laboratory cuts redundant tape costs by 75% with 4+P HPSS RAIT

o Enjoy file-transfers reaching 872 MB/s – 87% of native transfer rate in parallel for large files

• Small file aggregation

o Météo France demonstrated tape ingest of 2.6 MB files at 219 MB/s per tape drive

• Enterprise tape Recommended Access Order (RAO)

support for 30% to 60% faster tape recalls

• End-to-end data integrity with file checksums + T10 tape

• Open tape format

o The right to read and use your data without HPSS

Data Data Parity

Parallel I/O = FAST

• It’s your cartridge

• It’s your data

• It’s your right

• Know your rights

Page 9: DLR Portal - High Performance Storage Systemkonferenz-nz.dlr.de/pages/storage2016/present/2... · 2016. 6. 20. · Brookhaven National Lab (BNL) 75.69 104.2 Lawrence Berkeley National

IBM Systems

HPSS strengths

• Hardware vendor neutral

• Proven legacy system migration strategies

• Reputation and value

o Proven HPSS collaboration – longevity

o Proven HPSS community – sharing

o Proven support – responsive

o Proven delivery – unfailing

o Proven failure recovery – fast

o Proven value at scale – flat support fee

Block or Filesystem Disk TiersHardware

Neutral

Hardware

Vendor

Neutral

RHEL on Power or Intel

IBM � Oracle � Spectra Logic

Enterprise � LTO Tape

Page 10: DLR Portal - High Performance Storage Systemkonferenz-nz.dlr.de/pages/storage2016/present/2... · 2016. 6. 20. · Brookhaven National Lab (BNL) 75.69 104.2 Lawrence Berkeley National

IBM Systems

Data integrity and redundancy

• HPSS looks for data corruption during

migration

o File checksum validation ensures the disk read

o T10 Logical Block Protection ensures tape write

o Data corruption prevents successful migration

• Only checksum validated data makes it to

HPSS tape.

• Redundant tape protects data on tape

o Dual-copy tape doubles tape costs

o 2+P RAIT cuts redundant tape costs by 50%

• Validation and re-validation of data on tape

o Cartridge validation tool looks for data corruption

o HPSS tape repack looks for data corruption

Data Data Parity

1st Copy 2nd Copy

Page 11: DLR Portal - High Performance Storage Systemkonferenz-nz.dlr.de/pages/storage2016/present/2... · 2016. 6. 20. · Brookhaven National Lab (BNL) 75.69 104.2 Lawrence Berkeley National

IBM Systems

Tape ordered recalls – offset order

• LTO tape drives support offset ordered recalls

• The serpentine nature of tape may still result in long seek times between file reads

o This example illustrates 5:16 of tape movement without tape I/O

Beginning

of

Tape

End

of

Tape

File 1

File 2

File 3

File 4

File 5

48 sec

72 sec

45 sec

39 sec

93 sec19 sec

Page 12: DLR Portal - High Performance Storage Systemkonferenz-nz.dlr.de/pages/storage2016/present/2... · 2016. 6. 20. · Brookhaven National Lab (BNL) 75.69 104.2 Lawrence Berkeley National

IBM Systems

Tape ordered recalls – recommended access order

• Enterprise tape drives now support Recommended Access Ordering (RAO)

• Multiple tape recalls are properly ordered by the tape drive to reduce recall time

• Tests show that RAO improves multiple file recalls by 30% to 60%

o This SAME example illustrates 2:06 of tape movement without tape I/O

File 2

File 4

File 1

File 5

File 3

13 sec

58 sec

21 sec

14 sec 12 sec

8 sec

Beginning

of

Tape

End

of

Tape

Page 13: DLR Portal - High Performance Storage Systemkonferenz-nz.dlr.de/pages/storage2016/present/2... · 2016. 6. 20. · Brookhaven National Lab (BNL) 75.69 104.2 Lawrence Berkeley National

IBM Systems

HPSS is an IBM service offering

• Proven turnkey delivery of complex storage solutions

o System engineering of the storage solution to meet customer’s expectations

o Install, configure, and validate the compute, storage and interconnects

o Multiple review cycles to proactively find and resolve problems to eliminate their production impact

• Proactive and personalized support

o Proactive monitoring, planning, change management

o Personal relationship with your customer representative and project manager

o HPSS User Forum brings administrators, developers, project manager, and HPSS leaders together annually to share experiences and lessons learned.

• We do not charge customers based on the amount of data stored in HPSS

o If you have 10 PB or 1,000 PB, HPSS costs about the same

Page 14: DLR Portal - High Performance Storage Systemkonferenz-nz.dlr.de/pages/storage2016/present/2... · 2016. 6. 20. · Brookhaven National Lab (BNL) 75.69 104.2 Lawrence Berkeley National

IBM Systems

Spectrum Scale

InterfaceSwiftOnHPSS

FUSE

FilesystemParallel

FTP Client API

HPSS

Client API

RHEL Core Server & Mover computers Intel Power

Massively scalable global HPSS namespace enabled by DB2

Extreme-scale high-performance automated HSM Disk Tape

IBM � Oracle � Spectra LogicBlock or Filesystem Disk Tiers

Hardware

Neutral

Hardware

Vendor

Neutral

Spectrum

Scale

Client API for 3rd party applications

HPSS – interfaces

Enterprise � LTO Tape

Page 15: DLR Portal - High Performance Storage Systemkonferenz-nz.dlr.de/pages/storage2016/present/2... · 2016. 6. 20. · Brookhaven National Lab (BNL) 75.69 104.2 Lawrence Berkeley National

IBM Systems

OpenStack Swift (object storage)

Proxy/S3 Server

Account, Containerand Object Servers

• Leverage the virtually unlimited scalability of the

cloud interface the cloud metadata servers, and

the immediate storage (Flash, SSD, HDD, and

optical)

o Incremental horizontal growth of the cloud is accomplished by adding resources

o Erasure codes or multiple copies for local redundancy or redundancy spanning geography

• The number of users of a single storage

repository may be global – huge number

• Flash or HDD are good for interactive workloads

and user request, but expensive for idle data

SWIFTOBJECT STORAGE

Native Swift Disk

Page 16: DLR Portal - High Performance Storage Systemkonferenz-nz.dlr.de/pages/storage2016/present/2... · 2016. 6. 20. · Brookhaven National Lab (BNL) 75.69 104.2 Lawrence Berkeley National

IBM Systems

OpenStack Swift with SwiftOnHPSSAvailable on github today!

…SWIFTOBJECT STORAGE

Proxy/S3 Server

Account, Containerand Object Servers

• SwiftOnHPSS is Swift middleware that allows

Swift to store containers and object to HPSS

using HPSS VFS

• Containers and objects are stored into HPSS in

a non-obfuscated manner

o /account/container/object

• OpenStack Swift can enjoy the benefits of HSM

coupled with the strengths of HPSS, including:

o Extreme scale HSM tiered storage

o End-to-end data integrity

o Tape ordered recall

o Small file tape aggregation

o RAIT

• https://github.com/openstack/swiftonhpss

+SwiftOnHPSS

� Excellent performance when writing all file

sizes to tape

� A excellent solution for very large files

� Individual access to small and medium

objects – not efficient for tape

HPSS disk cache

HPSS tape

Page 17: DLR Portal - High Performance Storage Systemkonferenz-nz.dlr.de/pages/storage2016/present/2... · 2016. 6. 20. · Brookhaven National Lab (BNL) 75.69 104.2 Lawrence Berkeley National

IBM Systems

…SWIFTOBJECT STORAGE

Proxy/S3 Server

Account, Containerand Object Servers

• Flash or HDD are good for interactive workloads

and user requests, but tape is not

• Today, SwiftOnHPSS, and other object storage

solutions, allow objects to be individually stored

and accessed on tape

• To improve the effective use of tape, the object

storage solution should perform bulk I/O of

containers on tape

+SwiftOnHPSS

OpenStack Swift with SwiftOnHPSSAvailable on github today!

� Excellent performance when writing all file

sizes to tape

� A excellent solution for very large files

� Individual access to small and medium

objects – not efficient for tape

HPSS disk cache

HPSS tape

Page 18: DLR Portal - High Performance Storage Systemkonferenz-nz.dlr.de/pages/storage2016/present/2... · 2016. 6. 20. · Brookhaven National Lab (BNL) 75.69 104.2 Lawrence Berkeley National

IBM Systems

OpenStack Swift with HPSS SwiftCATComing soon!

…SWIFTOBJECT STORAGE

Proxy/S3 Server

Account, Containerand Object Servers

• HPSS SwiftCAT is Swift middleware for

managing object by container on tape, and other

economical, higher latency storage

• HPSS SwiftCAT allows the customer to

supplement an existing OpenStack Swift object

store with tape, leveraging the initial disk

investment

• Containers of objects are moved on and off of

the low latency disk to more cost effective,

higher latency tape and optical

o Archive and Recall for space management

o Snapshot and Restore for disaster recovery

+HPSS SwiftCAT

Vendor neutralnative Swift disks

Container:objectobjectobjectobject…

Container:objectobjectobjectobject…

Container:objectobjectobjectobject…

Container:objectobjectobjectobject…

Container:objectobjectobjectobject…

Object, filesystem and HPSS interfaces

Page 19: DLR Portal - High Performance Storage Systemkonferenz-nz.dlr.de/pages/storage2016/present/2... · 2016. 6. 20. · Brookhaven National Lab (BNL) 75.69 104.2 Lawrence Berkeley National

IBM Systems

HPSS SwiftCAT ISC16 demo

…SWIFTOBJECT STORAGE

Proxy/S3 Server

Account, Containerand Object Servers

+HPSS SwiftCAT

Vendor neutralnative Swift disks

Container:objectobjectobjectobject…

Container:objectobjectobjectobject…

Container:objectobjectobjectobject…

Container:objectobjectobjectobject…

Container:objectobjectobjectobject…

Filesystem interfaceLTFS Tape

• HPSS SwiftCAT is Swift middleware

• We are demonstrating the concept of operation

of managing containers of objects via the

archive/recall method

Archive or Recall

containers of

objects

Page 20: DLR Portal - High Performance Storage Systemkonferenz-nz.dlr.de/pages/storage2016/present/2... · 2016. 6. 20. · Brookhaven National Lab (BNL) 75.69 104.2 Lawrence Berkeley National

IBM Systems

Migrate to HPSS the FAST way

• Bulk data and metadata migration, where

HPSS takes ownership of legacy tape

cartridges

o Oracle SAM-FS metadata migration

� German Weather Service (DWD)

� Migrating a new customer today (2 billion files)

o EMC UniTree /DXULmetadata migration

� Purdue University

� German Climate Computing Centre (DKRZ)

� NASA Langley Research Center (LaRC)

� Lawrence Livermore National Laboratory (LLNL)

� National Climatic Data Center (NCDC)

� National Center for Supercomputing Applications

(NCSA)

� Oak Ridge National Laboratory (ORNL)

� San Diego Supercomputing Center (SDSC)

• Bulk metadata migration, where HPSS does

NOT take ownership of legacy tape cartridges

o Users only interact with HPSS

o HPSS is fully automated to retrieve data from the legacy tape system when data are not in HPSS

o Data are migrarted from legacy tape to HPSS

o IBM TSM

� Max Planck Computing and data Facility (MPCDF)

o SGI DMF

� French National Meteorological Service (Météo-France)

Page 21: DLR Portal - High Performance Storage Systemkonferenz-nz.dlr.de/pages/storage2016/present/2... · 2016. 6. 20. · Brookhaven National Lab (BNL) 75.69 104.2 Lawrence Berkeley National

IBM Systems

Thank you! Any questions?

Jim Gerry

Senior IT Architect and Consultant

IBM Global Business Services

(720)430-0017(o) (713)256-8516 (c)

[email protected]

http://www.hpss-collaboration.org