intel storage software - opensfscdn.opensfs.org/wp-content/uploads/2017/06/wed01-neitzel... ·...

18
INTEL STORAGE software Bryon Neitzel Director, High Performance Data Division

Upload: others

Post on 30-Dec-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: INTEL STORAGE software - OpenSFScdn.opensfs.org/wp-content/uploads/2017/06/Wed01-Neitzel... · 2017. 6. 16. · • Involvement in Lustrecommunity eventsand groups like OpenSFS, EOFS,

INTEL STORAGE software

Bryon NeitzelDirector, High Performance Data Division

Page 2: INTEL STORAGE software - OpenSFScdn.opensfs.org/wp-content/uploads/2017/06/Wed01-Neitzel... · 2017. 6. 16. · • Involvement in Lustrecommunity eventsand groups like OpenSFS, EOFS,

• Mission:

• Develop high performance IO software solutions for the worlds most challenging data movement and storage problems

• Scope:

• Lustre* Feature and Maintenance Releases, Lustre L3 Support

• Complete IO stack for large scale Deployments

• Future Storage Software Technology (DAOS)

Intel High Performance Data Division

2

Page 3: INTEL STORAGE software - OpenSFScdn.opensfs.org/wp-content/uploads/2017/06/Wed01-Neitzel... · 2017. 6. 16. · • Involvement in Lustrecommunity eventsand groups like OpenSFS, EOFS,

How we deliver products to the marketplace has changed:• All Intel contributions go directly to Open Source projects

• Moving away from Intel-branded releases

• All formerly proprietary components from Intel-branded releases have been open sourced (HAL, HAM, IML)

• https://github.com/intel-hpdd/

• Consolidating Sales Functions with other Intel organizations• Focus on Level 3 support for future customers

• Continued support for existing customers

• Enhanced testing and stability for Lustre Community Edition

• One release means more focus on LTS Lustre stabilization and hardening, plus free maintenance releases

• (Community, Foundation, Enterprise) -> Community

Lustre Business Model Changes

3

Page 4: INTEL STORAGE software - OpenSFScdn.opensfs.org/wp-content/uploads/2017/06/Wed01-Neitzel... · 2017. 6. 16. · • Involvement in Lustrecommunity eventsand groups like OpenSFS, EOFS,

What we build did not change:• Ongoing delivery of feature releases

• Support for existing and ongoing installations for Intel branded releases.

• Lustre development, test, release, support, and R&D teams

• Intel-funded hardware for development, build, and testing of Lustre

• Involvement in large scale machine deployments

• Involvement in Lustre community events and groups like OpenSFS, EOFS, LUG, LAD, etc

Lustre Business Model

4

Page 5: INTEL STORAGE software - OpenSFScdn.opensfs.org/wp-content/uploads/2017/06/Wed01-Neitzel... · 2017. 6. 16. · • Involvement in Lustrecommunity eventsand groups like OpenSFS, EOFS,

Community Release Roadmap

2017Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4

2.10*

ZFSSnapshotsMulti-railLNETProgressiveFileLayoutsProjectQuotas`

2.12

DataonMDTFLR– DelayedResyncLockAhead

2018

2.11

FLR– ImmediateResyncLNETNetworkHealth

2.9

UID/GIDMappingSharedKeyCryptoLargeBlockIOSubdirectoryMounts

Estimates are not commitments and are provided for informational purposes only

Fuller details of features in development are available at http://wiki.lustre.org/ProjectsLast updated: April 20th 2017

*LTS Release with maintenance releases provided

5

Page 6: INTEL STORAGE software - OpenSFScdn.opensfs.org/wp-content/uploads/2017/06/Wed01-Neitzel... · 2017. 6. 16. · • Involvement in Lustrecommunity eventsand groups like OpenSFS, EOFS,

Improvements in Lustre Performance - Today

ManagementNetwork

High Performance Data Network(OmniPath, FDR IB, 40GbE)

8GB/s

MetadataServers (~10’s)

Object Storage Servers

(~1000’s)

IML

Lustre Clients (~50,000)

MetadataTargets (MDTs)

ManagementTarget (MGT) Object Storage Targets (OSTs)

6

ZFS HDD 10k create/s/MDS ZFS HDD

5-8GB/s/OSS

50k 4-32KBIOPS/OSS SSD

Page 7: INTEL STORAGE software - OpenSFScdn.opensfs.org/wp-content/uploads/2017/06/Wed01-Neitzel... · 2017. 6. 16. · • Involvement in Lustrecommunity eventsand groups like OpenSFS, EOFS,

Improvements in Lustre Performance – 2.12 targets

ManagementNetwork

High Performance Data Network(OmniPath, EDR IB, 40GbE)

20GB/s+ with Multi-Rail

MetadataServers (~10’s)

Object Storage Servers

(~1000’s)

IML

Lustre Clients (~50,000)

MetadataTargets (MDTs)

ManagementTarget (MGT) Object Storage Targets (OSTs)

7

ZFS SSD18GB/s/OSS

150k 4-32KBIOPS/OSS SSD

ZFS SSD 150k create/s/MDS

100k 4-32KBsmall file

IOPS/MDS SSD

Page 8: INTEL STORAGE software - OpenSFScdn.opensfs.org/wp-content/uploads/2017/06/Wed01-Neitzel... · 2017. 6. 16. · • Involvement in Lustrecommunity eventsand groups like OpenSFS, EOFS,

8

Intel Focus on Scalability and PerformanceDNE Phase I - File Create: ZFS 0.6.5.7-1 vs. 0.7.0-rc1

$MPICMD ./mdtest -i 3 -I 10000 -F -C -T -r -u -d /mnt_point/@/mnt/point2/@etc.

5787.325

37540.87515459.028

118426.283

0

20000

40000

60000

80000

100000

120000

140000

1 2 3 4 5 6 7 8

Ops

/s

Number of Servers / MDT's

0.6.5.7-1 File Create 0.7.0-rc1 File Create

* Intel measured or estimated as of September 2016. Please see configuration details at end of deck

Source: https://www.eofs.eu/_media/events/lad16/17_dne_analysis_roe_2_.pdf

Page 9: INTEL STORAGE software - OpenSFScdn.opensfs.org/wp-content/uploads/2017/06/Wed01-Neitzel... · 2017. 6. 16. · • Involvement in Lustrecommunity eventsand groups like OpenSFS, EOFS,

Intuitive, browser-based administration

Lustre installation and configuration

Real-time system monitoring

Extensible through open, documented APIs

* Other names and brands may be claimed as the property of others.

9

IML: Community-based Lustre ManagerManagement and Monitoring Tool

OST Capacity

Metadata Operations Read/Write Bandwidth

Read/Write Heat Map

Page 10: INTEL STORAGE software - OpenSFScdn.opensfs.org/wp-content/uploads/2017/06/Wed01-Neitzel... · 2017. 6. 16. · • Involvement in Lustrecommunity eventsand groups like OpenSFS, EOFS,

IML: Community-based Lustre ManagerDetails of Open Source ProjectIML now available under MIT license at https://github.com/intel-hpdd/

• IML is a monorepo with a series of collaborator repos

• Each has CI mechanism running tests over changes: https://travis-ci.org/intel-hpdd/

• Providing convenient way to demo tool and test proposed change using Vagrant• https://atlas.hashicorp.com/boxes/search?utf8=%E2%9C%93&sort=&provider=&q

=manager-for-lustre

• Following typical GitHub workflow; issues and pull-requests can be opened against specific repos. Use these to communicate / propose changes to IML team• Examples: https://github.com/intel-hpdd/intel-manager-for-lustre/issues,

https://github.com/intel-hpdd/intel-manager-for-lustre/pulls.

Developing open-source roadmap; input from community greatly appreciated

IML 4.0 will be compatible with Lustre 2.10.x LTS releases; targeted for early Q3 release

10

Page 11: INTEL STORAGE software - OpenSFScdn.opensfs.org/wp-content/uploads/2017/06/Wed01-Neitzel... · 2017. 6. 16. · • Involvement in Lustrecommunity eventsand groups like OpenSFS, EOFS,

HPDD Storage Software

11

NW HCA

Lustre OSS

NW driverLustre Server

ZFS

SASHDD

Lustre ClientNW driver

Burst Buffer/ IO Node

Function Shipping Server

NW HCA

NW driver

HCA PCIeSSD

Mercury

Burst Buffer Service

Application

Function Shipping Client

Compute Node

I/O Library*

Mercury

HCA

NW driver

CPPR

3DXPoint

ZFS

Lustre

IO Forwarding

CPPR

I/O LibTuning

HPD

D A

uror

aI/O

Pro

gram

s

Open SourceLanding Zones

OpenZFS,ZFSonLinux

Lustre

MercuryOpenHPC

OpenHPC

HDF5,Various MPI

Page 12: INTEL STORAGE software - OpenSFScdn.opensfs.org/wp-content/uploads/2017/06/Wed01-Neitzel... · 2017. 6. 16. · • Involvement in Lustrecommunity eventsand groups like OpenSFS, EOFS,

12

The Future is both Evolutionary & RevolutionaryLustre evolving in response to:• A Growing Customer Base• Evolving use cases• Emerging HW capabilities

DAOS exploring new territory:• What may lay beyond POSIX• Use new HW capabilities as storage• Object storage model exposes new

capability for scalable consistency

IMLObject

DAOS Tier

Pool

Pool Container

Akey[i]

DKey

Page 13: INTEL STORAGE software - OpenSFScdn.opensfs.org/wp-content/uploads/2017/06/Wed01-Neitzel... · 2017. 6. 16. · • Involvement in Lustrecommunity eventsand groups like OpenSFS, EOFS,

Extreme Scale Storage IO (ESSIO)

Joint Project with HDF Group to explore:

§ New architectural directions– Massively distributed storage

– Hot tier close to compute nodes

§ Future programming models, runtimes and workflows

– Legion

– Asynchronous producer/consumer

§ Analytics– Capture & index metadata

– Help to derive value from data being produced as volumes explode

Pre-Productization version capability

§ Data Model and KV interface

§ Data replication / Online Rebuild

§ Large & Small record support

§ Metadata replication

§ Snapshots (aka Epochs)

§ Libfabric support

§ HDF5 Support

13

Application

I/O Middleware

Storage Backend

Page 14: INTEL STORAGE software - OpenSFScdn.opensfs.org/wp-content/uploads/2017/06/Wed01-Neitzel... · 2017. 6. 16. · • Involvement in Lustrecommunity eventsand groups like OpenSFS, EOFS,

• Mission:

• Develop a rich portfolio of high performance storage products to solve the worlds most challenging data storage and IO problems

• Scope:

• Lustre is the future of scalable POSIX storage• Advancing the Roadmap, Feature and Maintenance Releases, Commercial Level 3 Support

• DAOS is the future of scalable Object storage• Complete IO stack for pre-Exascale Deployments, Rich High-Level Object Interfaces

• Next Generation Storage R&D Projects (including both Lustre and DAOS)

• See Peter, Micah or Bryon during the break to ask questions – Thanks!

Summary

14

Page 15: INTEL STORAGE software - OpenSFScdn.opensfs.org/wp-content/uploads/2017/06/Wed01-Neitzel... · 2017. 6. 16. · • Involvement in Lustrecommunity eventsand groups like OpenSFS, EOFS,

Notices and DisclaimersNo license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.

Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade.

This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest forecast, schedule, specifications and roadmaps.

The products and services described may contain defects or errors known as errata which may cause deviations from published specifications. Current characterized errata are available on request.

Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase. For more complete information about performance and benchmark results, visit www.intel.com/benchmarks

© Intel Corporation. Intel, Intel Inside, the Intel logo, Xeon, Intel Xeon Phi, Intel Xeon Phi logos and Xeon logos are trademarks of Intel Corporation or its subsidiaries in the United States and/or other countries.

15

Page 16: INTEL STORAGE software - OpenSFScdn.opensfs.org/wp-content/uploads/2017/06/Wed01-Neitzel... · 2017. 6. 16. · • Involvement in Lustrecommunity eventsand groups like OpenSFS, EOFS,

16

Testbed Architecture

Bw-1-00

Bw-1-01

Bw-1-15

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Intel® HPC Orchestrator Headnode

GenericLustre1

GenericLustre2

.

.

.

.

.

.

.GenericLustre10

Intel® Omni-Path Switching(100Gbps)

10G Base-T Ethernet Switching

Intel® Omni Path Cabling10G Base-T RJ45

Page 17: INTEL STORAGE software - OpenSFScdn.opensfs.org/wp-content/uploads/2017/06/Wed01-Neitzel... · 2017. 6. 16. · • Involvement in Lustrecommunity eventsand groups like OpenSFS, EOFS,

17

Testbed Architecture (Cont.) Server § 10x Generic Lustre servers with two slightly different configurations

– Each System comprises of:– 2x Intel® Xeon E5-2697v3 (Haswell) CPU’s – 1x Intel® Omni-Path x16 HFI – 128GB DDR4 2133MHz Memory – Eight of the nodes contain - 4x Intel P3600 2.0TB 2.5” (U.2) NVMe devices, while the other two have 4x Intel® P3700 800GB 2.5” (U.2)

NVMe devices– One node equipped with 2x Intel® S3700 400GB’s for MGT

§ 16x 2S Intel® Xeon E5v4 (Broadwell) Compute nodes– 1x Intel® HPC Orchestrator (Beta 2) Headnode – Hardware Components:

– 2x Intel® Xeon E5-2697v4 (Broadwell) CPU’s – 1x Intel® Omni-Path x16 HFI – 128GB DDR4 2400MHz Memory – Local boot SSD

§ 100Gbps Intel® Omni-Path Fabric– None-blocking fabric with single switch design.– Server side optimisations: “options hfi1 sge_copy_mode=2 krcvqs=4 wss_threshold=70”

– Improve generic RDMA performance on Lustre server side, generally you can be more aggressive with krcvqs on the server side

Page 18: INTEL STORAGE software - OpenSFScdn.opensfs.org/wp-content/uploads/2017/06/Wed01-Neitzel... · 2017. 6. 16. · • Involvement in Lustrecommunity eventsand groups like OpenSFS, EOFS,