lenovo distributed storage solution for ceph · copies of objects,]… ceph insures data integrity...

38
1 Lenovo Distributed Storage Solution for Ceph 2016 Lenovo Unclassified. All rights reserved. Lenovo Solutions Development November 2016, Salt Lake City

Upload: others

Post on 02-Nov-2019

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lenovo Distributed Storage Solution for Ceph · copies of objects,]… Ceph insures data integrity by scrubbing placement groups. Ceph scrubbing is analogous to fsck on the object

1

Lenovo Distributed Storage Solution for Ceph

2016 Lenovo Unclassified. All rights reserved.

Lenovo Solutions Development

November 2016, Salt Lake City

Page 2: Lenovo Distributed Storage Solution for Ceph · copies of objects,]… Ceph insures data integrity by scrubbing placement groups. Ceph scrubbing is analogous to fsck on the object

2

Agenda

• Enterprise Storage – current state and future

• What is Ceph?

• Lenovo Portfolio for SAP HANA

• Lenovo Value Proposition

• Lenovo Architectures

• Summary

2016 Lenovo Unclassified. All rights reserved.

Page 3: Lenovo Distributed Storage Solution for Ceph · copies of objects,]… Ceph insures data integrity by scrubbing placement groups. Ceph scrubbing is analogous to fsck on the object

3

Welcome to the age of endless data growth

• 2016: ~59 Exabytes of external storage sales (IDC forcast) – 22 EB for FibreChannel

– 18 EB for Network Attached (NAS filer)

– 9 EB for iSCSI

– 8 EB for Direct Attached

… sold by the storage companies we all know

• Data created every day: ~2.5 Exabyte

2016 Lenovo Unclassified. All rights reserved.

Page 4: Lenovo Distributed Storage Solution for Ceph · copies of objects,]… Ceph insures data integrity by scrubbing placement groups. Ceph scrubbing is analogous to fsck on the object

4

The commodization of Enterprise storage

2016 Lenovo Unclassified. All rights reserved.

External storage systems:

Overall storage market: IDC march press release, covering FY2015:

"The enterprise storage market closed out 2015 on a slight downturn, as spending

on traditional external arrays continues to decline," said Liz Conner, Research

Manager, Storage Systems. "Over the past year, end user focus has shifted

towards server-based storage, software-defined storage, and cloud-based

storage. As a result, traditional enterprise storage vendors are forced to revamp

and update their product portfolios to meet these shifting demands."

Traditional storage vendors

losing market share to

ODM Direct vendors

ODM = Original Design

Manufacturers

Source: IDC

Page 5: Lenovo Distributed Storage Solution for Ceph · copies of objects,]… Ceph insures data integrity by scrubbing placement groups. Ceph scrubbing is analogous to fsck on the object

5

Storage Types

2016 Lenovo Unclassified. All rights reserved.

directory

tree

root dir

Files (NAS) Blocks (SAN) Objects

byte 0

byte 1

byte 2

byte n

. . .

file

/a/b

/c

byte 0

byte 1

byte k

. . .

file

/d/e

/f/g

. . .

read / write

byte range

block 0

block 1

block 2

block n

. . .

block 0

. . .

read / write

block range

(block=4K bytes)

. . . block 1

block k

. . .

read / write

entire objects

obje

ct

1

obje

ct

m

attr-i=val-i

attr-2=val-2 attr-1=val-1

attr-j=val-j

attr-2=val-2 attr-1=val-1

volu

me 1

volu

me m

Metadata

Page 6: Lenovo Distributed Storage Solution for Ceph · copies of objects,]… Ceph insures data integrity by scrubbing placement groups. Ceph scrubbing is analogous to fsck on the object

6

Evolution of Storage Topologies

Internal (DAS) Shared, Networked (some is SDS)

Hyper-converged (mostly SDS, virtualized)

Application

Server

Application

Server

Networked

Storage

Appliance

Ethernet /

FibreChannel

Networked

Storage

Appliance

Application/

Storage

Server

Application/

Storage

Server

Ethernet

Application

On Storage-

Rich Server

Ethernet

Hard to provision right

amount of storage;

doesn’t scale

2016 Lenovo Unclassified. All rights reserved.

Page 7: Lenovo Distributed Storage Solution for Ceph · copies of objects,]… Ceph insures data integrity by scrubbing placement groups. Ceph scrubbing is analogous to fsck on the object

7

Survey: Legacy vs Emerging Providers

2016 Lenovo Unclassified. All rights reserved.

Source: Tintri State of Storage Survey

https://www.tintri.com/news/tintri-state-storage-survey-reveals-biggest-pain-points-finds-buying-criteria-mismatched-today

Page 8: Lenovo Distributed Storage Solution for Ceph · copies of objects,]… Ceph insures data integrity by scrubbing placement groups. Ceph scrubbing is analogous to fsck on the object

8 8

Object-based storage is the future for Exascale storage:

"The future of storage is software based.”

"FOBS [file- and object-based storage] solutions are much more versatile and will quickly outpace more rigid, hardware-based options.“

Scale-up solutions, including unitary file servers and scale-up appliances and gateways, will fall on hard times

throughout the forecast period […] and will experience only sluggish growth through 2016 before beginning to decline in 2017.

IDC Storage Systems Research Director

Ashish Nadkarni (2013) 2016 Lenovo Unclassified. All rights reserved.

Page 9: Lenovo Distributed Storage Solution for Ceph · copies of objects,]… Ceph insures data integrity by scrubbing placement groups. Ceph scrubbing is analogous to fsck on the object

9

SDS market outlook

2016 Lenovo Unclassified. All rights reserved. Source: IT Brand Pulse

Page 10: Lenovo Distributed Storage Solution for Ceph · copies of objects,]… Ceph insures data integrity by scrubbing placement groups. Ceph scrubbing is analogous to fsck on the object

10

Market share trends per storage type

2016 Lenovo Unclassified. All rights reserved.

2014 2026 2016

HDD

flash

File-based storage, including SDS Networked

Server-attached

NVRAM

Growth driven by

Automatically

Generated,

unstructured data

Growth driven by

virtualization, SDS,

cloud storage

Hadoop,

Ceph,

Nutanix

EMC,

HP,

NetApp

Page 11: Lenovo Distributed Storage Solution for Ceph · copies of objects,]… Ceph insures data integrity by scrubbing placement groups. Ceph scrubbing is analogous to fsck on the object

11

New storage technologies IT professionals expect to

evaluate or deploy in 2015

2016 Lenovo Unclassified. All rights reserved.

Source: IT Brand Pulse

Page 12: Lenovo Distributed Storage Solution for Ceph · copies of objects,]… Ceph insures data integrity by scrubbing placement groups. Ceph scrubbing is analogous to fsck on the object

12

Why software-defined storage?

• Cost savings & flexibility: – Avoid large markup by storage vendor on hardware

– Share hardware resources between storage and application; increases utilization; get more work out of less hardware

– More customer flexibility in choosing the best hardware for their needs

• Disadvantages, when done by yourself: – Customer is responsible for selecting and installing hardware; may not provision adequately

for the needs of the software Lenovo Solution Architecture & Support

– Customer is responsible to debug problems and then work with server, storage, OS, or networking vendor single point of contact via SAP OSS ticketing system

– Storage vendor has to be prepared to support their software running on almost any reasonable hardware SUSE and Lenovo have defined a portfolio using few building blocks only

2016 Lenovo Unclassified. All rights reserved.

Page 13: Lenovo Distributed Storage Solution for Ceph · copies of objects,]… Ceph insures data integrity by scrubbing placement groups. Ceph scrubbing is analogous to fsck on the object

13 13

What is Ceph?

2016 Lenovo Unclassified. All rights reserved.

Page 14: Lenovo Distributed Storage Solution for Ceph · copies of objects,]… Ceph insures data integrity by scrubbing placement groups. Ceph scrubbing is analogous to fsck on the object

14

Looking under the hood of Ceph

2016 Lenovo Unclassified. All rights reserved.

RADOS

A software-based, reliable, autonomous, distributed object store

comprised of self-healing, self-managing, intelligent storage servers

specifically tuned to run SAP HANA workload

Server 1 Server 2 Server 3 Server X … Hardware

Software

rbd

(rados block device)

cephfs

(distributed POSIX

file system)

towards clients

API

REST gateway to

object store

(S3/Swift comp.)

Page 15: Lenovo Distributed Storage Solution for Ceph · copies of objects,]… Ceph insures data integrity by scrubbing placement groups. Ceph scrubbing is analogous to fsck on the object

15

The heart of Ceph: RADOS

RADOS elements:

• OSDs – Corresponds to one storage device in Linux (JBOD, RAID (HDD or SSD), NVMe)

– Store objects physically (see next slide)

– Act as fully autonomous devices to provide linear scalability and no SPOF

• Monitors – Manage cluster membership and cluster state – create a quorum of cluster nodes

2016 Lenovo Unclassified. All rights reserved.

RADOS

A software-based, reliable, autonomous, distributed object store

comprised of self-healing, self-managing, intelligent storage servers

specifically tuned to run SAP HANA workload

Server 1 Server 2 Server 3 Server X …

OSD = single Object Storage Devices,

fed into RADOS

Page 16: Lenovo Distributed Storage Solution for Ceph · copies of objects,]… Ceph insures data integrity by scrubbing placement groups. Ceph scrubbing is analogous to fsck on the object

16

The heart of Ceph: RADOS (cont)

• RADOS provides a hash-based placement algorithm (called CRUSH) – Foundation for linear scalability (each client can compute placement independently)

– Direct client to server data path

– Distributes data randomly among OSDs within the cluster

– Allows data placement rules and constraints

• Stores content of rbd images in 4 MB flat files

• Work is spread across all spindles in a cluster (much better utilization than traditional RAID arrays on Enterprise storage systems)

• Server or disk failure not fatal – one or more remaining OSDs still has data

2016 Lenovo Unclassified. All rights reserved.

Page 17: Lenovo Distributed Storage Solution for Ceph · copies of objects,]… Ceph insures data integrity by scrubbing placement groups. Ceph scrubbing is analogous to fsck on the object

17

Ceph Data Placement

• Default configuration for reduncany: size=3

• Copies are taken from the primary replica

• Data location determined by CRUSH map

2016 Lenovo Unclassified. All rights reserved.

Page 18: Lenovo Distributed Storage Solution for Ceph · copies of objects,]… Ceph insures data integrity by scrubbing placement groups. Ceph scrubbing is analogous to fsck on the object

18

Use cases (1/4) – Storage Snapshotting

• A snapshot is a read-only copy of the rbd image state at a particular point in time

• Snapshotting automatically built-in, no extra license fee like with Enterprise Storage systems

• Revert to snapshot support

• Usage: – Backup

– Fetch test data

– Save state before a HANA table modification

– … very popular

2016 Lenovo Unclassified. All rights reserved.

rbd with XFS

Snapshot is

triggered in SDS software,

and executed by

all OSDs in parallel.

Page 19: Lenovo Distributed Storage Solution for Ceph · copies of objects,]… Ceph insures data integrity by scrubbing placement groups. Ceph scrubbing is analogous to fsck on the object

19

Use cases (2/4) – Storage mirroring

• For synchronous operation, just increase the number of data copies

• For asynchronous operation, there is capability on rbd image level

2016 Lenovo Unclassified. All rights reserved.

Remote site

Local site

Replication on pool level, affecting

all rbd images in this pool.

Primary site uses RBD journaling image

feature to ensure crash-consistency.

Remote site pulls journal from time to time

and applies it locally.

Page 20: Lenovo Distributed Storage Solution for Ceph · copies of objects,]… Ceph insures data integrity by scrubbing placement groups. Ceph scrubbing is analogous to fsck on the object

20

Use cases (3/4) – Security features

2016 Lenovo Unclassified. All rights reserved.

Data encryption at OSD level

• Security comes at no extra license fee, it is built in – Encryption of data at rest

– Checksumming of data at rest

Checksumming & scrubbing (data copies match, etc)

“[In addition to making multiple

copies of objects,]… Ceph insures

data integrity by scrubbing

placement groups. Ceph

scrubbing is analogous to fsck on

the object storage layer.

Ceph generates a catalog of all

objects and compares each

primary object and its replicas to

ensure that no objects are missing

or mismatched.”

Page 21: Lenovo Distributed Storage Solution for Ceph · copies of objects,]… Ceph insures data integrity by scrubbing placement groups. Ceph scrubbing is analogous to fsck on the object

21

Use cases (4/4) – Efficient redundancy for scale-out env.

• Traditional RAID gets very inefficient when scaling big – Lots of capacity is lost to overhead

– RAID over dozens of disks not recommended

– During rebuild, only a fraction of overall spindles is involved

– Rebuilt puts heavy load on the storage subsystem (guess at what point HDDs fail …)

• New method: erasure-coding – Each object is stored as K+M chunks: K data chunks plus M coding chunks

Can sustain the loss of M cunks

– Example: K+M= 4+2 sustain the loss of two devices

2016 Lenovo Unclassified. All rights reserved.

Page 22: Lenovo Distributed Storage Solution for Ceph · copies of objects,]… Ceph insures data integrity by scrubbing placement groups. Ceph scrubbing is analogous to fsck on the object

22 22

Lenovo Portfolio for SAP HANA

Going into the details …

2016 Lenovo Unclassified. All rights reserved.

Page 23: Lenovo Distributed Storage Solution for Ceph · copies of objects,]… Ceph insures data integrity by scrubbing placement groups. Ceph scrubbing is analogous to fsck on the object

23

Lenovo Storage Solution for SAP HANA – Portfolio

2016 Lenovo Unclassified. All rights reserved.

High Availability thru redundant arrays

Scalability thru additional disks or arrays

beyond the defined node limitations

Unlimited growth thru scale out and replication

Each SES array has a minimum of

three x3650 M5 servers which is the

required SUSE minimum for a SES cluster.

All solutions available today under CA with

Lenovo development support.

Lenovo Storage Solution H

(HDD)

Lenovo Storage Solution F

(Flash)

Lenovo Storage Solution C

(Capacity)

x3650 M5 x3650 M5 x3650 M5

8871-AC1 8871-AC1 8871-AC1

Up to 4 nodes

per array

Up to 12 nodes

per array

Up to 16 nodes

per array

2x E5-2630v4 (min.) 2x E5-2690v4 2x E5-2690v4

256 GB DDR4 256 GB DDR4 256 GB DDR4

12x 1.2TB HDD 2.5“

with FlashCache (6x 400GB SSD)

5x 3.84TB SSD 2.5“ (max. 24) 12x (max. 36x) 2-10TB HDD 3.5“

with FlashCache (6x 400GB SSD, max. 24)

XFS on SUSE Enterprise Storage

XFS on SUSE Enterprise Storage

XFS on SUSE Enterprise Storage

10.8 TB data/log max. 92.1 TB data/log Max. 288 TB data/log

4x 40GbE

4x 1GigE

4x 40GbE

4x 1GigE

4x 40GbE

4x 1GigE

Internal upgradability:

+6 SSD (8 nodes)

Page 24: Lenovo Distributed Storage Solution for Ceph · copies of objects,]… Ceph insures data integrity by scrubbing placement groups. Ceph scrubbing is analogous to fsck on the object

24

Certified SAP HANA Hardware Directory

2016 Lenovo Unclassified. All rights reserved.

Page 25: Lenovo Distributed Storage Solution for Ceph · copies of objects,]… Ceph insures data integrity by scrubbing placement groups. Ceph scrubbing is analogous to fsck on the object

25

High-level view

2016 Lenovo Unclassified. All rights reserved.

Server with local storage devices (HDD, SSD, NVMe, ..)

plus software to turn it into SDS

Page 26: Lenovo Distributed Storage Solution for Ceph · copies of objects,]… Ceph insures data integrity by scrubbing placement groups. Ceph scrubbing is analogous to fsck on the object

26

Lenovo Storage Solution for SAP HANA

CephFS

2016 Lenovo Unclassified. All rights reserved.

Page 27: Lenovo Distributed Storage Solution for Ceph · copies of objects,]… Ceph insures data integrity by scrubbing placement groups. Ceph scrubbing is analogous to fsck on the object

27 27

Lenovo Value Proposition

2016 Lenovo Unclassified. All rights reserved.

Page 28: Lenovo Distributed Storage Solution for Ceph · copies of objects,]… Ceph insures data integrity by scrubbing placement groups. Ceph scrubbing is analogous to fsck on the object

28

Why Open-Source Storage Software?

Mostly due to the commoditization of storage: • Repeating what happened with Unix (Linux) and C compilers (gcc).

• Adherence to standards (NFS, CIFS, iSCSI, FCOe): – Makes it feasible to clone the functionality – not a moving target

– Makes different implementations interchangeable, i.e. commodities

• No one company can compete with the productivity and innovation of an active world-wide community of open-source developers.

• Cost: the price of any commoditized software eventually approaches $0.

• Final trigger for adoption is when a paid support model emerges from a trusted source (e.g. Red Hat or SUSE).

• Gartner predicts that 20% of storage will be open source as early as 2017.

2016 Lenovo Unclassified. All rights reserved.

Page 29: Lenovo Distributed Storage Solution for Ceph · copies of objects,]… Ceph insures data integrity by scrubbing placement groups. Ceph scrubbing is analogous to fsck on the object

29

Why Ceph as an engineered solution?

• Cost savings & flexibility: – Avoid large markup by storage vendor on hardware

– Share hardware resources between storage and application; increases utilization; get more work out of less hardware

– More customer flexibility in choosing the best hardware for their needs

• Disadvantages, when done by yourself: – Customer is responsible for selecting and installing hardware; may not provision adequately

for the needs of the software Lenovo Solution Architecture & Support

– Customer is responsible to debug problems and then work with server, storage, OS, or networking vendor single point of contact for support

– Storage vendor has to be prepared to support their software running on almost any reasonable hardware SUSE and Lenovo have defined a portfolio using few building blocks only

2016 Lenovo Unclassified. All rights reserved.

Page 30: Lenovo Distributed Storage Solution for Ceph · copies of objects,]… Ceph insures data integrity by scrubbing placement groups. Ceph scrubbing is analogous to fsck on the object

30

Lenovo Storage Solution for Ceph

Simple based on proven x3650 M5 server technology

Seamless providing block storage, configured e.g. with XFS

Software Defined based on SUSE Enterprise Storage

Safe High Availability thru replication (r2, sync or async)

Scalable scale up and/or scale out

Superior Technology scalable Flash acceleration, redundant 40GbE, RDMA etc.

replication, snapshotting, encryption, Object Storage etc.

Suitable Models HANA - Entry

Flash - Performance

Capacity - Density

SAP HANA ready certified Enterprise Storage for up to 64 nodes

2016 Lenovo Unclassified. All rights reserved.

Page 31: Lenovo Distributed Storage Solution for Ceph · copies of objects,]… Ceph insures data integrity by scrubbing placement groups. Ceph scrubbing is analogous to fsck on the object

31

Lenovo Storage Solution – Engineered end-to-end (1/2)

• Detailed workload analysis, example: Ceph internal behaviour for O_DIRECT 16k blocks

2016 Lenovo Unclassified. All rights reserved.

Page 32: Lenovo Distributed Storage Solution for Ceph · copies of objects,]… Ceph insures data integrity by scrubbing placement groups. Ceph scrubbing is analogous to fsck on the object

32

Lenovo Storage Solution – Engineered end-to-end (2/2)

• Detailed workload analysis, example: CPU bottleneck analysis (“How many cores do I really need?”)

2016 Lenovo Unclassified. All rights reserved.

4K random writes

64K random writes

1M streaming

Page 33: Lenovo Distributed Storage Solution for Ceph · copies of objects,]… Ceph insures data integrity by scrubbing placement groups. Ceph scrubbing is analogous to fsck on the object

33 33

Lenovo Architectures

2016 Lenovo Unclassified. All rights reserved.

Page 34: Lenovo Distributed Storage Solution for Ceph · copies of objects,]… Ceph insures data integrity by scrubbing placement groups. Ceph scrubbing is analogous to fsck on the object

34

Ceph for SAP HANA: Details

All server running

SUSE Enterprise

Storage (SES)

2016 Lenovo Unclassified. All rights reserved.

Page 35: Lenovo Distributed Storage Solution for Ceph · copies of objects,]… Ceph insures data integrity by scrubbing placement groups. Ceph scrubbing is analogous to fsck on the object

35

SUSE Enterprise Storage for SAP HANA

2016 Lenovo Unclassified. All rights reserved.

Each HANA server has:

• two rbd (data and log) which are XFS formatted.

• access to a CephFS distributed file system (HANA traces, …)

rbd with XFS (GB .. TB)

Page 36: Lenovo Distributed Storage Solution for Ceph · copies of objects,]… Ceph insures data integrity by scrubbing placement groups. Ceph scrubbing is analogous to fsck on the object

36

42U Rack

Example: Ceph for nextScale

• Design criteria: $/TB – big data read intensive (genomics, bio, imaging)

• up to 2 PB net usable in a single rack (10+2 EC) plus 15 TB Flash cache

• Net capacity (TB) using different redundancy levels (# coding chunks)

6U / 420 TB raw

2016 Lenovo Unclassified. All rights reserved.

Basic building block

3.5” SATA for capacity

Flash write cache Automatic pro/de-mote

Age

Access freq

Percentage full

Performance and

capacity layer can be

scaled independently.

Page 37: Lenovo Distributed Storage Solution for Ceph · copies of objects,]… Ceph insures data integrity by scrubbing placement groups. Ceph scrubbing is analogous to fsck on the object

37

Summary

• The storage market is changing heavily – Trend towards software-defined, better utilized (think of Uber or Airbnb)

– External storage systems are in trouble

• Don’t pay expensive license fees for storage features that you can get for free

• Avoid vendor lock-in, free your storage

• Lenovo has pioneered in collaboration with SUSE a new software defined storage solution for use with SAP HANA

– Get ready for the future of storage

– Come and talk to us on the booth

2016 Lenovo Unclassified. All rights reserved.

Page 38: Lenovo Distributed Storage Solution for Ceph · copies of objects,]… Ceph insures data integrity by scrubbing placement groups. Ceph scrubbing is analogous to fsck on the object