hybridstore: a cost efficient, high performance storage...

28
HybridStore: A Cost Efficient, High Performance Storage System Combining SSDs and HDDs Youngjae Kim, Aayush Gupta, Bhuvan Urgaonkar, Piotr Berman, and Anand Sivasubramaniam Oak Ridge National Laboratory Pennsylvania State University July 26 th 2011

Upload: others

Post on 05-Apr-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: HybridStore: A Cost Efficient, High Performance Storage ...users.nccs.gov/~yk7/papers/mascots11.pdf · HybridStore: A Cost Efficient, High Performance Storage System Combining SSDs

HybridStore: A Cost Efficient, High Performance Storage System Combining SSDs and HDDs

Youngjae Kim, Aayush Gupta, Bhuvan Urgaonkar, Piotr Berman, and Anand Sivasubramaniam

Oak Ridge National Laboratory Pennsylvania State University

!

July!26th!2011!

!

!

!

Page 2: HybridStore: A Cost Efficient, High Performance Storage ...users.nccs.gov/~yk7/papers/mascots11.pdf · HybridStore: A Cost Efficient, High Performance Storage System Combining SSDs

Enterprise-scale Storage Systems

2

Enterprise-scale Hard Disk Drive

!  Enterprise-scale Storage Systems •  Information technology focusing on storage, protection, retrieval of data in

LARGE-SCALE environments

!  Data-centric services •  File, web & media servers, transaction processing servers

Google's!massive!server!farms!!

Page 3: HybridStore: A Cost Efficient, High Performance Storage ...users.nccs.gov/~yk7/papers/mascots11.pdf · HybridStore: A Cost Efficient, High Performance Storage System Combining SSDs

A Persistent Hurdle in Enterprise Computing

3

Huge Performance Discrepancy Between CPU and HDD !  Normalized CPU Performance and Media Access Time

Jan-96

Jan-97

Jan-98

Jan-99

Jan-00

Jan-01

Jan-02

Jan-03

Jan-04

Jan-05

Jan-06

Jan-07

Jan-08

Jan-09

0

20

40

60

80

100

120

140

160

180

200

Sca

ling

Fact

or

Multicore CPU

CPU

Source: Intel Measurements

Measured CPU Performance scaling

= 175X

Measured HDD Performance scaling = 1.3X since Jan’96

HDD

o  HDD Performance (Random Read IOPs) has been stagnant for decades.

o  I/O bottleneck has become increasingly worse over time.

Intel® X25-E Seagate Cheetah 15K.6

4KB$Random$Read:$$35,000$IOPs$ 4,000$IOPs$

Flash SSD Potential !!

Page 4: HybridStore: A Cost Efficient, High Performance Storage ...users.nccs.gov/~yk7/papers/mascots11.pdf · HybridStore: A Cost Efficient, High Performance Storage System Combining SSDs

Fusion-IO’s ioDrive Duo (MLC) (100KIOPS for Reads, 141KIOPS for Writes)

Scalable Memory Architecture (VXM) 84 VIMMs (Violin Intelligent memory Modules) (1M random IOPs, PCIe x4/x8 I/F, DRAM/Flash SSD)

Violin Memory Inc – Violin 1010

!  Enterprise scale storage •  Fusion-io’s ioDrive, Texas Memory System’s RamSan-500,

Symmetrix DMX-4 from EMC

!  Embedded Storage •  PDAs, mobile phones, digital cameras

!  Desktop storage •  MacBook Air, One Laptop Per Child (OLPC),

game consoles, Intel’s X25-E Extreme SATA Solid-State Drive

Emergence of NAND Flash

4

Embedded, Desktop, and Enterprise

Intel 320 MLC Series (38K IOPS for Reads, 1.4K IOPS for Writes)

Unknown $8,335 / 320GB

$219 / 120GB

Page 5: HybridStore: A Cost Efficient, High Performance Storage ...users.nccs.gov/~yk7/papers/mascots11.pdf · HybridStore: A Cost Efficient, High Performance Storage System Combining SSDs

Contents

!  Introduction

!  Background •  NAND Flash based SSDs versus HDD •  Motivation for HybridStore and Related Works

!  Overview of HybridStore •  Capacity Planner

− Workload Analyzer − Storage Optimization Solver

!  Experimental Results

!  Conclusion

5!

Page 6: HybridStore: A Cost Efficient, High Performance Storage ...users.nccs.gov/~yk7/papers/mascots11.pdf · HybridStore: A Cost Efficient, High Performance Storage System Combining SSDs

Emergence of NAND Flash based SSD

!  NAND Flash vs. Hard Disk Drives • Pros:

− Semi-conductor technology, no mechanical parts − Offers lower and more predictable access latencies

" Microseconds (45us Reads / 200us Writes) vs. Milliseconds for Hard Disks

− Lower power consumption − Higher robustness to vibrations and temperature

• Cons:

− Limited lifetime " 10K - 1M erases per block

− High cost " About 8X more expensive than current hard disks

− Random writes can be sometimes slow

6

Page 7: HybridStore: A Cost Efficient, High Performance Storage ...users.nccs.gov/~yk7/papers/mascots11.pdf · HybridStore: A Cost Efficient, High Performance Storage System Combining SSDs

NAND Flash based SSD

7

SATA 2.5” 32GB SSD

Process Process Process

File System (FAT, Ext2, NTFS…)

Block Device Driver

fwrite (file,data)

Block write (LBA, size)

Control Signal

Page write (Bank, Block, Page)

Application

OS

Flash Banks

Block Interface (SATA, SCSI, etc)

CPU (FTL) Write Buffer Read Cache

Memory Garbage Collector

Address Translation

Wear- Leveler

Device

System Architecture

Page 8: HybridStore: A Cost Efficient, High Performance Storage ...users.nccs.gov/~yk7/papers/mascots11.pdf · HybridStore: A Cost Efficient, High Performance Storage System Combining SSDs

Existing Storage Server Platform

!  Examples of Storage Server Platform •  Various network interface

− Fibre Channel, SAS etc •  Various types of hard disk drives

− 2.5” SAS drive, 3.5” SATA drive, etc

8!

Page 9: HybridStore: A Cost Efficient, High Performance Storage ...users.nccs.gov/~yk7/papers/mascots11.pdf · HybridStore: A Cost Efficient, High Performance Storage System Combining SSDs

Can SSDs replace HDDs?

!  Challenges •  Unique performance characteristics of SSD

− SSD may become worse than HDD due to GC. •  Reliability Concerns

− Lifetime of SSDs is limited by the write rates. •  Cost Concerns

− NAND Flash is still expensive over HDD.

!  HybridStore •  Hybrid storage systems that combine HDDs and SSDs.

Homogeneous!SSD!based!

Storage!Server!PlaAorm!

Homogeneous!HDD!based!

Storage!Server!PlaAorm!

HybridStore:!Heterogeneous!!

Storage!Server!PlaAorm!

Page 10: HybridStore: A Cost Efficient, High Performance Storage ...users.nccs.gov/~yk7/papers/mascots11.pdf · HybridStore: A Cost Efficient, High Performance Storage System Combining SSDs

Existing Proposals in Enterprise

!  Hybrid Hard Disk •  NAND Flash is on-bard cache in HDD.

!  Intel Turbo Memory (ITM) [ACM TOS’08] •  Support for the ReadyBoost and Ready Drive of Microsoft

!  Two-tier Architecture from Microsoft [Eurosys’09] •  Use SSDs as Long-Term read Cache and Short Write Buffer

!  ZFS (designed by Sun) •  ReadZilla & LogZilla (Implementation of read cache and write buffer)

10!

Hybrid!HDD,!FlashON™!

Intel!Turbo!Memory!

`

Disk Tier

Solid-state Tier

Write!Buffer! Read Cache

Cache miss

Read Write

Flush Update

Storage Tier

TwoOPer!storage!architecture!

Page 11: HybridStore: A Cost Efficient, High Performance Storage ...users.nccs.gov/~yk7/papers/mascots11.pdf · HybridStore: A Cost Efficient, High Performance Storage System Combining SSDs

Contents

!  Introduction

!  Background •  NAND Flash based SSDs versus HDD •  Motivation and Related Works

!  HybridStore •  Overview of HybridStore •  Capacity Planner

− Workload Analyzer − Storage Optimization Solver

!  Experimental Results •  Synthetic and Realistic Workloads

!  Conclusion

11!

Page 12: HybridStore: A Cost Efficient, High Performance Storage ...users.nccs.gov/~yk7/papers/mascots11.pdf · HybridStore: A Cost Efficient, High Performance Storage System Combining SSDs

Overview of HybridStore

12

Capacity Planner and Dynamic Controller

Flash!Flash!Flash!Flash!

SSD Array HDD Array

Flash!Flash!Flash!Flash!

Capacity(Planner(

Requirement!Lists!

O !ApplicaPons!O !Available!Budget!O !Performance!

Requirement,!etc!

Capacity!Planner!

!

!!

!

!

!

!

!

OpPmal!!

SSD,!HDD!

ConfiguraPon!

Admin!

Dynamic(Controller(

Device Driver

Dynamic Controller

Performance!

LifePme!

Estimator

Dynamic Planning

Time

Capacity Planner Capacity Planner

Long-Term

Perf. Predictor Write Regulator Frag. Busting

Short-Term

Performance!Predictor!

FragmentaPon!Buster!(Front)!

AdapPve!WearOleveler!(Front)!

Write!Regulator!

FragmentaPon!!

Buster!(Back)!

AdapPve!WearO!

leveler!(Back)!

Re-Capacity Planning

Monitor!

Dispatcher!

Client’s requests

Data!ParPPon!

Table!

Performance Budget

Violation Violation

Lifetime Budget

Page 13: HybridStore: A Cost Efficient, High Performance Storage ...users.nccs.gov/~yk7/papers/mascots11.pdf · HybridStore: A Cost Efficient, High Performance Storage System Combining SSDs

Capacity Planning

13

Problem Formulation: Goal and Constraints?

Cost of HybridStore = Cost SSDs + Cost HDDs + Cost Recur

Capacity Planner

1.  Capacity of SSD 2.  Workload

Partitioning

Inputs 1. Workload Characteristics 2. Hardware Properties (SSD and HDD)

Constraints 1. Performance requirement 2. Lifetime requirement

1) Perf. of HybridStore > Perf. Budget

2) Lifetime of HybridStore > Lifetime Budget Constraints

Performance Budget

Lifetime Budget

Minimize Cost of HybridStore Goal

Page 14: HybridStore: A Cost Efficient, High Performance Storage ...users.nccs.gov/~yk7/papers/mascots11.pdf · HybridStore: A Cost Efficient, High Performance Storage System Combining SSDs

Capacity Planning

14

HybridStore Hardware Model !  Provisioning SSDs •  Find storage capacity of SSDs and HDDs •  Find out the amount of data partition sent to SSDs for a given workload

!  Storage Model & Data Partitioning

Page 15: HybridStore: A Cost Efficient, High Performance Storage ...users.nccs.gov/~yk7/papers/mascots11.pdf · HybridStore: A Cost Efficient, High Performance Storage System Combining SSDs

Finding Workload Attributes

!  I/O workloads can be characterized by •  Hot (highly accessed) and cold (rarely accessed) data, Read/write ratio,

Sequentiality, Request arrival rate, etc

!  Data Classification •  A methodology to partition a workload into smaller subsets.

!  Finding workload attributes •  The entire logical address space of the workload is divided into fixed-size

chunks (or records), then, mapped to different data classes. − 1MB record size is used because 1MB roughly corresponds to the granularity

of data prefetching doe by HDDs/SSDs. •  Each data record is represented by the following workload attributes

− Temporality (frequency of accesses per unit time) − Read/write ratio − Request size (spatial locality) – sequential, partially sequential, partially

random, and random. − Request arrival rate

15!

Page 16: HybridStore: A Cost Efficient, High Performance Storage ...users.nccs.gov/~yk7/papers/mascots11.pdf · HybridStore: A Cost Efficient, High Performance Storage System Combining SSDs

Data Classification

16!

Hierarchical Data Classification

16

•  Tuples (Hot or cold, Read ratio, Sequentiality (request size), Arrival rate)

Read (0) or Write (1) ?

0 1

Intensity of Arrival Rate? Eg. Lower (25th perc) Middle (50th perc) Upper (75th perc)

….

…. Sequentiality (Request Size)? Eg. <16KB, <32KB, <64KB, others

….

….

CLA

SS

1

CLA

SS

2

CLA

SS

3

CLA

SS

4 ….

`!Cold (0) or Hot (1)?

0 1 2 3 0 1 2 3

0 1 2 3 0 1 2 3

0 1

Partitioned logical address by records …. …. ….

Record (1MB)

Page 17: HybridStore: A Cost Efficient, High Performance Storage ...users.nccs.gov/~yk7/papers/mascots11.pdf · HybridStore: A Cost Efficient, High Performance Storage System Combining SSDs

Capacity Planning

!  Declaration of Variables

17

!  Decision Variables

xij = data of class j on yi devices of device type i

yi = number of devices of device type i

Capacity Planner: Problem Formulation

Ci =Capacity of device type i

Ui =Utilization of device type i

Bi =Maximum Bandwidth of device type i

•  Properties of data class j!

S j = Size of data class j

Fi = Frequency of data class j

Wij =Weight factor for bandwidth of data cass j on yi devices of device type i

•  Properties of device type i!

Integer variable

Page 18: HybridStore: A Cost Efficient, High Performance Storage ...users.nccs.gov/~yk7/papers/mascots11.pdf · HybridStore: A Cost Efficient, High Performance Storage System Combining SSDs

Capacity Planning

!  Objective Function

18

CostHybridStore = CostInstallation + CostRecurring

= ( y(i)i=1

I

∑ ×D$(i) ×Ci + (K$ × y(i)i=1

I

∑ × P(t)dt)t∫ )

xij = S ji∑ , (∀j ∈ J)

xijj∑ ≤ (Ui ×Ci) × yi, (∀i ∈ I)

Fj ×xijS j

≤ Bij × yi, (∀i ∈ I, ∀j ∈ J)

Lifetime(i,x) ≤Useful Lifetime of HDD (i ∈ Flash based SSDs)

!  Constraints

Mixed Integer Linear Programming

Expected lifetime =(Size of NAND flash × # of erase cycles)

bytes written per day

Space constraint

Performance constraint

Page 19: HybridStore: A Cost Efficient, High Performance Storage ...users.nccs.gov/~yk7/papers/mascots11.pdf · HybridStore: A Cost Efficient, High Performance Storage System Combining SSDs

Evaluating HybridPlan !  Solver development

•  Developed a trace analyzer (lines of codes less than 500) •  Developed the solver of HybridPlan using CPLEX

− CPLEX, a well-regarded Integer Linear Programming (ILP) solver

!  Workloads •  Synthetic workloads •  Realistic workloads

− MSR Cambridge traces, and Microsoft Exchange server Traces

!  Devices

19!

Device( Type( Capacity((

(GB)(

Per7GB(

($)(

U:liza

:on(

Read((

(MB/s)(

Write((

(MB/s)(

Latency(

(ms)(

Erase((

(#)(

Power((

(W)(

Seagate(

Cheetah(

15K((

HDD(

146( 1.80( 0.8( 171( 171( 3.6( 7( 12.92(

Seagate(

Barracuda(

7.2K((

HDD(

750( 0.17( 0.8( 125( 125( 4.2( 7( 9.4(

Intel((

X(257E(

SLC((

SSD(

32( 11.96( 0.5( 230( 200( 0.125( 100K( 2(

Intel((

X7257M(

MLC((

SSD(

80( 3.22( 0.5( 220( 80( 0.25( 10K( 2(

Page 20: HybridStore: A Cost Efficient, High Performance Storage ...users.nccs.gov/~yk7/papers/mascots11.pdf · HybridStore: A Cost Efficient, High Performance Storage System Combining SSDs

Synthetic Workloads

!  Description of Synthetic Workloads

20!

Workloads( Index( Read(

(%)(

Size(

(KB)(

Inter7Arrival( I/O(Bandwidth(

Time!(ms)! MB/s! IOPs!

Sequen:al(

Read(

SR1( 80( 128( 100((L)( 1.25( 7(

SR2( 80( 128( 2((M)( 62.5( 7(

SR3( 80( 128( 0.2((H)( 1,250( 7(

Random(

Read(

RR1( 80( 4( 100((L)( 7( 10(

RR2( 80( 4( 2((M)( 7( 500(

RR3( 80( 4( 0.2((H)( 7( 10,000(

Sequen:al(

Write(

SW1( 20( 128( 100((L)( 1.25( 7(

SW2( 20( 128( 2((M)( 62.5( 7(

SW3( 20( 128( 0.2((H)( 1,250( 7(

Random(

Write(

RW1( 20( 4( 100((L)( 7( 10(

RW2( 20( 4( 2((M)( 7( 500(

RW3( 20( 4( 0.2((H)( 7( 10,000(

Page 21: HybridStore: A Cost Efficient, High Performance Storage ...users.nccs.gov/~yk7/papers/mascots11.pdf · HybridStore: A Cost Efficient, High Performance Storage System Combining SSDs

Impact of I/O Intensity

21!

!  Results for Sequential Read Dominant Workloads •  SR1, SR2, SR3

0

1

2

3

4

5

6

7

8

9

10

SR1 15K HDD Only

SR2 SR3

Num

ber

of D

evic

es (

#)

7.2K HDD

15K HDD

7.2K HDD

M-SSD

7.2K HDD

M-SSD

15K RPM HDD7.2K RPM HDD

SLC SSDMLC SSD

0

200

400

600

800

1000

1200

1400

1600

1800

2000

2200

2400

2600

SR1 15K HDD Only

SR2 SR3

Tota

l Cost

($)

15K RPM HDD (I)7.2K RPM HDD (I)

SLC SSD (I)MLC SSD (I)

15K RPM HDD (R)7.2K RPM HDD (R)

SLC SSD (R)MLC SSD (R)

•  SR1!only!requires!2!slow!7.2K!RPM!HDDs!whereas!it!requires!9!fast!15K!RPM!HDDs.!

•  Our!solver!determines!the!right!devices!to!meet!the!capacity!needs.!!

•  As!the!arrival!rate!increases,!we!observe!the!need!for!MLC!SSDs!(considering!$/GB!for!

SLC!SSD,!it!is!not!efficient!to!use!compared!to!using!MLC!SSD).!!

•  Recurring!cost!(Electricity!cost)!are!quite!small!compared!to!device!installaPon!cost.!!

Page 22: HybridStore: A Cost Efficient, High Performance Storage ...users.nccs.gov/~yk7/papers/mascots11.pdf · HybridStore: A Cost Efficient, High Performance Storage System Combining SSDs

Impact of I/O Intensity

22!

!  Results for Sequential Write Dominant Workloads •  SW1, SW2, SW3

•  For!write!dominant!SW3,!unlike!observaPon!from!SR!workloads,!the!solver!suggests!

to!use!one!SLC!SSD!instead!of!the!MLC!ones!for!its!readOintensive!counterpart!(SR3).!

It’s!because!SLC!SSD!that!we!use!is!2.5!Pmes!faster!than!the!MLC!one.!!

•  Also!it!needs!a!sharp!increase!in!the!number!of!slow!HDDs!because!of!the!vast!$/GB!

difference!between!SLC!SSDs!and!slow!HDDs.!

0

2

4

6

8

10

12

14

16

18

20

SW1 SW2 SW3

Num

ber o

f Dev

ices

(#)

7.2K HDD 7.2K HDDM-SSD

7.2K HDD

S-SSD

15K RPM HDD7.2K RPM HDD

SLC SSDMLC SSD

0 200 400 600 800

1000 1200 1400 1600 1800 2000 2200 2400 2600 2800 3000 3200 3400 3600

SW1 SW2 SW3

Tota

l Cos

t ($)

15K RPM HDD (I)7.2K RPM HDD (I)

SLC SSD (I)MLC SSD (I)

15K RPM HDD (R)7.2K RPM HDD (R)

SLC SSD (R)MLC SSD (R)

Page 23: HybridStore: A Cost Efficient, High Performance Storage ...users.nccs.gov/~yk7/papers/mascots11.pdf · HybridStore: A Cost Efficient, High Performance Storage System Combining SSDs

Impact of Sequentiality

!  Results for Sequential and Random Workloads •  SR, SW

23!

0

1

2

3

4

5

6

7

8

9

10

SR2 RR2 SW3RW3

De

vice

(#

)

7.2K HDD

M-SSD

7.2KHDD

M-SSD

7.2K HDD

M-SSD

7.2K HDD

S-SSD

M-SSD

15K RPM HDD7.2K RPM HDD

SLC SSDMLC SSD

•  We!clearly!see!the!needs!of!the!larger!number!of!SSDs!as!the!workloads!are!random.!!

•  For!RW3,!we!observe!the!needs!of!SLCOSSDs!to!meet!the!high!IOPS!requirement.!!

•  As!a!storage!administrator,!it!is!highly!advisable!to!increase!the!sequenPality!of!

incoming!workloads!so!as!not!to!employ!expensive!SSDs.!!

0

500

1000

1500

2000

2500

SR2 RR2 SW3 RW3

Tota

l Cos

t ($)

15K RPM HDD (I)7.2K RPM HDD (I)

SLC SSD (I)MLC SSD (I)

15K RPM HDD (R)7.2K RPM HDD (R)

SLC SSD (R)MLC SSD (R)

Page 24: HybridStore: A Cost Efficient, High Performance Storage ...users.nccs.gov/~yk7/papers/mascots11.pdf · HybridStore: A Cost Efficient, High Performance Storage System Combining SSDs

Impact of Lifetime Constraint

!  Results for without and with lifetime constraints •  denoted as (A) and (B) respectively

24!

0 200 400 600 800

1000 1200 1400 1600 1800 2000 2200 2400 2600 2800 3000 3200 3400 3600 3800 4000

RW3(A)RW3(B)

SW3(A)SW3(B)

Tota

l Cos

t ($)

15K RPM HDD (I)7.2K RPM HDD (I)

SLC SSD (I)MLC SSD (I)

15K RPM HDD (R)7.2K RPM HDD (R)

SLC SSD (R)MLC SSD (R)

•  LifePme!constraint!is!an!important!metric!in!capacity!provisioning.!!

•  Without!lifePme!constraint,!we!see!a!greater!porPon!of!SSDs!being!used!than!with!the!

lifePme!constraint.!!

•  For!SW3,!without!lifePme!constraint,!we!may!have!lower!number!of!devices!as!well!as!

the!overall!cost!compared!to!when!the!lifePme!constraint!is!forced,!however,!the!

storage!administrator!needs!to!reOprovision!prematurely,!eventually!increase!the!

overall!costs!over!the!iniPal!esPmated!period.!!

0 1 2 3 4 5 6 7 8 9

10 11 12 13 14 15 16 17 18 19 20

RW3(A)RW3(B)

SW3(A)SW3(B)

Num

ber o

f Dev

ices

(#)

7.2KHDD

M-SSD

7.2K HDD

S-SSDM-SSD

7.2K HDDS-SSD

M-SSD

7.2K HDD

S-SSD

15K RPM HDD7.2K RPM HDD

SLC SSDMLC SSD

Page 25: HybridStore: A Cost Efficient, High Performance Storage ...users.nccs.gov/~yk7/papers/mascots11.pdf · HybridStore: A Cost Efficient, High Performance Storage System Combining SSDs

Realistic Workloads

!  Description of Realistic Workloads

25!

Workload( Size(

(TB)(

Read((

(%)(

Request(Size(

(KB)(

IOPS(

MSR!Trace! 5.7!TB! 68.1! 23.32! 823!

Exchange!

Server!

750GB! 38.3! 16.54! 3,692!

Page 26: HybridStore: A Cost Efficient, High Performance Storage ...users.nccs.gov/~yk7/papers/mascots11.pdf · HybridStore: A Cost Efficient, High Performance Storage System Combining SSDs

Can SSDs replace HDDs?

!  Results for MSR Traces

26!

0

1

10

100

1000

15K HDD Only

7.2K HDD Only

SLC SSD Only

MLC SSD Only

HybridStor

Num

ber o

f Dev

ices

(#)

50

14

364146

10

1

15K RPM HDD7.2K RPM HDD

SLC SSDMLC SSD

1

10

100

1000

10000

100000

1e+06

15K HDD Only

7.2K HDD Only

SLC SSD Only

MLC SSD Only

HybridStore

Tota

l Cos

t ($)

11.3K

2.8K 1.8K576

139.4K

3.2K

37.7K

1.0K 1.3K258412

7

15K HDD (I)7.2K HDD (I)SLC SSD (I)

MLC SSD (I)15K HDD (R)7.2K HDD (R)

SLC SSD (R)MLC SSD (R)

•  Employing!7.2K!RPM!HDDs!is!more!economically!efficient!than!employing!15K!RPM!

HDDs.!!

•  In!case!of!SSD!systems,!it!requires!several!hundreds!of!SSDs!to!saPsfy!the!capacity!

requirement.!

•  The!bounding!factor!for!decisionOmaking!of!HybridPlan!is!not!I/O!bandwidth!but!storage!

capacity!requirement.!!

Page 27: HybridStore: A Cost Efficient, High Performance Storage ...users.nccs.gov/~yk7/papers/mascots11.pdf · HybridStore: A Cost Efficient, High Performance Storage System Combining SSDs

Efficacy of HybridStore

!  Results for MSR Traces

27!

0

1

10

100

1000

15K HDD Only

7.2K HDD Only

SLC SSD Only

MLC SSD Only

HybridStor

Num

ber o

f Dev

ices

(#)

50

14

364146

10

1

15K RPM HDD7.2K RPM HDD

SLC SSDMLC SSD

1

10

100

1000

10000

100000

1e+06

15K HDD Only

7.2K HDD Only

SLC SSD Only

MLC SSD Only

HybridStore

Tota

l Cos

t ($)

11.3K

2.8K 1.8K576

139.4K

3.2K

37.7K

1.0K 1.3K258412

7

15K HDD (I)7.2K HDD (I)SLC SSD (I)

MLC SSD (I)15K HDD (R)7.2K HDD (R)

SLC SSD (R)MLC SSD (R)

•  HybridPlan!can!find!the!most!economic!storage!composiPon.!!

•  HybridPlan!suggests!2!x!7.2K!RPM!HDDs!and!1!MLC!SSD!for!MSR!Trace.!!

•  Total!cost!saving!of!HybridStore!is!about!85%!compared!to!highOend!HDD!only!system.!!

•  99%!data!are!classified!into!C32,!a!data!class!storing!data!rarely!accessed.!!

0.001 0.01

0.1 1

10 100

1000 10000

C0 C2 C4 C6 C8 C10C12

C14C16

C18C20

C22C24

C26C28

C30C32D

ata

Cla

ss D

istri

butio

n (G

B)

7.2K RPM HDD MLC SSD

Page 28: HybridStore: A Cost Efficient, High Performance Storage ...users.nccs.gov/~yk7/papers/mascots11.pdf · HybridStore: A Cost Efficient, High Performance Storage System Combining SSDs

Lessons Learned

28

I.  We developed an capacity planner that finds the most economically efficient storage configurations while meeting the performance and lifetime requirements of devices

II.  We provided a general form of comprehensive methodology using a well-known technique for optimization problems, Mixed Integer Linear Programming (LP)

III.  Experiments showed that our capacity planner is able to identify close to minimum SSD capacity needed to meet a specified performance goal for realistic workloads while ensuring similar performance as compared to a comparatively more over-provisioned system