solid state storage technologies - androbenchcsl.skku.edu/uploads/ice3028s17/7-sss.pdf · solid...

20
Jin-Soo Kim ([email protected]) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Solid State Storage Technologies

Upload: dangmien

Post on 30-Jun-2018

226 views

Category:

Documents


0 download

TRANSCRIPT

Jin-Soo Kim ([email protected])

Computer Systems Laboratory

Sungkyunkwan University

http://csl.skku.edu

Solid State

Storage

Technologies

2ICE3028: Embedded Systems Design | Spring 2017 | Jin-Soo Kim ([email protected])

NVMe (1)

▪ The industry standard interface for

high-performance NVM storage

• NVMe 1.0 specification in 2011 (now 1.3)

• Supported by major OSes: Windows, Linux, Solaris, …

▪ PCIe-based

• Low latency: direct connection to CPU

• Scalable performance: 1GB/s per lane, up to 32 lanes

• No HBA required: reduced power & cost

▪ Form factors

• Add-in-Card, M.2, BGA, etc.

3ICE3028: Embedded Systems Design | Spring 2017 | Jin-Soo Kim ([email protected])

NVMe (2)

▪ Deep queue: 64K commands/queue, up to 64K queues

▪ Streamlined command set: only 13 required commands

▪ One register write to issue a command (“doorbell”)

▪ Support for MSI-X and interrupt aggregation

Doorbell

4ICE3028: Embedded Systems Design | Spring 2017 | Jin-Soo Kim ([email protected])

NVMeDirect Framework

NV

MeD

irec

t Li

bra

ry

NVMe Controller

I/O

Han

dle

sI/

O Q

ueu

es

Block Cache

I/O Scheduler

I/O Completion Thread

Handle Handle

Admin Tool

NVMeDirect APIU

ser

Ke

rne

lH

W

NV

Me

Dri

ver

Def

ault

Q

ueu

es

Use

r Q

ueu

es

H.-J. Kim, Y.-S. Lee, and J.-S. Kim, “NVMeDirect: A User-space I/O Framework for Application-specific Optimization on NVMe SSDs,” HotStorage, 2016.

5ICE3028: Embedded Systems Design | Spring 2017 | Jin-Soo Kim ([email protected])

All-Flash Array

▪ Interfaces

• 10Gb/40Gb Ethernet (iSCSI) or

16Gb Fibre Channel or PCIe

• SAS or NVMe SSDs

▪ Functionalities

• Volume management

• Virtualization support

• RAID

• Snapshot

• Deduplication

• Compression, …

6ICE3028: Embedded Systems Design | Spring 2017 | Jin-Soo Kim ([email protected])

Traditional Block Interface

▪ SATA/SCSI/SAS

• Read (sector #, length)

Write (sector #, length, data)

• No block-level liveness information

• No high-level semantics on data

• Several “unwritten contracts”

do not hold for SSDs

– Sequential accesses are several tens of

times better than random accesses

– Distant LBNs lead to longer seek times

– Data written is equal to data issued

– …

FTL

SSD

Host

Block device driver

File system

Block I/F

NAND Flash

Flash I/F

7ICE3028: Embedded Systems Design | Spring 2017 | Jin-Soo Kim ([email protected])

Extending Block I/F

▪ TRIM command

• “The data in the specified sectors is no

longer needed”

• ATA interface standard

(T13 technical committee)

• Non-queued command

• SATA 3.1 introduces the Queued TRIM

commandFTL

SSD

Host

Block device driver

File system

NAND Flash

Block I/F + SSD-Specific I/F

Flash I/F

8ICE3028: Embedded Systems Design | Spring 2017 | Jin-Soo Kim ([email protected])

Atomic Write

▪ Transaction support for multi-block writes

• Simplifies file systems and DBMSes

X. Quyang, et al., “Beyond Block I/O: Rethinking Traditional Storage Primitives,” HPCA, 2011.

9ICE3028: Embedded Systems Design | Spring 2017 | Jin-Soo Kim ([email protected])

Multi-streamed SSD (1)

▪ Previous write patterns (= current state) matter

10ICE3028: Embedded Systems Design | Spring 2017 | Jin-Soo Kim ([email protected])

Multi-streamed SSD (2)

▪ Mapping data with different lifetime to different streams

▪ Standardized in T10 SCSI/SAS (2015), NVMe 1.3 (2017)

11ICE3028: Embedded Systems Design | Spring 2017 | Jin-Soo Kim ([email protected])

Multi-streamed SSD (3)

▪ Cassandra with Multi-streamed SSD

12ICE3028: Embedded Systems Design | Spring 2017 | Jin-Soo Kim ([email protected])

Multi-streamed SSD (4)

▪ Cassandra’s normalized updated throughput with

5 streams

13ICE3028: Embedded Systems Design | Spring 2017 | Jin-Soo Kim ([email protected])

OSSD: Object-based SSD (1)

▪ OSD (Object-based Storage Device)

• Virtualizes physical storage as a pool of objects

• Offloads space management to storage devices

• Standardized as a subset of SCSI command set

Block interface Object interface

14ICE3028: Embedded Systems Design | Spring 2017 | Jin-Soo Kim ([email protected])

OSSD: Object-based SSD (2)

▪ OSD storage modelApplication

System Call Interface

File System Storage Management

Sector/LBA Interface

Block I/O Manager

Physical Media

File System User Component

Application

System Call Interface

OSD Storage Management

OSD Interface

Block I/O Manager

Physical Media

File System User Component

Host

StorageDevice

Traditional OSD

15ICE3028: Embedded Systems Design | Spring 2017 | Jin-Soo Kim ([email protected])

OSSD: Object-based SSD (3)

OAQ

OFS

VFS

iSCSI Initiator

iSCSI Target Daemon

OSSD Framework

OML

FML

FAL

Host

Target

RawSSD

READ/WRITE/ERASE SATA-2

OSD Interface (iSCSI) TCP 1Gbps

46

7

7

OID -ContextHash

Q n

Q 1

Q 0

16:8

Priority Queue

oid = 7

oid = 46

I/OContext

Object I /O instances

W

W

COAQ

AllocationBitmap

FML

OML

Descriptor

Object Attr.Object Data

μ-Tree

(Extents)

Object Data Buffer

Y. S. Lee, S.-H. Kim, J.-S. Kim, J. Lee, C. park, and S. Maeng, “OSSD: A Case for Object-based Solid State Drives,” MSST, 2013.

16ICE3028: Embedded Systems Design | Spring 2017 | Jin-Soo Kim ([email protected])

OSSD: Object-based SSD (4)

▪ Simplified host file system

• No need for SSD-specific parameter tuning

▪ More efficient management of flash storage

• Block-level liveness

• Metadata separation

• Object-aware storage management (allocation, dedup, ...)

▪ Application-aware storage management

• Application hints, QoS

▪ Storage virtualization

• Pooling, tiering, caching, backup, replication, etc.

17ICE3028: Embedded Systems Design | Spring 2017 | Jin-Soo Kim ([email protected])

In-Storage Computing (1)

▪ Samsung ISC SSD Prototype

• Commodity SSD: Samsung PM1725 NVMe with the ISC

feature

• PCIe 3.0x4

• 800 GB

▪ Software

• C++11

• C++STL

• G++

• Software emulator

18ICE3028: Embedded Systems Design | Spring 2017 | Jin-Soo Kim ([email protected])

In-Storage Computing (2)

▪ ISC Application Development Process

19ICE3028: Embedded Systems Design | Spring 2017 | Jin-Soo Kim ([email protected])

In-Storage Computing (3)

▪ ISC Dataflow Programming Model

20ICE3028: Embedded Systems Design | Spring 2017 | Jin-Soo Kim ([email protected])

In-Storage Computing (4)

▪ Example: Simple Key-Value Store