consistent and durable data structures for non-volatile...

27
Consistent and Durable Data Structures for Non-Volatile Byte-Addressable Memory Shivaram Venkataraman * , Niraj Tolia , Parthasarathy Ranganathan* and Roy H. Campbell *HP Labs, Palo Alto, Maginatics, and University of Illinois, Urbana-Champaign

Upload: others

Post on 10-Oct-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Consistent and Durable Data Structures for Non-Volatile ...shivaram.org/talks/nvm-fast11-talk.pdf · Redis - Hashtable+Logging Tembo - CDDS BTree 2M insert operations, two client

Consistent and Durable Data Structures for

Non-Volatile Byte-Addressable Memory

Shivaram Venkataraman*†, Niraj Tolia‡, Parthasarathy Ranganathan* and Roy H. Campbell†

*HP Labs, Palo Alto, ‡Maginatics, and †University of Illinois, Urbana-Champaign

Page 2: Consistent and Durable Data Structures for Non-Volatile ...shivaram.org/talks/nvm-fast11-talk.pdf · Redis - Hashtable+Logging Tembo - CDDS BTree 2M insert operations, two client

Non-Volatile Byte-Addressable Memory (NVBM)

Memristor

3/4/11 2

Phase Change Memory Memristor

Page 3: Consistent and Durable Data Structures for Non-Volatile ...shivaram.org/talks/nvm-fast11-talk.pdf · Redis - Hashtable+Logging Tembo - CDDS BTree 2M insert operations, two client

Non-Volatile Byte-Addressable Memory (NVBM)

50-150 nanoseconds

Scalable

Non-Volatile

Lower energy

Memristor

3/4/11 3

Page 4: Consistent and Durable Data Structures for Non-Volatile ...shivaram.org/talks/nvm-fast11-talk.pdf · Redis - Hashtable+Logging Tembo - CDDS BTree 2M insert operations, two client

Access Times

1

10

100

1000

10000

100000

1000000

10000000

Nan

osec

onds

3/4/11 4

Hard Disk Writes – 3 ms Write to SLC Flash – 200 μs

Processor clock cycle – 1ns Access L2 cache – 10ns Update DRAM – 55ns

Page 5: Consistent and Durable Data Structures for Non-Volatile ...shivaram.org/talks/nvm-fast11-talk.pdf · Redis - Hashtable+Logging Tembo - CDDS BTree 2M insert operations, two client

Access Times

1

10

100

1000

10000

100000

1000000

10000000

Nan

osec

onds

3/4/11 5

Hard Disk Writes – 3 ms Write to SLC Flash – 200 μs

Processor clock cycle – 1ns Access L2 cache – 10ns Update DRAM – 55ns

Writes to PCM / Memristor – 100-150 ns

Page 6: Consistent and Durable Data Structures for Non-Volatile ...shivaram.org/talks/nvm-fast11-talk.pdf · Redis - Hashtable+Logging Tembo - CDDS BTree 2M insert operations, two client

Data Stores - Disk

L1 Cache

Traditional DB

DRAM

Core1 Core2

L1 Cache L1 Cache

L2 Cache

Disk

File systems

3/4/11 6

Page 7: Consistent and Durable Data Structures for Non-Volatile ...shivaram.org/talks/nvm-fast11-talk.pdf · Redis - Hashtable+Logging Tembo - CDDS BTree 2M insert operations, two client

DRAM

Data Stores - DRAM

Core1 Core2

L1 Cache L1 Cache

L2 Cache

Commit Log - Disk

RAMCloud memcached Memory-based DB

3/4/11 7

Page 8: Consistent and Durable Data Structures for Non-Volatile ...shivaram.org/talks/nvm-fast11-talk.pdf · Redis - Hashtable+Logging Tembo - CDDS BTree 2M insert operations, two client

DRAM

Data Stores - NVBM

Core1 Core2

L1 Cache L1 Cache

L2 Cache

Non-Volatile Memory

Single-level store

3/4/11 8

Page 9: Consistent and Durable Data Structures for Non-Volatile ...shivaram.org/talks/nvm-fast11-talk.pdf · Redis - Hashtable+Logging Tembo - CDDS BTree 2M insert operations, two client

Challenges

10

5 20

15

2  

1  

Consistency Durability

3/4/11 9

Page 10: Consistent and Durable Data Structures for Non-Volatile ...shivaram.org/talks/nvm-fast11-talk.pdf · Redis - Hashtable+Logging Tembo - CDDS BTree 2M insert operations, two client

Outline

§  Motivation §  Consistent durable data structures

§ Consistent durable B-Tree §  Tembo – Distributed Data Store Implementation

§  Evaluation

3/4/11 10

Page 11: Consistent and Durable Data Structures for Non-Volatile ...shivaram.org/talks/nvm-fast11-talk.pdf · Redis - Hashtable+Logging Tembo - CDDS BTree 2M insert operations, two client

Consistent Durable Data Structures

§  Versioning for consistency across failures

§  Restore to last consistent version on recovery

§  Atomic change across versions

§  No new processor extensions!

3/4/11 11

Page 12: Consistent and Durable Data Structures for Non-Volatile ...shivaram.org/talks/nvm-fast11-talk.pdf · Redis - Hashtable+Logging Tembo - CDDS BTree 2M insert operations, two client

Versioning

§  Totally ordered – Increasing natural numbers

§  Every update creates a new version

§  Last consistent version §  Stored in a well-known location § Used by reader threads and for recovery

3/4/11 12

Page 13: Consistent and Durable Data Structures for Non-Volatile ...shivaram.org/talks/nvm-fast11-talk.pdf · Redis - Hashtable+Logging Tembo - CDDS BTree 2M insert operations, two client

Consistent Durable B-Tree

B – Size of a B-Tree node

3/4/11 13

Key [start, end) Deleted entry

Live entry

Page 14: Consistent and Durable Data Structures for Non-Volatile ...shivaram.org/talks/nvm-fast11-talk.pdf · Redis - Hashtable+Logging Tembo - CDDS BTree 2M insert operations, two client

Lookup

Find key 20 at version 5

3/4/11 14

Page 15: Consistent and Durable Data Structures for Non-Volatile ...shivaram.org/talks/nvm-fast11-talk.pdf · Redis - Hashtable+Logging Tembo - CDDS BTree 2M insert operations, two client

Insert / Split

3/4/11 15

Page 16: Consistent and Durable Data Structures for Non-Volatile ...shivaram.org/talks/nvm-fast11-talk.pdf · Redis - Hashtable+Logging Tembo - CDDS BTree 2M insert operations, two client

Garbage Collection

3/4/11 16

Page 17: Consistent and Durable Data Structures for Non-Volatile ...shivaram.org/talks/nvm-fast11-talk.pdf · Redis - Hashtable+Logging Tembo - CDDS BTree 2M insert operations, two client

Tembo – Distributed Data Store Implementation

Based on open source key-value store

Widely used in production

In-memory dataset

3/4/11 17

Page 18: Consistent and Durable Data Structures for Non-Volatile ...shivaram.org/talks/nvm-fast11-talk.pdf · Redis - Hashtable+Logging Tembo - CDDS BTree 2M insert operations, two client

Tembo – Distributed Data Store Implementation

Key Value Server

Consistent durable B-Tree

Single writer, shared reader

3/4/11 18

Consistent Hashing

Page 19: Consistent and Durable Data Structures for Non-Volatile ...shivaram.org/talks/nvm-fast11-talk.pdf · Redis - Hashtable+Logging Tembo - CDDS BTree 2M insert operations, two client

Outline

§  Motivation §  Consistent durable data structures

§ Consistent durable B-Tree §  Tembo – Distributed Data Store Implementation

§  Evaluation

3/4/11 19

Page 20: Consistent and Durable Data Structures for Non-Volatile ...shivaram.org/talks/nvm-fast11-talk.pdf · Redis - Hashtable+Logging Tembo - CDDS BTree 2M insert operations, two client

Ease of Integration

Lines of Code Original STX B-Tree 2110 CDDS Modifications 1902 (90%)

Redis (v2.0.0-rc4) 18539 Tembo Modifications 321 (1.7%)

3/4/11 20

Page 21: Consistent and Durable Data Structures for Non-Volatile ...shivaram.org/talks/nvm-fast11-talk.pdf · Redis - Hashtable+Logging Tembo - CDDS BTree 2M insert operations, two client

Evaluation - Setup

§  API Microbenchmarks § Compare with Berkeley DB §  Tembo: Versioning vs. write-ahead logging

§  End-to-End Comparison § NoSQL systems – Cassandra §  Yahoo Cloud Serving Benchmark

§  15 node test cluster §  13 servers, 2 clients §  720 GB RAM, 120 cores

3/4/11 21

Page 22: Consistent and Durable Data Structures for Non-Volatile ...shivaram.org/talks/nvm-fast11-talk.pdf · Redis - Hashtable+Logging Tembo - CDDS BTree 2M insert operations, two client

Durability - Logging vs. Versioning

3/4/11 22

0

2000

4000

6000

8000

10000

12000

14000

256 1024 4096

Thr

ough

put

(Ops

/sec

)

Value size (bytes)

Redis - BTree+Logging Redis - Hashtable+Logging Tembo - CDDS BTree

2M insert operations, two client threads

Page 23: Consistent and Durable Data Structures for Non-Volatile ...shivaram.org/talks/nvm-fast11-talk.pdf · Redis - Hashtable+Logging Tembo - CDDS BTree 2M insert operations, two client

Yahoo Cloud Serving Benchmark

0

20000

40000

60000

80000

100000

120000

140000

160000

2 10 20 30

Ops

/sec

Client Threads

Tembo Cassandra-inmemory Cassandra-disk

3/4/11 23

286%

44%

Page 24: Consistent and Durable Data Structures for Non-Volatile ...shivaram.org/talks/nvm-fast11-talk.pdf · Redis - Hashtable+Logging Tembo - CDDS BTree 2M insert operations, two client

Furthermore

§  Algorithms for deletion

§  Analysis for space usage and height of B-Tree

§  Durability techniques for current processors

3/4/11 24

Page 25: Consistent and Durable Data Structures for Non-Volatile ...shivaram.org/talks/nvm-fast11-talk.pdf · Redis - Hashtable+Logging Tembo - CDDS BTree 2M insert operations, two client

Related Work

§  Multi-version data structures § Used in transaction time databases

§  NVBM based systems §  BPFS – File system (SOSP 2009) § NV-Heaps – Transaction Interface (ASPLOS 2011)

§  In-memory data stores § H-Store – MIT, Brown University, Yale University §  RAMCloud – Stanford University

3/4/11 25

Page 26: Consistent and Durable Data Structures for Non-Volatile ...shivaram.org/talks/nvm-fast11-talk.pdf · Redis - Hashtable+Logging Tembo - CDDS BTree 2M insert operations, two client

Work-in-progress

§  Robust reliability testing

§  Support for transaction-like operations

§  Integration of versioning and wear-leveling

3/4/11 26

Page 27: Consistent and Durable Data Structures for Non-Volatile ...shivaram.org/talks/nvm-fast11-talk.pdf · Redis - Hashtable+Logging Tembo - CDDS BTree 2M insert operations, two client

Conclusion

§  Changes in storage media §  Rethink software stack

§  Consistent Durable Data Structures §  Single-level store § Durability through versioning § Up to 286% faster than memory-backed systems

3/4/11 27