imc summit 2016 breakout - andy pavlo - what non-volatile memory means for the future of database...

Post on 09-Jan-2017

213 Views

Category:

Devices & Hardware

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

@ANDY_PAVLO

WHATNON-VOLATILEMEMORYMEANS FOR THE FUTURE OFDATABASESYSTEMSSee all the presentations from the In-Memory Computing Summit at http://imcsummit.org

1973

1974

1978

1986

1994

2010

2016

The Future

Non-Volatile Memory

1 1

• Persistent storage with byte-addressable operations.• Fast read/write latencies.• No difference between

random vs. sequential access.

1 2

What does NVM mean for DBMSs?• Thinking of NVM as just a

faster SSD is not interesting.• We want to use NVM as

permanent storage for the database, but this has major implications.–Operating System Support–Cloud Provider Provisioning–Database Management System Architectures

Existing Systems

NVM-Only Storage

Hybrid DBMS 1 3

Existing Systems

NVM-Only Storage

Hybrid DBMS

1 4

Chapter I – Existing Systems• Investigate how existing

systems perform with NVM for write-heavy transaction processing (OLTP) workloads.• Evaluate two types of DBMS

architectures.–Disk-oriented (MySQL)–In-Memory (H-Store)

A PROLEGOMENON ON OLTP DATABASE SYSTEMS FOR NON-VOLATILE MEMORYADMS@VLDB 2015

1 5

DISK-ORIENTEDBuffer Pool

Table Heap

Log Snapshots

IN-MEMORYTable Heap

Log Snapshots

1 6

Intel Labs NVM Emulator• Instrumented motherboard

that slows down access to the memory controller with tunable latencies.• Special assembly to emulate

upcoming Xeon instructions for flushing cache lines.

STORE STOREL1

CacheL2 Cach

ePCOMMIT

1 7

Experimental Evaluation• Compare architectures on Intel

Labs NVM emulator.• Yahoo! Cloud Serving

Benchmark:–10 million records (~10GB)–8x database / memory–Variable skew

1 8

YCSB //

0

10,000

20,000

30,000

40,000

TXN/SEC

MySQLH-Store

50% Reads / 50% Writes Workload2x Latency Relative to DRAM

SKEW AMOUNT (HIGH→LOW)

8x Latency

LESS

ONS

1 9

Logging is a major performance bottleneck.

2

1 NVM Latency does not have a large impact.Legacy DBMSs are not prepared to run on NVM.

3

What wouldLarry

Ellison do?

Chapter II – NVM-only Storage• Evaluate storage and

recovery methods for a system that only has NVM.• Testbed DBMS with a

pluggable storage engines.• We had to build our own NVM-

aware memory allocator.LET'S TALK ABOUT STORAGE & RECOVERY METHODS FOR NON-VOLATILE MEMORY DATABASE SYSTEMSSIGMOD 2015 2 2

Copy-on-WriteTable HeapNo Logging

Log-StructuredNo Table HeapLog-only Storage

DBMS Architectures

2 3

In-PlaceTable Heap

Log + Snapshots

Copy-on-WriteTable HeapNo Logging

Log-StructuredNo Table HeapLog-only Storage

2 4

In-Place EngineTable Heap

Log Snapshots1

2

3

UPDATE table SET val=ABC WHERE id=123

Delta Record

New Tuple

New TupleNVM

2 5

NVM-Optimized Architectures• Use non-volatile pointers to

only record what changed rather than how it changed.• Be careful about how & when

things get flushed from CPU caches to NVM.

2 6

NVM-Aware In-Place Engine

Table Heap

Log1

2

Tuple Pointers

New TupleLog Record

TxnId Pointer

UPDATE table SET val=ABC WHERE id=123

2 7

Evaluation• Testbed system using the Intel

NVM hardware emulator.• Yahoo! Cloud Serving

Benchmark–2 million records + 1 million transactions–High-skew setting

2 8

YCSB //

In-Pla

ce

Copy-o

n-Write

Log-St

ructur

ed0

400,000800,000

1,200,000NVM-OptimizedTraditional

10% Reads / 90% Writes Workload2x Latency Relative to DRAM

↑63%

↑122%↑50%

TXN/SEC

2 9

YCSB //

In-Pla

ce

Copy-o

n-Write

Log-St

ructur

ed0

150300

NVM-OptimizedTraditional

10% Reads / 90% Writes Workload2x Latency Relative to DRAM

NVM STORES

(M)

↓40%

↓25%↓20%

3 0

YCSB //NVM-OptimizedTraditional

Elapsed time to replay log with varying log sizes2x Latency Relative to DRAM

RECOVERY TIME (MS)

10^3

10^4

10^5

10^3

10^4

10^5

10^3

10^4

10^5

In-Place Copy-on-Write Log-Structured

0.010.1

110

1001000

No Recovery Needed

LESS

ONS

3 1

Avoid block-oriented components.2

1 Using NVM correctly improves throughput & reduces weadown.NVM-only systems are15-20 years away

3

What wouldNikita Kahn

do?

Chapter III – Hybrid DBMS• Design and build a new in-

memory DBMS that will be ready for NVM when it becomes available.• Hybrid Storage + Hybrid

Workloads–DRAM + NVM oriented architecture–Fast Transactions + Real-time Analytics

3 4

3 5

Adaptive StorageOriginal Data

Adapted Data

SELECT AVG(B) FROM myTable WHERE C < “yyy”

UPDATE myTable SET A = 123, B = 456, C = 789 WHERE D = “xxx”

A B C D

BRIDGING THE ARCHIPELAGO BETWEEN ROW-STORES AND COLUMN-STORES FOR HYBRID WORKLOADSSIGMOD 2016

A B C D

Cold

Hot

A B C D

LESS

ONS

3 7

PelotonThe Self-Driving Database Management System

AnthonyTomasic

ToddMowry

JoyArulraj

PrashanthMenon

MichaelZhang

LinMa

MatthewPerron

DanaVan Aken

YingjunWu

RanXian

RunshenZhu

JiexiLin

JianhongLi

ZiqiWang

http://pelotondb.org

@ANDY_PAVLOEND

top related