@andy_pavlo automatic database partitioning in parallel oltp systems sigmod may 22 nd, 2012

21
@andy_pavlo @andy_pavlo Automatic Database Automatic Database Partitioning in Partitioning in Parallel Parallel OLTP OLTP Systems Systems SIGMO SIGMO May 22 May 22 nd nd , 201 , 201

Upload: nathaniel-shepherd

Post on 18-Jan-2018

219 views

Category:

Documents


0 download

DESCRIPTION

3

TRANSCRIPT

Page 1: @andy_pavlo Automatic Database Partitioning in Parallel OLTP Systems SIGMOD May 22 nd, 2012

@andy_pavlo@andy_pavlo

Automatic Automatic Database Database

Partitioning in Partitioning in Parallel Parallel

OLTP OLTP SystemsSystemsSIGMODSIGMOD

May 22May 22ndnd, 2012, 2012

Page 2: @andy_pavlo Automatic Database Partitioning in Parallel OLTP Systems SIGMOD May 22 nd, 2012

2

Page 3: @andy_pavlo Automatic Database Partitioning in Parallel OLTP Systems SIGMOD May 22 nd, 2012

3

Page 4: @andy_pavlo Automatic Database Partitioning in Parallel OLTP Systems SIGMOD May 22 nd, 2012

4

Page 5: @andy_pavlo Automatic Database Partitioning in Parallel OLTP Systems SIGMOD May 22 nd, 2012

5

Page 6: @andy_pavlo Automatic Database Partitioning in Parallel OLTP Systems SIGMOD May 22 nd, 2012

Main Memory • Parallel • Shared-NothingTransaction Processing

H-Store: A High-Performance, DistributedMain Memory Transaction Processing SystemProc. VLDB Endow., vol. 1, iss. 2, pp. 1496-1499, 2008.

Page 7: @andy_pavlo Automatic Database Partitioning in Parallel OLTP Systems SIGMOD May 22 nd, 2012

7

ClientApplication

Database Cluster

Procedure NameInput

Parameters

Transaction

Execution

Database Cluster

Transaction

Result

Page 8: @andy_pavlo Automatic Database Partitioning in Parallel OLTP Systems SIGMOD May 22 nd, 2012

TPC-C NewOrder

8

Page 9: @andy_pavlo Automatic Database Partitioning in Parallel OLTP Systems SIGMOD May 22 nd, 2012

9

Page 10: @andy_pavlo Automatic Database Partitioning in Parallel OLTP Systems SIGMOD May 22 nd, 2012

10

Page 11: @andy_pavlo Automatic Database Partitioning in Parallel OLTP Systems SIGMOD May 22 nd, 2012

Automatic Database Design Toolfor Parallel Systems

Skew-Aware Automatic Database Partitioningin Shared-Nothing, Parallel OLTP SystemsSIGMOD 2012

Page 12: @andy_pavlo Automatic Database Partitioning in Parallel OLTP Systems SIGMOD May 22 nd, 2012

CUSTOCUSTOMERMERORDEORDERSRS

ITEMITEM

CUSTOCUSTOMERMERORDEORDERSRS

ITEMITEM

CUSTOCUSTOMERMERORDEORDERSRS

ITEMITEM

CUSTCUSTOMEROMERORDEORDE

RSRSITEMITEM

Schema

Workload

---------------------

DDL SELECT * FROM

WAREHOUSE WHERE W_ID = 10;INSERT INTO ORDERS (O_W_ID, O_D_ID, O_C_ID) VALUES (10, 9, 12345);

SELECT * FROM WAREHOUSE WHERE W_ID = 10;INSERT INTO ORDERS (O_W_ID, O_D_ID, O_C_ID) VALUES (10, 9, 12345);

SELECT * FROM WAREHOUSE WHERE W_ID = 10;SELECT * FROM DISTRICT D_W_ID = 10 AND D_ID =9;INSERT INTO ORDERS (O_W_ID, O_D_ID, O_C_ID) VALUES (10, 9, 12345);

SELECT * FROM WAREHOUSE WHERE W_ID = 10;SELECT * FROM DISTRICT WHERE D_W_ID = 10 AND D_ID =9;INSERT INTO ORDERS (O_W_ID, O_D_ID, O_C_ID,…) VALUES(10, 9, 12345,…);

NewOrdNewOrd

ererDDLCUSTOM

ERORDERS

ITEM

12

Page 13: @andy_pavlo Automatic Database Partitioning in Parallel OLTP Systems SIGMOD May 22 nd, 2012

o_id o_c_id

o_w_id

78703 1004 5 -78704 1002 3 -78705 1006 7 -78706 1005 6 -78707 1005 6 -78708 1003 12 -

c_id c_w_id

c_last

1001 5 RZA -1002 3 GZA -1003 12 Raek

won-

1004 5 Deck -1005 6 Killah -1006 7 ODB -

CUSTOMER ORDERS

CUSTOCUSTOMERMERORDERORDER

SS

CUSTOCUSTOMERMERORDERORDER

SS

CUSTOCUSTOMERMERORDERORDER

SS

ITEMi_id i_na

mei_price

603514

XXX 23.99 -

267923

XXX 19.99 -

475386

XXX 14.99 -

578945

XXX 9.98 -

476348

XXX 103.49

-

784285

XXX 69.99 -

ITEMITEM ITEMITEM ITEMITEM

CUSTOMERc_id c_w_i

dc_las

t…

1001 5 RZA -1002 3 GZA -1003 12 Raek

won-

1004 5 Deck -1005 6 Killah -1006 7 ODB -

13

Page 14: @andy_pavlo Automatic Database Partitioning in Parallel OLTP Systems SIGMOD May 22 nd, 2012

CUSTOCUSTOMERMERORDERORDER

SS

CUSTOCUSTOMERMERORDERORDER

SS

CUSTOCUSTOMERMERORDERORDER

SSITEMITEM ITEMITEM ITEMITEM

Client Application

NewOrder(5, “Method Man”,

1234)

14

Page 15: @andy_pavlo Automatic Database Partitioning in Parallel OLTP Systems SIGMOD May 22 nd, 2012

Best DesignInput

Workload

---------------------

Schema

DDL

Initial DesignRelaxationLocal Search

Restart

Large-Neighborhood

Search

15

Page 16: @andy_pavlo Automatic Database Partitioning in Parallel OLTP Systems SIGMOD May 22 nd, 2012

DistributedTransactionsWorkloadSkew Factor+Cost Model

Page 17: @andy_pavlo Automatic Database Partitioning in Parallel OLTP Systems SIGMOD May 22 nd, 2012

Algorithm Comparison

(cost estimate)lower is better

TATP TPC-CTPC-C Skewed17

HorticultureState-of-the-Art

Page 18: @andy_pavlo Automatic Database Partitioning in Parallel OLTP Systems SIGMOD May 22 nd, 2012

+88% +16% +183%

HorticultureState-of-the-Art

Throughput

TATP TPC-CTPC-C Skewed18

(txn/sec)higher is better

Page 19: @andy_pavlo Automatic Database Partitioning in Parallel OLTP Systems SIGMOD May 22 nd, 2012

19

Page 20: @andy_pavlo Automatic Database Partitioning in Parallel OLTP Systems SIGMOD May 22 nd, 2012

Conclusion:Dating scene is still difficult.But partitioning your database is now easier.

Page 21: @andy_pavlo Automatic Database Partitioning in Parallel OLTP Systems SIGMOD May 22 nd, 2012

http://github.com/apavlo/h-storehttp://github.com/apavlo/h-storehttp://hstore.cs.brown.eduhttp://hstore.cs.brown.edu