elastic data partitioning for cloud-based sql processing systems lipyeow lim information &...

19
Elastic Data Partitioning for Cloud-based SQL Processing Systems Lipyeow Lim Information & Computer Science Department University of Hawai`i at Mānoa 9/8/2010 1 Lipyeow Lim -- University of Hawai`i at Manoa

Upload: alannah-stephens

Post on 08-Jan-2018

217 views

Category:

Documents


0 download

DESCRIPTION

DBMS Shared Nothing Parallel DBMS 1/14/2013Lipyeow Lim -- University of Hawaii at Manoa3 DBMS query results Network Parallel DB layer

TRANSCRIPT

Page 1: Elastic Data Partitioning for Cloud-based SQL Processing Systems Lipyeow Lim Information & Computer Science Department University of Hawai`i at Mānoa 9/8/20101Lipyeow

1

Elastic Data Partitioning for Cloud-based SQL Processing Systems

Lipyeow LimInformation & Computer Science Department

University of Hawai`i at Mānoa

9/8/2010 Lipyeow Lim -- University of Hawai`i at Manoa

Page 2: Elastic Data Partitioning for Cloud-based SQL Processing Systems Lipyeow Lim Information & Computer Science Department University of Hawai`i at Mānoa 9/8/20101Lipyeow

2

Outline

9/8/2010 Lipyeow Lim -- University of Hawai`i at Manoa

Page 3: Elastic Data Partitioning for Cloud-based SQL Processing Systems Lipyeow Lim Information & Computer Science Department University of Hawai`i at Mānoa 9/8/20101Lipyeow

3

DBMS DBMS DBMS

Shared Nothing Parallel DBMS

1/14/2013 Lipyeow Lim -- University of Hawaii at Manoa

DBMS

query

results

Network

Parallel DB layer

Page 4: Elastic Data Partitioning for Cloud-based SQL Processing Systems Lipyeow Lim Information & Computer Science Department University of Hawai`i at Mānoa 9/8/20101Lipyeow

4

Cloud-based Architecture

1/14/2013 Lipyeow Lim -- University of Hawaii at Manoa

(Virtualized) NetworkDisk

Memory

CPU

Disk

Memory

CPU

Disk

Memory

CPU

Disk

Memory

CPU

Amazon EC2

Physical Resources

VirtualMachines

Page 5: Elastic Data Partitioning for Cloud-based SQL Processing Systems Lipyeow Lim Information & Computer Science Department University of Hawai`i at Mānoa 9/8/20101Lipyeow

5

DBMS DBMS

“Scaling” Up and Down

1/14/2013 Lipyeow Lim -- University of Hawaii at Manoa

Network

Parallel DB layer

DBMS DBMSDBMSDBMS

query

results

Page 6: Elastic Data Partitioning for Cloud-based SQL Processing Systems Lipyeow Lim Information & Computer Science Department University of Hawai`i at Mānoa 9/8/20101Lipyeow

6

Problem StatementGiven• A relation T• A partitioning function F on a fixed partitioning key • An initial number p of partitions/fragments• An initial mapping of p fragments to p database nodes• A target number q of partitionsFind • a mapping of {T1, T2, .. Tp} to {T1, T2, ... Tq} and • an assignment of the q fragments to q database nodesSuch that we minimize • The number of tuples re-partitioned• The number of tuples moved between database nodes

9/8/2010 Lipyeow Lim -- University of Hawai`i at Manoa

Page 7: Elastic Data Partitioning for Cloud-based SQL Processing Systems Lipyeow Lim Information & Computer Science Department University of Hawai`i at Mānoa 9/8/20101Lipyeow

7

Partitioning a Relation

• Partitioning attribute/key.• Partitioning type. Eg. Range or Hash• Partitioning constraint. Eg. Equi-width, equi-size• Number of partitions/fragments.

9/8/2010 Lipyeow Lim -- University of Hawai`i at Manoa

2246777

1320:

2246777

1320:

hash function

Page 8: Elastic Data Partitioning for Cloud-based SQL Processing Systems Lipyeow Lim Information & Computer Science Department University of Hawai`i at Mānoa 9/8/20101Lipyeow

8

Horizontal Fragmentation: Range Partitionsid sname rating age

22 dustin 7 4529 brutus 1 3331 lubber 8 5532 andy 4 2358 rusty 10 3564 horatio 7 35

1/14/2013 Lipyeow Lim -- University of Hawaii at Manoa

sid sname rating age29 brutus 1 3332 andy 4 23

sid sname rating age22 dustin 7 4531 lubber 8 5558 rusty 10 3564 horatio 7 35

Range Partition on rating column• Partition 1: 0 <= rating < 5• Partition 2: 5 <= rating <=

10

Partition 1

Partition 2

Page 9: Elastic Data Partitioning for Cloud-based SQL Processing Systems Lipyeow Lim Information & Computer Science Department University of Hawai`i at Mānoa 9/8/20101Lipyeow

9

Range Partition: Query Processing• Which partitions?• Better than non-parallel ?

1/14/2013 Lipyeow Lim -- University of Hawaii at Manoa

sid sname rating age29 brutus 1 3332 andy 4 23

sid sname rating age22 dustin 7 4531 lubber 8 5558 rusty 10 3564 horatio 7 35

Partition 1

Partition 2

SELECT *FROM Sailors S

SELECT *FROM Sailors SWHERE rating = 2

SELECT *FROM Sailors SWHERE rating < 2 and age < 30

SELECT *FROM Sailors SWHERE age > 30

Page 10: Elastic Data Partitioning for Cloud-based SQL Processing Systems Lipyeow Lim Information & Computer Science Department University of Hawai`i at Mānoa 9/8/20101Lipyeow

10

Partition 1

Partition 2

Horizontal Fragmentation: Hash Partition

• Hash partitioning using hash function– Partition = rating

mod 2

1/14/2013 Lipyeow Lim -- University of Hawaii at Manoa

sid sname rating age22 dustin 7 4529 brutus 1 3331 lubber 8 5532 andy 4 2358 rusty 10 3564 horatio 7 35

sid sname rating age31 lubber 8 5532 andy 4 2358 rusty 10 35

sid sname rating age22 dustin 7 4529 brutus 1 3364 horatio 7 35

Page 11: Elastic Data Partitioning for Cloud-based SQL Processing Systems Lipyeow Lim Information & Computer Science Department University of Hawai`i at Mānoa 9/8/20101Lipyeow

11

Hash Partition: Query Processing• Which partitions?• Better than non-parallel ?

1/14/2013 Lipyeow Lim -- University of Hawaii at Manoa

SELECT *FROM Sailors S

SELECT *FROM Sailors SWHERE rating = 2

SELECT *FROM Sailors SWHERE rating < 2 and age < 30

SELECT *FROM Sailors SWHERE age > 30

Partition 1

Partition 2

sid sname rating age31 lubber 8 5532 andy 4 2358 rusty 10 35

sid sname rating age22 dustin 7 4529 brutus 1 3364 horatio 7 35

Page 12: Elastic Data Partitioning for Cloud-based SQL Processing Systems Lipyeow Lim Information & Computer Science Department University of Hawai`i at Mānoa 9/8/20101Lipyeow

12

Method N: Naive Resize

9/8/2010 Lipyeow Lim -- University of Hawai`i at Manoa

Page 13: Elastic Data Partitioning for Cloud-based SQL Processing Systems Lipyeow Lim Information & Computer Science Department University of Hawai`i at Mānoa 9/8/20101Lipyeow

13

Method C : Chunk-based

9/8/2010 Lipyeow Lim -- University of Hawai`i at Manoa

Page 14: Elastic Data Partitioning for Cloud-based SQL Processing Systems Lipyeow Lim Information & Computer Science Department University of Hawai`i at Mānoa 9/8/20101Lipyeow

14

Method T : Tree-based

9/8/2010 Lipyeow Lim -- University of Hawai`i at Manoa

Page 15: Elastic Data Partitioning for Cloud-based SQL Processing Systems Lipyeow Lim Information & Computer Science Department University of Hawai`i at Mānoa 9/8/20101Lipyeow

15

Method H : Hash-based

9/8/2010 Lipyeow Lim -- University of Hawai`i at Manoa

Page 16: Elastic Data Partitioning for Cloud-based SQL Processing Systems Lipyeow Lim Information & Computer Science Department University of Hawai`i at Mānoa 9/8/20101Lipyeow

169/8/2010 Lipyeow Lim -- University of Hawai`i at Manoa

Page 17: Elastic Data Partitioning for Cloud-based SQL Processing Systems Lipyeow Lim Information & Computer Science Department University of Hawai`i at Mānoa 9/8/20101Lipyeow

179/8/2010 Lipyeow Lim -- University of Hawai`i at Manoa

Page 18: Elastic Data Partitioning for Cloud-based SQL Processing Systems Lipyeow Lim Information & Computer Science Department University of Hawai`i at Mānoa 9/8/20101Lipyeow

189/8/2010 Lipyeow Lim -- University of Hawai`i at Manoa

Page 19: Elastic Data Partitioning for Cloud-based SQL Processing Systems Lipyeow Lim Information & Computer Science Department University of Hawai`i at Mānoa 9/8/20101Lipyeow

199/8/2010 Lipyeow Lim -- University of Hawai`i at Manoa