elasca: workload-aware elastic scalability for partition based database systems taha rafiq mmath...

66
Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

Upload: cory-millson

Post on 30-Mar-2015

215 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

Elasca: Workload-Aware Elastic Scalability for Partition Based

Database Systems

Taha RafiqMMath Thesis Presentation

24/04/2013

Page 2: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

2

Outline

1. Introduction & Motivation2. VoltDB & Elastic Scale-Out Mechanism3. Partition Placement Problem4. Workload-Aware Optimizer5. Experiments & Results6. Supporting Multi-Partition Transactions7. Conclusion

Page 3: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

3

INTRODUCTION & MOTIVATION

Page 4: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

4

DBMS Scalability

Replication

Partitioning

Page 5: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

5

Traditional (DBMS) Scalability

Higher Load

Add Resources

Better Performance

Ability of a system to be enlarged to handle growing amount of work

Expensive Downtime

Page 6: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

6

Elastic (DBMS) Scalability

Higher Load

Dynamically Add

Resources

Better Performance

Use of computer resources which vary dynamically to meet a variable workload

NoDowntime

Page 7: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

Elastically Scaling a Partition Based DBMS

Re-Partitioning

7

Partition 1

Node 1Partition 1

Node 1

Partition 2

Node 2

Scale Out

Scale In

Page 8: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

Elastically Scaling a Partition Based DBMS

Partition Migration

8

P1

Node 1

P2

P3 P4

Node 1

P1 P2

Node 2

P3 P4

Scale Out

Scale In

Page 9: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

9

Partition Migration for Elastic Scalability

MechanismHow to add/remove nodes and move

partitions

Policy/StrategyWhich partitions to move when and where

during scale out/scale in

Page 10: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

10

Elasca

Elastic Scale-Out Mechanism

Partition Placement & Migration Optimizer

=

+

Page 11: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

11

VOLTDB & ELASTIC SCALE-OUT MECHANISM

Page 12: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

12

What is VoltDB?

• In memory, partition based DBMS– No disk access = very fast

• Shared nothing architecture, serial execution– No locks

• Stored procedures– No arbitrary transactions

• Replication– Fault tolerance & durability

Page 13: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

13

VoltDB Architecture

P1 P2

ES1 ES2

Initiator

Client Interface

P3 P1

ES1 ES2

Initiator

Client Interface

P2 P3

ES1 ES2

Initiator

Client Interface

Client ClientClient Client

Thr

eads

Page 14: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

14

Single-Partition Transactions

P1 P2

ES1 ES2

Initiator

Client Interface

P3 P1

ES1 ES2

Initiator

Client Interface

P2 P3

ES1 ES2

Initiator

Client Interface

Client ClientClient Client

Page 15: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

15

Multi-Partition Transactions

P1 P2

ES1 ES2

Initiator

Client Interface

P3 P1

ES1 ES2

Initiator

Client Interface

P2 P3

ES1 ES2

Initiator

Client Interface

Client ClientClient Client

ES1

Page 16: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

16

Elastic Scale-Out Mechanism

P3 P4

ES3 ES4

Initiator

Client Interface

P1 P2

ES1 ES2Scale-Out Node

(Failed)

ES4

Initiator

Client Interface

ES1

P4

P1

Page 17: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

17

Overcommitting Cores

• VoltDB suggests:Partitions per node < Cores per node

• Wasted resources when load is low or data access is skewed

IdeaAggregate extra partitions on each node

and scale out when load increases

Page 18: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

18

PARTITION PLACEMENT PROBLEM

Page 19: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

19

Given…Cluster and System Specifications

Number of CPU cores

MemoryMax. Number of Nodes

Page 20: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

20

Given…

P1 P2 P3 P4 P5 P6 P7 P80

500

1000

1500

2000

2500

3000

Load Per Partition

Partition

Req

uest

s P

er S

eco

nd

Page 21: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

21

Given…

P1 P2 P3 P4 P5 P6 P7 P80

200

400

600

800

1000

1200

Size of Each Partition

Partition

Size

in M

B

Page 22: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

22

Given…

Partition Node 1 Node 2 Node 3

P1

P2

P3

P4

P5

P6

P7

P8

Current Partition-to-Node Assignment

Page 23: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

23

Find…

Partition Node 1 Node 2 Node 3

P1 ? ? ?

P2 ? ? ?

P3 ? ? ?

P4 ? ? ?

P5 ? ? ?

P6 ? ? ?

P7 ? ? ?

P8 ? ? ?

Optimal Partition-to-Node Assignment (For Next Time Interval)

Page 24: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

24

Optimization Objectives

Maximize ThroughputMatch the performance of a static, fully

provisioned system

Minimize Resources UsedUse the minimum number of nodes required

to meet performance demands

Page 25: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

25

Optimization Objectives

Minimize Data MovementData movement adversely affects system performance and incurs network costs

Balance Load EffectivelyMinimizes the risk of overloading a node

during the next time interval

Page 26: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

26

WORKLOAD-AWARE OPTIMIZER

Page 27: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

System Overview

27

Page 28: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

28

Statistics Collected

α. Maximum number of transactions that can be executed on a partition per second– Max capacity of Execution Sites

β. CPU overhead of host-level tasks– How much CPU capacity the Initiator uses

Page 29: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

Effect of β

29

Page 30: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

Estimating CPU Load

30

CPU Load Generated by Each Partition

Average CPU Load of Host-Level Tasks Per Node

Average CPU Load Per Node

Page 31: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

31

Optimizer Details

• Mathematical Optimization vs. Heuristics• Mixed-Integer Linear Programming (MILP)• Can be solved using any general-purpose

solver (we use IBM ILOG CPLEX)• Applicable for wide variety of scenarios

Page 32: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

Objective Function

32

Minimizes data movement as primary objective and balances load as secondary objective

Page 33: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

Effect of ε

33

Page 34: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

34

Minimizing Resources Used

• Calculate the minimum number of nodes that can handle the load of all the partitions– Non-integer assignment

• Explicitly tell optimizer how many nodes to use• If optimizer can’t find solution with minimum

nodes, it tries again with N + 1 nodes

Page 35: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

35

Constraints

• Replication: Replicas of a given partition must be assigned to different nodes

• CPU Capacity: Sum of the load of partitions must be less than capacity of node

• Memory Capacity: All the partitions assigned to a node must fit in its memory

• Host-Level Tasks: The overhead of host-level tasks must not exceed capacity of single core

Page 36: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

36

Staggering Scale In

• Fluctuating workload can result in excessive data movement

• Staggering scale in mitigates this problem• Delay scaling in by s time steps• Slightly higher resources used to provide

stability

Page 37: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

37

EXPERIMENTAL EVALUATION

Page 38: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

38

Optimizers Evaluated

• ELASCA: Our workload-aware optimizer• ELASCA-S: ELASCA with staggered scale in• OFFLINE: Offline optimizer that minimizes

resources used and data movement• GREEDY: A greedy first-fit optimizer• SCO: Static, fully provisioned system (no

optimization)

Page 39: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

39

Benchmarks Used

• TPC-C: Modified to make it cleanly partitioned and fit in memory (3.6 GB)

• TATP: Telecommunication Application Transaction Processing Benchmark (250 MB)

• YCSB: Yahoo! Cloud Serving Benchmark with 50/50 read/write ratio (1 GB)

Page 40: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

40

Dynamic Workloads

• Varying the aggregate request rate– Periodic waveforms • Sine, Triangle, Sawtooth

• Skewing the data access– Temporal skew– Statistical distributions• Uniform, Normal, Categorical, Zipfian

Page 41: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

Temporal Skew

P1 P2 P3 P4 P5 P6 P7 P8

t = 1

Load

41

P1 P2 P3 P4 P5 P6 P7 P8

t = 2

Load

P1 P2 P3 P4 P5 P6 P7 P8

t = 3

Load

P1 P2 P3 P4 P5 P6 P7 P8

t = 4

Load

P1 P2 P3 P4 P5 P6 P7 P8

t = 1

Load

Page 42: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

42

Experimental Setup

• Each experiment run for 1 hour• 15 time intervals– Optimizer run every four minutes

• Combination of simulation and actual runs– Exact numbers for data movement, resources

used and load balance through simulation

• Cluster has 4 nodes, 2 separate client machines

Page 43: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

Data Movement (TPC-C)

43

Triangle Wave (f = 1)

Page 44: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

Data Movement (TPC-C)

44

Triangle Wave (f = 1), Zipfian Skew

Page 45: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

Data Movement (TPC-C)

45

Triangle Wave (f = 4)

Page 46: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

Computing Resources Saved (TPC-C)

46

Triangle Wave (f = 1)

Page 47: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

Load Balance (TPC-C)

47

Triangle Wave (f = 1)

Page 48: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

Database Throughput (TPC-C)

48

Sine Wave (f = 2)

Page 49: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

Database Throughput (TPC-C)

49

Sine Wave (f = 2), Normal Skew

Page 50: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

Database Throughput (TATP)

50

Sine Wave (f = 2)

Page 51: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

Database Throughput (YCSB)

51

Sine Wave (f = 2)

Page 52: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

Database Throughput (TPC-C)

52

Triangle Wave (f = 4)

Page 53: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

Optimizer Scalability

53

Page 54: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

54

SUPPORTING MULTI-PARTITION TRANSACTIONS

Page 55: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

55

Factors Affecting Performance

• Maximum MPT Throughput (η): The maximum number of transactions an execution site can coordinate per second

• Probability of MPTs (pmpt): Percentage of transactions that are MPTs

• Partitions Involved in MPTs: The number of partitions involved in MPTs

Page 56: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

56

Changes to Model

CPU load generated by each partition is equal to sum of:

1. Load due to transaction work (same as SPTs)2. Load due to coordinating MPTs

Page 57: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

Maximum MPT Throughput

57

Page 58: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

Probability of MPTs

58

Page 59: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

Effect on Resources Saved

59

Page 60: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

Effect on Data Movement

60

Page 61: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

61

CONCLUSION

Page 62: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

62

Related Work

• Data replication and partitioning• Database consolidation• Live database migration• Key-value stores• Data placement

Page 63: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

63

Elasca

Elastic Scale-Out Mechanism

Partition Placement & Migration Optimizer

=

+

Page 64: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

64

Conclusion

• Elasca = Mechanism + Optimizer• Workload-Aware Optimizer– Meets performance demands– Minimizes computing resources used– Minimizes data movement– Effectively balances load

• Scalable to large problem sizes for online setting

Page 65: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

65

Future Work

• Migrating to VoltDB 3.0– Intelligent client routing, master/slave

partitions

• Supporting multi-partition transactions• Automated parameter tuning• Transaction mixes• Workload prediction

Page 66: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

66

Thank You

Questions?