the power of both choices: practical load balancing for distributed stream processing engines

The Power of Both ChoicesPractical Load Balancing for Distributed Stream

Processing Engines Muhammad Anis Uddin Nasir, Gianmarco De Francisci Morales, David Garcia-Soriano

Nicolas Kourtellis, Marco Serafini

International Conference on Data Engineering (ICDE 2015)

Stream Processing Engines

• Streaming Application

– Online Machine Learning

– Real Time Query Processing

– Continuous Computation

• Streaming Frameworks

– Storm, Borealis, S4, Samza, Spark Streaming

2The Power of Both Choices

Stream Processing Model

• Streaming Applications are represented by Directed Acyclic Graphs (DAGs)

Worker

Worker

Worker

Source

Source


Data Stream

Operators

Data Channels

Stream Grouping

• Key or Fields Grouping– Hash-based assignment

– Stateful operations, e.g., page rank, degree count

• Shuffle Grouping– Round-robin assignment

– Stateless operations, e.g., data logging, OLTP


Key Grouping


• Key Grouping• Scalable✔• Low Memory ✔• Load Imbalance ✖

Shuffle Grouping


• Shuffle Grouping• Load Balance ✔• Memory O(W) ✖• Aggregation O(W) ✖

Problem Formulation

• Input is a unbounded sequence of messages from a key distribution

• Each message is assigned to a worker for processing

• Load balance properties– Memory Load Balance– Network Load Balance– Processing Load Balance

• Metric: Load Imbalance

The Power of Both Choices 7

Power of two choices

• Balls-and-bins problem

• Algorithm– For each ball, pick two bins uniformly at random– Assign the ball to least loaded of the two bins

• Issues– Distributed ✖– Consensus on Keys ✖– Skewed distribution ✖– Continuous Data✖– Load Information ✖


Img source: http://s17.postimg.org/qqctbpftr/Galton_prime_box.jpg

Partial Key Grouping (PKG)

• Key Splitting

– Split each key into two server

– Assign each instance using power of two choices

• Local Load Estimation

– each source estimates load on

– using the local routing history



• Key Splitting

• Local Load Estimation


Source

Source

Worker

Worker

Worker

2 0 1

1 0 2

2 0 2


• Key Splitting

– Distributed

– Stateless

– Handle Skew

• Local load estimation

– No coordination among sources

– No communication with workers


Partial Key Grouping


• PKG• Load Balance ✔• Memory O(1) ✔• Aggregation O(1) ✔

Analysis: Chromatic Balls and Bins

• Problem Formulation– If messages are drawn from a key distribution where

probabilities of keys are p1≥p2≥p3….. ≥pn

– Each key has d choices out of n workers

• Minimize the difference between maximum and average workload


Analysis

• Necessary Condition: If pi represents the probabilityof occurrence of a key i

• Bounds:


Streaming Applications

• Most algorithms that use Shuffle Grouping can be expressed using Partial Key Grouping to reduce:

– Memory footprint

– Aggregation overhead

• Algorithms that use Key Grouping can be rewritten to achieve load balance


Streaming Examples

• Naïve Bayes Classifier

• Streaming Parallel Decision Trees

• Heavy Hitters and Space Saving


Stream Grouping: Summary

Grouping • Pros • Cons

Key Grouping • Scalable• Memory

• Load Imbalance

Shuffle Grouping • Load Balance • Memory O(W)• Aggregation O(W)

Partial Key Grouping • Scalable• Load Balance• Memory O(1)• Aggregation O(1)


Experiments

• What is the effect of key splitting on POTC?

• How does local estimation compare to aglobal oracle?

• How does PKG perform on a real deploymenton Apache Storm?


Experimental Setup

• Metric– the difference of maximum and the average load of the workers

at time t

• Datasets– Twitter, 1.2G tweets (crawled July 2012)– Wikipedia, 22M access logs– Twitter, 690K cashtags (crawled Nov 2013)– Social Networks, 69M edges– Synthetic, 10M keys


Effect of Key Splitting


Local Load Estimation


Real deployment: Apache Storm


0

200

400

600

800

1000

1200

1400

1600

0 0.2 0.4 0.6 0.8 1

Th

rou

ghp

ut

(ke

ys/s

)

(a) CPU delay (ms)

PKGSGKG

1000

1100

1200

0.100

2.106

4.106

6.106

(b) Memory (keys)

10s

10s30s

30s 60s

60s

300s

300s

600s

600s

PKGSGKG

Conclusion

• Partial Key Grouping (PKG) reduces the load imbalance byup to seven orders of magnitude compared to KeyGrouping

• PKG imposes constant memory and aggregation overhead,i.e., O(1), compared to Shuffle Grouping that is O(W)

• Apache Storm– 60% improvement in throughput– 45% improvement in latency

• PKG has been integrated in Apache Storm ver 0.10.


Future Work

• Load Balancing for Stateful Operators using key migration

• Adaptive Load Balancing for highly skewed data

• Load Balancing for graph processing systems


The Power of Both ChoicesPractical Load Balancing for Distributed Stream

Processing Engines Muhammad Anis Uddin Nasir, Gianmarco De Francisci Morales, David Garcia-Soriano

Nicolas Kourtellis, Marco Serafini

International Conference on Data Engineering (ICDE) 2015

the power of both choices: practical load balancing for distributed stream processing engines

Engineering

various stream

incoming stream

need of real time processing

real time data

common stream transformation

high throughput processing

incoming data

data logging