an adaptive multi-objective scheduling selection framework for continuous query processing

An Adaptive Multi-Objective Scheduling Selection Framework For Continuous Query ProcessingTimothy M. Sutherland

Bradford PielechYali ZhuLuping Ding and Elke Rundensteiner

Worcester Polytechnic InstituteWorcester, MA, USA

A Presentation @ IDEAS, Montreal, Canada, July 27, 2005

Continuous Query (CQ) Processing

RegisterContinuous

Queries

Stream QueryEngine

Stream QueryEngine

Streaming Data Streaming Result

May have different QoS Requirements.

May have time-varying rates and data distribution.

Available resources for executing each operator

may vary over time.

Run-time adaptations are required for stream query engine.

Runtime Adaptation Techniques Operator Scheduling Query Optimization Distribution Load Shedding Others

Operator Scheduling for CQ

Operator Scheduling Determines the order to execute operators Allocates resources to query operators

Existing Scheduling Algorithms Round Robin (RR) First In First Out (FIFO) Most Tuples In Queue (MTIQ) Chain [BBM03] Others

3

1

4

stream A stream B

scheduler

2

1, 2, 3, 4

Properties of Existing Scheduling Algorithms Uni-Objective

Designed for a single performance objective Increase throughput Reduce memory Reduce tuple delay

Fixed Objective Cannot change objective during query run

May be insufficient for CQ processing

Performance Requirements in CQ May be multi-objective

Example Run time-critical queries on memory-limited machine Two performance goals: less result delay and less memory

May vary over time Example

System resource availability may change during query run Under light workload: faster throughput Under heavy workload: less memory

Existing scheduling algorithms Not designed for multiple changing objectives As a result, each has its strengths and weaknesses

Scheduling Example: FIFO

FIFOStart at leaf and process the newest tuple until completion.

Time:

End User

σ = 1

T = 0.75

σ = .1

T = 0.25

σ = 0.9

T = 1

Stream

3

1

2

0

0

0

0

3

1

2

1

0

0

0

0

3

1

2

1

0

0

0.9

1

3

1

2

1

0

0.09

0

1.25

3

1

2

2

0.09

0

0

2

3

1

2

2

0.09

0

0.9

3

3

1

2

2

0.09

0.09

0

3.25

3

1

2

3

0.18

0

0

4

• FIFO’s queue size grows quickly. • FIFO has fast first outputs

Scheduling Example: MTIQ

MTIQSchedule the operator with the largest input queue.

Time:

End User

σ = 1

T = 0.75

σ = .1

T = 0.25

σ = 0.9

T = 1

Stream

3

2

10

0

0

0

3

1

2

0

1

0

0

0

3

1

2

1

1

0

0

0.9

3

1

2

2

1

0

0

1.8

3

1

2

1

0

0.1

0.8

2.25

3

1

2

1

0

0.1

1.7

3.25

3

1

2

1

0

0.2

0.7

3.5

3

1

2

1

0

0.2

1.6

4.5

3

1

2

1

0

0.3

0.6

4.75

3

1

2

1

0

0.3

1.5

5.75

3

1

2

2

0

0.4

0.5

6

• MTIQ’s queue size grows at a slower rate• Tuples remain queued for longer time

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0 1 2 3 4 5 6 7 8Time

Th

rou

gh

pu

t

FIFO

MTIQ

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

0 1 2 3 4 5 6 7 8

Time

Qu

eue

Siz

e

FIFOMTIQ

Performance Comparison

Summary of Problem

What does CQ need: Runtime Adaptation Multiple prioritized optimization goals Dynamically changing optimization goals

Our solution -- AMoS AAdaptive MMulti-OObjective SScheduling Selection

Framework

Outline

Introduction and Motivation The AMoS Framework Experimental Evaluation Conclusion

General Idea of AMoS Framework Meta-scheduler

User specify multiple performance objectives Rank scheduling algorithms based on performances Dynamically select the current best algorithm

Design logic Simple, low overhead and effective

AlgorithmEvaluator

AlgorithmSelector

SelectingStrategy

SchedulingAlgorithm Library

Statistics

PerformanceRequirements

DecisionRequest

SchedulerDecision

Specifying Performance Objectives

Metric: Any statistic calculated by the system. Quantifier: Maximize or Minimize Weight: The relative weight / importance of this metric

The sum of all weights is exactly 1.

Metric Quantifier Weight

Output Rate Maximize 0.60

Memory Minimize 0.25

Delay Minimize 0.15

Adapting Scheduler

Step 1: Scoring Statistics Periodically Step 2: Scoring Scheduling Algorithms Step 3: Selecting Scheduling Algorithm

Step 1: Scoring Statistics




Delay Minimize 0.15

Output rate Memory Delay

MTIQ 0.10 0.12 0.46

FIFO 0.20 -0.31 -0.16

Chain 0.15 -0.50 0.21

decayzdecayz oldiHi

Hi

Hi

Ci

newi

__ )1(minmax

)(

Stats Score MatrixPerformance Objectives

Update stats scores of the current scheduling algorithm Acurrent

Zi_new -- score of stat i for Acurrent

μiH, maxi

H, miniH -- mean, max and min

history value of stat i. μi

C -- most recent collected stat i. decay -- decay factor in (0, 0.5).

Exponentially decay out-of-date data.

Zi_new ∈ (-1, 1).

quantifier Zi_new meaning

Maximize > 0 Perform above average

Maximize < 0 Perform below average

Minimize > 0 Perform below average

Minimize < 0 Perform above average


MTIQ

FIFO

Chain

Step 2: Scoring Scheduling Algorithms

I

iiiiA wzqscore

0

1)(


MTIQ 0.10 0.12 0.46

FIFO 0.20 -0.31 -0.16

Chain 0.15 -0.50 0.21

Output rate Memory Delay Score

MTIQ 0.10 0.12 0.46 0.96

FIFO 0.20 -0.31 -0.16 1.22

Chain 0.17 -0.50 -0.21 1.65




Delay Minimize 0.15

Performance Objectives

Stats Score Matrix

Update score of scheduling algorithm A.

ScoreA -- score for A.

zi – score of stat i for A.

qi – -1 for minimize, 1 for maximize

wi – Weight in table of objectives. Add 1 to shift from [-1, 1] to [0, 2].

ScoreA ∈ (0, 2).

quantifier Zi meaning qizi effect on ScoreA

Maximize > 0 above average > 0 Increase score

Maximize < 0 below average < 0 Decrease score

Minimize > 0 below average < 0 Decrease score

Minimize < 0 above average > 0 Increase score

Issues In Scheduler Selection Process The framework need to learn each algorithm

Solution: All algorithms are initially run for once

Algorithm did poorly earlier may be good now Solution: Periodically explore other algorithms Reason for adopting the Roulette Wheel Strategy

Step 3: Selecting Next Algorithm

Roulette Wheel [MIT99]

• Chooses next algorithm with a probability equivalent to its score

• Favors the better scoring algorithms, but will still pick others.

Scores of 4 Scheduling Algorithms

Algorithm 1

Algorithm 2

Algorithm 3

Algorithm 4

Well performed ones have better chances

Others also have chances to be explored

Lightweight so overhead is very low

Proven to be effective in experimental study

Overall Flow of the Adapting Process

Periodically Scoring statisticsPeriodically Scoring statistics

Initially run all algorithms onceInitially run all algorithms once

Rank candidate algorithmsRank candidate algorithms

Select next algorithm to runSelect next algorithm to run

Input: performance objectives &candidate scheduling algorithms

Repeat until query is done

change requested?

change requested?

Y

N

Summary of the AMoS Framework Light-weight

Use runtime statistics collected by the system Ranking formula are simple yet effective

Self learning No apriori information needed Learn behaviors of scheduling algorithms on the fly

Easily extendable Add more scheduling algorithms Add more performance objectives Add more selecting strategies

Outline

Introduction and Motivation The AMoS Framework Experimental Evaluation Conclusion

Experimental Setup

Evaluated in CAPE system [cape04] A prototype continuous query system

Query plans Consists of join and select operators

Input streams Simulated with Poisson arrival pattern

Performance objectives Different number of objectives Different weight of objectives

One Performance Objective100% focus on minimizing tuples in memory

5

10

15

20

25

30

35

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

Time (s)

Tu

ple

s in

mem

ory

AMoSFIFOMTIQRRChain

100% focus on minimizing tuple delay

0

100

200

300

400

500

600

700

800

900

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29

Time (s)

Av

g T

up

le d

ela

y (

ms

)

AMoSFIFOMTIQRRChain

Two Performance Objectives

0.07

0.09

0.11

0.13

0.15

0.17

10 30 50 70 90 110 130 150 170 190 210 230 250 270 290

Time (s)

Ou

tpu

t R

ate

(Tu

ple

s/m

s) AMoSFIFOMTIQRRChain

0

20000

40000

60000

80000

100000

10 30 50 70 90 110 130 150 170 190 210 230 250 270 290

Time (s)

Ave

rag

e T

up

le D

elay

in q

uer

y p

lan

(ms)

AMoSFIFOMTIQRRChain

50% focus on output rate, 50% focus on tuple delay

Two Performance Objectives (cont.)

100

200

300

400

500

600

700

800

900

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

Time (s)

Ave

rag

e T

up

le d

elay

in q

uer

y p

lan

(m

s)

AMosFIFOMTIQRRChain

0.075

0.125

0.175

0.225

0.275

0.325

0.375

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29

Time (s)

Ave

rag

e T

up

le O

utp

ut

Rat

e (T

up

les/

ms)

AMoSFIFOMTIQRRChain

100

300

500

700

900

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29

Time (s)

Ave

rag

e T

up

le d

elay

in q

uer

y p

lan

(m

s)

AMoSFIFOMTIQRRChain

0.075

0.125

0.175

0.225

0.275

0.325

0.375

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

Time (s)

Ave

rag

e T

up

le O

utp

ut

Rat

e (T

up

les/

ms)

AMoSFIFOMTIQRRChain

70% focus on tuple delay, 30% focus on output rate

30% focus on tuple delay, 70% focus on output rate

Three Performance Objectives

Equal focus (33%) on output rate, memory and tuple delay

0.03

0.05

0.07

0.09

0.11

0.13

0.15

0.17

0.19

10 30 50 70 90 110 130 150 170 190 210 230 250 270 290

Time (s)

Ou

tpu

t R

ate

(Tu

ple

s/m

s)

AMoSFIFOMTIQRRChain

-10000

10000

30000

50000

70000

90000

10 30 50 70 90 110 130 150 170 190 210 230 250 270 290

Time (s)

Ave

rag

e T

up

le D

elay

in q

uer

y p

lan

(m

s)

AMoSFIFOMTIQRRChain

0

10000

20000

30000

40000

50000

60000

10 30 50 70 90 110 130 150 170 190 210 230 250 270 290

Time (s)

Tu

ple

s in

Mem

ory

AMoSFIFOMTIQRRChain

Conclusions Identified the lack of support for multi-objective adaptation

Existing approaches only focus on single objective Cannot change objective during query run

Proposed a novel scheduling framework: Allows applications to control performance objectives Alters scheduling algorithm based on run-time performances Independent of scheduling algorithms or performance objectives

AMoS strategy shows very promising experimental results. Developed and evaluated in the CAPE system W/ single objective, performs as well as the best algorithm W/ multiple objectives, overall better than any algorithm

Thank You!For more information, please visit:

davis.wpi.edu/~dsrg

an adaptive multi-objective scheduling selection framework for continuous query processing

Documents

stream query engine

query rununder light

timevarying rates

largest input queue

1stream321 mtiqs queue

1stream312 fifos queue

end user

result delay