xiaodan wang, randal burns department of computer science johns hopkins university tanu malik cyber...

Xiaodan Wang, Randal BurnsDepartment of Computer ScienceJohns Hopkins University

Tanu MalikCyber CenterPurdue University

LifeRaft: Data-Driven, Batch Processing for the Exploration of

Scientific Databases

LifeRaft: Data-Driven, Batch Processing

BETTER LUCK NEXT TIME!

ProblemQ1

Eliminate redundant I/O to improve query throughput

Batch queries with that exhibit data sharing– Pre-process queries to identify data sharing– Co-schedule queries that access the same data– Access contentious data first to maximize sharing

Starvation resistance– Avoid indefinite queuing times (response time)– Enforce some constraints on completion order

Target Applications

Data intensive scan queries– Executed against a clustered index– Clustered and federated databases (e.g. joins that correlate

multiple nodes)

Peta-scale astronomy (Pan-STARRS)– Data are partitioned spatially– Many queries scan full DB and last hours or days

Cross-match

– Probabilistic spatial join across multiple databases

Filter and Refine

Filter queries– Pre-process queries to determine join buckets– Buckets B1,…,Bn and queries Q1,…, Qm

– Workload Wij denote objects from Qi that overlap Bj

Refinement– Read buckets one-at-a-time– Sort-merge join (sort by HTM ID)– Query specific predicates applied on output tuples

Workload Throughput Metric

Greedily in order of decreasing workload throughput Exploits data regions that experience contention May starve requests

– Favors buckets experiencing frequent reuse– No guarantee a particular bucket or query receives service

Aged Workload Throughput Metric

Inspired by disk-drive head scheduling Balance arrival order (low response time) with

contention (high throughput) Adaptive trade-offs based on workload saturation

– Maximize rate at which objects are joined during saturated workloads

– Enforce completion order (queuing times) to prevent indefinite starvation during low saturation

Scheduling Behavior

Qi – Qi1, Qi2, Qi3

B1 B2 B3 B4 B5 B6 B7 B8

Qi Qj Qk

Sub-divide queries by bucket:

Qj – Qj3, Qj4, Qj5, Qj6 , Qj7, Qj8

Assumptions:- Inter-query time of 1 sec- I/O for each bucket of 1 sec- Cache size of 2- Join cost is negligibleQj – Qj5, Qj6 , Qj7, Qj8

Arrival order with no sharing

Qi Arr

Qj Arr Qk Arr

Qi End

Qj End

Qk End

Qi – 3 sec

Completion Times:

Qj – 8 sec Qk – 13 sec Avg – 8 sec

B1 B2 B3 B4 B5 B6 B7 B8

Qi Qj QkQk

Tp – .2 qry/sec

Age based scheduling (bias 1)

Qi Arr

Qi3Qj3

Qj Arr Qk Arr Qi EndQj End

Qk End

Qj1Qk1

Qj4Qk4

Qj6Qk6

Qi – 3 sec

Completion Times:

Qj – 7 sec Qk – 7 sec Avg – 5.6 sec Tp – .33 qry/sec

B1 B2 B3 B4 B5 B6 B7 B8

Qi Qj QkQk

Qj8Qk8

Qj7Qk7

Contention based scheduling (bias 0)

Qi Arr

Qi3Qj3

Qj Arr Qk Arr Qi EndQj End

Qk End

Qj1Qk1Qj4Qk4

Qj6Qk6

Qj7Qk7

Qi – 7 sec

Completion Times:

Qj – 5 sec Qk – 6 sec Avg – 6 sec Tp – .38 qry/sec

B1 B2 B3 B4 B5 B6 B7 B8

Qi Qj QkQk

Qj8Qk8

(5.6) (.33)

Throughput Performance

Tuning theage bias

Throughput performance gap grows while response time gap is insensitive to saturation

Increasing age bias is more attractive at low saturation

Parameter tuning using trade-off curves

Discussion

Impact of caching strategies Workload overflow

– Large intermediate join results– Migrate pairs of workload and bucket

Beyond completion order– Higher priority for interactive queries

Batch processing in a clustered environmentP. Agrawal, D.Kifer, and C. Olston. Scheduling Shared Scans of Large Data Files. In VLDB, 2008.

WHAT ABOUT US?

Filter and refine

Partition data into buckets

Average Response Time

Outline

Motivation– Goals for data-driven, batch scheduling– Target application (SkyQuery)

LiftRaft scheduler– Filter and refine queries– Throughput maximizing metric– Starvation resistance– Differences in outcomes

Workload adaptive parameter selection

xiaodan wang, randal burns department of computer science johns hopkins university tanu malik cyber...

Documents

sar liferaft

liferaft hru installation istructions

datrex liferaft data sheet - datrex inc | official site

xiaodan mi 43982279 visual essay

rescyou™ liferaft

presented by:- tanu agrawal

eurosul/sollax-d type – launched inflatable liferaft balsa...

hopkins storage systems lab, department of computer science...

edition 1 / 2015 viking liferafts...2015/04/30 · 25 & 39...

chengrui duan , xiaodan li and zhenhua xu

liferaft and lifejacket requirements with other info

a tanu (the witness)

iso ocean liferaft -...

rachana subedi tanu singh. business deal with groupon

©2013 tanu kohli all rights reserved

glt saraswati bal mandir,rashmi ,tanu

avon catalogue liferaft 2009

tanu bhava base of your horoscope

c 2015 xiaodan zhang

liferaft manual