predict: towards predicting the runtime of iterative analytics adrian popescu 1, andrey balmin 2,...

PREDIcT: Towards Predicting the Runtime of Iterative Analytics

Adrian Popescu1, Andrey Balmin2, Vuk Ercegovac3, Anastasia Ailamaki1

Predicting Runtime of Iterative Analytics

computation messaging synch

Requirements:• # of iterations• per iteration resources (key features), i.e.,

for Bulk Synchronous Parallel (BSP):

• cost model

Challenges:• dependence on prior iterations• variable resource requirements

Workers

Partitioned Input

PREDIcT at a Glance

3• Cost model for BSP Execution Model

Iterations

Sample run

Iterations

Actual run

• Transformations: • Input dataset: sampling• Parameters: transform function

• Prediction methodology for iterative analytics on graphs:

Proportionality for resources,similarity for # of iterations

Supported AnalyticsGlobal convergence metric: e.g., an average, a ratio, fix point

Ranking (e.g., PageRank)

Graph processing (e.g., neighborhood estimation)

Graph clustering (e.g., semi-clustering)

Example: PageRank

⇒ Sampling technique

⇒ Transform function

• PageRank of a page: given by the rank of its inbound pages

• Rank computation: iterative

• Convergence: RankChange < τG1. graph structure:

connectivity, degree ratio, diameter

2. parameters: N, τG

Sampling: Biased Random Jump

• Variation of Random Jump (RJ) / random walk

Sampling scale-free graphs: e.g., web graphs

8 9 10

12 13 14

RJ BRJ

• Seed vertices: k high out degree nodes (hubs)

Disconnected Connected sampleBRJ: Improving connectivity at the same sampling ratio

Transformations: Preserving Iterations

5S Sampling Ratio (SR) = 50%

Convergence: RankChange (G) < τG

τS = τG / SR

Average rank change: RankChange(S) prop. w/ RankChange(G)Transform function T:

Sample and transform function preserve iterations

S maintains: connectivity, in/out degree ratio, effective diameter

Prediction

Cost ModelF (X1,…,Xk)

Extrapolator

Runtime

Scaled features

Profiled features

Sample run Estimated actual run

Two extrapolation factors:• on edges• on vertices

Customized cost model for the Bulk Synchronous Parallel execution model: i.e., Giraph BSP

Workers

Partitioned Input

Cost Model: Translating Features into Time

Active vertices, message counts

Message counts / sizes,Locality of messages

computation messaging synch

• Each phase but synch: multivariate linear

regression

• Synchronization: identifying critical path

Bulk Synchronous Parallel Model

Experimental Evaluation• Setup: 10 machines, 6C CPUs Intel X5660, 48GB

RAM, 1Gbps

• Datasets: Real graph datasets: Wikipedia (Wiki), Twitter (TW), UK-2002 (UK), LiveJournal(LJ), with sizes in [1,25] GB

• Representative Algorithms: PageRank (PR), Top-k Ranking and semi-clustering (SC)

• Default transformations: BRJ and Tr = (IDConf, τS = τG / SR)

• Metrics: signed relative error: RE=(Predicted - Actual) * 100 % / Actual (i.e., “+” = over-prediction, “-” = under-prediction) 10

Predicting Features (Iterations)Giraph BSP, 10 machines, real datasets in [1,25] GB

Predicting Features (Iterations)

Predicting iterations for semi-clustering: Ϯ= 0:01(left), and Ϯ = 0:001 (right).

Predicting key features for top-k ranking: Predicting iterations (left), and predicting remote message bytes (right).

LJ Wiki UK-2002 Twitter0

50Actual UpBound PREDIcT

sPageRank

Sampling Ratio = 0.1

PREDIcT reduces relative error from [104, 168]% to [0, 11]%

Predicting Time

0.1 0.15

0.2 0.25

0.6LJ Wiki UK-2002

Sampling Ratio

tive E

Semi-clustering

0.01 0.05 0.1 0.15 0.2 0.25-0.2

0.5LJ Wiki UK-2002

Sampling Ratio

tive E

Neighborhood estimation

[10, 30]% relative error for 15% sample

Algorithms with variable work/iteration• Cumulated impact of: # of iterations and per iteration

resources

Impact

• PREDIcT: Experimental methodology for estimating key features and runtime for iterative analytics on graphs

• Enables key feature prediction: pluggable transformations, and runtime prediction: cost model

• Accurate empirical solution:• Iterations: [0, 11]% (opposed to [104,168]%)• Time: [10, 30]%

http://dias.epfl.ch/predict

Thank you!

Backup slides

Cost Model: Model Fitting

Multivariate regression

Pool of BSP features

Model Fitting

Historical runs

• Training data: sample run + historical runs (if such runs exist)

• Customizable cost model (per input algorithm)

F (X1,…,Xk)

Sample run

Cost Model

compute message sync

W1 W2W3

Active vertices,Message countsMessage counts,Message sizes,Locality of messagesPartitioning scheme / skew

• Bulk Synchronous Parallel execution model

• Specialized for network intensive algorithms

• Each phase but sync: multivariate regression

• Synchronization modeled implicitly

Customized Cost Model for Bulk Synchronous Parallel Execution Model

Feasibility Analysis

PR (UK) PR (TW) SC (UK) CC (UK) CC (TW)0

Actual runSample run

20Feasible for algorithms dominated by iteration time

Context: BSP Processing Model

Giraph BSPW1

Vertex centric model: Each vertex performs local processing, then messaging

Algorithms in BSP are inherently iterative

W1 W2W3

compute message sync

Bulk Synchronous Parallel (BSP) W4

Prediction

Cost ModelF (X1,…,Xk)

Extrapolator

Runtime

Scaled features

Profiled features

Sample run Estimated actual run

Two extrapolation factors:• on edges• on vertices

Customized cost model for the Bulk Synchronous Parallel execution model: i.e., Giraph BSP

predict: towards predicting the runtime of iterative analytics adrian popescu 1, andrey balmin 2,...

Documents

identifiers - ericr. a. dirks, gillett raymond emerich,...

amc 10 preparation sdmc euler class instructor: david balmin...

1 the texture benchmark: measuring performance of text...

omnipresence of tesla’s work and...

oltp on hardware islands danica porobic, ippokratis pandis*,...

olston, ailamaki, garrod, maggs, manjhi, mowry, carnegie...

© 2010 ippokratis pandis aether: a scalable approach to...

ibm almaden research center © 2006 ibm corporation on the...

10/1/2014 memory and i/o subsystem reference: introduction...

cs848 paper presentaonkmsalem/courses/cs848w10/...cs848...

a comparison of join algorithms for log processing in...

knjiga saŽetaka book of abstracts - ukns.org · prof dr...

toward scalable transaction processing€¦ · toward...

limiting disclosure in hippocratic databases kristen lefevre...

to share or not to share? ryan johnson nikos hardavellas,...

digital arithmetic miloˇs d. ercegovac and department of

© 2010 ibm corporation w iki a nalytics andrey balmin (ibm...

carnegie mellon increasing intrusion tolerance via scalable...

weaving relations for cache performance anastassia ailamaki...

chevron richmond refinery: accountability on the road to...