Transcript

Flexible Cluster Computing: Dryad andDryadLINQ

Yuan Yu, Michael Isard, Dennis Fetterly, Mihai Budiu,Ulfar Erlingsson, Pradeep Kumar Gunda, Jon Currey,

Andrew Birrell

presented by Michael George

Distributed Storage Systems SeminarApril 2, 2009

Computing on Clusters

How to crunch lots of data?I Explicit distribution

I Write it by handI Hard! failure, resource allocation, scheduling, . . .

I Implicit distributionI MapReduce, DryadLINQI Easy! As long as your computation is expressible. . .

I Virtualized distributionI DryadI In between. Programmer specifies data flow, system handles

details

Dryad Overview — Jobs

Dryad Job is a Directed Acyclic Graph

I Vertices are subcomputations

I Edges are data channels

I Graph is virtual — may be more orfewer vertices than cluster nodes.

in.txt

distribute

sort

uniq

sort

uniq

sort

uniq

sort

uniq

count

out.txt

Dryad Overview — Execution

Centralized Job Manager distributes virtual graph to actual cluster

Files, TCP, FIFO

jobgraph data plane

control plane

NS PD PDPD

V V V

Job manager cluster

Writing Vertex Programs

I A Vertex Program is a class that extends the base VP classI Base class provides typed I/O channelsI Specialized abstract subclasses available

I Map, Reduce, Distribute, . . .

I Special support for legacy executablesI grep, perl, legacyApp, . . .

I Asynchronous I/O API available for vertices that require itI Runtime distinguishes asynch vertices, executes them

efficiently on thread pool

Composing Vertex Programs

Vertex programs are joined into graphs

I edges are local files by default

I can also be TCP pipes or in-memory FIFOs

Predefined operators for common composition patterns:

I Clone (G^n) ^3 =

I Merge (G1||G2) || =

I Pointwise composition (G1>=G2) >= =

I Bipartite composition (G1>>G2) >> =

Composing Vertex Programs

Vertex programs are joined into graphs

I edges are local files by default

I can also be TCP pipes or in-memory FIFOs

Predefined operators for common composition patterns:

I Clone (G^n) ^3 =

I Merge (G1||G2) || =

I Pointwise composition (G1>=G2) >= =

I Bipartite composition (G1>>G2) >> =

Composing Vertex Programs

Vertex programs are joined into graphs

I edges are local files by default

I can also be TCP pipes or in-memory FIFOs

Predefined operators for common composition patterns:

I Clone (G^n) ^3 =

I Merge (G1||G2) || =

I Pointwise composition (G1>=G2) >= =

I Bipartite composition (G1>>G2) >> =

Composing Vertex Programs

Vertex programs are joined into graphs

I edges are local files by default

I can also be TCP pipes or in-memory FIFOs

Predefined operators for common composition patterns:

I Clone (G^n) ^3 =

I Merge (G1||G2) || =

I Pointwise composition (G1>=G2) >= =

I Bipartite composition (G1>>G2) >> =

Composing Vertex Programs

Vertex programs are joined into graphs

I edges are local files by default

I can also be TCP pipes or in-memory FIFOs

Predefined operators for common composition patterns:

I Clone (G^n) ^3 =

I Merge (G1||G2) || =

I Pointwise composition (G1>=G2) >= =

I Bipartite composition (G1>>G2) >> =

Example Job

M S

M S

U M S

X D M S Y

N U

n n 4n 4n n n H out

U U

X D M S Y

N M S

M S

M S((((U >= X) || (N >= X)) >= D)^n >> (((M >= S)^4 >> Y) || U >= Y)^n) >> H >= out

Running a Job

Vertices are instantiated on nodes

I May be multiple execution records due to failureI Node placement handled by Job Manager

I Applications can specify locality “hints” or “requirements”I Edge requirements may force vertices to co-locate

Job manager is notified of node transitions

I May rerun failed vertices

I Can rewrite the graph

I May run duplicate process to route around slow nodes

Callback Example: Dynamic Optimization

Client may not know size or distribution of data at load time

I Can dynamically rewrite the graph

A A A A A A

B B

C C C C

k

MapReduce in Dryad

So Far. . .

I Dryad provides a mid-level execution platform

I Good performance possible

I FlexibilityI But

I Data parallelization done by handI Optimization done by handI Unfamiliar programming style

DryadLINQ

LINQ (Language INtegrated Query)

I Standard .NET extenstion

I Embeds SQL-like operators into programming languages

I Developers can mix declarative, functional, and imperativestatements

DryadLINQ

I Compiles LINQ statements to run on Dryad

I Provides simple, flexible, efficient access to cluster computing

LINQ Example

v a r a d j u s t e d S c o r e T r i p l e s =from d i n s c o r e T r i p l e sj o i n r i n s t a t i c R a n k on d . docID e q u a l s r . keys e l e c t new Q u e r y S c o r e D e c I D T r i p l e ( d , r ) ;

v a r r a n k e d Q u e r i e s =from s i n a d j u s t e d S c o r e T r i p l e sgroup s by s . q u e r y i n t o gs e l e c t TakeTopQueryResults ( g ) ;

v a r a d j u s t e d S c o r e T r i p l e s =s c o r e T r i p l e s . j o i n ( s t a t i c R a n k ,

d => d . docID , r => r . key ,( d , r ) => new Q u e r y S c o r e D o c I D T r i p l e ( d , r ) ) ;

v a r g r o u p e d Q u e r i e s =a d j u s t e d S c o r e T r i p l e s . groupBy ( s => s . q u e r y ) ;

v a r r a n k e d Q u e r i e s =g r o u p e d Q u e r i e s . s e l e c t (

g => TakeTopQueryResults ( g ) ) ;

LINQ Example

v a r a d j u s t e d S c o r e T r i p l e s =from d i n s c o r e T r i p l e sj o i n r i n s t a t i c R a n k on d . docID e q u a l s r . keys e l e c t new Q u e r y S c o r e D e c I D T r i p l e ( d , r ) ;

v a r r a n k e d Q u e r i e s =from s i n a d j u s t e d S c o r e T r i p l e sgroup s by s . q u e r y i n t o gs e l e c t TakeTopQueryResults ( g ) ;

v a r a d j u s t e d S c o r e T r i p l e s =s c o r e T r i p l e s . j o i n ( s t a t i c R a n k ,

d => d . docID , r => r . key ,( d , r ) => new Q u e r y S c o r e D o c I D T r i p l e ( d , r ) ) ;

v a r g r o u p e d Q u e r i e s =a d j u s t e d S c o r e T r i p l e s . groupBy ( s => s . q u e r y ) ;

v a r r a n k e d Q u e r i e s =g r o u p e d Q u e r i e s . s e l e c t (

g => TakeTopQueryResults ( g ) ) ;

DryadLINQ Constructs

Types:IEnumerable<T>

IQueryable<T>

DryadTable<T>

NTFS GFS SQL Table

Data Partitioning Operators:

I HashPartition<T,K>

I RangePartition<T,K>

Escape Hatches:

I Apply (f)

I Fork (f)

Execution Overview

Client machine

CompileOutput

DryadTable

ToDryadTable foreach

DryadLINQ

.NET

OutputTables

Inputtables

Execplan

DryadExecution

Data center

Results

.NETObjects

LINQExpr

JM

Invoke

Vertexcode

Execution Details

1. LINQ expression is compiled to an Execution Plan GraphI EPG is a skeleton of a job

2. EPG is optimized using term rewritingI Pipelining addedI Redundancy redundancy removedI Aggregation made eagerI TCP/FIFO annotations added where possible

3. Code GeneratedI Partially evaluated LINQ subexpressionsI Serialization code

4. Dynamically, EPG is executed by DryadLINQ job managerI vertices replicated to match dataI dynamic optimizations automatedI vertices use local LINQ execution engines (e.g. PLINQ)

Execution Details

1. LINQ expression is compiled to an Execution Plan GraphI EPG is a skeleton of a job

2. EPG is optimized using term rewritingI Pipelining addedI Redundancy redundancy removedI Aggregation made eagerI TCP/FIFO annotations added where possible

3. Code GeneratedI Partially evaluated LINQ subexpressionsI Serialization code

4. Dynamically, EPG is executed by DryadLINQ job managerI vertices replicated to match dataI dynamic optimizations automatedI vertices use local LINQ execution engines (e.g. PLINQ)

Execution Details

1. LINQ expression is compiled to an Execution Plan GraphI EPG is a skeleton of a job

2. EPG is optimized using term rewritingI Pipelining addedI Redundancy redundancy removedI Aggregation made eagerI TCP/FIFO annotations added where possible

3. Code GeneratedI Partially evaluated LINQ subexpressionsI Serialization code

4. Dynamically, EPG is executed by DryadLINQ job managerI vertices replicated to match dataI dynamic optimizations automatedI vertices use local LINQ execution engines (e.g. PLINQ)

Execution Details

1. LINQ expression is compiled to an Execution Plan GraphI EPG is a skeleton of a job

2. EPG is optimized using term rewritingI Pipelining addedI Redundancy redundancy removedI Aggregation made eagerI TCP/FIFO annotations added where possible

3. Code GeneratedI Partially evaluated LINQ subexpressionsI Serialization code

4. Dynamically, EPG is executed by DryadLINQ job managerI vertices replicated to match dataI dynamic optimizations automatedI vertices use local LINQ execution engines (e.g. PLINQ)

MapReduce on DryadLINQp u b l i c s t a t i c MapReduce ( source , mapper , k e yS e l e c t o r , r e du c e r ) {

va r mapped = sou r c e . s e l ec tMany ( mapper ) ;va r g roups = mapped . groupBy ( k e y S e l e c t o r ) ;r e t u r n groups . s e l ec tMany ( r e du c e r ) ;

}

SM

R

G

SM

S

G

R

D

MS

G

R

(1) (2) (3)

X

X

SM

S

G

R

D

MS

G

R

X

SM

S

G

R

D

MS

G

R

X

SM

S

G

R

D

SM

S

G

R

D

MS

G

R

X

SM

S

G

R

D

MS

G

R

X

SM

S

G

R

D

MS

G

R

MS

G

R

map

sort

groupby

reduce

distribute

mergesort

groupby

reduce

mergesort

groupby

reduce

consumer

map

part

ial a

ggre

gatio

nre

duce

Scalability Evaluation — TeraSort

50

100

150

200

250

300

350

0

0Gb

50

208Gb

100

416Gb

150

624Gb

200

832Gb

240

1000Gb

Machines:

Data:

Exe

cuti

onT

ime

319s

Overhead Evaluation — SkyServer

5

10

15

20

25

0 5 10 15 20 25 30 35 40

Machines

Sp

eed

up

Dryad Two-pass

SQL Server

Dryad In-memory

DryadLINQ

Conclusions and Discussion

Dryad meets its goals:

I efficient

I flexible

I programmable

DryadLINQ builds on Dryad:

I almost as efficient

I concise

I familiar


Top Related