carnegie mellon university joseph gonzalez joint work with yucheng low aapo kyrola danny bickson...

66
Carnegie Mellon Universit Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola @ e Next Generation of the GraphLab Abstractio Jay Gu 2

Upload: malcolm-harris

Post on 19-Jan-2018

216 views

Category:

Documents


0 download

DESCRIPTION

Map-Reduce / Hadoop Build learning algorithms on-top of high-level parallel abstractions... a popular answer:

TRANSCRIPT

Page 1: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Carnegie Mellon University

Joseph GonzalezJoint work with

YuchengLow

AapoKyrola

DannyBickson

CarlosGuestrin

JoeHellerstein

AlexSmola

@The Next Generation of the GraphLab Abstraction.

JayGu

2

Page 2: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

How will wedesign and implement

parallel learning systems?

Page 3: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Map-Reduce / HadoopBuild learning algorithms on-top of

high-level parallel abstractions

... a popular answer:

Page 4: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

BeliefPropagation

Label Propagation

KernelMethods

Deep BeliefNetworks

NeuralNetworks

Tensor Factorization

PageRank

Lasso

Map-Reduce for Data-Parallel MLExcellent for large data-parallel tasks!

4

Data-Parallel Graph-Parallel

CrossValidation

Feature Extraction

Map Reduce

Computing SufficientStatistics

Page 5: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Example of Graph Parallelism

Page 6: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

PageRank ExampleIterate:

Where:α is the random reset probabilityL[j] is the number of links on page j

1 32

4 65

Page 7: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Properties of Graph Parallel Algorithms

DependencyGraph

IterativeComputation

My Rank

Friends Rank

Factored Computation

Page 8: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

BeliefPropagation

SVM

KernelMethods

Deep BeliefNetworks

NeuralNetworks

Tensor Factorization

PageRank

Lasso

Map-Reduce for Data-Parallel MLExcellent for large data-parallel tasks!

8

Data-Parallel Graph-Parallel

CrossValidation

Feature Extraction

Map Reduce

Computing SufficientStatistics

Map Reduce?Pregel (Giraph)?

Page 9: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

BarrierPregel (Giraph)

Bulk Synchronous Parallel Model:

Compute Communicate

Page 10: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

PageRank in Giraph (Pregel)

public void compute(Iterator<DoubleWritable> msgIterator) {

double sum = 0;while (msgIterator.hasNext())

sum += msgIterator.next().get();DoubleWritable vertexValue =

new DoubleWritable(0.15 + 0.85 * sum);setVertexValue(vertexValue);if (getSuperstep() < getConf().getInt(MAX_STEPS,

-1)) {long edges = getOutEdgeMap().size();sentMsgToAllEdges(

new DoubleWritable(getVertexValue().get() / edges));

} else voteToHalt();}

Page 11: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Carnegie Mellon University

Bulk synchronous computation can be inefficient.

11

Problem

Page 12: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Curse of the Slow Job

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data

CPU 1

CPU 2

CPU 3

CPU 1

CPU 2

CPU 3

Data

Data

Data

Data

Data

Data

Data

CPU 1

CPU 2

CPU 3

Iterations

Barr

ier

Barr

ier

Data

Data

Data

Data

Data

Data

Data

Barr

ier

Page 13: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Curse of the Slow JobAssuming runtime is drawn from an exponential distribution with mean 1.

0 2 4 6 8 10 120

2

4

6

8

10

12

Number of Jobs

Runti

me

Mul

tiple

Page 14: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Problem with MessagingStorage Overhead:

Requires keeping Old and New Messages [2x Overhead]Redundant messages:

PageRank: send a copy of your own rank to all neighborsO(|V|) O(|E|)

Often requires complex protocolsWhen will my neighbors need information about me?

Unable to constrain neighborhood stateHow would you implement graph coloring?

CPU

1 CPU 2

Sends the same message three times!

Page 15: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Converge More Slowly

1 2 3 4 5 6 7 80

1000

2000

3000

4000

5000

6000

7000

8000

9000

Number of CPUs

Runti

me

in S

econ

ds

Optimized in Memory Bulk Synchronous

Asynchronous Splash BP

Page 16: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Carnegie Mellon University

Bulk synchronous computation can be wrong!

16

Problem

Page 17: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

17

The problem with Bulk Synchronous Gibbs

Adjacent variables cannot be sampled simultaneously.

Strong PositiveCorrelation

t=0

Parallel Execution

t=2 t=3

Strong PositiveCorrelation

t=1

Sequential

Execution

Strong NegativeCorrelation

Page 18: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

BeliefPropagationSVM

KernelMethods

Deep BeliefNetworks

NeuralNetworks

Tensor Factorization

PageRank

Lasso

The Need for a New AbstractionIf not Pregel, then what?

18

Data-Parallel Graph-Parallel

CrossValidation

Feature Extraction

Map Reduce

Computing SufficientStatistics

Pregel (Giraph)

Page 19: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

What is GraphLab?

Page 20: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

The GraphLab Framework

Scheduler Consistency Model

Graph BasedData Representation

Update FunctionsUser Computation

20

Page 21: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Data Graph

21

A graph with arbitrary data (C++ Objects) associated with each vertex and edge.

Vertex Data:• User profile text• Current interests estimates

Edge Data:• Similarity weights

Graph:• Social Network

Page 22: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Comparison with Pregel

PregelData is associated only with vertices

GraphLabData is associated with both vertices and edges

Page 23: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

pagerank(i, scope){ // Get Neighborhood data (R[i], Wij, R[j]) scope;

// Update the vertex data

// Reschedule Neighbors if needed if R[i] changes then reschedule_neighbors_of(i); }

;][)1(][][

iNj

ji jRWiR

Update Functions

23

An update function is a user defined program which when applied to a vertex transforms the data in the scope of the vertex

Page 24: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

PageRank in GraphLab2

struct pagerank : public iupdate_functor<graph, pagerank> {

void operator()(icontext_type& context) {vertex_data& vdata =

context.vertex_data(); double sum = 0;foreach ( edge_type edge,

context.in_edges() )sum +=

1/context.num_out_edges(edge.source()) *

context.vertex_data(edge.source()).rank;double old_rank = vdata.rank;vdata.rank = RESET_PROB + (1-RESET_PROB) *

sum;double residual = abs(vdata.rank –

old_rank) /

context.num_out_edges();if (residual > EPSILON)

context.reschedule_out_neighbors(pagerank());}

};

Page 25: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Comparison with Pregel

PregelData must be sent to adjacent verticesThe user code describes the movement of data as well as computation

GraphLabData is read from adjacent verticesUser code only describes the computation

Page 26: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

The Scheduler

26

CPU 1

CPU 2

The scheduler determines the order that vertices are updated.

e f g

kjih

dcba b

ih

a

i

b e f

j

c

Sche

dule

r

The process repeats until the scheduler is empty.

Page 27: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

The GraphLab Framework

Scheduler Consistency Model

Graph BasedData Representation

Update FunctionsUser Computation

27

Page 28: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Ensuring Race-Free CodeHow much can computation overlap?

Page 29: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

GraphLab Ensures Sequential Consistency

29

For each parallel execution, there exists a sequential execution of update functions which produces the same result.

CPU 1

CPU 2

SingleCPU

Parallel

Sequential

time

Page 30: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Consistency Rules

30

Guaranteed sequential consistency for all update functions

Data

Page 31: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Full Consistency

31

Page 32: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Obtaining More Parallelism

32

Page 33: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Edge Consistency

33

CPU 1 CPU 2

Safe

Read

Page 34: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Is pretty neat!

In Summary …

Page 35: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

02000

40006000

800010000

1200014000

1.00E-021.00E+001.00E+021.00E+041.00E+061.00E+08

GraphLabPregel

Runtime (s)

L1 E

rror

Pregel vs. GraphLabMulticore PageRank (25M Vertices, 355M Edges)

Pregel [Simulated]Synchronous ScheduleNo Skipping [Unfair updates comparison]No Combiner [Unfair runtime comparison]

0.0E+00 5.0E+08 1.0E+09 1.5E+09 2.0E+091.00E-02

1.00E+00

1.00E+02

1.00E+04

1.00E+06

1.00E+08GraphLabPregel

Updates

L1 E

rror

Page 36: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Update Count Distribution

0 10 20 30 40 50 60 700

2000000

4000000

6000000

8000000

10000000

12000000

14000000

Number of Updates

Num

-Ver

tices

Most vertices need to be updated infrequently

Page 37: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Bayesian Tensor Factorization

Gibbs Sampling

Dynamic Block Gibbs Sampling

MatrixFactorization

Lasso

SVM

Belief Propagation

PageRank

CoEM

K-Means

SVD

LDA

…Many others…

Page 38: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Startups Using GraphLab

Companies experimenting with Graphlab

Academic projects Exploring Graphlab

1600++ Unique Downloads Tracked(possibly many more from direct repository checkouts)

Page 39: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Why do we need a NEW GraphLab?

Page 40: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Natural Graphs

Page 41: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

41

Natural Graphs Power Law

Top 1% vertices is adjacent to53% of the edges!

Yahoo! Web Graph

“Power Law”

Page 42: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Problem: High Degree Vertices

High degree vertices limit parallelism:

Touch a LargeAmount of State

Requires Heavy Locking

Processed Sequentially

Page 43: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

High Degree Vertices are Common

Use

rs

Movies

Netflix

“Social” People Popular Movies

θZwZwZwZw

θZwZwZwZw

θZwZwZwZw

θZwZwZwZw

Hyper Parameters

Docs

Words

Freq.

Common Words

Obama

Page 44: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Proposed Four Solutions

Decomposable Update FunctorsExpose greater parallelism by further factoring update functions

Commutative- Associative Update FunctorsTransition from stateless to stateful update functions

Abelian Group Caching (concurrent revisions)Allows for controllable races through diff operations

Stochastic ScopesReduce degree through sampling

Page 45: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

PageRank in GraphLab

struct pagerank : public iupdate_functor<graph, pagerank> {

void operator()(icontext_type& context) {vertex_data& vdata =

context.vertex_data(); double sum = 0;foreach ( edge_type edge,

context.in_edges() )sum +=

1/context.num_out_edges(edge.source()) *

context.vertex_data(edge.source()).rank;double old_rank = vdata.rank;vdata.rank = RESET_PROB + (1-RESET_PROB) *

sum;double residual = abs(vdata.rank –

old_rank) /

context.num_out_edges();if (residual > EPSILON)

context.reschedule_out_neighbors(pagerank());}

};

Page 46: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

PageRank in GraphLab

struct pagerank : public iupdate_functor<graph, pagerank> {

void operator()(icontext_type& context) {vertex_data& vdata =

context.vertex_data(); double sum = 0;foreach ( edge_type edge,

context.in_edges() )sum +=

1/context.num_out_edges(edge.source()) *

context.vertex_data(edge.source()).rank;double old_rank = vdata.rank;vdata.rank = RESET_PROB + (1-RESET_PROB) *

sum;double residual = abs(vdata.rank –

old_rank) /

context.num_out_edges();if (residual > EPSILON)

context.reschedule_out_neighbors(pagerank());}

};

Atomic Single Vertex Apply

Parallel Scatter [Reschedule]

Parallel “Sum” Gather

Page 47: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Decomposable Update Functors

Decompose update functions into 3 phases:

Locks are acquired only for region within a scope Relaxed Consistency

+ + … + Δ

Y YY

ParallelSum

User Defined:Gather( ) ΔY

Δ1 + Δ2 Δ3

Y Scope

Gather

Y

YApply( , Δ) Y

Apply the accumulated value to center vertexUser Defined:

Apply

Y

Scatter( )

Update adjacent edgesand vertices.

User Defined:Y

Scatter

Page 48: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Factorized PageRankstruct pagerank : public iupdate_functor<graph, pagerank> { double accum = 0, residual = 0;

void gather(icontext_type& context, const edge_type& edge) {

accum += 1/context.num_out_edges(edge.source()) *

context.vertex_data(edge.source()).rank;}void merge(const pagerank& other) { accum +=

other.accum; }void apply(icontext_type& context) {

vertex_data& vdata = context.vertex_data();double old_value = vdata.rank;vdata.rank = RESET_PROB + (1 - RESET_PROB)

* accum; residual = fabs(vdata.rank – old_value) /

context.num_out_edges();}void scatter(icontext_type& context, const

edge_type& edge) {if (residual > EPSILON)

context.schedule(edge.target(), pagerank());

}};

Page 49: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Y

Split computation across machines:

Decomposable Execution Model

( o )( )Y

YYF1 F2

YY

Page 50: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Weaker ConsistencyNeighboring vertices maybe be updated simultaneously:

A B

CGather

Gather Gather

Apply

Page 51: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Other Decomposable AlgorithmsLoopy Belief Propagation

Gather: Accumulates product (log sum) of in messagesApply: Updates central beliefScatter: Computes out messages and schedules adjacent vertices

Alternating Least Squares (ALS)

y1

y2

y3

y4

w1

w2

x1

x2

x3Use

r Fac

tors

(W)

Movie Factors (X)

Use

rs MoviesNetflix

Use

rs

≈x

Movies

Page 52: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Convergent Gibbs SamplingCannot be done:

A B

CGather

Gather Gather

Unsafe

Page 53: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Decomposable FunctorsFits many algorithms

Loopy Belief Propagation, Label Propagation, PageRank…

Addresses the earlier concerns

Problem: Does not exploit asynchrony at the vertex level.

Large State

DistributedGather and Scatter

Heavy Locking

Fine GrainedLocking

Sequential

ParallelGather and Scatter

Page 54: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Need for Vertex Level Asynchrony

Exploit commutative associative “sum”

Y

+ + + + + Y

Costly gather for a single change!

Page 55: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Need for Vertex Level Asynchrony

Exploit commutative associative “sum”

Y

+ + + + + Y

Page 56: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Need for Vertex Level Asynchrony

Exploit commutative associative “sum”

Y

+ + + + + + Δ Y

Page 57: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Need for Vertex Level Asynchrony

Exploit commutative associative “sum”

Y

+ + + + + + Δ YOld (Cached) Sum

Page 58: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Need for Vertex Level Asynchrony

Exploit commutative associative “sum”

Y

+ + + + + + Δ YOld (Cached) Sum

Δ Δ Δ Δ

Page 59: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Commutative-Associative Updatestruct pagerank : public iupdate_functor<graph, pagerank> {

double delta;pagerank(double d) : delta(d) { }void operator+=(pagerank& other) { delta +=

other.delta; }void operator()(icontext_type& context) {

vertex_data& vdata = context.vertex_data();

vdata.rank += delta;if(abs(delta) > EPSILON) {

double out_delta = delta * (1 – RESET_PROB) *

1/context.num_out_edges(edge.source());

context.schedule_out_neighbors(pagerank(out_delta));}

}};// Initial Rank: R[i] = 0;// Initial Schedule: pagerank(RESET_PROB);

Page 60: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Scheduling Composes UpdatesCalling reschedule neighbors forces update function composition:

pagerank(3) Pending: pagerank(7)

reschedule_out_neighbors(pagerank(3))pagerank(3)

Pending: pagerank(3)

Pending: pagerank(10)

Page 61: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Experimental Comparison

Page 62: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Comparison of Abstractions:Multicore PageRank (25M Vertices, 355M Edges)

0 1000 2000 3000 4000 5000 60001.00E-021.00E-011.00E+001.00E+011.00E+021.00E+031.00E+041.00E+051.00E+061.00E+071.00E+08

GraphLab1FactorizedDelta

Runtime (s)

L1 E

rror

Page 63: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Comparison of Abstractions:Distributed PageRank (25M Vertices, 355M Edges)

2 3 4 5 6 7 80

50

100

150

200

250

300

350

400GL 1 (Chromatic)GL 2 Delta (Asynchronous)

# Machines (8 CPUs per Machine)

Runti

me

(s)

2 3 4 5 6 7 80

5

10

15

20

25

30

35GL 1 (Chromatic)GL 2 Delta (Asynchronous)

# Machines (8 CPUs per Machine)

Tota

l Com

mun

icati

on (G

B)

Page 64: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

PageRank on the Web circa 2000Invented Comparison:

GraphLab2 (512)

Pegasus (800)

Priter (200)

Dryad (960)

0

500

1000

1500

2000

2500

3000

3500Ru

ntim

e Se

cond

s

Page 65: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Ongoing work

Extending all of GraphLab2 to the distributed settingImplemented push based engines (chromatic)Need to build GraphLab2 distributed locking engine

Improving storage efficiency of the distributed data-graphPorting large set of Danny’s applications

Page 66: Carnegie Mellon University Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Joe Hellerstein Alex Smola The Next Generation

Questionshttp://graphlab.org