x-stream: edge-centric graph processing using streaming ... · presented by: amir h. payberah...

Post on 25-Jul-2020

6 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

X-Stream: Edge-centric Graph Processing usingStreaming Partitions

Amitabha Roy, Ivo Mihailovic, Willy Zwaenepoel (EPFL - SOSP’13)

Presented by: Amir H. Payberahamir@sics.se

Feb. 7, 2014

Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 1 / 32

Graphs

Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 2 / 32

Big Graphs

I Large graphs are a subset of the big data problem.

I Billions of vertices and edges, hundreds of gigabytes.

I Normally tackled on large clusters.• Pregel, Giraph, GraphLab, PowerGraph ...• Complexity, power consumption ...

Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 3 / 32

Question

Could we compute Big Graphs on a single machine?

Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 4 / 32

Challenges

I Disk-based processing• Problem: graph traversal = random access• Random access is inefficient for storage

Eiko Y., and Roy A., “Scale-up Graph Processing: A Storage-centric View”, 2013.

Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 5 / 32

Challenges

I Disk-based processing• Problem: graph traversal = random access• Random access is inefficient for storage

Eiko Y., and Roy A., “Scale-up Graph Processing: A Storage-centric View”, 2013.

Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 5 / 32

Proposed Solution

Solution

X-Stream makes graph accesses sequential.

Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 6 / 32

X-Stream Contribution

I Edge-centric scatter-gather model

I Streaming partitions

Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 7 / 32

Edge-CentricScatter-Gather Model

Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 8 / 32

Scatter-Gather Programming Model

I State stored in vertices.

I Vertex operations:• Scatter updates along outgoing edges• Gather updates from incoming edges

Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 9 / 32

Vertex-Centric Scatter

I Iterates over vertices

for each vertex v

if v has update

for each edge e from v

scatter update along e

Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 10 / 32

Vertex-Centric Scatter-Gather (1/5)

for each vertex v

if v has update

for each edge e from v

scatter update along e

Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 11 / 32

Vertex-Centric Scatter-Gather (2/5)

for each vertex v

if v has update

for each edge e from v

scatter update along e

Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 12 / 32

Vertex-Centric Scatter-Gather (3/5)

for each vertex v

if v has update

for each edge e from v

scatter update along e

Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 13 / 32

Vertex-Centric Scatter-Gather (4/5)

for each vertex v

if v has update

for each edge e from v

scatter update along e

Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 14 / 32

Vertex-Centric Scatter-Gather (5/5)

for each vertex v

if v has update

for each edge e from v

scatter update along e

Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 15 / 32

Vertex-Centric vs. Edge-Centric Access

Vertex-centric Edge-centric

Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 16 / 32

Edge-Centric Scatter

I Iterates over edges

for each edge e

if e.src has update

scatter update along e

Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 17 / 32

Edge-Centric Scatter-Gather (1/5)

for each edge e

if e.src has update

scatter update along e

Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 18 / 32

Edge-Centric Scatter-Gather (2/5)

for each edge e

if e.src has update

scatter update along e

Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 19 / 32

Edge-Centric Scatter-Gather (3/5)

for each edge e

if e.src has update

scatter update along e

Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 20 / 32

Edge-Centric Scatter-Gather (4/5)

for each edge e

if e.src has update

scatter update along e

Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 21 / 32

Edge-Centric Scatter-Gather (5/5)

for each edge e

if e.src has update

scatter update along e

Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 22 / 32

Vertex-Centric vs. Edge-Centric Tradeoff

I Vertex-centric scatter-gather: EdgeDataRandomAccessBandwidth

I Edge-centric scatter-gather: Scatters×EdgeDataSequentialAccessBandwidth

I Sequential Access Bandwidth � Random Access Bandwidth.

I Few scatter gather iterations for real world graphs.

Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 23 / 32

Streaming Partitions

Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 24 / 32

Problem

I Problem: still have random access to vertex set.

Solution

Partition the graph into streaming partitions.

Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 25 / 32

Problem

I Problem: still have random access to vertex set.

Solution

Partition the graph into streaming partitions.

Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 25 / 32

Partitioning the Graph (1/2)

Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 26 / 32

Partitioning the Graph (2/2)

Random access for free.

Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 27 / 32

Streaming Partition

I A subset of the vertices that fits in RAM.

I All edges whose source vertex is in that subset.

I No requirement on quality of the partition, e.g., sorting edges.

I Consists of three sets: vertex set, edge list, and update list.

Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 28 / 32

Streaming Partition Scatter-Gather

I The scatter phase iterates over all streaming partitions, rather thanover all edges.

I The gather phase iterates over all streaming partitions, rather thanover all updates.

I The vertex sets and edge lists remain fixed during the entire com-putation.

I The update list of a partition varies over time.

Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 29 / 32

X-Stream Edge-Centric Scatter-Gather

// Scatter phase

for each streaming_partition p {

read in vertex set of p

for each edge e in edge list of p

edge_scatter(e): append update to Uout

}

// Shuffle phase

for each update u in Uout {

let p = partition containing target of u

append u to Uin(p)

}

destroy Uout

// Gatter phase

for each streaming_partition p {

read in vertex set of p

for each update u in Uin(p)

edge_gather(u): apply update u to u.destination

destroy Uin(p)

}

Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 30 / 32

Summary

I X-Stream

I Scatter-Gather model

I Edge-centric: sequential access to the graph edges

I Streaming partition: vertex set, edge list, and update list

Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 31 / 32

Questions?

Acknowledgement

Some slides were derived from the slides of Amitabha Roy (EPFL)

Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 32 / 32

top related