x-stream: edge-centric graph processing using streaming ... · presented by: amir h. payberah...
Post on 25-Jul-2020
6 Views
Preview:
TRANSCRIPT
X-Stream: Edge-centric Graph Processing usingStreaming Partitions
Amitabha Roy, Ivo Mihailovic, Willy Zwaenepoel (EPFL - SOSP’13)
Presented by: Amir H. Payberahamir@sics.se
Feb. 7, 2014
Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 1 / 32
Graphs
Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 2 / 32
Big Graphs
I Large graphs are a subset of the big data problem.
I Billions of vertices and edges, hundreds of gigabytes.
I Normally tackled on large clusters.• Pregel, Giraph, GraphLab, PowerGraph ...• Complexity, power consumption ...
Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 3 / 32
Question
Could we compute Big Graphs on a single machine?
Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 4 / 32
Challenges
I Disk-based processing• Problem: graph traversal = random access• Random access is inefficient for storage
Eiko Y., and Roy A., “Scale-up Graph Processing: A Storage-centric View”, 2013.
Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 5 / 32
Challenges
I Disk-based processing• Problem: graph traversal = random access• Random access is inefficient for storage
Eiko Y., and Roy A., “Scale-up Graph Processing: A Storage-centric View”, 2013.
Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 5 / 32
Proposed Solution
Solution
X-Stream makes graph accesses sequential.
Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 6 / 32
X-Stream Contribution
I Edge-centric scatter-gather model
I Streaming partitions
Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 7 / 32
Edge-CentricScatter-Gather Model
Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 8 / 32
Scatter-Gather Programming Model
I State stored in vertices.
I Vertex operations:• Scatter updates along outgoing edges• Gather updates from incoming edges
Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 9 / 32
Vertex-Centric Scatter
I Iterates over vertices
for each vertex v
if v has update
for each edge e from v
scatter update along e
Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 10 / 32
Vertex-Centric Scatter-Gather (1/5)
for each vertex v
if v has update
for each edge e from v
scatter update along e
Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 11 / 32
Vertex-Centric Scatter-Gather (2/5)
for each vertex v
if v has update
for each edge e from v
scatter update along e
Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 12 / 32
Vertex-Centric Scatter-Gather (3/5)
for each vertex v
if v has update
for each edge e from v
scatter update along e
Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 13 / 32
Vertex-Centric Scatter-Gather (4/5)
for each vertex v
if v has update
for each edge e from v
scatter update along e
Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 14 / 32
Vertex-Centric Scatter-Gather (5/5)
for each vertex v
if v has update
for each edge e from v
scatter update along e
Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 15 / 32
Vertex-Centric vs. Edge-Centric Access
Vertex-centric Edge-centric
Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 16 / 32
Edge-Centric Scatter
I Iterates over edges
for each edge e
if e.src has update
scatter update along e
Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 17 / 32
Edge-Centric Scatter-Gather (1/5)
for each edge e
if e.src has update
scatter update along e
Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 18 / 32
Edge-Centric Scatter-Gather (2/5)
for each edge e
if e.src has update
scatter update along e
Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 19 / 32
Edge-Centric Scatter-Gather (3/5)
for each edge e
if e.src has update
scatter update along e
Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 20 / 32
Edge-Centric Scatter-Gather (4/5)
for each edge e
if e.src has update
scatter update along e
Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 21 / 32
Edge-Centric Scatter-Gather (5/5)
for each edge e
if e.src has update
scatter update along e
Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 22 / 32
Vertex-Centric vs. Edge-Centric Tradeoff
I Vertex-centric scatter-gather: EdgeDataRandomAccessBandwidth
I Edge-centric scatter-gather: Scatters×EdgeDataSequentialAccessBandwidth
I Sequential Access Bandwidth � Random Access Bandwidth.
I Few scatter gather iterations for real world graphs.
Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 23 / 32
Streaming Partitions
Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 24 / 32
Problem
I Problem: still have random access to vertex set.
Solution
Partition the graph into streaming partitions.
Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 25 / 32
Problem
I Problem: still have random access to vertex set.
Solution
Partition the graph into streaming partitions.
Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 25 / 32
Partitioning the Graph (1/2)
Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 26 / 32
Partitioning the Graph (2/2)
Random access for free.
Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 27 / 32
Streaming Partition
I A subset of the vertices that fits in RAM.
I All edges whose source vertex is in that subset.
I No requirement on quality of the partition, e.g., sorting edges.
I Consists of three sets: vertex set, edge list, and update list.
Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 28 / 32
Streaming Partition Scatter-Gather
I The scatter phase iterates over all streaming partitions, rather thanover all edges.
I The gather phase iterates over all streaming partitions, rather thanover all updates.
I The vertex sets and edge lists remain fixed during the entire com-putation.
I The update list of a partition varies over time.
Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 29 / 32
X-Stream Edge-Centric Scatter-Gather
// Scatter phase
for each streaming_partition p {
read in vertex set of p
for each edge e in edge list of p
edge_scatter(e): append update to Uout
}
// Shuffle phase
for each update u in Uout {
let p = partition containing target of u
append u to Uin(p)
}
destroy Uout
// Gatter phase
for each streaming_partition p {
read in vertex set of p
for each update u in Uin(p)
edge_gather(u): apply update u to u.destination
destroy Uin(p)
}
Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 30 / 32
Summary
I X-Stream
I Scatter-Gather model
I Edge-centric: sequential access to the graph edges
I Streaming partition: vertex set, edge list, and update list
Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 31 / 32
Questions?
Acknowledgement
Some slides were derived from the slides of Amitabha Roy (EPFL)
Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 32 / 32
top related