page: a partition aware graph computation engine

19
PAGE: A Partition Aware Graph Computation Engine Yingxia Shao, Junjie Yao, Bin Cui, Lin Ma EECS, Peking University, China

Upload: woody

Post on 24-Feb-2016

64 views

Category:

Documents


0 download

DESCRIPTION

PAGE: A Partition Aware Graph Computation Engine. Yingxia Shao, Junjie Yao, Bin Cui, Lin Ma EECS, Peking University, China. Agenda. Background Design of PAGE Experiment result Conclusion. Background. Prevalent large scale graphs Social networks Web graph … Graph computing systems - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: PAGE: A Partition Aware Graph Computation Engine

PAGE: A Partition Aware Graph Computation Engine

Yingxia Shao, Junjie Yao, Bin Cui, Lin MaEECS, Peking University, China

Page 2: PAGE: A Partition Aware Graph Computation Engine

Agenda

Background• Design of PAGE• Experiment result• Conclusion

2/19

Page 3: PAGE: A Partition Aware Graph Computation Engine

Background

• Prevalent large scale graphs– Social networks– Web graph – …

• Graph computing systems– Pregel (Google)– Giraph (Apache)– GPS (Stanford)– GraphLab (CMU)– …

3/19

Page 4: PAGE: A Partition Aware Graph Computation Engine

Background

• Graph Partitioning– Offline approach

• METIS (Karypis Lab)– Online approach

• Streaming partitioning• Linear Deterministic Greedy(LDG) algorithm (I. Stanton)

4/19

Problem: The existing graph computation systems cannot efficiently integrate the high-quality graph partitioning.

Page 5: PAGE: A Partition Aware Graph Computation Engine

Inefficient partition integrating

Ave

rage

tim

e(s/

itera

tion)

8 0 o v e ra l l co s t

7 0 s y n c re m o te co m m . co s t

6 0 lo ca l co m m . c o s t 5 0

4 0 3 0

2 0 1 0

0

Partitio n S ch em e

5/19

The high-quality graph partitioning leads to the worse overall performance.

The graph partitioning quality is improved from left to right.

Running PageRank on Giraph with six different graph partition qualities.

Page 6: PAGE: A Partition Aware Graph Computation Engine

Motivation of the PAGE

Call for a novel graph computation engine to efficiently integrate graph partitioning with various qualities.

A Novel Graph Computation Engine

High-Quality Graph PartitionLow-Quality Graph Partition

6/19

Page 7: PAGE: A Partition Aware Graph Computation Engine

Agenda

• BackgroundDesign of PAGE• Experiment result• Conclusion

7/19

Page 8: PAGE: A Partition Aware Graph Computation Engine

Message processor

8/19

Message Process Unit

msg.

msg.

msg.

Message Block

msg.

msg.

msg.

msg.

msg.…

Header

msg.

msg.

msg.

msg.

msg.

Message Process Unit

Message Process Unit

Message Process Unit

Message Process Unit

Message Processor

Page 9: PAGE: A Partition Aware Graph Computation Engine

Inefficient partition integratingA

vera

ge ti

me(

s/ite

ratio

n) 8 0 o v e ra l l co s t

7 0 s y n c re m o te co m m . c o s t

6 0 lo c a l co m m . c o s t 5 0

4 0 3 0

2 0 1 0

0

Pa rtitio n S ch em e

9/19

The local message processing cost dominates the overall cost.

The existing systems cannot provide enough local message processor.

Running PageRank on Giraph with six different graph partition qualities.

Page 10: PAGE: A Partition Aware Graph Computation Engine

Overview of the PAGE

PAGE worker1

Partition Aware Comm.

PAGE worker2

Partition Aware Comm.

PAGE worker3

Partition Aware Comm.

Distributed In-Memory Partitioned Graph

Computation Computation Computation

PAGE applies adaptively tuning mechanism and new cooperation methods.10/19

Page 11: PAGE: A Partition Aware Graph Computation Engine

New Designed PAGE Worker

11/19

Partition Aware

Monitor

DCCM

Communication

Dual Concurrent MP

Sender Receiver

Computation

Remote MP

Local MP

Page 12: PAGE: A Partition Aware Graph Computation Engine

Dual Concurrent MP

Remote MP

Local MP

Dual Concurrent Message Processor

• First type concurrency– A remote MP and a local MP are

embedded• Second type concurrency

– A set of message process units are contained by each message processor

• The concurrency is automatically determined by the system itself.

12/19

Page 13: PAGE: A Partition Aware Graph Computation Engine

Dynamic Concurrency Control Model

• The DCCM determines the proper parameters, such as nmp , nmpl , nmpr .

• The DCCM is built on top of two heuristic rules.– Ability Lower-bound.– Workload Balance Ratio.

• Monitor– Tracks the necessary metrics

Partition Aware

Monitor

DCCM

13/19

Page 14: PAGE: A Partition Aware Graph Computation Engine

Agenda

• Background• Design of PAGEExperiment result• Conclusion

14/19

Page 15: PAGE: A Partition Aware Graph Computation Engine

Environment & Datasets

• Experiment Environment– a 24 nodes cluster

• Dataset: the uk-2007-05-u.– Undirected– Vertex #: 105,153,952 – Edge #: 6,603,753,128

• Benchmark: PageRank

Scheme Edge Cut

Random 98.52%

LDG1 82.88%

LDG2 75.69%

LDG3 66.37%

LDG4 56.34%

METIS 3.48%

Partition qualities

15/19

Balance factor: < 1%.

Page 16: PAGE: A Partition Aware Graph Computation Engine

Partition Awareness in PAGE A

vera

ge ti

me(

s/ite

ratio

n) 3 5

3 0

2 5

2 0 o v erra l l co s t s y n c rem o te co m m . co s t

1 5 s y n c lo ca l co m m . co s t

1 0

5

0

Partitio n S ch em e A

vera

ge ti

me(

s/ite

ratio

n) 7 0

o v era ll co s t 6 0

sy n c rem o t e co m m . co s t

5 0 sy n c lo ca l co m m . co s t

4 0

3 0

2 0

1 0

0

Partitio n S ch e m e

PAGE Giraph

16/19

Page 17: PAGE: A Partition Aware Graph Computation Engine

Compare with the naive solution

Ave

rage

tim

e(s/

itera

tion)

80 G irap h

70 G irap h-G P S o p

P A G E

60

50

40

30

20

10

0

Partition S chem e

17/19

* The Giraph-GPSop is the naive solution.

Page 18: PAGE: A Partition Aware Graph Computation Engine

Contribution & Conclusion

• We identify the problem of partition unaware inefficiency.

• We set up a new partition aware graph computation engine, PAGE.

• We design a Dynamic Concurrency Control Model based on several heuristic rules to better profile the characters of graph partition.

• At last, we demonstrate PAGE’s robustness and efficiency on different graph partition qualities.

18/19

Page 19: PAGE: A Partition Aware Graph Computation Engine

Thanks!

19/19

Email: [email protected]