opportunities and challenges in scaling up graph …opportunities and challenges in scaling up graph...

44
Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu & Boris Grot The University of Edinburgh Priyank Faldu and Boris Grot The University of Edinburgh * This work is partially supported by Oracle Labs

Upload: others

Post on 24-Jun-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Opportunities and Challenges in Scaling Up Graph …Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu

Opportunities and Challenges inScaling Up Graph Analytics

1

UK System Research Challenges Workshop March 22, 2018

Priyank Faldu & Boris Grot

The University of EdinburghPriyank Faldu and Boris Grot

The University of Edinburgh* This work is partially supported by Oracle Labs

Page 2: Opportunities and Challenges in Scaling Up Graph …Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu

Why Graphs?Graphs are a natural way to represent pair-wise relationships among objects in the real world

• Each object is a vertex

• Relationship among a pair of objects represented with an edge

UK System Research Challenges Workshop March 22, 2018

2

Page 3: Opportunities and Challenges in Scaling Up Graph …Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu

Why Graphs?Graphs are a natural way to represent pair-wise relationships among objects in the real world

• Each object is a vertex

• Relationship among a pair of objects represented with an edge

Example:

Facebook Social Network

UK System Research Challenges Workshop March 22, 2018

2

Page 4: Opportunities and Challenges in Scaling Up Graph …Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu

Why Graphs?Graphs are a natural way to represent pair-wise relationships among objects in the real world

• Each object is a vertex

• Relationship among a pair of objects represented with an edge

Example:

Facebook Social Network

UK System Research Challenges Workshop March 22, 2018

2

Boris Grot Priyank Faldu

UoE

Page 5: Opportunities and Challenges in Scaling Up Graph …Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu

Why Graphs?Graphs are a natural way to represent pair-wise relationships among objects in the real world

• Each object is a vertex

• Relationship among a pair of objects represented with an edge

Example:

Facebook Social Network

UK System Research Challenges Workshop March 22, 2018

2

Boris Grot

follows

Priyank Faldu

UoE

follows

Page 6: Opportunities and Challenges in Scaling Up Graph …Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu

Why Graphs?Graphs are a natural way to represent pair-wise relationships among objects in the real world

• Each object is a vertex

• Relationship among a pair of objects represented with an edge

Example:

Facebook Social Network

UK System Research Challenges Workshop March 22, 2018

2

Boris Grot

follows

Priyank Faldu

friends

UoE

follows

Page 7: Opportunities and Challenges in Scaling Up Graph …Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu

Graph AnalyticsExtracting meaningful information out of complex many-to-many relationships among entities in a graph

s

UK System Research Challenges Workshop March 22, 2018

Page 8: Opportunities and Challenges in Scaling Up Graph …Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu

Graph AnalyticsExtracting meaningful information out of complex many-to-many relationships among entities in a graph

Applications

s

UK System Research Challenges Workshop March 22, 2018

Page 9: Opportunities and Challenges in Scaling Up Graph …Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu

Graph AnalyticsExtracting meaningful information out of complex many-to-many relationships among entities in a graph

Applications

s

UK System Research Challenges Workshop March 22, 2018

3

wordpress.com“blog”

blogger.com“blog”

bcc.com“news”

cnn.com“news”

asda.com“groceries”

nytimes.com“news”

y.com“?”

x.com“?”

ebay.com“shopping”

tesco.com“groceries”

amazon.com“shopping”

Page 10: Opportunities and Challenges in Scaling Up Graph …Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu

Graph AnalyticsExtracting meaningful information out of complex many-to-many relationships among entities in a graph

Applications

s

UK System Research Challenges Workshop March 22, 2018

3

wordpress.com“blog”

blogger.com“blog”

bcc.com“news”

cnn.com“news”

asda.com“groceries”

nytimes.com“news”

y.com“?”

x.com“?”

ebay.com“shopping”

tesco.com“groceries”

amazon.com“shopping”

Page 11: Opportunities and Challenges in Scaling Up Graph …Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu

Graph AnalyticsExtracting meaningful information out of complex many-to-many relationships among entities in a graph

Applications

s

UK System Research Challenges Workshop March 22, 2018

3

wordpress.com“blog”

blogger.com“blog”

bcc.com“news”

cnn.com“news”

asda.com“groceries”

nytimes.com“news”

y.com“?”

x.com“?”

ebay.com“shopping”

tesco.com“groceries”

amazon.com“shopping”

Page 12: Opportunities and Challenges in Scaling Up Graph …Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu

Graph AnalyticsExtracting meaningful information out of complex many-to-many relationships among entities in a graph

Applications• Label Propagation

s

UK System Research Challenges Workshop March 22, 2018

3

wordpress.com“blog”

blogger.com“blog”

bcc.com“news”

cnn.com“news”

asda.com“groceries”

nytimes.com“news”

y.com“?”

x.com“?”

ebay.com“shopping”

tesco.com“groceries”

amazon.com“shopping”

Page 13: Opportunities and Challenges in Scaling Up Graph …Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu

Graph AnalyticsExtracting meaningful information out of complex many-to-many relationships among entities in a graph

Applications• Label Propagation

• Centrality Analysis- Most influential people and information in social media

• Community Analysis- Identify customers with similar interests

• Connectivity Analysis- Find weakness in a network

• Path Analysis- Route optimization for distribution and supply chain

UK System Research Challenges Workshop March 22, 2018

4

Page 14: Opportunities and Challenges in Scaling Up Graph …Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu

Graphs are huge and growing …

UK System Research Challenges Workshop March 22, 2018

5

Page 15: Opportunities and Challenges in Scaling Up Graph …Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu

Graphs are huge and growing …

Image Credit: https://techcrunch.com/2017/06/27/facebook-2-billion-users/

UK System Research Challenges Workshop March 22, 2018

5

Page 16: Opportunities and Challenges in Scaling Up Graph …Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu

Graphs are huge and growing …

Image Credit: https://techcrunch.com/2017/06/27/facebook-2-billion-users/

UK System Research Challenges Workshop March 22, 2018

5

Billions of VerticesTrillions of Edges

Memory Footprint in Terabytes

Page 17: Opportunities and Challenges in Scaling Up Graph …Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu

Graphs are huge and growing …

Image Credit: https://techcrunch.com/2017/06/27/facebook-2-billion-users/

Graphs don’t fit in main memory of a single server

UK System Research Challenges Workshop March 22, 2018

5

Billions of VerticesTrillions of Edges

Memory Footprint in Terabytes

Page 18: Opportunities and Challenges in Scaling Up Graph …Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu

UK System Research Challenges Workshop March 22, 2018

6

To Scale-OutOr

To Scale-Up

Page 19: Opportunities and Challenges in Scaling Up Graph …Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu

Scale-Out Graph Analytics

UK System Research Challenges Workshop March 22, 2018

Graph is partitioned across a number of nodes

• Graph is stored in combined memory of all nodes Enables in-memory processing

7

Page 20: Opportunities and Challenges in Scaling Up Graph …Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu

Scale-Out Graph Analytics

UK System Research Challenges Workshop March 22, 2018

Graph is partitioned across a number of nodes

• Graph is stored in combined memory of all nodes Enables in-memory processing

7

GraphX[OSDI’14]

Pregel[SIGMOD’10]

PGX.D [SC’15]

PowerGraph[OSDI’12]

GraphLab[VLDI’12]

PEGASUS[ICDM’09]

GPS[SSDBM’13]

GRAM[SoCC’15]

Gemini[OSDI’16]

Page 21: Opportunities and Challenges in Scaling Up Graph …Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu

Scale-Out Graph Analytics

UK System Research Challenges Workshop March 22, 2018

Graph is partitioned across a number of nodes

• Graph is stored in combined memory of all nodes Enables in-memory processing

7

Is scale-out graph analytics a good idea?

GraphX[OSDI’14]

Pregel[SIGMOD’10]

PGX.D [SC’15]

PowerGraph[OSDI’12]

GraphLab[VLDI’12]

PEGASUS[ICDM’09]

GPS[SSDBM’13]

GRAM[SoCC’15]

Gemini[OSDI’16]

Page 22: Opportunities and Challenges in Scaling Up Graph …Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu

Scale-Out Graph Analytics Not Desirable

In-memory scale-out processing with 10s-100s nodes can be outperformed by disk-based scale-up graph processing

[GraphChi OSDI’12]

UK System Research Challenges Workshop March 22, 2018

8

Page 23: Opportunities and Challenges in Scaling Up Graph …Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu

Scale-Out Graph Analytics Not Desirable

In-memory scale-out processing with 10s-100s nodes can be outperformed by disk-based scale-up graph processing

[GraphChi OSDI’12]

Reason for inefficiency of scale-out graph analytics:

High inter-node communication & synchronization overhead

• “Small World” property of graphs Difficult to partition

UK System Research Challenges Workshop March 22, 2018

8

Page 24: Opportunities and Challenges in Scaling Up Graph …Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu

Scale-Out Graph Analytics Not Desirable

In-memory scale-out processing with 10s-100s nodes can be outperformed by disk-based scale-up graph processing

[GraphChi OSDI’12]

Reason for inefficiency of scale-out graph analytics:

High inter-node communication & synchronization overhead

• “Small World” property of graphs Difficult to partition

UK System Research Challenges Workshop March 22, 2018

8

So, why not scale-up graph analytics?

Page 25: Opportunities and Challenges in Scaling Up Graph …Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu

Need large per-node memory capacity

UK System Research Challenges Workshop March 22, 2018

9

DRAM: a limiting factor in scaling up

• Poor technology scaling

• High cost for large capacity

Page 26: Opportunities and Challenges in Scaling Up Graph …Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu

Need large per-node memory capacity

UK System Research Challenges Workshop March 22, 2018

9

DRAM: a limiting factor in scaling up

• Poor technology scaling

• High cost for large capacity

Alternative technology for main memory?

Page 27: Opportunities and Challenges in Scaling Up Graph …Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu

Need large per-node memory capacity

UK System Research Challenges Workshop March 22, 2018

9

DRAM: a limiting factor in scaling up

• Poor technology scaling

• High cost for large capacity

Alternative technology for main memory?

• Emerging solution: Storage Class Memory (SCM)

• Terabytes of capacity at affordable price

Image Credit: https://marketrealist.com/2016/03/microns-3d-xpoint-launch-stands-now

SSD

Page 28: Opportunities and Challenges in Scaling Up Graph …Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu

Need large per-node memory capacity

UK System Research Challenges Workshop March 22, 2018

9

Alternative technology for main memory?

• Emerging solution: Storage Class Memory (SCM)

• Terabytes of capacity at affordable price

SCM: Enabler for scale-up graph analytics

Image Credit: https://marketrealist.com/2016/03/microns-3d-xpoint-launch-stands-now

SSD

Page 29: Opportunities and Challenges in Scaling Up Graph …Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu

SCM: No Free Lunch

UK System Research Challenges Workshop March 22, 2018

10

Page 30: Opportunities and Challenges in Scaling Up Graph …Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu

SCM: No Free Lunch

2x-4x slower access latency than DRAM

4x-8x lower bandwidth than DRAM

Multiple orders of magnitude lower write endurance than DRAM

UK System Research Challenges Workshop March 22, 2018

10

Page 31: Opportunities and Challenges in Scaling Up Graph …Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu

SCM: No Free Lunch

2x-4x slower access latency than DRAM

4x-8x lower bandwidth than DRAM

Multiple orders of magnitude lower write endurance than DRAM

UK System Research Challenges Workshop March 22, 2018

10

Implications for scale-up graph analytics

Page 32: Opportunities and Challenges in Scaling Up Graph …Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu

CPI Stack of DRAM based Scale-Up Graph AnalyticsCycles spent in DRAM vs Elsewhere

0%

20%

40%

60%

80%

100%

CC SSSP BFS PR

DRAM Elsewhere

UK System Research Challenges Workshop March 22, 2018

11

Page 33: Opportunities and Challenges in Scaling Up Graph …Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu

CPI Stack of DRAM based Scale-Up Graph AnalyticsCycles spent in DRAM vs Elsewhere

0%

20%

40%

60%

80%

100%

CC SSSP BFS PR

DRAM Elsewhere

UK System Research Challenges Workshop March 22, 2018

11

DRAM stalls account for 40-65% of total time

Page 34: Opportunities and Challenges in Scaling Up Graph …Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu

CPI Stack of DRAM based Scale-Up Graph AnalyticsCycles spent in DRAM vs Elsewhere

0%

20%

40%

60%

80%

100%

CC SSSP BFS PR

DRAM Elsewhere

UK System Research Challenges Workshop March 22, 2018

11

DRAM stalls account for 40-65% of total time

SCM will exacerbate the problem

Page 35: Opportunities and Challenges in Scaling Up Graph …Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu

Mitigating Slower Latency of SCMData Prefetching for latency hiding

Challenges:

Graphs exhibit random access patterns

Simple hardware prefetchers are inadequate for graphs

UK System Research Challenges Workshop March 22, 2018

12

Page 36: Opportunities and Challenges in Scaling Up Graph …Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu

Mitigating Slower Latency of SCMData Prefetching for latency hiding

Challenges:

Graphs exhibit random access patterns

Simple hardware prefetchers are inadequate for graphs

Potential Solutions:

Hardware

Graph specific hardware prefetcher [Ainsworth et al. ICS’16]

Software

Employ software prefetcher in the framework• CPU often idle waiting on memory Plenty of idle cycles to burn on

extra instructions

UK System Research Challenges Workshop March 22, 2018

12

Page 37: Opportunities and Challenges in Scaling Up Graph …Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu

Mitigating Lower Bandwidth of SCM

Data transfer between DRAM & CPU happens at cache-line granularity

Observation: Graphs exhibit poor spatial locality

UK System Research Challenges Workshop March 22, 2018

13

Page 38: Opportunities and Challenges in Scaling Up Graph …Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu

Mitigating Lower Bandwidth of SCM

Data transfer between DRAM & CPU happens at cache-line granularity

Observation: Graphs exhibit poor spatial locality

UK System Research Challenges Workshop March 22, 2018

13

0%

20%

40%

60%

80%

100%

4 8 12 16 20 24 28 32 36 40 44 48 52 56 60 64

% of Evicted Cachelines in LLC

Bytes Accessed

Page 39: Opportunities and Challenges in Scaling Up Graph …Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu

Mitigating Lower Bandwidth of SCM

Data transfer between DRAM & CPU happens at cache-line granularity

Observation: Graphs exhibit poor spatial locality

UK System Research Challenges Workshop March 22, 2018

13

0%

20%

40%

60%

80%

100%

4 8 12 16 20 24 28 32 36 40 44 48 52 56 60 64

% of Evicted Cachelines in LLC

Bytes Accessed

~85% of data transferred from off-chip memory remains unused

Page 40: Opportunities and Challenges in Scaling Up Graph …Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu

Mitigating Lower Bandwidth of SCMPotential Solutions:

Software

Exploit graph topology to reorder vertices Improved spatial locality

Hardware

Revisit sectored caches Only fetch the required words into on-chip caches

UK System Research Challenges Workshop March 22, 2018

14

Page 41: Opportunities and Challenges in Scaling Up Graph …Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu

Mitigating Lower Write Endurance of SCMNew design goal: Reduce off-chip write traffic

Potential Solutions:

Software:

Approaches to improve write locality in on-chip caches

• E.g., pull-based vs push-based approach

Hardware:

Aggressively retain cachelines that accumulate writes

• Even at the expense of short temporal reuse of some cachelines

UK System Research Challenges Workshop March 22, 2018

15

Page 42: Opportunities and Challenges in Scaling Up Graph …Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu

Conclusion

Introduction of SCM will provide large capacity main memory at affordable price in commodity server

SCM will enable in-memory scale-up graph analytics even for extremely large graphs

• Open research questions on how to address weaknesses of SCM to achieve high performance for scale-up graph analytics

UK System Research Challenges Workshop March 22, 2018

16

Scale-Out

Page 43: Opportunities and Challenges in Scaling Up Graph …Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu

Conclusion

Introduction of SCM will provide large capacity main memory at affordable price in commodity server

SCM will enable in-memory scale-up graph analytics even for extremely large graphs

• Open research questions on how to address weaknesses of SCM to achieve high performance for scale-up graph analytics

UK System Research Challenges Workshop March 22, 2018

16

Scale-Up Out

Page 44: Opportunities and Challenges in Scaling Up Graph …Opportunities and Challenges in Scaling Up Graph Analytics 1 UK System Research Challenges Workshop March 22, 2018 Priyank Faldu

Thank You

UK System Research Challenges Workshop March 22, 2018

17

Priyank FalduPhD Student

The University of Edinburgh

www.faldupriyank.com

Boris GrotLecturer

The University of Edinburgh

http://homepages.inf.ed.ac.uk/bgrot