graphchi: large-scale graph computation on just a pc · – new connections made, some...

52
GraphChi: Large-Scale Graph Computation on Just a PC Aapo Kyrölä (CMU) Guy Blelloch (CMU) Carlos Guestrin (UW) OSDI’12 In coopera+on with the GraphLab team.

Upload: others

Post on 05-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

GraphChi: Large-Scale Graph Computation on Just a PC

Aapo Kyrölä (CMU) Guy Blelloch (CMU)

Carlos Guestrin (UW)

OSDI’12  

In  co-­‐opera+on  with  the  GraphLab  team.  

Page 2: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

BigData with Structure: BigGraph

social  graph   social  graph   follow-­‐graph   consumer-­‐products  graph  

user-­‐movie  ra;ngs  graph  

DNA  interac;on  

graph  

WWW  link  graph  

etc.  

Page 3: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

Big Graphs != Big Data

3  GraphChi  –  Aapo  Kyrola  

Data  size:   140  billion  connec;ons    

≈  1  TB  

Not  a  problem!  

Computa;on:  

Hard  to  scale  

TwiQer  network  visualiza;on,  by  Akshay  Java,  2009  

Page 4: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

Could we compute Big Graphs on a single machine?

Disk-based Graph Computation

Can’t    we  just  use  the  Cloud?  

Page 5: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

Wri;ng  distributed  applica;ons  remains  cumbersome.  

5  GraphChi  –  Aapo  Kyrola  

Cluster  crash   Crash  in  your  IDE  

Distributed State is Hard to Program

Page 6: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

Efficient Scaling

•  Businesses need to compute hundreds of distinct tasks on the same graph –  Example: personalized recommendations.

Parallelize  each  task   Parallelize  across  tasks  

Task  

Task  

Task  

Task  

Task  

Task  

Task  

Task  

Task    

Task    

Task    

Task  Task  

Complex   Simple  

Expensive  to  scale  

2x  machines  =  2x  throughput  

Page 7: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

Other Benefits

•  Costs – Easier management, simpler hardware.

•  Energy Consumption – Full utilization of a single computer.

•  Embedded systems, mobile devices – A basic flash-drive can fit a huge graph.

Page 8: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

Research Goal

Compute  on  graphs  with  billions  of  edges,  in  a  reasonable  *me,  on  a  single  PC.  

– Reasonable  =  close  to  numbers  previously  reported  for  distributed  systems  in  the  literature.  

Experiment  PC:  Mac  Mini  (2012)  

Page 9: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

Computational Model

GraphChi  –  Aapo  Kyrola   9  

Page 10: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

Computational Model

•  Graph G = (V, E) –  directed edges: e = (source,

destination) –  each edge and vertex associated

with a value (user-defined type) –  vertex and edge values can be

modified •  (structure modification also

supported) Data  

Data   Data  

Data  

Data  Data  Data  

Data  Data  

Data  

10  GraphChi  –  Aapo  Kyrola  

A   B  e  

Terms:    e  is  an  out-­‐edge  of  A,  and  in-­‐edge  of  B.  

Page 11: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

Data  

Data   Data  

Data  

Data  Data  Data  

Data  

Data  

Data  

Vertex-centric Programming

•  “Think like a vertex” •  Popularized by the Pregel and GraphLab projects

–  Historically, systolic computation and the Connection Machine

MyFunc(vertex)    {  //  modify  neighborhood  }  Data  

Data   Data  

Data  

Data  

Page 12: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

The Main Challenge of Disk-based Graph Computation:

Random Access

Page 13: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

Random Access Problem

vertex   in-­‐neighbors   out-­‐neighbors  

5   3:2.3,  19:  1.3,  49:  0.65,...   781:  2.3,  881:  4.2..  

....  

19   3:  1.4,  9:  12.1,  ...   5:  1.3,  28:  2.2,  ...  

•  ... or with file index pointers vertex   in-­‐neighbor-­‐ptr   out-­‐neighbors  

5   3:  881,  19:  10092,  49:  20763,...   781:  2.3,  881:  4.2..  

....  

19   3:  882,  9:  2872,  ...   5:  1.3,  28:  2.2,  ...  

Random  write  

Random  read  read  

synchronize  

•  Symmetrized adjacency file with values,

5 19

For  sufficient  performance,  millions  of  random  accesses  /  second  would  be  needed.  Even  for  SSD,  this  is  too  much.  

Page 14: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

Possible Solutions

1.  Use  SSD  as  a  memory-­‐extension?      

[SSDAlloc,  NSDI’11]  

2.  Compress  the  graph  structure  to  fit  into  RAM?  [  WebGraph  framework]  

3.  Cluster  the  graph  and  handle  each  cluster  separately  in  RAM?  

4.  Caching  of  hot  nodes?  

Too  many  small  objects,  need  millions  /  sec.  

Associated  values  do  not  compress  well,  and  are  mutated.  

Expensive;  The  number  of  inter-­‐cluster  edges  is  big.  

Unpredictable  performance.    

Page 15: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

Our Solution

Parallel Sliding Windows (PSW)

Page 16: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

Parallel Sliding Windows: Phases

•  PSW processes the graph one sub-graph a time:

•  In one iteration, the whole graph is processed.

–  And typically, next iteration is started.

1. Load

2. Compute

3. Write

Page 17: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

•  Vertices are numbered from 1 to n –  P intervals, each associated with a shard on disk. –  sub-graph = interval of vertices

PSW: Shards and Intervals

shard(1)  

interval(1)   interval(2)   interval(P)  

shard(2)   shard(P)  

1   n  v1   v2  

17  GraphChi  –  Aapo  Kyrola  

1. Load

2. Compute

3. Write

Page 18: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

PSW: Layout

Shard  1  

Shards  small  enough  to  fit  in  memory;  balance  size  of  shards  

Shard:  in-­‐edges  for  interval  of  ver;ces;  sorted  by  source-­‐id  

in-­‐edges  fo

r  ver;ces  1..100  

sorted

 by  source_id  

Ver;ces  1..100  

Ver;ces  101..700  

Ver;ces  701..1000  

Ver;ces  1001..10000  

Shard  2   Shard  3   Shard  4  Shard  1  

1. Load

2. Compute

3. Write

Page 19: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

Ver;ces  1..100  

Ver;ces  101..700  

Ver;ces  701..1000  

Ver;ces  1001..10000  

Load  all  in-­‐edges  in  memory    

Load  subgraph  for  verJces  1..100  

What  about  out-­‐edges?                Arranged  in  sequence  in  other  shards  

Shard  2   Shard  3   Shard  4  

PSW: Loading Sub-graph

Shard  1  

in-­‐edges  fo

r  ver;ces  1..100  

sorted

 by  source_id  

1. Load

2. Compute

3. Write

Page 20: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

Shard  1  

Load all in-edges in memory

Load  subgraph  for  verJces  101..700  

Shard  2   Shard  3   Shard  4  

PSW: Loading Sub-graph

Ver;ces  1..100  

Ver;ces  101..700  

Ver;ces  701..1000  

Ver;ces  1001..10000  

Out-edge blocks in memory

in-­‐edges  fo

r  ver;ces  1..100  

sorted

 by  source_id  

1. Load

2. Compute

3. Write

Page 21: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

PSW Load-Phase

Only P large reads for each interval.

P2 reads on one full pass.

21  GraphChi  –  Aapo  Kyrola  

1. Load

2. Compute

3. Write

Page 22: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

PSW: Execute updates

•  Update-function is executed on interval’s vertices •  Edges have pointers to the loaded data blocks

–  Changes take effect immediately asynchronous.

&Data  

&Data   &Data  

&Data  

&Data  &Data  &Data  

&Data  &Data  

&Data  

Block  X  

Block  Y  

Determinis;c  scheduling  prevents  races  between  neighboring  ver;ces.  

22  GraphChi  –  Aapo  Kyrola  

1. Load

2. Compute

3. Write

Page 23: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

PSW: Commit to Disk

•  In write phase, the blocks are written back to disk –  Next load-phase sees the preceding writes

asynchronous.

23  GraphChi  –  Aapo  Kyrola  

1. Load

2. Compute

3. Write

&Data  

&Data   &Data  

&Data  

&Data  &Data  &Data  

&Data  &Data  

&Data  

Block  X  

Block  Y  

In  total:    P2  reads  and  writes  /  full  pass  on  the  graph.    Performs  well  on  both  SSD  and  hard  drive.  

Page 24: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

GraphChi: Implementation

Evaluation & Experiments

Page 25: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

GraphChi

•  C++ implementation: 8,000 lines of code –  Java-implementation also available (~ 2-3x slower),

with a Scala API.

•  Several optimizations to PSW (see paper).

Source  code  and  examples:  hQp://graphchi.org  

Page 26: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

EVALUATION: APPLICABILITY

Page 27: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

Evaluation: Is PSW expressive enough?

Graph Mining –  Connected components –  Approx. shortest paths –  Triangle counting –  Community Detection

SpMV –  PageRank –  Generic

Recommendations –  Random walks

Collaborative Filtering (by Danny Bickson)

–  ALS –  SGD –  Sparse-ALS –  SVD, SVD++ –  Item-CF

Probabilistic Graphical Models –  Belief Propagation

Algorithms  implemented  for  GraphChi  (Oct  2012)  

Page 28: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

IS GRAPHCHI FAST ENOUGH?

Comparisons to existing systems

Page 29: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

Experiment Setting

•  Mac Mini (Apple Inc.) –  8 GB RAM –  256 GB SSD, 1TB hard drive –  Intel Core i5, 2.5 GHz

•  Experiment graphs:

Graph   VerJces   Edges   P  (shards)   Preprocessing  

live-­‐journal   4.8M   69M   3   0.5  min  

nevlix   0.5M   99M   20   1  min  

twiQer-­‐2010   42M   1.5B   20   2  min  

uk-­‐2007-­‐05   106M   3.7B   40   31  min  

uk-­‐union   133M   5.4B   50   33  min  

yahoo-­‐web   1.4B   6.6B   50   37  min  

Page 30: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

Comparison to Existing Systems

Notes:  comparison  results  do  not  include  +me  to  transfer  the  data  to  cluster,  preprocessing,  or  the  +me  to  load  the  graph  from  disk.  GraphChi  computes  asynchronously,  while  all  but  GraphLab  synchronously.    

PageRank  

See  the  paper  for  more  comparisons.  

WebGraph  Belief  PropagaJon  (U  Kang  et  al.)  

Matrix  FactorizaJon  (Alt.  Least  Sqr.)   Triangle  CounJng  

GraphLab  v1  (8  cores)  

GraphChi  (Mac  Mini)  

0   2   4   6   8   10   12  Minutes  

NeDlix  (99B  edges)  

Spark  (50  machines)  

GraphChi  (Mac  Mini)  

0   2   4   6   8   10   12   14  Minutes  

TwiKer-­‐2010  (1.5B  edges)  

Pegasus  /  Hadoop  (100  

machines)  

GraphChi  (Mac  Mini)  

0   5   10   15   20   25   30  Minutes  

Yahoo-­‐web  (6.7B  edges)  

Hadoop  (1636  

machines)  

GraphChi  (Mac  Mini)  

0   50   100   150   200   250   300   350   400   450  Minutes  

twiKer-­‐2010  (1.5B  edges)  

On  a  Mac  Mini:   GraphChi  can  solve  as  big  problems  as  exis;ng  large-­‐scale  systems.  

 Comparable  performance.    

Page 31: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

PowerGraph Comparison •  PowerGraph / GraphLab 2

outperforms previous systems by a wide margin on natural graphs.

•  With 64 more machines, 512 more CPUs: – Pagerank: 40x faster than

GraphChi – Triangle counting: 30x faster

than GraphChi.

OSDI’12  

GraphChi  has  state-­‐of-­‐the-­‐art  performance  /  CPU.  

vs.  

GraphChi  

Page 32: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

SYSTEM EVALUATION Sneak peek

Consult  the  paper  for  a  comprehensive  evalua+on:  •  HD  vs.  SSD  •  Striping  data  across  mul+ple  

hard  drives  •  Comparison  to  an  in-­‐memory  

version  •  BoKlenecks  analysis  •  Effect  of  the  number  of  shards  •  Block  size  and  performance.  

Page 33: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

Scalability / Input Size [SSD]

•  Throughput: number of edges processed / second.

Conclusion:  the  throughput  remains  roughly  constant  when  graph  size  is  increased.      

GraphChi  with  hard-­‐drive  is  ~  2x  slower  than  SSD  (if  computa;onal  cost  low).  

Graph size à

domain

twitter-2010

uk-2007-05 uk-union

yahoo-web

0.00E+00  

5.00E+06  

1.00E+07  

1.50E+07  

2.00E+07  

2.50E+07  

0.00E+00   2.00E+00   4.00E+00   6.00E+00   8.00E+00  Billions  

PageRank -- throughput (Mac Mini, SSD)

Perf

orm

ance

à

Paper:  scalability  of  other  applica;ons.  

Page 34: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

Bottlenecks / Multicore

Experiment  on  MacBook  Pro  with  4  cores  /  SSD.  

•  Computationally intensive applications benefit substantially from parallel execution.

•  GraphChi saturates SSD I/O with 2 threads.

0  20  40  60  80  100  120  140  160  

1   2   4  RunJ

me  (secon

ds)  

Number  of  threads  

Matrix  Factoriza*on  (ALS)  

Loading   ComputaJon  

0  

200  

400  

600  

800  

1000  

1   2   4  RunJ

me  (secon

ds)  

Number  of  threads  

Connected  Components  

Loading   ComputaJon  

Page 35: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

Evolving Graphs

Graphs whose structure changes over time

Page 36: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

Evolving Graphs: Introduction

•  Most interesting networks grow continuously: – New connections made, some ‘unfriended’.

•  Desired functionality: – Ability to add and remove edges in streaming

fashion; –  ... while continuing computation.

•  Related work: – Kineograph (EuroSys ‘12), distributed system for

computation on a changing graph.

Page 37: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

PSW and Evolving Graphs

•  Adding edges –  Each (shard, interval) has an associated edge-buffer.

•  Removing edges: Edge flagged as “removed”.

interval(1)

interval(2)

interval(P)

shard(j)

edge-buffer(j, 1)

edge-buffer(j, 2)

edge-buffer(j, P)

New  edges   (for  example)  TwiQer  “firehose”  

Page 38: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

Recreating Shards on Disk

•  When buffers fill up, shards a recreated on disk –  Too big shards are split.

•  During recreation, deleted edges are permanently removed.

interval(1)

interval(2)

interval(P)

shard(j)

Re-­‐create  &  Split  

interval(1)

interval(2)

interval(P+1)

shard(j)

interval(1)

interval(2)

interval(P+1)

shard(j+1)

Page 39: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

EVALUATION: EVOLVING GRAPHS

Streaming Graph experiment

Page 40: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

Streaming Graph Experiment •  On the Mac Mini:

–  Streamed edges in random order from the twitter-2010 graph (1.5 B edges)

•  With maximum rate of 100K or 200K edges/sec. (very high rate)

–  Simultaneously run PageRank. – Data layout:

•  Edges were streamed from hard drive •  Shards were stored on SSD.

edges  

Page 41: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

Ingest Rate

0  

50000  

100000  

150000  

200000  

250000  

0 1 2 3

Edges  /

 sec  

Time  (hours)  

actual-­‐100K   actual-­‐200K   100K   200K  

When  graph  grows,  shard  recrea;ons  become  more  expensive.  

Page 42: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

Streaming: Computational Throughput

Throughput  varies  strongly  due  to  shard  rewrites  and  asymmetric  computa;on.  

0.00E+00  

2.00E+06  

4.00E+06  

6.00E+06  

8.00E+06  

1.00E+07  

0 1 2 3 4

Throuh

gput  Edges/sec  

Time  (hours)  

Throughput  (100K)   Throughput  (200K)   Fixed  graph  

Page 43: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

Conclusion

Page 44: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

Future Directions

Come  to  the  the  poster  on  Monday  to  discuss!  

•  This work: small amount of memory. •  What if have hundreds of GBs of RAM?

Graph  working  memory  (PSW)   disk

Computation 1 state Computation 2 state

Computational state

Graph  working  memory  (PSW)   disk

Computation 1 state Computation 2 state

Computational state

Graph  working  memory  (PSW)   disk

computational state

Computation 1 state Computation 2 state

RAM  

Page 45: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

Conclusion

•  Parallel Sliding Windows algorithm enables processing of large graphs with very few non-sequential disk accesses.

•  For the system researchers, GraphChi is a solid baseline for system evaluation –  It can solve as big problems as distributed systems.

•  Takeaway: Appropriate data structures as an alternative to scaling up.

Source code and examples: http://graphchi.org License: Apache 2.0

Page 46: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

Extra Slides

Page 47: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

PSW is Asynchronous

•  If V > U, and there is edge (U,V, &x) = (V, U, &x), update(V) will observe change to x done by update(U): –  Memory-shard for interval (j+1) will contain writes to shard(j) done

on previous intervals. –  Previous slide: If U, V in the same interval.

•  PSW implements the Gauss-Seidel (asynchronous) model of computation –  Shown to allow, in many cases, clearly faster convergence of

computation than Bulk-Synchronous Parallel (BSP). –  Each edge stored only once.

V ti ( F (V t

0 , V t1 , . . . , V t

i�1, Vt�1i , V t�1

i+1 , ...)

Extended  edi;on  

47  GraphChi  –  Aapo  Kyrola  

Page 48: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

Number of Shards

101 102 1030

0.5

1

1.5

2x 107#Shards and Performance

Number of shards (P)

Thro

ughp

ut (e

dges

/sec

) Conn comp. (SSD)

Pagerank (SSD)

Pagerank (HD)

Conn comp. (HD)

Student Version of MATLAB

•  If P is in the “dozens”, there is not much effect on performance.

Page 49: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

I/O Complexity

•  See the paper for theoretical analysis in the Aggarwal-Vitter’s I/O model. – Worst-case only 2x best-case.

•  Intuition:

shard(1)  

interval(1)   interval(2)   interval(P)  

shard(2)   shard(P)  

1   |V|  v1   v2  

Inter-­‐interval  edge  is  loaded  from  disk  only  once  /  itera;on.  

Edge  spanning  intervals  is  loaded  twice  /  itera;on.  

Page 50: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

Multiple hard-drives (RAIDish)

•  GraphChi supports striping shards to multiple disks Parallel I/O.

0  

500  

1000  

1500  

2000  

2500  

3000  

Pagerank   Conn.  components  

Second

s  /  itera;

on  

1  hard  drive   2  hard  drives   3  hard  drives  

Experiment  on  a  16-­‐core  AMD  server  (from  year  2007).  

Page 51: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

Bottlenecks

0  

500  

1000  

1500  

2000  

2500  

1  thread   2  threads   4  threads  

Disk  IO   Graph  construc;on   Exec.  updates  

Connected  Components  on  Mac  Mini  /  SSD  

•  Cost of constructing the sub-graph in memory is almost as large as the I/O cost on an SSD –  Graph construction requires a lot of random access in RAM memory

bandwidth becomes a bottleneck.

Page 52: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming

Computational Setting

•  Constraints: A.  Not enough memory to store the whole graph in

memory, nor all the vertex values. B.  Enough memory to store one vertex and its edges

w/ associated values.

•  Largest example graph used in the experiments: – Yahoo-web graph: 6.7 B edges, 1.4 B vertices

Recently  GraphChi  has  been  used  on  a  MacBook  Pro  to  compute  with  the  most  recent  TwiQer  follow-­‐graph  (last  year:  15  B  edges)