thorny path to the large scale graph processing, Алексей Зиновьев (Тамтэк)
DESCRIPTION
Доклад Алексея Зиновьева на HighLoad++ 2014.TRANSCRIPT
![Page 1: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/1.jpg)
Thorny path to the Large-Scale Graph ProcessingZinoviev Alexey
![Page 2: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/2.jpg)
About
• I am a <graph theory, machine learning, traffic jams prediction, BigData
algorythms> scientist
• But I'm a <Java, JavaScript, Android, NoSQL, Hadoop, Spark>
programmer
![Page 3: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/3.jpg)
3/65
BigData & Graph Theory
![Page 4: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/4.jpg)
Big Data of old times• Astronomy
• Weather
• Trading
• Sea routes
• Battles
![Page 5: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/5.jpg)
And now ...• Web graph
• Facebook friend network
• Gmail email graph
• EU road network
• Citation graph
• PayPal transaction graph
![Page 6: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/6.jpg)
Graph Number of vertexes
Number of edges
Volume Data/per day
Web-graph 1,5 * 10^12 1,2 * 10^13 100 PB 300 TB
Facebook (friends graph)
1,1 * 10^9 160 * 10^9 1 PB 15 TB
Road graph of EU
18 * 10^6 42 * 10^6 20 GB 50 MB
Road graph of this city
250 000 460 000 500 MB 100 KB
![Page 7: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/7.jpg)
Problems• Popularity rank (page rank)
• Determining popular users, news, jobs, etc.
• Shortest paths
• Max flow
• How are users, groups connected?
• Clustering, semi-clustering
• Max clique, triangle closure, label propagation algorithms
• Finding related people, groups, interests
![Page 8: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/8.jpg)
![Page 9: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/9.jpg)
![Page 10: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/10.jpg)
![Page 11: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/11.jpg)
![Page 12: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/12.jpg)
Node Centrality Problem• Verticies with high impact
• Removal of important vertices reduces the reliability
Cases:
• Bioinformatics
• Social connections
• Road network
• Spam detection
• Recommendation system
![Page 13: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/13.jpg)
![Page 14: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/14.jpg)
Small World Problem
Facebook 4.74 712 M 69 G
Twitter 3.67 ---- 5G follows
MSN Messenger (1 month)
6.6 180 M 1.3 G arcs
![Page 15: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/15.jpg)
15/65
Large graph processing tools
![Page 16: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/16.jpg)
Think like a vertex…• Majority of graph algorithms are iterative and traverse the graph in
some way
• Classic map-reduce overheads (job startup/shutdown, reloading data
from HDFS, shuffling)
• High complexity of graph problem reduction to key-value model
• Iteration algorythms, but multiple chained jobs in M/R with full saving
and reading of each state
![Page 17: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/17.jpg)
Why not use MapReduce/Hadoop?• Example: PageRank, Google‘s
famous algorithm for measuring the
authority of a webpage based on the
underlying network of hyperlinks
• defined recursively: each vertex
distributes its authority to its neighbors
in equal proportions
![Page 18: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/18.jpg)
Google Pregel• Distributed system especially developed for large scale graph
processing
• Bulk Synchronous Parallel (BSP) as execution model
• Supersteps are atomic units of parallel computation
• Any superstep can be restarted from a checkpoint (need not be user
defined)
• A new superstep provides an opportunity for rebalancing of
components among available resources
![Page 19: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/19.jpg)
Superstep in BSP
![Page 20: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/20.jpg)
Vertex-centric BSP• Each vertex has an id, a value, a list of its adjacent vertex ids and the
corresponding edge values
• Each vertex is invoked in each superstep, can recompute its value and
send messages to other vertices, which are delivered over superstep
barriers
• Advanced features : termination votes, combiners, aggregators,
topology mutations
![Page 21: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/21.jpg)
C++ API, Pregel
![Page 22: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/22.jpg)
![Page 23: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/23.jpg)
23/65
Apache Giraph
![Page 24: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/24.jpg)
Why Apache GiraphPregel is proprietary, but:
• Apache Giraph is an open source implementation of Pregel
• Runs on standard Hadoop infrastructure
• Computation is executed in memory
• Can be a job in a pipeline(MapReduce, Hive)
• Uses Apache ZooKeeperfor synchronization
![Page 25: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/25.jpg)
![Page 26: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/26.jpg)
Why Apache Giraph
• No locks: message-based communication
• No semaphores: global synchronization
• Iteration isolation: massively parallelizable
![Page 27: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/27.jpg)
![Page 28: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/28.jpg)
![Page 29: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/29.jpg)
ZooKeeper in Apache GiraphZooKeeper: responsible for
computation state
• Partition/worker mapping
• Global state: superstep
• Checkpoint paths, aggregator
values, statistics
![Page 30: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/30.jpg)
Master in Apache GiraphMaster: responsible for coordination
• Assigns partitions to workers
• Coordinates synchronization
• Requests checkpoints
• Aggregates aggregator values
• Collects health statuses
![Page 31: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/31.jpg)
Worker in Apache GiraphWorker: responsible for vertices
• Invokes active vertices
compute() function
• Sends, receives and assigns
messages
• Computes local aggregation
values
![Page 32: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/32.jpg)
Scaling Giraph to a trillion edges
![Page 33: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/33.jpg)
Fault toleranceNo single point of failure from Giraph threads
• With multiple master threads, if the current master dies, a new one will automatically take over.
• If a worker thread dies, the application is rolled back to a previously checkpointed superstep.
• If a zookeeper server dies, as long as a quorum remains, the application can proceed
Hadoop single points of failure still exist (Namenode, jobtracker)
![Page 34: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/34.jpg)
Worker Scalability, 250m nodes
![Page 35: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/35.jpg)
Vertex scalability, 300 workers
![Page 36: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/36.jpg)
Vertex/workers scalability
![Page 37: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/37.jpg)
MapReduce vs Giraph6 machines with 2x8core Opteron CPUs, 4x1TB disks and 32GB RAM each, ran 1 Giraph worker per core
Wikipedia page link graph (6 million vertices, 200 million edges)
PageRank on Hadoop/Mahout
• 10 iterations approx. 29 minutes• average time per iteration: approx. 3 minutes
PageRank on Giraph
• 30 iterations took approx. 15 minutes• average time per iteration: approx. 30 seconds
10x performance improvement
![Page 38: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/38.jpg)
Okapi• Apache Mahout for graphs
• Graph-based recommenders: ALS,
SGD, SVD++, etc.
• Graph analytics: Graph
partitioning, Community Detection,
K-Core, etc.
![Page 39: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/39.jpg)
Giraph’s killer
![Page 40: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/40.jpg)
Spark• MapReduce in memory
• Up to 50x faster than Hadoop
• Support for Shark (like Hive), MLlib
(Machine learning), GraphX (graph
processing)
• RDD is a basic building block
(immutable distributed collections of
objects)
![Page 41: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/41.jpg)
Spark in Hadoop old family
![Page 42: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/42.jpg)
GraphXSupported algorythms
● PageRank
● Connected components
● Label propagation
● SVD++
● Strongly connected components
● Triangle count
![Page 43: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/43.jpg)
GraphChi• Asynchronous Disk-based version of GraphLab
• Utilizing parallel sliding window
• Very small number of non-sequential accessesto the disk
• Graph does not fit in memory
• Input graph is split into P disjoint intervals to balance edges,
each associated with a shard
• For Home deals ...
![Page 44: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/44.jpg)
GraphChi
![Page 45: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/45.jpg)
GraphChi
![Page 46: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/46.jpg)
46/65
Road Networks
![Page 47: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/47.jpg)
Definition• Edge weights > 0
• A few classes of roads
• Lat/Lon attributes for each vertex
• Subgraphs for cross-roads
• Not so big as web graph
• Static
![Page 48: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/48.jpg)
Shortest path problem
![Page 49: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/49.jpg)
AI
![Page 50: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/50.jpg)
Full
![Page 51: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/51.jpg)
Dijkstra
![Page 52: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/52.jpg)
Bi-Directional
![Page 53: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/53.jpg)
We need in fast system!• Response < 10 ms (with high accuracy)
• Shortest path (SP) with O(n)
• Preprocessing phase
• Don’t keep all SP - O(n^2)
• Use geo attributes
• Using compression and recoding for
disk storage
• Network is stable
![Page 54: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/54.jpg)
EU Road network
Dijkstra ALT RE HH CH TN HL
2 008 300 24 656 2444 462.0 94.0 1.8 0.3
• ALT: [Goldberg & Harrelson 05], [Delling & Wagner 07]• RE: [Gutman 05], [Goldberg et al. 07]• HH: [Sanders & Schultes 06]• CH: [Geisberger et al. 08]• TN: [Geisberger et al. 08]• HL: [Abraham et al. 11]
![Page 55: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/55.jpg)
A* with landmarks (ALT)
![Page 56: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/56.jpg)
Reach (RE)
![Page 57: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/57.jpg)
Transit nodes (TN)• Divide graph G on subgraphs G_i
• Find R (subset of G_i) for each G_i
• All sortest path in G_i across R
• Build pairs (v_i, r_k) for each v_i where
r_k is closest Transit Node
• Calculate shortest paths between transit
nodes in R
• Save it!
![Page 58: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/58.jpg)
TN + ALT
![Page 59: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/59.jpg)
59/65
Special Cases
![Page 60: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/60.jpg)
Optimization problems• Unstable graph
• Prerpocessing phase is meaningless
• How to invest 1B $ in road network to minimize human time in
traffic jams
• How to invest 1M $ in road network to improve reliability before
the flooding
![Page 61: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/61.jpg)
Last steps ...
• I/O Efficient Algorythms and Data Structures
• Graphs and Memory Errors
![Page 62: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/62.jpg)
Omsk
![Page 63: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/63.jpg)
Novosibirsk
![Page 64: Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)](https://reader033.vdocument.in/reader033/viewer/2022060116/557f479cd8b42aba678b529e/html5/thumbnails/64.jpg)
Novosibirsk, TN preprocessing