hornet: an efficient data structure for dynamic sparse ... · •dynamic graphs –graph can change...
TRANSCRIPT
![Page 1: Hornet: An Efficient Data Structure for Dynamic Sparse ... · •Dynamic graphs –Graph can change over time. –Changes can be to topology, edges, or vertices. •For example new](https://reader033.vdocument.in/reader033/viewer/2022060517/6049ee85f3a5b62e9d5b93ab/html5/thumbnails/1.jpg)
Hornet: An Efficient Data Structure for
Dynamic Sparse Graphs and Matrices
Oded Green
![Page 2: Hornet: An Efficient Data Structure for Dynamic Sparse ... · •Dynamic graphs –Graph can change over time. –Changes can be to topology, edges, or vertices. •For example new](https://reader033.vdocument.in/reader033/viewer/2022060517/6049ee85f3a5b62e9d5b93ab/html5/thumbnails/2.jpg)
Hornet
• A scalable and dynamic data structure for
– Sparse data
– Graph algorithms
– Linear algebra based problems
• Formerly known as cuSTINGER
– Hornet initialization is hundreds of times faster
– Hornet updates are 4X-10X faster
– The Hornet data structure offers is more robust and
scalable than cuSTINGER.
• Essentially a dynamic CSR data structure
• Easy to use
Oded Green, GTC-18 2
![Page 3: Hornet: An Efficient Data Structure for Dynamic Sparse ... · •Dynamic graphs –Graph can change over time. –Changes can be to topology, edges, or vertices. •For example new](https://reader033.vdocument.in/reader033/viewer/2022060517/6049ee85f3a5b62e9d5b93ab/html5/thumbnails/3.jpg)
“Separation of powers”
• Dynamic graph data structure and dynamic
graph algorithms are in two different
repositories
– Easy to integrate with external library
– Can also be used with matrices
• This talk focuses on the data structure
Oded Green, GTC-18 3
![Page 4: Hornet: An Efficient Data Structure for Dynamic Sparse ... · •Dynamic graphs –Graph can change over time. –Changes can be to topology, edges, or vertices. •For example new](https://reader033.vdocument.in/reader033/viewer/2022060517/6049ee85f3a5b62e9d5b93ab/html5/thumbnails/4.jpg)
Graph Primitives – Upfront summary
• Great performance for static and dynamic
graph algorithms
• Scalable
• Simple to use
• Will discuss algorithm framework later today
– 1:00pm
– Same room as this talk
Oded Green, GTC-18 4
![Page 5: Hornet: An Efficient Data Structure for Dynamic Sparse ... · •Dynamic graphs –Graph can change over time. –Changes can be to topology, edges, or vertices. •For example new](https://reader033.vdocument.in/reader033/viewer/2022060517/6049ee85f3a5b62e9d5b93ab/html5/thumbnails/5.jpg)
Hornet – Upfront Summary
• Can support over 150 million updates per second
• Can easily scale to graphs with billions of vertices
• CSR comparison
– Initializing is also relatively in-expensive – usually less than
3X slower
– Hornet requires 30% more storage
– Identical performance
• COO (edge-list) comparison
– Hornet requires 20% less storage
– Hornet has better locality
Oded Green, GTC-18 5
![Page 6: Hornet: An Efficient Data Structure for Dynamic Sparse ... · •Dynamic graphs –Graph can change over time. –Changes can be to topology, edges, or vertices. •For example new](https://reader033.vdocument.in/reader033/viewer/2022060517/6049ee85f3a5b62e9d5b93ab/html5/thumbnails/6.jpg)
Big Data problems
need Graph Analysis
Communication networks:
• World-wide connectivity
• High velocity changes
• Different types of extracted
data:– Physical communication network.
– Person-to-person communication
network.
Oded Green, GTC-18 6
Health-Care networks:
• Various players.
• Pattern matching and
epidemic monitoring.
• Problem sizes have
doubled in last 5 years.
Financial networks:
• Transactions between
players.
• Different transactions
types (property graph)
![Page 7: Hornet: An Efficient Data Structure for Dynamic Sparse ... · •Dynamic graphs –Graph can change over time. –Changes can be to topology, edges, or vertices. •For example new](https://reader033.vdocument.in/reader033/viewer/2022060517/6049ee85f3a5b62e9d5b93ab/html5/thumbnails/7.jpg)
Hornet Properties
✓ A Simple programming model
✓ Enable algorithm designers to implement dynamic & streaming graph
algorithms with ease.
✓ Can easily grows 1000X initial size (no restart needed)
✓ Millions of updates per second to graph
✓Updates are not bottlenecks for analytics.
✓ Automated data management
✓ Transfers data between host and device automatically
✓ Reduces fragmentation
✓ Supports memory reclamation
• Scalable data structurecuSTINGER paper: [Green&Bader; HPEC, 2016]:
cuSTINGER: Supporting dynamic graph algorithms for GPUs
Oded Green, GTC-18 7
![Page 8: Hornet: An Efficient Data Structure for Dynamic Sparse ... · •Dynamic graphs –Graph can change over time. –Changes can be to topology, edges, or vertices. •For example new](https://reader033.vdocument.in/reader033/viewer/2022060517/6049ee85f3a5b62e9d5b93ab/html5/thumbnails/8.jpg)
Definitions
• Dynamic graphs
– Graph can change over time.
– Changes can be to topology, edges, or vertices.
• For example new edges between two vertices.
– Changes to edge or vertex weights
• Streaming graphs:
– Graphs changing at high rates.
– 100s of thousands of updates per second.
• Dynamic matrices
– Adding a perturbation to the matrix
Oded Green, GTC-18 8
![Page 9: Hornet: An Efficient Data Structure for Dynamic Sparse ... · •Dynamic graphs –Graph can change over time. –Changes can be to topology, edges, or vertices. •For example new](https://reader033.vdocument.in/reader033/viewer/2022060517/6049ee85f3a5b62e9d5b93ab/html5/thumbnails/9.jpg)
Dynamic graph example
• Only a subset of the entire
graph…
• Dynamic:
– At time 𝑡:• 𝑣 and 𝑤 become friends.
• 𝑖𝑛𝑠𝑒𝑟𝑡_𝑒𝑑𝑔𝑒 (𝑣, 𝑤)
– At time Ƹ𝑡:• 𝑢 and 𝑣 no longer friends
• d𝑒𝑙𝑒𝑡𝑒𝑒𝑑𝑔𝑒 𝑢,𝑣
• Additional operations
include vertex insertions &
deletions
Oded Green, GTC-18 9
𝑣𝑢
𝑤
![Page 10: Hornet: An Efficient Data Structure for Dynamic Sparse ... · •Dynamic graphs –Graph can change over time. –Changes can be to topology, edges, or vertices. •For example new](https://reader033.vdocument.in/reader033/viewer/2022060517/6049ee85f3a5b62e9d5b93ab/html5/thumbnails/10.jpg)
Widely used graph data structures
10
Names Pros ConsDense Adjacency
Matrix
• Supports updates • Poor locality
• Massive storage
requirements
Linked lists • Flexible • Poor locality
• Limited parallelism
• Allocation time is costly
COO (Edge list) -
unsorted
• Has some flexibility
• Updates are simple
• Lots of parallelism
• Poor locality
• Stores both the source and
destination
CSR • Uses exact amount of
memory
• Good locality
• Lots of parallelism
• Inflexible
These data structures don’t cut itOded Green, GTC-18
![Page 11: Hornet: An Efficient Data Structure for Dynamic Sparse ... · •Dynamic graphs –Graph can change over time. –Changes can be to topology, edges, or vertices. •For example new](https://reader033.vdocument.in/reader033/viewer/2022060517/6049ee85f3a5b62e9d5b93ab/html5/thumbnails/11.jpg)
Compressed Sparse Row (CSR)
Pros:
• Uses precise storage
requirements
• Great locality
– Good for GPUs
• Handful of arrays
– Simple to use and
manage
Cons:
• Inflexible.
• Network growth
unsupported
• Topology changes
unsupported
• Property graphs not
supportedOded Green, GTC-18 11
0 1 2 3 4 5 6 7
0 2 4 7 9 11 13 14 14
Src/Row
Offset
1 2 0 5 0 3 4 2 6 2 5 1 4 3
2 5 2 7 4 1 4 1 2 4 1 7 1 2
Dest./Col.
Value
![Page 12: Hornet: An Efficient Data Structure for Dynamic Sparse ... · •Dynamic graphs –Graph can change over time. –Changes can be to topology, edges, or vertices. •For example new](https://reader033.vdocument.in/reader033/viewer/2022060517/6049ee85f3a5b62e9d5b93ab/html5/thumbnails/12.jpg)
Hornet – A High Level View
• Every vertex points at its own array
• Many edges array (blocks)
• Block size is determined by the number of neighbors (always powers of 2)
• Extra space left at the end of the blockOded Green, GTC-18 12
1 2
2 5
0 5
2 7
0 3 4
4 1 4
2 6
1 2
2 5
4 1
1 4
7 1
3
2
Over-allocated space
Dest./Col.
Value
0 1 2 3 4 5 6 7
2 2 3 2 2 2 1 0
Vertex Id
Used
Pointer
USER-INTERFACE
![Page 13: Hornet: An Efficient Data Structure for Dynamic Sparse ... · •Dynamic graphs –Graph can change over time. –Changes can be to topology, edges, or vertices. •For example new](https://reader033.vdocument.in/reader033/viewer/2022060517/6049ee85f3a5b62e9d5b93ab/html5/thumbnails/13.jpg)
Hornet – Property Graph Support
Oded Green, GTC-18 13
0 1 2 3 4 5 6 7
2 2 3 2 2 2 1 0
Vertex Id
Used
Pointer
1 2
2 5
0 5
2 7
0 3 4
4 1 4
2 6
1 2
2 5
4 1
1 4
7 1
3
2
USER-INTERFACE
Dest./Col.
Weight
Type
Time 1
User 1
User 2
….
• Programmers can add fields per edge
• Easy to mange for static graph data structures
• Hornet manages the data movement
![Page 14: Hornet: An Efficient Data Structure for Dynamic Sparse ... · •Dynamic graphs –Graph can change over time. –Changes can be to topology, edges, or vertices. •For example new](https://reader033.vdocument.in/reader033/viewer/2022060517/6049ee85f3a5b62e9d5b93ab/html5/thumbnails/14.jpg)
Hornet in Detail
14Oded Green, GTC-18
1 00 1 1 10 0 0 00 1 1 1 1 1 1 1
0 1 2 3 4 5 6 7
2 2 3 2 2 2 1 0
Vertex Id
Used (#Neighbors/nnz)
Pointer
1 2
5 2
0 5
5 7
0 3 4
2 1 4
2 6
1 2
2 5
4 1
1 4
7 1
3
2
𝑩𝑨𝟎,𝟏 𝑩𝑨𝟏,𝟏 𝑩𝑨𝟏,𝟐 𝑩𝑨𝟐,𝟏
Bit status
Over-allocated spacefor vertex insertions
USER-INTERFACE
Dest./Col.
Weight
MEMORY MANAGER
bsi
ze=1
bsi
ze=2
bsi
ze=2
bsi
ze=4
Vec-Tree
Over-allocated spacefor power-of-two rule
![Page 15: Hornet: An Efficient Data Structure for Dynamic Sparse ... · •Dynamic graphs –Graph can change over time. –Changes can be to topology, edges, or vertices. •For example new](https://reader033.vdocument.in/reader033/viewer/2022060517/6049ee85f3a5b62e9d5b93ab/html5/thumbnails/15.jpg)
Hornet Performance
• Memory Utilization
– Independent of the GPU being used
• Initialization overhead
• Update rate
15Oded Green, GTC-18
![Page 16: Hornet: An Efficient Data Structure for Dynamic Sparse ... · •Dynamic graphs –Graph can change over time. –Changes can be to topology, edges, or vertices. •For example new](https://reader033.vdocument.in/reader033/viewer/2022060517/6049ee85f3a5b62e9d5b93ab/html5/thumbnails/16.jpg)
Hornet Performance Analysis
• All performance analysis is for the P100
– 56 SMs
– 3584 SPs
– 16GB HBM2 memory
Oded Green, GTC-18 16
![Page 17: Hornet: An Efficient Data Structure for Dynamic Sparse ... · •Dynamic graphs –Graph can change over time. –Changes can be to topology, edges, or vertices. •For example new](https://reader033.vdocument.in/reader033/viewer/2022060517/6049ee85f3a5b62e9d5b93ab/html5/thumbnails/17.jpg)
Inputs Graphs
• DIMACS 10 Graph Implementation Challenge
• SNAP – Stanford Network Analysis Project
• Florida Matrix Collection
The following is only a subset of these graphs:
Oded Green, GTC-18 17
Name Type |𝑽| |𝑬|* Source
𝑐𝑜𝐴𝑢𝑡ℎ𝑜𝑟𝑠𝐷𝐵𝐿𝑃 Collaboration 299𝑘 1.95𝑀 DIMACS
𝑎𝑠 − 𝑠𝑘𝑖𝑡𝑡𝑒𝑟 Trace route 1.69𝑀 11.1𝑀 SNAP
𝑘𝑟𝑜𝑛_21 Random 2𝑀 201𝑀 DIMACS
𝑐𝑖𝑡 − 𝑝𝑎𝑡𝑒𝑛𝑡𝑠 Citation 3.77𝑀 16.5𝑀 SNAP
𝑐𝑎𝑔𝑒15 Matrix 5.15𝑀 94𝑀 DIMACS
𝑢𝑘 − 2002 Webcrawl 18.52𝑀 523𝑀 DIMACS
![Page 18: Hornet: An Efficient Data Structure for Dynamic Sparse ... · •Dynamic graphs –Graph can change over time. –Changes can be to topology, edges, or vertices. •For example new](https://reader033.vdocument.in/reader033/viewer/2022060517/6049ee85f3a5b62e9d5b93ab/html5/thumbnails/18.jpg)
Memory Utilization - Overall
• BlockArrays of size 216
• 70% average utilization of CSR
• Better utilization then: COO, cuSTINGER, AIM
– AIM allocates all GPU memory
Oded Green, GTC-18 18
0%
20%
40%
60%
80%
100%
Spac
e E
ffic
ien
cy
Hornet COO cuSTINGER AIM216
![Page 19: Hornet: An Efficient Data Structure for Dynamic Sparse ... · •Dynamic graphs –Graph can change over time. –Changes can be to topology, edges, or vertices. •For example new](https://reader033.vdocument.in/reader033/viewer/2022060517/6049ee85f3a5b62e9d5b93ab/html5/thumbnails/19.jpg)
Initialization overhead
• Time to initialize data structure in comparison to CSR
• In most cases 2X-3X slower
– One time penalty
• Much faster than cuSTINGER
Oded Green, GTC-18 19
1
10
100
1,000
Slo
wd
ow
n v
ers
us
CSR
Hornet cuSTINGER
![Page 20: Hornet: An Efficient Data Structure for Dynamic Sparse ... · •Dynamic graphs –Graph can change over time. –Changes can be to topology, edges, or vertices. •For example new](https://reader033.vdocument.in/reader033/viewer/2022060517/6049ee85f3a5b62e9d5b93ab/html5/thumbnails/20.jpg)
Insertion Rates
• Supports over 150M updates per second
• Hornet
– 4𝑋 − 10𝑋 faster than cuSTINGER
– Does not have 𝑝𝑒𝑟𝑓𝑜𝑟𝑚𝑎𝑛𝑐𝑒 𝑑𝑖𝑝 like cuSTINGER
• Scalable growth in update rate
Oded Green, GTC-18 20
cuSTINGER Hornet
1,000
10,000
100,000
1,000,000
10,000,000
100,000,000
1,000,000,000
Up
dat
e R
ate
(ed
ges
per
se
con
d)
in-2004 soc-LiveJournal1 cage15 kron_g500-logn21
1,000
10,000
100,000
1,000,000
10,000,000
100,000,000
1,000,000,000
Up
dat
e R
ate
(ed
ges
per
se
con
d)
in-2004 soc-LiveJournal1 cage15 kron_g500-logn21
103
104
105
106
107
108
109
103
104
105
106
107
108
109
![Page 21: Hornet: An Efficient Data Structure for Dynamic Sparse ... · •Dynamic graphs –Graph can change over time. –Changes can be to topology, edges, or vertices. •For example new](https://reader033.vdocument.in/reader033/viewer/2022060517/6049ee85f3a5b62e9d5b93ab/html5/thumbnails/21.jpg)
Take away
• Anything you can do with CSR you can also do with
Hornet (other way is not true)
• Supports high update rates
• Scalable in both data size and in performance
• Simple and high-level programming model
– See you at 1:00pm
• Also, look for James Fox’s talk on a cool algorithm
for finding the maximal K-Truss in a graph– Uses dynamic triangle counting and the Hornet’s deletion…
Oded Green, GTC-18 21
![Page 22: Hornet: An Efficient Data Structure for Dynamic Sparse ... · •Dynamic graphs –Graph can change over time. –Changes can be to topology, edges, or vertices. •For example new](https://reader033.vdocument.in/reader033/viewer/2022060517/6049ee85f3a5b62e9d5b93ab/html5/thumbnails/22.jpg)
Hornet Team (Current & Alumni)
Oded Green, GTC-18 22
![Page 23: Hornet: An Efficient Data Structure for Dynamic Sparse ... · •Dynamic graphs –Graph can change over time. –Changes can be to topology, edges, or vertices. •For example new](https://reader033.vdocument.in/reader033/viewer/2022060517/6049ee85f3a5b62e9d5b93ab/html5/thumbnails/23.jpg)
Thank you
Oded Green, GTC-18 23
•Email: [email protected]
• Hornet:
– https://github.com/hornet-gt/hornet
• HornetsNest:
– https://github.com/hornet-gt/hornetsnest
![Page 24: Hornet: An Efficient Data Structure for Dynamic Sparse ... · •Dynamic graphs –Graph can change over time. –Changes can be to topology, edges, or vertices. •For example new](https://reader033.vdocument.in/reader033/viewer/2022060517/6049ee85f3a5b62e9d5b93ab/html5/thumbnails/24.jpg)
Backup slides
24Oded Green, GTC-18
![Page 25: Hornet: An Efficient Data Structure for Dynamic Sparse ... · •Dynamic graphs –Graph can change over time. –Changes can be to topology, edges, or vertices. •For example new](https://reader033.vdocument.in/reader033/viewer/2022060517/6049ee85f3a5b62e9d5b93ab/html5/thumbnails/25.jpg)
Memory Utilization - Overall
• 70% average utilization of CSR
• Better utilization in comparison to: COO,
cuSTINGER, AIMS
Oded Green, GTC-18 25
0%
20%
40%
60%
80%
100%
Spac
e E
ffic
ien
cy
Hornet Hornet Hornet COO cuSTINGER AIM216 222218
![Page 26: Hornet: An Efficient Data Structure for Dynamic Sparse ... · •Dynamic graphs –Graph can change over time. –Changes can be to topology, edges, or vertices. •For example new](https://reader033.vdocument.in/reader033/viewer/2022060517/6049ee85f3a5b62e9d5b93ab/html5/thumbnails/26.jpg)
Part 2: HornetsNest
• Algorithm framework for Hornet data
structure
– We support CSR as well
• All algorithms are implemented using a small
set of operations
– We show that these operators are efficient for
static graph algorithms and can be used for
dynamic graph algorithms
• Uses features from C++11 and C++14
Oded Green, GTC-18 26
![Page 27: Hornet: An Efficient Data Structure for Dynamic Sparse ... · •Dynamic graphs –Graph can change over time. –Changes can be to topology, edges, or vertices. •For example new](https://reader033.vdocument.in/reader033/viewer/2022060517/6049ee85f3a5b62e9d5b93ab/html5/thumbnails/27.jpg)
Sparse Matrix Vector Multiplication
• In comparison to DCSR [King et al; 2016; ISC]
– DCSR requires customized SpMV
• Hornet uses identical algorithm code as CSR.
27Oded Green, GTC-18
1
10
100
Spee
du
p v
ers
us
DC
SR
CSR Hornet