schism: graph partitioning for oltp databases in a relational cloud implications for the design of...
DESCRIPTION
Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab. Samuel Madden MIT CSAIL Director, Intel ISTC in Big Data. GraphLab Workshop 2012. The Problem with Databases. Tend to proliferate inside organizations - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/1.jpg)
Samuel MaddenMIT CSAIL
Director, Intel ISTC in Big Data
Schism: Graph Partitioning for OLTP Databases in a Relational CloudImplications for the design of GraphLab
GraphLab Workshop 2012
![Page 2: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/2.jpg)
The Problem with Databases• Tend to proliferate inside organizations
– Many applications use DBs• Tend to be given dedicated hardware
– Often not heavily utilized• Don’t virtualize well• Difficult to scale
This is expensive & wasteful– Servers, administrators, software licenses,
network ports, racks, etc …
![Page 3: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/3.jpg)
3
RelationalCloud Vision• Goal: A database service that exposes self-serve
usage model– Rapid provisioning: users don’t worry about DBMS &
storage configurations
Example: • User specifies type and size of DB and SLA
(“100 txns/sec, replicated in US and Europe”) • User given a JDBC/ODBC URL• System figures out how & where to run user’s DB &
queries
![Page 4: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/4.jpg)
Before: Database Silos and Sprawl
Application #3
Database #3
Application #4
Database #4
Application #2
Database #2
Application #1
Database #1$$ $$
$$$$
• Must deal with many one-off database configurations
• And provision each for its peak load
![Page 5: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/5.jpg)
App #1
After: A Single Scalable Service
App #2 App #3
App #4
• Reduces server hardware by aggressive workload-aware multiplexing• Automatically partitions databases across multiple HW resources• Reduces operational costs by automating service management tasks
![Page 6: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/6.jpg)
What about virtualization?• Could run each DB in a separate VM
• Existing database services (Amazon RDS) do this– Focus is on simplified management, not performance
• Doesn’t provide scalability across multiple nodes
• Very inefficient
Max Throughput w/ 20:1 consolidation (Us vs. VMWare ESXi)One DB 10x loadedAll DBs equal load
![Page 7: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/7.jpg)
Key Ideas in this Talk: Schism• How to automatically partition transactional
(OLTP) databases in a database service
• Some implications for GraphLab
![Page 8: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/8.jpg)
System Overview
Schism
Not going to talk about:- Database migration- Security- Placement of data
![Page 9: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/9.jpg)
This is your OLTP Database
Curino et al, VLDB 2010
![Page 10: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/10.jpg)
This is your OLTP database on Schism
![Page 11: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/11.jpg)
Schism
New graph-based approach to automatically partition OLTP workloads across many machines
Input: trace of transactions and the DBOutput: partitioning plan
Results: As good or better than best manual partitioning
Static partitioning – not automatic repartitioning.
![Page 12: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/12.jpg)
Challenge: Partitioning
Goal: Linear performance improvement when adding machines
Requirement: independence and balance
Simple approaches:• Total replication• Hash partitioning• Range partitioning
![Page 13: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/13.jpg)
Partitioning Challenges
Transactions access multiple records?Distributed transactionsReplicated data
Workload skew?Unbalanced load on individual servers
Many-to-many relations?Unclear how to partition effectively
![Page 14: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/14.jpg)
Many-to-Many: Users/Groups
![Page 15: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/15.jpg)
Many-to-Many: Users/Groups
![Page 16: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/16.jpg)
Many-to-Many: Users/Groups
![Page 17: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/17.jpg)
Distributed Txn Disadvantages
Require more communicationAt least 1 extra message; maybe more
Hold locks for longer timeIncreases chance for contention
Reduced availabilityFailure if any participant is down
![Page 18: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/18.jpg)
Example
Single partition: 2 tuples on 1 machineDistributed: 2 tuples on 2 machines
Each transaction writes two different tuples
Same issue would arise in distributed GraphLab
![Page 19: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/19.jpg)
Schism Overview
![Page 20: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/20.jpg)
Schism Overview
1. Build a graph from a workload trace– Nodes: Tuples accessed by the trace– Edges: Connect tuples accessed in txn
![Page 21: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/21.jpg)
Schism Overview
1. Build a graph from a workload trace2. Partition to minimize distributed txnsIdea: min-cut minimizes distributed txns
![Page 22: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/22.jpg)
Schism Overview
1. Build a graph from a workload trace2. Partition to minimize distributed txns3. “Explain” partitioning in terms of the DB
![Page 23: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/23.jpg)
Building a Graph
![Page 24: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/24.jpg)
Building a Graph
![Page 25: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/25.jpg)
Building a Graph
![Page 26: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/26.jpg)
Building a Graph
![Page 27: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/27.jpg)
Building a Graph
![Page 28: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/28.jpg)
Building a Graph
![Page 29: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/29.jpg)
Replicated Tuples
![Page 30: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/30.jpg)
Replicated Tuples
![Page 31: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/31.jpg)
Partitioning
Use the METIS graph partitioner:min-cut partitioning with balance constraint
Node weight:# of accesses → balance workloaddata size → balance data size
Output: Assignment of nodes to partitions
![Page 32: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/32.jpg)
Graph Size Reduction Heuristics
Coalescing: tuples always accessed together → single node (lossless)
Blanket Statement Filtering: Remove statements that access many tuples
Sampling: Use a subset of tuples or transactions
![Page 33: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/33.jpg)
Explanation Phase
Goal:Compact rules to represent partitioning
42
5
1
1212
Users Partition
![Page 34: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/34.jpg)
Explanation Phase
Goal:Compact rules to represent partitioning
Classification problem:tuple attributes → partition mappings
4 Carlo Post Doc. $20,0002 Evan Phd Student $12,000
5 Sam Professor $30,000
1 Yang Phd Student $10,000
1212
Users Partition
![Page 35: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/35.jpg)
Decision Trees
Machine learning tool for classification
Candidate attributes:attributes used in WHERE clauses
Output: predicates that approximate partitioning
4 Carlo Post Doc. $20,0002 Evan Phd Student $12,000
5 Sam Professor $30,000
1 Yang Phd Student $10,000
1212
Users PartitionIF (Salary>$12000)
P1ELSE
P2
![Page 36: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/36.jpg)
Evaluation: Partitioning Strategies
Schism: Plan produced by our tool
Manual: Best plan found by experts
Replication: Replicate all tables
Hashing: Hash partition all tables
![Page 37: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/37.jpg)
YahooBench-A YahooBench-E0%
25%
50%
75%
100%
Schism Manual Replication Hashing
Benchmark Results: Simple
% Distributed Transactions
![Page 38: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/38.jpg)
0%
25%
50%
75%
100%
Schism Manual Replication Hashing
Benchmark Results: TPC
% Distributed Transactions
![Page 39: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/39.jpg)
0%
25%
50%
75%
100%
Schism Manual Replication Hashing
Benchmark Results: Complex
% Distributed Transactions
![Page 40: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/40.jpg)
Implications for GraphLab (1)
• Shared architectural components for placement, migration, security, etc.
• Would be great to look at building a database-like store as a backing engine for GraphLab
![Page 41: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/41.jpg)
Implications for GraphLab (2)
• Data driven partitioning– Can co-locate data that is accessed together
• Edge weights can encode frequency of read/writes from adjacent nodes
– Adaptively choose between replication and distributed depending on read/write frequency
– Requires a workload trace and periodic repartitioning
– If accesses are random, will not be a win– Requires heuristics to deal with massive graphs,
e.g., ideas from GraphBuilder
![Page 42: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/42.jpg)
Implications for GraphLab (3)• Transactions and 2PC for serializability
– Acquire locks as data is accessed, rather than acquiring read/write locks on all neighbors in advance
– Introduces deadlock possibility– Likely a win if adjacent updates are
infrequent, or not all neighbors accessed on each iteration
– Could also be implemented using optimistic concurrency control schemes
![Page 43: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/43.jpg)
Schism
Automatically partitions OLTP databases as well or better than
expertsGraph partitioning combined with decision
trees finds good partitioning plans for many applications
Suggests some interesting directions for distributed GraphLab; would be fun to explore!
![Page 44: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/44.jpg)
Graph Partitioning Time
![Page 45: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/45.jpg)
Collecting a Trace
Need trace of statements and transaction ids (e.g. MySQL general_log)
Extract read/write sets by rewriting statements into SELECTs
Can be applied offline: Some data lost
![Page 46: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/46.jpg)
Effect of Latency
![Page 47: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/47.jpg)
Replicated Data
Read: Access the local copyWrite: Write all copies (distributed txn)
• Add n + 1 nodes for each tuplen = transactions accessing tuple
• connected as star with weight = # writes
Cut a replication edge: cost = # of writes
![Page 48: Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab](https://reader036.vdocument.in/reader036/viewer/2022062302/5681692d550346895de07076/html5/thumbnails/48.jpg)
Partitioning Advantages
Performance:• Scale across multiple machines• More performance per dollar• Scale incrementally
Management:• Partial failure• Rolling upgrades• Partial migrations