Download - The Art of Database Sharding
![Page 1: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/1.jpg)
The Artof Database Sharding
Maxym Kharchenko
Amazon.com
![Page 2: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/2.jpg)
Whoami
• Started as a database kernel developer– Network database: db_VISTA
• ORACLE DBA for ~ 10-12 years– Starting with ORACLE 8
• Last 3 years: Sr. Persistence Engineer @Amazon.com
• OCM, ORACLE Ace Associate
• Blog: http://intermediatesql.com• Twitter: @maxymkh
![Page 3: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/3.jpg)
Agenda
• The “big data” scaling problem
• Solving scaling with “sharding”
• Practical sharding
• Your sharding experience: Good and bad
![Page 4: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/4.jpg)
How to scale a database
Old System
New SystemProblem
2013 2014 2015 2016 2017
![Page 5: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/5.jpg)
The Big Data problem
![Page 6: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/6.jpg)
Vertical Scaling
![Page 7: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/7.jpg)
Scaling Up …
![Page 8: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/8.jpg)
Scaling Up …
![Page 9: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/9.jpg)
Scaled!
![Page 10: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/10.jpg)
“Scaling up” math:System capabilities
2+2=3
![Page 11: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/11.jpg)
“Scaling up” math:System cost
2+2=7
![Page 12: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/12.jpg)
Scale out, not up
![Page 13: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/13.jpg)
Use lots of cheap machines
Not bigger machines
![Page 14: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/14.jpg)
Commodity hardware
=
$$$$$ $$
![Page 15: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/15.jpg)
Distributed System
![Page 16: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/16.jpg)
Distributed System
![Page 17: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/17.jpg)
Distributed System
![Page 18: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/18.jpg)
Distributed computing is hard
![Page 19: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/19.jpg)
Shared Nothing (“Sharded”) System
![Page 20: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/20.jpg)
Sharding is (relatively) easy
![Page 21: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/21.jpg)
Split your datainto small independent chunks
And run each chunkon cheap commodity hardware
![Page 22: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/22.jpg)
How to split your data
Data
![Page 23: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/23.jpg)
How to split your data
![Page 24: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/24.jpg)
How to split your data
![Page 25: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/25.jpg)
How to split your data
![Page 26: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/26.jpg)
How to split your data
![Page 27: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/27.jpg)
Vertical Partitioning
![Page 28: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/28.jpg)
Vertical Partitioning
![Page 29: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/29.jpg)
Vertical Partitioning
![Page 30: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/30.jpg)
Horizontal Partitioning
![Page 31: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/31.jpg)
Sharding
![Page 32: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/32.jpg)
Sharding
CREATE TABLE books ( id number PRIMARY KEY, title varchar2(200), author varchar2(200));
![Page 33: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/33.jpg)
CREATE TABLE books ( id number PRIMARY KEY, title varchar2(200), author varchar2(200)
) SHARD BY <method> (<shard_key>) ( SPLIT SIZE evenly SPLIT LOAD evenly DISCOURAGE CROSS SHARD ACCESS DISCOURAGE DATA MOVE USING 4 DATABASES);
Sharding
![Page 34: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/34.jpg)
Split size evenly
SHARD BY LIST ( first_letter(author) ) ( SPLIT SIZE evenly);
A-G H-M N-TU-Z
![Page 35: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/35.jpg)
Split load evenly
SHARD BY RANGE (id) ( SPLIT SIZE evenly SPLIT LOAD evenly);
1-100 101-200 201-300 301-400
![Page 36: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/36.jpg)
Split load evenly
SHARD BY HASH (id) ( SPLIT SIZE evenly SPLIT LOAD evenly);
0 1 2 3
![Page 37: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/37.jpg)
Discourage cross shard access
SHARD BY HASH (id) ( DISCOURAGE CROSS SHARD ACCESS);
SELECT title FROM booksWHERE id = 34567876;
![Page 38: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/38.jpg)
Discourage cross shard access
SHARD BY HASH (id) ( DISCOURAGE CROSS SHARD ACCESS);
SELECT title FROM booksWHERE author = 'Isaac Asimov'ORDER BY title;
![Page 39: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/39.jpg)
Discourage cross shard access
SHARD BY HASH (author) ( DISCOURAGE CROSS SHARD ACCESS);
0 1 2 3
SELECT title FROM booksWHERE author = 'Isaac Asimov'ORDER BY title;
![Page 40: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/40.jpg)
Discourage data move
SHARD BY mod(hash(author), 4) ( DISCOURAGE DATA MOVE);
0 1 2 3
![Page 41: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/41.jpg)
Discourage data move
SHARD BY mod(hash_function(author), 6) ( DISCOURAGE DATA MOVE);
0 1 2 3
4 5
![Page 42: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/42.jpg)
ReshardingHash Mod/4
1 12 23 34 05 16 27 38 09 110 211 312 0
Hash Mod/4 Mod/61 1 12 2 23 3 34 0 45 1 56 2 07 3 18 0 29 1 310 2 411 3 512 0 0
![Page 43: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/43.jpg)
Physical and Logical shards
SHARD BY mod(hash(author), 1200) ( DISCOURAGE DATA MOVE);
DB 1 DB 2 DB 3 DB 4
![Page 44: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/44.jpg)
Executing queriesdef shard_query(sql, binds, shard_key): """ Execute query in the correct db """
shard_hash = hash(shard_key) logical_bucket = mod(shard_hash, TOTAL_BUCKETS) physical_db = memcached_get_db(logical_bucket) execute_query(physical_db, sql, binds)
SELECT title FROM booksWHERE author = 'Isaac Asimov'ORDER BY title;
![Page 45: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/45.jpg)
Implementing Shards: Standbys
Unsharded StandbyShard 1 Shard 2
Apps
Read Only
Drop non-qualifying data Drop non-qualifying data
![Page 46: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/46.jpg)
Implementing Shards: Tables
Shard1
Apps
TabA
Shard 2
MVA
TabA
Create materialized view … as select …from a@shard1
Dropmaterialized view … preserve table
Read Only
![Page 47: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/47.jpg)
Implementing Shards:Moving “data head”
Shard 1
Apps
Shard 2
Logical Shard
Physical Shard
(1,2,3,4) 1(5,6,7,8) 2
Time Logical Shard
Physical Shard
2011(1,2,3,4) 12011(5,6,7,8) 2
Time Logical Shard
Physical Shard
2011(1,2,3,4) 12011(5,6,7,8) 22012(1,2) 12012(3,4) 32012(5,6) 22012(7,8) 4
Shard 3 Shard 4
![Page 48: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/48.jpg)
Data protection
Shard 1 Shard 2 Shard 4Shard 3
Stb 1 Stb 2 Stb 4Stb 3
App App
![Page 49: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/49.jpg)
Why shards are awesome
• (potentially) Unlimited scaling
• Local ACID + relational
• Better maintenance
• Eggs not in one basket
• “Apples to apples comparison” with other shards
![Page 50: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/50.jpg)
Why shards are NOT so great
• More systems– Power, rack space etc– Needs automation … bad– More likely to fail overall
• Some operations become difficult:– Transactions across shards– Foreign keys across shards
• More work:– Applications, developers, DBAs– High skill, DIY everything
![Page 51: The Art of Database Sharding](https://reader030.vdocument.in/reader030/viewer/2022033106/56812ba1550346895d8fcbb7/html5/thumbnails/51.jpg)
Takeaways
More > Bigger
ORACLE is still cool