Download - Sharding Methods for MongoDB
![Page 2: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/2.jpg)
2
• Customer Stories
• Sharding for Performance/Scale– When to shard?– How many shards do I need?
• Types of Sharding
• How to Pick a Shard Key
• Sharding for Other Reasons
Agenda
![Page 3: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/3.jpg)
Customer Stories
![Page 4: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/4.jpg)
4
![Page 5: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/5.jpg)
5
• 50M users.
• 6B check-ins to date (6M per day growth).
• 55M points of interest / venues.
• 1.7M merchants using the platform for marketing
• Operations Per Second: 300,000
• Documents: 5.5B
Foursquare
![Page 6: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/6.jpg)
6
• 11 MongoDB clusters– 8 are sharded
• Largest cluster has 15 shards (check ins)– Sharded on user id
Foursquare clusters
![Page 7: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/7.jpg)
7
• Large data set
CarFax
![Page 8: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/8.jpg)
8
• 13 billion+ documents– 1.5 billion documents added every year
• 1 vehicle history report is > 200 documents
• 12 Shards
• 9-node replica sets
• Replicas distributed across 3 data centers
CarFax Shards
![Page 9: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/9.jpg)
9
![Page 10: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/10.jpg)
What is Sharding?
![Page 11: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/11.jpg)
12
Sharding Overview
Primary
Secondary
Secondary
Shard 1
Primary
Secondary
Secondary
Shard 2
Primary
Secondary
Secondary
Shard 3
Primary
Secondary
Secondary
Shard N
…
Query Router
Query Router
Query Router
……
Driver
Application
![Page 12: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/12.jpg)
14
Scaling: Sharding
mongod
Read/Write Scalability
Key Range0..100
![Page 13: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/13.jpg)
15
Scaling: Sharding
Read/Write Scalability
mongod mongod
Key Range0..50
Key Range51..100
![Page 14: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/14.jpg)
16
Scaling: Sharding
mongod mongod mongod mongod
Key Range0..25
Key Range26..50
Key Range51..75
Key Range76.. 100
Read/Write Scalability
![Page 15: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/15.jpg)
How do I know I need to shard?
![Page 16: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/16.jpg)
18
Does one server/replica…
• Have enough disk space to store all my data?
• Handle my query throughput (operations per second)?
• Respond to queries fast enough (latency)?
![Page 17: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/17.jpg)
19
• Have enough disk space to store all my data?
• Handle my query throughput (operations per second)?
• Respond to queries fast enough (latency)?
Does one server/replica set…
Server Specs
Disk Capacity
Disk IOPSRAMNetwork
Disk IOPSRAMNetwork
![Page 18: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/18.jpg)
How many shards do I need?
![Page 19: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/19.jpg)
21
• Sum of disk space across shards > greater than required storage size
Disk Space: How Many Shards Do I Need?
![Page 20: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/20.jpg)
22
• Sum of disk space across shards > greater than required storage size
Disk Space: How Many Shards Do I Need?
Example
Storage size = 3 TBServer disk capacity = 2 TB
2 Shards Required
![Page 21: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/21.jpg)
23
• Working set should fit in RAM– Sum of RAM across shards > Working Set
• WorkSet = Indexes plus the set of documents accessed frequently
• WorkSet in RAM – Shorter latency– Higher Throughput
RAM: How Many Shards Do I Need?
![Page 22: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/22.jpg)
24
• Measuring Index Size and Working Setdb.stats() – index size of each collection
db.serverStatus({ workingSet: 1}) – working set size estimate
RAM: How Many Shards Do I Need?
![Page 23: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/23.jpg)
25
• Measuring Index Size and Working Setdb.stats() – index size of each collection
db.serverStatus({ workingSet: 1}) – working set size estimate
RAM: How Many Shards Do I Need?
Example
Working Set = 428 GBServer RAM = 128 GB
428/128 = 3.34
4 Shards Required
![Page 24: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/24.jpg)
26
• Sum of IOPS across shards > greater than required IOPS
• IOPS are difficult to estimate– Update doc– Update indexes– Append to journal– Log entry?
• Best approach – build a prototype and measure
Disk Throughput: How Many Shards Do I Need
![Page 25: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/25.jpg)
27
• Sum of IOPS across shards > greater than required IOPS
• IOPS are difficult to estimate– Update doc– Update indexes– Append to journal– Log entry?
• Best approach – build a prototype and measure
Disk Throughput: How Many Shards Do I Need
Example
Required IOPS = 11000Server disk IOPS = 5000
3 Shards Required
![Page 26: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/26.jpg)
28
• S = ops/sec of a single server
• G = required ops/sec
• N = # of shards
• G = N * S * .7
N = G/.7S
OPS: How Many Shards Do I Need?
![Page 27: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/27.jpg)
29
• S = ops/sec of a single server
• G = required ops/sec
• N = # of shards
• G = N * S * .7
N = G/.7S
OPS: How Many Shards Do I Need?
Sharding Overhead
![Page 28: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/28.jpg)
30
• S = ops/sec of a single server
• G = required ops/sec
• N = # of shards
• G = N * S * .7
N = G/.7S
OPS: How Many Shards Do I Need?
Example
S = 4000G = 10000
N = 3.57
4 Shards
![Page 29: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/29.jpg)
Types of Sharding
![Page 30: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/30.jpg)
32
• Range
• Tag-Aware
• Hashed
Sharding Types
![Page 31: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/31.jpg)
33
Range Sharding
mongod mongod mongod mongod
Key Range0..25
Key Range26..50
Key Range51..75
Key Range76.. 100
Read/Write Scalability
![Page 32: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/32.jpg)
34
Tag-Aware Sharding
mongod mongod mongod mongod
Shard Tags
Shard Tag Start End
Winter 23 Dec 21 Mar
Spring 22 Mar 21 Jun
Summer 21 Jun 23 Sep
Fall 24 Sep 22 Dec
Tag Ranges
Winter Spring Summer Fall
![Page 33: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/33.jpg)
35
Hash-Sharding
mongod mongod mongod mongod
Hash Range0000..4444
Hash Range4445..8000
Hash Rangei8001..aaaa
Hash Rangeaaab..ffff
![Page 34: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/34.jpg)
36
Hashed shard key
• Pros:– Evenly distributed writes
• Cons:– Random data (and index) updates can be IO
intensive– Range-based queries turn into scatter gather
Shard 1
mongos
Shard 2 Shard 3 Shard N
![Page 35: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/35.jpg)
37
Range sharding document distribution
![Page 36: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/36.jpg)
38
Hashed sharding document distribution
![Page 37: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/37.jpg)
How do I Pick A Shard Key
![Page 38: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/38.jpg)
40
Shard Key characteristics
• A good shard key has:– sufficient cardinality– distributed writes– targeted reads ("query isolation")
• Shard key should be in every query if possible– scatter gather otherwise
• Choosing a good shard key is important!– affects performance and scalability– changing it later is expensive
![Page 39: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/39.jpg)
41
Low cardinality shard key
• Induces "jumbo chunks"
• Examples: boolean field
Shard 1
mongos
Shard 2 Shard 3 Shard N
[ a, b )
![Page 40: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/40.jpg)
42
Ascending shard key
• Monotonically increasing shard key values cause "hot spots" on inserts
• Examples: timestamps, _id
Shard 1
mongos
Shard 2 Shard 3 Shard N
[ ISODate(…), $maxKey )
![Page 41: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/41.jpg)
Reasons to Shard
![Page 42: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/42.jpg)
44
• Scale– Data volume– Query volume
• Global deployment with local writes– Geography aware sharding
• Tiered Storage
• Fast backup restore
Reasons to shard
![Page 43: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/43.jpg)
45
Global Deployment/Local Writes
Primary:NYC
Secondary:NYC
Primary:LON
Primary:SYD
Secondary:LON
Secondary:NYC
Secondary:SYD
Secondary:LON
Secondary:SYD
![Page 44: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/44.jpg)
46
• Save hardware costs
• Put frequently accessed documents on fast servers– Infrequently accessed documents on less capable
servers
• Use Tag aware sharding
Tiered Storage
mongod mongod mongod mongod
Current Current Archive Archive
SSD SSD HDD HDD
![Page 45: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/45.jpg)
47
• 40 TB Database
• 2 shards of 20 TB each
• Challenge– Cannot meet restore SLA after data loss
Fast Restore
mongod mongod
20 TB 20 TB
![Page 46: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/46.jpg)
48
• 40 TB Database
• 4 shards of 10 TB each
• Solution– Reduce the restore time by 50%
Fast Restore
mongod mongod
10 TB 10 TB
mongod mongod
10 TB 10 TB
![Page 47: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/47.jpg)
Summary
![Page 48: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/48.jpg)
50
• To determine required # of shards determine– Storage requirements– Latency requirements– Throughput requirements
• Derive total– Disk capacity– Disk throughput– RAM
• Calculate # of shards based upon individual server specs
Determining the # of shards
![Page 49: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/49.jpg)
51
• Scalability
• Geo-aware clusters
• Tiered Storage
• Reduce backup restore times
Leverage Sharding For
![Page 50: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/50.jpg)
52
• MongoDB Manual: http://docs.mongodb.org/manual/sharding/
• Other Webinars:– How to Achieve Scale With MongoDB
• White Papers– MongoDB Performance Best Practices– MongoDB Architecture Guide
Sharding: Where to go from here…
![Page 51: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/51.jpg)
Get Expert Advice on Scaling. For Free.
For a limited time, if you’re considering a commercial relationship with MongoDB, you can sign up for a free one hour consult about scaling with one of our MongoDB Engineers.Sign Up: http://bit.ly/1rkXcfN
![Page 52: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/52.jpg)
54
Webinar Q&[email protected]
@jayrunkel
Stay tuned after the webinar and take our survey for your chance to win MongoDB swag.
![Page 53: Sharding Methods for MongoDB](https://reader035.vdocument.in/reader035/viewer/2022081413/54922e76b479590d2b8b593b/html5/thumbnails/53.jpg)
Thank You