sharding
DESCRIPTION
TRANSCRIPT
![Page 1: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/1.jpg)
Software Engineer, 10gen
Derick Rethans
#MongoDBTokyo
Sharding
![Page 2: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/2.jpg)
Agenda
• Short history of scaling data• Why shard• MongoDB's approach• Architecture• Configuration• Mechanics• Solutions
![Page 3: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/3.jpg)
The story of scaling data
![Page 4: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/4.jpg)
1970 - 2000: Vertical Scalability (scale up)
![Page 5: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/5.jpg)
Visual representation of horizontal scaling
Google, ~2000: Horizontal Scalability (scale out)
![Page 6: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/6.jpg)
Data Store Scalability in 2005
• Custom Hardware– Oracle
• Custom Software– Facebook + MySQL
![Page 7: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/7.jpg)
Data Store Scalability Today
• MongoDB auto-sharding available in 2009
• A data store that is– Free– Publicly available– Open source
(https://github.com/mongodb/mongo)– Horizontally scalable– Application independent
![Page 8: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/8.jpg)
Why shard?
![Page 9: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/9.jpg)
Working Set Exceeds Physical Memory
![Page 10: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/10.jpg)
Read/Write Throughput Exceeds I/O
![Page 11: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/11.jpg)
MongoDB's approach to sharding
![Page 12: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/12.jpg)
Partition data based on ranges
• User defines shard key• Shard key defines range of data• Key space is like points on a line• Range is a segment of that line
![Page 13: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/13.jpg)
Distribute data in chunks across nodes
• Initially 1 chunk• Default max chunk size: 64mb• MongoDB automatically splits & migrates chunks when max reached
![Page 14: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/14.jpg)
MongoDB manages data
• Queries routed to specific shards
• MongoDB balances cluster
• MongoDB migrates data to new nodes
![Page 15: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/15.jpg)
MongoDB Auto-Sharding
• Minimal effort required– Same interface as single mongod
• Two steps– Enable Sharding for a database– Shard collection within database
![Page 16: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/16.jpg)
Architecture
![Page 17: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/17.jpg)
Data stored in shard
• Shard is a node of the cluster• Shard can be a single mongod or a replica
set
![Page 18: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/18.jpg)
• Config Server–Stores cluster chunk ranges and locations–Can have only 1 or 3 (production must
have 3)–Two phase commit (not a replica set)
Config server stores meta data
![Page 19: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/19.jpg)
MongoS manages the data
• Mongos–Acts as a router / balancer–No local data (persists to config database)–Can have 1 or many
![Page 20: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/20.jpg)
Sharding infrastructure
![Page 21: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/21.jpg)
Configuration
![Page 22: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/22.jpg)
Example cluster setup
• Don’t use this setup in production!- Only one Config server (No Fault Tolerance)- Shard not in a replica set (Low Availability)- Only one Mongos and shard (No Performance Improvement)- Useful for development or demonstrating configuration
mechanics
![Page 23: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/23.jpg)
Start the config server
• “mongod --configsvr”• Starts a config server on the default port
(27019)
![Page 24: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/24.jpg)
Start the mongos router
• “mongos --configdb <hostname>:27019”• For 3 config servers: “mongos --configdb
<host1>:<port1>,<host2>:<port2>,<host3>:<port3>”
• This is always how to start a new mongos, even if the cluster is already running
![Page 25: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/25.jpg)
Start the shard database
• “mongod --shardsvr”• Starts a mongod with the default shard port (27018)• Shard is not yet connected to the rest of the cluster• Shard may have already been running in production
![Page 26: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/26.jpg)
Add the shard
• On mongos: “sh.addShard(‘<host>:27018’)”• Adding a replica set:
“sh.addShard(‘<rsname>/<seedlist>’)• In 2.2 and later can use
sh.addShard(‘<host>:<port>’)
![Page 27: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/27.jpg)
Verify that the shard was added
• db.runCommand({ listshards:1 })• { "shards" : • [ { "_id”: "shard0000”, "host”: ”<hostname>:27018” } ],• "ok" : 1 }
![Page 28: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/28.jpg)
Enabling Sharding
• Enable sharding on a database– sh.enableSharding(“<dbname>”)
• Shard a collection with the given key– sh.shardCollection(“<dbname>.people”,
{“country”:1})• Use a compound shard key to prevent duplicates– sh.shardCollection(“<dbname>.cars”,
{“year”:1, ”uniqueid”:1})
![Page 29: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/29.jpg)
Shard Key
![Page 30: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/30.jpg)
Shard Key
• Choose a field common used in queries
• Shard key is immutable• Shard key values are immutable• Shard key requires index on fields contained in key
• Uniqueness of `_id` field is only guaranteed within individual shard
• Shard key limited to 512 bytes in size
![Page 31: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/31.jpg)
Shard Key Considerations
• Cardinality• Write distribution• Query isolation• Data Distribution
![Page 32: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/32.jpg)
Mechanics
![Page 33: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/33.jpg)
Partitioning
• Remember it's based on ranges
![Page 34: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/34.jpg)
Chunk is a section of the entire range
![Page 35: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/35.jpg)
Chunk splitting
• A chunk is split once it exceeds the maximum size• There is no split point if all documents have the same shard
key• Chunk split is a logical operation (no data is moved)• If split creates too large of a discrepancy of chunk count
across cluster a balancing round starts
![Page 36: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/36.jpg)
Balancing
• Balancer is running on mongos• Once the difference in chunks between the most
dense shard and the least dense shard is above the migration threshold, a balancing round starts
![Page 37: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/37.jpg)
Acquiring the Balancer Lock
• The balancer on mongos takes out a “balancer lock”
• To see the status of these locks:- use config- db.locks.find({ _id: “balancer” })
![Page 38: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/38.jpg)
Moving the chunk
• The mongos sends a “moveChunk” command to source shard
• The source shard then notifies destination shard• The destination claims the chunk shard-key range• Destination shard starts pulling documents from source
shard
![Page 39: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/39.jpg)
Committing Migration
• When complete, destination shard updates config server- Provides new locations of the chunks
![Page 40: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/40.jpg)
Cleanup
• Source shard deletes moved data- Must wait for open cursors to either close or time out- NoTimeout cursors may prevent the release of the lock
• Mongos releases the balancer lock after old chunks are deleted
![Page 41: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/41.jpg)
Routing Requests
![Page 42: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/42.jpg)
Cluster Request Routing
• Targeted Queries• Scatter Gather Queries• Scatter Gather Queries with Sort
![Page 43: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/43.jpg)
Cluster Request Routing: Targeted Query
![Page 44: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/44.jpg)
Routable request received
![Page 45: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/45.jpg)
Request routed to appropriate shard
![Page 46: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/46.jpg)
Shard returns results
![Page 47: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/47.jpg)
Mongos returns results to client
![Page 48: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/48.jpg)
Cluster Request Routing: Non-Targeted Query
![Page 49: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/49.jpg)
Non-Targeted Request Received
![Page 50: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/50.jpg)
Request sent to all shards
![Page 51: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/51.jpg)
Shards return results to mongos
![Page 52: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/52.jpg)
Mongos returns results to client
![Page 53: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/53.jpg)
Cluster Request Routing: Non-Targeted Query with Sort
![Page 54: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/54.jpg)
Non-Targeted request with sort received
![Page 55: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/55.jpg)
Request sent to all shards
![Page 56: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/56.jpg)
Query and sort performed locally
![Page 57: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/57.jpg)
Shards return results to mongos
![Page 58: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/58.jpg)
Mongos merges sorted results
![Page 59: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/59.jpg)
Mongos returns results to client
![Page 60: Sharding](https://reader031.vdocument.in/reader031/viewer/2022013102/54b795c94a795920598b4574/html5/thumbnails/60.jpg)
Software Engineer, 10gen
Derick Rethans
#MongoDBTokyo
Thank You