a journey through the mongodb internals? · pdf filea journey through the mongodb internals?...
TRANSCRIPT
![Page 1: A Journey through the MongoDB Internals? · PDF fileA Journey through the MongoDB Internals? Who Am I • Driver Engineering lead at MongoDB Inc • Work on the Mongo DB Node.js driver](https://reader033.vdocument.in/reader033/viewer/2022051508/5a79a5f77f8b9ade698dd2ae/html5/thumbnails/1.jpg)
Engineering Team lead, MonboDB Inc, @christkv
Christian Amor Kvalheim
#nosql13
A Journey through the MongoDB Internals?
![Page 2: A Journey through the MongoDB Internals? · PDF fileA Journey through the MongoDB Internals? Who Am I • Driver Engineering lead at MongoDB Inc • Work on the Mongo DB Node.js driver](https://reader033.vdocument.in/reader033/viewer/2022051508/5a79a5f77f8b9ade698dd2ae/html5/thumbnails/2.jpg)
A Journey through the MongoDB Internals?
Who Am I• Driver Engineering lead at MongoDB Inc • Work on the Mongo DB Node.js driver • Started working for MongoDB Inc June 2011 • @christkv
![Page 3: A Journey through the MongoDB Internals? · PDF fileA Journey through the MongoDB Internals? Who Am I • Driver Engineering lead at MongoDB Inc • Work on the Mongo DB Node.js driver](https://reader033.vdocument.in/reader033/viewer/2022051508/5a79a5f77f8b9ade698dd2ae/html5/thumbnails/3.jpg)
A Journey through the MongoDB Internals?
Why Pop the Hood?• Understanding data safety • Estimating RAM / disk requirements • Optimizing performance
![Page 4: A Journey through the MongoDB Internals? · PDF fileA Journey through the MongoDB Internals? Who Am I • Driver Engineering lead at MongoDB Inc • Work on the Mongo DB Node.js driver](https://reader033.vdocument.in/reader033/viewer/2022051508/5a79a5f77f8b9ade698dd2ae/html5/thumbnails/4.jpg)
Storage Layout
![Page 5: A Journey through the MongoDB Internals? · PDF fileA Journey through the MongoDB Internals? Who Am I • Driver Engineering lead at MongoDB Inc • Work on the Mongo DB Node.js driver](https://reader033.vdocument.in/reader033/viewer/2022051508/5a79a5f77f8b9ade698dd2ae/html5/thumbnails/5.jpg)
drwxr-xr-x 136 Nov 19 10:12 journal -rw------- 16777216 Oct 25 14:58 test.0 -rw------- 134217728 Mar 13 2012 test.1 -rw------- 268435456 Mar 13 2012 test.2 -rw------- 536870912 May 11 2012 test.3 -rw------- 1073741824 May 11 2012 test.4 -rw------- 2146435072 Nov 19 10:14 test.5 -rw------- 16777216 Nov 19 10:13 test.ns
Directory Layout
A Journey through the MongoDB Internals?
![Page 6: A Journey through the MongoDB Internals? · PDF fileA Journey through the MongoDB Internals? Who Am I • Driver Engineering lead at MongoDB Inc • Work on the Mongo DB Node.js driver](https://reader033.vdocument.in/reader033/viewer/2022051508/5a79a5f77f8b9ade698dd2ae/html5/thumbnails/6.jpg)
Directory Layout
• Aggressive pre-allocation (always 1 spare file) • There is one namespace file per db which can
hold 18000 entries per default • A namespace is a collection or an index
A Journey through the MongoDB Internals?
![Page 7: A Journey through the MongoDB Internals? · PDF fileA Journey through the MongoDB Internals? Who Am I • Driver Engineering lead at MongoDB Inc • Work on the Mongo DB Node.js driver](https://reader033.vdocument.in/reader033/viewer/2022051508/5a79a5f77f8b9ade698dd2ae/html5/thumbnails/7.jpg)
Tuning with Options
• Use --directoryperdb to separate dbs into own folders which allows to use different volumes (isolation, performance)
• Use --smallfiles to keep data files smaller • If using many databases, use –nopreallocate and --
smallfiles to reduce storage size • If using thousands of collections & indexes,
increase namespace capacity with --nssize
A Journey through the MongoDB Internals?
![Page 8: A Journey through the MongoDB Internals? · PDF fileA Journey through the MongoDB Internals? Who Am I • Driver Engineering lead at MongoDB Inc • Work on the Mongo DB Node.js driver](https://reader033.vdocument.in/reader033/viewer/2022051508/5a79a5f77f8b9ade698dd2ae/html5/thumbnails/8.jpg)
Internal Structure
![Page 9: A Journey through the MongoDB Internals? · PDF fileA Journey through the MongoDB Internals? Who Am I • Driver Engineering lead at MongoDB Inc • Work on the Mongo DB Node.js driver](https://reader033.vdocument.in/reader033/viewer/2022051508/5a79a5f77f8b9ade698dd2ae/html5/thumbnails/9.jpg)
Internal File Format
A Journey through the MongoDB Internals?
![Page 10: A Journey through the MongoDB Internals? · PDF fileA Journey through the MongoDB Internals? Who Am I • Driver Engineering lead at MongoDB Inc • Work on the Mongo DB Node.js driver](https://reader033.vdocument.in/reader033/viewer/2022051508/5a79a5f77f8b9ade698dd2ae/html5/thumbnails/10.jpg)
Extent Structure
A Journey through the MongoDB Internals?
![Page 11: A Journey through the MongoDB Internals? · PDF fileA Journey through the MongoDB Internals? Who Am I • Driver Engineering lead at MongoDB Inc • Work on the Mongo DB Node.js driver](https://reader033.vdocument.in/reader033/viewer/2022051508/5a79a5f77f8b9ade698dd2ae/html5/thumbnails/11.jpg)
Extents and Records
Journaling and Storage
![Page 12: A Journey through the MongoDB Internals? · PDF fileA Journey through the MongoDB Internals? Who Am I • Driver Engineering lead at MongoDB Inc • Work on the Mongo DB Node.js driver](https://reader033.vdocument.in/reader033/viewer/2022051508/5a79a5f77f8b9ade698dd2ae/html5/thumbnails/12.jpg)
To Sum Up: Internal File Format
• Files on disk are broken into extents which contain the documents
• A collection has one or more extents • Extent grow exponentially up to 2GB • Namespace entries in the ns file point to the first
extent for that collection
A Journey through the MongoDB Internals?
![Page 13: A Journey through the MongoDB Internals? · PDF fileA Journey through the MongoDB Internals? Who Am I • Driver Engineering lead at MongoDB Inc • Work on the Mongo DB Node.js driver](https://reader033.vdocument.in/reader033/viewer/2022051508/5a79a5f77f8b9ade698dd2ae/html5/thumbnails/13.jpg)
What About Indexes?
![Page 14: A Journey through the MongoDB Internals? · PDF fileA Journey through the MongoDB Internals? Who Am I • Driver Engineering lead at MongoDB Inc • Work on the Mongo DB Node.js driver](https://reader033.vdocument.in/reader033/viewer/2022051508/5a79a5f77f8b9ade698dd2ae/html5/thumbnails/14.jpg)
Indexes
• Indexes are BTree structures serialized to disk • They are stored in the same files as data but using
own extents
A Journey through the MongoDB Internals?
![Page 15: A Journey through the MongoDB Internals? · PDF fileA Journey through the MongoDB Internals? Who Am I • Driver Engineering lead at MongoDB Inc • Work on the Mongo DB Node.js driver](https://reader033.vdocument.in/reader033/viewer/2022051508/5a79a5f77f8b9ade698dd2ae/html5/thumbnails/15.jpg)
The DB Stats
A Journey through the MongoDB Internals?
> db.stats() { "db" : "test", "collections" : 22, "objects" : 17000383, ## number of documents "avgObjSize" : 44.33690276272011, "dataSize" : 753744328, ## size of data "storageSize" : 1159569408, ## size of all containing extents "numExtents" : 81, "indexes" : 85, "indexSize" : 624204896, ## separate index storage size "fileSize" : 4176478208, ## size of data files on disk "nsSizeMB" : 16, "ok" : 1 }
![Page 16: A Journey through the MongoDB Internals? · PDF fileA Journey through the MongoDB Internals? Who Am I • Driver Engineering lead at MongoDB Inc • Work on the Mongo DB Node.js driver](https://reader033.vdocument.in/reader033/viewer/2022051508/5a79a5f77f8b9ade698dd2ae/html5/thumbnails/16.jpg)
The Collection Stats
A Journey through the MongoDB Internals?
> db.large.stats() { "ns" : "test.large", "count" : 5000000, ## number of documents "size" : 280000024, ## size of data "avgObjSize" : 56.0000048, "storageSize" : 409206784, ## size of all containing extents "numExtents" : 18, "nindexes" : 1, "lastExtentSize" : 74846208, "paddingFactor" : 1, ## amount of padding "systemFlags" : 0, "userFlags" : 0, "totalIndexSize" : 162228192, ## separate index storage size "indexSizes" : { "_id_" : 162228192 }, "ok" : 1 }
![Page 17: A Journey through the MongoDB Internals? · PDF fileA Journey through the MongoDB Internals? Who Am I • Driver Engineering lead at MongoDB Inc • Work on the Mongo DB Node.js driver](https://reader033.vdocument.in/reader033/viewer/2022051508/5a79a5f77f8b9ade698dd2ae/html5/thumbnails/17.jpg)
What’s Memory Mapping?
![Page 18: A Journey through the MongoDB Internals? · PDF fileA Journey through the MongoDB Internals? Who Am I • Driver Engineering lead at MongoDB Inc • Work on the Mongo DB Node.js driver](https://reader033.vdocument.in/reader033/viewer/2022051508/5a79a5f77f8b9ade698dd2ae/html5/thumbnails/18.jpg)
Memory Mapped Files
• All data files are memory mapped to Virtual Memory by the OS
• MongoDB just reads / writes to RAM in the filesystem cache
• OS takes care of the rest!
• Virtual process size = total files size + overhead (connections, heap)
• If journal is on, the virtual size will be roughly doubled
A Journey through the MongoDB Internals?
![Page 19: A Journey through the MongoDB Internals? · PDF fileA Journey through the MongoDB Internals? Who Am I • Driver Engineering lead at MongoDB Inc • Work on the Mongo DB Node.js driver](https://reader033.vdocument.in/reader033/viewer/2022051508/5a79a5f77f8b9ade698dd2ae/html5/thumbnails/19.jpg)
Virtual Address Space
A Journey through the MongoDB Internals?
![Page 20: A Journey through the MongoDB Internals? · PDF fileA Journey through the MongoDB Internals? Who Am I • Driver Engineering lead at MongoDB Inc • Work on the Mongo DB Node.js driver](https://reader033.vdocument.in/reader033/viewer/2022051508/5a79a5f77f8b9ade698dd2ae/html5/thumbnails/20.jpg)
Memory Map, Love It or Hate It
• Pros: – No complex memory / disk code in MongoDB, huge win! – The OS is very good at caching for any type of storage – Least Recently Used behavior – Cache stays warm across MongoDB restarts
• Cons: – RAM usage is affected by disk fragmentation – RAM usage is affected by high read-ahead – LRU behavior does not prioritize things (like indexes)
A Journey through the MongoDB Internals?
![Page 21: A Journey through the MongoDB Internals? · PDF fileA Journey through the MongoDB Internals? Who Am I • Driver Engineering lead at MongoDB Inc • Work on the Mongo DB Node.js driver](https://reader033.vdocument.in/reader033/viewer/2022051508/5a79a5f77f8b9ade698dd2ae/html5/thumbnails/21.jpg)
How Much Data is in RAM?• Resident memory the best indicator of how much
data in RAM • Resident is: process overhead (connections, heap) +
FS pages in RAM that were accessed • Means that it resets to 0 upon restart even though
data is still in RAM due to FS cache • Use free command to check on FS cache size • Can be affected by fragmentation and read-ahead
A Journey through the MongoDB Internals?
![Page 22: A Journey through the MongoDB Internals? · PDF fileA Journey through the MongoDB Internals? Who Am I • Driver Engineering lead at MongoDB Inc • Work on the Mongo DB Node.js driver](https://reader033.vdocument.in/reader033/viewer/2022051508/5a79a5f77f8b9ade698dd2ae/html5/thumbnails/22.jpg)
Journaling
![Page 23: A Journey through the MongoDB Internals? · PDF fileA Journey through the MongoDB Internals? Who Am I • Driver Engineering lead at MongoDB Inc • Work on the Mongo DB Node.js driver](https://reader033.vdocument.in/reader033/viewer/2022051508/5a79a5f77f8b9ade698dd2ae/html5/thumbnails/23.jpg)
The Problem
Changes in memory mapped files are not applied in order and different parts of the file can be from different points in time
!
You want a consistent point-in-time snapshot when restarting after a crash
A Journey through the MongoDB Internals?
![Page 24: A Journey through the MongoDB Internals? · PDF fileA Journey through the MongoDB Internals? Who Am I • Driver Engineering lead at MongoDB Inc • Work on the Mongo DB Node.js driver](https://reader033.vdocument.in/reader033/viewer/2022051508/5a79a5f77f8b9ade698dd2ae/html5/thumbnails/24.jpg)
?
![Page 25: A Journey through the MongoDB Internals? · PDF fileA Journey through the MongoDB Internals? Who Am I • Driver Engineering lead at MongoDB Inc • Work on the Mongo DB Node.js driver](https://reader033.vdocument.in/reader033/viewer/2022051508/5a79a5f77f8b9ade698dd2ae/html5/thumbnails/25.jpg)
corruption
![Page 26: A Journey through the MongoDB Internals? · PDF fileA Journey through the MongoDB Internals? Who Am I • Driver Engineering lead at MongoDB Inc • Work on the Mongo DB Node.js driver](https://reader033.vdocument.in/reader033/viewer/2022051508/5a79a5f77f8b9ade698dd2ae/html5/thumbnails/26.jpg)
Solution – Use a Journal
• Data gets written to a journal before making it to the data files
• Operations written to a journal buffer in RAM that gets flushed every 100ms by default or 100MB
• Once the journal is written to disk, the data is safe • Journal prevents corruption and allows durability • Can be turned off, but don’t!
A Journey through the MongoDB Internals?
![Page 27: A Journey through the MongoDB Internals? · PDF fileA Journey through the MongoDB Internals? Who Am I • Driver Engineering lead at MongoDB Inc • Work on the Mongo DB Node.js driver](https://reader033.vdocument.in/reader033/viewer/2022051508/5a79a5f77f8b9ade698dd2ae/html5/thumbnails/27.jpg)
Journal Format
A Journey through the MongoDB Internals?
![Page 28: A Journey through the MongoDB Internals? · PDF fileA Journey through the MongoDB Internals? Who Am I • Driver Engineering lead at MongoDB Inc • Work on the Mongo DB Node.js driver](https://reader033.vdocument.in/reader033/viewer/2022051508/5a79a5f77f8b9ade698dd2ae/html5/thumbnails/28.jpg)
Can I Lose Data on a Hard Crash?
• Maximum data loss is 100ms (journal flush). This can be reduced with –journalCommitInterval
• For durability (data is on disk when ack’ed) use the JOURNAL_SAFE write concern (“j” option).
• Note that replication can reduce the data loss further. Use the REPLICAS_SAFE write concern (“w” option).
• As write guarantees increase, latency increases. To maintain performance, use more connections!
A Journey through the MongoDB Internals?
![Page 29: A Journey through the MongoDB Internals? · PDF fileA Journey through the MongoDB Internals? Who Am I • Driver Engineering lead at MongoDB Inc • Work on the Mongo DB Node.js driver](https://reader033.vdocument.in/reader033/viewer/2022051508/5a79a5f77f8b9ade698dd2ae/html5/thumbnails/29.jpg)
What is the Cost of a Journal?
• On read-heavy systems, no impact • Write performance is reduced by 5-30% • If using separate drive for journal, as low as 3% • For apps that are write-heavy (1000+ writes per
server) there can be slowdown due to mix of journal and data flushes. Use a separate drive!
A Journey through the MongoDB Internals?
![Page 30: A Journey through the MongoDB Internals? · PDF fileA Journey through the MongoDB Internals? Who Am I • Driver Engineering lead at MongoDB Inc • Work on the Mongo DB Node.js driver](https://reader033.vdocument.in/reader033/viewer/2022051508/5a79a5f77f8b9ade698dd2ae/html5/thumbnails/30.jpg)
Fragmentation
![Page 31: A Journey through the MongoDB Internals? · PDF fileA Journey through the MongoDB Internals? Who Am I • Driver Engineering lead at MongoDB Inc • Work on the Mongo DB Node.js driver](https://reader033.vdocument.in/reader033/viewer/2022051508/5a79a5f77f8b9ade698dd2ae/html5/thumbnails/31.jpg)
A Journey through the MongoDB Internals?
What it Looks Like
Both on disk and in RAM!
![Page 32: A Journey through the MongoDB Internals? · PDF fileA Journey through the MongoDB Internals? Who Am I • Driver Engineering lead at MongoDB Inc • Work on the Mongo DB Node.js driver](https://reader033.vdocument.in/reader033/viewer/2022051508/5a79a5f77f8b9ade698dd2ae/html5/thumbnails/32.jpg)
Fragmentation
• Files can get fragmented over time if remove() and update() are issued.
• It gets worse if documents have varied sizes • Fragmentation wastes disk space and RAM • Also makes writes scattered and slower • Fragmentation can be checked by comparing size
to storageSize in the collection’s stats.
A Journey through the MongoDB Internals?
![Page 33: A Journey through the MongoDB Internals? · PDF fileA Journey through the MongoDB Internals? Who Am I • Driver Engineering lead at MongoDB Inc • Work on the Mongo DB Node.js driver](https://reader033.vdocument.in/reader033/viewer/2022051508/5a79a5f77f8b9ade698dd2ae/html5/thumbnails/33.jpg)
How to Combat Fragmentation
• compact command (maintenance op) • Normalize schema more (documents don’t grow) • Pre-pad documents (documents don’t grow) • Use separate collections over time, then use
collection.drop() instead of collection.remove(query) • --usePowerOf2sizes option makes disk buckets
more reusable
A Journey through the MongoDB Internals?
![Page 34: A Journey through the MongoDB Internals? · PDF fileA Journey through the MongoDB Internals? Who Am I • Driver Engineering lead at MongoDB Inc • Work on the Mongo DB Node.js driver](https://reader033.vdocument.in/reader033/viewer/2022051508/5a79a5f77f8b9ade698dd2ae/html5/thumbnails/34.jpg)
In Review
![Page 35: A Journey through the MongoDB Internals? · PDF fileA Journey through the MongoDB Internals? Who Am I • Driver Engineering lead at MongoDB Inc • Work on the Mongo DB Node.js driver](https://reader033.vdocument.in/reader033/viewer/2022051508/5a79a5f77f8b9ade698dd2ae/html5/thumbnails/35.jpg)
In Review
• Understand disk layout and footprint • See how much data is actually in RAM • Memory mapping is cool • Answer how much data is ok to lose • Check on fragmentation and avoid it • https://github.com/10gen-labs/storage-viz
A Journey through the MongoDB Internals?
![Page 36: A Journey through the MongoDB Internals? · PDF fileA Journey through the MongoDB Internals? Who Am I • Driver Engineering lead at MongoDB Inc • Work on the Mongo DB Node.js driver](https://reader033.vdocument.in/reader033/viewer/2022051508/5a79a5f77f8b9ade698dd2ae/html5/thumbnails/36.jpg)
Engineering Team lead, MongoDB Inc, @christkv
Christian Amor Kvalheim
#nosql13
Questions?