![Page 1: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/1.jpg)
SLM-DB: Single Level Merge Key-Value Store with Persistent Memory
Olzhas Kaiyrakhmet, Songyi Lee, Beomseok Nam, Sam H. Noh, Young-ri Choi
![Page 2: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/2.jpg)
Outline
• Background
• Contributions
• Architecture
• Evaluation
• Conclusion
FAST 2019 2
![Page 3: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/3.jpg)
Key-Value Databases
FAST 2019 3
“100”
“html_doc”
“linux_logo”
Key Value
{[Green, Word, Gates]}
<html><head>…..</body></html>
![Page 4: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/4.jpg)
Log-Structured Merge (LSM) Tree
• Optimized for heavy write application usage
• Designed for slow hard drives
FAST 2019 4
CK C1 C0
…
mergemergemerge
Disk Memory
In-memory buffer
Components are sorted
![Page 5: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/5.jpg)
LSM-tree: disadvantages
FAST 2019 5
CK C1 C0
Disk Memory
…
mergemergemerge
![Page 6: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/6.jpg)
LSM-tree: disadvantages
FAST 2019 5
CK C1 C0
Disk Memory
…
mergemergemergeGet(key)
Search(key)
![Page 7: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/7.jpg)
LSM-tree: disadvantages
FAST 2019 5
CK C1 C0
Disk Memory
…
mergemergemergeGet(key)
Search(key)
![Page 8: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/8.jpg)
LSM-tree: disadvantages
FAST 2019 5
CK C1 C0
Disk Memory
…
mergemergemergeGet(key)
Search(key)
![Page 9: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/9.jpg)
LSM-tree: disadvantages
FAST 2019 5
CK C1 C0
Disk Memory
…
mergemergemergeGet(key)
Search(key)
• Large overhead to locate needed data
![Page 10: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/10.jpg)
LSM-tree: disadvantages
FAST 2019 5
CK C1 C0
Disk Memory
…
mergemergemergeGet(key)
Search(key)
• Large overhead to locate needed data
![Page 11: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/11.jpg)
LSM-tree: disadvantages
FAST 2019 5
CK C1 C0
Disk Memory
…
mergemergemergeGet(key)
Search(key)
• Large overhead to locate needed data
• High disk write amplification
![Page 12: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/12.jpg)
State-of-the-art LSM-tree: LevelDB
FAST 2019 6
Level 0
Level 1MemTable
ImmutableMemTable
Application
Level 2
Sorted String Tables (SST)
Compaction
Merge from Level N to Level N+1
Flush
WAL
Write-Ahead-Log (no fsync)MANIFEST
Store file organization and
metadata In-memory skiplist to
buffer updates
Disk Memory
Each level is 10x larger than
previous
Mark Immutable when becoming
full
Sequential write to the disk
![Page 13: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/13.jpg)
LSM-tree optimizations
• Improve parallelism:• RocksDB (Facebook)
• HyperLevelDB
• Reduce write amplification:• PebblesDB (SOSP ‘17)
• Optimize for hardware(SSD):• VT-tree (FAST ‘13)
• WiscKey (FAST ‘16)
FAST 2019 7
![Page 14: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/14.jpg)
New era
FAST 2019 8
speedfast slow
Byte addressable Persistent storagePersistent
Memory
![Page 15: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/15.jpg)
Simple approach
FAST 2019 9
Disk Memory
CK C1 C0
…
mergemergemerge
![Page 16: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/16.jpg)
Simple approach
FAST 2019 9
Disk MemoryPersistent
Memory
CK C1 C0
…
mergemergemerge
![Page 17: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/17.jpg)
Simple approach
FAST 2019 9
Disk MemoryPersistent
Memory
CK C1 C0
…
mergemergemerge
![Page 18: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/18.jpg)
Our approach
FAST 2019 10
C1
Disk Memory
merge
C0
![Page 19: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/19.jpg)
Our approach
FAST 2019 10
C1
Disk Memory
merge
C0
PersistentMemory
![Page 20: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/20.jpg)
Our approach
FAST 2019 10
C1
Disk Memory
merge
C0merge
PersistentMemory
Single disk component C1that does self-merging
![Page 21: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/21.jpg)
Our approach
FAST 2019 10
Index
C1
Disk Memory
merge
C0merge
PersistentMemory
Single disk component C1that does self-merging B+-tree to manage data
stored in the disk
![Page 22: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/22.jpg)
Single-Level Merge DB (SLM-DB)
FAST 2019 11
MemTable
ImmutableMemTable
Disk Persistent Memory
… Data
Flush
Compaction
Level 0
Global B+-Tree
Application
MANIFESTNo WAL
Similar as in LevelDB
Index per-key that
stored in the diskSelect candidate
files to merge them together
Single level of SST files
![Page 23: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/23.jpg)
Contributions
FAST 2019 12
Persistent MemTableNo Write-Ahead Logging (WAL)
Stronger consistency compared to LevelDB
Persistent B+-tree IndexPer-key index for fast search
No multi-leveled merge structure
Selective CompactionLive-key ratio of a Sorted-String Table
Leaf node scan in the B+-treeDegree of sequentiality per range query
![Page 24: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/24.jpg)
Persistent MemTable
FAST 2019 13
Consistency
guaranteed
No consistency
guaranteed
0 1 2 3 5 6 7 8 9
Recoverable after failure
![Page 25: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/25.jpg)
Insert into Persistent MemTable
FAST 2019 14
(1) create node
4
(2) Assign next
pointer and clflush()(3) Atomically change
next pointer
Consistency
guaranteed
No consistency
guaranteed
0 1 2 3 5 6 7 8 9
![Page 26: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/26.jpg)
Single-Level Merge DB
FAST 2019 15
MemTable
ImmutableMemTable
Disk Persistent Memory
… Data
Compaction
Level 0
GlobalB+-Tree
Application
MANIFEST Flush
![Page 27: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/27.jpg)
Flush
FAST 2019 16
File Creation
Index Insertion
Save to MANIFEST
• Key-Index insertion into B+-tree happens during Immutable Memtable Flush to disk
• FAST-FAIR B+-tree (Hwang et al., FAST ’18)
FlushFile creation
thread
B+-tree insertion thread
Time
![Page 28: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/28.jpg)
Single-Level Merge DB
FAST 2019 17
MemTable
ImmutableMemTable
Disk Persistent Memory
… DataLevel 0
GlobalB+-Tree
Application
MANIFEST Flush
Compaction
![Page 29: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/29.jpg)
Why we need Compaction?
FAST 2019 18
File#0 File#1 File#21 10 17 11 13 19 6 14 35
- Valid KV pair
- Obsolete KV pair
![Page 30: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/30.jpg)
Why we need Compaction?
FAST 2019 18
File#0 File#1 File#21 10 17 11 13 19 6 14 35 File#3 1 11 14
New file
- Valid KV pair
- Obsolete KV pair
![Page 31: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/31.jpg)
Why we need Compaction?
FAST 2019 18
File#0 File#1 File#21 10 17 11 13 19 6 14 35 File#3 1 11 14
New file
- Valid KV pair
- Obsolete KV pair
KV-pairs became obsolete
![Page 32: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/32.jpg)
Why we need Compaction?
FAST 2019 18
File#0 File#1 File#21 10 17 11 13 19 6 14 35 File#3 1 11 14
New file
File#4 12 17 35
New file
- Valid KV pair
- Obsolete KV pair
KV-pairs became obsolete
![Page 33: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/33.jpg)
Why we need Compaction?
FAST 2019 18
File#0 File#1 File#21 10 17 11 13 19 6 14 35 File#3 1 11 14
New file
File#4 12 17 35
New file
- Valid KV pair
- Obsolete KV pair
KV-pairs became obsolete
![Page 34: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/34.jpg)
Why we need Compaction?
FAST 2019 18
File#0 File#1 File#21 10 17 11 13 19 6 14 35 File#3 1 11 14
New file
File#4 12 17 35
New file
Need garbage collection (GC)
- Valid KV pair
- Obsolete KV pair
KV-pairs became obsolete
![Page 35: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/35.jpg)
Why else?
FAST 2019 19
File#0 File#1 File#2 File#3 File#41 10 17 11 13 19 6 14 35 14 21 32 2 8 17
RangeQuery(5, 12)
![Page 36: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/36.jpg)
Why else?
FAST 2019 19
File#0 File#1 File#2 File#3 File#41 10 17 11 13 19 6 14 35 14 21 32 2 8 17
RangeQuery(5, 12)
![Page 37: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/37.jpg)
Why else?
FAST 2019 19
File#0 File#1 File#2 File#3 File#41 10 17 11 13 19 6 14 35 14 21 32 2 8 17
RangeQuery(5, 12)
![Page 38: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/38.jpg)
Why else?
FAST 2019 19
File#0 File#1 File#2 File#3 File#41 10 17 11 13 19 6 14 35 14 21 32 2 8 17
RangeQuery(5, 12)
![Page 39: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/39.jpg)
Why else?
FAST 2019 19
File#0 File#1 File#2 File#3 File#41 10 17 11 13 19 6 14 35 14 21 32 2 8 17
RangeQuery(5, 12)
![Page 40: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/40.jpg)
Why else?
FAST 2019 19
File#0 File#1 File#2 File#3 File#41 10 17 11 13 19 6 14 35 14 21 32 2 8 17
RangeQuery(5, 12) Need to improve sequentiality
![Page 41: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/41.jpg)
Selective compaction
• Selectively pick SSTable files
• Make those files as compaction candidates
• Merge together most overlapping compaction candidates
• Selection schemes for compaction candidates:oLive-key ratio selection of an SSTable (for GC)
oLeaf node scans in the B+-tree (for sequentiality) [see paper]
oDegree of sequentiality per range query (for sequentiality) [see paper]
FAST 2019 20
![Page 42: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/42.jpg)
CompactionCandidates
Live-key ratio selection
FAST 2019 21
File 1
PM B+-tree
1 3 5 File 2 1 2 4 File 3 2 6 7
• To collect garbage• If live (valid) to total key ratio is below threshold, then add to candidates
Ratio 66.6% Ratio 66.6% Ratio 66.6%
- Valid KV pair
- Obsolete KV pair
Ratio threshold - 50% PM
Disk
![Page 43: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/43.jpg)
CompactionCandidates
Live-key ratio selection
FAST 2019 21
File 1
PM B+-tree
1 3 5 File 2 1 2 4 File 3 2 6 7 File 4 1 2 4
• To collect garbage• If live (valid) to total key ratio is below threshold, then add to candidates
Ratio 66.6% Ratio 66.6% Ratio 66.6%
- Valid KV pair
- Obsolete KV pair
Ratio threshold - 50% PM
Disk
![Page 44: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/44.jpg)
CompactionCandidates
Live-key ratio selection
FAST 2019 21
File 1
PM B+-tree
1 3 5 File 2 1 2 4 File 3 2 6 7 File 4 1 2 4
• To collect garbage• If live (valid) to total key ratio is below threshold, then add to candidates
Ratio 66.6% Ratio 66.6%Ratio 33.3%
- Valid KV pair
- Obsolete KV pair
Ratio threshold - 50% PM
Disk
![Page 45: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/45.jpg)
CompactionCandidates
Live-key ratio selection
FAST 2019 21
File 1
PM B+-tree
1 3 5 File 2 1 2 4 File 3 2 6 7 File 4 1 2 4
• To collect garbage• If live (valid) to total key ratio is below threshold, then add to candidates
Ratio 66.6% Ratio 33.3% Ratio 33.3%
- Valid KV pair
- Obsolete KV pair
Ratio threshold - 50% PM
Disk
![Page 46: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/46.jpg)
CompactionCandidates
Live-key ratio selection
FAST 2019 21
File 1
PM B+-tree
1 3 5 File 2 1 2 4 File 3 2 6 7 File 4 1 2 4
• To collect garbage• If live (valid) to total key ratio is below threshold, then add to candidates
Ratio 66.6% Ratio 0.0% Ratio 33.3%
- Valid KV pair
- Obsolete KV pair
Ratio threshold - 50% PM
Disk
![Page 47: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/47.jpg)
CompactionCandidates
Live-key ratio selection
FAST 2019 21
File 1
PM B+-tree
1 3 5 File 2 1 2 4 File 3 2 6 7 File 4 1 2 4
• To collect garbage• If live (valid) to total key ratio is below threshold, then add to candidates
Ratio 66.6% Ratio 0.0% Ratio 33.3% Ratio 100.0%
- Valid KV pair
- Obsolete KV pair
Ratio threshold - 50% PM
Disk
![Page 48: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/48.jpg)
CompactionCandidates
Live-key ratio selection
FAST 2019 21
File 1
PM B+-tree
1 3 5 File 3 2 6 7 File 4 1 2 4
• To collect garbage• If live (valid) to total key ratio is below threshold, then add to candidates
Ratio 66.6% Ratio 0.0% Ratio 33.3% Ratio 100.0%
- Valid KV pair
- Obsolete KV pair
Ratio threshold - 50% PM
Disk
![Page 49: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/49.jpg)
CompactionCandidates
Live-key ratio selection
FAST 2019 21
File 1
PM B+-tree
1 3 5 File 3 2 6 7 File 4 1 2 4
• To collect garbage• If live (valid) to total key ratio is below threshold, then add to candidates
Ratio 66.6% Ratio 0.0% Ratio 33.3% Ratio 100.0%
- Valid KV pair
- Obsolete KV pair
Ratio threshold - 50% PM
Disk
![Page 50: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/50.jpg)
CompactionCandidates
Live-key ratio selection
FAST 2019 21
File 1
PM B+-tree
1 3 5
File 3 2 6 7
File 4 1 2 4
• To collect garbage• If live (valid) to total key ratio is below threshold, then add to candidates
Ratio 66.6% Ratio 0.0% Ratio 33.3% Ratio 100.0%
- Valid KV pair
- Obsolete KV pair
Ratio threshold - 50% PM
Disk
![Page 51: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/51.jpg)
Compaction
FAST 2019 22
Pick
File#1 File#2 File#3 File#4File#0 File#5 File#6
Compaction candidate files Files
• Compaction triggered when there are too many compaction candidate files
File creation thread
B+-tree insertion thread
Time
![Page 52: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/52.jpg)
Compaction
FAST 2019 22
Pick
File#1 File#2 File#3 File#4File#0 File#5 File#6
Compaction candidate files Files
• Compaction triggered when there are too many compaction candidate files
File creation thread
B+-tree insertion thread
Time
![Page 53: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/53.jpg)
Compaction
FAST 2019 22
Pick
Merge
File#1 File#2 File#3 File#4File#0 File#5 File#6
Compaction candidate files Files
• Compaction triggered when there are too many compaction candidate files
File creation thread
B+-tree insertion thread
Time
![Page 54: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/54.jpg)
Compaction
FAST 2019 22
File #7 CreationPick
Merge
File#1 File#2 File#3 File#4File#0 File#5 File#6
Compaction candidate files Files
• Compaction triggered when there are too many compaction candidate files
File creation thread
B+-tree insertion thread
Time
![Page 55: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/55.jpg)
Compaction
FAST 2019 22
File #7 Creation
Index File#7
Pick
Merge
File#1 File#2 File#3 File#4File#0 File#5 File#6
Compaction candidate files Files
• Compaction triggered when there are too many compaction candidate files
File creation thread
B+-tree insertion thread
Time
![Page 56: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/56.jpg)
Compaction
FAST 2019 22
File #7 Creation
Index File#7
Pick
Merge
File#1 File#2 File#3 File#4File#0 File#5 File#6
Compaction candidate files Files
• Compaction triggered when there are too many compaction candidate files
File creation thread
B+-tree insertion thread
Time
![Page 57: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/57.jpg)
Compaction
FAST 2019 22
File #7 Creation
Index File#7
File#8 Creation Pick
Merge
File#1 File#2 File#3 File#4File#0 File#5 File#6
Compaction candidate files Files
• Compaction triggered when there are too many compaction candidate files
File creation thread
B+-tree insertion thread
Time
![Page 58: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/58.jpg)
Compaction
FAST 2019 22
File #7 Creation
Index File#7
File#8 Creation
Index File#8
Pick
File#1 File#2 File#3 File#4File#0 File#5 File#6
Compaction candidate files Files
• Compaction triggered when there are too many compaction candidate files
File creation thread
B+-tree insertion thread
Time
![Page 59: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/59.jpg)
Compaction
FAST 2019 22
File #7 Creation
Index File#7
Save to MANIFEST
File#8 Creation
Index File#8
Pick
File#1 File#2 File#3 File#4File#0 File#5 File#6
Compaction candidate files Files
• Compaction triggered when there are too many compaction candidate files
File creation thread
B+-tree insertion thread
Time
![Page 60: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/60.jpg)
General operations
•Put
•Put if exists/Put if not exists
•Get
•Scan
FAST 2019 23
![Page 61: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/61.jpg)
B+-tree Index
Files
ImmutableMemTable
MemTableClient
Put(key, value)
FAST 2019 24
Disk PM
K V
![Page 62: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/62.jpg)
B+-tree Index
Files
ImmutableMemTable
MemTableClient
Put(key, value)
FAST 2019 24
Disk PM
K V
![Page 63: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/63.jpg)
B+-tree Index
Files
ImmutableMemTable
MemTableClient
Put(key, value)
FAST 2019 24
Disk PM
K V
![Page 64: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/64.jpg)
B+-tree Index
Files
ImmutableMemTable
MemTableClient
Put(key, value)
FAST 2019 24
Disk PM
K VK
![Page 65: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/65.jpg)
B+-tree Index
Files
ImmutableMemTable
MemTable
Put(key, value) if exists/if not exists
FAST 2019 25
Disk PM
ClientK V
![Page 66: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/66.jpg)
B+-tree Index
Files
ImmutableMemTable
MemTable
Put(key, value) if exists/if not exists
FAST 2019 25
Disk PM
ClientK V
Make sure if statement is
fulfilled before Put()
![Page 67: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/67.jpg)
B+-tree Index
Files
ImmutableMemTable
MemTable
Put(key, value) if exists/if not exists
FAST 2019 25
Disk PM
ClientK V
Query
Make sure if statement is
fulfilled before Put()
![Page 68: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/68.jpg)
B+-tree Index
Files
ImmutableMemTable
MemTable
Put(key, value) if exists/if not exists
FAST 2019 25
Disk PM
ClientK V
Query
Make sure if statement is
fulfilled before Put()
![Page 69: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/69.jpg)
B+-tree Index
Files
ImmutableMemTable
MemTable
Put(key, value) if exists/if not exists
FAST 2019 25
Disk PM
ClientK V
Query
Make sure if statement is
fulfilled before Put()
![Page 70: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/70.jpg)
B+-tree Index
Files
ImmutableMemTable
MemTable
Put(key, value) if exists/if not exists
FAST 2019 25
Disk PM
ClientK V
Query
Make sure if statement is
fulfilled before Put()
Statement is true
![Page 71: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/71.jpg)
B+-tree Index
Files
ImmutableMemTable
MemTable
Put(key, value) if exists/if not exists
FAST 2019 25
Disk PM
ClientK V
Query
Make sure if statement is
fulfilled before Put()
Statement is true
![Page 72: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/72.jpg)
B+-tree Index
Files
ImmutableMemTable
MemTable
Get(key)
FAST 2019 26
Disk PM
ClientK
![Page 73: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/73.jpg)
B+-tree Index
Files
ImmutableMemTable
MemTable
Get(key)
FAST 2019 26
Disk PM
ClientK
Query
![Page 74: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/74.jpg)
B+-tree Index
Files
ImmutableMemTable
MemTable
Get(key)
FAST 2019 26
Disk PM
ClientK
Query
![Page 75: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/75.jpg)
B+-tree Index
Files
ImmutableMemTable
MemTable
Get(key)
FAST 2019 26
Disk PM
ClientK
Query
![Page 76: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/76.jpg)
B+-tree Index
Files
ImmutableMemTable
MemTable
Get(key)
FAST 2019 26
Disk PM
ClientK
Query
Key exists
![Page 77: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/77.jpg)
B+-tree Index
Files
ImmutableMemTable
MemTable
Get(key)
FAST 2019 26
Disk PM
ClientK
VQuery
Key exists
![Page 78: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/78.jpg)
B+-tree Index
Files
ImmutableMemTable
MemTable
Get(key)
FAST 2019 26
Disk PM
ClientK V
Query
Key exists
![Page 79: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/79.jpg)
B+-tree Index
Files
ImmutableMemTable
MemTableClient
Scan(keyi, keyj)
FAST 2019 27
Disk PM
Ki Kj
![Page 80: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/80.jpg)
B+-tree Index
Files
ImmutableMemTable
MemTableClient
Scan(keyi, keyj)
FAST 2019 27
Disk PM
Ki Kj
Create iterator
![Page 81: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/81.jpg)
B+-tree Index
Files
ImmutableMemTable
MemTableClient
Scan(keyi, keyj)
FAST 2019 27
Disk PM
Ki KjKi+3Ki
Ki+1Ki
Ki+1Ki Ki+2
…
…
…
Create iterator
![Page 82: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/82.jpg)
B+-tree Index
Files
ImmutableMemTable
MemTableClient
Scan(keyi, keyj)
FAST 2019 27
Disk PM
Ki KjKi+3Ki
Ki+1Ki
Ki+1Ki Ki+2
…
…
…
Create iterator
![Page 83: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/83.jpg)
Evaluation
FAST 2019 28
Intel Xeon E5-2640 v3
DRAM: 4GBEmulated PM: 7GB
Intel SSD DC S3520
Ubuntu 18.04Kernel 4.15
DB: 8GB/20GBMemtable: 64MB
• PM write latency 500ns (5x of DRAM write latency)• PM read latency & bandwidth same same as DRAM’s• Intel’s PMDK used to control PM pool
![Page 84: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/84.jpg)
db_bench microbenchmark
FAST 2019 29
Random write Random read Range query
Overhead amortized from large value size
Low sequentiality
Steady performance increase
Low file locating overhead
Range size = 100
![Page 85: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/85.jpg)
db_bench microbenchmark
FAST 2019 29
Random write Random read Range query
Overhead amortized from large value size
Low sequentiality
Steady performance increase
Low file locating overhead
• ~2.56x less disk write amplification• Max 700MB used in PM Range size = 100
![Page 86: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/86.jpg)
PM sensitivity
FAST 2019 30
PM write latency sensitivityRandom write benchmark
PM bandwidth sensitivity
Emulated by inserting cpu pause after clfush()
Emulated using Thermal Throttling
db_bench
![Page 87: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/87.jpg)
YCSB
FAST 2019 31
100% I 50% R50% U
95% R5% U
95% R5% U
100% I95% LR5% U
95% S5% U
50% R50% RMW
![Page 88: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/88.jpg)
YCSB
FAST 2019 31
100% I 50% R50% U
95% R5% U
95% R5% U
100% I95% LR5% U
95% S5% U
50% R50% RMW
Better write performance
![Page 89: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/89.jpg)
YCSB
FAST 2019 31
100% I 50% R50% U
95% R5% U
95% R5% U
100% I95% LR5% U
95% S5% U
50% R50% RMW
Very fast on update operations
Better write performance
![Page 90: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/90.jpg)
YCSB
FAST 2019 31
100% I 50% R50% U
95% R5% U
95% R5% U
100% I95% LR5% U
95% S5% U
50% R50% RMW
Very fast on update operations
Only 1KB case is slower
Better write performance
![Page 91: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/91.jpg)
YCSB
FAST 2019 31
100% I 50% R50% U
95% R5% U
95% R5% U
100% I95% LR5% U
95% S5% U
50% R50% RMW
Very fast on update operations
Only 1KB case is slower
• On average, beats every workload• Up to 7.7x less disk write amplification
Better write performance
![Page 92: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/92.jpg)
Conclusion
• Novel design of Key-Value stores with Persistent Memory
• High write/read performance compared to LevelDB
• Comparable scan performance
• Low write amplification
• Near-optimal read amplification
FAST 2019 32
![Page 93: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/93.jpg)
Thanks!Questions?
FAST 2019 33
![Page 94: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/94.jpg)
SLM-DB: Single Level Merge Key-Value Store with Persistent Memory
Olzhas Kaiyrakhmet, Songyi Lee, Beomseok Nam, Sam H. Noh, Young-ri Choi
![Page 95: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/95.jpg)
db_bench microbenchmark (20GB)
FAST 2019 35
Random write Random read Range query
![Page 96: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/96.jpg)
Effect of persistent MemTable
FAST 2019 36
Random write performance Total disk write
![Page 97: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/97.jpg)
B+-tree
B+-tree leaf node scan
FAST 2019 37
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Files
CompactionCandidates
• To increase sequentiality of key-values with scans in round-robin fashion• If the number of unique file accesses is above threshold, then add to candidates
Threshold = 2
![Page 98: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/98.jpg)
B+-tree
B+-tree leaf node scan
FAST 2019 37
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Files
CompactionCandidates
• To increase sequentiality of key-values with scans in round-robin fashion• If the number of unique file accesses is above threshold, then add to candidates
Threshold = 2
![Page 99: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/99.jpg)
B+-tree
B+-tree leaf node scan
FAST 2019 37
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Files
CompactionCandidates
• To increase sequentiality of key-values with scans in round-robin fashion• If the number of unique file accesses is above threshold, then add to candidates
Threshold = 2
![Page 100: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/100.jpg)
B+-tree
B+-tree leaf node scan
FAST 2019 37
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Files
CompactionCandidates
• To increase sequentiality of key-values with scans in round-robin fashion• If the number of unique file accesses is above threshold, then add to candidates
Threshold = 2
![Page 101: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/101.jpg)
B+-tree
B+-tree leaf node scan
FAST 2019 37
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Files
CompactionCandidates
• To increase sequentiality of key-values with scans in round-robin fashion• If the number of unique file accesses is above threshold, then add to candidates
Threshold = 2
![Page 102: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/102.jpg)
B+-tree
B+-tree leaf node scan
FAST 2019 37
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Files
CompactionCandidates
• To increase sequentiality of key-values with scans in round-robin fashion• If the number of unique file accesses is above threshold, then add to candidates
Threshold = 2
![Page 103: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/103.jpg)
B+-tree
B+-tree leaf node scan
FAST 2019 37
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Files
CompactionCandidates
• To increase sequentiality of key-values with scans in round-robin fashion• If the number of unique file accesses is above threshold, then add to candidates
Threshold = 2
![Page 104: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/104.jpg)
Degree of sequentiality per range query
FAST 2019 38
B+-tree
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
RangeQuery(7, 14)
Files
CompactionCandidates
• To increase sequentiality of key-values during range query operation• If subrange max unique file accesses is above threshold, then add to
candidates
Threshold = 2
![Page 105: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/105.jpg)
Degree of sequentiality per range query
FAST 2019 38
B+-tree
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
RangeQuery(7, 14)
Files
CompactionCandidates
• To increase sequentiality of key-values during range query operation• If subrange max unique file accesses is above threshold, then add to
candidates
Threshold = 2
![Page 106: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/106.jpg)
Degree of sequentiality per range query
FAST 2019 38
B+-tree
1 2 3 4 5 6
7 8 9 10 11 12 13 14
15 16
RangeQuery(7, 14)
Files
CompactionCandidates
• To increase sequentiality of key-values during range query operation• If subrange max unique file accesses is above threshold, then add to
candidates
Threshold = 2
![Page 107: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/107.jpg)
Degree of sequentiality per range query
FAST 2019 38
B+-tree
1 2 3 4 5 6
7 8 9 10 11 12 13 14
15 16
RangeQuery(7, 14)
Files
CompactionCandidates
• To increase sequentiality of key-values during range query operation• If subrange max unique file accesses is above threshold, then add to
candidates
Threshold = 2
![Page 108: SLM-DB: Single Level Merge Key-Value Store with Persistent ... · Merge from Level N to Level N+1 Flush WAL Write-Ahead-MANIFEST Log (no fsync) Store file organization and metadata](https://reader034.vdocument.in/reader034/viewer/2022042621/5f75f8c112402753b72cc5a6/html5/thumbnails/108.jpg)
Degree of sequentiality per range query
FAST 2019 38
B+-tree
1 2 3 4 5 6
7 8 9 10 11 12 13 14
15 16
RangeQuery(7, 14)
Files
CompactionCandidates
• To increase sequentiality of key-values during range query operation• If subrange max unique file accesses is above threshold, then add to
candidates
Threshold = 2