![Page 1: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/1.jpg)
An Efficient Backup and Replication of Storage
Takashi HOSHINO
Cybozu Labs, Inc.
2013-05-29@LinuxConJapan
rev.20130529a
![Page 2: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/2.jpg)
Self-introduction
• Takashi HOSHINO
– Works at Cybozu Labs, Inc.
• Technical interests
– Database, storage, distributed algorithms
• Current work
– WalB: Today’s talk!
2
![Page 3: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/3.jpg)
Contents
• Motivation
• Architecture
• Alternative solutions
• Algorithm
• Performance evaluation
• Summary
3
![Page 4: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/4.jpg)
Contents
• Motivation
• Architecture
• Alternative solutions
• Algorithm
• Performance evaluation
• Summary
4
![Page 5: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/5.jpg)
Motivation
• Backup and replication are vital
– We have our own cloud infrastructure
– Requiring high availability of customers’ data
– With cost-effective commodity hardware and software
5
Operating data
Backup data
Replicated data
Primary DC Secondary DC
![Page 6: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/6.jpg)
Requirements
• Functionality
– Getting consistent diff data for backup/replication achieving small RPO
• Performance
– Without usual full-scans
– Small overhead
• Various kinds of data support
– Databases
– Blob data
– Full-text-search indexes
– ... 6
![Page 7: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/7.jpg)
• A Linux kernel device driver to provide wrapper block devices and related userland tools
• Provides consistent diff extraction functionality for efficient backup and replication
• “WalB” means “Block-level WAL”
A solution: WalB
7
WalB storage
Backup storage
Replicated storage
Diffs
![Page 8: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/8.jpg)
WAL (Write-Ahead Logging)
8
Ordinary Storage WAL Storage
Write at 0
Write at 2
Read at 2
Write at 2
0
0 2
0 2
0 2 2
Time
![Page 9: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/9.jpg)
How to get diffs
• Full-scan and compare
• Partial-scan with bitmaps (or indexes)
• Scan logs directly from WAL storages
9
a b c d e a b’ c d e’ b’ e’ 1 4
a b c d e a b’ c d e’ b’ e’ 1 4
00000 01001
a b c d e a b’ c d e’ b’ e’ 1 4
b’ e’ 1 4
Data at t0 Data at t1 Diffs from t0 to t1
![Page 10: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/10.jpg)
Contents
• Motivation
• Architecture
• Alternative solutions
• Algorithm
• Performance evaluation
• Summary
10
![Page 11: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/11.jpg)
WalB architecture
A walb device as a wrapper
A block device for data (Data device)
A block device for logs (Log device)
Read Write Logs
Not special format An original format
Any application (File system, DBMS, etc)
Walb log Extractor
Walb dev controller
Control
11
![Page 12: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/12.jpg)
Log device format
12
Ring buffer Metadata
Address
Superblock (physical block size)
Unused (4KiB)
Log address = (log sequence id) % (ring buffer size) + (ring buffer start address)
![Page 13: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/13.jpg)
Checksum
Ring buffer inside
13
2nd written data
Ring buffer
Log pack Logpack header block
1st written data …
The oldest logpack The latest logpack
1st log record
IO address IO size
...
Logpack lsid Log pack header block
Num of records
Total IO size
2nd log record
IO address IO size
...
...
![Page 14: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/14.jpg)
Redo/undo logs
• Redo logs
– to go forward
• Undo logs
– to go backward
14
0 2 Data at t0
0 2
Data at t1
Redo logs
Undo logs
![Page 15: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/15.jpg)
How to create redo/undo logs
• Undo logs requires additional read IOs
• Walb devices do not generate undo-logs
15
Data Log
WalB
(1) write IOs submitted
(2) read current data
(3) write undo logs
Data Log
WalB
(1) write IOs submitted
(2) write redo logs
Generating redo logs Generating undo logs
![Page 16: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/16.jpg)
Consistent full backup
• The ring buffer must have enough capacity to store the logs generated from t0 to t1
Primary host Backup host
Walb device (online)
Log device
Full archive
Logs
Apply
(A)
(B)
Read the online volume (inconsistent)
t1
Get consistent image at t1
(C)
(A) (B) (C)
Time
t0 t2
Get logs from t0 to t1
16
![Page 17: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/17.jpg)
Consistent incremental backup
• To manage multiple backup generations, defer application or generate undo-logs during application
Primary host Backup host
Walb device (online)
Log device
Full archive
Logs
Apply (A)
Get logs from t0 to t1
(B)
(A) (B)
Time
t0 t2 t1
Previous backup timestamp
Backuped Application can be deferred
17
![Page 18: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/18.jpg)
Backup and replication
18
Primary host Backup host
Walb device (online)
Log device
Full archive
Logs
Apply (A)
Get logs from t0 to t1 Replicated
(A’)
(A)
Time
t1 t3
Remote host
Full archive
Logs
Apply (B’) (B)
(B) t2
Backuped
t0
Replication delay
![Page 19: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/19.jpg)
Contents
• Motivation
• Architecture
• Alternative solutions
• Algorithm
• Performance evaluation
• Summary
19
![Page 20: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/20.jpg)
Alternative solutions
• DRBD
– Specialized for replication
– DRBD proxy is required for long-distance replication
• Dm-snap
– Snapshot management using COW (copy-on-write)
– Full-scans are required to get diffs
• Dm-thin
– Snapshot management using COW and reference counters
– Fragmentation is inevitable
20
![Page 21: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/21.jpg)
Alternatives comparison
WalB
dm-snap
Capability
dm-thin
21
DRBD
Incr. backup
Sync repl-
ication
Async repl-
ication
Performance
Negligible
Search idx
Negligible
Read response overhead
Search idx
Write response overhead
Fragment-ation
Write log instead data
Modify idx (+COW)
Send IOs to slaves
(async repl.)
Modify idx (+COW)
Never
Inevitable
Never
Never (original lv)
![Page 22: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/22.jpg)
WalB pros and cons
• Pros
– Small response overhead
– Fragmentation never occur
– Logs can be retrieved with sequential scans
• Cons
– 2x bandwidth is required for writes
22
![Page 23: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/23.jpg)
WalB source code statistics
23
File systems Block device wrappers
![Page 24: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/24.jpg)
Contents
• Motivation
• Architecture
• Alternative solutions
• Algorithm
• Performance evaluation
• Summary
24
![Page 25: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/25.jpg)
Requirements to be a block device
• Read/write consistency
– It must read the latest written data
• Storage state uniqueness
– It must replay logs not changing the history
• Durability of flushed data
– It must make flushed write IOs be persistent
• Crash recovery without undo
– It must recover the data using redo-logs only
25
![Page 26: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/26.jpg)
WalB algorithm
• Two IO processing methods
– Easy: very simple, large overhead
– Fast: a bit complex, small overhead
• Overlapped IO serialization
• Flush/fua for durability
• Crash recovery and checkpointing
26
![Page 27: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/27.jpg)
IO processing flow (easy algorithm)
27
Submitted
Time
Completed
Packed
Log submitted Log completed
Wait for log flushed and overlapped IOs done
Data submitted Data completed
Write
Submitted
Time
Completed
Data submitted Data completed
Read
Log IO response Data IO response
WalB write IO response
Data IO response
![Page 28: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/28.jpg)
IO processing flow (fast algorithm)
28
Submitted
Time
Completed
Packed
Log submitted Log completed
Wait for log flushed and overlapped IOs done
Data submitted Data completed
Log IO response Data IO response
WalB write IO response
Pdata inserted Pdata deleted
Write
Submitted
Time
Completed
(Data submitted) (Data completed)
Read
Pdata copied
Data IO response
![Page 29: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/29.jpg)
Pending data
• A red-black tree
– provided as a kernel library
• Sorted by IO address
– to find overlapped IOs quickly
• A spinlock
– for exclusive accesses
29
Node0
addr size data
NodeN
addr size data
Node1
addr size data
...
...
![Page 30: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/30.jpg)
Overlapped IO serialization
• Required for storage state uniqueness
• Oldata (overlapped data)
– similar to pdata
– counter for each IO
– FIFO constraint
31
Time
Wait for overlapped IOs done
Data submitted Data completed
Data IO response
Oldata inserted Oldata deleted Got notice Sent notice
![Page 31: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/31.jpg)
Flush/fua for durability
• The condition for a log to be persistent
– All logs before the log and itself are persistent
• Neither FLUSH nor FUA is required for data device IOs
– Data device persistence will be guaranteed by checkpointing
34
Property to guarantee
All write IOs submitted before the flush IO are
persistent REQ_FLUSH
What WalB need to do
The fua IO is persistent REQ_FUA
Set FLUSH flag of the corresponding log IOs
Set FLUSH and FUA flags to the corresponding log IOs
![Page 32: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/32.jpg)
Crash recovery and checkpointing
• Crash recovery
– Crash will make recent write IOs not be persistent in the data device
– Generate write IOs from recent logs and execute them in the data device
• Checkpointing
– Sync the data device and update the superblock periodically
– The superblock contains recent logs information
35
![Page 33: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/33.jpg)
Contents
• Motivation
• Architecture
• Alternative solutions
• Algorithm
• Performance evaluation
• Summary
36
![Page 34: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/34.jpg)
Experimental environment
• Host
– CPU: Intel core i7 3930K
– Memory: DDR3-10600 32GB (8GBx4)
– OS: Ubuntu 12.04 x86_64
– Kernel: Custom build 3.2.24
• Storage HBA
– Intel X79 Internal SATA controller
– Using SATA2 interfaces for the experiment
37
![Page 35: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/35.jpg)
Benchmark software
• Self-developed IO benchmark for block devices
– https://github.com/starpos/ioreth
– uses DIRECT_IO to eliminate buffer cache effects
• Parameters
– Pattern: Random/sequential
– Mode: Read/write
– Block size: 512B, 4KB, 32KB, 256KB
– Concurrency: 1-32
38
![Page 36: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/36.jpg)
Target storage devices
• MEM
– Memory block devices (self-implemented)
• HDD
– Seagate Barracuda 500GB (ST500DM002)
– Up to 140MB/s
• SSD
– Intel 330 60GB (SSDSC2CT060A3)
– Up to 250MB/s with SATA2
39
![Page 37: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/37.jpg)
Storage settings
• Baseline
– A raw storage device
• Wrap
– Self-developed simple wrapper
– Request/bio interfaces
• WalB
– Easy/easy-ol/fast/fast-ol
– Request/bio interfaces
40
Log Data
Data
Baseline
WalB driver
WalB
Data
Wrap
Wrap driver
![Page 38: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/38.jpg)
WalB parameters
• Log-flush-disabled experiments
– Target storage: MEMs/HDDs/SSDs
– Pdata size: 4-128MiB
• Log-flush-enabled experiments
– Target storage: HDDs/SSDs
– Pdata size: 4-64MiB
– Flush interval size: 4-32MiB
– Flush interval period: 10ms, 100ms, 1s
41
![Page 39: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/39.jpg)
MEM 512B random: response
42
Read Write
• Smaller overhead with bio interface than with request interface
• Serializing write IOs of WalB seems to enlarge response time as number of threads increases
![Page 40: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/40.jpg)
MEM 512B random: IOPS
43
Read Write
• Large overhead with request interface
• Pdata search seems overhead of walb-bio
• IOPS of walb decreases as num of threads increases due to decrease of cache-hit ratio or increase of spinlock wait time
Queue length Queue length
![Page 41: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/41.jpg)
Pdata and oldata overhead
44
MEM 512B random write
Kernel 3.2.24, walb req prototype
Overhead of oldata
Overhead of pdata
Queue length
![Page 42: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/42.jpg)
HDD 4KB random: IOPS
45
Read Write
• Negligible overhead • IO scheduling effect was observed especially with walb-req
Queue length Queue length
![Page 43: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/43.jpg)
HDD 256KB sequential: Bps
46
Read Write
• Request interface is better
• IO size is limit to 32KiB with bio interface
• Additional log header blocks decrease throughput
Queue length Queue length
![Page 44: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/44.jpg)
SSD 4KB random: IOPS
47
Read Write
• WalB performance is almost the same as that of wrap req
• Fast algo. is better then easy algo.
• IOPS overhead is large with smaller number of threads
Queue length Queue length
![Page 45: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/45.jpg)
SSD 4KB random: IOPS (two partitions in a SSD)
48
Read Write
• Almost the same result as with two SSD drives
• A half throughput was observed
• Bandwidth of a SSD is the bottleneck
Queue length Queue length
![Page 46: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/46.jpg)
SSD 256KB sequential: Bps
49
Read Write
• Non-negligible overhead was observed with queue length 1
• Fast is better than easy
• Larger overhead of walb with smaller queue length
Queue length Queue length
![Page 47: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/47.jpg)
Log-flush effects with HDDs
50
IOPS (4KB random write) Bps (256KB sequential write)
• IO sorting for the data device is effective for random writes
• Enough memory for pdata is required to minimize the log-flush overhead
Queue length Queue length
![Page 48: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/48.jpg)
Log-flush effects with SSDs
51
IOPS (4KB random write) Bps (32KB sequential write)
• Small flush interval is better
• Log flush increases IOPS with single-thread
• Log-flush effect is negligible
Queue length Queue length
![Page 49: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/49.jpg)
Evaluation summary
• WalB overhead
– Non-negligible for writes with small concurrency
– Log-flush overhead is large with HDDs, negligible with SSDs
• Request vs bio interface
– bio is better except for workloads with large IO size
• Easy vs fast algorithm
– Fast is better
52
![Page 50: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/50.jpg)
Contents
• Motivation
• Architecture
• Alternative solutions
• Algorithm
• Performance evaluation
• Summary
53
![Page 51: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/51.jpg)
WalB summary
• A wrapper block device driver for
– incremental backup
– asynchronous replication
• Small performance overhead with
– No persistent indexes
– No undo-logs
– No fragmentation
54
![Page 52: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/52.jpg)
Current status
• Version 1.0
– For Linux kernel 3.2+ and x86_64 architecture
– Userland tools are minimal
• Improve userland tools
– Faster extraction/application of logs
– Logical/physical compression
– Backup/replication managers
• Submit kernel patches
55
![Page 53: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/53.jpg)
Future work
• Add all-zero flag to the log record format
– to avoid all-zero blocks storing to the log device
• Add bitmap management
– to avoid full-scans in ring buffer overflow
• (Support snapshot access)
– by implementing pluggable persistent address indexes
• (Support thin provisioning)
– if a clever defragmentation algorithm was available
56
![Page 54: An Efficient Backup and Replication of Storage...•Add bitmap management –to avoid full-scans in ring buffer overflow •(Support snapshot access) –by implementing pluggable persistent](https://reader035.vdocument.in/reader035/viewer/2022071103/5fdd3aef010c806e7c203bd1/html5/thumbnails/54.jpg)
Thank you for your attention!
• GitHub repository:
– https://github.com/starpos/walb/
• Contact to me:
– Email: hoshino AT labs.cybozu.co.jp
– Twitter: @starpoz (hashtag: #walbdev)
57