btrfs specific dedup liu bo · 2017. 12. 14. · default dedup(bs=4k) dedup(bs=8k) dedup(bs=64k)...

30
Btrfs Specific Dedup Liu Bo

Upload: others

Post on 30-Aug-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Btrfs Specific Dedup Liu Bo · 2017. 12. 14. · default dedup(bs=4k) dedup(bs=8k) dedup(bs=64k) dedup(bs=128) 0 100 200 300 400 500 600 700 85.9 136 163 195 199 88.7 155 175 199

Btrfs Specific Dedup

Liu Bo

Page 2: Btrfs Specific Dedup Liu Bo · 2017. 12. 14. · default dedup(bs=4k) dedup(bs=8k) dedup(bs=64k) dedup(bs=128) 0 100 200 300 400 500 600 700 85.9 136 163 195 199 88.7 155 175 199

Why btrfs needs dedup?

Page 3: Btrfs Specific Dedup Liu Bo · 2017. 12. 14. · default dedup(bs=4k) dedup(bs=8k) dedup(bs=64k) dedup(bs=128) 0 100 200 300 400 500 600 700 85.9 136 163 195 199 88.7 155 175 199

What is Dedup?

Page 4: Btrfs Specific Dedup Liu Bo · 2017. 12. 14. · default dedup(bs=4k) dedup(bs=8k) dedup(bs=64k) dedup(bs=128) 0 100 200 300 400 500 600 700 85.9 136 163 195 199 88.7 155 175 199

Dedup

• A specialized compression technique

• Elimate duplicate copies

• Improve storage utilization

Page 5: Btrfs Specific Dedup Liu Bo · 2017. 12. 14. · default dedup(bs=4k) dedup(bs=8k) dedup(bs=64k) dedup(bs=128) 0 100 200 300 400 500 600 700 85.9 136 163 195 199 88.7 155 175 199
Page 6: Btrfs Specific Dedup Liu Bo · 2017. 12. 14. · default dedup(bs=4k) dedup(bs=8k) dedup(bs=64k) dedup(bs=128) 0 100 200 300 400 500 600 700 85.9 136 163 195 199 88.7 155 175 199
Page 7: Btrfs Specific Dedup Liu Bo · 2017. 12. 14. · default dedup(bs=4k) dedup(bs=8k) dedup(bs=64k) dedup(bs=128) 0 100 200 300 400 500 600 700 85.9 136 163 195 199 88.7 155 175 199

But we already have

Compression?

Page 8: Btrfs Specific Dedup Liu Bo · 2017. 12. 14. · default dedup(bs=4k) dedup(bs=8k) dedup(bs=64k) dedup(bs=128) 0 100 200 300 400 500 600 700 85.9 136 163 195 199 88.7 155 175 199

A Good FS For Backup!

Page 9: Btrfs Specific Dedup Liu Bo · 2017. 12. 14. · default dedup(bs=4k) dedup(bs=8k) dedup(bs=64k) dedup(bs=128) 0 100 200 300 400 500 600 700 85.9 136 163 195 199 88.7 155 175 199

Btrfs:● Cow B+tree● 2^64 byte == 16 EiB maximum file size● Dynamic inode allocation● Checksum on both data and metadata● Compression(zlib, lzo supported)● Integrated multiple device support● Subvolume, writable/readonly snapshot● Send/receive● Etc

Page 10: Btrfs Specific Dedup Liu Bo · 2017. 12. 14. · default dedup(bs=4k) dedup(bs=8k) dedup(bs=64k) dedup(bs=128) 0 100 200 300 400 500 600 700 85.9 136 163 195 199 88.7 155 175 199

Btrfs Deduplication:● Inline● Bock level

Page 11: Btrfs Specific Dedup Liu Bo · 2017. 12. 14. · default dedup(bs=4k) dedup(bs=8k) dedup(bs=64k) dedup(bs=128) 0 100 200 300 400 500 600 700 85.9 136 163 195 199 88.7 155 175 199
Page 12: Btrfs Specific Dedup Liu Bo · 2017. 12. 14. · default dedup(bs=4k) dedup(bs=8k) dedup(bs=64k) dedup(bs=128) 0 100 200 300 400 500 600 700 85.9 136 163 195 199 88.7 155 175 199

Back Reference:

Page 13: Btrfs Specific Dedup Liu Bo · 2017. 12. 14. · default dedup(bs=4k) dedup(bs=8k) dedup(bs=64k) dedup(bs=128) 0 100 200 300 400 500 600 700 85.9 136 163 195 199 88.7 155 175 199
Page 14: Btrfs Specific Dedup Liu Bo · 2017. 12. 14. · default dedup(bs=4k) dedup(bs=8k) dedup(bs=64k) dedup(bs=128) 0 100 200 300 400 500 600 700 85.9 136 163 195 199 88.7 155 175 199

●Fingerprint●

●Hash algorithm:Crc32c vs sha256

Page 15: Btrfs Specific Dedup Liu Bo · 2017. 12. 14. · default dedup(bs=4k) dedup(bs=8k) dedup(bs=64k) dedup(bs=128) 0 100 200 300 400 500 600 700 85.9 136 163 195 199 88.7 155 175 199

●B+tree: dedup tree

●Keys: dedup keys

Page 16: Btrfs Specific Dedup Liu Bo · 2017. 12. 14. · default dedup(bs=4k) dedup(bs=8k) dedup(bs=64k) dedup(bs=128) 0 100 200 300 400 500 600 700 85.9 136 163 195 199 88.7 155 175 199
Page 17: Btrfs Specific Dedup Liu Bo · 2017. 12. 14. · default dedup(bs=4k) dedup(bs=8k) dedup(bs=64k) dedup(bs=128) 0 100 200 300 400 500 600 700 85.9 136 163 195 199 88.7 155 175 199
Page 18: Btrfs Specific Dedup Liu Bo · 2017. 12. 14. · default dedup(bs=4k) dedup(bs=8k) dedup(bs=64k) dedup(bs=128) 0 100 200 300 400 500 600 700 85.9 136 163 195 199 88.7 155 175 199

Dedup Engine:● Dedup is a filter of IO as

compression● Take a bunch of locked pages to

process● Asynchronous helper thread, aim to

work across all online processors

Page 19: Btrfs Specific Dedup Liu Bo · 2017. 12. 14. · default dedup(bs=4k) dedup(bs=8k) dedup(bs=64k) dedup(bs=128) 0 100 200 300 400 500 600 700 85.9 136 163 195 199 88.7 155 175 199

Flexible Control:● Register (create the dedup tree)● Unregister (delete the out-of-date

dedup tree)● Mount options

– "-o dedup"– "-o dedup_bs=xxx", eg. 4k, 128k

Page 20: Btrfs Specific Dedup Liu Bo · 2017. 12. 14. · default dedup(bs=4k) dedup(bs=8k) dedup(bs=64k) dedup(bs=128) 0 100 200 300 400 500 600 700 85.9 136 163 195 199 88.7 155 175 199

Conclusion:● Transparent dedup● Synchronous, block level● Compression support● Tunable granularity, ie. dedup

blocksize● Not default, easy to control

Page 21: Btrfs Specific Dedup Liu Bo · 2017. 12. 14. · default dedup(bs=4k) dedup(bs=8k) dedup(bs=64k) dedup(bs=128) 0 100 200 300 400 500 600 700 85.9 136 163 195 199 88.7 155 175 199

Limit:

● Effective on backup, virtualization● Ineffective on structured data

Page 22: Btrfs Specific Dedup Liu Bo · 2017. 12. 14. · default dedup(bs=4k) dedup(bs=8k) dedup(bs=64k) dedup(bs=128) 0 100 200 300 400 500 600 700 85.9 136 163 195 199 88.7 155 175 199

Performance

default dedup(bs=4k) dedup(bs=8k) dedup(bs=64k) dedup(bs=128)0

100

200

300

400

500

600

700

85.9

136163

195 199

88.7

155175

199

243

83.8

178

440

602

6481G Zero Write(compress: OFF)

First write

Backup-1

Backup-2

Page 23: Btrfs Specific Dedup Liu Bo · 2017. 12. 14. · default dedup(bs=4k) dedup(bs=8k) dedup(bs=64k) dedup(bs=128) 0 100 200 300 400 500 600 700 85.9 136 163 195 199 88.7 155 175 199

Performance, cont

default dedup(bs=4k) dedup(bs=8k) dedup(bs=64k) dedup(bs=128)0

100

200

300

400

500

600

700

800

900

323

136163

195 199

327

154175

198 202

843

155

207239

290

1G Zero Write(compress: ON)

First write

Backup-1

Backup-2

Page 24: Btrfs Specific Dedup Liu Bo · 2017. 12. 14. · default dedup(bs=4k) dedup(bs=8k) dedup(bs=64k) dedup(bs=128) 0 100 200 300 400 500 600 700 85.9 136 163 195 199 88.7 155 175 199

Demo

Page 25: Btrfs Specific Dedup Liu Bo · 2017. 12. 14. · default dedup(bs=4k) dedup(bs=8k) dedup(bs=64k) dedup(bs=128) 0 100 200 300 400 500 600 700 85.9 136 163 195 199 88.7 155 175 199
Page 26: Btrfs Specific Dedup Liu Bo · 2017. 12. 14. · default dedup(bs=4k) dedup(bs=8k) dedup(bs=64k) dedup(bs=128) 0 100 200 300 400 500 600 700 85.9 136 163 195 199 88.7 155 175 199
Page 27: Btrfs Specific Dedup Liu Bo · 2017. 12. 14. · default dedup(bs=4k) dedup(bs=8k) dedup(bs=64k) dedup(bs=128) 0 100 200 300 400 500 600 700 85.9 136 163 195 199 88.7 155 175 199

Known Issues:

● ENOSPC● A byte to byte comparison

Page 28: Btrfs Specific Dedup Liu Bo · 2017. 12. 14. · default dedup(bs=4k) dedup(bs=8k) dedup(bs=64k) dedup(bs=128) 0 100 200 300 400 500 600 700 85.9 136 163 195 199 88.7 155 175 199

QA

Page 29: Btrfs Specific Dedup Liu Bo · 2017. 12. 14. · default dedup(bs=4k) dedup(bs=8k) dedup(bs=64k) dedup(bs=128) 0 100 200 300 400 500 600 700 85.9 136 163 195 199 88.7 155 175 199

Reference● http://en.wikipedia.org/wiki/Data_deduplication

● - http://media.netapp.com/documents/tr-3505.pdf

● - http://www.druva.com/blog/2009/01/09/understanding-data-deduplication

● - https://btrfs.wiki.kernel.org/index.php/Main_Page

● - https://communities.netapp.com/community/netapp-blogs/drdedupe/blog/2010/04/07/how-netapp-deduplication-works--a-primer

● - http://en.wikipedia.org/wiki/Fingerprint_%28computing%29

Page 30: Btrfs Specific Dedup Liu Bo · 2017. 12. 14. · default dedup(bs=4k) dedup(bs=8k) dedup(bs=64k) dedup(bs=128) 0 100 200 300 400 500 600 700 85.9 136 163 195 199 88.7 155 175 199

Thank you!

Liu Bo

<[email protected]>