the logic of physical garbage collection in deduplicating ...24 © copyright 2017 dell inc....

25
© Copyright 2017 Dell Inc. 1 The Logic of Physical Garbage Collection in Deduplicating Storage Fred Douglis Abhinav Duggal Philip Shilane Tony Wong Dell EMC Shiqin Yan University of Chicago Fabiano Botelho Rubrik

Upload: others

Post on 15-May-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Logic of Physical Garbage Collection in Deduplicating ...24 © Copyright 2017 Dell Inc. Conclusions • Shift in workloads required moving from depth-first based . Title: FAST17_slides_Duggal

© Copyright 2017 Dell Inc.1

The Logic of Physical Garbage Collection in Deduplicating Storage

Fred Douglis

Abhinav DuggalPhilip Shilane

Tony Wong

Dell EMC

Shiqin Yan

University of Chicago

Fabiano Botelho

Rubrik

Page 2: The Logic of Physical Garbage Collection in Deduplicating ...24 © Copyright 2017 Dell Inc. Conclusions • Shift in workloads required moving from depth-first based . Title: FAST17_slides_Duggal

© Copyright 2017 Dell Inc.2

Deduplication in Data Domain Filesystem (DDFS)

R S T W

File 1

W X Y Z

R S T WRfp Sfp Tfp Wfp

R S

T W

C1

C2

fp CID

R C1

S C1

T C2

W C2

Fingerprint Index

X YC3

ZC4

Containers holding chunks

File 2

W X Y

X C3

Y C3

Z C4

Variable sized chunks Variable sized chunks

Generate fingerprints

Wfp Xfp Yfp

Generate fingerprints

ZZfp

Page 3: The Logic of Physical Garbage Collection in Deduplicating ...24 © Copyright 2017 Dell Inc. Conclusions • Shift in workloads required moving from depth-first based . Title: FAST17_slides_Duggal

© Copyright 2017 Dell Inc.3

File Representation in DDFS

L6

… L5

L5

L1: Rfp Sfp Tfp Ufp Vfp Wfp Xfp Yfp

L4

L3

L2

R

Files represented as a Merkle tree of fingerprints

L0: Chunks stored on disk in containers

S

Y

L6

L1 : Rfp Sfp Zfp

L2 …

COPY

“fastcopy” creates new root into same

tree

Lp chunks (metadata)

Page 4: The Logic of Physical Garbage Collection in Deduplicating ...24 © Copyright 2017 Dell Inc. Conclusions • Shift in workloads required moving from depth-first based . Title: FAST17_slides_Duggal

© Copyright 2017 Dell Inc.4

Deduplication Workloads on Data Domain

• Traditional backups– Weekly full and daily incremental backups

› Full backups tend to be very large – 100GBs to TBs› Much content in full backups repeats previous full

– Typically, 10-20x total compression (TC)› 20x TC = 10x dedup and 2x compression

• New workloads– “Synthetic” full backups

› Send changes and a recipe to create a single full backup from some previous backup

› Daily fulls› High TC (100x-400x or higher)

– High file count› 100M to 1 billion small files

Page 5: The Logic of Physical Garbage Collection in Deduplicating ...24 © Copyright 2017 Dell Inc. Conclusions • Shift in workloads required moving from depth-first based . Title: FAST17_slides_Duggal

© Copyright 2017 Dell Inc.5

Garbage Collection in a Deduplication Filesystem

File 1

R S

T W

C1

C2

X YC3

ZC4

Containers holding chunks

File 2

Shared chunk

Duplicate chunk

fp CID

R C1

S C1

T C2

W C2

Fingerprint Index

X C3

Y C3

Z C4

File 3

Q C5

Y C5Q YC5

Duplicates are sometimes written to improve throughput

Page 6: The Logic of Physical Garbage Collection in Deduplicating ...24 © Copyright 2017 Dell Inc. Conclusions • Shift in workloads required moving from depth-first based . Title: FAST17_slides_Duggal

© Copyright 2017 Dell Inc.6

Evolution of GC in DDFS

• Logical GC (LGC)– Depth-first traversal of per-file Merkle tree on disk to mark live

chunks in memory– In-memory data structures may not allow system to track all chunks,

so an extra mark phase (“pre-phases”) is used when necessary

• Physical GC (PGC)– Breadth-first traversal of the physical layout of Merkle trees to mark

live chunks in memory– Similar to LGC, pre-phases may be needed

• Phase-optimized Physical GC (PGC+)– Improvement over PGC by removing pre-phases, plus other

optimizations

Page 7: The Logic of Physical Garbage Collection in Deduplicating ...24 © Copyright 2017 Dell Inc. Conclusions • Shift in workloads required moving from depth-first based . Title: FAST17_slides_Duggal

© Copyright 2017 Dell Inc.7

Logical GC Phases

• Merge– Merge in-memory Index on disk

• Enumeration– Depth-first walk and mark live chunks in an in-memory

Bloom filter called live vector

• Filter– Create live instance vector (also a Bloom filter) from

live vector to remove the duplicates

• Select– Select best containers to compact

• Copy– Copy live chunks from selected containers into new

containers and delete old containers

Mark phase

Sweep phase

Page 8: The Logic of Physical Garbage Collection in Deduplicating ...24 © Copyright 2017 Dell Inc. Conclusions • Shift in workloads required moving from depth-first based . Title: FAST17_slides_Duggal

© Copyright 2017 Dell Inc.8

EnumerationPhase(LogicalGC)

L6

L2

L1 L1’

L6’

L2’

F1 F1’

L1’’

L0 L0

Only Lpchunks are traversed

shared

Page 9: The Logic of Physical Garbage Collection in Deduplicating ...24 © Copyright 2017 Dell Inc. Conclusions • Shift in workloads required moving from depth-first based . Title: FAST17_slides_Duggal

© Copyright 2017 Dell Inc.9

Logical GC àPhysical GC• Logical enumeration performance is sensitive to the

following parameters– Total compression factor– Number of small files – Spatial locality of Lp

Physical GC addresses these performance issues

Page 10: The Logic of Physical Garbage Collection in Deduplicating ...24 © Copyright 2017 Dell Inc. Conclusions • Shift in workloads required moving from depth-first based . Title: FAST17_slides_Duggal

© Copyright 2017 Dell Inc.10

Physical GC (PGC)• Uses breadth-first walk instead of per-file depth-first walk

during enumeration

• Uses Perfect Hash Vector(PHV) to store LPs for assisting the breadth-first walk– Uses less memory– Needed for doing checksums to prevent corruption

• New analysis phase to build Perfect Hash Functions for LPs• Remaining phases are same as logical GC

Live vector Live instance vector

Bloom filters

Live vector Live instance vector

Walk Vector

Bloom filtersPHV

LGC PGC

Page 11: The Logic of Physical Garbage Collection in Deduplicating ...24 © Copyright 2017 Dell Inc. Conclusions • Shift in workloads required moving from depth-first based . Title: FAST17_slides_Duggal

© Copyright 2017 Dell Inc.11

Collision Free - Perfect Hashing Vector (PHvec)

s1 s2 … sn

0 1 n - 1

PHF (m ≥ n)

1 0 … 1

0 1 m - 1

Fingerprint set S

Bit vector

Collision-free hash function which maps a fingerprint to a unique position in a bit vector

Page 12: The Logic of Physical Garbage Collection in Deduplicating ...24 © Copyright 2017 Dell Inc. Conclusions • Shift in workloads required moving from depth-first based . Title: FAST17_slides_Duggal

© Copyright 2017 Dell Inc.12

Analysis Phase

FP CID type

fp1 10 L0

fp2 5 LP

fp3 30 LP

……

….. …..

….. ….. ……

……

……

……

fpn 40

On-disk container index

In-memory Perfect Hashfunctions of Lp

1

2

3

4

.

.

.

#fps

Page 13: The Logic of Physical Garbage Collection in Deduplicating ...24 © Copyright 2017 Dell Inc. Conclusions • Shift in workloads required moving from depth-first based . Title: FAST17_slides_Duggal

© Copyright 2017 Dell Inc.13

Benefits & Costs of Physical Enumeration

• Pro: Sequential scan of containers on disk– All L6, then all L5, down to L1s– Relatively few containers store high-level metadata– No need to keep revisiting same Lp containers due to fastcopy

(high deduplication)

• Con: extra analysis cost doesn’t help “traditional” workloads

• … and due to pre-phases we may have to run analysis twice!

Page 14: The Logic of Physical Garbage Collection in Deduplicating ...24 © Copyright 2017 Dell Inc. Conclusions • Shift in workloads required moving from depth-first based . Title: FAST17_slides_Duggal

© Copyright 2017 Dell Inc.14

LGC and PGC phases (including pre-phases)• Physical GC

1. Pre-merge2. Pre-analysis3. Pre-enumeration4. Pre-filter5. Pre-select6. Merge7. Analysis8. Candidate9. Enumeration10. Filter11. Copy12. Summary

• Logical GC1. Pre-merge2. Pre-enumeration3. Pre-filter4. Pre-select5. Candidate6. Enumeration7. Merge8. Filter9. Copy10. Summary

Pre-phases/sampling phases

Pre-phases / sampling phases

Page 15: The Logic of Physical Garbage Collection in Deduplicating ...24 © Copyright 2017 Dell Inc. Conclusions • Shift in workloads required moving from depth-first based . Title: FAST17_slides_Duggal

© Copyright 2017 Dell Inc.15

Physical GC à Phase-optimized Physical GC

• Limitations of Physical GC– Adds 2 extra phases (pre-analysis and analysis)– Slightly degrades GC performance for customers with

traditional backup workloads

• Motivation for Phase-optimized Physical GC (PGC+)– Avoid pre-phases by representing all chunks in memory– Can we use Perfect hash as a live vector?

› Need only 2.7 bits per fingerprint instead of a 6 bits in Bloom filter– Can we maintain duplicate recipe without using a Bloom

filter?› Get 50% memory back

Live vectorLive vector Live instance vector

Walk Vector

Bloom filtersPHV

PGCWalk

Vector

PHV PHV

PGC+

Page 16: The Logic of Physical Garbage Collection in Deduplicating ...24 © Copyright 2017 Dell Inc. Conclusions • Shift in workloads required moving from depth-first based . Title: FAST17_slides_Duggal

© Copyright 2017 Dell Inc.16

Phase-optimized Physical GC (PGC+) Phases1. Merge

2. Analysis

3. Enumeration

4. Select

5. Copy6. Summary

Page 17: The Logic of Physical Garbage Collection in Deduplicating ...24 © Copyright 2017 Dell Inc. Conclusions • Shift in workloads required moving from depth-first based . Title: FAST17_slides_Duggal

© Copyright 2017 Dell Inc.17

PGC+ Analysis and Enumeration • Replace Bloom filter with Perfect Hash vector for tracking

live and dead chunks

• In analysis phase build two Perfect hash vectors– Lp vector called the walk vector (similar to PGC) – All fingerprints(Lp + L0) based Perfect Hash vector called live vector

• Perfect hashing optimizations– NUMA-aware Perfect Hashing– Cache prefetching of Perfect hash functions and values in the Perfect

Hash Vector

Page 18: The Logic of Physical Garbage Collection in Deduplicating ...24 © Copyright 2017 Dell Inc. Conclusions • Shift in workloads required moving from depth-first based . Title: FAST17_slides_Duggal

© Copyright 2017 Dell Inc.18

PGC+ Copy phase

fp1, fp2 fp1, fp3111

fp1 fp2 fp3C1 C2

fp1, fp2 fp1, fp3 010fp1 fp2 fp3

C1 C2

fp1, fp2 fp1, fp3 000fp1 fp2 fp3

C1 C2

Initial state

Process C2

Process C1

Dynamically remove duplicates during

Copy phase

Live vector

Live vector

Live vector

Page 19: The Logic of Physical Garbage Collection in Deduplicating ...24 © Copyright 2017 Dell Inc. Conclusions • Shift in workloads required moving from depth-first based . Title: FAST17_slides_Duggal

© Copyright 2017 Dell Inc.19

Evaluation • Deployed systems

– Comparison of GC runs for systems upgraded from LGC to PGC

• Controlled experiments on 4 systems– Comparison of LGC vs PGC vs PGC+

› One phase versus two phase GC

– DD860 used as default for all experiments– Workload used was Synthetic dataset similar to some past

deduplication work (e.g., Botelho, et al., FAST 2012)Systems DD2500 DD860 DD890 DD990

CPU(cores*GHz) 8*2.2 GHz 16*2.53 GHz 24*2.8 GHz 40*2.4 GHz

Mem(GB) 64 GB 70 GB 94 GB 256 GB

PhysicalCapacity (TB)

122 TB 126 TB 167 TB 319 TB

Page 20: The Logic of Physical Garbage Collection in Deduplicating ...24 © Copyright 2017 Dell Inc. Conclusions • Shift in workloads required moving from depth-first based . Title: FAST17_slides_Duggal

© Copyright 2017 Dell Inc.20

Deployed System Results- LGC vs PGC

• For high TC workloads, PGC improved from LGC up to 20x

• For high file count workload, PGC improved over LGC by 7x

• 75% of systems upgraded from LGC to PGC suffered from some degradation but usually not much– Hard to compare LGC v/s PGC systems because of some other

performance changes introduced with PGC

• Lab experiments to compare all GC variants with same performance parameters

Page 21: The Logic of Physical Garbage Collection in Deduplicating ...24 © Copyright 2017 Dell Inc. Conclusions • Shift in workloads required moving from depth-first based . Title: FAST17_slides_Duggal

© Copyright 2017 Dell Inc.21

GC on Different Platforms (36.6x TC)

For this dedup, LGC2 is slightly better than PGC2 but PGC+ is better than LGC2/PGC2

Page 22: The Logic of Physical Garbage Collection in Deduplicating ...24 © Copyright 2017 Dell Inc. Conclusions • Shift in workloads required moving from depth-first based . Title: FAST17_slides_Duggal

© Copyright 2017 Dell Inc.22

High Total compression Workload

0

20

40

60

80

100

120LG

CPG

CPG

C+

LGC

PGC

PGC

+

LGC

PGC

PGC

+

LGC

PGC

PGC

+

LGC

PGC

PGC

+

LGC

PGC

PGC

+

LGC

PGC

PGC

+

Dur

atio

n (h

ours

)

LGC2LGC1PGC2PGC1PGC+

36.6x 73.2x 147x 293x 586x 1170x 2340x

250

LGC duration scales with TC

PGC/PGC+ remain flat

Total compression factor (TC)

Page 23: The Logic of Physical Garbage Collection in Deduplicating ...24 © Copyright 2017 Dell Inc. Conclusions • Shift in workloads required moving from depth-first based . Title: FAST17_slides_Duggal

© Copyright 2017 Dell Inc.23

High file Count Workload

0

20

40

60

80

100

LGC PGC PGC+

Dur

atio

n (h

ours

)

LGC2LGC1PGC2PGC1PGC+

187

High file count(900M)

LGC1/LGC2 is orders of magnitude slower than PGC

Page 24: The Logic of Physical Garbage Collection in Deduplicating ...24 © Copyright 2017 Dell Inc. Conclusions • Shift in workloads required moving from depth-first based . Title: FAST17_slides_Duggal

© Copyright 2017 Dell Inc.24

Conclusions• Shift in workloads required moving from depth-first based

mark phase to breadth-first based mark phase• PGC works better than LGC for very high TC datasets and

large number of small files• Due to extra phases and performance constraints

introduced in PGC, PGC is not uniformly faster than LGC• PGC+ uses various optimizations to improve over PGC,

primarily by avoiding multiple mark phases • PGC+ is significantly faster than LGC when 2 mark phases

are required and orders of magnitude faster for problematic workloads

Page 25: The Logic of Physical Garbage Collection in Deduplicating ...24 © Copyright 2017 Dell Inc. Conclusions • Shift in workloads required moving from depth-first based . Title: FAST17_slides_Duggal