strongbox: towards reliability and consistency in cloud-based
TRANSCRIPT
![Page 1: StrongBox: Towards Reliability and Consistency in Cloud-Based](https://reader031.vdocument.in/reader031/viewer/2022020706/61fc869b8d33c02b785e31a0/html5/thumbnails/1.jpg)
*-Box (star-box)
Towards Reliability and Consistency in Dropbox-like File Synchronization Services
Yupu Zhang, Chris Dragga,
Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau
University of Wisconsin - Madison
6/27/2013 1
![Page 2: StrongBox: Towards Reliability and Consistency in Cloud-Based](https://reader031.vdocument.in/reader031/viewer/2022020706/61fc869b8d33c02b785e31a0/html5/thumbnails/2.jpg)
Cloud-Based File Synchronization Services
• Exploding in popularity – Numerous providers: Dropbox, Google Drive, SkyDrive …
– Large user base: Dropbox has more than 100 million users
• Key benefit – Automatic synchronization across clients/devices
– Reliable data storage on the server through replication
“your stuff is safe in Dropbox and will never be lost” - Dropbox tour, page 1
6/27/2013 2
![Page 3: StrongBox: Towards Reliability and Consistency in Cloud-Based](https://reader031.vdocument.in/reader031/viewer/2022020706/61fc869b8d33c02b785e31a0/html5/thumbnails/3.jpg)
Is Your Data Really Safe?
• Data corruption
– Uploaded from local machine to cloud
– Propagated to other devices/clients
6/27/2013 3
![Page 4: StrongBox: Towards Reliability and Consistency in Cloud-Based](https://reader031.vdocument.in/reader031/viewer/2022020706/61fc869b8d33c02b785e31a0/html5/thumbnails/4.jpg)
1
• Crash consistency
– Inconsistent data ends up everywhere
– “Out-of-sync” synchronization
1 1 2
Is Your Data Really Safe?
6/27/2013 4
1
1
after reboot
sync client thinks everything is in sync
![Page 5: StrongBox: Towards Reliability and Consistency in Cloud-Based](https://reader031.vdocument.in/reader031/viewer/2022020706/61fc869b8d33c02b785e31a0/html5/thumbnails/5.jpg)
Your Data is NOT Really Safe
• False sense of safety
– Many copies do NOT always make your data safe
• Why?
– Semantic gap between local file system and cloud
– Separately designed and loosely linked
6/27/2013 5
![Page 6: StrongBox: Towards Reliability and Consistency in Cloud-Based](https://reader031.vdocument.in/reader031/viewer/2022020706/61fc869b8d33c02b785e31a0/html5/thumbnails/6.jpg)
Project *-Box (star-box)
• Goal
– Close the gap between local file system and cloud
– Provide * without too much infrastructure changes
• * represents desired properties
– e.g., reliable, consistent, fast, private …
• Currently focus on two properties
– Reliable: Data corruption
– Consistent: Crash consistency
6/27/2013 6
![Page 7: StrongBox: Towards Reliability and Consistency in Cloud-Based](https://reader031.vdocument.in/reader031/viewer/2022020706/61fc869b8d33c02b785e31a0/html5/thumbnails/7.jpg)
Outline
• Introduction
• Data Corruption
• Crash Consistency
• Current Status
• Conclusion
6/27/2013 7
![Page 8: StrongBox: Towards Reliability and Consistency in Cloud-Based](https://reader031.vdocument.in/reader031/viewer/2022020706/61fc869b8d33c02b785e31a0/html5/thumbnails/8.jpg)
Corruption Problem
• Data corruption is not uncommon
– Comes from disk media, firmware, controllers [Bairavasundaram07, Anderson03]
– Remains local w/o synchronization
• With synchronization
– Corruption may propagate and pollute other copies
• Synchronization is a double-edged sword
– Make sure synchronized data is good
6/27/2013 8
![Page 9: StrongBox: Towards Reliability and Consistency in Cloud-Based](https://reader031.vdocument.in/reader031/viewer/2022020706/61fc869b8d33c02b785e31a0/html5/thumbnails/9.jpg)
I1 I
Dedup
NOT a duplicate duplicate
Disk
Background
9 6/27/2013
Dropbox Client
Local Database
File System
Dropbox Server
Name Attributes
foo I
Inotify
Rsync
C1 C2 C3
D1 D2
foo [v0] (4MB data chunks)
foo: inode
NO offline changes
D2’
overwrite D2
foo was modified I’
read foo (chunk by chunk)
D1 … …
C1’ C2 C3
4KB data blocks
D2’
C1’
foo [v1] (4MB data chunks)
Name Attributes
foo I’
Changed Multiple Times
![Page 10: StrongBox: Towards Reliability and Consistency in Cloud-Based](https://reader031.vdocument.in/reader031/viewer/2022020706/61fc869b8d33c02b785e31a0/html5/thumbnails/10.jpg)
D1 D1 D1 D1 Disk M1 I
Inject Corruption
10 6/27/2013
File System
Dropbox Server
Inotify
C1 C2 C3
D2
foo [v0] (4MB data chunks)
… …
Corrupt D1
foo: inode
4KB data blocks
![Page 11: StrongBox: Towards Reliability and Consistency in Cloud-Based](https://reader031.vdocument.in/reader031/viewer/2022020706/61fc869b8d33c02b785e31a0/html5/thumbnails/11.jpg)
I I
Local Database
Name Attributes
foo I
Dedup
Disk
Start Client
11 6/27/2013
Dropbox Client
File System
Dropbox Server
Inotify
Rsync
C1 C2 C3
D1 D2
foo [v0] (4MB data chunks)
D1 … …
NO offline changes
foo: inode
4KB data blocks
![Page 12: StrongBox: Towards Reliability and Consistency in Cloud-Based](https://reader031.vdocument.in/reader031/viewer/2022020706/61fc869b8d33c02b785e31a0/html5/thumbnails/12.jpg)
I I’
D1 D1
Dedup
NOT a duplicate
Disk I
Data Write
12 6/27/2013
Dropbox Client
Local Database
File System
Dropbox Server
Inotify
Rsync
C1 C2 C3
D2
foo [v0] (4MB data chunks)
D2’
overwrite D2
foo was modified
read foo (chunk by chunk)
… …
C1’
D2’
C1’
foo [v1] (4MB data chunks)
D1
foo: inode
4KB data blocks
Name Attributes
foo I
Name Attributes
foo I’
![Page 13: StrongBox: Towards Reliability and Consistency in Cloud-Based](https://reader031.vdocument.in/reader031/viewer/2022020706/61fc869b8d33c02b785e31a0/html5/thumbnails/13.jpg)
I I’
D2 D1 D1
Dedup
NOT a duplicate
Disk I
Metadata Change
13 6/27/2013
Dropbox Client
Local Database
File System
Dropbox Server
Inotify
Rsync
C1 C2 C3
D2
foo [v0] (4MB data chunks)
touch -m foo’s metadata was changed
read foo (chunk by chunk)
… …
C1’
D2’
C1’
foo [v1] (4MB data chunks)
D1
foo: inode
4KB data blocks
Name Attributes
foo I
Name Attributes
foo I’
![Page 14: StrongBox: Towards Reliability and Consistency in Cloud-Based](https://reader031.vdocument.in/reader031/viewer/2022020706/61fc869b8d33c02b785e31a0/html5/thumbnails/14.jpg)
More Results
FS Service Data Writes Metadata Changes
mtime ctime atime
ext4 (Linux)
Dropbox L G L G L G L
ownCloud L G L G L L
FileRock L G L G L L
HFS+ (Mac OS X)
Dropbox L G L G L L
ownCloud L G L G L L
GoogleDrive L G L G L L
SugarSync L G L L L
Syncplicity L G L G L L
• Corruption is propagated when there is a change to file data • Even if there is no data change, corruption may still be uploaded
6/27/2013 14
L: Local corruption G: Global corruption
![Page 15: StrongBox: Towards Reliability and Consistency in Cloud-Based](https://reader031.vdocument.in/reader031/viewer/2022020706/61fc869b8d33c02b785e31a0/html5/thumbnails/15.jpg)
Summary
• Bad bits are promoted to resilient bad bits
– ALL copies polluted
– Cloud copies protected by checksum
• Fundamental problem, not implementation bugs
– FS monitoring services only provide file-level notification
– Sync clients cannot tell legitimate changes from corruption
• Redundant data on the cloud is not fully utilized
– If corruption can be detected, local FS can recover from corruption using cloud copies
6/27/2013 15
![Page 16: StrongBox: Towards Reliability and Consistency in Cloud-Based](https://reader031.vdocument.in/reader031/viewer/2022020706/61fc869b8d33c02b785e31a0/html5/thumbnails/16.jpg)
Outline
• Introduction
• Data Corruption
• Crash Consistency
• Current Status
• Conclusion
6/27/2013 16
![Page 17: StrongBox: Towards Reliability and Consistency in Cloud-Based](https://reader031.vdocument.in/reader031/viewer/2022020706/61fc869b8d33c02b785e31a0/html5/thumbnails/17.jpg)
Crash Recovery Techniques
• Copy-on-write (e.g., ZFS, btrfs)
– Always roll back to a consistent version
• Journaling (e.g., ext4)
– Data journaling mode • Both data and metadata are logged
• Provide data consistency
– Ordered journaling mode • Only journal metadata
• Data blocks are written before metadata is logged
• Cannot guarantee data consistency
6/27/2013 17
![Page 18: StrongBox: Towards Reliability and Consistency in Cloud-Based](https://reader031.vdocument.in/reader031/viewer/2022020706/61fc869b8d33c02b785e31a0/html5/thumbnails/18.jpg)
Disk I
I I’ I’ I’ D1’
1. Write dirty data blocks to home locations
2. Write metadata blocks to journal
3. Write journal commit block to the journal
4. Checkpoint journaled metadata blocks to home locations
Ordered Mode
18 6/27/2013
File System
D1 D2
D2’
D1
overwrite D1 &D2
D2’
1 2
C
3 4
foo: inode
4KB data blocks
…
D1’
1
Journal / Log
![Page 19: StrongBox: Towards Reliability and Consistency in Cloud-Based](https://reader031.vdocument.in/reader031/viewer/2022020706/61fc869b8d33c02b785e31a0/html5/thumbnails/19.jpg)
Disk
I I’ I’
I
D1’
• Crash during step 1
– Write dirty data blocks to home locations
Crash in Ordered Mode
19 6/27/2013
File System
D1 D2
D2’
D1
overwrite D1 &D2
D2’
1
foo: inode
4KB data blocks
…
D1’
1 Consistent Data Inconsistent Data
![Page 20: StrongBox: Towards Reliability and Consistency in Cloud-Based](https://reader031.vdocument.in/reader031/viewer/2022020706/61fc869b8d33c02b785e31a0/html5/thumbnails/20.jpg)
Ext4 Ordered Mode + Dropbox
• Case 1
– Inconsistent data is propagated
• Case 2
– Consistent data is NOT synchronized
6/27/2013 20
![Page 21: StrongBox: Towards Reliability and Consistency in Cloud-Based](https://reader031.vdocument.in/reader031/viewer/2022020706/61fc869b8d33c02b785e31a0/html5/thumbnails/21.jpg)
I I’
I Disk D1
Dedup
Case 1 Crash
21 6/27/2013
Dropbox Client
File System
Dropbox Server
Inotify
Rsync
C1 C2 C3
D2
foo [v0] (4MB data chunks)
D2’
… …
foo: inode
4KB data blocks
overwrite D1 &D2
D1’ D1’
crash AFTER database is changed
inconsistent data on disk
Local Database
Name Attributes
foo I
Name Attributes
foo I’
NOT fully updated
foo was modified
![Page 22: StrongBox: Towards Reliability and Consistency in Cloud-Based](https://reader031.vdocument.in/reader031/viewer/2022020706/61fc869b8d33c02b785e31a0/html5/thumbnails/22.jpg)
I I
C1
Dedup
Disk
Case 1 Reboot
22 6/27/2013
Dropbox Client
Local Database
File System
Dropbox Server
Inotify
Rsync
C2 C3
D1 D2
foo [v0] (4MB data chunks)
D1’ … …
foo: inode
4KB data blocks
C1’
foo [v1] (4MB data chunks)
inconsistent data D1’ D2 on cloud
Name Attributes
foo I’
NOT fully updated
Sync!
inconsistent data on disk
![Page 23: StrongBox: Towards Reliability and Consistency in Cloud-Based](https://reader031.vdocument.in/reader031/viewer/2022020706/61fc869b8d33c02b785e31a0/html5/thumbnails/23.jpg)
I I’
I Disk D1
Dedup
Case 2 Crash
23 6/27/2013
Dropbox Client
Local Database
File System
Dropbox Server
Inotify
Rsync
C1 C2 C3
D2
foo [v0] (4MB data chunks)
D2’
… …
foo: inode
4KB data blocks
overwrite D1 &D2 (O_SYNC)
D1’ D1’ D2’
crash BEFORE database is changed
consistent data on disk
Name Attributes
foo I
![Page 24: StrongBox: Towards Reliability and Consistency in Cloud-Based](https://reader031.vdocument.in/reader031/viewer/2022020706/61fc869b8d33c02b785e31a0/html5/thumbnails/24.jpg)
I I
C1
Dedup
Disk
Case 2 Reboot
24 6/27/2013
Dropbox Client
Local Database
File System
Dropbox Server
Inotify
Rsync
C2 C3
D1 D2’
foo [v0] (4MB data chunks)
D1’ … …
NO offline changes
foo: inode
4KB data blocks
Server and other devices still have v0
This machine has v1
Name Attributes
foo I
NO sync!
consistent data on disk
![Page 25: StrongBox: Towards Reliability and Consistency in Cloud-Based](https://reader031.vdocument.in/reader031/viewer/2022020706/61fc869b8d33c02b785e31a0/html5/thumbnails/25.jpg)
Recover using Data on Cloud?
• Data on server does not always reflect a consistent state on disk
– Dropbox uploads data asynchronously
– Dropbox reorders file uploading
– Actively modified files may get delayed
• When crash occurs, files on server could be inconsistent with respect to disk
6/27/2013 25
![Page 26: StrongBox: Towards Reliability and Consistency in Cloud-Based](https://reader031.vdocument.in/reader031/viewer/2022020706/61fc869b8d33c02b785e31a0/html5/thumbnails/26.jpg)
Summary
• Inconsistent content gets propagated
• “Out-of-sync” files may exist
– Different client/devices see different versions of the same file
• Need in-depth communication between local FS and cloud
– Cloud has very weak sense of actual FS state
6/27/2013 26
![Page 27: StrongBox: Towards Reliability and Consistency in Cloud-Based](https://reader031.vdocument.in/reader031/viewer/2022020706/61fc869b8d33c02b785e31a0/html5/thumbnails/27.jpg)
Outline
• Introduction
• Data Corruption
• Crash Consistency
• Current Status
• Conclusion
6/27/2013 27
![Page 28: StrongBox: Towards Reliability and Consistency in Cloud-Based](https://reader031.vdocument.in/reader031/viewer/2022020706/61fc869b8d33c02b785e31a0/html5/thumbnails/28.jpg)
Current Status of *-Box
• Finished
– Data checksumming in ext4
– Dropbox-aided corruption recovery in ext4
– Fine-grained inotify in Linux
• Add a ranged file-update notification
• Adapt ownCloud (an open-source sync service) to use it
• In-progress
– Use in-memory snapshot to facilitate crash recovery in ext4 ordered mode
6/27/2013 28
![Page 29: StrongBox: Towards Reliability and Consistency in Cloud-Based](https://reader031.vdocument.in/reader031/viewer/2022020706/61fc869b8d33c02b785e31a0/html5/thumbnails/29.jpg)
Outline
• Introduction
• Data Corruption
• Crash Consistency
• Possible Solutions
• Conclusion
6/27/2013 29
![Page 30: StrongBox: Towards Reliability and Consistency in Cloud-Based](https://reader031.vdocument.in/reader031/viewer/2022020706/61fc869b8d33c02b785e31a0/html5/thumbnails/30.jpg)
Conclusion
• Many copies do NOT always make your data safe
– Propagation of corrupt data and inconsistent state
– Synchronized files are out-of-sync
• Propose *-Box project
– Solve problems by reducing the semantic gap between existing local FS and cloud storage
• Ultimately may need a cohesive system that provides capabilities unachievable in isolation
6/27/2013 30
![Page 31: StrongBox: Towards Reliability and Consistency in Cloud-Based](https://reader031.vdocument.in/reader031/viewer/2022020706/61fc869b8d33c02b785e31a0/html5/thumbnails/31.jpg)
Thank you!
Questions?
Advanced Systems Lab (ADSL)
University of Wisconsin-Madison
http://www.cs.wisc.edu/adsl
Wisconsin Institute on Software-defined Datacenters in Madison
http://wisdom.cs.wisc.edu/
6/27/2013 31