hardening hdfs with selective and lightweight versioning · hardening hdfs with" selective and...
TRANSCRIPT
![Page 1: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/1.jpg)
1
Thanh Do, Tyler Harter, Yingchao Liu, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau Haryadi S. Gunawi
HARDFS:���Hardening HDFS with���
Selective and Lightweight Versioning
![Page 2: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/2.jpg)
Cloud Reliability
2
q Cloud systems § Complex software § Thousands of commodity machines § “Rare failures become frequent” [Hamilton]
q Failure detection and recovery § “… has to come from the software” [Dean] § “… must be a first-class operation” [Ramakrishnan et al.]
![Page 3: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/3.jpg)
3
Fail-stop failures q Machine crashes, disk failures
q Pretty much handled
q Current systems have sophisticated crash- recovery machineries § Data replication § Logging § Fail-over
![Page 4: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/4.jpg)
Fail-silent failures
q Exhibits incorrect behaviors instead of crashing
q Caused by memory corruption or software bugs
q Crash recovery is useless if fault can spread
4
Master
Workers
![Page 5: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/5.jpg)
5
Fail-silent failure headlines
![Page 6: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/6.jpg)
Current approaches
6
Replicated state machine using BFT library
Ver. 1 Ver.2
Ver. 3 Agree?
N-Version programing
• High resource consumption • High engineering effort • Rare deployment
![Page 7: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/7.jpg)
Selective and Lightweight Versioning (SLEEVE)
q 2nd version models basic protocols of the system
q Detects and isolates fail-silent behaviors
q Exploits crash recovery machinery for recovery 7
Master
Workers
Trusted sources
Reloading state during reboot
![Page 8: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/8.jpg)
8
Selective and lightweight versioning (SLEEVE)
q Selective § Goal: small engineering effort § Protects important parts
- Bug sensitive - Frequently changed - Currently unprotected
q Lightweight § Avoids replicating full state § Encodes states to reduce space
A B C D
A D
0 1 0 01 0 1 00 1 0 1
![Page 9: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/9.jpg)
9
HARDFS q HARDFS - hardened version HDFS: § Namespace management § Replica management § Read/write protocol
q HARDFS detects and recovers from: § 90% of the faults caused by random memory corruption § 100% of the faults caused by targeted memory corruption § 5 injected software bugs
q Fast recovery using micro-recovery § 3 orders of magnitude faster than full reboot
q Little space and performance overhead
![Page 10: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/10.jpg)
10
Outline ü Introduction
q HARDFS Design
q HARDFS Implementation
q Evaluation
q Conclusion
![Page 11: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/11.jpg)
Case study: ���namespace integrity
11
NameNode
Client
Create(F)
F
F
Normal Operation
txCreat(F)
NameNode
Client
exists(F)
F
No
G
Corrupted HDFS
Client
exists(F)
F
Yes
G
F
F
HARDFS
NameNode
Incorrect behavior
Trusted source
![Page 12: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/12.jpg)
SLEEVE layer components
12
• Interposition module • State manager • Action verifier • Recovery module
![Page 13: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/13.jpg)
State manager q Replicates subset of state of the main version
§ Directory entries without modification time
q Adds new state incrementally § Adds permissions for security checks
q Understands semantics of various protocol messages and thread events to update state correctly
q Compresses state using compact encoding
13
![Page 14: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/14.jpg)
Naïve: Full replication
q HDFS master manages millions of files
q 100% memory overhead reduces HDFS master scalability [;login; ‘11]
14
FF100% memory overhead
![Page 15: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/15.jpg)
Lightweight: ���Counting Bloom Filters
q Space-efficient data structure
q Supports 3 APIs § insert(“A fact”) § delete(“A fact”) § exists(“A fact”)
15
![Page 16: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/16.jpg)
Lightweight: ���Counting Bloom Filters
q Suitable for boolean checking § Does F exist? § Does F has length X? § Has block B been allocated? 16
“F is 10 bytes”
Disagreement detected!
F:10
insert(“F is 10 bytes”)
F:10 F:5 F:10
exists(“F is 5 bytes”) à NO
“Give me length of F”
![Page 17: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/17.jpg)
Challenges of using ���Counting Bloom Filters
q Hard to check stateful system
q False positives
17
![Page 18: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/18.jpg)
Non-boolean verification
18
“F is 20 bytes”
F:10 F:10 F:10 F:20
X = returnSize(F) delete(F:X) insert(F:20)
Bloom filter does not support this API
Before After
![Page 19: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/19.jpg)
Non-boolean verification
19
“F is 20 bytes”
F:10 F:10
X ç MainVersion.returnSize(F); IF exists(F:X) delete(F:X); insert(F:20); ELSE initiate recovery;
Ask-Then-Check
F:10 F:20
Before After
![Page 20: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/20.jpg)
Stateful verification
20
Bloom Filter (boolean verification)
Checking stateful systems
Ask Then Check
![Page 21: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/21.jpg)
Dealing with False positive q Bloom filters can give false positive
§ 4 per billion § 1 false positive per month (given 100 op/s)
q Only leads to unnecessary recovery
21
F G
Trusted source
F
F
Reloading state
![Page 22: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/22.jpg)
22
Outline ü Introduction
q HARDFS Design ü Lightweight § Selective § Recovery
q HARDFS Implementation
q Evaluation
q Conclusion
![Page 23: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/23.jpg)
Selective Checks
q Goals: small engineering effort
q Selectively chooses namespace protection
q Excludes security checks 23
Client create(F)
G F F
HDFS Master
F
txCreate(F)
Client
Operation log
exists(F)
Disagreement detected! No Yes
X ß mainVersion.exists(F); Y ß bloomFilter.exists(F); If X != Y then handleDisagreement();
![Page 24: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/24.jpg)
Incorrect action examples
24
Create(F)
txCreate(F)
Create(F)
reject
Create(F)
txCreate(D/F) txMkdir(D)
txCreate(F)
Create(D/F) Mkdir(D)
Normal correct action Corrupt action Missing action
Orphan action Out-of-order action
All of these happen in practice
![Page 25: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/25.jpg)
Action verifier q Set of micro-checks to detect incorrect
actions of the main version
q Mechanisms: § Expected-action list § Actions dependency checking § Timeout § Domain knowledge to handle disagreement
25
![Page 26: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/26.jpg)
26
Outline ü Introduction
q HARDFS Design ü Lightweight ü Selective q Recovery
q HARDFS Implementation
q Evaluation
q Conclusion
![Page 27: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/27.jpg)
Recovery
q Crash is good provided no fault propagation
q Detects and turns bad behaviors into crashes
q Exploits HDFS crash recovery machineries 27
Master
Workers
Trusted sources
Reloading state during reboot
![Page 28: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/28.jpg)
HARDFS Recovery
q Full recovery (crash and reboot)
q Micro-recovery § Repairing the main version § Repairing the 2nd version
28
![Page 29: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/29.jpg)
Crash and Reboot
q Full state is reconstructed from trusted sources
q Full recovery may be expensive § Restarting an HDFS master could take hours
29
Reloading Full state
![Page 30: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/30.jpg)
Micro-recovery
q Repairs only corrupted state from trusted sources
q Falls back to full reboot when micro-recovery fails
30
![Page 31: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/31.jpg)
Repairing main version
31
Main Version
2nd Version
F:100
Trusted source: checkpoint file
F:200 F:100
Direct update F:200 ç F:100 F:100
![Page 32: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/32.jpg)
Repairing 2nd version
32
Main Version
2nd Version
F:100
Trusted source: checkpoint file
F:200
Must: 1. Delete(“F is 200 bytes”) 2. Insert(“F is 100 bytes”) F:100
Solution: 1. Start with an empty BF 2. Add facts as they are verified
F:100
![Page 33: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/33.jpg)
33
Outline ü Introduction
ü HARDFS Design
q HARDFS Implementation
q Evaluation
q Conclusion
![Page 34: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/34.jpg)
Implementation q Hardens three functionalities of HDFS
§ Namespace management (HARDFS-N) § Replica management (HARDFS-R) § Read/write protocol of datanodes (HARDFS-D)
q Uses 3 Bloom filters API § insert(“a fact”), delete(“a fact”), exists(“a fact”)
q Uses ask-then-check for non-boolean verification
34
![Page 35: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/35.jpg)
Protecting ���namespace integrity q Guards namespace structures necessary for
reaching data: § File hierarchy § File-to-block mapping § File length information
q Detects and recovers from namespace-related problems: § Corrupt file-to-block mapping § Unreachable files
35
![Page 36: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/36.jpg)
Namespace management Message Logic of the secondary version
Create(F): Client request NN to create F
Entry: ��� If exists(F) Then reject; Else insert(F); generateAction(txCreate[F]); Return: check response;
AddBlock(F): client requests NN to allocate a block to file F
Entry: F:X = ask-then-check(F); Return: B = addBlk(F);��� If exists(F) & !exists(B) Then X′ = X ∪ {B}; delete(F:X); insert(F:X′) insert(B@0); Else declare error;
36
![Page 37: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/37.jpg)
37
Outline ü Introduction
ü HARDFS Design
ü HARDFS Implementation
q Evaluation and Conclusion
![Page 38: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/38.jpg)
Evaluation
q Is HARDFS robust against fail-silent faults?
q How much time and space overhead incurred?
q Is micro-recovery efficient?
q How much engineering effort required?
38
![Page 39: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/39.jpg)
Random memory corruption results Outcome HDFS HARDFS
Silent failure 117 9
Detect and reboot - 140
Detect and micro-recover - 107
Crash 133 268
Hang 22 16
No problem observed 728 460
39
q # fail-silent failures reduced by factor of 10
q Crash happens twice as often
![Page 40: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/40.jpg)
Silent failures FIELD HDFS HARDFS
pathname 95 0
replication 1 0
modification time 6 8
permission 3 0
block size 12 1
40
![Page 41: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/41.jpg)
0
100
200
300
400
500
600
700
800
200K 400K 600K 800K 1000K
Mem
ory allocated (M
B)
File system size (number of files) HDFS HARDFS + Concrete State HARDFS + Bloom Filters
Namepsace management Space Overhead
41
HARDFS with Bloom filter incurs little space overhead (2.6%)
![Page 42: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/42.jpg)
Recovery Time
42
1
10
100
1000
10000
200K 400K 600K 800K 1000K Recovery Tim
e (secon
ds)
File system size (number of files)
Reboot Micro-‐recovery OpGmized Micro-‐recovery
• Rebooting NameNode is expensive • Micro-recovery is 3 order of magnitude faster
![Page 43: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/43.jpg)
Complexity (LOC)
Functionality HDFS HARDFS
Namespace management 10114 1751 17%
Replica management 2342 934 40%
Read/write protocol 5050 944 19%
Others 13339 - -
43
• Lightweight versions are smaller
![Page 44: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/44.jpg)
Injecting software bugs Bug Year Priority Description HARDFS
HADOOP-1135 2007 Major Blocks in block report wrongly marked for deletion ✔
HADOOP-3002 2008 Blocker Blocks removed during safemode ✔ HDFS-900 2010 Blocker Valid replica deleted rather than
corrupt replica ✔ HDFS-1250 2010 Major Namenode processes block
report from dead datanode ✔ HDFS-3087 2012 Critical Decommission before replication
during namenode restart ✔
44
![Page 45: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/45.jpg)
Conclusion
q Crashing is good
q To die (and be reborn) is better than to lie
q But lies do happen in reality
q HARDFS turns lies into crashes
q Leverages existing crash recovery techniques to resurrect
45
![Page 46: Hardening HDFS with Selective and Lightweight Versioning · Hardening HDFS with" Selective and Lightweight Versioning! Cloud Reliability" 2 ! Cloud systems! ... Namenode processes](https://reader036.vdocument.in/reader036/viewer/2022063014/5fd10f081c741779217a2257/html5/thumbnails/46.jpg)
Thank you!���Questions?
46
http://research.cs.wisc.edu/adsl/
http://ucare.cs.uchicago.edu/
http://wisdom.cs.wisc.edu/