hope for the best, expect the worst - cornell university · 35 conclusions concept of...
TRANSCRIPT
![Page 1: Hope for the Best, Expect the Worst - Cornell University · 35 Conclusions Concept of speculation/roll-back introduced – known in fault tolerance research already – applicable](https://reader033.vdocument.in/reader033/viewer/2022051605/600f8090567f1467b6318cda/html5/thumbnails/1.jpg)
1
Hope for the Best, Expect the Worstor what happens when
E[ f(good event) ] > E[ f(bad event) ]
Lukas KrocOctober 12, 2006
![Page 2: Hope for the Best, Expect the Worst - Cornell University · 35 Conclusions Concept of speculation/roll-back introduced – known in fault tolerance research already – applicable](https://reader033.vdocument.in/reader033/viewer/2022051605/600f8090567f1467b6318cda/html5/thumbnails/2.jpg)
2
Outline● Overview of file systems● The basic idea: speculation● Applying the idea to file systems:
– local file systems– distributed file systems
● Implementation issues– performance results
● Conclusion
![Page 3: Hope for the Best, Expect the Worst - Cornell University · 35 Conclusions Concept of speculation/roll-back introduced – known in fault tolerance research already – applicable](https://reader033.vdocument.in/reader033/viewer/2022051605/600f8090567f1467b6318cda/html5/thumbnails/3.jpg)
3
File Systems: What They Are● allow and organize access to data
– operations: create, write, read, delete● physical scenarios:
– local file systems– distributed file systems
● goal:– provide durability and performance given physical
limitations (latency, bandwidth)● consistency added for distributed systems
![Page 4: Hope for the Best, Expect the Worst - Cornell University · 35 Conclusions Concept of speculation/roll-back introduced – known in fault tolerance research already – applicable](https://reader033.vdocument.in/reader033/viewer/2022051605/600f8090567f1467b6318cda/html5/thumbnails/4.jpg)
4
File Systems: How They Work
stolen from Paul Francis' CS414 lecture notes
![Page 5: Hope for the Best, Expect the Worst - Cornell University · 35 Conclusions Concept of speculation/roll-back introduced – known in fault tolerance research already – applicable](https://reader033.vdocument.in/reader033/viewer/2022051605/600f8090567f1467b6318cda/html5/thumbnails/5.jpg)
5
File Systems: How They Work
stolen from Paul Francis' CS414 lecture notes
![Page 6: Hope for the Best, Expect the Worst - Cornell University · 35 Conclusions Concept of speculation/roll-back introduced – known in fault tolerance research already – applicable](https://reader033.vdocument.in/reader033/viewer/2022051605/600f8090567f1467b6318cda/html5/thumbnails/6.jpg)
6
Main Issues● a trade-off between durability and performance
– durability calls for immediate access to the medium● synchronous access
– performance calls for caching● asynchronous access
● file system speedups:– local: use memory cache and disk buffer to delay
access– distributed: cache fetched files on clients
![Page 7: Hope for the Best, Expect the Worst - Cornell University · 35 Conclusions Concept of speculation/roll-back introduced – known in fault tolerance research already – applicable](https://reader033.vdocument.in/reader033/viewer/2022051605/600f8090567f1467b6318cda/html5/thumbnails/7.jpg)
7
Papers for Discussion● Nightingale et al: Speculative Execution in a
Distributed File System (SOSP'05)– new way of dealing with issues of distributed file
system● Nightingale et al: Rethink the Sync (OSDI'06)
– applies ideas from above to issues of local file systems
● same basic idea, different scenarios– will reverse the order of presentation, easier first
![Page 8: Hope for the Best, Expect the Worst - Cornell University · 35 Conclusions Concept of speculation/roll-back introduced – known in fault tolerance research already – applicable](https://reader033.vdocument.in/reader033/viewer/2022051605/600f8090567f1467b6318cda/html5/thumbnails/8.jpg)
8
Basic Idea
“Expect the best, be prepared for the worst”
● best = no power failure, cached data is valid● worst = power fails, cached data is invalid● prepared = able to recover a consistent state
after a bad event happened● expect = speculate that it will happen
![Page 9: Hope for the Best, Expect the Worst - Cornell University · 35 Conclusions Concept of speculation/roll-back introduced – known in fault tolerance research already – applicable](https://reader033.vdocument.in/reader033/viewer/2022051605/600f8090567f1467b6318cda/html5/thumbnails/9.jpg)
9
Conditions for the Basic Idea to Work● highly predictable results of speculations
– crash will most likely not occur in the next 5 seconds
– data in the cache is most likely valid● computers have spare CPU cycles
– to perform “free” speculative computation● local overhead is lower than remote I/O
![Page 10: Hope for the Best, Expect the Worst - Cornell University · 35 Conclusions Concept of speculation/roll-back introduced – known in fault tolerance research already – applicable](https://reader033.vdocument.in/reader033/viewer/2022051605/600f8090567f1467b6318cda/html5/thumbnails/10.jpg)
10
Outline● Overview of file systems● The basic idea: speculation● Applying the idea to file systems:
– local file systems– distributed file systems
● Implementation issues– performance results
● Conclusion
![Page 11: Hope for the Best, Expect the Worst - Cornell University · 35 Conclusions Concept of speculation/roll-back introduced – known in fault tolerance research already – applicable](https://reader033.vdocument.in/reader033/viewer/2022051605/600f8090567f1467b6318cda/html5/thumbnails/11.jpg)
11
Local File Systems:Traditional Approach (ext3)
● i-node based● added journaling for
increased durability– meta-data only for
performance reasons
● 2 modes of operation:– synchronous: system call return only after done– asynchronous: system call returns immediately
stolen from Paul Francis' CS414 lecture notes
![Page 12: Hope for the Best, Expect the Worst - Cornell University · 35 Conclusions Concept of speculation/roll-back introduced – known in fault tolerance research already – applicable](https://reader033.vdocument.in/reader033/viewer/2022051605/600f8090567f1467b6318cda/html5/thumbnails/12.jpg)
12
Problems of Traditional Approach
● synchronous mode:– durable (but only if using write barriers, or with disk
buffer disabled), but very slow● asynchronous mode:
– not durable, but fast
![Page 13: Hope for the Best, Expect the Worst - Cornell University · 35 Conclusions Concept of speculation/roll-back introduced – known in fault tolerance research already – applicable](https://reader033.vdocument.in/reader033/viewer/2022051605/600f8090567f1467b6318cda/html5/thumbnails/13.jpg)
13
Local File Systems:New Approach
● shift of paradigm: don't promise anything to the application, promise it to the user– the promise = synchronous guarantees– the user = any external entity observing the process
⇒ external synchrony– asynchronous internal workings, synchronous
external guarantees– combines performance and durability benefits of
both
![Page 14: Hope for the Best, Expect the Worst - Cornell University · 35 Conclusions Concept of speculation/roll-back introduced – known in fault tolerance research already – applicable](https://reader033.vdocument.in/reader033/viewer/2022051605/600f8090567f1467b6318cda/html5/thumbnails/14.jpg)
14
External Synchrony● Idea:
– speculate that everything will be properly written to disk
● Overview:– immediately return from write call (asynchrony)– buffer all external output of the application until the
write successfully happens– if write fails, discard the buffers
● Result:– better guarantees AND performance than ext3
![Page 15: Hope for the Best, Expect the Worst - Cornell University · 35 Conclusions Concept of speculation/roll-back introduced – known in fault tolerance research already – applicable](https://reader033.vdocument.in/reader033/viewer/2022051605/600f8090567f1467b6318cda/html5/thumbnails/15.jpg)
15
External Synchrony: Schema
![Page 16: Hope for the Best, Expect the Worst - Cornell University · 35 Conclusions Concept of speculation/roll-back introduced – known in fault tolerance research already – applicable](https://reader033.vdocument.in/reader033/viewer/2022051605/600f8090567f1467b6318cda/html5/thumbnails/16.jpg)
16
External Synchrony: Performance
![Page 17: Hope for the Best, Expect the Worst - Cornell University · 35 Conclusions Concept of speculation/roll-back introduced – known in fault tolerance research already – applicable](https://reader033.vdocument.in/reader033/viewer/2022051605/600f8090567f1467b6318cda/html5/thumbnails/17.jpg)
17
Distributed File Systems:Traditional Approach (NFS)
● client-server approach● synchronous I/O
operations required for coherence– using RPC
● offers close-to-open consistency– weaker than local file
systems
![Page 18: Hope for the Best, Expect the Worst - Cornell University · 35 Conclusions Concept of speculation/roll-back introduced – known in fault tolerance research already – applicable](https://reader033.vdocument.in/reader033/viewer/2022051605/600f8090567f1467b6318cda/html5/thumbnails/18.jpg)
18
Problems of Traditional Approach
● at least 2 round-trip-times required per close– very slow
● close-to-open consistency isn't very good– for how slow it is
![Page 19: Hope for the Best, Expect the Worst - Cornell University · 35 Conclusions Concept of speculation/roll-back introduced – known in fault tolerance research already – applicable](https://reader033.vdocument.in/reader033/viewer/2022051605/600f8090567f1467b6318cda/html5/thumbnails/19.jpg)
19
Distributed File Systems:New Approach
● Idea:– speculate that close is successful, that a cached
data is valid....● Overview:
– use asynchronous RPCs, immediately returning– checkpoint the application (store its state) and
buffer all subsequent output– on success: output buffers, on failure: roll-back
● Result:– better guarantee AND performance than NFS
![Page 20: Hope for the Best, Expect the Worst - Cornell University · 35 Conclusions Concept of speculation/roll-back introduced – known in fault tolerance research already – applicable](https://reader033.vdocument.in/reader033/viewer/2022051605/600f8090567f1467b6318cda/html5/thumbnails/20.jpg)
20
Speculative NFS: Schema
![Page 21: Hope for the Best, Expect the Worst - Cornell University · 35 Conclusions Concept of speculation/roll-back introduced – known in fault tolerance research already – applicable](https://reader033.vdocument.in/reader033/viewer/2022051605/600f8090567f1467b6318cda/html5/thumbnails/21.jpg)
21
Speculative NFS: Performance
![Page 22: Hope for the Best, Expect the Worst - Cornell University · 35 Conclusions Concept of speculation/roll-back introduced – known in fault tolerance research already – applicable](https://reader033.vdocument.in/reader033/viewer/2022051605/600f8090567f1467b6318cda/html5/thumbnails/22.jpg)
22
Overview of the Technique
Speculate on...power failure not occurring, cache being valid
...by means of...buffering externalized output, checkpointing the
process...in order to...
improve performance, increase consistency
![Page 23: Hope for the Best, Expect the Worst - Cornell University · 35 Conclusions Concept of speculation/roll-back introduced – known in fault tolerance research already – applicable](https://reader033.vdocument.in/reader033/viewer/2022051605/600f8090567f1467b6318cda/html5/thumbnails/23.jpg)
23
Outline● Overview of file systems● The basic idea: speculation● Applying the idea to file systems:
– local file systems– distributed file systems
● Implementation issues– performance results
● Conclusion
![Page 24: Hope for the Best, Expect the Worst - Cornell University · 35 Conclusions Concept of speculation/roll-back introduced – known in fault tolerance research already – applicable](https://reader033.vdocument.in/reader033/viewer/2022051605/600f8090567f1467b6318cda/html5/thumbnails/24.jpg)
24
Implementation:Buffering Externalized Output
● any kernel object with commit dependencies is uncommitted– any process that accesses uncommitted object is
marked uncommitted, and vice versa– any external output of such process is buffered by
kernel– logs are used to track dependencies
● once commit dependencies are removed, the buffers are output to external devices– also allows to group commits
![Page 25: Hope for the Best, Expect the Worst - Cornell University · 35 Conclusions Concept of speculation/roll-back introduced – known in fault tolerance research already – applicable](https://reader033.vdocument.in/reader033/viewer/2022051605/600f8090567f1467b6318cda/html5/thumbnails/25.jpg)
25
Buffering Externalized Output (1)
![Page 26: Hope for the Best, Expect the Worst - Cornell University · 35 Conclusions Concept of speculation/roll-back introduced – known in fault tolerance research already – applicable](https://reader033.vdocument.in/reader033/viewer/2022051605/600f8090567f1467b6318cda/html5/thumbnails/26.jpg)
26
Buffering Externalized Output (2)
![Page 27: Hope for the Best, Expect the Worst - Cornell University · 35 Conclusions Concept of speculation/roll-back introduced – known in fault tolerance research already – applicable](https://reader033.vdocument.in/reader033/viewer/2022051605/600f8090567f1467b6318cda/html5/thumbnails/27.jpg)
27
Result: xsyncfs● adapted ext3 file system to use external
synchrony– internally works asynchronously, but looks
synchronous● commits journal transaction when:
– journal space exhausted, journal old....– user calls fsync()– output-triggered by buffered output
● adapts for throughput/latency optimization
![Page 28: Hope for the Best, Expect the Worst - Cornell University · 35 Conclusions Concept of speculation/roll-back introduced – known in fault tolerance research already – applicable](https://reader033.vdocument.in/reader033/viewer/2022051605/600f8090567f1467b6318cda/html5/thumbnails/28.jpg)
28
xsyncfs: PerformancePostMark benchmark Apache build
![Page 29: Hope for the Best, Expect the Worst - Cornell University · 35 Conclusions Concept of speculation/roll-back introduced – known in fault tolerance research already – applicable](https://reader033.vdocument.in/reader033/viewer/2022051605/600f8090567f1467b6318cda/html5/thumbnails/29.jpg)
29
Implementation:Checkpointing a Process
● checkpoint: a state-image of a process– copy-on-write fork of the process– not placed on the run queue
● output of the running processed buffered while the process is speculative (with a checkpoint)
● depending on the result of the speculation:– success: the checkpoint is discarded– failed: process terminated and checkpoint assumes
its identity and placed on the run queue
![Page 30: Hope for the Best, Expect the Worst - Cornell University · 35 Conclusions Concept of speculation/roll-back introduced – known in fault tolerance research already – applicable](https://reader033.vdocument.in/reader033/viewer/2022051605/600f8090567f1467b6318cda/html5/thumbnails/30.jpg)
30
Propagating Causal Dependencies
![Page 31: Hope for the Best, Expect the Worst - Cornell University · 35 Conclusions Concept of speculation/roll-back introduced – known in fault tolerance research already – applicable](https://reader033.vdocument.in/reader033/viewer/2022051605/600f8090567f1467b6318cda/html5/thumbnails/31.jpg)
31
Result: SpecNFS● preserves existing NFS semantics
– including close-to-open consistency● offers much better performance than NFS
● implemented using the same RPCs– but in an asynchronous, speculative manner
● follows the external-synchrony paradigm– what is observed has been committed
![Page 32: Hope for the Best, Expect the Worst - Cornell University · 35 Conclusions Concept of speculation/roll-back introduced – known in fault tolerance research already – applicable](https://reader033.vdocument.in/reader033/viewer/2022051605/600f8090567f1467b6318cda/html5/thumbnails/32.jpg)
32
Result: BlueFS● strong consistency and safety guarantees
– single-copy file semantics (shared local disk)● still good performance
– still outperforms NFS
● prior to read/write, cached versions are speculated to be valid– in case of access conflict, roll-back occurs
![Page 33: Hope for the Best, Expect the Worst - Cornell University · 35 Conclusions Concept of speculation/roll-back introduced – known in fault tolerance research already – applicable](https://reader033.vdocument.in/reader033/viewer/2022051605/600f8090567f1467b6318cda/html5/thumbnails/33.jpg)
33
SpecNFS & BlueFS: PerformancePostMark benchmark Apache build
![Page 34: Hope for the Best, Expect the Worst - Cornell University · 35 Conclusions Concept of speculation/roll-back introduced – known in fault tolerance research already – applicable](https://reader033.vdocument.in/reader033/viewer/2022051605/600f8090567f1467b6318cda/html5/thumbnails/34.jpg)
34
SpecNFS & BlueFS: PerformanceApache build
![Page 35: Hope for the Best, Expect the Worst - Cornell University · 35 Conclusions Concept of speculation/roll-back introduced – known in fault tolerance research already – applicable](https://reader033.vdocument.in/reader033/viewer/2022051605/600f8090567f1467b6318cda/html5/thumbnails/35.jpg)
35
Conclusions● Concept of speculation/roll-back introduced
– known in fault tolerance research already– applicable to general I/O issues– “Expecting the best, being prepared for the worst”
● Might help resolve the tension between performance and durability in file systems– not “proven by time” yet, but looks good
● The idea is applicable in a broader context– distributed simulations, processor cache warm-up