towards automatically checking thousands of failures with micro-specifications
DESCRIPTION
Towards Automatically Checking Thousands of Failures with Micro-Specifications. Haryadi S. Gunawi, Thanh Do † , Pallavi Joshi, Joseph M. Hellerstein, Andrea C. Arpaci-Dusseau † , Remzi H. Arpaci-Dusseau † , Koushik Sen University of California, Berkeley - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Towards Automatically Checking Thousands of Failures with Micro-Specifications](https://reader036.vdocument.in/reader036/viewer/2022081603/568144c4550346895db18d62/html5/thumbnails/1.jpg)
Towards Automatically Checking
Thousands of Failures with Micro-Specifications
Haryadi S. Gunawi, Thanh Do†, Pallavi Joshi,
Joseph M. Hellerstein, Andrea C. Arpaci-Dusseau†,
Remzi H. Arpaci-Dusseau†, Koushik Sen
University of California, Berkeley† University of Wisconsin, Madison
![Page 2: Towards Automatically Checking Thousands of Failures with Micro-Specifications](https://reader036.vdocument.in/reader036/viewer/2022081603/568144c4550346895db18d62/html5/thumbnails/2.jpg)
Cloud Era
Solve bigger human problemsUse cluster of thousands of
machines
2
![Page 3: Towards Automatically Checking Thousands of Failures with Micro-Specifications](https://reader036.vdocument.in/reader036/viewer/2022081603/568144c4550346895db18d62/html5/thumbnails/3.jpg)
Failures in The Cloud
“The future is a world of failures everywhere” - Garth Gibson
“Recovery must be a first-class operation” - Raghu Ramakrishnan
“Reliability has to come from the software” - Jeffrey Dean
3
![Page 4: Towards Automatically Checking Thousands of Failures with Micro-Specifications](https://reader036.vdocument.in/reader036/viewer/2022081603/568144c4550346895db18d62/html5/thumbnails/4.jpg)
4
![Page 5: Towards Automatically Checking Thousands of Failures with Micro-Specifications](https://reader036.vdocument.in/reader036/viewer/2022081603/568144c4550346895db18d62/html5/thumbnails/5.jpg)
5
![Page 6: Towards Automatically Checking Thousands of Failures with Micro-Specifications](https://reader036.vdocument.in/reader036/viewer/2022081603/568144c4550346895db18d62/html5/thumbnails/6.jpg)
Why Failure Recovery Hard?
• Testing is not advanced enough against complex failures– Diverse, frequent, and multiple failures– FaceBook photo loss
• Recovery is under specified– Need to specify failure recovery behaviors– Customized well-grounded protocols
• Example: Paxos made live – An engineering perspective [PODC’ 07]
6
![Page 7: Towards Automatically Checking Thousands of Failures with Micro-Specifications](https://reader036.vdocument.in/reader036/viewer/2022081603/568144c4550346895db18d62/html5/thumbnails/7.jpg)
Our Solutions
• FTS (“FATE”) – Failure Testing Service– New abstraction for failure exploration – Systematically exercise 40,000 unique
combinations of failures
• DTS (“DESTINI”) – Declarative Testing Specification– Enable concise recovery specifications– We have written 74 checks (3 lines / check)
• Note: Names have changed since the paper
7
![Page 8: Towards Automatically Checking Thousands of Failures with Micro-Specifications](https://reader036.vdocument.in/reader036/viewer/2022081603/568144c4550346895db18d62/html5/thumbnails/8.jpg)
Summary of Findings
• Applied FATE and DESTINI to three cloud systems: HDFS, ZooKeeper, Cassandra
• Found 16 new bugs• Reproduced 74 bugs• Problems found
– Inconsistency– Data loss– Rack awareness broken– Unavailability
8
![Page 9: Towards Automatically Checking Thousands of Failures with Micro-Specifications](https://reader036.vdocument.in/reader036/viewer/2022081603/568144c4550346895db18d62/html5/thumbnails/9.jpg)
Outline
Introduction• FATE• DESTINI• Evaluation• Summary
9
![Page 10: Towards Automatically Checking Thousands of Failures with Micro-Specifications](https://reader036.vdocument.in/reader036/viewer/2022081603/568144c4550346895db18d62/html5/thumbnails/10.jpg)
10
M 1C 2 3 M 1C 2 3 4
M 1C 2 3 M 1C 2 3
No failures Setup Stage Recovery: Recreate fresh pipeline
Data transfer Stage Recovery: Continue on surviving nodes
Bug in Data Transfer Stage Recovery
X3X2
X1
Setup
Stage
Alloc.Req.
Data Transfer
Stage Failures at DIFFERENT STAGES
lead to DIFFERENT FAILURE BEHAVIORS
Goal: Exercise different failure recovery path
![Page 11: Towards Automatically Checking Thousands of Failures with Micro-Specifications](https://reader036.vdocument.in/reader036/viewer/2022081603/568144c4550346895db18d62/html5/thumbnails/11.jpg)
FATE
• A failure injection framework– target IO points– Systematically exploring
failure– Multiple failures
• New abstraction of failure scenario– Remember injected failures– Increase failure coverage
11
M 1C 2 3
XX X
X
X X
![Page 12: Towards Automatically Checking Thousands of Failures with Micro-Specifications](https://reader036.vdocument.in/reader036/viewer/2022081603/568144c4550346895db18d62/html5/thumbnails/12.jpg)
Failure ID
12
2 3
Fields Values
Static Func. Call OutputStream.read()
Source File BlockReceiver.java
Dynamic Stack Track …
Domain specific
Source Node 2
Destination Node 3
Net. Message Data Packet
Failure Type Crash After
Hash 12348729
![Page 13: Towards Automatically Checking Thousands of Failures with Micro-Specifications](https://reader036.vdocument.in/reader036/viewer/2022081603/568144c4550346895db18d62/html5/thumbnails/13.jpg)
How Developers Build Failure ID?
• FATE intercepts all I/Os• Use aspectJ to collect information at
every I/O point– I/O buffers (e.g file buffer, network
buffer)– Target I/O (e.g. file name, IP address)
• Reverse engineer for domain specific information
13
![Page 14: Towards Automatically Checking Thousands of Failures with Micro-Specifications](https://reader036.vdocument.in/reader036/viewer/2022081603/568144c4550346895db18d62/html5/thumbnails/14.jpg)
Failure ID
12
2 3
Fields Values
Static Func. Call OutputStream.read()
Source File BlockReceiver.java
Dynamic Stack Track …
Domain specific
Source Node 2
Destination Node 3
Net. Message Data Packet
Failure Type Crash After
Hash 12348729
![Page 15: Towards Automatically Checking Thousands of Failures with Micro-Specifications](https://reader036.vdocument.in/reader036/viewer/2022081603/568144c4550346895db18d62/html5/thumbnails/15.jpg)
Exploring Failure Space
14
M 1C 2 3
A
A
B
A
B C
Exp #1: A
Exp #2: B
Exp #3: C
M 1C 2 3
A
B C
B
A
A
AB
AC
B CBC
![Page 16: Towards Automatically Checking Thousands of Failures with Micro-Specifications](https://reader036.vdocument.in/reader036/viewer/2022081603/568144c4550346895db18d62/html5/thumbnails/16.jpg)
Outline
IntroductionFATE• DESTINI• Evaluation• Summary
15
![Page 17: Towards Automatically Checking Thousands of Failures with Micro-Specifications](https://reader036.vdocument.in/reader036/viewer/2022081603/568144c4550346895db18d62/html5/thumbnails/17.jpg)
DESTINI
• Enable concise recovery specifications• Check if expected behaviors match with
actual behaviors• Important elements:
– Expectations– Facts– Failure Events– Check Timing
• Interpose network and disk protocols
16
![Page 18: Towards Automatically Checking Thousands of Failures with Micro-Specifications](https://reader036.vdocument.in/reader036/viewer/2022081603/568144c4550346895db18d62/html5/thumbnails/18.jpg)
Writing specifications
“Violation if expectation is different from actual facts”
violationTable():- expectationTable(), NOT-IN actualTable()
DataLog syntax::- derivation
, AND17
![Page 19: Towards Automatically Checking Thousands of Failures with Micro-Specifications](https://reader036.vdocument.in/reader036/viewer/2022081603/568144c4550346895db18d62/html5/thumbnails/19.jpg)
18
M 1C 2 3
Correct recovery
X
M 1C 2 3
X
Incorrect Recovery
Expected Nodes(Block, Node)
B Node 1
B Node 2
actualNodes(Block, Node)
B Node 1
B Node 2
IncorrectNodes(Block, Node)
incorrectNodes(B, N) :- expectedNodes(B, N), NOT-IN actualNodes(B, N);
![Page 20: Towards Automatically Checking Thousands of Failures with Micro-Specifications](https://reader036.vdocument.in/reader036/viewer/2022081603/568144c4550346895db18d62/html5/thumbnails/20.jpg)
19
M 1C 2 3
Correct recovery
X
Expected Nodes(Block, Node)
B Node 1
B Node 2
actualNodes(Block, Node)
B Node 1
IncorrectNodes(Block, Node)
B Node 2
M 1C 2 3
X
Incorrect recovery
BUILD EXPECTATIONS CAPTURE FACTS
incorrectNodes(B, N) :- expectedNodes(B, N), NOT-IN actualNodes(B, N);
![Page 21: Towards Automatically Checking Thousands of Failures with Micro-Specifications](https://reader036.vdocument.in/reader036/viewer/2022081603/568144c4550346895db18d62/html5/thumbnails/21.jpg)
Building Expectations
expectedNodes(B, N) :- getBlockPipe(B, N);
20
Expected Nodes(Block, Node)
B Node 1
B Node 2
B Node 3
M 1C 2 3
X
Master Client
Give me list of nodes for B
[Node 1, Node 2, Node 3]
![Page 22: Towards Automatically Checking Thousands of Failures with Micro-Specifications](https://reader036.vdocument.in/reader036/viewer/2022081603/568144c4550346895db18d62/html5/thumbnails/22.jpg)
Updating Expectation
DEL expectedNodes(B, N) :- fateCrashNode(N), writeStage(B, Stage),
Stage = “Data Transfer”, expectedNode(B, N)
21
Expected Nodes(Block, Node)
B Node 1
B Node 2
B Node 3
M 1C 2 3
X
• “Client receives all acks from setup stage writeStage” enter Data Transfer stage
• Precise failure events- Different stages different recovery behaviors different
specifications- FATE and DESTINI must work hand in hand
setupAcks (B, Pos, Ack) :- cdpSetupAck (B, Pos, Ack);goodAcksCnt (B, COUNT<Ack>) :- setupAcks (B, Pos, Ack), Ack == ’OK’;nodesCnt (B, COUNT<Node>) :- pipeNodes (B, , N, );writeStage (B, Stg) :- nodesCnt (NCnt), goodAcksCnt (ACnt), NCnt == Acnt, Stg := “Data Transfer”;
![Page 23: Towards Automatically Checking Thousands of Failures with Micro-Specifications](https://reader036.vdocument.in/reader036/viewer/2022081603/568144c4550346895db18d62/html5/thumbnails/23.jpg)
Capture Facts
actualNodes(B, N) :- blocksLocation(B, N, Gs), latestGenStamp(B, Gs)
22
actualNodes(Block, Node)
B Node 1
blocksLocations(B, N, Gs)
B Node 1 2
B Node 2 1
B Node 3 1
latestGenStamp(B, Gs)
B 2
M 1C 2 3
Correct recovery
X
M 1C 2 3
X
Incorrect recovery
B_gs2 B_gs1 B_gs1
![Page 24: Towards Automatically Checking Thousands of Failures with Micro-Specifications](https://reader036.vdocument.in/reader036/viewer/2022081603/568144c4550346895db18d62/html5/thumbnails/24.jpg)
Violation and Check-Timing
23
actualNodes(Block, Node)
B Node 1
ExpectedNodes(Block, Node)
B Node 1
B Node 2
IncorrectNodes(Block, Node)
B Node 2
incorrectNodes(B, N) :- expectedNodes(B, N), NOT-IN actualNodes(B, N),
cnpComplete(B) ;
• There is a point in time where recovery is ongoing, thus specifications are violated
• Need precise events to decide when the check should be done– In this example, upon block completion
![Page 25: Towards Automatically Checking Thousands of Failures with Micro-Specifications](https://reader036.vdocument.in/reader036/viewer/2022081603/568144c4550346895db18d62/html5/thumbnails/25.jpg)
Rules
24
r1 incorrectNodes (B, N) :-
cnpComplete (B), expectedNodes (B, N), NOT-IN actualNodes (B, N);
r2 pipeNodes (B, Pos, N) :-
getBlkPipe (UFile, B, Gs, Pos, N);
r3 expectedNodes (B, N) :-
getBlkPipe (UFile, B, Gs, Pos, N);
r4 DEL expectedNodes (B, N) :-
fateCrashNode (N), pipeStage (B, Stg), Stg == 2, expectedNodes (B, N);
r5 setupAcks (B, Pos, Ack) :-
cdpSetupAck (B, Pos, Ack);
r6 goodAcksCnt (B, CUUNT<Ack>)
:-
setupAcks (B, Pos, Ack), Ack == ’OK’;
r7 nodesCnt (B, COUNT<Node>) :-
pipeNodes (B, , N, );
r8 pipeStage (B, Stg) :-
nodesCnt (NCnt), goodAcksCnt (ACnt), NCnt == Acnt, Stg := 2;
r9 blkGenStamp (B, Gs) :-
dnpNextGenStamp (B, Gs);
r10 blkGenStamp (B, Gs) :-
cnpGetBlkPipe (UFile, B, Gs, , );
r11 diskFiles (N, File) :-
fsCreate (N, File);
r12 diskFiles (N, Dst) :-
fsRename (N, Src, Dst), diskFiles (N, Src, Type);
r13 DEL diskFiles (N, Src) :-
fsRename (N, Src, Dst), diskFiles (N, Src, Type);
r14 fileTypes (N, File, Type) :-
diskFiles(N, File), Type := Util.getType(File);
r15 blkMetas (N, B, Gs) :-
fileTypes (N, File, Type), Type == metafile, Gs := Util.getGs(File);
r16 actualNodes (B, N) :-
blkMetas (N, B, Gs), blkGenStamp (B, Gs);
• Capture Facts, Build Expectation from IO events- No need to interpose internal functions• Specification Reuse- For the first check, # rules : #check is 16:1- Overall, #rules: # check ratio is 3:1
![Page 26: Towards Automatically Checking Thousands of Failures with Micro-Specifications](https://reader036.vdocument.in/reader036/viewer/2022081603/568144c4550346895db18d62/html5/thumbnails/26.jpg)
Outline
IntroductionFATEDESTINI• Evaluation• Summary
25
![Page 27: Towards Automatically Checking Thousands of Failures with Micro-Specifications](https://reader036.vdocument.in/reader036/viewer/2022081603/568144c4550346895db18d62/html5/thumbnails/27.jpg)
Evaluation
• FATE: 3900 lines, DESTINI: 1200 lines• Applied FATE and DESTINI to three
cloud systems– HDFS, ZooKeeper, Cassandra
• 40,000 unique combination of failures
• Found 16 new bugs, reproduced 74 bugs
• 74 recovery specifications– 3 lines / check
26
![Page 28: Towards Automatically Checking Thousands of Failures with Micro-Specifications](https://reader036.vdocument.in/reader036/viewer/2022081603/568144c4550346895db18d62/html5/thumbnails/28.jpg)
Bugs found
• Reduced availability and performance• Data loss due to multiple failures• Data loss in log recovery protocol• Data loss in append protocol• Rack awareness property is broken
27
![Page 29: Towards Automatically Checking Thousands of Failures with Micro-Specifications](https://reader036.vdocument.in/reader036/viewer/2022081603/568144c4550346895db18d62/html5/thumbnails/29.jpg)
Conclusion
• FATE explores multiple failure systematically• DESTINI enables concise recovery specifications• FATE and DESTINI: a unified framework
– Testing recovery specifications requires a failure service– Failure service needs recovery specifications to catch
recovery bugs
28
![Page 30: Towards Automatically Checking Thousands of Failures with Micro-Specifications](https://reader036.vdocument.in/reader036/viewer/2022081603/568144c4550346895db18d62/html5/thumbnails/30.jpg)
Thank you!
29
The Advanced Systems Laboratory
http://www.cs.wisc.edu/adsl
Berkeley Orders of Magnitudehttp://boom.cs.berkeley.edu
QUESTIONS?
Downloads our full TR paper from these websites
![Page 31: Towards Automatically Checking Thousands of Failures with Micro-Specifications](https://reader036.vdocument.in/reader036/viewer/2022081603/568144c4550346895db18d62/html5/thumbnails/31.jpg)
New Challenges
• Exponential growth of multiple failures– FATE exercised 40,000 failure
combinations in 80 hours
30
![Page 32: Towards Automatically Checking Thousands of Failures with Micro-Specifications](https://reader036.vdocument.in/reader036/viewer/2022081603/568144c4550346895db18d62/html5/thumbnails/32.jpg)
DESTINI vs. Related works
Framework
# Checks Lines/check
D3S 10 53
Pip 44 43
WiDS 15 22
P2 Monitor 11 12
DESTINI 74 3
31
![Page 33: Towards Automatically Checking Thousands of Failures with Micro-Specifications](https://reader036.vdocument.in/reader036/viewer/2022081603/568144c4550346895db18d62/html5/thumbnails/33.jpg)
HDFS
Java SDK
FailureServer
Filt
ers Fail/
No Fail?
Workload Driverwhile (server injects new failureIDs) { runWorkload(); // e.g hdfs.write}
FailureSurface
FATE Architecture
![Page 34: Towards Automatically Checking Thousands of Failures with Micro-Specifications](https://reader036.vdocument.in/reader036/viewer/2022081603/568144c4550346895db18d62/html5/thumbnails/34.jpg)
DESTINI
DESTINIDESTINIstateY(..) :- cnpEv(..), state(X);stateY(..) :- cnpEv(..), state(X);
NN DDCC FATEFATE
![Page 35: Towards Automatically Checking Thousands of Failures with Micro-Specifications](https://reader036.vdocument.in/reader036/viewer/2022081603/568144c4550346895db18d62/html5/thumbnails/35.jpg)
Current state of the Art:
• Failure exploration- Rarely deal with multiple failures- Or using random approach
• System specifications- Unit test checking: cumbersome- WiDS, Pip: not integrated with
failure service
![Page 36: Towards Automatically Checking Thousands of Failures with Micro-Specifications](https://reader036.vdocument.in/reader036/viewer/2022081603/568144c4550346895db18d62/html5/thumbnails/36.jpg)
35
M 1C 2 3 M 1C 2 3 4
X1
M 1C 2 3
X2
M 1C 2 3
X3
No failures Recovery 1: Recreate fresh pipeline
Recovery 2: Continue on surviving nodes Bug in recovery 2
Static: InputStream.read()Domain: - Src : Node 1 - Dest: Node 2 - Type: Data Transfer
Static: InputStream.read()Domain: - Src : Node 2 - Dest: Node 3 - Type: Data Transfer
Static: InputStream.read()Domain: - Src : Node 1 - Dest: Node 2 - Type: Setup