fault tolerance chapter 7. topics basic concepts failure models redundancy agreement and consensus...

Fault Tolerance

Chapter 7

Topics

Basic Concepts

Failure Models

Redundancy

Agreement and Consensus

Client Server Communication

Group Communication and Virtual Synchrony

Atomic Commit, Recovery, Checkpointing

Basic Concepts

An important goal in DS is to make the system resilient to failures of some of the components. Fault tolerance (FT) is frequently one of the reasons for making it distributed in the first place.

Dependability Includes:• Availability• Reliability

• Safety• Maintainability

Goals

• Availability: Can I use it now? Probability of being up at any given time.

• Reliability: Will it be up as long as I need it? Ability to run continuously without failure. If system crashes briefly every hour, it may still have good availability (it is up most of the time) but has poor reliability because it cannot run for very long before crashing.

• Safety: If it fails, ensure nothing bad happens?• Maintainability: How easy is it to fix if it breaks?

Definitions

• FAULT A fault is the cause of an error

• FAULT TOLERANCE - A system can continue to function even in the presence of faults.

• Classification of faults:– Transient faults - occur once then disappear.– Intermittent faults - occurs, goes away, then

comes back, goes away …– Permanent faults - doesn't go away by itself,

like disk failures.

Failure Models

Different types of failures.

Type of failure Description

Crash failure or fail-stop A server halts, but is working correctly until it halts

Omission failure Receive omission Send omission

A server fails to respond to incoming requestsA server fails to receive incoming messagesA server fails to send messages

Timing failure A server's response lies outside the specified time interval

Response failure Value failure State transition failure

The server's response is incorrectThe value of the response is wrongThe server deviates from the correct flow of control

Arbitrary or Byzantine A server may produce arbitrary responses at arbitrary times

Network Failures

• Link failure (one way or 2 way): 5 can talk to 6, but 6 can not talk to 5

• Network partitions: the network 1,2,3,4,5,6 is partitioned into 1,2,3,4 and 5,6.

43

2

1

5

6

Are The Models Realistic?

No, of course not!Synch vs Asynch

– Asynchronous model is too weak (real systems have clocks, “most” timing meets expectations… but heavy tails)

– Synchronous model is too strong (real systems lack a way to implement synchronize rounds)

Failure Types– Crash fail (fail-stop) model is too weak (systems usually

display some odd behavior before dying)– Byzantine model is too strong (assumes an adversary of

arbitrary speed who designs the “ultimate attack”)

Models: Justification

• If we can do something in the asynchronous model, we can probably do it even better in a real network– Clocks, a-priori knowledge can only help…

• If we can’t do something in the synchronous model, we can’t do it in a real network– After all, synchronized rounds are a powerful, if

unrealistic, capability to introduce

• If we can survive Byzantine failures, we can probably survive a real distributed system.

Fault Tolerance Strategies

• Redundancy– Hardware,software,informational,temporal

• Hierarchy– Confinement of errors

Failure Masking by Redundancy

Triple modular redundancy.Voter circuits choose majority of inputs to determine correct output

Flat Groups versus Hierarchical Groups

a) Communication in a flat group.b) Communication in a simple hierarchical group

Identical Processes, Fail-stop

• A system is K fault tolerant if it can withstand faults in K components and still produce correct results.

• Example: FT through replication - each replica reports a result. If the nodes in a DS are fail-stop and there are K+1 identical processes, then the system can tolerate K failures: the result comes from the remaining one.

4 3

21

5

Identical Processes, Byzantine Failures

• If K failures are Byzantine (with K-collusion) then 2K+1 processes are needed for K FT.

• Example: K processes can be faulty and "lie" about their result. (If they simply fail to report a result, that is not a problem). If there are 2K+1 processes, at least K+1 will be correct and report the same correct answer. So by taking the result reported by at least K+1 (which is a majority), we get the correct answer.

Agreement section 7.2.3

• Distributed agreement or "distributed consensus" is the fundamental problem in DS. – Distributed mutual exclusion and election are basically

getting processes to agree on something.

– Agreeing on time or the update of replicated data are special cases of the distributed consensus problem.

• Agreement sometimes means one process proposes a value and the others agree on it while consensus means all processes propose values and all agree on some function of those values.

Consensus (Agreement)• There are M processes, P1, P2, … Pm in a DS that are trying

to reach agreement. A subset F of the processes are faulty. Each process Pi stores a value Vi. During agreement, the processes each calculate a value Ai. At the end of the algorithm:– All non-faulty processes reach a decision.

– For every pair of non-faulty processes Pi and Pj, Ai = Aj. This is the agreement value.

– The agreement value is a function of the initial values {Vi} of the non-faulty processes.

• The function is often max (as in the case of election) or average or one of the Vi. If all non-faulty processes have the same Vi, then that must be the agreement value.

Consensus: Easy Case: No Failures

• No failures, synchronous, M processes• If there can be no failures, reaching consensus is easy.

Every process sends his value to every other process. All processes now have identical info.

• All processes do the same calculation and come up with the same value. Processes need to maintain an array of M values.

P1 has {1,2,3,4}

P2 has {1,2,3,4}

P3 has {1,2,3,4}

P4 has {1,2,3,4}

43

2

1

Consensus: Fail-stop

• Fairly Easy case: fail-stop, synchronous

• If faulty processes are fail-stop, reaching consensus is reasonably easy, all non-faulty processes send their values to all others. However, K of them may fail at sometime during the process...

P1 has {1,2,3,4}

P2 has {1,2,3,4}

P3 has {x,2,3,4}

P4 has {x,2,3,4}4

3

2

11


• Solution is after all processes send their values to all others, then all processes now broadcast all the values they received (and who from).

• This continues for f+1 rounds where f = |F|. Processes maintain a tree of values.

• After second round P4 has

1st round{x,2,3,4}

from P2 {1,2,3,4}

from P3 {x,2,3,4}4

3

2{x,2,3,4}

{x,2,3,4}

{1,2,3,4}


• If M=4 and F=1 then we need f+1=2 rounds to get consensus (previous example).

• Do we really need f+1 rounds? Consider M=4, F=2• P1 crashes during 1st round after sending to P2. P2

crashes during 2nd round after sending to P3

431 2

P3:{x,2,3,4}

P4:{x,2,3,4}P2:{1,2,3,4}

Consensus: Fail stop

What do P3 and P4 see?

Round 1 {1,2,3,4} {X,2,3,4} {X,2,3,4} Round 2 send to P3 {1,2,3,4} {X,2,3,4} and die

Round 3 {1,2,3,4} {1,2,3,4}

If processes are fail-stop, we can tolerate any number

of faulty processes, however we need f+1 rounds

432

Difficult Case: Agreement with Byzantine Failures

• We will look at agreement (single proposer) rather than consensus (all propose values).

• The faulty process may respond like a non-faulty process so the non-faulty processes do not know who is faulty. Faulty process can send a fake value to throw off the calculation and can send one value to some and a different value to others.

• Faulty process is an adversary and can see the global state: has more information than non-faulty nodes. But, can only affect the faulty processes.

Variations on Byzantine Agreement

• Process always knows who sent the received message.• Default value - some algorithms assume a default

value (retreat) when there is no agreement.• Oral messages - message content is controlled by

latest sender (relayer) so receiver doesn’t know whether or not it was tampered with.

• Signed messages - messages can be authenticated with digital signatures. Assume faulty processes can send arbitrary messages but they cannot forge signatures.

BA with Oral Messages(1)

Commanding general coordinates other generals.If all loyal generals attack victory is certain.If none attack, the Empire survives.If some attack, Empire is lost.Gong keeps time.

Attack!


How it works.• Disloyal generals have corrupt soldiers.• Orders are distributed by exchange of messages,

corrupt soldiers violate protocol at will.• But corrupt soldiers can’t intercept and modify

messages between loyal generals.• The gong sounds slowly: there is ample time for

exchange of messages.• Commanding general sends his order. Then all

other generals relay to all what they received.


• Limitations

• Let t be the maximum number of faulty processes (disloyal generals).

• Byzantine agreement is not possible with fewer than 3t+1 processes

• Same result holds for fault-tolerant consensus in the Byzantine model

Byzantine Consensus Oral Messages(1)

The Byzantine generals problem for 3 loyal generals and1 traitor.a) The generals announce their troop strengths (in units of 1

kilosoldiers) to all other generals.b) The vectors that each general assembles based on (a)c) Additional vectors that each general receives in next round

(all send what they received to all). Decide by majority. If no majority, use default value.

ByzantineConsensus Oral Messages(2)

The same as in previous slide, except now with 2 loyal generals and one traitor. Majority decision does not guarantee consensus.

BA with Signed Messages (1)

• Faulty process can send arbitrary message, but cannot forge signatures. All messages are digitally signed for authentication.

• Assume at most f faulty nodes. At the start, coordinator sends signed message to each node.

• Each process at round I– endorses (authenticate) and forwards all

messages received in round I-1

BA with Signed Messages (2)

• At round f+1, either:– 1 value endorsed by at least f+1 nodes, decide majority

– else, coordinator is faulty

• If coordinator is faulty:– either abort,

– or retry after leader election to choose new coordinator

• f+1 rounds proven to be necessary and sufficient. Must have f+2 processes.

Consensus in Asynchronous Systems

•All of the preceding agreement and consensus algorithms are for synchronous systems, that is the algorithm works by sending messages in rounds or phases.

•What about Byzantine Consensus in an asynchronous system?•Provably impossible [FLP1985]

Client-Server Communications

Possible problems:

1. client unable to locate server

2. request message from client to server gets lost

3. server crashes after receiving request

4. reply message from the server to client is lost

5. client crashes after sending request

client server


Possible Solutions

1. client cannot locate server: client reports exception to user.

2. Request message lost: use timeouts and message numbers

3. Server crashes: client cannot distinguish #2,3, and 4. What to do? Application dependent.

4. Reply lost: see #3: timeout and try again (resend original request and hope that it is recognized as a duplicate and that reply needs to be sent again).


5. Client crashes before reply is received; resources are locked up; orphan processes may exist. Upon recovery, release resources and kill processes?

Solution 1 "log and exterminate", keep log of activity and write to stable storage before you send each request - drawback: expense of writing to disk.

Solution 2 "reincarnation": release everything, kill local processes, broadcast msg to kill orphans associated with this process.

Solution 3 "gentle reincarnation": remote process killed if owner cannot be found.

Solution 4 "expiration": remote processes get a timeout value, if not renewed, they can be killed.

fault tolerance chapter 7. topics basic concepts failure models redundancy agreement and consensus...

Documents