1 distributed resilient consensus nitin vaidya university of illinois at urbana-champaign
TRANSCRIPT
1
Distributed Resilient Consensus
Nitin Vaidya
University of Illinois at Urbana-Champaign
2
Net-X: Multi-
Channel Mesh
Theory to Practice
Multi-channelprotocol
Channel Abstraction Module
IP Stack
InterfaceDevice Driver
User Applications
ARP
InterfaceDevice Driver
OS improvements
Software architecture
Capacity
bounds
channels
capa
city
Net-X
testbed
CSL
A
B
C
D
EF
Fixed
Switchable
Insights on
protocol design
Linux box
Acknowledgments
g Byzantine consensus
i Vartika Bhandari
i Guanfeng Liang
i Lewis Tseng
g Consensus over lossy links
i Prof. Alejandro Dominguez-Garcia
i Prof. Chris Hadjicostis
Consensus … Dictionary Definition
g General agreement
4
Many Faces of Consensus
g What time is it?
g Network of clocks …
agree on a common notion of time
5
Many Faces of Consensus
g Commit or abort ?
g Network of databases …
agree on a common action
Many Faces of Consensus
g What is the temperature?
g Network of sensors …
agree on current temperature
Many Faces of Consensus
g Should we trust ?
g Web of trust …
agree whether is good or evil
8
Many Faces of Consensus
g Which way?
Many Faces of Consensus
g Which cuisine for dinner tonight?
Korean
Thai
Chinese
Consensus Requires Communication
11
Korean
Thai
Chinese
g Exchange preferences with each other
Consensus Requires Communication
12
Korean
Thai
Chinese
CKT
CKT
CKT
g Exchange preferences with each other
Consensus Requires Communication
13
Korean
Thai
Chinese
CKT C
g Exchange preferences with each otherg Choose “smallest” proposal
CKT C
CKT C
Complications
Most environments are not benign
g Faults
g Asynchrony
14
Complications
Most environments are not benign
g Faults
g Asynchrony
15
Crash Failure
16
Korean
Thai
Chinese
CKT
KT
fails without sending own preference to
Round 1
Crash Failure
17
Korean
Thai
Chinese
One more round of exchange among fault-free nodes
CKT
KT
Round 1
CKT
CKT
Round 2
Crash Failure
18
Korean
Thai
Chinese
One more round of exchange among fault-free nodes
Round 1
CKT C
CKT C
Round 2
CKT
KT
Crash Failures
Well-known result
… need f+1 rounds of communication in the worst-case
19
Complications
Most environments are not benign
g Faults
g Asynchrony
20
Asynchrony
g Message delays arbitrarily large
g Difficult to distinguish between a slow message, and message that is not sent (due to faulty sender)
21
Asynchrony + Crash Failure
22
Korean
Thai
Chinese
Messages from slow to reach others.
Others wait a while, and give up … suspecting faulty.
Round 1
KT
KT
CKT
Asynchrony + Crash Failure
23
Korean
Thai
Chinese
Messages from slow to reach others
Round 1
KT K
KT K
Round 2
KT
KT
CKT CKT C
Asynchrony + Crash Failures
Another well-known (disappointing) result
… consensus impossible with asynchrony + failure
24
Asynchrony + Failures
Impossibility result applies to exact consensus,
approximate consensus still possible
even if failures are Byzantine
25
Byzantine Failure
26
Korean
Thai
Chinese
Byzantine faulty
Round 1
Indian
Chinese
Byzantine Failures
Yet another well-known result
… 3f+1 nodes necessary to achieve consensus in presence of Byzantine faults
27
Related Work
30+ years of research
g Distributed computing
g Decentralized control
g Social science (opinion dynamics, network cascades)
28
1980: Byzantine exact consensus, 3f+1 nodes
1986: Approximate consensus with asynchrony & failure
Tsitsiklis 1984:Decentralized control
[Jadbabaei 2003]
(Approximate)
1983: Impossibility of exact consensus with asynchrony & failure
Pre-history
1980: Byzantine exact consensus, 3f+1 nodes
1986: Approximate consensus with asynchrony & failure
Tsitsiklis 1984:Decentralized control
[Jadbabaei 2003]
(Approximate)
1983: Impossibility of exact consensus with asynchrony & failure
Hajnal 1958 (weak ergodicity ofnon-homogeneousMarkov chains)
Pre-history
Consensus
g 30+ years of research
g Anything new under the sun ?
31
Consensus
g 30+ years of research
g Anything new under the sun ?
… more refined network models
32
Our Contributions
g Average consensus over lossy links
g Byzantine consensus
• Directed graphs
• Capacitated links
• Vector inputs
33
Average Consensus
g Each node has an input
g Nodes agree (approximately) on average of inputs
g No faulty nodes
34
Distributed Iterative Solution … Local Computation
g Initial state a, b, c = input
35
c
b
a
Distributed Iterative Solution
g State update (iteration)
36
a = 3a/4+ c/4
b = 3b/4+ c/4
c = a/4+b/4+c/2
c
b
a
37
a = 3a/4+ c/4
b = 3b/4+ c/4
c
b
a
:= = M
M
c = a/4+b/4+c/2
38
a = 3a/4+ c/4
b = 3b/4+ c/4
c
b
a
:= M M
after 2 iterations
c = a/4+b/4+c/2
= M2
after 1 iteration
39
a = 3a/4+ c/4
b = 3b/4+ c/4
c
b
a
:= Mk
c = a/4+b/4+c/2
after k iterations
Well-Known Results
Reliable links & nodes:
g Consensus achievable iffat least one node can reach all other nodes
g Average consensus achievable iffstrongly connected graph
with suitably chosen transition matrix M
40
Well-Known Results
Reliable links & nodes:
g Consensus achievable iffat least one node can reach all other nodes
g Average consensus achievable iffstrongly connected graph
with suitably chosen transition matrix M
41
Rowstochastic M
Well-Known Results
Reliable links & nodes:
g Consensus achievable iffat least one node can reach all other nodes
g Average consensus achievable iffstrongly connected graph
with suitably chosen transition matrix M
42
Rowstochastic M
Doubly stochastic M
43
a = 3a/4+ c/4
b = 3b/4+ c/4
c
b
a
:= Mk
c = a/4+b/4+c/2
Doubly stochastic M
Asynchrony
g Asynchrony results in time-varying transition matrices
g Results hold under mild conditions [Hajnal58]
44
An Implementation:Mass Transfer + Accumulation
g Each node “transfers mass” to neighbors via messagesg Next state = Total received mass
45
c
b
a
c/2
c/4
c = a/4+b/4+c/2a = 3a/4+ c/4
b = 3b/4+ c/4
c/4
An Implementation:Mass Transfer + Accumulation
g Each node “transfers mass” to neighbors via messagesg Next state = Total received mass
46
c
b
a
c/2
c/4
c/4a/4
b/4
c = a/4+b/4+c/2
3b/4
3a/4 a = 3a/4+ c/4
b = 3b/4+ c/4
Conservation of Mass
g a+b+c constant after each iteration
47
c
b
a
c/2
c/4
c/4a/4
b/4
c = a/4+b/4+c/2
3b/4
3a/4 a = 3a/4+ c/4
b = 3b/4+ c/4
Wireless Transmissions Unreliable
48
a = 3a/4+ c/4
b = 3b/4+ c/4
c = a/4+b/4+c/2
c
b
a
c/4
XX
Impact of Unreliability
49
a = 3a/4+ c/4
b = 3b/4+ c/4
c = a/4+b/4+c/2
c
b
a
c/4
X
=
X
Conservation of Mass
50
a = 3a/4+ c/4
b = 3b/4+ c/4
c = a/4+b/4+c/2
c
b
a
c/4
X
=
X
X
Average consensus over lossy links ?
51
Existing Solution
When mass not transferred to neighbor,
keep it to yourself
52
53
a = 3a/4+ c/4
b = 3b/4+ c/4
c = a/4+b/4+c/2+c/4
c
b
a
c/4
X
=
X
c/4
Existing Solutions … Link Model
Common knowledge on whether a message is delivered
54
S R
Existing Solutions … Link Model
Common knowledge on whether a message is delivered
S knows
R knows that S knows
S knows that R knows that S knows
R knows that S knows that R knows that …
S R
56
Reality in Wireless Networks
Common knowledge on whether a message is delivered
Two scenarios:
g A’s message to B lost … B does not send Ackg A’s message received by B … Ack lost
57
X
Reality in Wireless Networks
Common knowledge on whether a message is delivered
Need solutions that tolerate lack of common knowledge
58
X
Our Solution
g Average consensus without common knowledge
… using additional per-neighbor state
59
Solution Sketch
g S = mass C wanted to transfer to node
A in total so far
g R = mass A has received from
node Cin total so far
60
C
A
SR
S
R
Solution Sketch
g Node C transmits quantity S…. message may be lost
g When it is received,node A accumulates (S-R)
61
C
A
SR
S-R
S
R
What Does That Do ?
62
What Does That Do ?
g Implements virtual buffers
63a
b
e
d
f
g
c
Dynamic Topology
g When CB transmission unreliable,mass transferred to buffer (d)
g d = d + c/4
64a
b
d
c
Dynamic Topology
g When CB transmission unreliable,mass transferred to buffer (d)
g d = d + c/4
65a
b
d
c
No lossof mass
even withmessage loss
Dynamic Topology
g When CB transmission reliable,mass transferred to b
g b = 3b/4 + c/4 + d
66a
b
d
c
No lossof mass
even withmessage loss
Time-Varying Column Stochastic Matrix
g Mass is conserved
g Time-varying network
Matrix varies over iterations
67
Matrix Mi for i-th iteration
State Transitions
g V = state vector =
g V[0] = initial state vector
g V[t] = iteration t68
State Transitions
g V[1] = M1 V[0]
g V[2] = M2 V[1] = M2 M1 V[0]
…
g V[t] = Mk Mk-1 … M2 M1 V[0]
69
State Transitions
g V[t] = Mk Mk-1 … M2 M1 V[0]
Matrix product converges to
column stochastic matrix with identical columns
…
State Transition
After k iterations
k+1
=
Mk+1…
…
z z z
…
State Transition
After k iterations
k+1
=
Mk+1…
…
z
w
z
wz
w…
z * sum
w * sum
State Transitions
g After k iterations, state of first node has the form
z(k) * sum of inputs
where z(k) changes each iteration (k)
g Does not converge to average
73
Solution
g Run two iterations in parallel
i First : original inputs
i Second : input = 1
74
Solution
g Run two iterations in parallel
i First : original inputs
i Second : input = 1
g After k iterations …
first algorithm: z(k) * sum of inputssecond algorithm: z(k) * number of nodes
Solution
g Run two iterations in parallel
i First : original inputs
i Second : input = 1
g After k iterations …
first algorithm: z(k) * sum of inputssecond algorithm: z(k) * number of nodes
ratio = average
77
ratio
denominatornumerator
time time
time
78
Byzantine Consensus
79
80
a = 3a/4+ c/4
b = 3b/4+ c/4b
a
c = a/4+b/4+c/2
81
a = 3a/4+ c/4
b = 3b/4+ c/4b
a
Byzantinenode
c = a/4+b/4+c/2
82
a = 3a/4+ c/4
b = 3b/4+ c/4b
a
c = 2
No consensus !
c =1
Byzantinenode
c = a/4+b/4+c/2
Prior Results
g Necessary and sufficient conditions on undirected communication graph to be able to achieve Byzantine consensus
i Synchronous systems
i Asynchronous systems
g 3f+1 nodes2f+1 connectivity
83
Our Results
g Conditions on directed graphs to achieve Byzantine consensus
g Motivated by wireless networks
84
Link Capacity Constraints
32
10
10
10 10
10
101
1
10
10
S
Byzantine Consensus … Lower Bound
g Ω(n2) messages in worst case
86
Link Capacity Constraints
g How to quantify the impact of capacity constraints ?
87
Throughput
g Borrow notion of throughput from networking
g b(t) = number of bits agreed upon in [0,t]
88
t
tbThroughput
t
)(lim
Problem Definition
g What is the achievable throughput of consensus over a capacitated network ?
g Results so far
… optimal algorithm for 4 nodes
… within factor of 3 in general
89
Summary
g Many applications of consensus
g Large body of work
g Still many interesting problems open
90