introduction to self-stabilization stéphane devismes
TRANSCRIPT
Introduction to Self-Stabilization
Stéphane Devismes
27/03/2008 2
Self-Stabilization [Dijkstra, 1974]
Example: Dijkstra’s Token Ring
0
0 0
0
01
1
11
1
2
27/03/2008 3
Starting from an arbitrary state
1
4 0
2
5
4 5
05
0
5
27/03/2008 4
Definition: Closure + Convergence
States of the system
Illegitimate states Legitimate States
Convergence
Closure
27/03/2008 5
Why Self-Stabilization?
Tolerance to transient faults
Eventually Safe
No initialization Overcost
DynamicityNo Detection of
Stability
Advantages Drawbacks
27/03/2008 6
Protocols for:
Resources Allocation (Mutual Exclusion) Broadcast Routing Overlay (Spanning trees, Routing table) …
27/03/2008 7
Around Self-Stabilization (1/2)
Weaker Properties: K-Stabilization (no more than K faults) Weak-Stabilization (possible convergence) Probabilistic Stabilization (probabilistic convergence)
Pseudo-Stabilization
Aim: circumvent impossibility results Example: alternated bit protocol
27/03/2008 8
Pseudo-Stabilization ? Self-Stabilization [Dijkstra, 1974]: Starting from any configuration, a self-stabilizing system reaches in a finite time a configuration c such that any suffix starting from c satisfies the intended specification.
Pseudo-Stabilization [Burns, Gouda, and Miller, 1993]:
Starting from any configuration, any execution of a pseudo-stabilizing system has a non-empty suffix that satisfies the intended specification.
27/03/2008 9
Self- vs. Pseudo- Stabilization
Illegitimate States Legitimate States
Strong Closure vs. Ultimate Closure
27/03/2008 10
Self- vs. Pseudo- Stabilization Example: Leader Election
Self-Stabilizing Leader Election: Eventually there is a unique leader that cannot change
Pseudo-Stabilizing Leader Election: We never have the guarantee that the leader no more changes
but eventually it no more change
Remark: no stabilization time in pseudo-stabilization
27/03/2008 11
Around Self-Stabilization (2/2)
Stronger Properties: Fault-containment (Quick stabilization when there are few faults)
Snap-Stabilization (Safety for the tasks started after the faults)
Byzantine-Tolerant Stabilization Fault-Tolerant Stabilization (Stabilization despite crashes)
Aim: circumvent the drawbacks
Fault-Tolerant Stabilizing Leader Election
Carole Delporte-Gallet (LIAFA)
Stéphane Devismes (CNRS, LRI)
Hugues Fauconnier (LIAFA)
LIAFA
27/03/2008 13
Fault-Tolerant Stabilization
Gopal and Perry, PODC’93 Beauquier and Kekkonen-Moneta, JSS’97 Anagnostou and Hadzilacos, WDAG’93
In partial synchronous model ?
27/03/2008 14
Leader Election
Fault-Tolerant Stabilizing Leader Election with:
weak reliability and synchrony assumptions
27/03/2008 15
Model
Network: fully-connected
n Processes: timely may crash (an arbitrary number of processes may crash)
Variables: initially arbitrary assigned
Links: Unidirectional Initially not necessarily empty No order on the message deliverance Variable reliability and timeliness assumptions
27/03/2008 16
Communication-Efficiency
[Larrea, Fernandez, and Arevalo, 2000]: « An algorithm is communication-efficient
if
it eventually only uses n - 1 unidirectional links »
27/03/2008 17
Self-Stabilizing Leader Election in a full timely network?
Yes + communication-efficiency
27/03/2008 18
Principles of the algorithm A process p periodically sends ALIVE to every other if Leader = p
4
3 2
1Leader=1
Leader=2 Leader=2
Alive,2
Alive,2
Alive,2
Alive,1
Alive,1
Alive,1
27/03/2008 19
Principles of the algorithm When a process p such that Leader = p receives ALIVE from q, then
Leader := q if q < p
4
3 2
1Leader=1
Leader=2 Leader=2
Alive,2
Alive,2
Alive,2
Alive,1
Alive,1
Alive,1
Leader=1
4
27/03/2008 20
Principles of the algorithm Any process q such that Leader ≠ q always chooses as leader the process
from which it receives ALIVE the most recently
4
3 2
1Leader=1
Leader=2 Leader=1
Alive,1
Alive,1
Alive,1
Leader=1
4
27/03/2008 21
Principles of the algorithm
On Time out, a process p sets Leader to p
4
3 2
1Leader=3
Leader=2 Leader=4
Alive,2
Alive,2
Alive,2
Alive,1
Alive,1
Alive,1
Leader=1
Leader=2
4
27/03/2008 22
Communication-Efficient Self-StabilizingLeader Election
in a system where at most one link is asynchronous?
No
27/03/2008 23
Impossibility of Communication-Efficiency in a system with at most one asynchronous link
Claim: Any process p such that Leader ≠ p must periodically receive messages within a bounded time otherwise it chooses another leader
The process chooses another leader
27/03/2008 24
Self-Stabilizing (non communication-efficient) Leader Election
in a system where some links are asynchronous?
Yes
27/03/2008 25
Self-Stabilizing Leader Election in a system with a timely routing overlay
For each pair of alive processes (p,q), there exists at least two paths of timely links: From p to q From q to p
27/03/2008 26
Principle of the algorithm
Each process computes the set of alive processes and chooses as leader the
smallest process of this set
To compute the set:
– Each process p periodically sends ALIVE,p to every other process
– Any ALIVE,p message is repeated n - 1 times
(any other process periodically receives such a message)
27/03/2008 27
Self-Stabilizing Leader Election in a system without timely routing overlay ?
No
27/03/2008 28
Pseudo-Stabilizing Leader Election in a system where Self-Stabilizing Leader
Election is not possible ?
Yes + communication-efficiently
In a system having a source and fair links
27/03/2008 29
Algorithm for systems with Source + fair links
A process p periodically sends ALIVE to every other if Leader = p
Each process stores in Active its ID + the IDs of each process from which it recently
receives ALIVE
Each process chooses its leader among the processes in its Active set
Problem: we cannot use the IDs to choose a leader
21
Source
<1,2><1> <2><1,2>Alive,1
Alive,2
27/03/2008 30
Accusation Counter p stores in Counter[p] how many times it was suspected to be crashed When a process suspects its leader:
it sends an ACCUSATION to LEADER, and chooses as new leader the process in Active with the smallest accusation counter
p periodically sends ALIVE,Counter[p] to every other if Leader = p Problem: the accusation counter of the source can increase infinitely often
1 2
3Source
3 <3>2
<2,3>4<1,3>1
3,C=23,
C=2
1,C=1
1,C
=1
<1,3>
<2>
Accuse
27/03/2008 31
Phase Counter Each process maintains in Phase[p] the number of times it looses the
leadership p periodically sends ALIVE,Counter[p],Phase[p] to every other if Leader = p p increments Counter[p] only when receiving ACCUSATION,ph with ph =
Phase[p]
1 2
3Source
<3>2
<2,3>4<1,3>1
3,C=23,C
=2
1,C=1
1,C
=1<1,3> Ph=3
Ph=1 Ph=2
Ph=4 (previously 3)
<2>
Accuse,3
27/03/2008 32
Communication-Efficient Pseudo-Stabilizing Leader Election
in a system having only a source?
No, but a non communication-efficient pseudo-stabilizing leader election can be done
27/03/2008 33
Result Summary
ce-FTSS FTSS ce-FTPS FTPS
Full-Timely Yes Yes Yes Yes
Bi-source No Yes Yes Yes
Timely routing No Yes ? Yes
Source + fair links No No Yes Yes
Source No No No Yes
Totally asynchronous No No No No
27/03/2008 34
Perspectives
Communication-efficient FTPS leader election in a system with
timely routing overlay
Extend these results to other topologies and models
Fault-tolerant stabilizing decision problems ?
Thank You!