1 theory of distributed systems (tds) group leaders: nancy lynch, idit keidar phd students: victor...
TRANSCRIPT
![Page 1: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/1.jpg)
1
Theory of Distributed Systems (TDS)
Group Leaders: Nancy Lynch, Idit Keidar
PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas, Rui Fan
Meng and UROP students: Laura Dean, Andrej Bogdanov, Michael Tsai, Chris Luhrs,
Affiliates: Alex Shvartsman (U. Conn.), Alan Fekete (U. Sydney), Paul Attie (Northeastern), Ziv Bar-Joseph
![Page 2: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/2.jpg)
2
Research Overview
Algorithms Applications
Math Models, Proof and Analysis Methods
Building blocksProject (AFOSR)
![Page 3: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/3.jpg)
3
Math Models, Proof and Analysis
Methods• I/O Automata
– Mathematical model for asynchronously interacting system components.
• Timed I/O Automata, Hybrid I/O Automata [Lynch, Segala, Vaandrager], Probabilistic [Segala], Dynamic [Lynch, Attie]
• Invariant assertions• Levels of abstraction (simulation relations)• Compositional reasoning• Conditional performance analysis• Object inheritance [Keidar, Khazan, Shvartsman]
![Page 4: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/4.jpg)
4
IOA Language and Toolset
• Language design [Garland, Lynch]• Front end [Garland]• Simulator [Chefter, Ramirez, Dean]• Code generator [Tauber, Tsai]• Theorem prover [Garland, Fan, Bogdanov]• Test examples [Luhrs, Dean, Rosen, Karlovich]
![Page 5: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/5.jpg)
5
Algorithms
• Communication [Keidar, Khazan, Livadas, Bar-Joseph, Lynch]
• Distributed data management [Keidar, Shvartsman, De Prisco, Lynch]
• Consensus [Bar-Joseph, Keidar]• Load balancing [Shvartsman, Khazan, Fekete]
• Lower bounds, impossibility results
• Distributed algorithms => Distributed systems– Generating code from models [Tauber, Tsai]
![Page 6: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/6.jpg)
6
Applications
• Multiprocessor memories [Luchangco]• Hybrid systems
– Automated transportation: Planes (TCAS), Trains (Raytheon PRT), and Automobiles (California PATH) [Livadas]
– Mobile computing
• Middleware (building blocks) for fault-tolerant distributed computing [Keidar, Shvartsman, Fekete, De Prisco, Khazan]
• Communication [Livadas, Bar-Joseph]• Object-oriented programming [Fekete, Lynch]• Agents [Attie, Luchangco, Lynch, Keidar]
![Page 7: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/7.jpg)
7
![Page 8: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/8.jpg)
8
Reliable Group Communication: a Mathematical Approach
Nancy Lynch
Theory of Distributed Systems
MIT LCS
AFOSR Review
November 15, 2000GC
…
![Page 9: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/9.jpg)
9
Dynamic Distributed Systems
• Modern distributed systems are dynamic.• Set of clients participating in an application
changes, because of: – Network, processor failure, recovery– Changing client requirements
• To cope with changes:– Use abstract groups of client processes
with changing membership sets.– Processes communicate with group members by
sending messages to the group as a whole.
? ?
?? ?
![Page 10: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/10.jpg)
10
Group Communication Services
• Support management of groups• Maintain membership info • Manage communication• Make guarantees about ordering, reliability of
message delivery, e.g.:– Best-effort: IP Multicast
– Strong consistency guarantees: Isis, Transis, Ensemble
• Hide complexity of coping with changes
GC
…
![Page 11: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/11.jpg)
11
This Talk
• Describe– Group communication systems
– A mathematical approach to designing, modeling, analyzing GC systems.
– Our accomplishments and ideas for future work.
• Collaborators: Idit Keidar, Alan Fekete, Alex Shvartsman, Roger Khazan, Roberto De Prisco, Jason Hickey, Robert van Renesse, Carl Livadas, Ziv Bar-Joseph, Kyle Ingols, Igor Tarashchanskiy
![Page 12: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/12.jpg)
12
Talk Outline
I. Background: Group Communication
II. Our Approach
III. Projects and Results1. View Synchrony
2. Ensemble
3. Dynamic Views
4. Scalable Group Communication
IV. Conclusions
![Page 13: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/13.jpg)
13
I. Background: Group Communication
![Page 14: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/14.jpg)
14
The Setting
• Dynamic distributed system, changing set of participating clients.
• Applications:– Replicated databases, file systems
– Distributed interactive games
– Multi-media conferencing, collaborative work
– …
? ?
?? ?
![Page 15: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/15.jpg)
15
Groups
• Abstract, named groups of client processes, changing membership.
• Client processes send messages to the group (multicast).
• Early 80s: Group idea used in replicated data management system designs
• Late 80s: Separate group communication services.
![Page 16: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/16.jpg)
16
Group Communication Service
• Communication middleware
• Manages group membership, current views
View = membership set + identifier
• Manages multicast communication
among group members– Multicasts respect views
– Guarantees within each view:
• Reliability constraints
• Ordering constraints, e.g., FIFO from each sender, causal, common total order
• Global service
GC
…AB
![Page 17: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/17.jpg)
17
Group Communication Service
GCS
new-view
new-viewmcast
mcastreceive
receive
Client A Client B
![Page 18: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/18.jpg)
18
Isis [Birman, Joseph 87]
• Primary component group membership• Several reliable multicast services, different ordering
guarantees, e.g.:– Atomic Broadcast: Common total order, no gaps
– Causal Broadcast:
• When partition is repaired, primary processes send state information to rejoining processes.
• Virtually Synchronous message delivery
A B
![Page 19: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/19.jpg)
19
Example: Interactive Game
• Alice, Bob, Carol, Dan in view {A,B,C,D}• Primary component membership– {A}{B,C,D} split;
only {B,C,D} may continue.
• Atomic Broadcast– A fires, B moves away;
need consistent order
A B C D
A B C D
![Page 20: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/20.jpg)
20
Interactive Game
• Causal Broadcast– C sees A enter a room; locks door.
• Virtual Synchrony– {A}{BCD} split; B sees A shoot; so do C, D.
A B C D
A B C D
![Page 21: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/21.jpg)
21
Applications
• Replicated data management– State machine replication
[Lamport 78] , [Schneider 90]
– Atomic Broadcast provides support
– Same sequence of actions performed everywhere.
– Example: Interactive game state machine
• Stock market• Air-traffic control
![Page 22: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/22.jpg)
22
Transis [Amir, Dolev, Kramer, Malkhi 92]
• Partitionable group membership
• When components merge, processes exchange state information.
• Virtual synchrony reduces amount of data exchanged.
• Applications– Highly available servers
– Collaborative computing, e.g. shared whiteboard
– Video, audio conferences
– Distributed jam sessions
– Replicated data management [Keidar , Dolev 96]
![Page 23: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/23.jpg)
23
Other Systems• Totem [Amir, Melliar-Smith, Moser, et al., 95]
– Transitional views, useful with virtual synchrony
• Horus [Birman, van Renesse, Maffeis 96]
• Ensemble [Birman, Hayden 97]– Layered architecture– Composable building blocks
• Phoenix, Consul, RMP, Newtop, RELACS,…• Partitionable
![Page 24: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/24.jpg)
24
Service Specifications
• Precise specifications needed for GC services– Help application programmers write programs that use the
services correctly, effectively
– Help system maintainers make changes correctly
– Safety, performance, fault-tolerance
• But difficult:– Many different services; different guarantees about
membership, reliability, ordering
– Complicated
– Specs based on implementations might not be optimal for application programmers.
![Page 25: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/25.jpg)
25
Early Work on GC Service Specs
• [Ricciardi 92]• [Jahanian, Fakhouri, Rajkumar 93]• [Moser, Amir, Melliar-Smith, Agrawal 94]• [Babaoglu et al. 95, 98] • [Friedman, van Renesse 95]• [Hiltunin, Schlichting 95]• [Dolev, Malkhi, Strong 96]• [Cristian 96]• [Neiger 96]
• Impossibility results [Chandra, Hadzilacos, et al. 96]
• But still difficult…
![Page 26: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/26.jpg)
26
II. Our Approach
![Page 27: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/27.jpg)
27
Approach
• Model everything:– Applications
• Requirements, algorithms
– Service specifications
• Work backwards, see what
the applications need
– Implementations of the services
• State, prove correctness theorems:– For applications, implementations.
– Methods: Composition, invariants, simulation relations
• Analyze performance, fault-tolerance.
• Layered proofs, analyses
Application
Application Algorithm
Service
![Page 28: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/28.jpg)
28
Math Foundation: I/O Automata
• Nondeterministic state machines
• Not necessarily finite-state
• Input/output/internal actions (signature)
• Transitions, executions, traces
• System modularity:– Composition, respecting traces
– Levels of abstraction, respecting traces
• Language-independent, math model
![Page 29: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/29.jpg)
29
Modeling Style• Describe interfaces, behavior• Program-like behavior descriptions:
– Precondition/effect style– Pseudocode or IOA language
• Abstract models for algorithms, services• Model several levels of abstraction,
– High-level, global service specs
…
– Detailed distributed algorithms
![Page 30: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/30.jpg)
30
Modeling Style
• Very nondeterministic:– Constrain only what must be constrained.
– Simpler
– Allows alternative implementations
![Page 31: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/31.jpg)
31
Describing Timing Features
• TIOAs [Lynch, Vaandrager 93] – For describing:
• Timeout-based algorithms.
• Clocks, clock synchronization
• Performance properties
![Page 32: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/32.jpg)
32
Describing Failures
• Basic or timed I/O automata, with fail, recover input actions.
• Included in traces, can use them in specs.
failrecover
fail
recover
![Page 33: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/33.jpg)
33
Using I/O Automata for Group Communication Systems
• Use for global services + distributed algorithms• Define safety properties separately from
performance/fault-tolerance properties.– Safety:
• Basic I/O automata; trace properties
– Performance/fault-tolerance:
• Timed I/O automata with failure actions; timed trace properties
![Page 34: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/34.jpg)
34
III. Projects and Results
![Page 35: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/35.jpg)
35
Projects
1. View Synchrony
2. Ensemble
3. Dynamic Views
4. Scalable Group Communication
![Page 36: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/36.jpg)
36
1. View Synchrony (VS) [Fekete, Lynch, Shvartsman 97, 00]
Goals:• Develop prototypes:– Specifications for typical GC services
– Descriptions for typical GC algorithms
– Correctness proofs
– Performance analyses
• Design simple math foundation for the area. • Try out, evaluate our approach.
![Page 37: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/37.jpg)
37
View Synchrony
What we did:• Talked with system developers (Isis, Transis)• Defined I/O automaton models for:– VS, prototype partitionable GC service– TO, non-view-oriented totally ordered bcast service– VStoTO, application algorithm based on
[Amir, Dolev, Keidar, Melliar-Smith, Moser]
• Proved correctness• Analyzed performance/ fault-tolerance.
![Page 38: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/38.jpg)
38
VStoTO Architecture
VS
VStoTO VStoTO
TObcast
brcv
gpsnd
gprcvnewview
![Page 39: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/39.jpg)
39
TO Broadcast Specification
Delivers messages to everyone, in the same order.Safety: TO-Machine
Signature: input: bcast(a,p) output: brcv(a,p,q) internal: to-order(a,p)State: queue, sequence of (a,p), initially empty for each p: pending[p], sequence of a, initially empty next[p], positive integer, initially 1
TO
![Page 40: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/40.jpg)
40
TO-Machine
Transitions:
bcast(a,p)Effect:
append a to pending[p]
to-order(a,p)Precondition:
a is head of pending[p]Effect:
remove head of pending[p]append (a,p) to queue
brcv(a,p,q)Precondition:
queue[next[q]] = (a,p)Effect:
next[q] := next[q] + 1
![Page 41: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/41.jpg)
41
Performance/Fault-Tolerance
TO-Property(b,d,C): If C stabilizes, then within time b, a point is reached after which any message sent or received anywhere in C is received everywhere in C, within time d.
stabilize
b
send
d
receive
![Page 42: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/42.jpg)
42
VS Specification
• Partitionable view-oriented service• Safety: VS-Machine– Views presented in consistent order, possible gaps
– Messages respect views
– Messages in consistent order
– Causality
– Prefix property
– Safe indication
• Doesn’t guarantee Virtual Synchrony • Like TO-Machine, but per view
VS
![Page 43: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/43.jpg)
43
Performance/Fault-Tolerance
VS-Property(b,d,C):
If C stabilizes, then within time b, a point is reached where views known within C are consistent, and messages sent in the final view v are delivered everywhere in C, within time d.
mcast(v) receive(v)
d
stabilize
b
newview( v)
![Page 44: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/44.jpg)
44
VStoTO Algorithm• TO must deliver messages in order, no gaps.• VS delivers messages in order per view.• Problems arise from view changes:
– Processes moving between views could have different prefixes.– Processes could skip views.
• Algorithm: – Real work done in majority views only– Processes in majority views totally order messages, and deliver
to clients messages that VS has said are safe.– At start of new view, processes exchange state, to reconcile
progress made in different majority views.
![Page 45: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/45.jpg)
45
Correctness (Safety) Proof
• Show composition of VS-Machine and VStoTO machines implements TO-Machine.
• Trace inclusion
• Use simulation relation proof:– Relate start states, steps of composition
to those of TO-Machine– Invariants, e.g.: Once a message is ordered everywhere in some majority view, its order is determined forever.
• Checked using PVS theorem-prover, TAME [Archer]
TO
Composition
![Page 46: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/46.jpg)
46
Performance Analysis• Assume VS satisfies VS-Property(b,d,C):– If C stabilizes, then within time b, views known
within C become consistent, and messages sent in the final view are delivered everywhere in C, within time d.
• And VStoTO satisfies:– Simple timing and fault-tolerance assumptions.
• Then TO satisfies TO-Property(b+d,d,C):– If C stabilizes, then within time b+d, any message
sent or delivered anywhere in C is delivered everywhere in C, within time d.
![Page 47: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/47.jpg)
47
Conclusions: VS• Models for VS, TO, VStoTO• Proofs, performance analysis• Tractable, understandable, modular • [PODC 97], [TOCS 00]• Follow-on work:
– Algorithm for VS [Fekete, Lesley]– Load balancing using VS [Khazan]– Models for other Transis algorithms [Chockler]
• But: VS is only a prototype; lacks some key features, like Virtual Synchrony
• Next: Try a real system!
![Page 48: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/48.jpg)
48
2. Ensemble [Hickey, Lynch, van Renesse 99]
Goals:• Try, evaluate our approach on a real system • Develop techniques for modeling, verifying,
analyzing more features, of GC systems, including Virtual Synchrony
• Improve on prior system validation methods
![Page 49: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/49.jpg)
49
Ensemble
• Ensemble system [Birman, Hayden 97]– Virtual Synchrony– Layered design, building blocks– Coded in ML [Hayden]
• Prior verification work for Ensemble and predecessors: – Proving local properties using
Nuprl [Hickey]– [Ricciardi], [Friedman]
![Page 50: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/50.jpg)
50
Ensemble• What we did:– Worked with developers
– Followed VS example
– Developed global specs for key layers:
• Virtual Synchrony
• Total Order with Virtual Synchrony
– Modeled Ensemble algorithm spanning between layers
– Attempted proof; found logical error in state exchange algorithm (repaired)
– Developed models, proofs for repaired system
![Page 51: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/51.jpg)
51
Conclusions: Ensemble
• Models for two layers, algorithm • Tractable, easily understandable by developers• Error, proofs• Low-level models similar to actual ML code (4 to 1)• [TACAS 99]• Follow-on:
– Same error found in Horus.– Incremental models, proofs [Hickey]
• Next: Use our approach to design new services.
![Page 52: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/52.jpg)
52
3. Dynamic Views [De Prisco, Fekete, Lynch, Shvartsman 98]
Goals:• Define GC services that cope with both:– Long-term changes:• Permanent failures, new joins• Changes in the “universe” of processes
– Transient changes
• Use these to design consistent total order and consistent replicated data algorithms that tolerate both long-term and transient changes.
![Page 53: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/53.jpg)
53
Dynamic Views
• Many applications with strong consistency requirements make progress only in primary views:– Consistent replicated data management – Totally ordered broadcast
• Can use static notion of allowable primaries, e.g., majorities of universe, quorums– All intersect.– Only one exists at a time.– Information flows from each to the next.
• But: Static notion not good forlong-term changes
A B C D E
![Page 54: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/54.jpg)
54
Dynamic Views
• For long-term changes, want dynamic notion of allowable primaries.
• E.g., each primary might contain majority of previous:
• But: Some might not intersect.Makes it hard to maintain consistency.
FA B C D E
![Page 55: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/55.jpg)
55
Dynamic Views
• Key problem: – Processes may have different opinions about which
is the previous primary
– Could even be disjoint.
• [Yeger-Lotem, Keidar, Dolev 97] algorithm– Keeps track of all possible previous primaries.
– Ensures intersection with all of them.
![Page 56: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/56.jpg)
56
Dynamic Views
What we did:
• Defined Dynamic View Service, DVS, based on [YKD]
• Designed to tolerate long-term failures
• Membership:– Views delivered in consistent order, with possible gaps.
– Ensures new primary intersects all possible previous primaries.
• Communication: Similar toVS– Messages delivered within views,
– Prefix property, safe notifications.
![Page 57: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/57.jpg)
57
Dynamic Views
• What we did:– Modeled, proved implementing algorithm
– Modeled, proved TO-Broadcast application
– Distributed implementation [Ingols 00]
DVS
TO
![Page 58: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/58.jpg)
58
Handling Transient Failures: Dynamic Configurations
• Configuration = Set of processes plus structure, e.g., set of quorums, leader,…
• Application: Highly available consistent replicated data management:– Paxos [Lamport], uses leader, quorums
– [Attiya,Bar-Noy, Dolev], uses read quorums and write quorums
• Quorums allow flexibility, availability in the face of transient failures.
![Page 59: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/59.jpg)
59
Dynamic Configurations [De Prisco, Fekete, Lynch, Shvartsman 99, 00]
• Combine ideas/benefits of– Dynamic views, for long-term failures, and– Static configurations, for transient failures
• Idea:– Allow configuration to change (reconfiguration).– Each configuration satisfies intersection properties with
previous configuration
• Example:– Config = (membership set, read quorums, write quorums)– Membership set of new configuration contains read quorum
and write quorum of previous configuration
![Page 60: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/60.jpg)
60
Dynamic Configurations
What we did:• Defined dynamic configuration service DCS,
guaranteeing intersection properties w.r.t. all possible previous configurations.
• Designed implementing algorithm, extending [YKD]• Developed application: Replicated data
– Dynamic version of [Paxos]– Dynamic version of [Attiya, BarNoy, Dolev] – Tolerate
• Transient failures, using quorums• Longer-term failures, using reconfiguration
![Page 61: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/61.jpg)
61
Conclusions: Dynamic Views
• New DVS, DCS services for long-term changes in set of processes
• Applications, implementations• Decomposed complex algorithms into tractable pieces:
– Service specification, implementation, application– Static algorithm vs. reconfiguration
• Couldn’t have done it without the formal framework.• [PODC 98], [DISC 99]
![Page 62: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/62.jpg)
62
4. Scalable Group Communication [Keidar, Khazan 99],
[Keidar, Khazan, Lynch, Shvartsman 00]
Goal: • Make GC work in wide area networks
What we did:• Defined desired properties for GC services • Defined spec for scalable group membership service
[Keidar, Sussman, Marzullo, Dolev 00], implemented on small set of membership servers
![Page 63: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/63.jpg)
63
Scalable Group Communication
What we did:• Developed new, scalable GC algorithms:
– Use scalable GM service
– Multicast implemented on clients
– Efficient: Algorithm for virtual synchrony uses only one round for state exchange, in parallel with membership service’s agreement on views.
– Processes can join during reconfiguration.
• Distributed implementation [Tarashchanskiy]
![Page 64: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/64.jpg)
64
S
A A’
S’
Scalable GC
What we did:
• Developed new incremental modeling, proof methods [Keidar, Khazan, Lynch, Shvartsman 00]– Proof Extension Theorem
• Developed models, proofs (safety and liveness) , using the new methods.
![Page 65: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/65.jpg)
65
Conclusions: Scalable GC
• Specs, new algorithms, proofs
• New incremental proof methods
• Couldn’t have done it without the formal framework.
• [ICDCS 99], [ICSE 00]
![Page 66: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/66.jpg)
66
IV. Conclusions
![Page 67: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/67.jpg)
67
Practical GC Systems: Current Status [Birman 99]
• Commercial successes: – Stock exchange (Zurich, New York)– Air-traffic control (France)
• Problems:– Performance, for strong guarantees like Virtual Synchrony– Not integrated with OO programming technologies.
• Trends: – Flexible services– Weaker guarantees; better performance– Integration with OO technologies
![Page 68: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/68.jpg)
68
Summary
• GC services help in programming dynamic distributed systems, though scalability, integration problems remain.
• Our contributions:– Modeling style: Automata + performance properties
– Techniques: Conditional performance analysis, incremental modeling/proof
– Models, proofs for key services
– Discovered errors
– New services: Dynamic views, scalable GC
• Mathematical framework makes it possible to design more complex systems correctly.
GC
![Page 69: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/69.jpg)
69
Conditional Performance Analysis• Make conditional claims about system behavior, under
various assumptions about behavior of environment, network.
• Include timing, performance, failures.
• Benefits:– Formal performance predictions
– Says when system makes specific guarantees
• Normal case + failure cases
• Parameters, sensitivity analysis
– Composable
– Get probabilistic claims as corollaries
![Page 70: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/70.jpg)
70
CP Analysis: Typical Hypotheses
• Stabilization of underlying network.
• Limited rate of change.
• Bounds on message delay.
• Limited amount of failure (number, density).
• Limit input arrivals (number, density).
• Method allows focus on tractable cases.
![Page 71: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/71.jpg)
71
Future Work
• Consider more dynamic settings:– Join, leave requests– Processor failures: Volatile memory, Byzantine behavior– Communication failures– Changing network topology– Mobility
• Design, model, analyze more services, applications– Communication, data sharing services– Other services
• Lower bounds, impossibility results• Math foundations
![Page 72: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/72.jpg)
72
Math Foundations• Models:– General models for timing behavior, hybrid
continuous/discrete behavior, failures, probabilistic behavior, process creation,…
– Models for mobility (based on Hybrid I/O Automata)
– Combined models
• Proof and performance analysis methods:– Measures– Conditional performance analysis– Incremental modeling, proof
![Page 73: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/73.jpg)
73
Communication: View-oriented GC
• Scalable GC– Experiments
– CP analysis
– Compare predicted, observed performance
– Applications
• Dynamic views, dynamic configurations– Experiments, Analysis, …
• Self-stabilization• Lower bounds, impossibility results
GC
![Page 74: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/74.jpg)
74
Scalable Reliable Multicast [Floyd, Jacobson, et al. 95]
• IP Multicast+ retransmission protocol• Modeling, analysis [Livadas, Keidar, Lynch]
IPMcastSRM
![Page 75: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/75.jpg)
75
Other communication services
• Bimodal Multicast [Birman, Hayden, et al. 99]
• Compare GC services with SRM, Bimodal Multicast
• Study tradeoffs between strength of ordering and reliability guarantees vs. performance
• Multicast with QoS [Bar-Joseph, Keidar, Anker, Lynch 00]
– Models for:
• Bandwidth reservation service
• TO Multicast service with QoS (latency, bandwidth)
– Algorithms implementing TO-QoS using reservation service
![Page 76: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/76.jpg)
76
Data Sharing
• Coherent– Analyze, compare implementations
– Lower bounds, impossibility results
• Less coherent• Tradeoffs between coherence guarantees and
performance
![Page 77: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/77.jpg)
77
Other services for dynamic systems:
• Registry/location services• Leader election • Consensus• Resource allocation• Establishing spanning trees, overlay networks• Load-balancing• …
![Page 78: 1 Theory of Distributed Systems (TDS) Group Leaders: Nancy Lynch, Idit Keidar PhD students: Victor Luchangco, Josh Tauber, Roger Khazan, Carl Livadas,](https://reader035.vdocument.in/reader035/viewer/2022081511/5697bf9a1a28abf838c920c1/html5/thumbnails/78.jpg)
78
Mobility
• Mobile system models– Ad hoc, cellular, mixed (Oxygen)
– Failure models
– Formalized using Hybrid I/O Automata
• Mobile system services– Location services
– Group membership, group communication
– Data sharing
– Applications: Games, military operations