distributed systems concepts
TRANSCRIPT
![Page 1: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/1.jpg)
Distributed Systems Concepts
Jordan Halterman
Intro to
![Page 2: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/2.jpg)
I N T R O D U C T I O N
2
![Page 3: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/3.jpg)
I N T R O D U C T I O N
3
W H A T I S A D I S T R I B U T E D
S Y S T E M ?
![Page 4: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/4.jpg)
I N T R O D U C T I O N
4
A N A T O M Y O F A D I S T R I B U T E D
S Y S T E MW H A T I S A D I S T R I B U T E D
S Y S T E M ?
![Page 5: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/5.jpg)
I N T R O D U C T I O N
5
A N A T O M Y O F A D I S T R I B U T E D
S Y S T E M F A L L A C I E S O F D I S T R I B U T E D C O M P U T I N G
W H A T I S A D I S T R I B U T E D
S Y S T E M ?
![Page 6: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/6.jpg)
A collection of independent computers that appear to users as a single coherent system
W H A T I S A D I S T R I B U T E D S Y S T E M ?
6
![Page 7: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/7.jpg)
“A collection of independent computers that appear to the users of the system as a single computer”
— Andrew Tanenbaum
W H A T I S A D I S T R I B U T E D S Y S T E M ?
7
![Page 8: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/8.jpg)
“You know you have a distributed system when the crash of a computer you’ve never heard of stops you from getting any work done”
— Leslie Lamport
W H A T I S A D I S T R I B U T E D S Y S T E M ?
8
![Page 9: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/9.jpg)
• Scalability and fault tolerance
• Memory, disk, and CPU are finite resources
• Computers crash and networks fail
• Science hasn’t kept up with technological needs
W H A T I S A D I S T R I B U T E D S Y S T E M ?
9
![Page 10: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/10.jpg)
B U T D I S T R I B U T E D
S Y S T E M S A R E
H A R D !1 0
![Page 11: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/11.jpg)
T H E T W O G E N E R A L S P R O B L E M
1 1
• Two generals on the opposite sides of a valley have to coordinate to decide when to attack
• Each general must be sure the other made the same decision
• Generals can only communicate through messages
• Messengers sent through the valley can be captured
![Page 12: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/12.jpg)
A N A T O M Y O F A D I S T R I B U T E D S Y S T E M
1 2
![Page 13: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/13.jpg)
Nodes
A N A T O M Y O F A D I S T R I B U T E D S Y S T E M
1 3
![Page 14: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/14.jpg)
Nodes Networks
A N A T O M Y O F A D I S T R I B U T E D S Y S T E M
1 4
![Page 15: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/15.jpg)
Nodes Networks Protocols
A N A T O M Y O F A D I S T R I B U T E D S Y S T E M
1 5
![Page 16: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/16.jpg)
• Each independent component of a distributed system is called a node
• Also known as a process, agent or actor
• Operations within a node are fast
• Communication between nodes is slow
• Operations generally occur in order
N O D E S
1 6
![Page 17: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/17.jpg)
S Y S T E M M O D E L
1 7
• Bounded message delays
• Accurate global clock
• Easy to reason about
• You don’t have one
ASYNCHRONOUSSYNCHRONOUS
• Processes execute independently
• Unbounded message delays
• No global clock
• Difficult to reason about
• You have one
![Page 18: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/18.jpg)
• Nodes communicate via messages
• Example: UDP, TCP, HTTP
M E S S A G E P A S S I N G
1 8
![Page 19: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/19.jpg)
F A L L A C I E S O F D I S T R I B U T E D C O M P U T I N G
1 9
![Page 20: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/20.jpg)
The network is reliable
F A L L A C I E S O F D I S T R I B U T E D C O M P U T I N G
2 0
![Page 21: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/21.jpg)
The network is reliable
Latency is zero
F A L L A C I E S O F D I S T R I B U T E D C O M P U T I N G
2 1
![Page 22: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/22.jpg)
The network is reliable
Latency is zero
Bandwidth is infinite
F A L L A C I E S O F D I S T R I B U T E D C O M P U T I N G
2 2
![Page 23: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/23.jpg)
The network is reliable
Latency is zero
Bandwidth is infinite
The network is secure
F A L L A C I E S O F D I S T R I B U T E D C O M P U T I N G
2 3
![Page 24: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/24.jpg)
The network is reliable
Latency is zero
Bandwidth is infinite
The network is secure
F A L L A C I E S O F D I S T R I B U T E D C O M P U T I N G
2 4
Topology doesn’t change
![Page 25: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/25.jpg)
The network is reliable
Latency is zero
Bandwidth is infinite
The network is secure
F A L L A C I E S O F D I S T R I B U T E D C O M P U T I N G
2 5
Topology doesn’t change
There is one administrator
![Page 26: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/26.jpg)
The network is reliable
Latency is zero
Bandwidth is infinite
The network is secure
F A L L A C I E S O F D I S T R I B U T E D C O M P U T I N G
2 6
Topology doesn’t change
There is one administrator
Transport cost is zero
![Page 27: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/27.jpg)
The network is reliable
Latency is zero
Bandwidth is infinite
The network is secure
F A L L A C I E S O F D I S T R I B U T E D C O M P U T I N G
2 7
Topology doesn’t change
There is one administrator
Transport cost is zero
The network is homogeneous
![Page 28: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/28.jpg)
FALLACY #1
THE NETWORK IS RELIABLE
F A L L A C I E S O F D I S T R I B U T E D C O M P U T I N G
2 8
![Page 29: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/29.jpg)
• On average 5.2 devices and 40.8 lines fail per day in Microsoft data centers
• The majority of Google’s outages that lasted more than 30 seconds were due to network maintenance or connectivity issues
• If network hardware doesn’t fail, software will
• We cannot rely on the network to deliver our communications
F A L L A C I E S O F D I S T R I B U T E D C O M P U T I N G
2 9
![Page 30: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/30.jpg)
FALLACY #2
LATENCY IS ZERO
F A L L A C I E S O F D I S T R I B U T E D C O M P U T I N G
3 0
![Page 31: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/31.jpg)
• Latency is the time it takes for a signal to travel from one computer to another
• Latency is a function of the speed of light
• It takes 40 milliseconds for light to travel from New York to Paris and back
• The JVM executes billions of instructions per second
F A L L A C I E S O F D I S T R I B U T E D C O M P U T I N G
3 1
![Page 32: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/32.jpg)
FALLACY #3
BANDWIDTH IS INFINITE
F A L L A C I E S O F D I S T R I B U T E D C O M P U T I N G
3 2
![Page 33: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/33.jpg)
• Bandwidth is roughly the amount of information that can be transmitted each second
• Networks are limited by hardware
• Applications are limited by software
F A L L A C I E S O F D I S T R I B U T E D C O M P U T I N G
3 3
![Page 34: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/34.jpg)
FALLACY #4
THE NETWORK IS SECURE
F A L L A C I E S O F D I S T R I B U T E D C O M P U T I N G
3 4
![Page 35: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/35.jpg)
• We see hacks of major corporations’ networks seemingly on a weekly basis
• In 2015, Foxglove Security discovered a major vulnerability in Java’s serialization framework
• Allowing remote access to friendly users opens systems up to unfriendly ones
F A L L A C I E S O F D I S T R I B U T E D C O M P U T I N G
3 5
![Page 36: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/36.jpg)
F A L L A C I E S O F D I S T R I B U T E D C O M P U T I N G
3 6
D A T A B R E A C H E S S I N C E 2 0 0 5
![Page 37: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/37.jpg)
FALLACY #5
TOPOLOGY DOESN’T CHANGE
F A L L A C I E S O F D I S T R I B U T E D C O M P U T I N G
3 7
![Page 38: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/38.jpg)
• Administrators add and remove servers from networks
• We cannot depend on machines always being in the same place
• Service discovery and routing layers solve this problem
F A L L A C I E S O F D I S T R I B U T E D C O M P U T I N G
3 8
![Page 39: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/39.jpg)
FALLACY #6
THERE IS ONE ADMINISTRATOR
F A L L A C I E S O F D I S T R I B U T E D C O M P U T I N G
3 9
![Page 40: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/40.jpg)
• Production systems are often maintained and managed by numerous people
• Multiple administrators may institute conflicting policies
F A L L A C I E S O F D I S T R I B U T E D C O M P U T I N G
4 0
![Page 41: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/41.jpg)
FALLACY #7
TRANSPORT COST IS ZERO
F A L L A C I E S O F D I S T R I B U T E D C O M P U T I N G
4 1
![Page 42: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/42.jpg)
• Local processing is cheap
• Network communication is expensive
• Latency and bandwidth ensure transport cost is never zero
F A L L A C I E S O F D I S T R I B U T E D C O M P U T I N G
4 2
![Page 43: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/43.jpg)
FALLACY #8
THE NETWORK IS HOMOGENEOUS
F A L L A C I E S O F D I S T R I B U T E D C O M P U T I N G
4 3
![Page 44: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/44.jpg)
• Applications must be designed to work in a variety of environments
• Wired networks
• Wireless networks
• Cellular networks
• Satellite networks
F A L L A C I E S O F D I S T R I B U T E D C O M P U T I N G
4 4
![Page 45: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/45.jpg)
C O N C E P T S
4 5
![Page 46: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/46.jpg)
C O N C E P T S
4 6
T I M E I N D I S T R I B U T E D
S Y S T E M S
![Page 47: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/47.jpg)
C O N C E P T S
4 7
C O N S I S T E N C Y I N D I S T R I B U T E D
S Y S T E M ST I M E I N D I S T R I B U T E D
S Y S T E M S
![Page 48: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/48.jpg)
C O N C E P T S
4 8
C O N S I S T E N C Y I N D I S T R I B U T E D
S Y S T E M S P A R T I T I O N I N G A N D
R E P L I C A T I O N
T I M E I N D I S T R I B U T E D
S Y S T E M S
![Page 49: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/49.jpg)
CONSISTENCY AVAILABILITY PARTITION TOLERANCE
ZOOKEEPER STRONG QUORUM YES
DYNAMO EVENTUALLY STRONG HIGH YES
MYSQL STRONG HIGH NO
T H E C A P T H E O R E M
4 9
T R A D E O F F S I N D I S T R I B U T E D S Y S T E M S
![Page 50: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/50.jpg)
O R D E R I N D I S T R I B U T E D S Y S T E M S
5 0
![Page 51: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/51.jpg)
• Order is necessary to enforce causal relationships
• Two types of order in distributed systems
• Partial order
• Order of dependent events
• Total order
• Order of all events
• Single-threaded applications are totally ordered
O R D E R I N D I S T R I B U T E D S Y S T E M S
5 1
![Page 52: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/52.jpg)
T I M E I N D I S T R I B U T E D S Y S T E M S
5 2
![Page 53: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/53.jpg)
• Time can be used to enforce order
• Time can be used to enforce bounds on communications
• But time progresses independently in asynchronous systems
• Clocks suffer from clock drift
• Even NTP can only synchronize clocks to within a few milliseconds of each other
T I M E I N D I S T R I B U T E D S Y S T E M S
5 3
![Page 54: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/54.jpg)
T I M E I N D I S T R I B U T E D S Y S T E M S
5 4
![Page 55: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/55.jpg)
• “Time, Clocks, and the Ordering of Events in a Distributed System”
• Developed by Leslie Lamport in 1978
• One of the seminal papers in distributed systems
• Determines partial ordering of events in a distributed system
• Also referred to as logical clocks
T I M E I N D I S T R I B U T E D S Y S T E M S
5 5
L A M P O R T C L O C K S
![Page 56: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/56.jpg)
T I M E I N D I S T R I B U T E D S Y S T E M S
5 6
![Page 57: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/57.jpg)
• “Timestamps in Message Passing Systems That Preserve the Partial Ordering” - Colin J. Fidge
• “Virtual Time and Global States of Distributed Systems” - Friedemann Mattern
• Independently developed by two researchers in 1988
• Determines causal ordering of events in a distributed system
• Also referred to as version vectors
T I M E I N D I S T R I B U T E D S Y S T E M S
5 7
V E C T O R C L O C K S
![Page 58: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/58.jpg)
T I M E I N D I S T R I B U T E D S Y S T E M S
5 8
![Page 59: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/59.jpg)
C O N S I S T E N C Y
5 9
![Page 60: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/60.jpg)
• Linearizability
• Sequential consistency
• Causal consistency
• Eventual strong consistency
• Eventual consistency
C O N S I S T E N C Y M O D E L S
6 0
![Page 61: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/61.jpg)
• Monotonic read consistency
• Monotonic write consistency
• Read-your-writes consistency
• Writes follow reads consistency
• Serializability
C O N S I S T E N C Y M O D E L S
6 1
M O R E C O N S I S T E N C Y M O D E L S
![Page 62: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/62.jpg)
P A R T I T I O N I N G
6 2
![Page 63: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/63.jpg)
• Split data across multiple machines
• Reduces the amount of data each node must handle
• Reduces the amount of network I/O for certain algorithms
P A R T I T I O N I N G
6 3
![Page 64: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/64.jpg)
R E P L I C A T I O N
6 4
![Page 65: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/65.jpg)
• Sharing information to ensure consistency between redundant services
• Active replication — push
• Passive replication — pull
• Quorum-based
• Gossip
R E P L I C A T I O N
6 5
![Page 66: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/66.jpg)
R E P L I C A T I O N
6 6
• Nodes updated between the request and response
• Consistency over performance
A S Y N C H R O N O U SS Y N C H R O N O U S
• State persisted locally and replicated after response
• Performance over consistency
![Page 67: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/67.jpg)
PRIMARY-BACKUP GOSSIP 2PC QUORUM
CONSISTENCY
TRANSACTIONS
LATENCY
THROUGHPUT
DATA LOSS
READ ONLY
E V E N T U A L S T R O N G
L O W
H I G H
F U L L
H I G H
F U L L L O C A L
S O M E
R E A D O N LY
L O W M E D I U M
N O N E
R E A D / W R I T E
R E P L I C A T I O N
6 7
T R A D E O F F S I N D I S T R I B U T E D S Y S T E M S
![Page 68: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/68.jpg)
• Gossip is one of the simplest distributed communication algorithms
• Inspired by the gossip that takes place in human communication
• Each node periodically chooses a random set of neighbors with which to exchange information
• Information propagates through the system quickly
• Version vectors can be used to resolve conflicts in updates
R E P L I C A T I O N
6 8
G O S S I P
![Page 69: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/69.jpg)
C O N S I S T E N T H A S H I N G
6 9
![Page 70: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/70.jpg)
• Map each object to a point on the edge of a circle
• Map each machine to a pseudo-random point on the same circle
• To find the node on which an object is stored, find the location of the object on the edge of the circle and walk around the circle until the first node is found
C O N S I S T E N T H A S H I N G
7 0
![Page 71: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/71.jpg)
C O N S I S T E N T H A S H I N G
7 1
![Page 72: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/72.jpg)
F A I L U R E D E T E C T I O N
7 2
![Page 73: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/73.jpg)
• Failure detectors are characterized in terms of completeness and accuracy
• In a synchronous system, failure detection is solvable
• Certain problems are not solvable without failure detection in an asynchronous system
• A partitioned process is indistinguishable from a crashed process
• Thus reliable failure detection is impossible in an asynchronous system
• Failure detection is usually based on time
F A I L U R E D E T E C T I O N
7 3
![Page 74: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/74.jpg)
L E A D E R E L E C T I O N
7 4
![Page 75: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/75.jpg)
L E A D E R E L E C T I O N
7 5
• The process of selecting a single node to coordinate a cluster
• Difficult to account for failures
• Electing a leader allows a single process to control a cluster
• Frequently used in consensus algorithms
• But a single leader can limit throughput
![Page 76: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/76.jpg)
L E A D E R E L E C T I O N
7 6
B U L LY A L G O R I T H M
![Page 77: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/77.jpg)
C O N S E N S U S
7 7
![Page 78: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/78.jpg)
• Single-system view, shared state
• Key to building consistent storage systems
C O N S E N S U S
7 8
![Page 79: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/79.jpg)
• Agreement — every correct process must agree on the same value
• Integrity — every correct process decides at most one proposed value
• Termination — all processes eventually reach some value
• Validity — if all correct processes propose the same value v then all processes decide the same value v
C O N S E N S U S
7 9
![Page 80: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/80.jpg)
• “Impossibility of Consensus with One Faulty Process” — Fischer, Lynch, and Paterson
• Commonly referred to as the FLP Impossibility Result
• Consensus is impossible to guarantee in a fault-tolerant asynchronous system
• In practice, consensus can be reached
C O N S E N S U S
8 0
![Page 81: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/81.jpg)
ZooKeeper Atomic Broadcast “Wait-free Coordination for Internet Scale Systems” — Hunt, Konar et al
Viewstamped Replication “Viewstamped Replication” — Brian M. Oki and Barbara H. Liskov
Raft “In Search of an Understandable Consensus Algorithm” — Diego Ongaro and John Osterhout
C O N S E N S U S
8 1
Paxos “The Part-Time Parliament” — Leslie Lamport
“Paxos Made Easy” — Leslie Lamport
![Page 82: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/82.jpg)
• Leader election
• Log replication
• Failure detection
• Log compaction
• Membership changes
C O N S E N S U S
8 2
![Page 83: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/83.jpg)
Distributed systems in practice
N E X T T I M E
8 3
![Page 84: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/84.jpg)
Q & A
![Page 85: Distributed Systems Concepts](https://reader031.vdocument.in/reader031/viewer/2022030305/587138dd1a28abf0568b6487/html5/thumbnails/85.jpg)
HaltermanJordan
T H A N K Y O U !