distributed systems fall 2011 gossip and highly available services
Post on 21-Dec-2015
215 views
TRANSCRIPT
3
Outline
• Quality of Service (QoS) vs. High availability
• Gossip– Basic concept– Architecture
• Cloud computing
Quality of Service
• Quality of Service means different things depending on context– Low number of crashes / high uptime
– Messages delivered in ”reasonable” time (e.g. live streaming data)
– Many more interpretations
• Service Level Agreement• Compensation for broken SLAs
– E.g. accounting in Grids / Clouds4
High availability
• Uptime does not imply availability!
• Some level of service is better than none
• We do not always need replicas with either sequential consistency / linearizability properties– “Fresh enough” rather than “the freshest”
• Many modern (huge) systems have similar requirements
5
Quick refresher
• Sequential consistency– Interleaving of operations performed (as if) on a single copy of the object
– Consistent with program order of the invoking client (as opposed to real time in linearizability)
6
Gossip
• Framework, Architecture, and/or Protocol
• “Lazy” synchronization between replica managers– Eventual consistency
• Very fault-tolerant– Crashes are OK– Clients contact any replica manager for any operation
– Clients keep track of their own state
7
Basic ideas of Gossip
• Consistent service over time– Provide clients with newer data than they have observed so far
– Clients can work with different replicas for each operation
• Relaxed inter-replica consistency– Not generally sequential consistency
8
Basic ideas contd.
• Update operations– Change some value in the system– Accepted immediately (carried out later)
• Query operations– Reads value(s) from the system– Blocks until replica manager is able to respond to client
– Needs fresh enough data to respond
9
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 4
© Pearson Education 2005
Figure 15.6Query and update operations in a gossip service
Service
Query Val
FE
RM RM
RM
Query, prev Val, new
Update
FE
Update, prev Update id
Clients
gossip
Message ordering
• Causal update ordering• Forced (total and causal)• Immediate
– Applied in consistent order relative to any other update at all replica managers, regardless of ordering for the other update
• Tradeoff: consistency vs. cost– Causal is cheap and easy– Causal order for queries to a single RM
11
Message ordering example
• Discussion forum (bulletin board)
• Causal order for discussion threads– Preserves conversation structures
• Forced order for registration– Clear order of who joined when
• Immediate for unregistering– No messages sent to ex-subscriber
12
Front ends
• Much more intelligent than during active/passive replication!
• Clients always use front ends– Even for inter-client messages (allows causally related messages and information dissemination)
13
Queries
• Client state (vector-clock) included in the call
• The RM returns values that are at least as recent as the client state– If the state of the client is more recent than the RM, the RM will either request the missing messages or wait for them.
14
Updates
• Causal order• Each have unique identifiers• Client state is included in the call to support message orderings
• Updates are never blocking, only enqueued
• Due to the ordering guarantees, a client can update (or query) any RM and get the same result
15
Replication phases
• Request– Normally FE sends to single RM (use more for higher fault tolerance)
• Update response• Coordination
– Apply updates once ordering allows– No explicit coordination: only Gossip messages required
• Execution• Query response• Agreement
– Lazy: can wait and send update batches
16
Front end timestamp
• FE embeds own vector clock with each message
• Update– RM infers order relative to others
• Queries– RM returns oldest possible data:FE has (2,4,6) and RM (2,5,5)RM updates own state to (2,5,6) and returns state at (2,5,5)
17
Front end timestamps contd.
• Clients can communicate with each other– Causes causal relationship between messages
– Lets FEs update their vector timestamps so that future queries to Gossip service will give more updated data
18
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 4
© Pearson Education 2005
Figure 15.8A gossip replica manager, showing its main state components
Other replica managers
Replica timestamp
Update log
Value timestamp
Value
Executed operation table
Stable
updates
Updates
Gossipmessages
FE
Replicatimestamp
Replica log
OperationID Update PrevFE
Replica manager
Timestamp table
Gossiping
• The architecture does not specify when and with which peers to gossip
• Delay depends on:– Frequency and duration of partitions
– Frequency of gossip-exchanges (application dependant)
– Peer-selection policy• Random• Deterministic• Topological 20
Advanced Gossiping
• In some applications (think Facebook), replication can be geographically biased
• Read-only RMs close to clients can improve scalability a lot for read intensive applications
21
Facebook and Gossip
• Inbox search problem: 25TB of data
• Facebook solution: Cassandra– Similar to Distributed Hash-Tables, more about these during next lecture
– Conceptually, a hash-table with N-replicas using lazy updates to share data
• Facebook uses algorithms similar to Gossip for failure handling.– If a node in DHT crashes, the event is propagated to the other replicas
22
Cloud computing
• Servers and tasks are run inside virtual machines– Virtual machines can be moved and placed dynamically
– A master image is used to start new copies of virtual machines
– Elasticity: virtual machines are added or removed depending on certain system conditions
• A simple example follows
23