global state recording

13
Global State Recording definitions global state recording FIFO Chandy-Lamport’s algorithm collecting global state incremental snapshot non-FIFO Lai-Yang two color algorithm Mittern’s vector clocks algorithm consistent global snapshots causality and zigzag paths rollback dependency graph

Upload: blythe

Post on 05-Feb-2016

50 views

Category:

Documents


0 download

DESCRIPTION

Global State Recording. definitions global state recording FIFO Chandy-Lamport’s algorithm collecting global state incremental snapshot non-FIFO Lai-Yang two color algorithm Mittern’s vector clocks algorithm consistent global snapshots causality and zigzag paths - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Global State Recording

Global State Recording

definitions global state recording

FIFO Chandy-Lamport’s algorithm collecting global state incremental snapshot

non-FIFO Lai-Yang two color algorithm Mittern’s vector clocks algorithm

consistent global snapshots causality and zigzag paths rollback dependency graph

Page 2: Global State Recording

Local State local state LSi of a site (process) Si is an assignment of values to variables of Si

sending send(mij) and receiving rec(mij) of message mij from Si to Sj may influence LSi

we denote time(send(mij) or rec(mij)) the sequence number of the state in the

compuation after which send/receive occurs (note the difference with Singhal)

time(LSi) the state at which Si was recorded to aid the reasoning we consider the messages sent/received by a process as

belonging to local state

we define

that is• the message is in transit if it was sent but not received• the message is inconsistent if it was received but never sent

Page 3: Global State Recording

Global State global state is a collection of

local states of all processesand set of messages in the channels notice Singhal does not

use messages in his def. – ours is more precise

global state is consistent if it does not have any inconsistent messages,that is:

global state is transitless if there are no messages in transition,that is:

note that a consistent state is not necessarily transitless and v.v.

what are the global states on the picture above?

Page 4: Global State Recording

Chandy-Lamport’s Global State Recording Algorithm

works on arbitrary topology system with FIFO channels and arbitrary algorithm whose snapshot is taken (basic algorithm) does not interfere with the operation of basic algorithm (does not delay, reorder or drop basic messages) one process

initiates recording by sending controlmessages (markers)multiple pro-cesses can also initiate

• can C-L record an inconsistent state?

Page 5: Global State Recording

Global State Recorded by Chandy-Lamport’s Algorithm

does C-L record a (global) state that occurs in the computations? not necessarily. however, C-L records a state in a computation that is equivalent to the

original computation. moreover this equivalent computation shares with the original

computation a prefix up to start of snapshot a suffix after the snapshot

the recorded state is between these two states

• can C-L record a state where some P have messages in every channel? if yes which one?

• can several independent snapshots run in parallel?

Page 6: Global State Recording

Collecting Global State

based on spanning tree constructed on the fly sender of the first marker to arrive at a process is its parent each marker carries the sender’s parent

by receiving marker process learns if it has children if process is a leaf, after finishing state recording, it sends its

state to its parent each process waits for its children’s states, appends its own

and forwards all info to its parent

Page 7: Global State Recording

Non-FIFO: Lai-Yang Algorithm non-FIFO channel: is a set (rather than a queue) of messages, any

message in the set can be received messages can overtake one another fair message receipt is assumed – eventually a sent message is

received

two colors for processes and basic messages – white and red, no explicit markers all processes start as white, when process sends a basic message it

attaches its color when process receives a differently colored message, it itself changes

color while white (red), process records all messages sent/received, after

changing colors, the process sends message history to the initiator; based on sent/received histories, initiator calculates messages in transit

if only the number of messages in transit needed – may maintain counters in stead of histories

Page 8: Global State Recording

State Recording Using Causal Message Delivery

initiator broadcast token to all processes each process records state and sends it to the initiator

processes do not send markers or coordinate local state recording

due to causal ordering of messages if for Pi: rec(tokeni) send(mij)

then send(tokenj) send(mij) therefore rec(tokenj) rec(mij) hence state recording at Pj happens before mij receipt

channel state recording (Archaya-Badrinath) append sequence numbers to all messages, at each process record highest sent/received SN together with local state send sent/received records for initiator to

determine messages in transition

Page 9: Global State Recording

Consistent Global Snapshot processes periodically asynchronously record local states (local checkpoints)

a global snapshot is a collection of local checkpoints needed in distributed failure recovery, distributed event monitoring, debugging, etc. global snapshot is consistent if no two checkpoints are causally related even though checkpoints themselves are not causally related, they may not be a part of a

global snapshot ex: C11 and C32 are concurrent, yet there is no global snapshot that contains both of

them the objective in global snapshot recording is to select (out of available) the set of

concurrent checkpoints. note that unlike global state recording, the alg. does not have control over the

snapshottakingtime

Page 10: Global State Recording

Zigzag Path checkpoint interval – part of computation between two successive checkpoints at

the same process zigzag path exists between checkpoints Cxi and Cyj if there exists a sequence of

messages m1, …mn such that

m1 sent by Px after Cxi

mk received by Pz, mk+1 is sent by Pz in the same checkpoint interval

mn received by Py before Cyj

causal path - same as zigzag path but the messages are causally related zigzag cycle is a zigzag path to the process itself

causal path zigzag path

Page 11: Global State Recording

Zigzag Path checkpoint interval – part of computation between two successive checkpoints at

the same process zigzag path exists between checkpoints Cxi and Cyj if there exists a sequence of

messages m1, …mn such that

m1 sent by Px after Cxi

mk received by Pz then mk+1 is sent by Pz in the same checkpoint interval

mn received by Py before Cyj

causal path - same as zigzag path but the messages are causally related zigzag cycle is a zigzag path to the process itself

zigzag cycle

zigzag path

Page 12: Global State Recording

Sufficient Condition for Consistent Snapshot

a consistent snapshot can be formed to include a set S of checkpoints if and only if no zigzag path exists between any two checkpoints in S [Netzer and Xu] snapshot line – a line drawn through a set of checkpoints due to the existence of a zigzag path, a snapshot line always crosses a

message making two checkpoints causally related and resultant snapshot inconsistent

constructing consistent snapshot requires choosing checkpoints without zigzag path

zigzag path

Page 13: Global State Recording

Runtime Consistent Snapshot Construction Using R-Graph

definition of rollback-dependency graph (R-graph) [Wang]

basic message carries its checkpoint interval number

there is a zigzag path between two checkpoints if there is a path in R-graph

examplecomputation

correspondingR-graph