november, 19th gds meeting, lip6, paris 1 hierarchical synchronization and consistency in gds...
DESCRIPTION
November, 19thGDS meeting, LIP6, Paris3 Replicated Home The home node is replicated to tolerate failures Thanks to active replications all replicas are up-to-dateTRANSCRIPT
November, 19thNovember, 19th GDS meeting, LIP6, ParisGDS meeting, LIP6, Paris 11
Hierarchical Hierarchical Synchronization and Synchronization and Consistency in GDSConsistency in GDS
Sébastien MonnetSébastien MonnetIRISA, RennesIRISA, Rennes
November, 19thNovember, 19th GDS meeting, LIP6, ParisGDS meeting, LIP6, Paris 22
JuxMem Consistency ProtocolJuxMem Consistency ProtocolCurrently Home BasedCurrently Home Based
Home node responsible of a piece of dataHome node responsible of a piece of data Actions on the piece of data <=> Actions on the piece of data <=>
communication with the home nodecommunication with the home node
Home node
Client
HomeHome
November, 19thNovember, 19th GDS meeting, LIP6, ParisGDS meeting, LIP6, Paris 33
Replicated HomeReplicated Home The home node is replicated to tolerate The home node is replicated to tolerate
failuresfailures Thanks to active replications all replicas Thanks to active replications all replicas
are up-to-dateare up-to-date
November, 19thNovember, 19th GDS meeting, LIP6, ParisGDS meeting, LIP6, Paris 44
ReplicationReplication Two layered architectureTwo layered architecture Replication based on Replication based on
classical fault tolerant classical fault tolerant distributed algorithmsdistributed algorithms Implies a consensus Implies a consensus
between all nodesbetween all nodes Need for replicates in Need for replicates in
several clusters (locality)several clusters (locality)
CommunicationsFailure detector
Consensus
Group communicationand group membership
Atomic multicast
AdapterFault tolerance
Consistency
Junction layer
November, 19thNovember, 19th GDS meeting, LIP6, ParisGDS meeting, LIP6, Paris 55
HierarchicalHierarchical
Client
GDG
LDG
LDG
LDG
GDG : Global Data GroupLDG : Local Data Group
November, 19thNovember, 19th GDS meeting, LIP6, ParisGDS meeting, LIP6, Paris 66
Synchronization Point of View Synchronization Point of View Naturally similar to data managementNaturally similar to data management
1 lock per piece of data1 lock per piece of data Pieces of data are strongly linked to their Pieces of data are strongly linked to their
locks locks
Synchronisation manager
Client
SM
November, 19thNovember, 19th GDS meeting, LIP6, ParisGDS meeting, LIP6, Paris 77
Synchronization Point of View Synchronization Point of View The synchronization manager is replicated The synchronization manager is replicated
the same waythe same way
November, 19thNovember, 19th GDS meeting, LIP6, ParisGDS meeting, LIP6, Paris 88
Synchronization Point of View Synchronization Point of View
Client
November, 19thNovember, 19th GDS meeting, LIP6, ParisGDS meeting, LIP6, Paris 99
In Case of FailureIn Case of Failure Failure of a provider (group member)Failure of a provider (group member)
Held by the proactive group membership : the faulty Held by the proactive group membership : the faulty provider is replaced by a new oneprovider is replaced by a new one
Failure of a clientFailure of a client With a lock => regenerate the tokenWith a lock => regenerate the token Without a lock => do nothingWithout a lock => do nothing
Failure of a whole local groupFailure of a whole local group Very low probabilityVery low probability As if it was a client (as it is for the global group)As if it was a client (as it is for the global group)
November, 19thNovember, 19th GDS meeting, LIP6, ParisGDS meeting, LIP6, Paris 1010
False DetectionFalse Detection Blocking unlocking with return codeBlocking unlocking with return code To be sure that an operation as performed To be sure that an operation as performed
a client has to do something like:a client has to do something like:
do do {{
locklock(data)(data)process(data)process(data)
} while(} while(unlockunlock(data) is not ok)(data) is not ok)//// here we’re sure that the action has been taken into accounthere we’re sure that the action has been taken into account
November, 19thNovember, 19th GDS meeting, LIP6, ParisGDS meeting, LIP6, Paris 1111
Actual JuxMem’s Actual JuxMem’s Synchronization (Sum up)Synchronization (Sum up)
Authorization based Authorization based Exclusive (acquire)Exclusive (acquire) Non exclusive (acquireR)Non exclusive (acquireR)
Centralized (active replication)Centralized (active replication) Strongly coupled with data managementStrongly coupled with data management
Hierarchical and fault tolerantHierarchical and fault tolerant
November, 19thNovember, 19th GDS meeting, LIP6, ParisGDS meeting, LIP6, Paris 1212
Data Updates : When?Data Updates : When? Eager (current version) :Eager (current version) :
When a lock is released update all replicasWhen a lock is released update all replicas High fault tolerant level / Low performancesHigh fault tolerant level / Low performances
Client
November, 19thNovember, 19th GDS meeting, LIP6, ParisGDS meeting, LIP6, Paris 1313
Data Updates : When?Data Updates : When? Lazy (possible implementation) :Lazy (possible implementation) :
Update a local data group when a lock is Update a local data group when a lock is acquiredacquired
Client
November, 19thNovember, 19th GDS meeting, LIP6, ParisGDS meeting, LIP6, Paris 1414
Data Updates : When?Data Updates : When? Intermediate (possible implementation) :Intermediate (possible implementation) :
Allow a limited number of local update before Allow a limited number of local update before propagating all the updates to the global levelpropagating all the updates to the global level
Client
November, 19thNovember, 19th GDS meeting, LIP6, ParisGDS meeting, LIP6, Paris 1515
Data Updates : When?Data Updates : When?
A hierarchical consistency model?A hierarchical consistency model? Local lockLocal lock Global lockGlobal lock
November, 19thNovember, 19th GDS meeting, LIP6, ParisGDS meeting, LIP6, Paris 1616
Distributed Synchronization Distributed Synchronization AlgorithmsAlgorithms
Naïmi-Trehel’s Naïmi-Trehel’s Token basedToken based Mutual exclusionMutual exclusion
Extented by REGALExtented by REGAL Hierarchical (Marin, Luciana, Pierre)Hierarchical (Marin, Luciana, Pierre) Fault tolerant (Julien)Fault tolerant (Julien) Both?Both?
A fault tolerant, grid aware synchronization A fault tolerant, grid aware synchronization module used by JuxMem?module used by JuxMem?
November, 19thNovember, 19th GDS meeting, LIP6, ParisGDS meeting, LIP6, Paris 1717
Open Question and Future Open Question and Future WorkWork
Interface between JuxMem providers and Interface between JuxMem providers and synchronization modulesynchronization module Providers have to be informed of synchronization Providers have to be informed of synchronization
operations to perform updatesoperations to perform updates Future work (Julien & Sébastien)Future work (Julien & Sébastien)
Centralized data / distributed locks?Centralized data / distributed locks? Data may become distributed in JuxMem (epidemic Data may become distributed in JuxMem (epidemic
protocols, migratory replication, etc.)protocols, migratory replication, etc.) Algorithms for token-based non-exclusive locks? Algorithms for token-based non-exclusive locks? May allow more flexibility for replication May allow more flexibility for replication
techniques (passive or quorum based)techniques (passive or quorum based)
November, 19thNovember, 19th GDS meeting, LIP6, ParisGDS meeting, LIP6, Paris 1818
Other Open Issues in Other Open Issues in JuxMemJuxMem
November, 19thNovember, 19th GDS meeting, LIP6, ParisGDS meeting, LIP6, Paris 1919
Junction LayerJunction Layer
Decoupled designDecoupled design
Need to refine the Need to refine the junction layerjunction layer
Fault tolerance
Consistency
Junction layer
Send Receive
November, 19thNovember, 19th GDS meeting, LIP6, ParisGDS meeting, LIP6, Paris 2020
Replication DegreeReplication Degree Actual features : the client specifiesActual features : the client specifies
The global data group cardinality (i.e number of clusters)The global data group cardinality (i.e number of clusters) The local data groups cardinality (i.e number of replicas in each The local data groups cardinality (i.e number of replicas in each
cluster)cluster)
Desirable features : the client specifiesDesirable features : the client specifies The criticality degree of the piece of dataThe criticality degree of the piece of data The access needs (model, required perfs)The access needs (model, required perfs)
A monitoring moduleA monitoring module Integrated to Marin’s failure detectors? Integrated to Marin’s failure detectors? Current MTBF, message losses, etc.Current MTBF, message losses, etc. May allow JuxMem to dynamically deduce the May allow JuxMem to dynamically deduce the
replication degree for each piece of datareplication degree for each piece of data
November, 19thNovember, 19th GDS meeting, LIP6, ParisGDS meeting, LIP6, Paris 2121
Application NeedsApplication Needs Access modelAccess model
Data grain? Data grain? Access patternsAccess patterns
Multiple readers?Multiple readers? Locks shared across multiple clusters?Locks shared across multiple clusters?
Data criticalityData criticality Are there different levels of criticality?Are there different levels of criticality?
What kind of advice the application can give What kind of advice the application can give concerning those 2 aspects?concerning those 2 aspects?
Duration of the application?Duration of the application? Traces : latency, crashes, message losses?Traces : latency, crashes, message losses?