replicated databases. reading textbook: ch.13 textbook: ch.13 farkascsce 824 - spring 20112
TRANSCRIPT
Replicated DatabasesReplicated Databases
ReadingReading
Textbook: Ch.13Textbook: Ch.13
FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 22
ReviewReview
Centralized DBMSCentralized DBMS Distributed DBMSDistributed DBMS
– Data fragmentation and allocationData fragmentation and allocation Top-down designTop-down design Bottom-up designBottom-up design
– Transaction processingTransaction processing Serializability theoremSerializability theorem Locking protocolsLocking protocols Reliability Reliability
FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 33
Replicated DatabasesReplicated Databases
Multiple copies of the same data Multiple copies of the same data items (databases)items (databases)
Consistency:Consistency:– Local consistencyLocal consistency– Mutual consistencyMutual consistency
FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 44
Why Replication?Why Replication?
System availabilitySystem availability PerformancePerformance ScalabilityScalability Application requirementsApplication requirements
FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 55
Risk of ReplicationRisk of Replication
Worse performance: updates Worse performance: updates must be applied to all replicas must be applied to all replicas and synchronizedand synchronized
Worse availability: some Worse availability: some algorithms require multiple algorithms require multiple replicas to be operational for replicas to be operational for any of them to be usedany of them to be used
FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 66
Transaction Transaction CorrectnessCorrectness 2-Phase Locking – serializability 2-Phase Locking – serializability 2-Phase Commit – reliability 2-Phase Commit – reliability Replica control – mutual consistencyReplica control – mutual consistency
– Database designDatabase design: local vs. global : local vs. global transactionstransactions
– Database consistencyDatabase consistency: strong consistency : strong consistency vs. weak consistencyvs. weak consistency
– Location of updatesLocation of updates: master vs. distributed: master vs. distributed– Update propagatUpdate propagation: eager vs. lazyion: eager vs. lazy– Degree of transparencyDegree of transparency: limited vs. full: limited vs. full
FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 77
Mutual Consistency vs. Mutual Consistency vs. Transaction Transaction ConsistencyConsistency Transaction consistency: global serializabilityTransaction consistency: global serializability Mutual consistency: replicas having the same Mutual consistency: replicas having the same
valuesvalues– StrongStrong: all replicas have the same value at : all replicas have the same value at
the end of the execution of an update the end of the execution of an update transactiontransaction
– QuorumQuorum: a quorum of replicas have the : a quorum of replicas have the same valuesame value
– WeakWeak: eventually the values of all replicas : eventually the values of all replicas become identicalbecome identical
FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 88
FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 9999
Replica ControlReplica Control
Hides replication from transactionHides replication from transaction Knows location of all replicasKnows location of all replicas Translates transaction’s request to access Translates transaction’s request to access
an item into request to access particular an item into request to access particular replica(s)replica(s)
Maintains some form of mutual consistencyMaintains some form of mutual consistency
One-Copy One-Copy Serializability (1SR)Serializability (1SR) Extension of the serializability Extension of the serializability
theorytheory Effects of transactions on Effects of transactions on
replicated data items should be replicated data items should be the same as if they had been the same as if they had been performed one at-a-time on a performed one at-a-time on a single set of date itemssingle set of date items
FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 1010
FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 1111
7/22/997/22/991111
Example ReplicationExample Replication
Transaction
x1
x2
x3
Issues– May reduce performance (complex
operations)– Too expensive – Can’t control when replicas are updated
Replica ControlReplica Control
Pessimistic replica controlPessimistic replica control: at : at most one group can make an most one group can make an update – mutual consistency at update – mutual consistency at all timesall times
Optimistic replica contrOptimistic replica control: ol: system must be available at all system must be available at all times. Correct if there is any times. Correct if there is any violation of mutual consistencyviolation of mutual consistency
FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 1212
FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 13131313
Read One / Write All Read One / Write All Replica Control Replica Control
Pessimistic approachPessimistic approach Read the nearest replicaRead the nearest replica Write all replicasWrite all replicas
– Synchronous : before transaction commitsSynchronous : before transaction commits– Asynchronous case: eventually
Advantage: Advantage: Mutual consistencyMutual consistency Performance benefits: reads transactionsPerformance benefits: reads transactions
Disadvantage: availability is not always Disadvantage: availability is not always guaranteedguaranteed
E.g., Primary site approachE.g., Primary site approach
Primary Site – static Primary Site – static
Primary site: most recent copyPrimary site: most recent copy What happens if the network is What happens if the network is
partitioned?partitioned?
FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 1414
DB0
DB4
DB2
DB5
DB3 DB1
Primary1
2
DB6
Majority ApproachMajority Approach
The group that contains the The group that contains the majority of the sites can process majority of the sites can process an updatean update
FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 1515
DB0
DB4
DB2
DB5
DB3 DB1
1
DB6
FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 1616
Majority ApproachMajority Approach
The group that contains the The group that contains the majority of the sites can process majority of the sites can process an updatean update
FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 1616
DB0
DB4
DB2
DB5
DB3 DB1
1
DB6
2
(N+1)/2
Majority ApproachMajority Approach
Advantages: more flexible than primary Advantages: more flexible than primary sitesite
Disadvantages: zero availability may still Disadvantages: zero availability may still happenhappen
Who has the most recent copy?Who has the most recent copy?– Version number:Version number:
Each site assigns a version number to Each site assigns a version number to the copy (initially VN=0)the copy (initially VN=0)
After an update, the VN is After an update, the VN is incremented by 1 incremented by 1
FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 1717
Quorum ConsensusQuorum Consensus
Each sites are not equalEach sites are not equal Special case of majority Special case of majority
approachapproach
FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 1818
DB0
DB4
DB2
DB5
DB3 DB1
W=5
DB6
W=2
W=15
W=1
W=1
W=1
W=3
Other ApproachesOther Approaches
Dynamic Linear: order sites Dynamic Linear: order sites linearly to calculate majoritylinearly to calculate majority
Token-based primary site Token-based primary site (moving token): change the (moving token): change the location of the primary sitelocation of the primary site
FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 1919
Pessimistic Replica Pessimistic Replica ControlControl Advantages:Advantages:
– Mutual consistency at all timesMutual consistency at all times– Know the latest version ( between Know the latest version ( between
two consecutive updates, there is a two consecutive updates, there is a site in common)site in common)
Disadvantage:Disadvantage:– May result in zero availabilityMay result in zero availability
FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 2020
Optimistic Replica Optimistic Replica ControlControl Goal: availability at all timeGoal: availability at all time Issues: consistency may not be Issues: consistency may not be
guaranteedguaranteed– Need an algorithm to Need an algorithm to detectdetect
whether an whether an inconsistencyinconsistency occurred occurred– Take actions to Take actions to fixfix any any
inconsistenciesinconsistencies
FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 2121
Example Optimistic Example Optimistic Alg. Alg. Two partitions P1, P2Two partitions P1, P2 Assumption: separately, P1 and Assumption: separately, P1 and
P2 produces serializable P2 produces serializable historieshistories
Need: after P1 and P2 joins Need: after P1 and P2 joins again: Detect which transactions again: Detect which transactions violate global serializabilityviolate global serializability
FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 2222
Example cont. Example cont.
Items read by transaction T: Items read by transaction T: read(T)read(T)
Items written by transaction T: Items written by transaction T: write(T)write(T)
Assume: write(T) Assume: write(T) read(T) read(T) Transactions in P1: TTransactions in P1: T1i 1i , in P2: T, in P2: T2i2i
FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 2323
Example cont. Example cont.
Precedence graph: GPrecedence graph: G– Nodes: {TNodes: {T1111, …,T, …,T1n1n, T, T2121, …, T, …, T2m2m}}
– Edges:Edges: Dependency edge (ripple effect): there Dependency edge (ripple effect): there
is an edge Tis an edge Tijij T Tik ik if j<k and there is a if j<k and there is a data item d, s.t., d data item d, s.t., d write (T write (Tijij) ) read(Tread(Tikik) and there is no l s.t., j<l<k and ) and there is no l s.t., j<l<k and d is in the write set in Td is in the write set in Tilil (to consider (to consider dirty read within the same partition)dirty read within the same partition)
FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 2424
Example cont.Example cont.
Precendence edges: there is an edge Precendence edges: there is an edge TTijij T Tik ik if j<k and there is a data item if j<k and there is a data item d, s.t., d d, s.t., d read(T read(Tijij) ) write(T write(Tikik) and ) and there is no l s.t., j<l<k and d is in the there is no l s.t., j<l<k and d is in the write set in Twrite set in Til il (to consider the first (to consider the first transaction to write a data item after transaction to write a data item after a read within the same partition)a read within the same partition)
FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 2525
Example cont.Example cont.
Interference edges: there is an Interference edges: there is an edge Tedge T1i1i T T2j 2j if j<k and there is a if j<k and there is a data item d, s.t., d data item d, s.t., d read(T read(T1i1i) ) write(Twrite(T2j2j) or vice verse (to ) or vice verse (to consider when Tconsider when T1i1i reads reads something written by Tsomething written by T2j2j))
FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 2626
Example cont.Example cont.
TheoremTheorem: The combined : The combined histories are correct iff the histories are correct iff the precendense graph is acyclicprecendense graph is acyclic
Correct inconsistencies: remove Correct inconsistencies: remove (undo) transactions that make (undo) transactions that make the graph cyclicthe graph cyclic
FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 2727
FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 2828
SummarySummary
Correctness: If the transactions are ACID, local Correctness: If the transactions are ACID, local execution in serializable, distributed execution in serializable, distributed transactions are reliable, and update replication transactions are reliable, and update replication is is synchronous synchronous then distributed transactions are then distributed transactions are globally atomic & serializableglobally atomic & serializable
Performance:Performance:
– Applications: transactions are not always Applications: transactions are not always serializable (e.g., WS-transactions)serializable (e.g., WS-transactions)
– Replication: update propagation is not always Replication: update propagation is not always asynchronousasynchronous
Compensating transactionsCompensating transactions
FarkasFarkas CSCE 824 - Spring 2011CSCE 824 - Spring 2011 2929
Next ClassNext Class
Review distributed databasesReview distributed databases
DesignDesign
Concurrency controlConcurrency control
ReliabilityReliability
Replication Replication