more about transaction management

More About Transaction Management

Chapter 10

Contents

• Transactions that Read Uncommitted Data

• View Serializability

• Resolving Deadlocks

• Distributed Databases

• Long-Duration Transactions

The Dirty-Data Problem

Data is “dirty” if it has been written by a transaction that is not yet committed

Example 1: A written by T1 is a dirty data ,T2’s reading of A leaves the database with an inconsistent state

T1 T2 A B

25 25

:l1(A);r1(A);

A:=A+100;

w1(A);l1(B);u1(A); 125

l2(A);r2(A);

A:=A*2;

w2(A);u2(A); 250

l2(B);Denied

r1(B);

Abort;u1(B); l2(B); u2(A); r2(B);

B:=B*2;

w2(B);u2(B) 50 T1 writes dirty data and then aborts

Example 2:

T1 T2 T3 A B C

200 150 175 RT=0 RT=0 RT=0

WT=0 WT=0 WT=0

w2(B) WT=150

r1(B)

r2(A) RT=150

r3(C) RT=175

w2(C)

Abort WT=0

w3(A)) WT=175

T1 has read dirty data from T2 and must abort when T2 does

Cascading Rollback

• When transaction T aborts,we must find each transaction U that read dirty data from T,abort U,find any transaction V that read dirty data from U,abort V,and so on

• Both a timestamp-based scheduler with a commit bit and a validation-based scheduler avoids cascading rollback

Managing Rollbacks

• Strict Locking: A transaction must not release any write locks (or other locks, such as increment locks that allow values to be changed) until the transaction has either committed or aborted,and the commit or abort log record had been flushed to disk

• A schedule of transactions that obey the strict locking rule is called recoverable

View Serializability

Conflict equivalent View equivalent

Conflict serializable View serializable

Motivating example

Schedule QT1 T2 T3

Read(A)

Write(A)

Write(A)

Write(A)

Same as

Q = r1(A) w2(A) w1(A) w3(A)

P(Q): T1 T2

T3

Not conflict serializable!

But now compare Q to Ss, a serial schedule:

Q T1 T2 T3 Read(A)

Write(A) Write(A)

Write(A)

Ss T1 T2 T3 Read(A) Write(A) Write(A)

Write(A)

• T1 reads same thing in Q, Ss

• T2, T3 read same thing (nothing?)

• After Q or Ss, DB is left in the same state

So what is wrong with Q?

Definition Schedules S1,S2 are View Equivalent if:

(1) If in S1: wj(A) ri(A)

then in S2: wj(A) ri(A)

(2) If in S1: ri(A) reads initial DB value,

then in S2: ri(A) also reads initial DB value

(3) If in S1: Ti does last write on A,

then in S2: Ti also does last write on A

means “reads value produced”

Definition

Schedule S1 is View Serializable if it is view equivalent to some serial schedule

View Conflict

Serializable Serializable ?

• View Serializable Conflict Serializable

e.g., See Schedule Q

• Conflict Serializable View Serializable ?

Lemma

Conflict Serializable View Serializable

Proof:

Swapping non-conflicting actions does not change what transactions read nor final DB state

Venn Diagram

All schedules

View Serializable

ConflictSerializable

Note: All view serializable schedules that are not conflict serializable, involve useless write

S = W2(A) … W3(A)….. no reads

How do we test for view-serializability?

P(S) not good enough… (see schedule Q)

• One problem: some swaps involving conflicting actions are OK… e.g.:

S = ….w2(A)……r1(A).... w3(A)… w4(A)

this action can moveif this write exists

• Another problem: useless writes

S = …..W2(A)…….. W1(A)….. no A reads

To check if S is View Serializable

(1) Add final transaction Tf that reads all DB (eliminates condition 3 of V-S definition)

E.g.: S = …..W1(A)…….. W2(A)… rf(A)

Last A write

?

add

(2) Add initial transaction Tb that writes all DB (eliminates condition 2 of V-S definition)

E.g.: S = wb(A) ... r1(A) … w2(A) …

add

?

(3) Create labeled precedence graph of S:(3a) If wi(A) rj(A) in S, add Ti Tj0

(3b) For each wi(A) rj(A) doconsider each wk(A): [Tk Tb]

- If Ti Tb Tj Tf then insert Tk Ti some new p

Tj Tk

- If Ti =Tb Tj Tf then insert Tj Tk

- If Ti Tb Tj =Tf then insert Tk Ti

p

p

0

0

(4) Check if there is some selection from each arc pair that turn s LP(S) into “acyclic” (if so, S is V-S)

Example: check if Q is V-S:

Q = r1(A) w2(A) w1(A) w3(A)

Q’ = wb(A) r1(A) w2(A) w1(A) w3(A) rf(A)

T3

T2

T1

TfTb

rule 3(a)

0

0 0

0

rule 3(b)

0

0

rule 3(b)

LP(S) acyclic!!S is V-S

Another example:

Z=wb(A) r1(A) w2(A) r3(A) w1(A) w3(A) rf(A)

T3

T2

T1

TfTb

00

01

1 0

0

00

do not pickthis one of “1” pair

LP(Z) acyclic, so Z is V-S(equivalent to Tb T1 T2 T3 Tf)

Deadlocks

• Detection– Wait-for graph

• Prevention– Resource ordering– Timeout– Wait-die– Wound-wait

Deadlock Detection

• Build Wait-For graph

• Use lock table structures

• Build incrementally or periodically

• When cycle found, rollback victim

T1

T3

T2

T6

T5

T4T7

The Waits-For Graph In the waits-for graph there is an arc from

node(transactions) T to node U if there is some database element A such that

– U holds a lock on A– T is waiting for a lock on A,and– T cannot get a lock on A in its desired mode unless U

first releases its lock on A

• If there are no cycles in the waits-for graph, then each transaction can eventually complete

• If there is a cycle,then no transaction in the cycle can ever make progress,so there is a deadlock

T1 T2 T3 T4

1)l1(A);r1(A)

2) l2(C);r2(C)

3) l3(B);r3(B)

4) l4(D);r4(D)

5) l2(A);Denied

6) l3(C);Denied

7) l4(A);Denied

8) l1(B);Denied

Beginning of a schedule with a deadlock

3 2 1

4

Waits-for graph with a cycle caused by step(8)

3 2

Waits-for graph after T1 is rolled back

Deadlock Prevention By Resource Ordering

• Order all elements A1, A2, …, An

• Every transaction is required to request locks on element in order.

Problem : Ordered lock requests not realistic in most cases

Timeout

• If transaction waits more than L sec., roll it back!

• Simple scheme

• Hard to select L

Wait-die

• Transactions given a timestamp when they arrive …. ts(Ti)

• Ti can only wait for Tj if ts(Ti)< ts(Tj) ...else die

T1

(ts =10)

T2

(ts =20)

T3

(ts =25)

wait

wait

Example:

wait?

Wound-wait

• Transactions given a timestamp when they arrive … ts(Ti)

• Ti wounds Tj if ts(Ti)< ts(Tj)

else Ti waits

“Wound”: Tj rolls back and gives lock to Ti

T1

(ts =25)

T2

(ts =20)

T3

(ts =10)

wait

wait

Example:

wait

Comparison of Deadlock Management

• Both wound-wait and wait-die are easier to implement than the waits-for graph method.

• The waits-for graph method aborts transactions only when there is a deadlock. However, either wound-wait or wait-die will sometimes roll back a transaction when there was no deadlock.

Distributed Databases

data

DBMS

data

DBMS

data

DBMS

data

DBMS

Distributed Database System

Advantages of a DDBS

• Speedy Queries by Parallelism

• Fault Tolerance by Data Replication

Increasing complexity and communication cost

Data Distribution

• A bank with many branches

• A chain store with many individual stores

• A digital library with a consortium of universities

Partitioning a relation among many sites

• Horizontal Decomposition

• Vertical Decomposition

Parallelism: Pipelining• Example:

– T1 SELECT * FROM A WHERE cond

– T2 JOIN T1 and B

A B(with index)

select join

Parallelism: Concurrent Operations• Example: SELECT * FROM A WHERE cond

A whereA.x < 10

select select

A where10 A.x < 20

select

A where20 A.x

merge data location isimportant...

Join Processing• Example: JOIN A, B over attribute X

A1 A2 B1 B2

A.x < 10 A.x 10 B.x < 10 B.x 10

join strategy

Join Processing• Example: JOIN A, B over attribute X

A1 A2 B1 B2

A.z < 10 A.z 10 B.z < 10 B.z 10

join strategy

Data Replication

• Fault Tolerance

• Query Speedup

1. How to keep copies identical2. How to place copies properly3. How to handle communication failure

Some Problems

Distributed Transactions

• Transaction components at a different site

• Each having the local scheduler and logger

Distributed Commit --Executing Atomically

Office

Store1 Store i Store n…… ……

T0

T1 Ti Tn

MessageReport

Two-Phase Commit

• Phase One: A coordinator component polls the components whether to commit or abort.

• Phase Two: The coordinator tells the components to commit if and only if all have expressed a willingness to commit.

• Each site logs actions at that site but there is no global log.

• One site, called the coordinator, plays a special role in deciding whether or not the distributed transaction can commit.

• The two-phase commit protocol involves sending certain messages between the coordinator and the other sites. As each message is sent, it is logged at the sending site, to aid in recovery should be it necessary.

ATMBank

Mainframe

An Example

2PC: ATM Withdrawl

• Mainframe is coordinator

• Phase 1: ATM checks if money available; mainframe checks if account has funds (money and funds are “reserved”)

• Phase 2: ATM releases funds; mainframe debits account

• Message in phase 1 of two-phase Commit

Coordinator

prepare

Ready or don‘t commit

Log <Prepare T>Log <Ready T> orLog <Don’t Commit T>

Coordinator

Commit or abort

• Message in phase 2 of two-phase Commit

Log <Commit T> orLog <Abort T>

Log <Commit T> orLog <Abort T>

Recovery of Distributed Transaction

• Last log record <Commit T>

• Last log record <Abort T>

• Last log record <Don’t commit T>

• Last log record <Ready T>

• No log record about T

Coordinator Failure

• Wait for it to recover• Elect a new coordinator and poll all the sites (1) If some site has <Commit T>, commit T (2) If some site has <Abort T>, abort T (3) If no sites has <Commit T> or <Abort T> and at

least one site does not have <Ready T>, it is safe to abort T

(4) If there is no <Commit T> or <Abort T> but every surviving site has <Ready T>, must wait until the original coordinator recovers.

Distributed Locking --Executing Serializably

• Locking Replicated Elements

• Centralized Lock Systems

• Primary-Copy Locking

• Global Locks From Local Locks

A Cost Model for Distributed Locking Algorithms

• Assign one component of a transaction as the lock coordinator to gather all the locks it wants

• Lock data elements at its own site without messages.

• Lock data elements at the other site with three messages: requesting, granting and releasing.

Locking Replicated Elements

Global locks on an element must be obtained through locks on one or more replicas.

Centralized Lock Systems

• Designate one site , the lock site, to maintain a lock table for logical elements, whether or not they have copies at that site.

• When a transaction wants a lock on logical element X, it sends a request to the lock site, which grants or denies the lock, as appropriate.

Primary-Copy Locking

• Each logical element X has one of its copies designated the “primary copy”.

• A transaction sends a request to the site of the primary copy of X to get a lock on X.

• The site of the primary copy maintains an entry for X in its lock table and grants or denies the request as appropriate.

Global Locks From Local Locks

1. S is the number of copies of A that must be locked in shared mode in order for a transaction to have a global shared lock on A.

2. X is the number of copies of A that must be locked in exclusive mode in order for a transaction to have an exclusive lock on A.

If 2x > n, there can be only one global exclusive lock on A.

If s+x >n, there cannot be both a global shared and global exclusive lock on A.

• Read-Locks-One; Write-Locks-All (s=1,x=n)

Allowing a global read lock by obtaining a read lock on any copy, while allowing a global write lock only through write locks on every lock.

• Majority Locking (s=x=[(n+1)/2])

Require a read- or write-lock on a majority of the replicas to obtain a global lock.

Long-Duration Transactions

A long transaction is one that takes too long to be allowed to hold locks that another transaction needs.

Three Applications Involving Long Transactions

• Conventional DBMS Applications

• Design Systems

• Workflow Systems

Workflow diagram for a traveler requesting expense reimbursement

Create travel report

ReserveMoney

Dept. authorization

Corporateapproval

Writecheck

Assistantapproval

StartA1 A2 A3

A4 A6

A5

available

Give to assistant

deny

denyAbort

Abortapprove

approve

Complete

approveAbort

Not enoughdeny

Abort

Sagas

A saga is a collection of actions that together form a long-duration “ transaction”.

Concurrency control for sagas is managed by two facilities

• Each action may be considered itself a (short) transaction, that when executed uses a conventional concurrency-control mechanism, such as locking.

• The overall transaction is managed through the mechanism of “compensating transactions”, which are inverses to the transactions at the nodes of the saga.

Compensating Transactions

• To undo the effects of transactions on database state.

If a saga execution leads to the Abort node, then we roll back the saga by executing the compensating transactions for each executed action, in the reserve order of those actions.

Exercises for Storage Management

• EX 2.2.1

• EX 2.2.2

• EX 2.6.7

• EX 3.2.2

• EX 3.3.4

• Ex 4.1.2

• Ex 4.3.1

• Ex 4.4.6

• Ex 5.2.7

• Ex 5.4.2

Exercises for Query Processing

• Ex 6.1.6 (a)(d)• Ex 6.5.3• Ex 6.6.2• Ex 6.7.2• Ex 6.8.1

• Ex 7.1.3

• Ex 7.4.1 (c) , (d), (e),

• Ex 7.5.1

• Ex 7.6.1

• Ex 7.7.1 (b), (c)

Exercises for Transaction Management

• Ex 8.2.7 (a), (e)

• Ex 8.3.3

• Ex 8.4.5 (c) , (d)

• Ex 9.2.1

• Ex 9.8.2 (b)

• Ex 9.9.1 (b) (c)• EX 10.1.2 (b) (c)• EX 10.2.1 (b) (c)• EX 10.3.1 (b) (c)• EX 10.6.2

more about transaction management

Documents

dirty data

transaction v

transaction u

initial transaction

transaction managementchapter

final transaction tf

vs definitione

view serializable schedules