transactions and concurrency control fall 2007 himanshu bajpai hbajpa1@student.gsu.edu

Post on 19-Jan-2016

213 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Transactions and Concurrency Control

Fall 2007

Himanshu Bajpai

hbajpa1@student.gsu.edu

Topics

Transaction Services

Serialization

Concurrency Control Protocols

The Concept of ‘Transaction’ and Transaction services[1]

A transaction is the basic logical unit of execution in an information system

A transaction is a sequence of operations that must be executed as a whole

ACCOUNT A

Fred Bloggs

$1000

transfer $500ACCOUNT B

Fred Bloggs

£0 1. Debit A2. Credit B

Contd..

The database system must ensure that either (1) and (2) happen or that neither happens. Otherwise inconsistency occurs

Requirements for Database Consistency[3]

The simultaneous execution of many different application programs must be such that each transaction does not interfere with another transaction.

The concurrent execution of transactions must be such that each transaction appears to execute in isolation.

Desirable Properties of Transactions (ACID)

Atomicity: a transaction is an atomic unit of processing and it is either performed entirely or not at all

Consistency Preservation: a transaction's correct execution must take the database from one correct state to another

Isolation: the updates of a transaction must not be made visible to other transactions until it is committed (solves the temporary update problem)

Contd..

Durability or Permanency: if a transaction changes the database and is committed, the changes must never be lost because of subsequent failure

Serializability: transactions are considered serializable if the effect of running them in an interleaved fashion is equivalent to running them serially in some order

Transaction as a Concurrency Unit[2]

Transactions must be synchronised correctly to guarantee database consistency

ACCOUNT AFred Bloggs$1000

ACCOUNT BFred Bloggs$0

transfer £500

1. Debit A2. Credit B

ACCOUNT CFred Bloggs$200

transfer £300

T1 T2

1. Debit B2. Credit C

simultaneous

ACCOUNT AFred Bloggs$500

ACCOUNT BFred Bloggs$200

ACCOUNT CFred Bloggs$500

Net Result

Serialization[4]

A mechanism controls concurrency among database transactions through the use of serial ordering relations. The ordering relations are computed dynamically in response to patterns of use. An embodiment of the present invention serializes a transaction that accesses a resource before a transaction that modifies the resource, even if the accessor starts after the modifier starts or commits after the modifier commits.

Contd…

A method of concurrency control for a database transaction in a distributed database system stores an intended use of a database system resource by the database transaction in a serialization graph. A serialization ordering is asserted between the database transaction and other database transactions based on the intended use of the database system resource by the database transaction.

Contd…

The serialization ordering is then communicated to a node in the distributed database system that needs to know the serialization ordering to perform concurrency control. Cycles in the serialization graph are detected based on the asserted serialization order and in order to break such cycles and ensure transaction serializability a database transaction is identified that is a member of a cycle in the serialization graph.

Concurrency Control Protocols[4]

Concurrency in Transaction Execution

There is a need to ensure that concurrent transactions do not interfere with each others operations

Most DBMS are multi-user systems

Transaction scheduling algorithms

Transaction Serializabilty

The effect on a database of any number of transactions executing in parallel must be the same as if they were executed one after another

The Need for Concurrency Control

The concurrent execution of transactions may lead, if uncontrolled, to problems such as an inconsistent database

Concurrency control techniques are used to ensure that multiple transactions submitted by various users do not interfere with one another in a way that produces incorrect results

Read and Write Operations of a Transactionread_item(X): reads a database item named X into a program variable also named X. Execution of the command includes the following steps:

find the address of the disk block that contains item X

copy that disk block into a buffer in the main memory

copy item X from the buffer to the program variable named X

write_item(X): writes the value of program variable X into the database item named X. Execution of the command includes the following steps:

find the address of the disk block that contains item X

copy that disk block into a buffer in the main memory

copy item X from the program variable named X into its current location in the buffer

store the updated block in the buffer back to disk (this step updates the database on disk)

Problems due to the Concurrent Execution of Transactions[5]

The Lost Update Problem

The Temporary Update (uncommitted dependency) Problem

The Incorrect Summary (inconsistent analysis) Problem

The Lost Update Problem

Two transactions accessing the same database item have their operations interleaved in a way that makes the database item incorrect.

T1: T2:read_item(X);X:= X - N;

read_item(X);X:= X + M;

write_item(X);read_item(Y);

write_item(X);Y:= Y + N;write_item(Y);

item X has incorrect value because its update from T1 is lost

time

If transactions T1 and T2 are submitted at approximately the same time and their operations are interleaved then the value of database item X will be incorrect because T2 reads the value of X before T1 changes it in the database and hence the updated database value resulting from T1 is lost.

The Temporary Update Problem

One transaction updates a database item and then the transaction -for some

reason- fails. The updated item is accessed by another transaction before it is

changed back to its original value.T1: T2:

read_item(X);X:= X - N;write_item(X);

read_item(X);X:= X - N;write_item(X);

read_item(Y);time

transaction T1 fails and must change the value of X back to its old value; meanwhile T2 has read the "temporary" incorrect value of X

The Incorrect Summary Problem

One transaction is calculating an aggregate summary function on a number of records while other transactions are updating some of these records. The aggregate function may calculate some values before they are updated and others after.

T1: T3:sum:= 0;read_item(A);sum:= sum + A;

.read_item(X); .X:= X - N; .write_item(X);

read_item(X);sum:= sum + X;read_item(Y);sum:= sum + Y;

read_item(Y);Y:= Y + N;write_item(Y);

T3 reads X after N is subtracted and reads Y before X is added, so a wrong summary is the result

Schedules of Transactions[3]

A schedule S of n transactions is a sequential ordering of the operations of the n transactions.

A schedule maintains the order of operations within the individual transaction. It is subject to the constraint that for each transaction T participating in S, if operation i is performed in T before operation j, then operation i will be performed before operation j in S.

The serializability theory attempts to determine the 'correctness' of the schedules.

Serial, Nonserial and Serializable Schedules

A schedule S is serial if, for every transaction T participating in S all of T's operations are executed consecutively in the schedule; otherwise it is called nonserial.

A schedule S of n transactions is serializable if it is equivalent to some serial schedule of the same n transactions.

Example of Serial Schedules

Schedule A

T1: T2:

read_item(x)

X:= X - N;

write_item(X);

read_item(Y);

Y:=Y + N;

write_item(Y);

read_item(X);

X:= X + M;

write_item(X);

time

Example of Nonserial Schedules

Schedule A

T1: T2:

read_item(X);

X:= X - N;

read_item(X);

X:= X + M;

write_item(X);

read_item(Y);

write_item(X);

Y:=Y + N;

write_item(Y);

time

The Constrained Write Assumption

The new value of a data item is dependent only on its old value and

thus the concern is only for the read_item(X) and write_item(X)

operations.

Problems:

(a) the value of the data item may depend on the values of

other database items (additionally to its old value)

(b) the value of the data item may be independent of any

other database items

Example of the Constrained Write Assumption

read_item(X);

.

. (includes X:=f(X))

.

write_item(X);

}

The Unconstrained Write Assumption

this is only included for completeness - ‘constrained write’ is used in precedence graphs

The new value of each database item in the set of all items written by a transaction (write set) is dependent on the values of some of the items found in the set of all items read by the transaction (read set)

Testing for Serializability of a Schedule(Under Constrained Write)

(1) for each transaction Ti participating in schedule S

create a node labelled Ti in the precedence graph;

(2) for each case in S where Tj executes a read_item(X) that reads the value of item X written by a write_item(X) command executed by Ti

create an edge (Ti -> Tj) in the precedence graph;

(3) for each case in S where Tj executes write_item(X) after Ti executes read_item(X)

create an edge (Ti -> Tj) in the precedence graph;

(4) the schedule S is serializable if and only if the precedence graph has no cycles;

Example

Schedule A

T1: T2:read_item(X);X:= X - N;write_item(X);read_item(Y);Y:=Y + N;write_item(Y);

read_item(X);X:= X + M;write_item(X);

T1 T2 precedence graph for schedule A (serial)

X

Example

Schedule A

T1: T2:read_item(X);X:= X - N;

read_item(X);X:= X + M;

write_item(X);read_item(Y);

write_item(X);

Y:=Y + N;write_item(Y);

T1 T2

X

precedence graph for schedule A (nonserial)

Methods for Serializability[6]

Protocols that, if followed by every transaction, will ensure serializability of all schedules in which the transactions participate. They may use locking techniques of data items to prevent multiple transactions from accessing items concurrently.

Timestamps are unique identifiers for each transaction and are generated by the system. Transactions can then be ordered according to their timestamps to ensure serializability.

Multiversion Concurrency Control Techniques keep the old values of a data item when that item is updated.

Locking Techniques for Concurrency Control

The concept of locking data items is one of the main techniques used for controlling the concurrent execution of transactions.

A lock is a variable associated with a data item in the database. Generally there is a lock for each data item in the database.

A lock describes the status of the data item with respect to possible operations that can be applied to that item. It is used for synchronising the access by concurrent transactions to the database items.

Types of Locks

Binary (Exclusive) locks have two possible states: locked (lock_item(X) operation) and unlocked (unlock_item(X) operation

Multiple-mode (Shared) locks allow concurrent access to the same item by several transactions. They have three possible states: read locked or shared locked (other transactions are allowed to read the item) write locked or exclusive locked (a single transaction exclusively holds the lock on the item) and unlocked.

Lock Type compatability matrix

Y=yes (requests compatible) X - Binary (exclusive) block

N= No(requests incompatible) S - Multiple(shared) lock

X

X S

S

N

N

N

Y Y

Y

YY Y

-

-

Two-Phase LockingAll locking operations (read_lock, write_lock) precede the first unlock operation in the transactions. Two phases:

• expanding phase: new locks on items can be acquired but none can be released

• shrinking phase: existing locks can be released but no new ones can be acquired

read_lock(Y);read_lock(X);

read_item(Y);

unlock(Y);read_item(X);

X:=X+Y; write_lock(X);write_item(X);

unlock(X);

read_lock(X);

read_item(X);

write_lock(Y);

unlock(X);read_item(Y);

Y:=X+Y;write_item(Y);

unlock(Y);

not two-phase locking two-phase locking

Locking ProblemsDeadlock: when each of two transactions is waiting for the other to release an item.

Approaches for solution:

deadlock prevention protocol: every transaction must lock all items it needs in advance

deadlock detection (if the transaction load is light or transactions are short and lock only a few items):

Livelock: a transaction cannot proceed for an indefinite period of time while other transactions in the system continue normally.

Solution: fair waiting schemes (i.e. first-come-first-served)

References

[1] http://www.cs.duke.edu/~junyang/courses/cps216-2001-fall/lectures/07-cc.pdf

[2] www.itu.dk/courses/DS/F2002/week11/concurrencycontrol-final.ppt

[3] http://en.wikipedia.org/wiki/Concurrency_control

[4] Transaction Management and Concurrency control by Connolly & Begg. Chapter 19. Third edition

Reference contd…

[5] http://publib.boulder.ibm.com/infocenter/wasinfo/v6r0/index.jsp?topic=/com.ibm.websphere.express.doc/info/exp/ae/cejb_cncr.html

[6] http://www.agiledata.org/essays/concurrencyControl.html

Scott W Ambler, 2006

top related