cs 405g: introduction to database systems 20. concurrent control, storage

CS 405G: Introduction to Database Systems

20. Concurrent control, Storage

04/20/23 Chen Qian @ University of Kentucky 2

Conflicting operations

Two operations on the same data item conflict if at least one of the operations is a write r(X) and w(X) conflict w(X) and r(X) conflict w(X) and w(X) conflict r(X) and r(X) do not r/w(X) and r/w(Y) do not

Order of conflicting operations matters E.g., if T1.r(A) precedes T2.w(A), then conceptually, T1

should precede T2


Locking

Rules If a transaction wants to read an object, it must first request a

shared lock (S mode) on that object If a transaction wants to modify an object, it must first request

an exclusive lock (X mode) on that object Allow one exclusive lock, or multiple shared locks

Mode of lock(s)currently held

by other transactions

Mode of the lock requested

Grant the lock?

Compatibility matrix

S X

S Y N

X N N


Basic locking is not enough

T1 T2

r(A)w(A)

r(A)w(A)

r(B)w(B)

r(B)w(B)

lock-X(A)

lock-X(B)

unlock(B)

unlock(A)lock-X(A)

unlock(A)

unlock(B)lock-X(B)

Possible scheduleunder locking

But still notconflict-

serializable!

T1

T2

Read 100

Write 100+1

Read 101

Write 101*2

Read 100

Write 100*2

Read 200

Write 200+1

Add 1 to both A and B(preserve A=B)

Multiply both A and B by 2(preserves A=B)

A B!


Two-phase locking (2PL)

All lock requests precede all unlock requests Phase 1: obtain locks, phase 2: release locks

T1 T2

r(A)w(A)

r(A)w(A)

r(B) w(B) r(B)w(B)

lock-X(A)

lock-X(B)

unlock(B)

unlock(A)lock-X(A)

lock-X(B)

Cannot obtain the lock on Buntil T1 unlocks

T1 T2

r(A)w(A)

r(A)w(A)

r(B)w(B) r(B) w(B)

2PL guarantees aconflict-serializable

schedule


Problem of 2PL

T2 has read uncommitted data written by T1

If T1 aborts, then T2 must abort as well

Cascading aborts possible if other transactions have read data written by T2

Even worse, what if T2 commits before T1? Schedule is not recoverable if the system crashes right

after T2 commits

T1 T2

r(A)w(A)

r(A)w(A)

r(B)w(B) r(B) w(B)

Abort!


Strict 2PL

Only release locks at commit/abort time A writer will block all other readers until the writer

commits or aborts

Used in most commercial DBMS

Deadlock Detection

Create and maintain a “waits-for” graph Periodically check for cycles in graph


Deadlock Detection (Continued)

Example:

T1: S(A), S(D), S(B)T2: X(B) X(C)T3: S(D), S(C), X(A)T4: X(B)

T1 T2

T4 T3

Deadlock!


Deadlock Prevention

Assign priorities based on timestamps. Say Ti wants a lock that Tj holds

Two policies are possible:Wait-Die: If Ti has higher priority, Ti waits for Tj; otherwise Ti aborts

Wound-wait: If Ti has higher priority, Tj aborts; otherwise Ti waits



Storage

It’s all about disks! That’s why we always draw databases as And why the single most important metric in database

processing is the number of disk I/Os performed

The Storage Hierarchy

Source: Operating Systems Concepts 5th Edition

Main memory (RAM) for currently used dataDisk for the main database (secondary storage).Tapes for archiving older versions of the data (tertiary storage).

Smaller, Faster

Bigger, Slower

04/20/23 12Chen Qian @ University of Kentucky


Jim Gray’s Storage Latency Analogy: How Far Away is the Data?

RegistersOn Chip CacheOn Board Cache

Memory

Disk

12

10

100

Tape /Optical Robot

10 9

10 6

Lexington

This Lecture Hall

This RoomMy Head

10 min

1.5 hr

2 Years

1 min

Pluto

2,000 Years

Andromeda


A typical disk

Spindle rotation

Sp

ind

le

Platter

Tracks

Arm movement

Disk arm

Disk headCylinders

“Moving parts” are slow


Disk access time

Sum of: Seek time: time for disk heads to move to the correct

cylinder Rotational delay: time for the desired block to rotate

under the disk head Transfer time: time to read/write data in the block (=

time for disk to rotate over the block)


Random disk access

Seek time + rotational delay + transfer time Average seek time

Time to skip one half of the cylinders? Not quite; should be time to skip a third of them (why?)

“Typical” value: 5 ms Average rotational delay

Time for a half rotation (a function of RPM) “Typical” value: 4.2 ms (7200 RPM)

Typical transfer time .08msec per 8K block


Sequential Disk Access Improves Performance

Seek time + rotational delay + transfer time Seek time

0 (assuming data is on the same track) Rotational delay

0 (assuming data is in the next block on the track) Easily an order of magnitude faster than random disk

access!


Performance tricks

Disk layout strategy Keep related things (what are they?) close together: same

sector/block ! same track ! same cylinder ! adjacent cylinder

Double buffering While processing the current block in memory, prefetch

the next block from disk (overlap I/O with processing) Disk scheduling algorithm Track buffer

Read/write one entire track at a time Parallel I/O

More disk heads working at the same time


Records and Files

A row in a table, when located on disks, is called A record

Two types of record: Fixed-length Variable-length


In an abstract sense, a file is A set of “records” on a disk

In reality, a file is A set of disk pages

Each record lives on A page

Physical Record ID (RID) A tuple of <page#, slot#>

Files

Higher levels of DBMS operate on records, and files of records.

FILE: A collection of pages, containing a collection of records. Record = row in a table

Must support: insert/delete/modify record fetch a particular record (specified using record id) scan all records (possibly with some conditions on the records

to be retrieved)


Unordered (Heap) Files

Simplest file structure contains records in no particular order.

As file grows and shrinks, disk pages are allocated and de-allocated.

To support record level operations, we must: keep track of the pages in a file keep track of free space on pages keep track of the records on a page

There are many alternatives for keeping track of this. We’ll consider 2


Heap File Implemented as a List

The header page id and Heap file name must be stored someplace. Database “catalog”

Each page contains 2 `pointers’ plus data.

HeaderPage

DataPage

DataPage

DataPage

DataPage

DataPage

DataPage Pages with

Free Space

Full Pages


Heap File Using a Page Directory

DBMS must remember where the first directory page is located. The entry for a page can include the number of free bytes on the

page. The directory itself is a collection of pages and shown as a

linked list. Much smaller than linked list of all HF pages!

DataPage 1

DataPage 2

DataPage N

HeaderPage

DIRECTORY


Trade-off

Compared to linked list, the directory based approaches have:

Pros: Fast access to a particular record Cons: Extra space


Summary

Disks provide cheap, non-volatile storage. Random access, but cost depends on the location of page

on disk; important to arrange data sequentially to minimize seek and rotation delays.


cs 405g: introduction to database systems 20. concurrent control, storage

Documents

storagechen qian

t2chen qian

commercial dbmschen

ti waits

lock requestedgrant

ti abortswound

data tertiary storage

t1if t1 aborts