cs 405g: introduction to database systems 20. concurrent control, storage
TRANSCRIPT
CS 405G: Introduction to Database Systems
20. Concurrent control, Storage
04/20/23 Chen Qian @ University of Kentucky 2
Conflicting operations
Two operations on the same data item conflict if at least one of the operations is a write r(X) and w(X) conflict w(X) and r(X) conflict w(X) and w(X) conflict r(X) and r(X) do not r/w(X) and r/w(Y) do not
Order of conflicting operations matters E.g., if T1.r(A) precedes T2.w(A), then conceptually, T1
should precede T2
04/20/23 Chen Qian @ University of Kentucky 3
Locking
Rules If a transaction wants to read an object, it must first request a
shared lock (S mode) on that object If a transaction wants to modify an object, it must first request
an exclusive lock (X mode) on that object Allow one exclusive lock, or multiple shared locks
Mode of lock(s)currently held
by other transactions
Mode of the lock requested
Grant the lock?
Compatibility matrix
S X
S Y N
X N N
04/20/23 Chen Qian @ University of Kentucky 4
Basic locking is not enough
T1 T2
r(A)w(A)
r(A)w(A)
r(B)w(B)
r(B)w(B)
lock-X(A)
lock-X(B)
unlock(B)
unlock(A)lock-X(A)
unlock(A)
unlock(B)lock-X(B)
Possible scheduleunder locking
But still notconflict-
serializable!
T1
T2
Read 100
Write 100+1
Read 101
Write 101*2
Read 100
Write 100*2
Read 200
Write 200+1
Add 1 to both A and B(preserve A=B)
Multiply both A and B by 2(preserves A=B)
A B!
04/20/23 Chen Qian @ University of Kentucky 5
Two-phase locking (2PL)
All lock requests precede all unlock requests Phase 1: obtain locks, phase 2: release locks
T1 T2
r(A)w(A)
r(A)w(A)
r(B) w(B) r(B)w(B)
lock-X(A)
lock-X(B)
unlock(B)
unlock(A)lock-X(A)
lock-X(B)
Cannot obtain the lock on Buntil T1 unlocks
T1 T2
r(A)w(A)
r(A)w(A)
r(B)w(B) r(B) w(B)
2PL guarantees aconflict-serializable
schedule
04/20/23 Chen Qian @ University of Kentucky 6
Problem of 2PL
T2 has read uncommitted data written by T1
If T1 aborts, then T2 must abort as well
Cascading aborts possible if other transactions have read data written by T2
Even worse, what if T2 commits before T1? Schedule is not recoverable if the system crashes right
after T2 commits
T1 T2
r(A)w(A)
r(A)w(A)
r(B)w(B) r(B) w(B)
Abort!
04/20/23 Chen Qian @ University of Kentucky 7
Strict 2PL
Only release locks at commit/abort time A writer will block all other readers until the writer
commits or aborts
Used in most commercial DBMS
Deadlock Detection
Create and maintain a “waits-for” graph Periodically check for cycles in graph
04/20/23 Chen Qian @ University of Kentucky 8
Deadlock Detection (Continued)
Example:
T1: S(A), S(D), S(B)T2: X(B) X(C)T3: S(D), S(C), X(A)T4: X(B)
T1 T2
T4 T3
Deadlock!
04/20/23 Chen Qian @ University of Kentucky 9
Deadlock Prevention
Assign priorities based on timestamps. Say Ti wants a lock that Tj holds
Two policies are possible:Wait-Die: If Ti has higher priority, Ti waits for Tj; otherwise Ti aborts
Wound-wait: If Ti has higher priority, Tj aborts; otherwise Ti waits
04/20/23 Chen Qian @ University of Kentucky 10
04/20/23 Chen Qian @ University of Kentucky 11
Storage
It’s all about disks! That’s why we always draw databases as And why the single most important metric in database
processing is the number of disk I/Os performed
The Storage Hierarchy
Source: Operating Systems Concepts 5th Edition
Main memory (RAM) for currently used dataDisk for the main database (secondary storage).Tapes for archiving older versions of the data (tertiary storage).
Smaller, Faster
Bigger, Slower
04/20/23 12Chen Qian @ University of Kentucky
04/20/23 Chen Qian @ University of Kentucky 13
Jim Gray’s Storage Latency Analogy: How Far Away is the Data?
RegistersOn Chip CacheOn Board Cache
Memory
Disk
12
10
100
Tape /Optical Robot
10 9
10 6
Lexington
This Lecture Hall
This RoomMy Head
10 min
1.5 hr
2 Years
1 min
Pluto
2,000 Years
Andromeda
04/20/23 Chen Qian @ University of Kentucky 14
A typical disk
Spindle rotation
Sp
ind
le
Platter
Tracks
Arm movement
Disk arm
Disk headCylinders
“Moving parts” are slow
04/20/23 Chen Qian @ University of Kentucky 15
Disk access time
Sum of: Seek time: time for disk heads to move to the correct
cylinder Rotational delay: time for the desired block to rotate
under the disk head Transfer time: time to read/write data in the block (=
time for disk to rotate over the block)
04/20/23 Chen Qian @ University of Kentucky 16
Random disk access
Seek time + rotational delay + transfer time Average seek time
Time to skip one half of the cylinders? Not quite; should be time to skip a third of them (why?)
“Typical” value: 5 ms Average rotational delay
Time for a half rotation (a function of RPM) “Typical” value: 4.2 ms (7200 RPM)
Typical transfer time .08msec per 8K block
04/20/23 Chen Qian @ University of Kentucky 17
Sequential Disk Access Improves Performance
Seek time + rotational delay + transfer time Seek time
0 (assuming data is on the same track) Rotational delay
0 (assuming data is in the next block on the track) Easily an order of magnitude faster than random disk
access!
04/20/23 Chen Qian @ University of Kentucky 18
Performance tricks
Disk layout strategy Keep related things (what are they?) close together: same
sector/block ! same track ! same cylinder ! adjacent cylinder
Double buffering While processing the current block in memory, prefetch
the next block from disk (overlap I/O with processing) Disk scheduling algorithm Track buffer
Read/write one entire track at a time Parallel I/O
More disk heads working at the same time
04/20/23 Chen Qian @ University of Kentucky 19
Records and Files
A row in a table, when located on disks, is called A record
Two types of record: Fixed-length Variable-length
04/20/23 Chen Qian @ University of Kentucky 20
In an abstract sense, a file is A set of “records” on a disk
In reality, a file is A set of disk pages
Each record lives on A page
Physical Record ID (RID) A tuple of <page#, slot#>
Files
Higher levels of DBMS operate on records, and files of records.
FILE: A collection of pages, containing a collection of records. Record = row in a table
Must support: insert/delete/modify record fetch a particular record (specified using record id) scan all records (possibly with some conditions on the records
to be retrieved)
04/20/23 21Chen Qian @ University of Kentucky
Unordered (Heap) Files
Simplest file structure contains records in no particular order.
As file grows and shrinks, disk pages are allocated and de-allocated.
To support record level operations, we must: keep track of the pages in a file keep track of free space on pages keep track of the records on a page
There are many alternatives for keeping track of this. We’ll consider 2
04/20/23 22Chen Qian @ University of Kentucky
Heap File Implemented as a List
The header page id and Heap file name must be stored someplace. Database “catalog”
Each page contains 2 `pointers’ plus data.
HeaderPage
DataPage
DataPage
DataPage
DataPage
DataPage
DataPage Pages with
Free Space
Full Pages
04/20/23 23Chen Qian @ University of Kentucky
Heap File Using a Page Directory
DBMS must remember where the first directory page is located. The entry for a page can include the number of free bytes on the
page. The directory itself is a collection of pages and shown as a
linked list. Much smaller than linked list of all HF pages!
DataPage 1
DataPage 2
DataPage N
HeaderPage
DIRECTORY
04/20/23 24Chen Qian @ University of Kentucky
Trade-off
Compared to linked list, the directory based approaches have:
Pros: Fast access to a particular record Cons: Extra space
04/20/23 Chen Qian @ University of Kentucky 25
Summary
Disks provide cheap, non-volatile storage. Random access, but cost depends on the location of page
on disk; important to arrange data sequentially to minimize seek and rotation delays.
04/20/23 26Chen Qian @ University of Kentucky