chapter19

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition


Database RecoveryTechniques

Chapter 19


Recovery Concepts

Recovery Techniques based on Deferred Update

Recovery Techniques Based on Immediate Update

Shadow Paging

The Aries Recovery Algorithm

Recovery in Multidatabase Systems

Chapter Outline


Recovery from transaction failures means that the database is restored to the most recent consistent state just before the time of failure.

System must keep information that enables recovery from failures.

A typical strategy for recovery includes:If database is physically damaged (e.g., disk crash), then recovery method restores a past copy of the database.

Recovery from non-catastrophic (transaction) failures may imply a redo/undo of some operations (this can be done by information kept in the system log).

Recovery Concepts


Writing and reading to secondary storage is a performance bottleneck. To increase performance, the DBMS maintains a collection of in-memory buffers, called the DBMS cache.

A directory is used to keep track of which database items are in the buffers (cache).

When the DBMS requests action on some item, it first checks the cache directory. If it is not in the cache, then the appropriate disk pages must be copied into the cache.

A dirty bit (associated with each buffer), indicates whether or not the buffer has been modified.

A negative aspect of caching is that is makes recovery from failure more complex.

Caching (Buffering) of Disk Blocks


It may be necessary to replace (or flush) some of the cache buffers to make space available for the new item.

Two main strategies for flushing a modified buffer:

1) In-place updating: writes the buffer back to the same original disk location (overwriting old values).

2) Shadowing: writes an updated buffer at a different disk location.Multiple versions of the data can be maintained.



The old value of a data item before updating is called the before image (BFIM), and the new value after updating is called the after image (AFIM).

The shadowing strategy can keep both the BFIM and the AFIM for recovery.

In-place updating requires use of a log for recovery. The BFIM of an updated item must be recorded in the log before BFIM is overwritten with the AFIM (this is known as write-ahead logging).

Two types of log entry information include:1) Information needed for UNDO; contains the old value

(BFIM) of the item,2) Information needed for REDO; contains the new value

(AFIM) of the item.



If a buffer needs to be reused, but it has data written by a transaction that has not committed:

A steal approach writes the data anyway so the buffer can be reused ― i.e. the buffer is stolen

A no-steal approach does not allow the data to be written until the transaction commits

When a transaction commits:

In a force approach, all buffers updated by a transaction are immediately written to disk

In a no-force approach, they could be written later



Checkpoint is also a type of entry in the log. It is written into the log when the system writes out to the database on disk all DBMS buffers that have been modified.

All transactions that have their [COMMIT, T] entries in the log before a checkpoint entry, do not need to have their WRITE operations REDONE in case of a system crash.

The recovery manager of a DBMS must decide at what intervals to take a checkpoint.

Checkpoints in the system log


Taking a checkpoint consists of the following actions:

1) Suspend execution of transactions temporarily,

2) Force-write all main memory buffers that have been modified to disk,

3) Write a [CHECKPOINT] record to the log, and force-write the log to disk.

4) Resume execution of transactions.

Checkpoints in the system log


If a transaction fails for whatever reason after updating the database, it may be necessary to roll back the transaction.

Also a transition may need to be roll back because of cascading rollback phenomenon.

In practice, cascading rollback of transactions is never required because practical recovery methods guarantee cascadeless or strict schedules.

Transaction Rollback


FIGURE 19.1Illustrating cascading rollback (a process that never occurs in strict or cascadeless schedules). (a) The read and write operations of three transactions. (b) System log at point of crash.



FIGURE 19.1 (continued) Illustrating cascading rollback

(a process that never occurs in strict or cascadeless schedules).

(c) Operations before the crash.



Some approaches for recovery from non-catastrophic (transaction) failures can be classified as either:

Deferred update: (NO-UNDO, REDO)

The database is not physically updated until after a transaction reaches its commit point.UNDO is not neededREDO may be necessary

Immediate update: (UNDO, NO-REDO)

The database is physically updated before the transaction reaches to its commit point.UNDO may be necessary when a transaction failsREDO is not needed.

Recovery Techniques


A deferred update method can be classified as a no-steal approach. It obey the following rules:

1) A transaction cannot change the database on disk until it reaches its commit point,

2) A transaction does not reach its commit point until all its update operations are recorded in the log and the log is force-written to disk.

Because the database is never updated on disk until after the transaction commits, there is never the need to UNDO any operations (that is why a deferred update is known as NO-UNDO/REDO recovery algorithm).

Recovery Techniques Based on Deferred Update


Recovery using deferred update in a single user environment uses the following procedure:

PROCEDURE RDU: apply the REDO operation to all the WRITE_ITEM operations of the committed transactions from the log in the order in which they were written to the log.Restart the active transactions.

The REDO operation must be idempotent – that is, executing it over and over is equivalent to executing it just once.



FIGURE 19.2 An example of recovery using deferred update in

a single-user environment. (a) The READ and WRITE operations of two

transactions. (b) The system log at the point of crash.



FIGURE 19.3An example of recovery in a multiuser environment.



FIGURE 19.4 An example of recovery using deferred updatewith concurrent transactions. (a) The READ and WRITE operations of four transactions. (b) System log at the point of crash.



In immediate update techniques, although the database can be updated immediately, update operations must be recorded in the log (on disk) before being applied to the database.

There is never a need to REDO any operations of committed transactions (this is called the UNDO/NO-REDO recovery algorithm).

The effect of all active transactions at the time of failure must be undone.



Recovery using immediate update uses the following recovery procedure:

PROCEDURE RIU: UNDO all the WRITE_ITEM operations of the active (uncommitted) transactions from the log. The operations should be done in the reverse of the order in which they were written into the log.

Restart the active transactions.



Shadow Paging is a NO-UNDO, NO-REDO approach to recovery.

In a single-user environment, does not require the use of a log. In a multiuser environment, a log may be needed for the concurrency control method.

It considers the database to be made up of a number of fixed-size disk pages (located in the main memory).

When a transactions begins it creates its own current and shadow directories

When a WRITE_ITEM operation is performed, a new copy of the modified DB page is created and the current directory is modified

Recovery from a failure requires freeing the modified pages and discarding the current directory.

Shadow Paging


FIGURE 19.5 An example of shadow paging.

Shadow Paging


ARIES is a REDO/UNDO approach to recovery, steal/no-force approach for writing

Recovery procedure has three main phases:

1) Analysis phase: identifies the dirty (updated) pages in the buffer, and the set of active transactions. Finds where in the log to begin REDO.

2) REDO phase: reapplies updates from the log in order to bring the database to the state it was in before failure.

3) UNDO phase: the log is scanned backwards and the operations of transactions that were active at the time of the crash are undone in reverse order.

The ARIES Recovery Algorithm


FIGURE 19.6An example of recovery in ARIES. (a) The log at point of crash. (b) Transaction and Dirty Page Tables at time of checkpoint. (c) The Transaction and Dirty Page Tables after the analysis phase.

The ARIES Recovery Algorithm


A multidatabase transaction is a transaction that requires access to multiple databases.A global recovery manager (or coordinator) performs the two-phase commit protocol:

1) When all participating databases signal the coordinator that the part of the multidatabase transaction involving each has concluded, the coordinator sends a message “prepare for commit” to each participant,

2) If all participating databases reply “OK”, the transaction is successful and the coordinator sends a “commit” signal to the participating databases; otherwise it sends a message “roll back” or UNDO the local effect of the transaction to each participating database.

Recovery in Multidatabase Systems


A key assumption in our discussion of non-catastrophic failures has been that the system log is maintained on the disk and is not lost as a result of the failure.

The recovery manager of a DBMS must also be equipped to handle catastrophic failures.

The main technique used to handle catastrophic failures (such as disk crashes) is that of database backup. Hence, the whole database and the log are periodically copied onto a cheap storage medium such as magnetic tapes (in case of a catastrophic failure, the latest backup copy can be reloaded).

Recovery from Catastrophic Failures

chapter19

Technology

transaction rollback

recovery manager

transaction commits

disk crash

practical recovery methods

updated buffer

b system log

thesystem log