part14 crash

Upload: sukhpreet-kaur

Post on 05-Apr-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/2/2019 Part14 Crash

    1/28

    Part 14 -crash 1

    Crash RecoveryCrash Recovery

    in case of system crash (failure) we require a recoveryscheme to: detect failures restore the database to a consistent state

    Failure Types

    Volatile Storage main memory and cache

    normally does not survive a crash

  • 8/2/2019 Part14 Crash

    2/28

    Part 14 -crash 2

    Failure Types

    Nonvolatile Storage usually survives a crash example: disk and magnetic tape except head crash, etc.

    Stable Storage "never" lost (??)

    can replicate on several nonvolatile media withindependent failure modes

  • 8/2/2019 Part14 Crash

    3/28

    Part 14 -crash 3

    Failure Types

    Logical Error program related error - divide by zero, overflow, access

    to non-existent memory, etc. can often be restarted after a software fix made

    System Error example: deadlock or some undesirable system state

    entered re-execution often possible

  • 8/2/2019 Part14 Crash

    4/28

    Part 14 -crash 4

    Failure Types

    System Crash some hardware problem, volatile memory lost,...

    Disk Failure head crash, etc. error during data transfer - sometimes recoverable

  • 8/2/2019 Part14 Crash

    5/28

    Part 14 -crash 5

    Basic Terminology

    input(X)

    transfer physical block where data item X resides intomain memory

    output(X)

    transfer buffer block on which X resides onto physical

    block (disk)read(X, xi )

    assign value of X to local variable xi :

    ifthe block in which X resides is not in mainmemory then issue an input(X).

    assign xi the value of data item X from the buffer

    block.

  • 8/2/2019 Part14 Crash

    6/28

    Part 14 -crash 6

    Basic Terminology

    write(X, xi )

    assign value of the local variable xi to data item X in

    the buffer block: ifthe block in which X resides is not in main

    memory then issue an input(X) first.

    assign xi to X in buffer memory.

  • 8/2/2019 Part14 Crash

    7/28

    Part 14 -crash 7

    EXAMPLE consider the following example from a banking system where

    $50 is withdrawn from account A and deposited into account B:

    read( A, a1 )

    a1 = a1 - 50

    write( A, a1 )

    read( B, b1 )

    b1 = b1 + 50

    write( B, b1 )

  • 8/2/2019 Part14 Crash

    8/28

    Part 14 -crash 8

    Failure Modes

    can leave the database in an inconsistent state, e.g.:

    failure after output(A) but before output(B)

    before output(A) and output(B) executed, thephysical database blocks and memory blocks differ,problem if crash!

  • 8/2/2019 Part14 Crash

    9/28

    Part 14 -crash 9

    Transaction

    a basic program unit

    its execution preserves the database consistency

    the database is consistent both before and after itsexecution.

    transaction may not always complete may become aborted for various reasons database must be restored (rolled back) to the state

    before the transaction started the transaction must be atomic

    either all the instructions are completed or none areperformed

  • 8/2/2019 Part14 Crash

    10/28

    Part 14 -crash 10

    Crash Recovery Methods

    Incremental Log with Deferred Updates during the transaction execution, all writes are deferred until partial

    commit stage

    all updates are recorded on log and written to stable storage

    for example: let A = 1000, B = 2000 at the start

    T1 Log

    read(A, a)

    a := a - 50

    write( A, a)

    read(B, b)

    b := b + 50

    write(B, b)

    . . .

    other

    transactions

    the log is used to update the

    database after thetransaction commits.

  • 8/2/2019 Part14 Crash

    11/28

    Part 14 -crash 11

    Recovery Procedure

    redo(T

    i

    )

    set of all data values updated by Ti to new values

    Ti needs a redo if both and

    found in the log. redo is idempotent: can execute more than once,

    same final result. For example, the system crashes while

    performing a recovery.

  • 8/2/2019 Part14 Crash

    12/28

    Part 14 -crash 12

    Crash Recovery Methods

    Incremental Log with Immediate Update all updates are applied to the database; we keep an

    incremental log of all changes.

    written to stable storage when Ti begins.

    for each write:

    is written to

    stable storage before any output(X) is performed.

    e.g. write(X, 950)

    when Ti partially commits, is written to

    log.

  • 8/2/2019 Part14 Crash

    13/28

    Part 14 -crash 13

    Recovery Procedure

    [Incremental Log with Immediate Update]

    redo(Ti)

    the same as before set updated items to new values

    undo(Ti) if log contains an but no < Ti ,commit>

    found.

    restore value of items updated by T

    i

    to their old

    values.

  • 8/2/2019 Part14 Crash

    14/28

    Part 14 -crash 14

    Checkpoints

    recovery with logs requires the entire log to be

    scanned. the search time grows with log size. many redone transactions unnecessary since their

    updates have already been written to disk.

    we can maintain periodic checkpoints save all logs currently residing in main memory (if

    any) onto stable storage. output all modified buffer blocks to disk.

    output a to log on stable storage.

  • 8/2/2019 Part14 Crash

    15/28

    Part 14 -crash 15

    Checkpoints - recovery

    Recovery:

    find the last Ti executing before the last checkpoint, Ti .

    all the redo and undo operations apply only to Ti and

    subsequent Tjs.

    much less time consuming.

  • 8/2/2019 Part14 Crash

    16/28

    Part 14 -crash 16

    Buffer Management

    OSs with virtual memory have paging schemes to evict

    resident pages as required.

    may work against us:

    OS may evict a modified block before Ti commits,

    as well logs often stored in main memory until abuffer block is full before sending to stable storage.

    if now, Ti crashes, an inconsistency may result.

    most OSs rarely support database requirements

  • 8/2/2019 Part14 Crash

    17/28

    Part 14 -crash 17

    Buffer Management

    it may be possible for the db manager to allocate an

    area of memory and manage it independent of the OS(i.e. memory reserved for database use only).

    thus < Ti , data_item, old_value, new_value> must be

    written to stable storage before output of the block onwhich the item resides. (all entries)

    before output on a block in main memory, all logspertaining to the block must be written to stable storage

    first.

  • 8/2/2019 Part14 Crash

    18/28

    Part 14 -crash 18

    Shadow Paging

    the database is partitioned into a number of fixed length

    blocks (pages). we can use a page table to translate each logical block

    into its physical block:

    1

    2

    3

    n

    Logical pag

    table

    Physical Pages

    ondisk

    we maintain two page tables:- current page table - used by Ti .

    - shadow page table a copy of

    the table before Ti executes, never

    changed during execution of Ti

    ,

    and stored in stable storage. Logical pagetable

    Physical pages

    on disk

  • 8/2/2019 Part14 Crash

    19/28

    Part 14 -crash 19

    Shadow Paging

    example:

    a write(X, xi ) is issued and X resides on the k-th page:

    if the k-th page is not in memory, then issue aninput(X).

    if this is the first write to the k-th page: find a free page on disk. modify the current page table so the k-th entry

    points to the new page.

    assign xito X in the buffer page.

  • 8/2/2019 Part14 Crash

    20/28

    Part 14 -crash 20

    Shadow Paging

    the shadow page is stored in non-volatile memory just

    prior to the execution of Ti . We can recover the

    shadow page on a crash.

    when Ti commits, the current page table becomes the

    new shadow page table.

    if the current page table is lost in a crash, it is simple toroll the system back to the last consistent state.

    the overhead of log-records are eliminated.

  • 8/2/2019 Part14 Crash

    21/28

    Part 14 -crash 21

    Shadow Paging

    recovery is fast since no redo or undos to perform.

    In order to commit a transaction:

    all modified buffer pages in main memory are outputto disk.

    output the current page table to disk (do notoverwrite the shadow page -may need to recover ifcrash occurs now).

    send the disk address of current page table to stablestorage - over writes the previous shadow page.

  • 8/2/2019 Part14 Crash

    22/28

    Part 14 -crash 22

    Shadow Paging

    Disadvantages: data fragmentation:

    the database becomes scattered over the disk (slowsequential access) - may need to repack to maintainfast sequential access.

    garbage collection: after a commit, the old version of data is not

    reachable (unreferenced) and is not part of free

    space. We must perform periodic garbagecollections to recover the lost disk space.

  • 8/2/2019 Part14 Crash

    23/28

    Part 14 -crash23

    Loss of Non-volatile Storage

    typically does not occur frequently

    do periodic dump from disk to magnetic tape (?)

    recovery to point of last dump, then follow log torestore database.

  • 8/2/2019 Part14 Crash

    24/28

    Part 14 -crash24

    Recovery with Concurrent Transactions

    the scheme depends on the concurrency-control scheme

    used. Basically, to roll back a transaction, we must undo its

    updates. situation:

    T0 is rolled back: a data item, B, that it updated must berestored to old value - can use undo information in its logfor log based recovery systems.

    But if T1 did another update to B before T0 is rolledback, then T1s update is lost if T0 is rolled back.

    thus we require that if T updates data item B, then no othertransaction may update B until T either commits or is rolledback.

    This can be ensured with strict two-phase locking scheme(exclusive locks held until the end of a transaction).

  • 8/2/2019 Part14 Crash

    25/28

    Part 14 -crash25

    Recovery with Concurrent Transactions

    Transaction Rollbacks

    transaction Ti is rolled back by scanning the logbackwards.

    for every entry found, the data item, Xj,

    is restored to its old value V1. (possible that Tiperformed several updates to Xj)

    continue scan until found.

  • 8/2/2019 Part14 Crash

    26/28

    Part 14 -crash26

    Recovery with Concurrent Transactions

    Checkpoints recovery scheme more complex with concurrent

    transaction execution than previous form. Severaltransactions may have been active at the lastcheckpoint.

    we require that the checkpoint log entry be, where L is a list of the transactions

    active at the time of the checkpoint. as before it is assumed that the transactions do not

    perform updates to either the log or to buffer blocksduring the checkpoint duration

  • 8/2/2019 Part14 Crash

    27/28

    Part 14 -crash27

    Recovery with Concurrent Transactions

    Restart Recovery initially, create two empty lists: undo-listand redo-list for

    transactions requiring these operations. next, scan log backwards until the first

    record is found, then:

    for each found, add Ti to the redo-list. for each found, if Ti is not in redo-list,

    then add it to undo-list. next, check the list L in the checkpoint record:

    for each Ti in L, if Ti is not in the redo-list then addTi to the undo-list.

  • 8/2/2019 Part14 Crash

    28/28

    Part 14 -crash28

    Recovery with Concurrent Transactions

    once the two lists have been constructed: Rescan log from most recent record backwards

    performing an undo for each log record that belongsto a transaction on the undo-list (the log records forredo-list transactions are ignored). Stop scan when have been found for every transaction inundo-list.

    Relocate the most recent again. scan log forward and perform redo for each record

    that belongs to a transaction on the redo-list. Ignore

    log records of transactions in the undo-list.