dbms 2001notes 7: crash recovery1 principles of database management systems 7: crash recovery pekka...

54
DBMS 2001 Notes 7: Crash Recovery 1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals by Hector Garcia-Molina, Jeff Ullman and Jennifer Widom)

Upload: stanley-underwood

Post on 03-Jan-2016

224 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 1

Principles of Database Management Systems

7: Crash Recovery

Pekka Kilpeläinen(after Stanford CS245 slide originals by Hector Garcia-Molina, Jeff Ullman and

Jennifer Widom)

Page 2: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 2

PART II of Course

• Major consern so far:– How to manipulate database

efficiently?

• Now shift focus to:– How to ensure correctness of

database manipulation?

Page 3: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 3

Integrity or correctness of data

• Would like data to be “accurate” or“correct” at all times

EMP Name

WhiteGreenGray

Age

523421

1

Page 4: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 4

Integrity or consistency constraints• Predicates that the data must

satisfy• Examples:

- x is key of relation R- x y holds in R- Domain(x) = {Red, Blue, Green}- no employee should make more than

twice the average salary

Page 5: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 5

Definition:

• Consistent state: satisfies all constraints

• Consistent DB: DB in a consistent state

Page 6: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 6

Constraints (as we use here) may not capture “full

correctness”Example 1 Transaction constraints• When salary is updated,

new salary > old salary• When account record is deleted,

balance = 0In general, various policies to ensure

that the DB corresponds to "real world"

Page 7: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 7

a2

TOT

.

.50

.

.1000

.

.150

.

.1000

.

.150

.

.1100

Obs: DB cannot be consistent always!Example: a1 + a2 +…. an = TOT (constraint)

Deposit $100 in a2: a2 a2 + 100TOT TOT + 100

Page 8: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 8

Transaction: collection of actions

that preserve consistency( process)

Consistent DB Consistent DB’T

SQL: each query or modification statementEmbedded SQL: Sequence of DB operations up

to COMMIT or ROLLBACK ("abort")

Page 9: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 9

Big assumption:

If transaction T starts with consistent state +executes in isolation

T leaves consistent state

Correctness (informally)

• If we stop running transactions,DB left consistent

• Each transaction sees a consistent DB

Page 10: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 10

How can constraints be violated?

• Transaction bug• DBMS bug• Hardware failure

e.g., disk crash alters balance of account

• Simultaneous transactions accessing shared datae.g.: T1: give 10% raise to programmers

T2: change programmers systems analysts

Page 11: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 11

How can we prevent/fix violations?

• This lecture: due to system failures only [Chapter 8]

• Chapter 9: due to concurrency only• Chapter 10: due to failures and

concurrency

Page 12: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 12

Will not consider:

• How to write correct transactions• How to write correct DBMS• Constraint checking & repair

That is, solutions studied here do not need

to know constraints

Page 13: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 13

Events Desired Undesired Expected

Unexpected

Failure Model

Page 14: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 14

Context of failure model

processor

memory disk - volatile, -

nonvolatile disappears when power off

CPU

M D

Page 15: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 15

Desired events: see product manuals….

Undesired expected events:System failures- SW errors, power loss

-> memory and transaction state lost

Undesired Unexpected: Everything else!

that’s it!!

Page 16: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 16

Examples:• Disk data is lost

• Memory lost without CPU halt

• Aliens attack and destroy the building….

Undesired Unexpected: Everything else!

(can protect against this by archive backups)

Page 17: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 17

Is this model reasonable?

Approach: Add low level checks +redundancy to increase probability that the model holds

E.g., Replicate disk storage (stable store) Memory parity CPU checks

Page 18: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 18

Interacting Address Spaces:

Storage hierarchy

Memory Disk

x xtInput(x)

Output(x)

Read(x,t)

Write(x,t)

Input (x): block with x from disk to bufferOutput (x): block with x from buffer to disk

Read (x,t): transaction variable t:= X; (Performs Input(x) if needed)Write (x,t): X (in buffer) := t

Page 19: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 19

Note on "database elements" X:

• Anything that can have value and be accessed of modified by transactions:– a relation (or extent of an object class)– a disk block– a record

• Easiest to use blocks as database elements

Page 20: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 20

Key problem Unfinished transaction

Example Constraint: A=B (*)

T1: A A 2 B B 2

(*) simplification; a more realistic example: the sum of loan balances = the total debt of a bank

Page 21: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 21

T1: Read (A,t); t t2Write (A,t);Read (B,t); t t2Write (B,t);Output (A);Output (B);

A: 8B: 8

A: 8B: 8

memory disk

16

16

failure! A != B

16

Page 22: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 22

• Need atomicity: either execute all actions of a transaction or none at all

One solution: undo logging (immediatemodification)

Log: sequence of log records, recording actions of transactions Ti:

<Ti, start>: transaction has begun<Ti, commit>: completed succesfully<Ti, abort>: could not complete succesfully<Ti, X, v>: Ti has updated element X, whose

old value was v

Page 23: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 23

T1: Read (A,t); t t2

Write (A,t);Read (B,t); t t2Write (B,t);Output (A);Output (B);

A:8B:8

A:8B:8

memory disk log

Undo logging (Immediate modification)

1616

<T1, start>

<T1, commit>16 <T1, B, 8>

16

constraint: A=B

<T1, A, 8>

Page 24: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 24

One “complication”

• Log is first written in memory• Not written to disk on every action

memory

DB

Log

A: 8 16B: 8 16Log:<T1,start><T1, A, 8><T1, B, 8>

A: 8B: 8

16BAD STATE

# 1

Page 25: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 25

One “complication”

• Log is first written in memory• Not written to disk on every action

memory

DB

Log

A: 8 16B: 8 16Log:<T1,start><T1, A, 8><T1, B, 8><T1, commit>

A: 8B: 8

16BAD STATE

# 2

<T1, B, 8><T1, commit>

...

Page 26: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 26

Undo logging rules

(1) Generate an undo log record for every update action (with the old value)

(2) Before modifying element X on disk, any log records for X must be flushed to disk (write-ahead logging)

(3) All WRITEs of transaction T must be OUTPUT to disk before <T, commit> is flushed to log

Page 27: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 27

Recovery rules: Undo logging• For every Ti with <Ti, start> in log:

- If <Ti,commit> or <Ti,abort> in log, do nothing

- Else For all <Ti, X, v> in log:write (X, v)output (X )

Write <Ti, abort> to log

IS THIS CORRECT??

Page 28: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 28

T1: A A+1; B B+1; T2: A A+1; B B+1;

A:10B:10

DB on disklog

Recovery with Undo logging

<T1, start><T1, A, 10>

CRASH => A should be restored to 10, not

11

11

constraint: A=B

<T2, start><T2, A, 11>

Page 29: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 29

Undo logging Recovery : (more precisely)

(1) Let S = set of transactions with<Ti, start> in log, but no<Ti, commit> (or <Ti, abort>) record

in log(2) For each <Ti, X, v> in log,

in reverse order (latest earliest) do:

- if Ti S then - write (X, v)

- output (X)

(3) For each Ti S do

- write <Ti, abort> to log

Page 30: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 30

What if system fails during recovery?No problem! Undo idempotent- Can repeat (unfinished) undo, result remains the same

Page 31: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 31

To discuss:

• Redo logging• Undo/redo logging, why both?• Checkpointing• Media failures

Page 32: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 32

Redo logging (deferred modification)

T1: Read(A,t); t t2; write (A,t); Read(B,t); t t2; write (B,t);

Output(A); Output(B)

A: 8B: 8

A: 8B: 8

memory DB LOG

1616

<T1, start><T1, A, 16><T1, B, 16>

<T1, commit>

output16

Page 33: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 33

Redo logging rules

(1) For every disk update, generate a redo log record (with new value)

(2) Before modifying DB element X on disk, all log records for the transaction that (including COMMIT) must be flushed to disk

Page 34: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 34

• For every Ti with <Ti, commit> in log:– For all <Ti, X, v> in log:

Write(X, v)Output(X)

Recovery rules: Redo logging

This time, the last updates should remain in effect

Page 35: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 35

(1) Let S = set of transactions with<Ti, commit> in log

(2) For each <Ti, X, v> in log, in forward order (earliest latest) do:

- if Ti S then Write(X, v) Output(X)

Recovery with Redo Logging: (more precisely)

Page 36: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 36

Recovery is very, very

SLOW !Redo log:

First T1 wrote A,B LastRecord Committed a year ago

Record(1 year ago) --> STILL, Need to redo after crash!!

... ... ...

Crash

Page 37: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 37

Solution: Checkpointing (for redo-logging, simple version)

Periodically:(1) Stop accepting new transactions(2) Wait all transactions to finish

(commit/abort)(3) Flush all log records to disk (log)(4) Flush all buffers to disk (DB) (5) Write & flush a <CKPT> record on disk

(log)(6) Resume transaction processing

Page 38: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 38

Example: what to do at recovery?Redo log (on disk):

<T1

,A,1

6>

<T1

,com

mit

>

Ch

eck

poin

t

<T2

,B,1

7>

<T2

,com

mit

>

<T3

,C,2

1>

Crash... ... ... ...

...

...

Result of transactions completed prior to the checkpoint are stored permanently on disk

-> Redo write(B, 17)output(B)

Page 39: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 39

Drawbacks:• Undo logging:

– System forced to update disk at the end of transactions (may increase disk I/O)

– cannot redo recent updates to refresh a backup copy of the DB

• Redo logging: – need to keep all modified blocks in

memory until commit (may lead to shortage of buffer pool)

Page 40: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 40

Solution: Undo/redo logging!

Update record <Ti, X, Old-X-value, New-X-value>

Provides more flexibility to order actions

Page 41: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 41

Rules for undo/redo logging

(1) Log record <Ti, X, old, new>

flushed to disk before writing X to disk

(2) Flush the log at commit– Element X can be written to disk

either before or after the commit of Ti

Page 42: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 42

Undo/Redo Recovery Policy

• To recover from a crash, – redo updates by any committed

transactions (in forward-order, earliest first)

– undo updates by any incomplete transactions (in backward-order, latest first)

Page 43: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 43

Problem with Simple Checkpointing

• Simple checkpointing stops the system from accepting new transactions system effectively shuts down

• Solution: "nonquiescent" checkpointing– accepts new transactions during a

checkpoint

Page 44: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 44

Nonquiescent Checkpointing

• Slightly different for various logging policies• Rules for undo/redo logging:

– Write log record (and flush log)<START CKPT T1, …, Tk>,

where T1, …, Tk are the active transactions– Flush to disk all dirty buffers which contain changed

database elements (Corresponding log records first!)=> Effect of writes prior to the chkpt made

permanent– Write & flush log record

<END CKPT>

Page 45: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 45

Non-quiescent checkpoint

LOG

for undo dirty buffer

pool pagesflushed

<Start ckptTi,T2,…>

<endckpt>

.........

...

Page 46: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 46

Examples what to do at recovery time?

no T1 commit

LOG

<T1, X,a, a'>

...CkptT1

...Ckptend

...<T1,Y, b, b'>

...

Undo T1 (restore Y to b and X to a)

Page 47: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 47

Example

LOG

...T1

a... ...

T1

b... ...

T1

c...

T1

cmt...

ckpt-end

ckpt-sT1

Redo T1: (redo update of elements to b and to c)

Page 48: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 48

Recovery process:• Backwards pass (end of log latest checkpoint

start)

– construct set S of committed transactions– undo actions of transactions not in S

• Undo pending transactions– follow undo chains for transactions in

(checkpoint active list) - S

• Forward pass (latest checkpoint start end of log)

– redo actions of transactions in set S

backward pass

forward passstart

check-point

Page 49: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 49

Media failure (loss of non-volatile storage)

A: 16

Solution: Make copies of data!

Disk crash!

Page 50: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 50

Solution a: Use mirrored or redundant disks

• Mirroring: copies on separate disks– Output(X) --> three outputs– Input(X) --> three inputs + vote

D1 D2 D3

Page 51: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 51

Example #2 Redundant writes, Single reads

• Keep N copies on separate disks• Output(X) --> N outputs• Input(X) --> Input one copy

- if ok, done- else try

another one Assumes bad data can be detected

Page 52: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 52

Solution b: Archiving

backupdatabase

activedatabase

log

• If active database is lost,– restore active database from backup (archive)– bring up-to-date using redo entries in log

Page 53: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 53

When can we discard the log?

check-point

dbdump

lastneededundo

not needed formedia recovery

not needed for undoafter system failure

not needed forredo after system failure

log

time

Page 54: DBMS 2001Notes 7: Crash Recovery1 Principles of Database Management Systems 7: Crash Recovery Pekka Kilpeläinen (after Stanford CS245 slide originals

DBMS 2001 Notes 7: Crash Recovery 54

Summary• Consistency of data• Recovery from system failures

– Logging (undo/redo/undo-redo), checkpoints– Recovery: using log to reconstruct the DB

• Recovery form media failures– reducing risk through redundancy– backups / archiving

• Another source of problems:– Concurrent transactions that access shared

DB elements .... NEXT