database system principlesfolk.uio.no/inf212/handouts/uke18_1.pdf · 1 cs 245 notes 08 1 database...

29
1 CS 245 Notes 08 1 Database System Principles Failure Recovery Hector Garcia-Molina CS 245 Notes 08 2 Integrity or correctness of data • Would like data to be “accurate” or “correct” at all times EMP Name White Green Gray Age 52 3421 1

Upload: others

Post on 04-Jun-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Database System Principlesfolk.uio.no/inf212/handouts/uke18_1.pdf · 1 CS 245 Notes 08 1 Database System Principles Failure Recovery Hector Garcia-Molina CS 245 Notes 08 2 Integrity

1

CS 245 Notes 08 1

Database System Principles

Failure Recovery

Hector Garcia-Molina

CS 245 Notes 08 2

Integrity or correctness of data

• Would like data to be “accurate” or“correct” at all times

EMP Name

WhiteGreenGray

Age

523421

1

Page 2: Database System Principlesfolk.uio.no/inf212/handouts/uke18_1.pdf · 1 CS 245 Notes 08 1 Database System Principles Failure Recovery Hector Garcia-Molina CS 245 Notes 08 2 Integrity

2

CS 245 Notes 08 3

Integrity or consistency constraints

• Predicates data must satisfy• Examples:

- x is key of relation R- x → y holds in R- Domain(x) = {Red, Blue, Green}− α is valid index for attribute x of R- no employee should make more than

twice the average salary

CS 245 Notes 08 4

Definition:

• Consistent state: satisfies all constraints• Consistent DB: DB in consistent state

Page 3: Database System Principlesfolk.uio.no/inf212/handouts/uke18_1.pdf · 1 CS 245 Notes 08 1 Database System Principles Failure Recovery Hector Garcia-Molina CS 245 Notes 08 2 Integrity

3

CS 245 Notes 08 5

Constraints (as we use here) may not capture “full correctness”

Example 1 Transaction constraints• When salary is updated,

new salary > old salary• When account record is deleted,

balance = 0

CS 245 Notes 08 6

Note: could be “emulated” by simpleconstraints, e.g.,

account Acct # …. balance deleted?

Page 4: Database System Principlesfolk.uio.no/inf212/handouts/uke18_1.pdf · 1 CS 245 Notes 08 1 Database System Principles Failure Recovery Hector Garcia-Molina CS 245 Notes 08 2 Integrity

4

CS 245 Notes 08 7

Example 2 Database should reflectreal world

DBReality

Constraints (as we use here) may not capture “full correctness”

CS 245 Notes 08 8

in any case, continue with constraints...

Observation: DB cannot be consistent always!

Example: a1 + a2 +…. an = TOT (constraint)Deposit $100 in a2: a2 ← a2 + 100

TOT ← TOT + 100

Page 5: Database System Principlesfolk.uio.no/inf212/handouts/uke18_1.pdf · 1 CS 245 Notes 08 1 Database System Principles Failure Recovery Hector Garcia-Molina CS 245 Notes 08 2 Integrity

5

CS 245 Notes 08 9

a2

TOT

..50..

1000

..150

..1000

..150

..1100

Example: a1 + a2 +…. an = TOT (constraint)Deposit $100 in a2: a2 ← a2 + 100

TOT ← TOT + 100

CS 245 Notes 08 10

Transaction: collection of actions that preserve consistency

Consistent DB Consistent DB’T

Page 6: Database System Principlesfolk.uio.no/inf212/handouts/uke18_1.pdf · 1 CS 245 Notes 08 1 Database System Principles Failure Recovery Hector Garcia-Molina CS 245 Notes 08 2 Integrity

6

CS 245 Notes 08 11

Big assumption:

If T starts with consistent state +T executes in isolation

⇒ T leaves consistent state

CS 245 Notes 08 12

Correctness (informally)

• If we stop running transactions,DB left consistent

• Each transaction sees a consistent DB

Page 7: Database System Principlesfolk.uio.no/inf212/handouts/uke18_1.pdf · 1 CS 245 Notes 08 1 Database System Principles Failure Recovery Hector Garcia-Molina CS 245 Notes 08 2 Integrity

7

CS 245 Notes 08 13

How can constraints be violated?

• Transaction bug• DBMS bug• Hardware failure

e.g., disk crash alters balance of account

• Data sharinge.g.: T1: give 10% raise to programmers

T2: change programmers ⇒ systems analysts

CS 245 Notes 08 14

How can we prevent/fix violations?

• Chapter 17: due to failures only• Chapter 18: due to data sharing only• Chapter 19: due to failures and sharing

Page 8: Database System Principlesfolk.uio.no/inf212/handouts/uke18_1.pdf · 1 CS 245 Notes 08 1 Database System Principles Failure Recovery Hector Garcia-Molina CS 245 Notes 08 2 Integrity

8

CS 245 Notes 08 15

Will not consider:

• How to write correct transactions• How to write correct DBMS• Constraint checking & repair

That is, solutions studied here do not need to know constraints

CS 245 Notes 08 16

Chapter 8 Recovery

• First order of business:Failure Model

Page 9: Database System Principlesfolk.uio.no/inf212/handouts/uke18_1.pdf · 1 CS 245 Notes 08 1 Database System Principles Failure Recovery Hector Garcia-Molina CS 245 Notes 08 2 Integrity

9

CS 245 Notes 08 17

Events DesiredUndesired Expected

Unexpected

CS 245 Notes 08 18

Our failure model

processor

memory disk

CPU

M D

Page 10: Database System Principlesfolk.uio.no/inf212/handouts/uke18_1.pdf · 1 CS 245 Notes 08 1 Database System Principles Failure Recovery Hector Garcia-Molina CS 245 Notes 08 2 Integrity

10

CS 245 Notes 08 19

Desired events: see product manuals….

Undesired expected events:System crash

- memory lost- cpu halts, resets

Undesired Unexpected: Everything else!

that’s it!!

CS 245 Notes 08 20

Examples:• Disk data is lost• Memory lost without CPU halt• CPU implodes wiping out universe….

Undesired Unexpected: Everything else!

Page 11: Database System Principlesfolk.uio.no/inf212/handouts/uke18_1.pdf · 1 CS 245 Notes 08 1 Database System Principles Failure Recovery Hector Garcia-Molina CS 245 Notes 08 2 Integrity

11

CS 245 Notes 08 21

Is this model reasonable?

Approach: Add low level checks +redundancy to increaseprobability model holds

E.g., Replicate disk storage (stable store)Memory parityCPU checks

CS 245 Notes 08 22

Second order of business:

Storage hierarchy

Memory Disk

x x

Page 12: Database System Principlesfolk.uio.no/inf212/handouts/uke18_1.pdf · 1 CS 245 Notes 08 1 Database System Principles Failure Recovery Hector Garcia-Molina CS 245 Notes 08 2 Integrity

12

CS 245 Notes 08 23

Operations:

• Input (x): block with x → memory• Output (x): block with x → disk

• Read (x,t): do input(x) if necessaryt ← value of x in block

• Write (x,t): do input(x) if necessaryvalue of x in block ← t

CS 245 Notes 08 24

Key problem Unfinished transaction

Example Constraint: A=BT1: A ← A × 2

B ← B × 2

Page 13: Database System Principlesfolk.uio.no/inf212/handouts/uke18_1.pdf · 1 CS 245 Notes 08 1 Database System Principles Failure Recovery Hector Garcia-Molina CS 245 Notes 08 2 Integrity

13

CS 245 Notes 08 25

T1: Read (A,t); t ← t×2Write (A,t);Read (B,t); t ← t×2Write (B,t);Output (A);Output (B);

A: 8B: 8

A: 8B: 8

memory disk

1616

16

failure!

CS 245 Notes 08 26

• Need atomicity: execute all actions ofa transaction or noneat all

Page 14: Database System Principlesfolk.uio.no/inf212/handouts/uke18_1.pdf · 1 CS 245 Notes 08 1 Database System Principles Failure Recovery Hector Garcia-Molina CS 245 Notes 08 2 Integrity

14

CS 245 Notes 08 27

One solution: undo logging (immediatemodification)

due to: Hansel and Gretel, 782 AD

• Improved in 784 AD to durable undo logging

CS 245 Notes 08 28

T1: Read (A,t); t ← t×2 A=BWrite (A,t);Read (B,t); t ← t×2Write (B,t);Output (A);Output (B);

A:8B:8

A:8B:8

memory disk log

Undo logging (Immediate modification)

1616

<T1, start><T1, A, 8>

<T1, commit>16 <T1, B, 8>16

Page 15: Database System Principlesfolk.uio.no/inf212/handouts/uke18_1.pdf · 1 CS 245 Notes 08 1 Database System Principles Failure Recovery Hector Garcia-Molina CS 245 Notes 08 2 Integrity

15

CS 245 Notes 08 29

One “complication”

• Log is first written in memory• Not written to disk on every action

memoryDB

Log

A: 8 16B: 8 16Log:<T1,start><T1, A, 8><T1, B, 8>

A: 8B: 8

16BAD STATE

# 1

CS 245 Notes 08 30

One “complication”

• Log is first written in memory• Not written to disk on every action

memoryDB

Log

A: 8 16B: 8 16Log:<T1,start><T1, A, 8><T1, B, 8><T1, commit>

A: 8B: 8

16BAD STATE

# 2

<T1, B, 8><T1, commit>

...

Page 16: Database System Principlesfolk.uio.no/inf212/handouts/uke18_1.pdf · 1 CS 245 Notes 08 1 Database System Principles Failure Recovery Hector Garcia-Molina CS 245 Notes 08 2 Integrity

16

CS 245 Notes 08 31

Undo logging rules

(1) For every action generate undo logrecord (containing old value)

(2) Before x is modified on disk, logrecords pertaining to x must beon disk (write ahead logging: WAL)

(3) Before commit is flushed to log, allwrites of transaction must bereflected on disk

CS 245 Notes 08 32

Recovery rules: Undo logging

• For every Ti with <Ti, start> in log:- If <Ti,commit> or <Ti,abort>

in log, do nothing- Else For all <Ti, X, v> in log:

write (X, v)output (X )

Write <Ti, abort> to log

➽ IS THIS CORRECT??

Page 17: Database System Principlesfolk.uio.no/inf212/handouts/uke18_1.pdf · 1 CS 245 Notes 08 1 Database System Principles Failure Recovery Hector Garcia-Molina CS 245 Notes 08 2 Integrity

17

CS 245 Notes 08 33

Recovery rules: Undo logging

(1) Let S = set of transactions with<Ti, start> in log, but no<Ti, commit> (or <Ti, abort>) record in log

(2) For each <Ti, X, v> in log,

in reverse order (latest → earliest) do:

- if Ti ∈ S then - write (X, v)- output (X)

(3) For each Ti ∈ S do- write <Ti, abort> to log

CS 245 Notes 08 34

What if failure during recovery?No problem! ✏ Undo idempotent

Page 18: Database System Principlesfolk.uio.no/inf212/handouts/uke18_1.pdf · 1 CS 245 Notes 08 1 Database System Principles Failure Recovery Hector Garcia-Molina CS 245 Notes 08 2 Integrity

18

CS 245 Notes 08 35

To discuss:

• Redo logging• Undo/redo logging, why both?• Real world actions• Checkpoints• Media failures

CS 245 Notes 08 36

Redo logging (deferred modification)

T1: Read(A,t); t t×2; write (A,t);Read(B,t); t t×2; write (B,t);Output(A); Output(B)

A: 8B: 8

A: 8B: 8

memory DB LOG

1616

<T1, start><T1, A, 16><T1, B, 16>

<T1, commit>

output16

Page 19: Database System Principlesfolk.uio.no/inf212/handouts/uke18_1.pdf · 1 CS 245 Notes 08 1 Database System Principles Failure Recovery Hector Garcia-Molina CS 245 Notes 08 2 Integrity

19

CS 245 Notes 08 37

Redo logging rules

(1) For every action, generate redo logrecord (containing new value)

(2) Before X is modified on disk (DB),all log records for transaction thatmodified X (including commit) mustbe on disk

(3) Flush log at commit

CS 245 Notes 08 38

• For every Ti with <Ti, commit> in log:– For all <Ti, X, v> in log:

Write(X, v)Output(X)

Recovery rules: Redo logging

➽ IS THIS CORRECT??

Page 20: Database System Principlesfolk.uio.no/inf212/handouts/uke18_1.pdf · 1 CS 245 Notes 08 1 Database System Principles Failure Recovery Hector Garcia-Molina CS 245 Notes 08 2 Integrity

20

CS 245 Notes 08 39

(1) Let S = set of transactions with<Ti, commit> in log

(2) For each <Ti, X, v> in log, in forwardorder (earliest → latest) do:- if Ti ∈ S then Write(X, v)

Output(X) optional

Recovery rules: Redo logging

CS 245 Notes 08 40

Recovery is very, very SLOW !Redo log:

First T1 wrote A,B LastRecord Committed a year ago Record(1 year ago) --> STILL, Need to redo after crash!!

... ... ...

Crash

Page 21: Database System Principlesfolk.uio.no/inf212/handouts/uke18_1.pdf · 1 CS 245 Notes 08 1 Database System Principles Failure Recovery Hector Garcia-Molina CS 245 Notes 08 2 Integrity

21

CS 245 Notes 08 41

Solution: Checkpoint (simple version)

Periodically:(1) Do not accept new transactions(2) Wait until all transactions finish(3) Flush all log records to disk (log)(4) Flush all buffers to disk (DB) (do not discard buffers)

(5) Write “checkpoint” record on disk (log)(6) Resume transaction processing

CS 245 Notes 08 42

Example: what to do at recovery?

Redo log (disk):

<T1

,A,1

6>

<T1

,com

mit>

Chec

kpoi

nt

<T2

,B,1

7>

<T2

,com

mit>

<T3

,C,2

1>

Crash... ... ... ... ... ...

Page 22: Database System Principlesfolk.uio.no/inf212/handouts/uke18_1.pdf · 1 CS 245 Notes 08 1 Database System Principles Failure Recovery Hector Garcia-Molina CS 245 Notes 08 2 Integrity

22

CS 245 Notes 08 43

Key drawbacks:

• Undo logging: cannot bring backup DBcopies up to date

• Redo logging: need to keep all modified blocks in memory until commit

CS 245 Notes 08 44

Solution: undo/redo logging!

Update ⇒ <Ti, Xid, New X val, Old X val>page X

Page 23: Database System Principlesfolk.uio.no/inf212/handouts/uke18_1.pdf · 1 CS 245 Notes 08 1 Database System Principles Failure Recovery Hector Garcia-Molina CS 245 Notes 08 2 Integrity

23

CS 245 Notes 08 45

Rules

• Page X can be flushed before orafter Ti commit

• Log record flushed before corresponding updated page (WAL)

• Flush at commit (log only)

CS 245 Notes 08 46

Non-quiesce checkpoint

LOG

forundo dirty buffer

pool pagesflushed

Start-ckptactive TR:Ti,T2,...

endckpt

.........

...

Page 24: Database System Principlesfolk.uio.no/inf212/handouts/uke18_1.pdf · 1 CS 245 Notes 08 1 Database System Principles Failure Recovery Hector Garcia-Molina CS 245 Notes 08 2 Integrity

24

CS 245 Notes 08 47

Examples what to do at recovery time?

no T1 commit

LOG

T1,-a ... Ckpt

T1... Ckpt

end ... T1-b...

➽ Undo T1 (undo a,b)

CS 245 Notes 08 48

Example

LOG

... T1

a ... ... T1

b ... ... T1

c ... T1

cmt ...ckpt-end

ckpt-sT1

➽ Redo T1: (redo b,c)

Page 25: Database System Principlesfolk.uio.no/inf212/handouts/uke18_1.pdf · 1 CS 245 Notes 08 1 Database System Principles Failure Recovery Hector Garcia-Molina CS 245 Notes 08 2 Integrity

25

CS 245 Notes 08 49

Recovery process:• Backwards pass (end of log ➜ latest checkpoint start)

– construct set S of committed transactions– undo actions of transactions not in S

• Undo pending transactions– follow undo chains for transactions in

(checkpoint active list) - S

• Forward pass (latest checkpoint start ➜ end of log)

– redo actions of S transactions

backward pass

forward passstart

check-point

CS 245 Notes 08 50

Real world actions

E.g., dispense cash at ATMTi = a1 a2 …... aj …... an

$

Page 26: Database System Principlesfolk.uio.no/inf212/handouts/uke18_1.pdf · 1 CS 245 Notes 08 1 Database System Principles Failure Recovery Hector Garcia-Molina CS 245 Notes 08 2 Integrity

26

CS 245 Notes 08 51

Solution

(1) execute real-world actions after commit(2) try to make idempotent

CS 245 Notes 08 52

ATMGive$$(amt, Tid, time)

$

give(amt)

lastTid:time:

Page 27: Database System Principlesfolk.uio.no/inf212/handouts/uke18_1.pdf · 1 CS 245 Notes 08 1 Database System Principles Failure Recovery Hector Garcia-Molina CS 245 Notes 08 2 Integrity

27

CS 245 Notes 08 53

Media failure (loss of non-volatilestorage)

A: 16

Solution: Make copies of data!

CS 245 Notes 08 54

Example 1 Triple modular redundancy

• Keep 3 copies on separate disks• Output(X) --> three outputs• Input(X) --> three inputs + vote

X1 X2 X3

Page 28: Database System Principlesfolk.uio.no/inf212/handouts/uke18_1.pdf · 1 CS 245 Notes 08 1 Database System Principles Failure Recovery Hector Garcia-Molina CS 245 Notes 08 2 Integrity

28

CS 245 Notes 08 55

Example #2 Redundant writes,Single reads

• Keep N copies on separate disks• Output(X) --> N outputs• Input(X) --> Input one copy

- if ok, done- else try another one

➳ Assumes bad data can be detected

CS 245 Notes 08 56

Example #3: DB Dump + Log

backupdatabase

activedatabase

log

• If active database is lost,– restore active database from backup– bring up-to-date using redo entries in log

Page 29: Database System Principlesfolk.uio.no/inf212/handouts/uke18_1.pdf · 1 CS 245 Notes 08 1 Database System Principles Failure Recovery Hector Garcia-Molina CS 245 Notes 08 2 Integrity

29

CS 245 Notes 08 57

When can log be discarded?

check-point

dbdump

lastneededundo

not needed formedia recovery

not needed for undoafter system failure

not needed forredo after system failure

log

time

CS 245 Notes 08 58

Summary

• Consistency of data• One source of problems: failures

- Logging- Redundancy

• Another source of problems:Data Sharing..... next