1 module 6 log manager cop 6730. 2 log manager log knows everything. it is the temporal database...

24
1 Module 6 Log Manager COP 6730

Upload: candice-oneal

Post on 27-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Module 6 Log Manager COP 6730. 2 Log Manager Log knows everything. It is the temporal database –The online durable data are just their current versions

1

Module 6Log Manager

COP 6730

Page 2: 1 Module 6 Log Manager COP 6730. 2 Log Manager Log knows everything. It is the temporal database –The online durable data are just their current versions

2

Log Manager• Log knows everything. It is the temporal database

– The online durable data are just their current versions

– The log has their complete histories.

• Uses of the Log:– Transaction recovery

– Auditing

– Performance analysis

– Accounting

• The log can easily become a performance problem, and it can get very large. Intelligent algorithms are needed.

Page 3: 1 Module 6 Log Manager COP 6730. 2 Log Manager Log knows everything. It is the temporal database –The online durable data are just their current versions

3

Log Sequence Numbers (LSNs)• A log file consists of a sequence of log records

• Each log record has a unique identifier, or key, called its log sequence number (LSN)

• The LSN is composed of the record’s file number and the relative byte offset of the record within that file.

typedef struct

{

long file; /* number of log file in log directory */ long rba; /* relative byte address of (first byte) record in file */

} LSN;

Page 4: 1 Module 6 Log Manager COP 6730. 2 Log Manager Log knows everything. It is the temporal database –The online durable data are just their current versions

4

LSN Property

Property: If log record A for an object is created “after” log record B for that object, then

LSN (A) > LSN (B)

This monotonicity property is used by the write-ahead log (WAL) protocol.

Note: If two objects send their log records to different logs, then their LSNs are incomparable.

Page 5: 1 Module 6 Log Manager COP 6730. 2 Log Manager Log knows everything. It is the temporal database –The online durable data are just their current versions

5

Value Logging

Each log record contains the old and the new states of the object.

UNDO Program: set the object to the old state.

REDO Program: set the object to the new state.

Page 6: 1 Module 6 Log Manager COP 6730. 2 Log Manager Log knows everything. It is the temporal database –The online durable data are just their current versions

6

Logical Logging

• Value logging is often called physical logging because it records the physical addresses and values of objects

• Logical (or operation) logging records the name of an UNDO-REDO function and its parameter

Page 7: 1 Module 6 Log Manager COP 6730. 2 Log Manager Log knows everything. It is the temporal database –The online durable data are just their current versions

7

Log Manager: OverviewThe log manager provides an interface to the log table,

which is a sequence of log records.create table log_table(

lsn LSN,

prev_lsn LSN, /* for scanning the log backward */

timestamp TIMESTAMP, /* for time domain addressing */

resource_mg RMID, /* this RM will handle the UNDO-REDO work */

trid TRID, /* creator of this record */

tran_prev_lsn LSN, /* avoid scanning the log backward during transaction UNDO */

body varchar, /* UNDO-REDO information generated

by the RM */

primary key(lsn),

foreign key (prev_lsn) references log_table (lsn),

foreign key (tran_prev_lsn) references log_table (lsn),

) entry sequenced; /* inserts go at end of tile */

Page 8: 1 Module 6 Log Manager COP 6730. 2 Log Manager Log knows everything. It is the temporal database –The online durable data are just their current versions

8

Log Manager: Overview (Cont’d)

Example: find the records written by a RM.

select *

from log_table

where resource_mgr = :rmid

order by lsn descending;

Page 9: 1 Module 6 Log Manager COP 6730. 2 Log Manager Log knows everything. It is the temporal database –The online durable data are just their current versions

9

File System and Archive System• The log manager provides read and write access to

the log table for all the other RMs, and for the TM.– In a perfect world, the log manager simply writes log

records, and no one ever reads them.

– In the event of failures, the log is used to return each logged object to its most recent consistent state.

• The log manager maps the log table onto a growing collection of sequential files.– As the log table fills one file, another is allocated.

– Only recent records are kept online.

– Log records more than a few hours old are stored in less expensive tertiary storage managed by the archive system

Page 10: 1 Module 6 Log Manager COP 6730. 2 Log Manager Log knows everything. It is the temporal database –The online durable data are just their current versions

10

Why Have a Log Manager ?Can’t we maintain the log table using SQL operations ?

At restart, almost none of the system is functioning. The log manager must be able to find, read, and write the log without much help from the SQL system.

• The log manager must maintain and use a special catalog listing the physical log files.

• It must use low-level interfaces to read write these catalogs and files.

Page 11: 1 Module 6 Log Manager COP 6730. 2 Log Manager Log knows everything. It is the temporal database –The online durable data are just their current versions

11

Normal Execution

Begin_Work ( )new transaction

TRID

Normal

Functions

Callback

Functions

UNDO,REDO,

COMMIT

WorkRequests

LockRecords

LockRequests Lock

Manager

LogManager

1. Want to Commit

2. Commit Phase 1?

3. YES to Phase 1

5. Commit Phase 2

6. Acknowledge

4.

Write

Commit

log record

and

?

RMs TM

Commit_Work ( )

Application

Page 12: 1 Module 6 Log Manager COP 6730. 2 Log Manager Log knows everything. It is the temporal database –The online durable data are just their current versions

12

2. Read transaction’s log records

Transaction Abort

Rollback_Work ( )

Application

NormalFunctions

CallbackFunctions

LogManager

1. rollback transaction

5. write abort records

3. UNDO (log record)

4. Aborted (TRID)

RMs TM

Page 13: 1 Module 6 Log Manager COP 6730. 2 Log Manager Log knows everything. It is the temporal database –The online durable data are just their current versions

13

• The DO-UNDO-REDO protocol is a programming style for RMs implementing transactional objectsDO program:

UNDO program:

REDO program:

• RM have following structure:

DO-UNDO-REDO Protocol

RMNormal Function: DO program

Callback Functions: UNDO & REDO programs

Old State DO

New State

Log Record

New State

Log Record

Old State

Log Record

DOUNDO

REDO

Old State

New State

Page 14: 1 Module 6 Log Manager COP 6730. 2 Log Manager Log knows everything. It is the temporal database –The online durable data are just their current versions

14

Restart1. The TM regularly invokes checkpoints during

normal processing it informs each RM to checkpoint its state to persistent memory.

2. At restart, the transaction mgr. scans the log table forward from the most recent checkpoint to the end.

3. For each transaction that has not committed (e.g., T2), the TM calls the UNDO( ) callback of the RMs to undo it to the most recent persistent savepoint.

Checkpoint Crash

T1T2

T3

Page 15: 1 Module 6 Log Manager COP 6730. 2 Log Manager Log knows everything. It is the temporal database –The online durable data are just their current versions

15

2-Phase Commit: CommitPhase I:

• Prepare: Invoke each RM asking for its vote.

• Decide: If all vote yes, durably write the transaction commit log record.

Note: The commit record write is what makes a transaction atomic and durable. If the system fails prior to that instant, the transaction will be undone at restart; otherwise, phase 2 will be carried forward by the restart logic.

Page 16: 1 Module 6 Log Manager COP 6730. 2 Log Manager Log knows everything. It is the temporal database –The online durable data are just their current versions

16

2-Phase Commit: Commit (Cont’d)

Phase II:

• Commit: Invoke each RM telling it the commit decision.

Note: The RM can now release locks, deliver real messages, and perform other clean-up tasks.

• Complete: When all acknowledge the commit message, write a commit completion record to the log, indicating that phase 2 ended. When the completion message is durable, deallocate the live transaction state.

Note: Phase 2 completion record, is used at restart to indicate that the RM have all been informed about the transaction

Page 17: 1 Module 6 Log Manager COP 6730. 2 Log Manager Log knows everything. It is the temporal database –The online durable data are just their current versions

17

2-Phase Commit: Abort

• If any RM votes no during the prepare step, or if it does not respond at all, then the transaction cannot commit.

• The simplest thing to do in this case is to roll back the transaction by calling Abort_work ( ).

Page 18: 1 Module 6 Log Manager COP 6730. 2 Log Manager Log knows everything. It is the temporal database –The online durable data are just their current versions

18

2-Phase Commit: Abort (Cont’d)

The logic for Abort_work ( ) is as follows:

Undo: Read the transaction’s log backwards, issuing UNDO of each record. The RM that wrote the record is invoked to undo the operation.

Broadcast: At each savepoint, invoke each RM telling it the transaction is at the savepoint.

Abort: Write the transaction abort record to the log (UNDO of begin_work( )).

Complete: Write a complete record to the log indicating that abort ended. Deallocate the live transaction state.

Page 19: 1 Module 6 Log Manager COP 6730. 2 Log Manager Log knows everything. It is the temporal database –The online durable data are just their current versions

19

Multiple Logs• In systems with very high update rates, the

bandwidth of the log can become a bottleneck.– Such bottlenecks can be eliminated by creating multiple

logs and by directing the log records of different objects to different logs.

• In some situations, a particular RM keeps its own log table for portability reasons.

• Distributed systems are likely to have one or more logs per network node.– They maintain multiple logs for performance and for node

autonomy.

– With a local log, each node can recover its local transactions without involving the other nodes.

Page 20: 1 Module 6 Log Manager COP 6730. 2 Log Manager Log knows everything. It is the temporal database –The online durable data are just their current versions

20

Group CommitLog Insert:

1. The program acquires the log lock.

2. It fixes the log page in the buffer pool.

3. It allocates space for the log record in the page, and fills in the record.

4. It unfixes the page in the buffer pool, and unlocks the semaphore.

5. The movement of data to durable storage is coordinated by an asynchronous process called the log flush daemon.

Page 21: 1 Module 6 Log Manager COP 6730. 2 Log Manager Log knows everything. It is the temporal database –The online durable data are just their current versions

21

Group Commit (Cont’d)

Group Commit:

The log daemon wakes up once every t ms and does all the log writing that has accumulated in the buffer pool (batch processing log writes).

advantage: I/O overhead is reduced

disadvantage: It makes transaction last longer and delays releasing locks.

Page 22: 1 Module 6 Log Manager COP 6730. 2 Log Manager Log knows everything. It is the temporal database –The online durable data are just their current versions

22

The FIX RuleWhile the semaphore is set, the page is said to be fixed, and releasing the page is called unfixing it.

Fixed Rule:

1. Get the page semaphore in exclusive mode prior to altering the page.

2. Get the semaphore in shared or exclusive mode prior to reading the page.

3. Hold the semaphores until the page and log are again consistent, and read or update is complete.

Page 23: 1 Module 6 Log Manager COP 6730. 2 Log Manager Log knows everything. It is the temporal database –The online durable data are just their current versions

23

The FIX Rule: 2-Phase Locking

This is just two-phase locking at the page-semaphore level.

Isolation Theorem tells us that all read and write actions on page will be isolated.

Page updates are actually min-transactions.

When the page is unfixed, the page should be consistent and the log record should allow UNDO or REDO of the page transformation.

Page 24: 1 Module 6 Log Manager COP 6730. 2 Log Manager Log knows everything. It is the temporal database –The online durable data are just their current versions

24

Multi-Page Actions• Some actions modify several pages at once.

Examples: Inserting a multi-page record. Splitting a B-tree node.

• These actions are structured as follows:

1. Fix all the relevant pages

2. Do all the modifications and generate many log records.

3. Unfix the page.