data & database administration instructor: engr. sana ziafat

31
Data & Database Administration Instructor: Engr. Sana Ziafat

Upload: violet-lawson

Post on 14-Jan-2016

223 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data & Database Administration Instructor: Engr. Sana Ziafat

Data & Database Administration

Instructor:

Engr. Sana Ziafat

Page 2: Data & Database Administration Instructor: Engr. Sana Ziafat

Database Recovery and Backup Databases are often damaged or lost because of

system problems that may be caused by: Human error, Hardware failure, Incorrect or invalid data, Program errors, Viruses, Network failures, Conflicting transactions or, Natural disasters

Mechanisms for restoring a database quickly and accurately after loss or damage are known as Database recovery.

Page 3: Data & Database Administration Instructor: Engr. Sana Ziafat

Back-up Facilities Automatic dump facility that produces backup copy

of the entire database Periodic backup (e.g. nightly, weekly) Stored in secure, off-site location

Incremental back-up Cold backup

database is shut down during backup Hot backup

selected portion is shut down and backed up at a given time

Back-up strategies based on demands

Page 4: Data & Database Administration Instructor: Engr. Sana Ziafat

Backup Facilities (Contd.)

Backups of static/less frequently changing data may be taken less often.

Incremental backups which record changes made since last full backup, may be taken on interim basis. Incremental backups take lesser time.

Page 5: Data & Database Administration Instructor: Engr. Sana Ziafat

Journalizing Facilities

A DBMS must provide journalizing facilities to keep an audit trail of transactions and database changes. Transaction: a discrete unit of work that must be

completely processed or not processed at all within a computer system. e.g., entering a customer record

In the event of failure, a consistent database state can be re-established by using the information in the journals together with the most recent complete backup.

There are two types of journals or logs: Transaction log Database change log

Page 6: Data & Database Administration Instructor: Engr. Sana Ziafat

Journalizing Facilities Audit trail of transactions and database updates

Kept on disk or tape; Must be backed-up

Transaction log record of essential data for each transaction processed

against the database Example of data in the transaction log?

Database change log–images of updated data Before-image–copy before modification After-image–copy after modification

Produces an audit trailaudit trail

Page 7: Data & Database Administration Instructor: Engr. Sana Ziafat

Journalizing Facilities (Contd.) Transaction log

Data typically recorded for a transaction include transaction code or ID Action or type of transaction (e.g. insert) Time of transaction Terminal number or user ID Input data values Tables and records accessed Records modified And possibly the old and new field values

Page 8: Data & Database Administration Instructor: Engr. Sana Ziafat

Figure - Database audit trail

Database (current)

Transaction Log

Database Change log

Database (backup)

DBMS

Copy of database affected by transactionCopy of

transaction

Effect of transaction or recovery action

Transaction Recovery action

Page 9: Data & Database Administration Instructor: Engr. Sana Ziafat

Checkpoint Facilities

A facility by which DBMS periodically refuses to accept any new transactions.

All transactions in progress are completed and the journal files are updates.

At this point, system is in a quiet state, and the database & transaction logs are synchronized.

The DBMS writes a special record (checkpoint record) to log file which is like a snapshot of the state of database.

The checkpoint record contains info necessary to restart the system.

Page 10: Data & Database Administration Instructor: Engr. Sana Ziafat

Checkpoint Facilities (Contd.) Any dirty blocks (memory pages containing changes not

yet written to the disks) are written from memory to the disk storage, ensuring that all changes made prior to taking checkpoint have been written to long-term storage.

DBMS may perform checkpoints automatically or in response to user application programs.

Checkpoints should be taken frequently (several times an hour).

In case of failures, it is often possible to resume processing from the most recent checkpoint.

Thus only few minutes processing work must be repeated, compared with several hours for a complete restart of the day’s processing.

Page 11: Data & Database Administration Instructor: Engr. Sana Ziafat

Recovery Manager

It is a module of the DBMS which restores the database to a correct condition when a failure occurs and which resumes processing user requests.

The type of restart used depends on the nature of the failure.

Recovery manager uses the logs as well as the backup copy to restore the database

Page 12: Data & Database Administration Instructor: Engr. Sana Ziafat

Recovery and Restart Procedures Disk Mirroring–switch between identical copies

of databases: fastest recovery; high availability of data required

Restore/Rerun–reprocess transactions against the backup copy; simple, but two disadvantages/last resort:

1. Sequence of transaction

2. Time to reprocess transaction

Page 13: Data & Database Administration Instructor: Engr. Sana Ziafat

Transaction Integrity–commit or abort all transaction changes (business transactions are well-defined sequence of steps) Define transaction boundaries: BEGIN

TRANSACTION & COMMIT Transactions should follow 4 properties.

Page 14: Data & Database Administration Instructor: Engr. Sana Ziafat

Transaction ACID Properties

Atomic Transaction cannot be subdivided

Consistent Constraints don’t change from before transaction to

after transaction. Example: inventory level Isolated

Database changes not revealed to users until after transaction has completed;

know on-hand inventory after inventory transaction is completed.

Durable Database changes are permanent

Page 15: Data & Database Administration Instructor: Engr. Sana Ziafat

Recovery and Restart Procedures Backward Recovery (Rollback)

apply before- images Reverse the changes made

Forward Recovery (Roll Forward)–apply the most recent after-images (much faster and more accurate than

restore/rerun)

Page 16: Data & Database Administration Instructor: Engr. Sana Ziafat

Figure Basic recovery techniques a) Rollback

Page 17: Data & Database Administration Instructor: Engr. Sana Ziafat

Figure Basic recovery techniques (cont.)b) Rollforward

Page 18: Data & Database Administration Instructor: Engr. Sana Ziafat

Database Failure Responses (1) Aborted transactions

Transaction terminated abnormally; human error, hardware failure of loss of transmission etc. Preferred recovery: rollback Automatically done by DBMS

Incorrect data Difficult to detect; Need human intervention, e.g. wrong grades or payment Preferred recovery: rollback Alternative 1: rerun transactions not including inaccurate data updates Alternative 2: compensating transactions, human intervention to correct the error

Page 19: Data & Database Administration Instructor: Engr. Sana Ziafat

Database Failure Responses (2) System failure (database intact)

System crash due to power loss, system software failure Preferred recovery: switch to duplicate database Alternative 1: rollback Alternative 2: restart from checkpoint

Database destruction (database is lost, destroyed) Preferred recovery: switch to duplicate database Alternative 1: rollforward to the state before loss Alternative 2: reprocess transactions

Page 20: Data & Database Administration Instructor: Engr. Sana Ziafat

What is Concurrency Control? The process of managing simultaneous operations

against a database so that data integrity is maintained and the operations do not interfere with each other in a multi-user environment.

Page 21: Data & Database Administration Instructor: Engr. Sana Ziafat

Background

Most DBMS run in a multi-user environment, where several users are able to share data contained in a database.

No data integrity problems occur when data is being read.

But problems can arise when several users are updating data.

When more than one transaction is being processed against a DB at the same time, the transactions are considered to be concurrent.

The actions that must be made to ensure data integrity are called concurrency control actions.

Page 22: Data & Database Administration Instructor: Engr. Sana Ziafat

Background (Contd.)

Remember that CPU can process only one instruction at a time.

As new transactions are submitted while other processing is occurring against the database, the transactions are usually interleaved.

The CPU switches among the transactions so that some portion of each transaction is performed as the CPU addresses each transaction in turn.

Page 23: Data & Database Administration Instructor: Engr. Sana Ziafat

Problems without Concurrency Control Lost Updates problem can occur when multiple users attempt to

update database. Example Two users with joint account trying to withdraw cash at the same

time using ATM at different locations.User A User B

1.Read account balance($1000) 1.Read account balance

($1000)

2.Withdraw $200

2.Withdraw $3003.Write account balance($800)

3.Write account balance($700)

ERROR

time

Page 24: Data & Database Administration Instructor: Engr. Sana Ziafat

Problems without Concurrency Control Inconsistent read problems can occur when one

user reads data that has been partially updated by another user.

This read will be incorrect, also called dirty read or unrepeatable read.

Page 25: Data & Database Administration Instructor: Engr. Sana Ziafat

Serializabilty Concurrent transactions need to be processed in isolation so that

they do not interfere with each other.

If one transaction were entirely processes at a time before another one, no interference would occur.

Procedures which emulate this are called serializable.

Serializable schedules process transactions that will give results as if they had been processed one after the other.

Schedules are designed such that transactions that will not interfere with each other can be run in parallel, e.g., transactions that access data from different tables in a database will not conflict with each other.

Serializability is achieved by different ways, but locking mechanisms are the most common.

Page 26: Data & Database Administration Instructor: Engr. Sana Ziafat

Locking

Locking implies that any data retrieved by a user for updating must be locked, or denied to other users, until the update is completed or aborted.

Locking mechanism enforces a sequential updating process that prevents erroneous updates.

It is the pessimistic approach of concurrency control

Page 27: Data & Database Administration Instructor: Engr. Sana Ziafat

Concurrency Control through Locking User A User B

1.Read account balance2.Lock account balance

1.Read account balance (denied)

2.Read account balance($1000)

2.Withdraw $200

3.Write account balance($800)

4. Unlock account balance 2.Lock account balance 3.Read account balance($800)

4.Withdraw $300

5.Write account balance($500)

6. Unlock account balance

time

Page 28: Data & Database Administration Instructor: Engr. Sana Ziafat

Locking Levels There are different locking levels (granularity)

Database Entire DB is locked & becomes unavailable. This level can be

used during backups Table

Entire table containing a requested record is locked. This level can be used for bulk updates that will update entire table (e.g., salary raise)

Block or page The physical storage block (or page) containing the requested

record is locked. This level is commonly implemented. Record level

Only the requested record (row) is locked. All other records within table are available.

Field level Only the particular field (column) in the requested record is

locked.

Page 29: Data & Database Administration Instructor: Engr. Sana Ziafat

Types of locks

Shared (S locks or read locks) Allows other transactions to read but not update a record or

resource. Placing a shared lock prevents another user from placing

an exclusive lock, but not a shared lock, on that record. Exclusive (X locks or write locks)

Prevent other transaction from reading and therefore updating a record until unlocked.

Should be placed when updating records. Placing an exclusive lock on a record prevents another

user from placing any type of lock on that record

Page 30: Data & Database Administration Instructor: Engr. Sana Ziafat

Deadlocks

Locking may prevent erroneous updates but can lead to deadlocks problem.

Deadlock: When two or more transactions lock a common resource, each one of them waits for the other one to unlock that resource.

Deadlock prevention: users must lock all required records at the beginning of the transaction.

Deadlock resolution: the DBMS detects and breaks the deadlock by backing out of the the deadlocked transactions. Any changes made by that transaction up to the time of

deadlock are removed, The transaction is restarted when the resources become

available.

Page 31: Data & Database Administration Instructor: Engr. Sana Ziafat

Versioning

Optimistic approach of concurrency control Each transaction is restricted to a view of a

database as of the time that transaction started and when a transaction modifies a record, the DBMS creates a new record version instead of overwriting the old record.