design dbms

20
Relational Database Design Relational Database Design

Upload: iiita

Post on 15-Jan-2015

777 views

Category:

Education


1 download

DESCRIPTION

Database Management Systems

TRANSCRIPT

Page 1: Design dbms

Relational Database DesignRelational Database Design

Page 2: Design dbms

©Silberschatz, Korth and Sudarshan7.2Database System Concepts

Relational Database DesignRelational Database Design

RDBMS design issues – Pitfalls and Normalization.

Overview of Normal Forms

Pitfalls in Relational Database Design

Functional Dependencies

Decomposition

Boyce-Codd Normal Form

Third Normal Form

Multivalued Dependencies and Fourth Normal Form

Overall Database Design Process

Page 3: Design dbms

©Silberschatz, Korth and Sudarshan7.3Database System Concepts

RDBMS Design issues RDBMS Design issues

So far we have assumed that attributes are grouped to form a relation schema by using the common sensecommon sense of database designer or by mapping a schema defined by ER model.

The goodness of Design should be evaluated for tables with few attributes or the tables with large number of attributes ?

Can we combine two tables ( Schemas) without any problem?

We still need some formal measure of why one grouping of grouping of attributesattributes into a relation schema may be better than another.

Page 4: Design dbms

©Silberschatz, Korth and Sudarshan7.4Database System Concepts

The Banking SchemaThe Banking Schema branch = (branch_name, branch_city, assets)

customer = (customer_id, customer_name, customer_street, customer_city)

loan = (loan_number, amount)

account = (account_number, balance)

employee = (employee_id. employee_name, telephone_number, start_date)

dependent_name = (employee_id, dname)

account_branch = (account_number, branch_name)

loan_branch = (loan_number, branch_name)

borrower = (customer_id, loan_number)

depositor = (customer_id, account_number)

cust_banker = (customer_id, employee_id, type)

works_for = (worker_employee_id, manager_employee_id)

payment = (loan_number, payment_number, payment_date, payment_amount)

savings_account = (account_number, interest_rate)

checking_account = (account_number, overdraft_amount)

Page 5: Design dbms

©Silberschatz, Korth and Sudarshan7.5Database System Concepts

Combine Schemas?Combine Schemas?

Suppose we combine borrower and loan to get

bor_loan = (customer_id, loan_number, amount )

Result is possible repetition of information (L-100 in example below)

Page 6: Design dbms

©Silberschatz, Korth and Sudarshan7.6Database System Concepts

A Combined Schema Without RepetitionA Combined Schema Without Repetition

Consider combining loan_branch and loan

loan_amt_br = (loan_number, amount, branch_name)

No repetition (as suggested by example below)

Page 7: Design dbms

©Silberschatz, Korth and Sudarshan7.7Database System Concepts

What About Smaller Schemas?What About Smaller Schemas?

Suppose we had started with bor_loan. How would we know to split up (decompose) it into borrower and loan?

Write a rule “if there were a schema (loan_number, amount), then loan_number would be a candidate key”

Denote as a functional dependency:

loan_number amount

In bor_loan, because loan_number is not a candidate key, the amount of a loan may have to be repeated. This indicates the need to decompose bor_loan.

Not all decompositions are good. Suppose we decompose employee into

employee1 = (employee_id, employee_name)

employee2 = (employee_name, telephone_number, start_date)

The next slide shows how we lose information -- we cannot reconstruct the original employee relation -- and so, this is a lossy decomposition.

Page 8: Design dbms

©Silberschatz, Korth and Sudarshan7.8Database System Concepts

A Lossy DecompositionA Lossy Decomposition

Page 9: Design dbms

©Silberschatz, Korth and Sudarshan7.9Database System Concepts

Pitfalls in Relational Database DesignPitfalls in Relational Database Design

Relational database design requires that we find a “good” collection of relation schemas. A bad design may lead to Repetition of Information.

Inability to represent certain information.

Design Goals: Avoid redundant data

Ensure that relationships among attributes are represented

Facilitate the checking of updates for violation of database integrity constraints.

Page 10: Design dbms

©Silberschatz, Korth and Sudarshan7.10Database System Concepts

RDBMS Design issues RDBMS Design issues

So far we have assumed that attributes are grouped to form a relation schema by using the common sensecommon sense of database designer or by mapping a schema defined by ER model.

We still need some formal measure of why one grouping of grouping of attributesattributes into a relation schema may be better than another.

Unsatisfactory relation schemas that do not meet certain conditions – the normal form tests – are decomposed into smaller relation schemas that meet the tests and hence possess the desirable properties.

Thus, the normalization procedure provides database designers with; A formal framework for analyzing relation schemas based on their

keys and on the functional dependencies among their attributes. A series of normal form tests that can be carried out on individual

relation schemas so that the relational database can be normalized to any desired degree.

Page 11: Design dbms

©Silberschatz, Korth and Sudarshan7.11Database System Concepts

First Normal FormFirst Normal Form

A relational database table that adheres to 1NF is one that meets a certain minimum set of criteria.

These criteria are basically concerned with ensuring that the table is a faithful representation of a relation and that it is free of repeating groups.

Some definitions of 1NF, most notably that of Edgar F. Codd, make reference to the concept of atomicity.

Codd states that the "values in the domains on which each relation is defined are required to be atomic with respect to the DBMS."

Codd defines an atomic value as one that "cannot be decomposed into smaller pieces by the DBMS (excluding certain special functions).“

Meaning a field should not be divided into parts with more than one kind of data in it such that what one part means to the DBMS depends on another part of the same field.

Page 12: Design dbms

©Silberschatz, Korth and Sudarshan7.12Database System Concepts

First Normal FormFirst Normal Form

Domain is atomic if its elements are considered to be indivisible units

Examples of non-atomic domains:

Set of names, composite attributes

Identification numbers like CS101 that can be broken up into parts

A relational schema R is in first normal form if the domains of all attributes of R are atomic

Non-atomic values complicate storage and encourage redundant (repeated) storage of data

Example: Set of accounts stored with each customer, and set of owners stored with each account

We assume all relations are in first normal form (and revisit this again!)

Page 13: Design dbms

©Silberschatz, Korth and Sudarshan7.13Database System Concepts

First Normal Form (Cont’d)First Normal Form (Cont’d)

Atomicity is actually a property of how the elements of the domain are used.

Example: Strings would normally be considered indivisible

Suppose that students are given roll numbers which are strings of the form CS0012 or EE1127

If the first two characters are extracted to find the department, the domain of roll numbers is not atomic.

Doing so is a bad idea: leads to encoding of information in application program rather than in the database.

Page 14: Design dbms

©Silberschatz, Korth and Sudarshan7.14Database System Concepts

Goal — Devise a Theory for the FollowingGoal — Devise a Theory for the Following

Decide whether a particular relation R is in “good” form.

In the case that a relation R is not in “good” form, decompose it into a set of relations {R1, R2, ..., Rn} such that

each relation is in good form

the decomposition is a lossless-join decomposition

Our theory is based on:

functional dependencies

multivalued dependencies

Page 15: Design dbms

©Silberschatz, Korth and Sudarshan7.15Database System Concepts

Overview of Normal FormsOverview of Normal Forms

1NF ( First Normal Form)1NF ( First Normal Form)

To understand

2NF

3NF

BCNF Concept of FD’s ( Functional Dependency) required

To understand

4NF

5NF

Concept of MVD (Multi Valued Dependency) is required

Page 16: Design dbms

©Silberschatz, Korth and Sudarshan7.16Database System Concepts

Normalization Normalization

The basic objective of normalization is to reduce the various anomalies in the database.

Normalization can be looked upon as a process of analyzing the given relation schemas based on their FDs and primary keys to achieve the desirable properties of ; Minimizing redundancy

Minimizing the insertioninsertion, deletiondeletion, and updateupdate anomalies.

Unsatisfactory relation schemas that do not meet certain conditions – the normal form tests – are decomposed into smaller relation schemas that meet the tests and hence possess the desirable properties.

Thus, the normalization procedure provides database designers with;

A formal framework for analyzing relation schemas based on their keys and on the functional dependencies among their attributes.

A series of normal form tests that can be carried out on individual relation schemas so that the relational database can be normalized to any desired degree.

Page 17: Design dbms

©Silberschatz, Korth and Sudarshan7.17Database System Concepts

Normalization…Normalization… The normal form of a relation refers to the highest normal form

condition that it meets, and hence indicates the degree to which it has been normalized.

Normal forms when considered in isolation from other factors, do not guarantee a good database designnot guarantee a good database design.

It is generally not sufficient to check separately that each relation schema in the database is, say, in BCNF or 3NF.

Rather, the process of normalization through decomposition must also confirm the existence of additional properties that the relation schemas, taken together should possess; The Lossless joinLossless join,

The dependency preservation propertydependency preservation property, which ensures that each functional dependency is represented in some individual relations resulting after decomposition.

Page 18: Design dbms

©Silberschatz, Korth and Sudarshan7.18Database System Concepts

RDBMS designRDBMS design

RDBMS design involves checking the current design through Normal Form Test.

If the design is not in desired Normal form, then

Decompose the Relations (Tables) into smaller ones

Fulfill the properties of decomposition.

Properties of decomposition Functional dependency preservation

– Identify the FDs in the given Relation schema

– Apply Armstrong Axioms to find set of all FDs (Closure)

Page 19: Design dbms

©Silberschatz, Korth and Sudarshan7.19Database System Concepts

OverviewOverview

To understand

2NF

3NF FD’s required

BCNF FD’s & Closure of FDs requiredFD’s & Closure of FDs required

To understand

4NF

5NF

MVD is required

Page 20: Design dbms

©Silberschatz, Korth and Sudarshan7.20Database System Concepts

First Normal FormFirst Normal Form

Domain is atomic if its elements are considered to be indivisible units Examples of non-atomic domains:

Set of names, composite attributes

Identification numbers like CS101 that can be broken up into parts

A relational schema R is in first normal form if the domains of all attributes of R are atomic

Non-atomic values complicate storage and encourage redundant (repeated) storage of data