objectives
DESCRIPTION
Objectives. In this lesson, you will learn to: Describe data redundancy Describe the first, second, and third normal forms Describe the Boyce-Codd Normal Form (BCNF) Appreciate the need for denormalization. Normalization. - PowerPoint PPT PresentationTRANSCRIPT
RDBMS Concepts/ Session 3 / 1 of 22
Objectives
In this lesson, you will learn to: Describe data redundancy Describe the first, second, and third
normal forms Describe the Boyce-Codd Normal Form
(BCNF) Appreciate the need for denormalization
RDBMS Concepts/ Session 3 / 2 of 22
Normalization
The logical design of the database, including the tables and the relationships between them, is the core of an optimized relational database.
A good logical database design can lay the foundation for optimal database and application performance. A poor logical database design can impair the performance of the entire system.
RDBMS Concepts/ Session 3 / 3 of 22
Normalizing a logical database design involves using formal methods to separate the data into multiple, related tables.
A greater number of narrow tables (with fewer columns) is characteristic of a normalized database. A few wide tables (with more columns) is characteristic of an nonnomalized database.
RDBMS Concepts/ Session 3 / 4 of 22
Understanding Data Redundancy Redundancy means repetition of data Redundancy increases the time involved
in updating, adding, and deleting data It also increases the utilization of disk
space and hence, disk I/O increases
RDBMS Concepts/ Session 3 / 5 of 22
Understanding Data Redundancy (Contd.)
Redundancy can lead to the following problems: Update anomalies—Inserting,
modifying, and deleting data may cause inconsistencies
Inconsistencies—Errors are more likely to occur when facts are repeated
Unnecessary utilization of extra disk space
RDBMS Concepts/ Session 3 / 6 of 22
Definition of Normalization Normalization is a scientific method of breaking
down complex table structures into simple table structures by using certain rules
It allows you to reduce redundancy in a table and eliminate the problems of inconsistency and disk space usage
Normalization results in the formation of tables that satisfy certain specified rules and represent certain normal forms
RDBMS Concepts/ Session 3 / 7 of 22
Normal Forms The most important and widely used
normal forms are: First Normal Form (1 NF) Second Normal Form (2 NF) Third Normal Form (3 NF) Boyce Codd Normal Form (BCNF)
RDBMS Concepts/ Session 3 / 8 of 22
First Normal Form A table is said to be in the 1 NF when each cell
of the table contains precisely one value Functional Dependency
The normalization theory is based on the fundamental notion of functional dependency
Given a relation R, attribute A is functionally dependent on attribute B if each value of A in R is associated with precisely one value of B
RDBMS Concepts/ Session 3 / 9 of 22
Un-Normalised Data Employee No Employee Name Branch Code Branch Name Branch Location Certification ID 1….n Certification Name 1….n Certification done at Marks obtained
RDBMS Concepts/ Session 3 / 10 of 22
Rule 1
Eliminate repeating groups: Make a separate table for each set of
repeated attributes and give each table a primary key.
RDBMS Concepts/ Session 3 / 11 of 22
FNF
Employee No Employee Name Branch Code Branch Name Branch Location
Employee No Certification ID Certification Name Certification done at Marks obtained
RDBMS Concepts/ Session 3 / 12 of 22
Second Normal Form (2NF) A table is said to be in 2 NF when it is in 1 NF
and every attribute in the row is functionally dependent upon the whole key, and not just part of the key
To ensure that a table is in 2 NF, you should: Find and remove attributes that are
functionally dependent on only a part of the key and not on the whole key and place them in a different table
Group the remaining attributes
RDBMS Concepts/ Session 3 / 13 of 22
Rule 2
Eliminate Redundant Data If an attribute depends only on part of
a multi-valued key, move it to separate table.
The certification Name appears redundantly.(It also depends only on a part of the multi-valued key).
RDBMS Concepts/ Session 3 / 14 of 22
SNF
Employee Certifications Emp Certifications
Employee No
Employee Name
Branch Code
Branch Name
Branch Location
Certification ID
Certification Name
Employee No
Certification ID
Certification done at
Marks obtained
RDBMS Concepts/ Session 3 / 15 of 22
Third Normal Form (3NF) A relation is said to be in 3 NF when it is in 2
NF and every non-key attribute is functionally dependent only on the primary key
To ensure that a table is in 3 NF, you should: Find and remove non-key attributes that are
functionally dependent on attributes that are not the primary key and place them in a different table
Group the remaining attributes
RDBMS Concepts/ Session 3 / 16 of 22
Rule 3
Eliminate columns not dependent on Key Employee Table satisfies 1st & 2nd
normal forms. But the key is Employee No, and the
Branch name & location describe only a branch, Not a employee.
RDBMS Concepts/ Session 3 / 17 of 22
TNF
Employee Employee No Name Branch Code
Branch Branch Code Branch Name Location
Certification Cert. ID Cert. Name
Emp Certification Emp No Cert Id Cert. Done at Marks obtained
RDBMS Concepts/ Session 3 / 18 of 22
Boyce-Codd Normal Form The original definition of 3NF was inadequate in
some situations It was not satisfactory for the tables:
that had multiple candidate keys where the multiple candidate keys were
composite where the multiple candidate keys overlapped
Therefore, a new normal form—the Boyce-Codd Normal Form (BCNF) was introduced A relation is in the Boyce-Codd normal form
(BCNF) if and only if every determinant is a candidate key
RDBMS Concepts/ Session 3 / 19 of 22
Characteristics of a normalized database
Each table must have a key field. All field must contain small data. There must be no repeating fields. Each table must contain information
about a single entity. Each field in a table must depend on the
key field. All non-key fields must be mutually
independent.
RDBMS Concepts/ Session 3 / 20 of 22
Understanding Denormalization The end product of normalization is a set of
related tables that comprise the database However, in the interests of speed of response
to critical queries, which demand information from more than one table, it is sometimes wiser to introduce a degree of redundancy in tables
The intentional introduction of redundancy in a table to improve performance is called denormalization
RDBMS Concepts/ Session 3 / 21 of 22
Summary
In this lesson, you learned that: Normalization is used to simplify table structures. Normalization results in the formation of tables
that satisfy certain specified constraints, and represent certain normal forms. The normal forms are used to ensure that various types of anomalies and inconsistencies are not introduced in the database. A table structure is always in a certain normal form. Several normal forms have been identified.
RDBMS Concepts/ Session 3 / 22 of 22
Summary (Contd.) The most important and widely used of these are:
First Normal Form (1NF) Second Normal Form (2 NF) Third Normal Form (3 NF) Boyce Codd Normal Form (BCNF)
The intentional introduction of redundancy in a table in order to improve performance is called denormalization.
The decision to denormalize results in a trade-off between performance and data integrity.
Denormalization increases disk space utilization.