normalizationnormalization chapter 4. purpose of normalization normalization a technique for...
TRANSCRIPT
NormalizationNormalizationNormalizationNormalization
Chapter 4
Purpose of NormalizationPurpose of Normalization
Normalization A technique for producing a set of relations with desirable
properties, given the data requirements of an enterprise.
The process of normalization is a formal method that identifies relations based on their primary or candidate keys and on the functional dependencies among their attributes.
Normalization supports a number of tests , which can be applied to relations so that a relational schema can be normalized to a specific form to prevent the possible occurrence of update anomalies.
First Normal Form Relation First Normal Form Relation
We saw before that a relation should have the following properties:
A relation in a database has a unique name. Each cell of the relation contains exactly one atomic value. Each attribute has a distinct name within a relation. The values of an attribute are all from the same domain. The order of the attributes has no significance. Each tuple is distinct; there are no duplicate tuples. The order of tuples has no significance, theoretically.
In that case we say that the relation is in
FIRST NORMAL FORM (1NF).
Normalization Process: 1NFNormalization Process: 1NF
Un-normalized form
A table that contains
repeating Groups
S# P Q
P# QTY
S1 P1 300
P2 200
P3 400
P4 200
P5 100
P6 100
S2 P1 300
P2 400
S3 P2 200
S4 P2 200
P4 300
P5 400
First Normal Form A relation in which the inter-
section of each row and column contains one and only one value
S# P# QTY
S1 P1 300 S1 P2 200 S1 P3 400 S1 P4 200 S1 P5 100 S1 P6 100 S2 P1 300 S2 P2 400 S3 P2 200 S4 P2 200 S4 P4 300 S4 P5 400
Sample Database : 1NFSample Database : 1NF
S# Status City P# QTY S1 20 London P1 300
S1 20 London P2 200
S1 20 London P3 400
S1 20 London P4 200
S1 20 London P5 100
S1 20 London P6 100
S2 10 Paris P1 300
S2 10 Paris P2 400
S3 10 Paris P2 200
S4 20 London P2 200
S4 20 London P4 300
S4 20 London P5 400
Constraints
Primary Key : S# , P#
A city has a specific statusA Supplier Is located in one City
First
Update AnomaliesUpdate Anomalies
Insertion Anomalies each time we insert a new part for a supplier we have to
repeat status and city we cannot insert a new supplier before he supplies a part
Deletion Anomaly If we delete the fact that S3 supplies P2 we delete the row and
we do not know anymore that he is located in Paris
Modification Anomalies If S1 moves from London to Berlin we have to modify 6 rows If the status of London changes we have to modify 9 rows in
order to avoid inconsistency
Functional DependenciesFunctional Dependencies
Functional Dependency Describes the relationship between attributes in a relation If A and B are attributes of relation R, B is functionally dependent on
A , if each value of A in R is associated with exactly one value of B in R notation A B Functional dependency diagram
Determinant The determinant of a functional dependency refers to the attribute or
group of attributes on the starting point of the arrow
A BB is functionally
dependent on A
Functional Dependencies in FIRSTFunctional Dependencies in FIRST
QTY
S#
P#
Status
City
Sample Database : 1NFSample Database : 1NF
S# Status City P# QTY S1 20 London P1 300
S1 20 London P2 200
S1 20 London P3 400
S1 20 London P4 200
S1 20 London P5 100
S1 20 London P6 100
S2 10 Paris P1 300
S2 10 Paris P2 400
S3 10 Paris P2 200
S4 20 London P2 200
S4 20 London P4 300
S4 20 London P5 400
Constraints
Primary Key : S# , P#
A city has a specific statusA Supplier Is located in one City
First
Lossless-join and Dependency Preservation Lossless-join and Dependency Preservation PropertiesProperties
Two important properties of decomposition
Lossless-join property enables us to find any instance of the original relation from corresponding instances in the smaller relations.
Dependency preservation property enables us to enforce a constraint on the original relation by enforcing some constraint on each of the smaller relations.
Full Functional DependencyFull Functional Dependency
Full Functional Dependency indicates that if A and B are attributes of a relation , B is fully functionally dependent on A if B is functionally dependent on A, but not on a proper subset of A.
e.g. First.(S# , Status ) First.City
First. S# First.City
In fact full functional dependence is a more important concept than functional dependence
Normalization Process: 2NFNormalization Process: 2NF
A relation is in second normal form if it is in first normal form and every non-primary-key attribute is fully functional dependent on the primary key.
S# P# QTY S1 P1 300
S1 P2 200
S1 P3 400
S1 P4 200
S1 P5 100
S1 P6 100
S2 P1 300
S2 P2 400
S3 P2 200
S4 P2 200
S4 P4 300
S4 P5 400
S# Status City S1 20 London
S2 10 Paris
S3 10 Paris
S4 20 London
S5 30 Athens
The reduction consists of a suitable projection
Supplier 5 is inserted
If in a 1NF relation the primary has only oneattribute, the relation is also in 2NF
SPSecond
Functional Dependencies in SP and SecondFunctional Dependencies in SP and Second
QTY
S#
P#
Status
City
S#
Elimination of non-fully functionaldependencies
Normalization Process: 3NFNormalization Process: 3NF
A relation in third normal form is a relation that is in first and second normal form , and in which no non-primary-key attribute is transitively dependent on the primary key.
S# City S1 London
S2 Paris
S3 Paris
S4 London
S5 Athens
SC City Status Athens 30
London 20
Paris 10
CS
Functional Dependencies in SP , SC and CSFunctional Dependencies in SP , SC and CS
QTY
S#
P#
City
City
S#
Status
Elimination oftransitive dependencies
Inter-relational DependencyInter-relational Dependency
QTY
S#
P#
City
S#
S#
Status
The three relations are also in 3NF but there is an inter-relational dependency
Bad decomposition
Normalization Process: BCNFNormalization Process: BCNF
A relation is in Boyce-Codd normal form if and only if every determinant is a candidate key.
This definition doesn’t refers to other normal forms. BCNF is stronger than 3NF
If a relation is in third normal form , violation of the BCNF is quite rare. It may only happen under the specific conditions that the relation :
contains two or more composite candidate keys which overlap and share at least one attribute in common this attribute is fully dependent on the primary key
Boyce-Codd Normal Form (BCNF)Boyce-Codd Normal Form (BCNF)
Violation of BCNF may occur in a relation that contains two (or more) composite keys which overlap and share at least one attribute in common.
Multivalued DependencyMultivalued Dependency
In R(A,B,C) the multivalued dependency R.A R.B
holds in R , if and only if the set of B-values matching a given pair (A,C) is independent of the C-value.
Multivalued dependencies always go together in pairs notation R.A R.B R.C functional dependency is a special case of multivalued
dependency example Course Teacher Text
Physics Prof.Green Basic Mechanics
Prof.Brown Optics
Prof. Black
Math Prof.White Modern Algebra
Geometry
CTX
Normalization Process: 4NFNormalization Process: 4NF
Course Teacher Text
Physics Prof.Green Basic Mechanics
Physics Prof.Green Optics
Physics Prof.Brown Basic Mechanics
Physics Prof.Brown Optics
Physics Prof. Black Basic Mechanics
Physics Prof. Black Optics
Math Prof.White Modern Algebra
Math Prof.White Geometry
CTX
Normalized
4NFCourse Teacher
Physics Prof.Green
Physics Prof.Brown
Physics Prof. Black
Math Prof.White
Course Text
Physics Basic Mechanics
Physics Optics
Math Modern Algebra
Math Geometry
CTX.Course CTX.TeacherCTX.Course CTX.Text
Omit multivalued dependencies