Download - Tutorial 5
The Hong Kong University of Scienceand Technology
COMP231 Tutorial 5
Functional Dependencies, 3NF
HKUST 2 Database Management Systems
• SuperkeyK R
• Candidate KeyK R
no K’ K, s.t. K’ R (minimal)
• Primary KeyThe candidate key chosen to uniquely identify tuples in a
relation
superkey
candidate key
primary key
Review: Key
HKUST 3 Database Management Systems
• For a set of functional dependencies F, we can get the closure, F+, by applying Armstrong’s Axioms.
• Armstrong’s Axioms:• Reflexivity
If X Y, then X Y
• Augmentation
If X Y, then XZ YZ
• Transitivity
If X Y, Y Z, then X Z
• Derived rules:• Decomposition
If X YZ, then X Y and X Z
• Union
If X Y and X Z, then X YZ
• Pseudo-transitivity
If X Y and WY Z, then WX Z
Review: The Closure of FD
HKUST 4 Database Management Systems
Definition:
X, Y are attributes of a relation R:
X Y is in F+ Y X+
Example:
Given R = (loan_no, amount, branch_name, customer_name)• If loan_no amount
then loan_no+ = {loan_no, amount}
• If we also have loan_no branch_namethen loan_no+ = {loan_no, amount, branch_name}
• If we also have loan_no customer_name
then loan_no+ = {loan_no, amount, branch_name, customer_name}
Algorithm:
•X(0) := X•Repeat
X(i+1) := X(i) Z,where Z is the set of attributes such that there exists YZ in F, and Y X(i)
•Until X(i+1) := X(i)
•Return X(i+1)
Review: The Closure of Attributes
HKUST 5 Database Management Systems
Review: Canonical Cover of FD
Definition:
A canonical cover for F is a set of dependencies Fc such that
• F and Fc are equivalent
• Fc contains no redundancy
• Each left hand side of functional dependency in Fc is unique
HKUST 6 Database Management Systems
Review: Normalization
• Decomposition of a relation R with the following goals• Lossless (necessary)
Information lost?
• Dependency preservation (desirable)
(i Fi)+ = F+ ?
• Good form
1NF, 2NF, 3NF, BCNF
2NF:
R is in 2NF if and only iffor each FD: X {A} in F+
ThenA X (the FD is trivial), ORX is not a proper subset of a candidate key for R, ORA is a prime attribute
3NF:R is in 3NF if and only if
for each FD: X {A} in F+
ThenA X (trivial FD), ORX is a superkey for R, ORA is prime attribute for R
•A primary attribute is an attribute that is part of a candidate key
HKUST 7 Database Management Systems
R = (A, B, C, D, E)
F = {ABC, CDE, BD, EA}
Compute A+ and B+:
A+ := {A}
:= {A, B, C} ABC and {A} A+
:= {A, B, C, D} BD and {B} A+
:= {A, B, C, D, E} CDE and {C, D} A+
ends because A+ stops changing
B+ := {B}
:= {B, D} BD and {B} B+
ends because B+ stops changing
Exercise 1: The Closure of Attributes
HKUST 8 Database Management Systems
R = (A, B, C, D, E)
F = {ABC, CDE, BD, EA}
List all candidate keys of R.
• We have A+ = {A, B, C, D, E} in Exercise 1,
then AABCDE, it is a candidate key of R.
• Since EA,
then EABCDE. (transitivity)
• Since CDE,
then CDABCDE. (transitivity)
• Since BD,
then BCCD, then BCABCDE. (augmentation, transitivity)
So A, E, CD, BC are candidate keys of R.
Exercise 2: Candidate Keys
HKUST 9 Database Management Systems
R = (A, B, C, D, E)
F = {ACE, ACDB, CED, BE}
Find the canonical cover of F.
Algorithm:
Repeat
UnionX1Y1 and X1Y2 replaced with X1Y1Y2
Find an extraneous attribute
If an extraneous attribute is found in XY,
delete it from XY
Until F does not change
Exercise 3: Compute Canonical Cover
HKUST 10 Database Management Systems
R = (A, B, C, D, E)
F = {ACE, ACDB, CED, BE}
Find the canonical cover of F.
First loop:
Union
Fc(1) = {ACE, ACDB, CED, BE}
Find an extraneous attribute
Consider ACDB:
D is extraneous because ACE and CED
Remove D in ACDB
Fc(1) = {ACE, ACB, CED, BE}
Exercise 3: Compute Canonical Cover (cont)
HKUST 11 Database Management Systems
R = (A, B, C, D, E)
Fc(1) = {ACE, ACB, CED, BE}
Second loop:
Union
Fc(2) = {ACBE, CED, BE}
Find an extraneous attribute
Consider ACBE:
E is extraneous because BE
Remove E in ACBE
Fc(2) = {ACB, CED, BE}
Exercise 3: Compute Canonical Cover (cont)
HKUST 12 Database Management Systems
R = (A, B, C, D, E)
Fc(2) = {ACB, CED, BE}
Third loop:
Union
Fc(3) = {ACB, CED, BE}
Find an extraneous attribute
No extraneous attributes found
Ends because Fc stops changing
Fc = {ACB, CED, BE}
Exercise 3: Compute Canonical Cover (cont)
HKUST 13 Database Management Systems
Exercise 3: Compute Canonical Cover (cont)
• Different order of removing the extraneous attributes may result in different FC
• Example:
R=(A, B, C, D)
FD = {AC, BCA, ABCD}
• In ABCD, A is extraneous or C is extraneous
• If we remove A first, we get Fc = {AC, BCAD}
• If we remove C first, we get Fc = {AC, BCA, ABD}
HKUST 14 Database Management Systems
Exercise 4: Normal forms
• R=(A, B, C, D, E)
• FD = {ABC, CDE, BD, EA}
• Is R in 1NF?• Yes. Relational tables are always in 1NF.
• Is R in 2NF?• We found candidate keys: A, E, CD, BC.
ABC BC are prime attribute
CDE E is a prime attribute
BD D is a prime attribute
EA A is a prime attribute
So R is in 2NF
What is 2NF?
For each FD: X->{A}
1. A X (FD is trivial)
2. X is NOT subset of candidate key
3. A is prime attribute
HKUST 15 Database Management Systems
Exercise 4: Normal forms (cont)
• R=(A, B, C, D, E)
• FD = {ABC, CDE, BD, EA}
• Is R in 3NF?• We found candidate keys: A, E, CD, BC.
ABC A is a candidate key
CDE CD is a candidate key
BD D is a prime attribute
EA E is a candidate key
So R is in 3NF What is 3NF?
For each FD: X->{A}
1. A X (FD is trivial)
2. X is super key
3. A is prime attribute
Different with 2NF
HKUST 16 Database Management Systems
Exercise 4: Normal forms (cont)
• R=(A, B, C, D, E, F)
• FD = {AB, BCD, CE, BF}
• Is R in 2NF?• Candidate key: AC.
AB A is a proper subset of candidate key AND B is not a prime attribute
BCD BC is not a proper subset of candidate key
CE C is a proper subset of candidate key AND E is not a prime attribute
BF B is not a proper subset of candidate key
AB or CE makes R not in 2NF
What is 2NF?
For each FD: X->{A}
A X (FD is trivial)
X is NOT subset of candidate key
A is prime attribute
HKUST 17 Database Management Systems
Exercise 4: Normal forms (cont)
• R=(A, B, C, D, E, F)
• FD = {AB, BCD, CE, BF}
• Is R in 3NF?• 3NF 2NF 1NF, R is not in 2NF, so R is not in 3NF either.
• Candidate key: AC.
AB A is not a super-key AND B is not a prime attribute
BCD BC is not a super-key AND D is not a prime attribute
CE C is not a super-key AND E is not a prime attribute
BF B is not a super-key AND F is not a prime attribute
Either one of the FD makes R not in 3NF
What is 3NF?
For each FD: X->{A}
1.
A X (FD is trivial)
2.
X is super key
3.
A is prime attribute
HKUST 18 Database Management Systems
Exercise 5: Decomposition
R = (A, B, C, D, E, F, G, H)
F = {ACG, DEG, BCD, CGBD, ACDB, CEAG}
• A decomposition of R:• Table1: (A, B, C, D)
• Table2: (D, E, G)
• Table3: (A, C, D, F, H)
• Is it lossless?Yes
A decomposition of R into R1 and R2 is lossless if and only if the common attributes of R1 and R2 is a candidate key for R1 or R2
• (Table1 Table3) Table2 = D (candidate key of Table2)
• Table1 Table3 = ACD (candidate key of Table1)
R
Table1Table3 Table2
Table1 Table3
HKUST 19 Database Management Systems
Exercise 5: Decomposition (cont)
R = (A, B, C, D, E, F, G, H)
F = {ACG, DEG, BCD, CGBD, ACDB, CEAG}
• A decomposition of R:• Table1: (A, B, C, D)
• Table2: (D, E, G)
• Table3: (A, C, D, F, H)
• Is it dependency preserving?
• No (CGBD is lost)
HKUST 20 Database Management Systems
Exercise 6: 3NF
R = (A, B, C, D, E, F, G, H)
F = {ACG, DEG, BCD, CGBD, ACDB, CEAG}
Decompose R into 3NF
• Algorithm:
Compute Fc of F
S :=
For each FD XY in Fc:
S := S (X,Y)
If no scheme contains a candidate key for R:
Choose any candidate key CN
S := S table with attributes in CN
HKUST 21 Database Management Systems
Exercise 6: 3NF (cont)
R = (A, B, C, D, E, F, G, H)
F = {ACG, DEG, BCD, CGBD, ACDB, CEAG}
• In Exercise 3, we get
Fc = {ACB, DEG, BCD, CGD, CEA}
• For each FD, we generate a table:
Table1: (A, B, C)
Table2: (D, E, G)
Table3: (B, C, D)
Table4: (C, D, G)
Table5: (A, C, E)
HKUST 22 Database Management Systems
R = (A, B, C, D, E, F, G, H)
Fc = {ACB, DEG, BCD, CGD, CEA}
• Any table in S contains a candidate key for R?
No
• ACFH is a candidate key:
Generate Table6: (A, C, F, H)
• So a possible 3NF decomposition of R is:
(A, B, C), (D, E, G), (B, C, D), (C, D, G), (A, C, E), (A, C, F, H)
• Lossless?
Yes
• Dependency preserving?
Yes
Exercise 6: 3NF (cont)
R
ACBDEG
ACFH
CEBDG
ACE
BCDG
DGE
BCDCDG