introduction to schema refinement. normal forms types of normal form first normal form (1nf) second...

37
Introduction to Schema Refinement

Upload: hilary-roberts

Post on 24-Dec-2015

229 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Introduction to Schema Refinement. Normal Forms Types of Normal Form First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form(3NF) Boyce-Codd

Introduction to Schema Refinement

Page 2: Introduction to Schema Refinement. Normal Forms Types of Normal Form First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form(3NF) Boyce-Codd

Normal Forms• Types of Normal Form

• First Normal Form (1NF)• Second Normal Form (2NF)• Third Normal Form(3NF)• Boyce-Codd Normal Form ( BCNF)• Fourth Normal Form (4NF)• Fifth Nornal Form (5NF)

Page 3: Introduction to Schema Refinement. Normal Forms Types of Normal Form First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form(3NF) Boyce-Codd

Normal Forms• First Normal Form (1NF)• A row of data cannot contain repeating group

of data • Ie atomic value

• Here the student Jeet is used twice in the table and subject PHY is repeated

• Another method is to divide the relation into 2

Cid name Subject101 Jeet PHY101 Jeet CHE102 Seet PHY103 Swet SOCIAL

Cid Cname101 Jeet102 Seet

Subid Cid Subject1 101 PHY2 101 CHE3 102 PHY4 103 SOCIAL

Page 4: Introduction to Schema Refinement. Normal Forms Types of Normal Form First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form(3NF) Boyce-Codd

Normal Forms

• Second normal form (2NF): • A relation that is in 1NF and every non-primary

key attribute is fully functionally dependent on the primary key.

• Does not permit partial dependency• No attribute is dependent on only to primary

key• Primary key consists of only a single attribute it

is automatically be in 2NF if it is in 1NF

Page 5: Introduction to Schema Refinement. Normal Forms Types of Normal Form First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form(3NF) Boyce-Codd

Normal Forms• Second normal form (2NF): • Consider the following relation, not in 2NF

• Here Cid & Order_id is PK• It is in 1NF• Not in 2NF, there are partial dependencies of columns on Primary key• Cname is only dependent on Cid• Order_name is dependent on order_id• There is no link between Cname & Sale_details• To reduce this table into 2NF, break the table into 3 different tables

Cid Cname Order_id Order_name Sale_details101 Adam 10 Order1 Sale1101 Adam 11 order2 sale2102 Alex 12 Order3 Sale3103 Sumo 13 Order4 Sale4

Cid Cname101 Adam102 Alex103 Sumo

Order_id Order_name10 Order111 order212 Order313 Order4

Cid Order_id Sale_details

101 10 Sale1101 11 sale2102 12 Sale3103 13 Sale4

Page 6: Introduction to Schema Refinement. Normal Forms Types of Normal Form First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form(3NF) Boyce-Codd

Normal Forms• Second normal form (2NF): • 1NF 2NF

• This relation is in 1NF, because the value of each domain are atomic.

• To convert into 2NF• First find the attribute that make primary key• No one attribute alone form a primary key• 2NF does not contain partial dependency• Divide the above relation into 3

Order Product Customer Address Qty Unit Prices1 p1 Anish bhopal 300 500s1 p2 Anish bhopal 100 600s2 p3 Ammu kollam 500 300s3 p5 Sumo kottayam 200 200s3 p1 Sumo kottayam 450 500

Page 7: Introduction to Schema Refinement. Normal Forms Types of Normal Form First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form(3NF) Boyce-Codd

Normal Forms• R1 R2 R3

• Advantages• Insert : can insert any row in relation R1 without

using product attribute• Delete : can delete the S1 & P2 by deleting a row

from relation R3, without losing the information that in R1

• Update: address for a given customer is written once, it is not repeated many times in relation R1, so we can update the address only once

Order Customer Addresss1 Anish bhopals2 Ammu kollams3 Sumo kottayam

Order Product Qtys1 p1 300s1 p2 100s2 p3 500s3 p5 200s3 p1 450

Product Unit Pricep1 500p2 600p3 300p5 200

Page 8: Introduction to Schema Refinement. Normal Forms Types of Normal Form First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form(3NF) Boyce-Codd

Normal Forms• Third normal form (3NF)• 3NF is based on the concept of transitive dependency. • Transitive dependencies are not allowed in 3NF. • Transitive dependency means, if in a relation if XY and YZ

hold, then X Z is also a functional dependency that holds on R. • Here X, Y, Z are attributes of the table and also Y should not be

a candidate key or a subset of any key (prime attribute) of the table R.

• Ie all non-prime attribute of table must be dependent on primary key

• Example. • Student3

Page 9: Introduction to Schema Refinement. Normal Forms Types of Normal Form First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form(3NF) Boyce-Codd

Normal Forms• Third normal form (3NF)

• 3 FD’s here. that is• • Fd1 Stdidgrade• • Fd2 Stdid marks• • Fd3 Marks grade• • We can see that marks is not a prime attribute of student3.• Stdid grade is a transitive dependency because of Fd2 and Fd3.• • This is not allowed in 3NF.

Page 10: Introduction to Schema Refinement. Normal Forms Types of Normal Form First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form(3NF) Boyce-Codd

Normal Forms• Third normal form (3NF)• A relation R is said to be in 3NF, if R is in 2NF and also no non prime

attribute of R is transitively dependent on the key of R.

• The above relation schema student3 is in 2NF, since there are no partial dependencies on a key exists.

• But it is not in 3NF because of the transitive dependency stdid grade via ‘marks’.

• We can normalize student3 by decomposing it in to two 3NF relation schemas,

• Student3A and student3B as follows.• • Student3A (stdid, branch, sem, rn, name, marks)• Student3B (marks, grade)

Page 11: Introduction to Schema Refinement. Normal Forms Types of Normal Form First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form(3NF) Boyce-Codd

Normal Forms• Third normal form (3NF)• Student3A and student3B as follows.• • Student3A (stdid, branch, sem, rn, name, marks)• Student3B (marks, grade)

Page 12: Introduction to Schema Refinement. Normal Forms Types of Normal Form First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form(3NF) Boyce-Codd

Normal Forms

• Boyce Codd Normal form (BCNF)• It higher form of 3NF• This is because every relation in BCNF is als ion 3NF. • However a relation in 3NF may not be in BCNF.• A relation schema R is in BCNF if whenever a

non trivial functional dependency X A holds in R, then X is a superkey of R.

• The only difference between BCNF and 3NF is that the condition (b) of 3NF is absent from BCNF.

Page 13: Introduction to Schema Refinement. Normal Forms Types of Normal Form First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form(3NF) Boyce-Codd

Normal Forms• Boyce Codd Normal form (BCNF)

• Here we can see that the relation Lots1A is not in BCNF, but it is in 3NF.

• FD5 violates BCNF because area is not a superkey.• Fd1 and Fd2 satisfies BCNF because the LHS are super keys.• So remove the attribute (county name) and place it in another

relation.

Page 14: Introduction to Schema Refinement. Normal Forms Types of Normal Form First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form(3NF) Boyce-Codd

Normal Forms• Boyce Codd Normal form (BCNF)

Page 15: Introduction to Schema Refinement. Normal Forms Types of Normal Form First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form(3NF) Boyce-Codd

Normal Forms• Fourth Normal Form (4NF)• A relation is in 4NF if • It is in BCNF• And• It has no multivalued dependency

• In this relation • Each subject has a well-defined set of trainers• Eg: MCA subject has 3 trainers• Each subject has a well-defined set of textbooks• Eg: mca subject has 2 textbooks• The textbook that are used for a given subject are independent of

the trainers

Subject Trainer Textbooks

MCAAjiJinsonLisha

DBMSJAVA

Computer JK OSC++

Page 16: Introduction to Schema Refinement. Normal Forms Types of Normal Form First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form(3NF) Boyce-Codd

Normal Forms• Fourth Normal Form (4NF)• The table has been converted to a realtion by filling in

all of empty rows• The relation is in 1NF

• The primary key of this relation consists of all three attributes

• Since there is no determinants other than the primary key, this relation is actually in BCNF

Subject Trainer TextbooksMCA AJI DBMSMCA AJI JAVAMCA Jinson DBMSMCA Jinson JAVAMCA Lisha DBMSMCA Lisha JAVA

Computer JK OSComputer JK C++

Page 17: Introduction to Schema Refinement. Normal Forms Types of Normal Form First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form(3NF) Boyce-Codd

Normal Forms• Fourth Normal Form (4NF)• Suppose for teaching MCA a new trainer

comes, it is necessary to create 2 new tuples , one for each of the 2 text books.

• See that it is not necessary to include all faculty• Decomposition cannot be made on the basis of

functional dependencies, because there are no functional dependencies in the relation.

• So we introduce multi valued dependencies (MVDs) in the relation.

Page 18: Introduction to Schema Refinement. Normal Forms Types of Normal Form First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form(3NF) Boyce-Codd

Normal Forms• Multivalued dependencies and 4NF• Subject trainer• Subjecttextbook• Double arrows are used here. • Read as• “ subject multidetermines trainer”• or “trainer is multidependent on subject”)• we know that a subject does not have a single corresponding trainer, ie..• functional dependency subject trainer does not hold. • But each subject has a well defined set of corresponding trainers. • By well defined here means that for a given subject(MCA) and a given text

book (DBMS) the set of trainers (AJI,Jinson,Lisha) matching the pair (MCA,Computer) in the relation depends on the value of MCA alone.

• It makes no difference what particular value of text book we choose.• The second MVD can also be interpreted like this.

Page 19: Introduction to Schema Refinement. Normal Forms Types of Normal Form First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form(3NF) Boyce-Codd

Normal Forms• Definition of multi valued dependency• Let R be a table, and let A, B, C be arbitrary subsets

of the set of attributes of R.• Then we say that B is multidependent on A , AB.• If and only if the set of B values matching a given

( A value, C value pair) in R depends only on the A value and is independent of the C value.

• MVDs always go together in pairs. That is given the table R (A, B, C), the MVD

• AB holds if and only if A C also holds.

Page 20: Introduction to Schema Refinement. Normal Forms Types of Normal Form First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form(3NF) Boyce-Codd

Normal Forms

• Fourth normal form• This is based on multivalued functional

dependencies.• A relation schema R is in 4NF with respect to a

set of dependencies F (that includes FDs and• MVDs) if, for every non trivial multivalued

dependency• X Y, X is a super key of R.• Consider the table.

Subject Trainer TextbooksMCA AJI DBMSMCA AJI JAVAMCA Jinson DBMSMCA Jinson JAVAMCA Lisha DBMSMCA Lisha JAVA

Computer JK OSComputer JK C++

Page 21: Introduction to Schema Refinement. Normal Forms Types of Normal Form First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form(3NF) Boyce-Codd

Normal Forms• Fourth normal form• The table or relation CFX is not in fourth normal

form because The MVDs subject textbook and subjecttrainer are not satisfying any of the 2 conditions of fourth normal form.

• So decomposing it into tables

Subject TrainerMCA AJIMCA JinsonMCA Lisha

Computer JK

Subject TextbooksMCA DBMSMCA JAVA

Computer OSComputer C++

Page 22: Introduction to Schema Refinement. Normal Forms Types of Normal Form First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form(3NF) Boyce-Codd

Normal Forms• Lossless join decomposition• Consider the example EMP

• Suppose we decompose the EMP table into Emp_projects and Emp_dependents.

• Emp_projects Emp_dependents

ename Pname DnameSmith X JohnSmith Y AnnaSmith X Annasmith Y John

ename PnameSmith XSmith Y

ename DnameSmith JohnSmith Anna

Page 23: Introduction to Schema Refinement. Normal Forms Types of Normal Form First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form(3NF) Boyce-Codd

Normal Forms

• Lossless join decomposition• Suppose we again join these tables we can

see that we get the original EMP table. • So this decomposition of EMP table in to

Emp_projects and Emp_dependents is a lossless join decomposition because nothing is lost after a decomposition.

Page 24: Introduction to Schema Refinement. Normal Forms Types of Normal Form First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form(3NF) Boyce-Codd

Normal Forms

• Lossless join decomposition

• Suppose we decompose the supply table in to two that is R1 and R2.

• We get

Page 25: Introduction to Schema Refinement. Normal Forms Types of Normal Form First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form(3NF) Boyce-Codd

Normal Forms• Lossless join decomposition• If we again join these two tables R1 and R2 we will get

• the join of these tables will not give our original table supply. So this is a lossy join

• decomposition because after decomposing the Supply table we have lost some values.

Page 26: Introduction to Schema Refinement. Normal Forms Types of Normal Form First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form(3NF) Boyce-Codd

Normal Forms• Join dependencies and fifth normal form• In some cases there may be no lossless join

decomposition of a table R into 2 tables but there may be a lossless join decomposition into more than 2 tables.

• For example in the supply table

Page 27: Introduction to Schema Refinement. Normal Forms Types of Normal Form First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form(3NF) Boyce-Codd

Normal Forms• Join dependencies and fifth normal form

• If we decompose the supply table in to 3

Page 28: Introduction to Schema Refinement. Normal Forms Types of Normal Form First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form(3NF) Boyce-Codd

Normal Forms

• Join dependencies and fifth normal form• Here we can see that if we again join these

tables R1, R2, R3 we will get the original table. • We can see that by joining just R1 and R2 will

not get the supply table. • But by joining all these 3 tables we will get the

supply table

Page 29: Introduction to Schema Refinement. Normal Forms Types of Normal Form First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form(3NF) Boyce-Codd

Normal Forms• Join dependencies and fifth normal form• So we are moving to another type of dependency called Join dependency.• If a join dependency is present in a table we perform decomposition to fifth

normal form(5NF)• Here for the supply table the join dependency is specified by JD (R1, R2, R3)• This is because by joining R1 and R2 and R3 tables we will get the original

table ‘Supply’.• JD (R1, R2, R3) can also be written as• JD( (sname, partname), (sname, projname), (partname,projname) )• We can see that JD( R1, R2) is not valid for the supply table because on

joining R1 and R2 we will not get the Supply table.• Trivial join dependency• For a table R, a join dependency specified as JD(R1, R2, R3…) is trivial,

if any of these Ri’ s is the table R.

Page 30: Introduction to Schema Refinement. Normal Forms Types of Normal Form First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form(3NF) Boyce-Codd

Normal Forms• Fifth normal form• It is also called project join normal form.• A relation schema is in fifth normal form (5NF) ,

if for every nontrivial join dependency JD( R1, R2, R3…), every Ri is a superkey of R.

• example

Page 31: Introduction to Schema Refinement. Normal Forms Types of Normal Form First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form(3NF) Boyce-Codd

Normal Forms• Fifth normal form• The key of this table is (sname, partname,

projname) • We have seen that it has a join dependency• JD { (sname,partname),(sname,projname),

(partname,projname) }• Here the projections are (sname,partname),

(sname,projname) and (partname,projname).

Page 32: Introduction to Schema Refinement. Normal Forms Types of Normal Form First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form(3NF) Boyce-Codd

Normal Forms• Fifth normal form• We can say that this table supply is not in 5NF

because of this join dependency • Each of these projections do not form a

superkey of supply. • Superkey of supply is (sname,projname,partname).

• (sname,partname) is not a superkey. • (sname,projname) is not a super key.• (partname,projname) is not a superkey.

Page 33: Introduction to Schema Refinement. Normal Forms Types of Normal Form First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form(3NF) Boyce-Codd

Normal Forms• Fifth normal form• So we have to normalise this table supply in to

tables that satisfy 5NF. • We are decomposing the table supply by

considering the JD. • Take each of the projections in the JD and form

tables as

Page 34: Introduction to Schema Refinement. Normal Forms Types of Normal Form First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form(3NF) Boyce-Codd

Normal Forms• Fifth normal form

Page 35: Introduction to Schema Refinement. Normal Forms Types of Normal Form First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form(3NF) Boyce-Codd

Normal Forms• Fifth normal form• See that each of these R1, R2, R3 are in fifth

normal form because there are no non trivial join dependencies in each of these tables.

• • A join dependency is very difficult to detect in

practice. • So it is not normally applied in a database.

Page 36: Introduction to Schema Refinement. Normal Forms Types of Normal Form First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form(3NF) Boyce-Codd

Normal Forms• Fifth normal form

Page 37: Introduction to Schema Refinement. Normal Forms Types of Normal Form First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form(3NF) Boyce-Codd

Normal Forms