normalization fundamentals of database systems. lilac safadi normalization 2 database design steps...

54
Normalization Normalization Fundamentals of Database Systems

Upload: ashlee-bell

Post on 01-Jan-2016

234 views

Category:

Documents


2 download

TRANSCRIPT

NormalizationNormalization

Fundamentals of

Database Systems

Lilac Safadi Normalization

2

Database Design

Steps in building a database for an application:

Real-world domain

Conceptualmodel

DBMS data model

Create Schema

(DDL)

Modify data (DML)

Lilac Safadi Normalization

3

How to produce a good relation How to produce a good relation schema?schema?

1. Start with a set of relation

2. Define the functional dependencies for the relation to specify the PK

3. Transform relations to normal form

Lilac Safadi Normalization

4

Data RedundancyData Redundancy

SL21

SG37

SG14

SA9

SG5

StaffNo

John

Ann

David

Mary

Susan

FName

White

Beech

Ford

Howe

Brand

LName position

Manager

Assistant

Supervisor

Assistant

Manager

Salary

30000

12000

18000

9000

24000

BrnNo

B005

B003

B003

B007

B003

City

London

Glasgow

Glasgow

Aberdeen

Glasgow

SL41 Julie Lee Assistant 9000 B005 London

Address

22 Deer Rd

163 Main St

16 Arglly St

22 Deer Rd

163 Main St

163 Main St

Relations that have redundant data may have update anomalies (insert, modify, delete)

STAFFBRANCH

B003 Glasgow163 Main St

B003 Glasgow163 Main St

B003 Glasgow163 Main St

Lilac Safadi Normalization

5

SL21

SG37

SG14

SA9

SG5

StaffNo

John

Ann

David

Mary

Susan

FName

White

Beech

Ford

Howe

Brand

LName position

Manager

Assistant

Supervisor

Assistant

Manager

Salary

30000

12000

18000

9000

24000

SL41 Julie Lee Assistant 9000

BrnNo

B005

B003

B007

City

London

Glasgow

Aberdeen

Address

22 Deer Rd

163 Main St

16 Arglly St

STAFF

BRANCH

BrnNo

B005

B005

B003

B003

B003

B007

Lilac Safadi Normalization

6

Relation DecompositionRelation DecompositionNormalization process involve decomposing a relation

Decomposition require to be reversible

Functional dependencies guarantee decomposition to be reversible

While normalization, two important properties associated with decomposition:

Lossless-join

Dependency preservation

Lilac Safadi Normalization

7

SL21

SG37

SG14

SA9

SG5

StaffNo

John

Ann

David

Mary

Susan

FName

White

Beech

Ford

Howe

Brand

LName position

Manager

Assistant

Supervisor

Assistant

Manager

Salary

30000

12000

18000

9000

24000

SL41 Julie Lee Assistant 9000

BrnNo

B005

B003

B007

City

London

Glasgow

London

Address

22 Deer Rd

163 Main St

16 Arglly St

STAFF

BRANCH

City

London

London

Glasgow

Glasgow

London

Glasgow

Lilac Safadi Normalization

8

Data RedundancyData Redundancy

SL21

SG37

SG14

SA9

SG5

StaffNo

John

Ann

David

Mary

Susan

FName

White

Beech

Ford

Howe

Brand

LName position

Manager

Assistant

Supervisor

Assistant

Manager

Salary

30000

12000

18000

9000

24000

BrnNo

B005

B007

City

London

London

SL41 Julie Lee Assistant 9000 B005 London

Address

22 Deer Rd

22 Deer Rd

22 Deer Rd

STAFFBRANCH

B003 Glasgow163 Main St

B003 Glasgow163 Main St

B003 Glasgow163 Main St

SL21 John White Manager 30000 LondonB007 16 Arglly St

SA9 Mary Howe Assistant 9000 B007 London16 Arglly St

SL41 Julie Lee Assistant 9000 B005 London22 Deer Rd

Lilac Safadi Normalization

9

Functional DependenciesFunctional DependenciesDescribes the relationship between attributes in a relation.

If A and B are attributes of relation R,

B is functionally dependent on A, denoted by A B, if each value of A is associated

with exactly one value of B. B may have several values of A.

Determinant Dependent

•Functional dependency is identifies between attributes in a relation at different times

(all time functional dependency)

A BB is functionallydependent on A

Lilac Safadi Normalization

10

A B

t

u

If t & u agree here Then they must agree here

Functional DependenciesFunctional Dependencies

A Bwhenever two tuples t & u agree on all attributes of A, then they must agree on attribute B

Lilac Safadi Normalization

11

Functional Dependencies

Example

StaffNo positionB is functionallydependent on A

position StaffNoStaffNo is NOT functionally

dependent on position

SL21 Manager

Manager SL21 SG5

1:1 or M:1 relationship

between attributes in a

relation

1:M relationship

between attributes in a

relation

Lilac Safadi Normalization

12

Trivial Functional DependenciesTrivial Functional Dependencies

A B is trivial if B A

StaffNo, SName SName

StaffNo, SName StaffNo

We are not interested in trivial functional dependencies as it provides no genuine

integrity constraints on the value held by these attributes

Lilac Safadi Normalization

13

StaffBranch ExampleStaffBranch ExampleFunctional dependencies on StaffBranch relation

StaffNo FName, Lname, position, salary, brnNo, Address, city

BranchNo Address, city

Address, city BranchNo

BranchNo, position salary

Address, city, position salary

Determinants:

StaffNo, BranchNo, (Address, city), (branchNo, position), and (address, city, position)

Lilac Safadi Normalization

14

Identifying the PKIdentifying the PKPurpose of functional dependency, specify the set of integrity constraints that must

hold on a relation

The determinant attribute(s) are candidate of the relation

•1:1 relationship between determinant & dependent

• No subset of determinant attribute(s) is a determinant. (nontrivial)

If (A, B) C, then NOT A B, and NOT B A

• All attributes that are not part of the CK should be functionally dependent on the

key. CK all attributes of R

• Hold for all time

PK is the candidate attribute(s) with the minimal set of functional dependency

Lilac Safadi Normalization

15

ClosureClosure

Closure (inferred from) X+: the set of functional dependencies that are implied by a given set of functional dependencies X

A B

t

u

If t & u agree here Then they must agree here

C

So surely they will agree here

C B

X A B

X+ A C

Lilac Safadi Normalization

16

Closure ExampleClosure Example

S BranchNo (Address, city)

S+ BranchNo AddressBranchNo city

Implied by

Lilac Safadi Normalization

17

Inference Rules for Functional Inference Rules for Functional DependenciesDependencies

Armstrong’s aximos (inference rules): the set of inference rules specifies how functional dependencies can be inferred from given one

Inference rules:Reflexivity If B A, then A BAugmentation If A B, then A,C B,CTransitivity If A B and B C, then A CSelf-Determination A ADecomposition If A B,C, then A B and A CUnion If A B and A C, then A B,CComposition If A B and C D, then A,C B,D

Lilac Safadi Normalization

18

Minimal Sets of Functional Minimal Sets of Functional DependenciesDependencies

• Complete set of functional dependencies for a relation can very large

• We need to reduce the set to a manageable size, by applying the inference rules

repeatedly until they stop producing new FDs

Assume S1 & S2 are set of dependencies

S1 S2, then S2 is a cover for S1 or S1 is covered by S2

if S2 is a cover for S1

& S1 is a cover for S2

S1 equivalent to S2

Lilac Safadi Normalization

19

Minimal Sets of Functional Minimal Sets of Functional DependenciesDependencies

A set of functional dependencies X is minimal if it satisfies the following:

• Every dependency in X has a single attribute for its right-hand side

• Can’t replace any dependency A B in X with C B , where C A, & still have

a set of dependencies equivalent to X

• Can’t remove any dependency from X and still have a set of dependencies that is

equivalent to X

Lilac Safadi Normalization

20

Minimal Sets of Functional Minimal Sets of Functional DependenciesDependencies

1. For each X {A1, A2, .. An}, create X A1, X A2, …., X An

2. A, B C is equivalent to B C, then replace A, B C with B C

3. X - {A B} equivalent to X, then remove A B

Lilac Safadi Normalization

21

The purpose of NormalizationThe purpose of NormalizationNormalization is a bottom-up approach to database design that begins by examining

the relationships between attributes. It is performed as a serious of tests on a relation

to determine whether it satisfies or violates the requirements of a given normal form.

Purpose:

Guarantees no redundancy due to FDs

Guarantees no update anomalies

Normal Forms:

First Normal Form (1NF)

Second Normal Form (2NF)

Third Normal Form (3NF)

Boyce-Codd Normal Form (BCNF)

Fourth Normal Form (4NF)

Fifth Normal Form (5NF)

Lilac Safadi Normalization

22

The Process of NormalizationThe Process of Normalization

Normalization is a technique for analyzing relations based on their CK & FD

5NF

4NF

BCNF

3NF

2NF

1NF

Higher Normal Form

Strong

er in

form

at

Less

vulne

rable

to u

pdat

e an

omali

es

Lilac Safadi Normalization

23

First Normal Form (1NF)First Normal Form (1NF)

Unnormalized form (UNF): A relation that contains one or more repeating groups

First normal form (1NF): A relation in which the intersection of each row and

column contains one & only one value

Unnormalized relation

ClientNo

CR76

PropertyNo

PG4

Name

John Key

CLIENT_PROPERTY

PG16

PG4PG36

PG16

CR56 Aline Stewart

Lilac Safadi Normalization

24

UNF 1NFUNF 1NFApproach 1Approach 1

Expand the key so that there will be a separate tuple in the original relation for each

repeated attribute(s). Primary key becomes the combination of primary key and

redundant value

1NF relation

Disadvantage: introduce redundancy in the relation

ClientNo

CR76

PropertyNo

PG4

Name

John Key

CLIENT_PROPERTY

PG16

PG4PG36

PG16

CR56 Aline Stewart

CR76 John Key

CR56 Aline Stewart

CR56 Aline Stewart

Lilac Safadi Normalization

25

If the maximum number of values is known for the attribute, replace repeated

attribute (PropertyNo) with a number of atomic attributes (PropertyNo1,

PropertyNo2, PropertyNo3)

1NF relation

Disadvantage: introduce NULL values in the relation

UNF 1NFUNF 1NFApproach 2Approach 2

ClientNo

CR76

PropertyNo1

PG4

Name

John Key

CLIENT_PROPERTY

PG16

PG4 PG36CR56 Aline Stewart

PropertyNo2 PropertyNo3

NULL

PG16

Lilac Safadi Normalization

26

UNF 1NFUNF 1NFApproach 3Approach 3

Remove the attribute that violates the 1NF and place it in a separate relation along

with the primary key

ClientNo

CR76

Name

John Key

CLIENT

CR56 Aline Stewart

ClientNo

CR76

PropertyNo

PG4

PROPERTY

PG16

PG4PG36

PG16

CR56

CR76

CR56CR56

1NF relation

1NF relation

Lilac Safadi Normalization

27

Full Functional DependencyFull Functional Dependency

If A and B are attributes of a relation

B is fully functionally dependent on A if B is functionally dependent on A, but not

on any proper set of A

B is partial functional dependent on A if some attributes can be removed from A

& the dependency still holds

StaffNo, Sname BranchNo Partial dependency

ClientNo, PropertyNo RentDate Full dependency

Lilac Safadi Normalization

28

Second Normal Form (2NF)Second Normal Form (2NF)

Second normal form (2NF): A 1NF relation in which every attribute is fully

nontrivial functionally dependent on the PK.non-prime attributes fully dependent on

PK.

Applies to relations with composite primary keys & partial dependencies

1NF relation

ClientNo cNamePropertyNo

CLIENT_RENTAL

Address RentStart RentFinish Rent OwnerNo OName

Lilac Safadi Normalization

29

1NF 2NF1NF 2NF

1. Start with 1NF relation

2. Find the FDs of a relation

3. Test the FDs whose determinant attribute is part of the PK

Lilac Safadi Normalization

30

ClientNo cNamePropertyNo

CLIENT_RENTAL

pAddress RentStart RentFinish Rent OwnerNo OName

(ClientNo, PropertyNo) PKClientNo, PropertyNo RentStart, RentFinish Full DependencyClientNo CName Partial DependencyPropertyNo Paddress, Rent, OwnerNo, Oname Partial Dependency

ClientNo, RentStart PropertyNo, pAddress, RentFinish, Rent, OwnerNo, OnamePropertyNo, RentStart ClientNo, cName, RentFinish

1NF 2NF1NF 2NF

Lilac Safadi Normalization

31

1NF 2NF1NF 2NF

3. Remove partial dependencies by placing the functionally dependent attributes in

a new relation along with a copy of their determinants

2NF relation 2NF relation

2NF relation

ClientNo cName

CLIENTClientNo PropertyNo RentStart RentFinish

RENTAL

PropertyNo

PROPERTY_OWNER

pAddress Rent OwnerNo OName

Lilac Safadi Normalization

32

Transitive DependencyTransitive Dependency

A, B, C are attributes of a relation, such that

If A B and B C, then C is transitively dependent on A via B

Provided A is NOT functionally dependent on B or C (nontrivial FD)

Example

StaffNo BranchNo , BranchNo Address

StaffNo Address

Lilac Safadi Normalization

33

Third Normal Form (3NF)Third Normal Form (3NF)

Third normal form (3NF): A 2NF relation in which NO non-prime attribute is

transitively dependent on the PK

3NF relation 3NF relation

2NF relation

ClientNo cName

CLIENTClientNo PropertyNo RentStart RentFinish

RENTAL

PropertyNo

PROPERTY_OWNER

pAddress Rent OwnerNo OName

Lilac Safadi Normalization

34

2NF 3NF2NF 3NF

1. Identify the PK in the 2NF relation

2. Identify FDs in this relation

3. If transitive dependencies exist, place transitively dependent attributes in a new

relation along with a copy of their determinants

3NF relation 3NF relation

OwnerNo OName

OWNER

PropertyNo pAddress rent OwnerNo

PROPERTY_FOR_RENT

Lilac Safadi Normalization

35

Review of DecompositionsReview of Decompositions

CLIENT_RENTAL

CLIENT RENTAL OWNER PROPERTY_FOR_RENT

PROPERTY_OWNER

1NF

2NF

3NF

Lilac Safadi Normalization

36

General Definition of 2NF & 3NFGeneral Definition of 2NF & 3NF

Second normal form (2NF): A 1NF relation in which every non-primary-key

attribute is fully functionally dependent on the CK

Third normal form (3NF): A 2NF relation in which NO non-primary-key attribute in

a nontrivial FD is transitively dependent on the CK

Lilac Safadi Normalization

37

Boyce-Codd Normal Form Boyce-Codd Normal Form (BCNF)(BCNF)

Boyce-Codd normal form (3NF): A 3NF relation in which every determinant in a

nontrivial FD is a CK

Difference between 3NF & BCNF: A B

• 3NF allows A NOT CK

• BCNF insists on A is a CK

Potential to violate BCNF may occur in a relation that:

• Contain two (or more) composite CKs

• CKs overlap. (at least one attribute in common)

Lilac Safadi Normalization

38

Boyce-Codd Normal Form Boyce-Codd Normal Form (BCNF)(BCNF)

A B C D

3NF but not BCNF

Lilac Safadi Normalization

39

ClientNo

CLIENT_INTERVIEW

Int_Date Int_Time StaffNo RoomNo

3NF BCNF3NF BCNF

ClientNo, Int_Date Int_Time, StaffNo, RoomNoStaffNo, Int_Date, Int_Time ClientNoRoomNo, Int_Date, Int_Time StaffNo, ClientNoStaffNo, Int_Date RoomNo

1. Examine FDs for a relation2. If determinant is NOT a CK, decompose relation into 2 relations

Lilac Safadi Normalization

40

3NF BCNF3NF BCNF

3. Remove non-CK dependencies by placing the functionally dependent attributes

in a new relation

BCNF relation BCNF relation

Int_Date RoomNo

STAFF_ROOMClientNo Int_date Int_time StaffNo

INTERVIEW

ClientNo

Lilac Safadi Normalization

41

Review Example

PG4

PG16

Pno pAddress

18-Oct-00

22-Apr-01

1-Oct-01

22-Apr-01

24-Oct-01

iDate iTime

10:00

09:00

12:00

13:00

14:00

comments

Replace crockery

Good order

Damp rot

Replace carpet

Good condition

StaffNo

SG37

SG14

SG14

SG14

SG37

CarReg

M23JGR

M53HDR

N72HFR

M53HDR

N72HFR

Lawrence St,

Glasgow

5 Novar Dr.,

Glasgow

sName

Ann

David

David

David

Ann

STAFF_PROPERTY_INSPECTION

Unnormalized relation

Lilac Safadi Normalization

42

UNF 1NF

PG4

PG4

PG4

PG16

PG16

Pno pAddress

18-Oct-00

22-Apr-01

1-Oct-01

22-Apr-01

24-Oct-01

iDate iTime

10:00

09:00

12:00

13:00

14:00

comments

Replace crockery

Good order

Damp rot

Replace carpet

Good condition

StaffNo

SG37

SG14

SG14

SG14

SG37

CarReg

M23JGR

M53HDR

N72HFR

M53HDR

N72HFR

Lawrence St, Glasgow

Lawrence St,Glasgow

5 Novar Dr., Glasgow

5 Novar Dr., Glasgow

5 Novar Dr., Glasgow

sName

Ann

David

David

David

Ann

STAFF_PROPERTY_INSPECTION

1NF

Lilac Safadi Normalization

43

1NF 2NF

Pno pAddressiDate iTime comments StaffNo CarRegsName

STAFF_PROPERTY_INSPECTION

Pno, iDate iTime, comments, StaffNo, Sname, CarRegPno pAddress Partial DependencyStaffNo SnameiDate, StaffNo CarRegiDate, iTime, CarReg Pno, pAddress, comments, StaffNo, SnameiDate, iTime, StaffNo Pno, pAddress, Comments

Lilac Safadi Normalization

44

1NF 2NF

Pno iDate iTime comments StaffNo CarRegsName

PROPERTY_INSPECTION

Pno, iDate iTime, comments, StaffNo, Sname, CarRegStaffNo Sname Transitive DependencyiDate, StaffNo CarRegiDate, iTime, CarReg Pno, comments, StaffNo, SnameiDate, iTime, StaffNo Pno, comments

Pno pAddress

PROPERTY

2NF

2NF

Lilac Safadi Normalization

45

2NF 3NF

Pno iDate iTime comments StaffNo CarReg

PROPERTY_INSPECTION

StaffNo sName

STAFF

3NF

PROPERTY(Pno, pAddres)STAFF(StaffNo, sName)PROPERTY_INSPECT(Pno, iDate, iTime, comments, staffNo, CarReg)

3NF

Lilac Safadi Normalization

46

3NF BCNF

Pno iDate iTime comments StaffNo CarReg

PROPERTY_INSPECTION

Pno, iDate iTime, comments, staffNo, CarReg)StaffNo, iDate carRegCarReg, iDate, iTime pno, comments, staffNoStaffNo, iDate, iTime pno, comments

STAFF_CAR(StaffNo, iDate, CarReg)PROPERTY_INSPECT(pno, iDate, iTime, comments, StaffNo, CarReg)

3NF

Lilac Safadi Normalization

47

Multi-Valued Dependency (MVD)

Represents a dependency between attributes A, B, C in a relation, such that for

each value of A, there is a set of values for B and a set of values of values for C.

However, the set of values for B & C are independent of each others.

Denoted by: A B, A C

Example

BranchNo SName, BranchNo OName

SName OName

BRANCH_STAFF_OWNER

BranchNo

B003B003B003B003

AnnDavidAnnDavid

CarolCarolTinaTina

Lilac Safadi Normalization

48

Trivial MVD

A B trivial MVD if:

B A

OR

A B = R

Lilac Safadi Normalization

49

Fourth Normal Form (4NF)

Fourth normal form (4NF): A BCNF relation with NO nontrivial MVD

BCNF relation

SName OName

BRANCH_STAFF_OWNER

BranchNo

B003B003B003B003

AnnDavidAnnDavid

CarolCarolTinaTina

Lilac Safadi Normalization

50

BCNF 4NF

1. Start with a BCNF relation2. Examine FDs for a relation3. If nontrivial MVD exists, remove the MVD by placing the attributes in a new

relation along with a copy of their determinant

4NF 4NF

SName

BRANCH_STAFF

BranchNo

B003B003

AnnDavid

OName

BRANCH_OWNER

BranchNo

B003B003

CarolTina

Lilac Safadi Normalization

51

Lossless-Join Dependency

A property of decompostion, which ensures that no spurious tuples are generated

when relations are reunited through a natural join operation

Objectives:

Preserve all the data in the original relation

Does not result in the creation of additional spurious tuples

Lilac Safadi Normalization

52

Join Dependency

A, B, .., Z attributes in relation R satisfies join dependency if

Every legal value of R is equal to the join of its projections on A, B, .., Z

Lilac Safadi Normalization

53

Fifth Normal Form (5NF)

Fifth normal form (5NF): A relation with no join dependency

Description SupplierNo

PROPERTY_ITEM_SUPPLIER

PropertyNo

PG4PG4PG16

BedChairBed

S1S2S2

Lilac Safadi Normalization

54

4NF 5NF

Description

PROPERTY_ITEM

PropertyNo

PG4PG4PG16

BedChairBed

SupplierNo

ITEM_SUPPLIER

Description

BedChairBed

S1S2S2

SupplierNo

PROPERTY_ITEM

PropertyNo

PG4PG4PG16

S1S2S2

Description SupplierNo

PROPERTY_ITEM_SUPPLIER

PropertyNo

PG4PG4PG4PG16

BedBedChairBed

S1S2S2S2

Original PROPERTY_ITEM_SUPPLIER