bis 360 – lecture eight ch. 12: database design and normalization

32
BIS 360 – Lecture Eight Ch. 12: Database Design and Normalization

Upload: abigail-higgins

Post on 16-Jan-2016

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: BIS 360 – Lecture Eight Ch. 12: Database Design and Normalization

BIS 360 – Lecture Eight

Ch. 12: Database Design and Normalization

Page 2: BIS 360 – Lecture Eight Ch. 12: Database Design and Normalization

Objectives

Where are we?

ER Diagram Transformation

Why DB Normalization?

Database Anomalies

Normalization Theory

1NF - 3NF

Page 3: BIS 360 – Lecture Eight Ch. 12: Database Design and Normalization

Where we are now

Project ID and Selection

Project Initiation & Planning

Analysis

Logical Design

Physical Design

Implementation

Maintenance

1. Database Design (Ch. 12)

2. Form and Report Design (Ch. 13)

3. Interface Design (Ch. 14)

Page 4: BIS 360 – Lecture Eight Ch. 12: Database Design and Normalization

Why we design a database

E-R Model DFD Model Information

Database

Applications

ReportsBillsCharts...

Page 5: BIS 360 – Lecture Eight Ch. 12: Database Design and Normalization

DB Design – Where we start

From the Conceptual Data Model (ERD)examine the diagram for accuracy –

entities, attributes,relationships, and cardinalities

Transform this Conceptual Data Model into Logical Data Model (i.e., Relational Data Model) a set of flat tables (relations)

Page 6: BIS 360 – Lecture Eight Ch. 12: Database Design and Normalization

ER Diagram TransformationStep 1: For each regular entity type E, create a relation R that includes all the simple attributes. A unique identifier of E will be a primary key of R

EMPLOYEE

EmpID EmpLName EmpFName Salary

PK: EmpID

E

R

Page 7: BIS 360 – Lecture Eight Ch. 12: Database Design and Normalization

ER Diagram TransformationStep 2: For each weak entity type W with identifying entity type E, create a relation Rw with all the attributes of W and also create a relation Re for E as explained in Step 1. Then, include the primary key of Re as a foreign key of Rw. A combination of the partial identifier of W and the unique identifier of E will be a primary key of Rw.

EMPLOYEE

EmpID EmpLName EmpFName Salary

PK: EmpID

DEPENDENTE W

DpdSSN DpdLName DpdFName EmpID

PK: EmpID + DpdSSN FK: EmpID

Re

Rw

Page 8: BIS 360 – Lecture Eight Ch. 12: Database Design and Normalization

ER Diagram TransformationStep 3: When there is a 1:1 relationship between entity type S and T, create relations Rs and Rt as explained in step 1. Include a primary key of S as a foreign key of T.

EMPLOYEE

EmpID EmpLName EmpFName Salary

PK: EmpID

DEPARTMENTS T

DeptID DeptName DeptPhone EmpID

PK: DeptID FK: EmpID

Rs

Rt

manages

is managed by

Page 9: BIS 360 – Lecture Eight Ch. 12: Database Design and Normalization

ER Diagram TransformationStep 4: When there is a 1:M relationship between entity types S and T, create relations Rs and Rt as explained in step 1. If T is an entity type at the MANY-side, include a primary key of Rs as a foreign key of Rt.

DEPARTMENT

DeptID DeptName DeptPhone

PK: DeptID

EMPLOYEES T

EmpID EmpLName EmpFName Salary DeptID

PK: EmpID FK: DeptID

Rs

Rt

works for

hires

Page 10: BIS 360 – Lecture Eight Ch. 12: Database Design and Normalization

ER Diagram TransformationStep 5: When there is a M:N relationship P between entity types S and T and there is no property associated with this relationship P, create relations Rs and Rt as explained in step 1. Also create a relation Rp to represent the relationship and include primary keys of Rs and Rt as foreign keys of Rp. A combination of primary keys of Rs and Rt will be a primary key of Rp.

WhID WhName

PK: WhID

WAREHOUSE PRODUCTS T

WhID ProdID

PK: WhID + ProdID FK: WhID, ProdID

Rs

Rp

ProdID ProdName ProdPriceRt

PK: ProdID

Page 11: BIS 360 – Lecture Eight Ch. 12: Database Design and Normalization

ER Diagram TransformationStep 6: When there is a M:N relationship P between entity types S and T and there are some properties associated with this relationship P, create relations Rs, Rt, and Rp as explained in step 5. All properties associated with the relationship P will be the non-key attributes of Rp.

SchID

ST

UD

EN

T

SC

HO

OL

SSN

Name

Phone

ZIP

Name

Type

ZIP

Attends

Date

STUDENT ( SSN , Name , Phone , Zip )

SCHOOL ( SchID , Name , Type , Zip )

STU_SCH ( SSN , SchID , Date , Degree )

Degree

Page 12: BIS 360 – Lecture Eight Ch. 12: Database Design and Normalization

Complex ERD Transformation Example of (M:N) Relationship

ORDER ( CID , ProdID , Date , Units )

Same customer may order same product many times

Q: Are you happy with the above design?

ProdID

CU

ST

OM

ER

PR

OD

UC

T

CID

Name

Phone

ZIP

...

Desc.

U_Price

Qty

...

Order

Date

Units

Page 13: BIS 360 – Lecture Eight Ch. 12: Database Design and Normalization

Complex ERD Transformation

Example of (M:N) Relationship

ORDER ( OrderID , CID , ProdID , Date , Units )

Same customer may order same product many times

ProdID

CU

ST

OM

ER

PR

OD

UC

T

CID

Name

Phone

ZIP

...

Desc.

U_Price

Qty

...

Order

Date

Units

Create a unique primary key – OrderID

Page 14: BIS 360 – Lecture Eight Ch. 12: Database Design and Normalization

Why DB Normalization?

To avoid database processing errors (anomalies)

To verify the relations derived from the ER diagram – each derived relation would be at least in 3rd normal form (3NF)

Page 15: BIS 360 – Lecture Eight Ch. 12: Database Design and Normalization

Database Anomalies

Anomalies -- Data errors occurred during or after the processing of data

Three types of anomaliesInsertion anomaly - the difficulty in adding new data due to the poor design of a relation Deletion Anomaly - unintentional data loss due to the deletion of some data Update Anomaly - data become inconsistent after some data were updated

Page 16: BIS 360 – Lecture Eight Ch. 12: Database Design and Normalization

Insertion Anomaly

EmpID Dept Name Project Budget Location

1 Mkt John A 100000 MI

1 Mkt John B 200000 IN

2 Eng Jim A 100000 MI

3 Acct Joe C 350000 IL

EMPLOYEE-PROJECT

EmpID Dept Name Project Budget Location

1 Mkt John A 100000 MI

1 Mkt John B 200000 IN

2 Eng Jim A 100000 MI

3 Acct Joe C 350000 IL

4 Fin Jack A 100000 MI

4 Fin Jack C 350000 IL

5 HR June null !! null null

Inse

rt n

ew e

mpl

oyee

s

Page 17: BIS 360 – Lecture Eight Ch. 12: Database Design and Normalization

Insertion Anomaly

EmpID Dept Name Project Budget Location

1 Mkt John A 100000 MI

1 Mkt John B 200000 IN

2 Eng Jim A 100000 MI

3 Acct Joe C 350000 IL

EMPLOYEE-PROJECT

EmpID Dept Name Project Budget Location

1 Mkt John A 100000 MI

1 Mkt John B 200000 IN

2 Eng Jim A 100000 MI

3 Acct Joe C 350000 IL

1 Mkt John D 250000 MN

2 Eng Jim D 250000 MN

null !! null null E 400000 WI

Inse

rt n

ew p

roje

cts

Page 18: BIS 360 – Lecture Eight Ch. 12: Database Design and Normalization

Deletion Anomaly

EmpID Dept Name Project Budget Location

1 Mkt John A 100000 MI

1 Mkt John B 200000 IN

2 Eng Jim A 100000 MI

3 Acct Joe C 350000 IL

EMPLOYEE-PROJECT

EmpID Dept Name Project Budget Location

2 Eng Jim A 100000 MI

3 Acct Joe C 350000 IL

Del

ete

emp

loye

e #

1

EmpID Dept Name Project Budget Location

1 Mkt John B 200000 IN

3 Acct Joe C 350000 IL

Del

ete

pro

ject

A

Page 19: BIS 360 – Lecture Eight Ch. 12: Database Design and Normalization

Update Anomaly

EmpID Dept Name Project Budget Location

1 Mkt John A 100000 MI

1 Mkt John B 200000 IN

2 Eng Jim A 100000 MI

3 Acct Joe C 350000 IL

EMPLOYEE-PROJECT

EmpID Dept Name Project Budget Location

1 Fin John A 100000 MI

1 Fin John B 200000 IN

2 Eng Jim A 100000 MI

3 Acct Joe C 350000 IL

Up

date

em

plo

yees

# 1

Page 20: BIS 360 – Lecture Eight Ch. 12: Database Design and Normalization

Normalization TheoryBasic Concept - Functional DependencyFunctional Dependency (FD): A

relationship between attributes of an entity. FD is the foundation of Normalization.

Notation: a b- value of a uniquely determines the value of b

a is a “determinant” a functionally determines b b is functionally dependent on a

Page 21: BIS 360 – Lecture Eight Ch. 12: Database Design and Normalization

Normalization TheoryFunctional Dependency - Examples

Interpretation: either SSN or EmpID can uniquely determine his/her Name, Phone, and DOB, but not the reverse!

Both SSN and EmpID are candidate keys You can choose one of them as a PK

SSN Name Phone DOBEmpID

EMPLOYEE ( SSN , EmpID , Name , Phone , DOB ) orEMPLOYEE ( SSN , EmpID , Name , Phone , DOB )

Page 22: BIS 360 – Lecture Eight Ch. 12: Database Design and Normalization

Normalization TheoryFunctional Dependency - Examples

VIN # of doors Color Type

Interpretation: VIN can uniquely determine a vehicle’s # of doors, Color, and Type, but not the reverse!

VEHICLE ( VIN , # of doors , Color , Type )

VIN is the only candidate key and it is used as a PK

Page 23: BIS 360 – Lecture Eight Ch. 12: Database Design and Normalization

Database NormalizationWhere is the beef?

Reality is not that simple !All candidate keys, including PK, are the determinantBut, determinant may not be the candidate key

Q: What is the difference between a candidate key and a determinant?

A: They are similar, but not the same - Scope

Page 24: BIS 360 – Lecture Eight Ch. 12: Database Design and Normalization

Database NormalizationWhere is the beef?

• Call # is a candidate key and is used as a PK

• Call # is also a determinant of CourseID, Title, and Classroom

• CourseID is a determinant of Title

Q: Should we put these four attributes on the same table (relation)?

A: No !! We need database normalization

Title Call # ClassroomCourseID

Page 25: BIS 360 – Lecture Eight Ch. 12: Database Design and Normalization

Database NormalizationBasic Ideas

Unnormalized

1st NF (1NF)

2nd NF (2NF)

3rd NF (3NF)

Page 26: BIS 360 – Lecture Eight Ch. 12: Database Design and Normalization

Database NormalizationNormal Form Definitions

• A relation is in its first normal form (1NF) if it does not contain repeating groups.

• A relation is in its second normal form (2NF) if every non-primary key attribute is fully dependent on the (whole) primary key.

• A relation is in its third normal form (3NF) if it has no transitive dependency between non-key attributes.

Page 27: BIS 360 – Lecture Eight Ch. 12: Database Design and Normalization

First Normal Form (1NF)

1NF: A relation is in its first normal form (1NF) if it does not contain repeating groups

PhoneSID Name Sex DOB Phone

MajorMajor

Major

Repeatinggroups

Normalization

STUDENT ( SID , Name , Sex , DOB )STU_PHONE ( SID , Phone )STU_MAJOR ( SID , Major )

Page 28: BIS 360 – Lecture Eight Ch. 12: Database Design and Normalization

Second Normal Form (2NF) 2NF: A relation is in its second normal form (2NF) if it is in

1NF and every non- primary key attribute is fully functionally dependent on the (whole) primary key

123, Jim, Line crew, 01/01/96, Factory123, Jim, Supervisor, 01/01/99, Factory211, John, Sales Rep, 09/01/94, MKT211, John, Sales Manager, 01/01/98, MKT235, Joe, Accountant, 07/01/96, Acct

A combination of EmpID and SDate is the only candidate key and is used as a PK

(EmpID, SDate) Name , Position , Dept EmpID Name , Dept

Q: Do we see any partially functional dependency?

JOB_HIST ( EmpID , Name , Position , SDate , Dept )

Page 29: BIS 360 – Lecture Eight Ch. 12: Database Design and Normalization

Second Normal Form (2NF)

JOB_HIST ( EmpID , Name , Position , SDate , Dept )

Normalization

JOB_HIST ( EmpID , SDate , Position )

EMPLOYEE ( EmpID , Name , Dept )

Page 30: BIS 360 – Lecture Eight Ch. 12: Database Design and Normalization

Comments on 2NF Verification

You don’t need to worry about whether a relation is in its 2NF if its PK includes only one attribute (Why?)

BecausePartially functional dependency only occurs when the PK is a composite (compound) key

Page 31: BIS 360 – Lecture Eight Ch. 12: Database Design and Normalization

Third Normal Form (3NF) 3NF: A relation is in its third normal form (3NF) if it is in

2NF and there is no transitive dependency between non-key attributes in the relation

Transitive dependency:If a b , and b c , then there is a transitive dependency between a and c

EmpID Name , Phone , Office , Street , City , State , Zip

Phone Office

Zip City , State

Q: Do you see the transitive dependency?

EMPLOYEE ( EmpID , Name , Phone , Office , Street , City , State ,

Zip )

Page 32: BIS 360 – Lecture Eight Ch. 12: Database Design and Normalization

Third Normal Form (3NF)

EMPLOYEE ( EmpID , Name , Phone , Office , Street , City , State ,

Zip )

Normalization

EMPLOYEE ( EmpID , Name , Phone , Street , Zip )

PHONE ( Phone , Office )

ZIP ( Zip , City , State )