m taimoor khan [email protected]. course objectives 1) basic concepts 2) tools 3)...

22
NORMALIZATION M Taimoor Khan [email protected]

Upload: allyson-miller

Post on 31-Dec-2015

243 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: M Taimoor Khan taimoorkhan@ciit-attock.edu.pk. Course Objectives 1) Basic Concepts 2) Tools 3) Database architecture and design 4) Flow of data (DFDs)

NORMALIZATION

M Taimoor Khan

[email protected]

Page 2: M Taimoor Khan taimoorkhan@ciit-attock.edu.pk. Course Objectives 1) Basic Concepts 2) Tools 3) Database architecture and design 4) Flow of data (DFDs)

Course Objectives

1) Basic Concepts

2) Tools

3) Database architecture and design

4) Flow of data (DFDs)

5) Mappings (ERDs)

6) Formulating queries (Relational algebra)

7) Implementing Schema

8) Built-in Functions

9) Extracting data

10) Working with Joins

11) Normalization

12) Improving performance

13) Advanced topics

Page 3: M Taimoor Khan taimoorkhan@ciit-attock.edu.pk. Course Objectives 1) Basic Concepts 2) Tools 3) Database architecture and design 4) Flow of data (DFDs)

Normalization

Normalization! Functional Dependency Armstrong’s Axiomso First Normal Form (1NF)o Second Normal Form (2NF)o Third Normal Form (3NF)o Boyce - Codd Normal Form (BCNF)

Page 4: M Taimoor Khan taimoorkhan@ciit-attock.edu.pk. Course Objectives 1) Basic Concepts 2) Tools 3) Database architecture and design 4) Flow of data (DFDs)

Normalization Normalization is the process of efficiently

organizing data in a database There are two goals of the normalization

process: eliminating redundant data (for example, storing the same data in more than one table) and ensuring data dependencies make sense (only storing related data in a table)

Both of these are worthy goals as they reduce the amount of space a database consumes and ensure that data is logically stored

Page 5: M Taimoor Khan taimoorkhan@ciit-attock.edu.pk. Course Objectives 1) Basic Concepts 2) Tools 3) Database architecture and design 4) Flow of data (DFDs)

Anomalies Insertion anomaly: We can not add a new record in

courseRegistered unless the Course exists in the course table

Update anomaly: If we change the course description for IS380 from Database Concepts to Advance_Database_Concepts we have to make changes in more than one place or else the database will be inconsistent. In other words in some places the course description will be Advance_Database_Concepts and in any place were we forgot to make the changes the description still will be Database_Concepts.

Deletion anomaly: If student Russell is deleted from the database we also lose information that we had on his currentGPA and courses passed etc.

Page 6: M Taimoor Khan taimoorkhan@ciit-attock.edu.pk. Course Objectives 1) Basic Concepts 2) Tools 3) Database architecture and design 4) Flow of data (DFDs)

Functional dependency

A functional dependency occurs when one attribute in a relation uniquely determines another attribute. This can be written A -> B which would be the same as stating "B is functionally dependent upon A."

Page 7: M Taimoor Khan taimoorkhan@ciit-attock.edu.pk. Course Objectives 1) Basic Concepts 2) Tools 3) Database architecture and design 4) Flow of data (DFDs)

example For Example there is a relation of student with

following attributes. We will establish the functional dependency of different attributes: - STD (stId,stName,stAdr,courseName,credits)

stId stName,stAdr,courseName,credits courseName credits Now in this example if we know the stID we

can tell the complete information about that student. Similarly if we know the courseName , we can tell the credit hours for any particular subject.

Page 8: M Taimoor Khan taimoorkhan@ciit-attock.edu.pk. Course Objectives 1) Basic Concepts 2) Tools 3) Database architecture and design 4) Flow of data (DFDs)

EMP (eId,eName,eAdr,eDept,prId,prCost)

eId (eName,eAdr,eDept) eId,prId prCost

Page 9: M Taimoor Khan taimoorkhan@ciit-attock.edu.pk. Course Objectives 1) Basic Concepts 2) Tools 3) Database architecture and design 4) Flow of data (DFDs)

Armstrong’s Axioms Reflexivity rule: If A is a set of attributes, and B is a set

of attributes that are completely contained in A, the A implies B

Augmentation rule: If A implies B, and C is a set of attributes, then if A implies B, then AC implies BC

Transitivity rule: If A implies B and B implies C, then A implies C

Union rule: If A implies B and A implies C, the A implies BC

Decomposition rule: If A implies BC then A implies B and A implies C

Pseudo-transitivity rule: If A implies B and CB implies D, then AC implies D

Page 10: M Taimoor Khan taimoorkhan@ciit-attock.edu.pk. Course Objectives 1) Basic Concepts 2) Tools 3) Database architecture and design 4) Flow of data (DFDs)

Normal Forms Normalization is basically; a process of

efficiently organizing data in a database There are two goals of the normalization

process: eliminate redundant data (for example, storing the same data in more than one table) and ensure data dependencies make sense (only storing related data in a table)

Both of these are worthy goals as they reduce the amount of space a database consumes and ensure that data is logically stored

Page 11: M Taimoor Khan taimoorkhan@ciit-attock.edu.pk. Course Objectives 1) Basic Concepts 2) Tools 3) Database architecture and design 4) Flow of data (DFDs)

First Normal Form A relation is in first normal form if and only

if every attribute is single valued for each tuple

This means that each attribute in each row , or each cell of the table, contains only one value.

Domain of the attribute is atomic and not multivalued

Multi-valued attributes create problems in SELECT and JOIN operations

Page 12: M Taimoor Khan taimoorkhan@ciit-attock.edu.pk. Course Objectives 1) Basic Concepts 2) Tools 3) Database architecture and design 4) Flow of data (DFDs)

Example

Page 13: M Taimoor Khan taimoorkhan@ciit-attock.edu.pk. Course Objectives 1) Basic Concepts 2) Tools 3) Database architecture and design 4) Flow of data (DFDs)

Second Normal Form

A relation is in second normal form (2NF) if and only if it is in first normal form and all the nonkey attributes are fully functionally dependent on the key.

Page 14: M Taimoor Khan taimoorkhan@ciit-attock.edu.pk. Course Objectives 1) Basic Concepts 2) Tools 3) Database architecture and design 4) Flow of data (DFDs)

Example

This data is in first normal form

CustID FirstName LastName Address City State Zip

1 Bob Smith 123 Main St. Tucson AZ 123452 John Brown 555 2nd Ave. St. Paul MN 543553 Sandy Jessop 4256 James St. Chicago IL 435554 Maria Hernandez 4599 Columbia Vancouver BC V5N 1M05 Gameil Hintz 569 Summit St. St. Paul MN 543556 James Richardson 12 Cameron Bay Regina SK S4T 2V87 Shiela Green 12 Michigan Ave. Chicago IL 435558 Ian Sampson 56 Manitoba St. Winnipeg MB M5W 9N79 Ed Rodgers 15 Athol St. Regina SK S4T 2V9

Page 15: M Taimoor Khan taimoorkhan@ciit-attock.edu.pk. Course Objectives 1) Basic Concepts 2) Tools 3) Database architecture and design 4) Flow of data (DFDs)

But let’s pay attention to the City, State, and Zip fields:There are 3 rows of

repeating data: one for Chicago, one for St. Paul and one for Regina.

All have the same city, state and zip code

City State Zip

Tucson AZ 12345St. Paul MN 54355Chicago IL 43555Vancouver BC V5N 1M0St. Paul MN 54355Regina SK S4T 2V8Chicago IL 43555Winnipeg MB M5W 9N7Regina SK S4T 2V9

Page 16: M Taimoor Khan taimoorkhan@ciit-attock.edu.pk. Course Objectives 1) Basic Concepts 2) Tools 3) Database architecture and design 4) Flow of data (DFDs)

The CustID determines all the data in the row, but U.S. Zip codes determines the City and State. (eg. A given Zip code can only belong to one city and state so storing Zip codes with a City and State is redundant)

This means that City and State are Functionally Dependent on the value in Zip code and not only the primary key.

Page 17: M Taimoor Khan taimoorkhan@ciit-attock.edu.pk. Course Objectives 1) Basic Concepts 2) Tools 3) Database architecture and design 4) Flow of data (DFDs)

To be in 2NF, this repeating data must be in its own table.

So:Let’s create a Zip code table that maps Zip

codes to their City and State.

Page 18: M Taimoor Khan taimoorkhan@ciit-attock.edu.pk. Course Objectives 1) Basic Concepts 2) Tools 3) Database architecture and design 4) Flow of data (DFDs)

Our Data in 2NFCustID FirstName LastName Address Zip

1 Bob Smith 123 Main St. 123452 John Brown 555 2nd Ave. 543553 Sandy Jessop 4256 James St. 435554 Maria Hernandez 4599 Columbia V5N 1M05 Gameil Hintz 569 Summit St. 543556 James Richardson 12 Cameron Bay S4T 2V87 Shiela Green 12 Michigan Ave. 435558 Ian Sampson 56 Manitoba St. M5W 9N79 Ed Rodgers 15 Athol St. S4T 2V9

Zip City State

12345 Tucson AZ54355 St. Paul MN43555 Chicago ILV5N 1M0 Vancouver BCS4T 2V8 Regina SKM5W 9N7 Winnipeg MBS4T 2V9 Regina SK

•We see that we can actually save 2 rows in the Zip Code table by removing these redundancies: 9 customer records only need 7 Zip code records.

•Zip code becomes a foreign key in the customer table linked to the primary key in the Zip code table

Cus

tom

er T

able

Zip

Cod

e Ta

ble

Page 19: M Taimoor Khan taimoorkhan@ciit-attock.edu.pk. Course Objectives 1) Basic Concepts 2) Tools 3) Database architecture and design 4) Flow of data (DFDs)

Advantages of 2NF Saves space in the database by

reducing redundancies If a customer calls, you can just ask

them for their Zip code and you’ll know their city and state! (No more spelling mistakes)

If a City name changes, we only need to make one change to the database.

Page 20: M Taimoor Khan taimoorkhan@ciit-attock.edu.pk. Course Objectives 1) Basic Concepts 2) Tools 3) Database architecture and design 4) Flow of data (DFDs)

Normalization

Normalization! Functional Dependency Armstrong’s Axioms First Normal Form (1NF) Second Normal Form (2NF)o Third Normal Form (3NF)o Boyce - Codd Normal Form (BCNF)

Page 21: M Taimoor Khan taimoorkhan@ciit-attock.edu.pk. Course Objectives 1) Basic Concepts 2) Tools 3) Database architecture and design 4) Flow of data (DFDs)

Lab Activity-13

Creating forms Using form button Create a split form Create a multiple table form Modify a form Creating reports Using report button

Page 22: M Taimoor Khan taimoorkhan@ciit-attock.edu.pk. Course Objectives 1) Basic Concepts 2) Tools 3) Database architecture and design 4) Flow of data (DFDs)

Next Lecture

Normalization continued…