information resources management march 13, 2001. agenda n administrivia n normalization n homework...

45
Information Resources Information Resources Management Management March 13, 2001 March 13, 2001

Post on 20-Dec-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

Information Resources Information Resources ManagementManagement

March 13, 2001March 13, 2001

AgendaAgenda

AdministriviaAdministrivia NormalizationNormalization Homework #7Homework #7 Mid-Term #2Mid-Term #2

AdministriviaAdministrivia

Homework #4Homework #4 Homework #5Homework #5 Homework #6Homework #6 Quiz 2Quiz 2 Mid-Term #1 KeysMid-Term #1 Keys Mid-Term GradesMid-Term Grades

Regrade Requests HW 5 & 6Regrade Requests HW 5 & 6

Create DatabaseCreate Database Enter query(s) as submittedEnter query(s) as submitted Submit to meSubmit to me

Database (electronic)Database (electronic) Graded homework (paper)Graded homework (paper)

Reserve the right to change test data Reserve the right to change test data and reexecute queryand reexecute query

NormalizationNormalization

Why & WhatWhy & What 1st Normal Form1st Normal Form 2nd Normal Form2nd Normal Form 3rd Normal Form3rd Normal Form Boyce-Codd Normal FormBoyce-Codd Normal Form 4th Normal Form4th Normal Form

Normalization - WhyNormalization - Why

Eliminate anomaliesEliminate anomalies Avoid duplicationAvoid duplication Increase flexibility and stabilityIncrease flexibility and stability Reduce maintenanceReduce maintenance

Normalization - What?!?Normalization - What?!?

Analysis of functional dependencies Analysis of functional dependencies between attributesbetween attributes

Building several smaller tables from larger Building several smaller tables from larger onesones

Decomposing relations with anomalies to Decomposing relations with anomalies to produce smaller, well-structured relationsproduce smaller, well-structured relations

Reducing complexity & increasing stabilityReducing complexity & increasing stability

Normalization - What (2)Normalization - What (2)

Series of StepsSeries of Steps Recipe for constructing a “good” Recipe for constructing a “good”

physical model of a database from a physical model of a database from a logical modellogical model

Applied to all existing tables, including Applied to all existing tables, including ones produced by earlier normalization ones produced by earlier normalization stepssteps

ExampleExample

SalesSales

((Order#Order#, Date, CustID, Name, Address, , Date, CustID, Name, Address, City, State, Zip, {Product#, ProductDesc, City, State, Zip, {Product#, ProductDesc, Price, QuantityOrdered}, Subtotal, Tax, Price, QuantityOrdered}, Subtotal, Tax, S&H, Total)S&H, Total)

What are the problems with using a What are the problems with using a single table for all order information?single table for all order information?

ProblemsProblems

Implementing Repeating GroupsImplementing Repeating Groups Duplication of Data (customer name & Duplication of Data (customer name &

address)address) Unnecessary Data (subtotal, total, tax)Unnecessary Data (subtotal, total, tax) OthersOthers

Normalization is a process to eliminate Normalization is a process to eliminate these problems.these problems.

1st Normal Form1st Normal Form

Eliminate Repeating GroupsEliminate Repeating Groups 1st Normal Form has no repeating 1st Normal Form has no repeating

groupsgroups

Create definition with all other attributes, Create definition with all other attributes, remove the repeat {}, and change the remove the repeat {}, and change the primary key to include the “key” for the primary key to include the “key” for the repeating group.repeating group.

ExampleExample

SalesSales

((Order#Order#, Date, CustID, Name, Address, , Date, CustID, Name, Address, City, State, Zip, City, State, Zip, Product#Product#, ProductDesc, , ProductDesc, Price, QuantityOrdered, Subtotal, Tax, Price, QuantityOrdered, Subtotal, Tax, S&H, Total)S&H, Total)

Why is this better?Why is this better?

1st NF Improvements1st NF Improvements

Implementation is possibleImplementation is possible Querying is possibleQuerying is possible

2nd Normal Form2nd Normal Form

Remove all partial functional Remove all partial functional dependenciesdependencies

2nd Normal Form has no partial 2nd Normal Form has no partial functional dependencies and is in 1st functional dependencies and is in 1st Normal FormNormal Form

Partial dependencies get their own Partial dependencies get their own tables -- original table gets a foreign keytables -- original table gets a foreign key

Partial Functional Partial Functional DependenciesDependencies An attribute is only dependent on part of An attribute is only dependent on part of

the primary keythe primary key must be composite keymust be composite key single attribute key is 2nd NFsingle attribute key is 2nd NF

Functional dependencies can be specified Functional dependencies can be specified explicitly but usually come from the E-R model, explicitly but usually come from the E-R model, user specifications, and common senseuser specifications, and common sense

key key non-key attributes non-key attributes

Example - Functional Example - Functional DependenciesDependenciesOrder# Order# Date, CustID, Name, Address, City, State, Zip, Date, CustID, Name, Address, City, State, Zip,

Subtotal, Tax, S&H, TotalSubtotal, Tax, S&H, Total

Order#, Product# Order#, Product# ProductDesc, Price, QuantityOrdered ProductDesc, Price, QuantityOrdered

CustID CustID Name, Address, City, State, Zip Name, Address, City, State, Zip

Product# Product# ProductDesc, Price ProductDesc, Price

Which are partial functional dependencies?Which are partial functional dependencies?

ExampleExample

Sales (Sales (Order#Order#, Date, CustID, Name, , Date, CustID, Name, Address, City, State, Zip, Subtotal, Tax, Address, City, State, Zip, Subtotal, Tax, S&H, Total)S&H, Total)

OrderLine (OrderLine (Order#Order#, , Product#Product#, ProductDesc, , ProductDesc, Price, QuantityOrdered)Price, QuantityOrdered)

Is this 2nd NF?Is this 2nd NF?

ExampleExample

Sales (Sales (Order#Order#, Date, CustID, Name, Address, , Date, CustID, Name, Address, City, State, Zip, Subtotal, Tax, S&H, Total)City, State, Zip, Subtotal, Tax, S&H, Total)

OrderLine (OrderLine (Order#Order#, , Product#Product#, QuantityOrdered), QuantityOrdered)

Product (Product (Product#Product#, ProductDesc, Price), ProductDesc, Price)

Is this 2nd NF? Why is this better than 1st NF?Is this 2nd NF? Why is this better than 1st NF?

2nd NF Improvements2nd NF Improvements

Elimination of Duplicate DataElimination of Duplicate Data No LossNo Loss

3rd Normal Form3rd Normal Form

Eliminate transitive functional Eliminate transitive functional dependenciesdependencies

3rd Normal Form has no transitive 3rd Normal Form has no transitive depencencies and is in 2nd Normal Formdepencencies and is in 2nd Normal Form

Transitive dependencies get their own Transitive dependencies get their own tables -- original table gets a foreign keytables -- original table gets a foreign key

Transitive Functional Transitive Functional DependenciesDependencies Attribute is dependent on another, non-Attribute is dependent on another, non-

key attribute or attributeskey attribute or attributes Attribute is the result of a calculationAttribute is the result of a calculation

CustID CustID Name, Address, City, State, Zip Name, Address, City, State, Zip

ExampleExample

Sales (Sales (Order#Order#, Date, CustID, Subtotal, Tax, S&H, Total), Date, CustID, Subtotal, Tax, S&H, Total)

OrderLine (OrderLine (Order#Order#, , Product#Product#, QuantityOrdered), QuantityOrdered)

Product (Product (Product#Product#, ProductDesc, Price), ProductDesc, Price)

Customer (Customer (CustIDCustID, Name, Address, City, State, Zip), Name, Address, City, State, Zip)

Is this 3rd NF? Why is this better than 2nd NF?Is this 3rd NF? Why is this better than 2nd NF?

ExampleExample

Sales (Sales (Order#Order#, Date, CustID), Date, CustID)

OrderLine (OrderLine (Order#Order#, , Product#Product#, QuantityOrdered), QuantityOrdered)

Product (Product (Product#Product#, ProductDesc, Price), ProductDesc, Price)

Customer (Customer (CustIDCustID, Name, Address, City, State, Zip), Name, Address, City, State, Zip)

Is this 3rd NF? Why is this better than 2nd NF?Is this 3rd NF? Why is this better than 2nd NF?

3rd NF Improvements3rd NF Improvements

Elimination of Duplicate DataElimination of Duplicate Data No LossNo Loss Data is Well-groupedData is Well-grouped

Beyond 3rd Normal FormBeyond 3rd Normal Form

Assume we also want to track information Assume we also want to track information about products, builders, and finishesabout products, builders, and finishes

The following are the functional The following are the functional dependencies:dependencies: Product, Finish Product, Finish Builder Builder Builder Builder Finish Finish

Beyond 3rd Normal FormBeyond 3rd Normal Form

ProdFinish (ProdFinish (Product#Product#, {Finish, Builder}), {Finish, Builder})

becomesbecomes

ProdFinish (ProdFinish (Product#Product#, , FinishFinish, Builder), Builder)

Is this 3rd NF?Is this 3rd NF?

What’s wrong with 3rd NF?What’s wrong with 3rd NF?

Product Finish Builder

1 Cherry Stan

1 Oak Sue

2 Oak Vera

3 Pine Marv

3 Cherry Stan

5 Oak Vera

Product, Finish BuilderBuilder Finish

What’s Wrong with 3rd NF?What’s Wrong with 3rd NF?

Product Finish Builder1 Cherry Stan1 Oak Sue2 Oak Vera3 Pine Marv3 Cherry Stan5 Oak Vera

What happens when:

1. Vera is replaced by Vern?

2. Vera is rehired to work with Oak?

3. Product #3 in pine is discontinued?

What’s Wrong with 3rd NF?What’s Wrong with 3rd NF?

ProblemsProblems

1. Multiple changes need to be made1. Multiple changes need to be made

2. Can’t assign a builder without a 2. Can’t assign a builder without a productproduct

3. Lose information that Marv works in 3. Lose information that Marv works in PinePine

Problem & SolutionProblem & Solution

Problem:Problem: Builder Builder Finish Finish Builder is not a keyBuilder is not a key

Solution:Solution: Boyce-Codd Normal FormBoyce-Codd Normal Form

Boyce-Codd Normal Form Boyce-Codd Normal Form (BCNF)(BCNF) Every determinant in a relation (LHS of Every determinant in a relation (LHS of

the FD’s) is a candidate key and 3rd NFthe FD’s) is a candidate key and 3rd NF

Make determinant part of the key and Make determinant part of the key and that which is dependent on it an that which is dependent on it an attribute and renormalizeattribute and renormalize

ExampleExample

ProductFinish (ProductFinish (Product#Product#, , BuilderBuilder, Finish), Finish)

Is this BCNF?Is this BCNF?

Hint: Is it 3rd NF?Hint: Is it 3rd NF?

ExampleExample

ProductFinish (ProductFinish (Product#Product#, , BuilderBuilder))

Builder (Builder (BuilderBuilder, Finish), Finish)

Is there anything wrong with this?Is there anything wrong with this?

ExampleExample

ProductBuilder (ProductBuilder (Product#Product#, , BuilderBuilder))

Builder (Builder (BuilderBuilder, Finish), Finish)

Normalization often results in the need Normalization often results in the need to rename tables so the table name to rename tables so the table name matches the actual contents.matches the actual contents.

Beyond BCNFBeyond BCNF

Normalization with separate repeating Normalization with separate repeating groups can result in other anomaliesgroups can result in other anomalies

CustService (CustService (StateState, {SalesPerson}, , {SalesPerson}, {Delivery}){Delivery})

Beyond BCNFBeyond BCNFCustService (State, SalesPerson, Delivery)

State Sales Delivery State Sales Delivery

PA George UPS NJ Mike UPS

PA George RPS NJ Mike Truck

PA Sue UPS NJ Valerie UPS

PA Sue RPS NJ Valerie Truck

Is this BCNF?

Beyond BCNFBeyond BCNF

Everything is in the key -- must be Everything is in the key -- must be BCNFBCNF

Still problems with duplicationStill problems with duplication

Multivalued DependenciesMultivalued Dependencies

Multivalued DependencyMultivalued Dependency

At least three attributes (A, B, C)At least three attributes (A, B, C) A A B and A B and A C C B and C are independent of each other B and C are independent of each other

(they really shouldn’t be in the same (they really shouldn’t be in the same table)table)

4th Normal Form4th Normal Form

No multivalued dependencies and No multivalued dependencies and BCNFBCNF

Create separate tables for each Create separate tables for each separate functional dependencyseparate functional dependency

ExampleExample

State Sales State Delivery

PA George PA UPS

PA Sue PA RPS

NJ Mike NJ UPS

NJ Valerie NJ Truck

SalesForce (State, SalesPerson) Delivery (State, Delivery)

Beyond 4th Normal FormBeyond 4th Normal Form

5th Normal Form5th Normal Form Project-Join Normal FormProject-Join Normal Form

Domain Key Normal Form (DKNF)Domain Key Normal Form (DKNF)

User View

1st NF

2nd NF

3rd NF

BCNF

4th NF

Remove repeating groups

Remove partial functional dependencies

Remove transitive functional

dependencies

Remove remaining functional dependency anomalies

Remove multivalued dependencies

In-Class ExercisesIn-Class Exercises

Identify the current normal formIdentify the current normal form

If not 4th NF, transform to 4th NFIf not 4th NF, transform to 4th NF

Homework #7Homework #7

NormalizationNormalization Database schema from HW #3Database schema from HW #3 Earlier due date - post key?Earlier due date - post key?

Mid-Term #2Mid-Term #2

Next week, 3/20Next week, 3/20 TopicsTopics

Converting an E-R Diagram to a Converting an E-R Diagram to a physical database schemaphysical database schema

Normalizing that schema (3NF)Normalizing that schema (3NF) SQLSQL Identification of BCNF, 4NF problemsIdentification of BCNF, 4NF problems