bad db design

21
Bad DB Design Bad DB Design Duplicate of data Duplicate of data Updating Updating Deleting Deleting

Upload: diem

Post on 06-Feb-2016

63 views

Category:

Documents


0 download

DESCRIPTION

Bad DB Design. Duplicate of data Updating Deleting. Redundant. Deleting. Update. Normalization. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Bad DB Design

Bad DB Design Bad DB Design Duplicate of dataDuplicate of dataUpdatingUpdatingDeletingDeleting

Page 2: Bad DB Design

Redundant Redundant

Page 3: Bad DB Design

Deleting Deleting

Page 4: Bad DB Design

Update Update

Page 5: Bad DB Design

Normalization is a design technique that is widely Normalization is a design technique that is widely used as a guide in designing relational databases. used as a guide in designing relational databases. Normalization is essentially a two step process Normalization is essentially a two step process that puts data into tabular form by removing that puts data into tabular form by removing repeating groups and then removes duplicated repeating groups and then removes duplicated data from the relational tables. data from the relational tables.

Normalization theory is based on the concepts of Normalization theory is based on the concepts of normal formsnormal forms. A relational table is said to be a . A relational table is said to be a particular normal form if it satisfied a certain set of particular normal form if it satisfied a certain set of constraints. There are currently constraints. There are currently fivefive normal forms normal forms that have been defined. In this course, we will that have been defined. In this course, we will cover the first cover the first threethree normal forms normal forms

Normalization Normalization

Page 6: Bad DB Design

contcontThe goal of normalization is to create a set The goal of normalization is to create a set

of relational tables that are free of of relational tables that are free of redundant data and that can be redundant data and that can be consistently and correctly modified. This consistently and correctly modified. This means that all tables in a relational means that all tables in a relational database should be in the third normal database should be in the third normal form (3NF). form (3NF).

Page 7: Bad DB Design

NormalizationNormalization A relational table is in 3NF if and only if all non-key A relational table is in 3NF if and only if all non-key

columns are:columns are: mutually independent and mutually independent and fully dependent upon the primary key fully dependent upon the primary key

Mutual independence means that no non-key Mutual independence means that no non-key column is dependent upon any combination of the column is dependent upon any combination of the other columns other columns

The first two normal forms are intermediate steps The first two normal forms are intermediate steps to achieve the goal of having all tables in 3NF to achieve the goal of having all tables in 3NF

In order to better understand the 2NF and higher In order to better understand the 2NF and higher forms, it is necessary to understand the concepts forms, it is necessary to understand the concepts of of functional dependenciesfunctional dependencies

Page 8: Bad DB Design

Functional DependenciesFunctional Dependencies The concept of functional dependencies is the basis for the The concept of functional dependencies is the basis for the

first three normal forms. A column, Y, of the relational table first three normal forms. A column, Y, of the relational table R is said to be R is said to be functionally dependent functionally dependent upon column X of R upon column X of R if and only if each value of X in R is associated with if and only if each value of X in R is associated with precisely one value of Y at any given time. X and Y may be precisely one value of Y at any given time. X and Y may be composite. Saying that column Y is functionally dependent composite. Saying that column Y is functionally dependent upon X is the same as saying the values of column X upon X is the same as saying the values of column X identify the values of column Y. If column X is a primary key, identify the values of column Y. If column X is a primary key, then all columns in the relational table R must be functionally then all columns in the relational table R must be functionally dependent upon X. dependent upon X.

A short-hand notation for describing a functional A short-hand notation for describing a functional dependency is: dependency is:

R.x —> R.y R.x —> R.y which can be read as in the relational table named R, which can be read as in the relational table named R,

column x functionally determines (identifies) column y.column x functionally determines (identifies) column y.

Page 9: Bad DB Design

MotivationMotivation: “normalization,” the process where we break a : “normalization,” the process where we break a relation schema into two or more schemas.relation schema into two or more schemas.

ExampleExample: : ABCDABCD with FD’s with FD’s AB AB ->->CC, , C C ->->DD, and , and D D ->->AA.. Decompose into Decompose into ABCABC, , ADAD. What FD’s hold in . What FD’s hold in ABC ABC ?? Not only Not only AB AB ->->CC, but also , but also C C ->->AA ! !

ABAB -> ->CC and and CC -> ->BB.. Example: Example: AA = street address, = street address, BB = city, = city, CC = zip code. = zip code.

There are two keys, {There are two keys, {AA,,BB } and { } and {AA,,CC }. }.

Functional Dependencies, exampleFunctional Dependencies, example

Page 10: Bad DB Design

Example FDExample FDDrinkers(name, addr, drinkLiked, manf, Drinkers(name, addr, drinkLiked, manf,

favdrink).favdrink).Reasonable FD’s to assert:Reasonable FD’s to assert:

1.1. name -> addrname -> addr2.2. name -> favdrinkname -> favdrink3.3. drinkLiked -> manfdrinkLiked -> manf

Page 11: Bad DB Design

Example DFExample DF

name addr drinkLiked manf favDrinkJaneway Voyager Bud A.B. WickedAleJaneway Voyager WickedAle Pete’s WickedAleSpock Enterprise Bud A.B. Bud

Because name -> addr Because name -> favBeer

Because beersLiked -> manf

Page 12: Bad DB Design

FD’s With Multiple AttributesFD’s With Multiple AttributesNo need for FD’s with > 1 attribute on No need for FD’s with > 1 attribute on

right.right.But sometimes convenient to combine FD’s But sometimes convenient to combine FD’s

as a shorthand.as a shorthand.Example: name -> addr and name -> favDrink Example: name -> addr and name -> favDrink

become name -> addr favDrinkbecome name -> addr favDrink > 1 attribute on left may be essential.> 1 attribute on left may be essential.

Example: Resturnt Drink -> priceExample: Resturnt Drink -> price

Page 13: Bad DB Design

Example, ContExample, Cont Consider relation Drinkers(name, addr, Consider relation Drinkers(name, addr,

drinkLiked, manf, favdrink).drinkLiked, manf, favdrink). {name, beersLiked} is a superkey because {name, beersLiked} is a superkey because

together these attributes determine all the other together these attributes determine all the other attributes.attributes. name -> addr favBeername -> addr favBeer beersLiked -> manfbeersLiked -> manf

{name, drinksLiked} is a {name, drinksLiked} is a keykey because neither because neither {name} nor {drinkLiked} is a superkey.{name} nor {drinkLiked} is a superkey. namename doesn’t -> manf; drinkLiked doesn’t -> addr. doesn’t -> manf; drinkLiked doesn’t -> addr.

Page 14: Bad DB Design

Basic IdeaBasic IdeaTo know what FD’s hold in a projection, To know what FD’s hold in a projection,

we start with given FD’s and find all FD’s we start with given FD’s and find all FD’s that follow from given ones.that follow from given ones.

Then, restrict to those FD’s that involve Then, restrict to those FD’s that involve only attributes of the projected schema.only attributes of the projected schema.

Page 15: Bad DB Design

normalizationnormalization What is normalization? Basically, it's the process What is normalization? Basically, it's the process

of efficiently organizing data in a database.of efficiently organizing data in a database. There are two goals of the normalization There are two goals of the normalization

process: process: Eliminate redundant data (for example, storing Eliminate redundant data (for example, storing

the same data in more than one table) andthe same data in more than one table) and Ensure data dependencies make sense (only Ensure data dependencies make sense (only

storing related data in a table).storing related data in a table). Both of these are worthy goals as they reduce Both of these are worthy goals as they reduce

the amount of space a database consumes and the amount of space a database consumes and ensure that data is logically stored.ensure that data is logically stored.

Page 16: Bad DB Design

First Normalization Form 1FNFirst Normalization Form 1FNEliminate duplicative columns from the Eliminate duplicative columns from the

same table. BY the values in each column same table. BY the values in each column of a table are atomic. By atomic we mean of a table are atomic. By atomic we mean that there are no sets of values within a that there are no sets of values within a column.column.

Create separate tables for each group of Create separate tables for each group of related data and identify each row with a related data and identify each row with a unique column or set of columns (the unique column or set of columns (the primary key). primary key).

Page 17: Bad DB Design

exampleexampleTitle Title AuthorAuthor BioBio ISBNISBN SubjectSubject PagePage PubliPubli

Beginning Beginning MySQL MySQL Database Database Design and Design and Optimization Optimization

Chad Russell, Chad Russell, Jon StephensJon Stephens

Chad Russell is Chad Russell is a programmer a programmer and network and network administrator administrator who owns his who owns his own Internet own Internet hosting hosting company.,. company.,.

1590591590593324 3324

MySQL, MySQL, Database Database DesignDesign

512 512 Apress Apress

Book (ISBN, Title, Pages)Author(Author_ID, First_Name, Last_name)Subject(Subject_ID, Name)Publisher (Publisher_ID, Name, Address, City, State, Zip)The relationship between the Book table and the Author table is a many-to-many relationship:Book_Author (ISBN, Author_ID)Book_Subject (ISBN Subject_ID)One-to-many relationship exists between the Book table and the Publisher table:Book (ISBN, Title, Pages, Publisher_ID)

Page 18: Bad DB Design

Second normal form 2NFSecond normal form 2NF Where the First Normal Form deals with Where the First Normal Form deals with

atomicity of data, the Second Normal Form (or atomicity of data, the Second Normal Form (or 2NF) deals with relationships between 2NF) deals with relationships between composite key columns and non-key columns:composite key columns and non-key columns: Meet all the requirements of the first normal form. Meet all the requirements of the first normal form. Any non-key columns must depend on the entire Any non-key columns must depend on the entire

primary key. In the case of a composite primary key, primary key. In the case of a composite primary key, this means that a non-key column cannot depend on this means that a non-key column cannot depend on only part of the composite key. only part of the composite key.

Create relationships between these new tables and Create relationships between these new tables and their predecessors through the use of foreign keys.their predecessors through the use of foreign keys.

A relation R is in 2nf if every non-primary attribute A in A relation R is in 2nf if every non-primary attribute A in R is fully R is fully Functionally dependentFunctionally dependent on the primary on the primary key. key.

Page 19: Bad DB Design

Example 2NFExample 2NF

StudentStudent Advisor Advisor Adv-Room Adv-Room Class# Class#

052144052144052144052144052144052144621464621464621464621464521423521423457215457215

MohammedMohammedMohammedMohammedMohammedMohammed

SamiSamiSamiSami

IbrahiemIbrahiemKhalidKhalid

500500500500500500501501501501215215312312

Cs424Cs424Cs424Cs424Cs424Cs424Cs416Cs416Cs416Cs416Cs491Cs491Cs412Cs412

StudentStudent Class# Class#

052144052144052144052144052144052144621464621464621464621464521423521423457215457215

Cs424Cs424Cs424Cs424Cs424Cs424Cs416Cs416Cs416Cs416Cs491Cs491Cs412Cs412

StudentStudent Advisor Advisor Adv-Room Adv-Room

052144052144621464621464

MohammedMohammedSamiSami

500500501501

Registration

Student

Page 20: Bad DB Design

Third normal form 3NF Third normal form 3NF Remove columns that are not dependent Remove columns that are not dependent

upon the primary key. upon the primary key. Third Normal Form (3NF) requires that all Third Normal Form (3NF) requires that all

columns depend columns depend directlydirectly on the primary on the primary key. key. Example:Example:

Publisher Publisher (Publisher_ID,(Publisher_ID, Name, Address, Name, Address, City, State, Zip)City, State, Zip)

Zip (Zip (ZipZip, City, State) , City, State)

Page 21: Bad DB Design

Example 3NFExample 3NF

StudentStudent Advisor Advisor Adv-Room Adv-Room

052144052144621464621464

MohammedMohammedSamiSami

500500501501

StudentStudent Advisor Advisor

052144052144621464621464

MohammedMohammedSamiSami

In the last example, Adv-Room (the advisor's office number) is In the last example, Adv-Room (the advisor's office number) is functionally dependent on the Advisor attribute. The solution is to functionally dependent on the Advisor attribute. The solution is to move that attribute from the Students table to the Faculty table, as move that attribute from the Students table to the Faculty table, as shown below:shown below:

NameName Room Room DepDep

MohammedMohammedSamiSami

500500501501

CSCSITIT

Student Faculty