normalization (codd, 1972) practical information for real world database design

14
Normalization (Codd, 1972) Practical Information For Real World Database Design

Upload: jodie-allen

Post on 31-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Normalization (Codd, 1972) Practical Information For Real World Database Design

Normalization(Codd, 1972)

Practical Information

For Real World Database Design

Page 2: Normalization (Codd, 1972) Practical Information For Real World Database Design

Requirements for Relational DB

• Table format• Supports Boolean Algebra

– Selects, joins, projects + 5 other operations to define queries

• Supports mathematical, relational, and logical operators (And-Or-Not)

• Codd’s Twelve Rules (abstracted)– Null values can be present except in primary key– All data represented in tables– Must be able to update views– Access data with table name, field name, or value– Data and programs should be independent

– Should enforce integrity and validity constraints

Page 3: Normalization (Codd, 1972) Practical Information For Real World Database Design

Normalization Defined

• Normalization– Purpose is to avoid potential update problems called

anomalies– Assigned attributes to entities using 1NF, 2NF, 3NF,

4NF, 5NF

• Denormalization– Moving back a level to gain better performance in the

real-world database; in practice 3NF is most common, however, to gain efficiency and speed, minor changes may need to be made

Page 4: Normalization (Codd, 1972) Practical Information For Real World Database Design

Why Normalization Is Important

• If not done, updates are less efficient (larger tables, possibly more than one update per data item change)

• If not done, indexing is more cumbersome – impractical to build large databases

• If not done, no simple strategies for creating views required by users

Page 5: Normalization (Codd, 1972) Practical Information For Real World Database Design

Design Rules

• Determine Business Rules– A company manages many different projects

– Each project requires the services of many employees

– Employees may be assigned to work on more than one project

– Each employee has a job classification

– Many employees have the same job classification

• Translate business rules to validity constraints and relationships

Page 6: Normalization (Codd, 1972) Practical Information For Real World Database Design

Design Rules, Continued

• Analyze documents, interview key users, etc. to develop a field list

• Determine entities to be used (see next slide for definition)

• Determine relationships between entities• Assign the attributes to the entities• Identify primary and foreign keys• Check for 1NF, 2NF, and 3 NF

Page 7: Normalization (Codd, 1972) Practical Information For Real World Database Design

Definitions

• Entity – the subject to be modeled by the database file (table or relation)

• Primary Key – the field value that uniquely identifies the entity entry (row, tuple, record); all other attributes are functionally dependent on it; can’t be null

• Foreign Key – the field (attribute, column) that relates the table to a pre-existing table

• Functional Dependence - determines or depends on, e.g. advisor name depends on advisor ID

Page 8: Normalization (Codd, 1972) Practical Information For Real World Database Design

Definitions, Continued

• Views– Selected group of records (select)– Selected group of fields (project)– Selected group of records and fields from two

or more tables (join)– A query– A report– A set of labels

Page 9: Normalization (Codd, 1972) Practical Information For Real World Database Design

Definitions, Continued

• Determinant – Determines value of another attribute; e.g. primary key

• Indexes– Tables that contain record numbers only arranged in an order

based on some field value

• Entity Integrity– Every table must have a field to uniquely identity each

record and there must be a field value for every record

• Referential Integrity– If a record has a value in a foreign key field, it must match

an exiting value in the original table to which it is linked

Page 10: Normalization (Codd, 1972) Practical Information For Real World Database Design

1 Normal Form

• Table does not contain repeating groups– To put it another way, each record has at least one field

that differentiates it from every other record in the file; e.g. a unique primary key

Examples:Faculty ID is primary key and the same faculty id is associated with two or more courses

Solve by creating a course fileFaculty ID is primary key and the same faculty id is associated with two or more offices

Solve by redesigning database to include offices as a separate table

Page 11: Normalization (Codd, 1972) Practical Information For Real World Database Design

2 Normal Form

• Table must be in 1 Normal Form

• No non-key attribute is dependent on only part of the concatenated key– Concatenated key (two or more fields taken

together represent primary key)• In course table, concatenated key is faculty ID and

Catalog No – every field in table must be dependent on both faculty ID and catalog number

Page 12: Normalization (Codd, 1972) Practical Information For Real World Database Design

Anomalies Avoided By 2 NF

• Only have one data item to change when update is made

• Avoids “loose” data when deletes are made– When a part number is deleted, could lose reference to

invoice

– How do you add a new course when there is no associated faculty ID?

– A new office with no assigned faculty?

• Avoids inconsistent data

Page 13: Normalization (Codd, 1972) Practical Information For Real World Database Design

3 Normal Form

• The only determinants are candidate keys– Candidate keys in student file are social

security number and patron ID (both are unique)

• To put it another way, there are no transitive dependencies– If student file contains Dept ID (foreign key)

and department name, this is a transitive dependency

Page 14: Normalization (Codd, 1972) Practical Information For Real World Database Design

4 and 5 Normal Form

• 4 Normal Form– There are no multivalued dependencies – is like

Boyce Codd

• 5 Normal Form– Holds only theoretical interest