project and data management software1 data analysis and data modelling normalisation
TRANSCRIPT
Project and Data Management Software 1
Project and Data Management Software
Data Analysis and Data Modelling
Normalisation
Project and Data Management Software 2
Normalisation
Normalisation provides an algorithm for reducing complex data structures into simple structures
Formalised by set of rules known as Codd’s laws
Tidying up the data so there is no data redundancy
Ensuring data is grouped logically
Project and Data Management Software 3
Why Use Normalization?
Relations formed by the process makes the data easier to understand and manipulate.
Provides a stable base for future database growth.
Simplifies relations and reduces anomalies.
Project and Data Management Software 4
Stages of Normalization
There are 3 stages: 1st Normal Form – 1NF 2nd Normal Form – 2NF 3rd Normal Form – 3NF
BCNF Boyce Codd Normal Form 4NF also exists
Project and Data Management Software 5
First Normal Form – 1NF
For a relation to be in 1NF all its attributes must be atomic Each attribute must contain a single value
not a repeating group of values. Every non-primary key attribute must be
functionally dependent on the Primary Key.
Project and Data Management Software 6
Un-normalised data
Course Code
Course Desc
Employee Number
Name
Block
Room No
Date Joined Course
Allocated Hours
Project and Data Management Software 7
Un-normalised data
A list of fields needed for the system E.g. Staff Development Course All staff are released for two hours a week for staff
dev. Employees work at their own pace in a lab. A total of six attributes are recorded about each
employee including their normal office location (block and room), the date they joined the course and how many hours it is planned for them to work on it.
Project and Data Management Software 8
First Normal Form (1NF)
An entity is in 1NF if, and only if, it has an identifying key and there are no repeating attributes or groups of attributes
To get to 1NF we must remove all repeating groups (data elements)
Project and Data Management Software 9
Our Example
COURSE EMP_ON_COURSE
Course Code
Course Desc.
Course Code
Employee Number
Name
Block
Room No
Date Joined Course
Allocated Hours
Project and Data Management Software 10
Second Normal Form (2NF)
An entity is in 2NF if, and only if, it is in 1NF and has no attributes which require only part of the key to identify them uniquely
To get to 2NF we remove part key dependencies
All data items must be dependant on the primary key
Project and Data Management Software 11
Our Example
Course is already in 2NF Emp_On_Course is not because
Attribute Depends On
Name
Block
RoomNo
Employee No
Employee No
Employee No
Attribute Depends On
Date Joined
Hours
Employee No + Course Code
Employee No + Course Code
Project and Data Management Software 12
So we..
Take out details that are linked only to employee into a separate table
If in any doubt, ask a question such as ‘Are these fields affected when they join a course’
Attribute Depends On
Name
Block
RoomNo
Employee No
Employee No
Employee No
Project and Data Management Software 13
Cont.
COURSE EMP_ON_
COURSE
EMPLOYEE
Course Code
Course Desc
Course Code
Emp No
Date Joined Course
Allocated Hours
Emp No
Name
Block
Room No
Project and Data Management Software 14
Problems
Block and Room Number are related, so if one is updated the other will be affected.
If the block names change, then the whole of the employee records will have to be altered
Project and Data Management Software 15
Third Normal Form (3NF)
An entity is in 3NF if, and only if, it is in 2NF and no non-key attribute depends on another non-key attribute.
To get to 3NF we must remove attributes that depend on other non-key attributes
It removes any mutual dependence between non-key attributes
Project and Data Management Software 16
Third Normal Form 3NF
In other words:
“The attributes is a relation in 3NF must depend on the key, the whole key and nothing but the key” !
Project and Data Management Software 17
How to do that: Dependency
Decide on the direction of the dependency between the attributes
If B determines A, then A is dependant on B If A depends on B, create a new entity, keyed
by B, with A as an attribute Leave B in the original entity and mark it as a
foreign key, but remove A from the original entity
Project and Data Management Software 18
Our Example: Dependency
If, given a value for A, there is only one possible value for B, then B is dependant on A
Therefore, given a value for room no., there is only one value for block. The same is not true vice-versa.
Hence Block is dependent on Room No. Leave Room No in the original entity and mark it as a
foreign key, but remove Block from the original entity
Project and Data Management Software 19
Our Example
Hence the EMPLOYEE (2NF) entity becomes
EMPLOYEE LOCATION
Employee No
Name
Room No *
Room No
Block
* Room No is a foreign key in the Employee entity
Project and Data Management Software 20
Entity Relationship Modelling
Course
Emp_On_Course Employee
Location
Project and Data Management Software 21
Background - Keys
Primary key Unique Identifier Can be made up of more than one attribute
and then is called a composite key If there is no obvious choice, use a number
Foreign Key Does not belong to the entity Used to relate entity to entity A primary key in another table