normalization in dbms: 1nf, 2nf, 3nf and bcnf in database 07... · 2021. 2. 15. · instances of an...
TRANSCRIPT
DATABASE II
Computer science department
Second stage
Second semester
2020-2021
Asst.lecturer Wafaa Mustafa
1
DBMS KEYS
• KEYS in DBMS is an attribute or set of attributes which
helps you to identify a row(tuple) in a relation(table).
• They allow you to find the relation between two tables.
• Keys help you uniquely identify a row in a table by
a combination of one or more columns in that table.
• Key is also helpful for finding unique record or row from the
table
• Database key is also helpful for finding unique record or row
from the table.
3
Types of Keys in Database Management System
• Super Key - A super key is a group of single or multiple keys
which identifies rows in a table.
• Primary Key - is a column or group of columns in a table that uniquely identify every row in that table.
• Candidate Key - is a set of attributes that uniquely identify tuples in a table. Candidate Key is a super key with no repeated attributes.
• Foreign Key - is a column that creates a relationship between two tables. The purpose of Foreign keys is to maintain data integrity and allow navigation between two different instances of an entity.
4
Emp-SSN Emp-Num Emp-name
9812345098 AB05 Shown
9876512345 AB06 Roslyn
199937890 AB07 James
5
SUPERKEY is a group of single or multiple keys which identifies
rows in a table.
A Super key may have additional attributes that are not needed
for unique identification.
Example:
6
PRIMARY KEY is a column or group of columns in a table
that uniquely identify every row in that table. The Primary Key
can't be a duplicate meaning the same value can't appear more
than once in the table. A table cannot have more than one
primary key.
Rules for defining Primary key:
•Two rows can't have the same primary key value
•It must for every row to have a primary key value.
•The primary key field cannot be null.
•The value in a primary key column can never be modified or
updated if any foreign key refers to that primary key.
7
Example: In the following table,( StudID) is a Primary Key.
• CANDIDATE KEY is a set of attributes that uniquely identify tuples in a table. Candidate Key is a super key with no repeated attributes. The Primary key should be selected from the candidate keys.
• Every table must have at least a single candidate key.
• A table can have multiple candidate keys but only a single primary key.
• Properties of Candidate key:
• It must contain unique values;
• Candidate key may have multiple attributes;
• Must not contain null values;
• It should contain minimum fields to ensure uniqueness;
• Uniquely identify each record in a table.
8
.
• Example: In the given table Stud ID, Roll No, and email are
candidate keys which help us to uniquely identify the student
record in the table
9
StudID Roll No First Name LastName Email
1 11 Tom Price [email protected]
2 12 Nick Wright [email protected]
3 13 Dana Natan [email protected]
candidate key
primary key
• FOREIGN KEY is a column that creates a relationship
between two tables. The purpose of Foreign keys is to maintain
data integrity and allow navigation between two different
instances of an entity. It acts as a cross-reference between two
tables as it references the primary key of another table.
10
DeptCode DeptName
001 Science
002 English
005 Computer
Example:
Teacher
ID
Fname Lname
B002 David Warner
B017 Sara Joseph
B009 Mike Brunton
• In this key in dbms example, we have two table, teach and
department in a school. However, there is no way to see which
search work in which department.
• In this table, adding the foreign key in Deptcode to the Teacher
name, we can create a relationship between the two tables.
11
Teacher ID DeptCode Fname Lname
B002 002 David Warner
B017 002 Sara Joseph
B009 001 Mike Brunton
This concept is also known as Referential Integrity.
12
RELATIONSHIP
13
Relationship
• Relationship The association among entities is called a
relationship. For example, an employee works at a
department, a student enrolls in a course. Here, Works at
and Enrolls are called relationships.
Asst lecturer .Wafaa Mustafa 14
One-to-one:
• In a one-to-one relationship, one record in a table is
associated with one and only one record in another table.
For example, in a school database, each student has
only one student ID, and each student ID is assigned to
only one person.
Asst lecturer .Wafaa Mustafa 15
Types of Relationship
Types of Relationship cont.
Asst lecturer .Wafaa Mustafa 16
One-to-many:
O In a one-to-many relationship, one record in a table
can be associated with one or more records in another
table. For example, each customer can have many sales
orders.
Types of Relationship cont.
Asst lecturer .Wafaa Mustafa 17
Many-to-one:
O More than one entities from entity set A can be
associated with at most one entity of entity set B,
however an entity from entity set B can be associated
with more than one entity from entity set A.
A University database let us consider two entity sets
Student and Course. Assume that a student can
register many courses that are offered and a course may be
registered by many students per semester. Then
the relationship between Student and Course will be
many-to-many.
Types of Relationship cont.
Asst lecturer .Wafaa Mustafa 18
O Many-to-many: One entity from A can be associated
with more than one entity from B and vice versa.
A typical example of a many-to many relationship is
one between students and classes. A student can register
for many classes, and a class can include many students.
Asst lecturer .Wafaa Mustafa 19
One customer may
place many orders,
but each order is
placed by a single
customer
One-to-many
relationship
Asst lecturer .Wafaa Mustafa 20
One order has many order
lines; each order line is
associated with a single
order
One-to-many relationship
Asst lecturer .Wafaa Mustafa 21
One product can be in
many order lines, each
order line refers to a
single product
One-to-many
relationship
Asst lecturer .Wafaa Mustafa 22
Product Supplier
Product
Stock Warehouse Purchase
Order
Purchase
Order Item
Co
nta
ins
Sto
cked
as
Ordered on Supplies
Holds
Asst lecturer .Wafaa Mustafa 23
Relationships established in special columns that provide links between
tables
NORMALIZATION IN
DBMS: 1NF, 2NF, 3NF
Normalization
• is a process of organizing the data in database to avoid
data redundancy, insertion anomaly, update anomaly &
deletion anomaly. Let’s discuss about anomalies first then
we will discuss normal forms with examples
25
• There are three types of anomalies that occur when the
database is not normalized. These are – Insertion, update
and deletion anomaly. Let’s take an example to
understand this.
• Example: Suppose a manufacturing company stores the
employee details in a table named employee that has four
attributes: emp_id for storing employee’s id, emp_name
for storing employee’s name, emp_address for storing
employee’s address and emp_dept for storing the
department details in which the employee works. At some
point of time the table looks like this:
26
emp_id emp_name emp_address emp_dept
101 Rick Delhi D001
101 Rick Delhi D002
123 Maggie Agra D890
166 Glenn Chennai D900
166 Glenn Chennai D004
27
The above table is not normalized. We will see the
problems that we face when a table is not normalized:
Update anomaly: In the above table we have two rows
for employee Rick as he belongs to two departments of
the company. If we want to update the address of Rick
then we have to update the same in two rows or the data
will become inconsistent. If somehow, the correct address
gets updated in one department but not in other then as
per the database, Rick would be having two different
addresses, which is not correct and would lead to
inconsistent data.
28
• Insert anomaly: Suppose a new employee joins the
company, who is under training and currently not assigned
to any department then we would not be able to insert the
data into the table if emp_dept field doesn’t allow nulls.
• Delete anomaly: Suppose, if at a point of time the
company closes the department D890 then deleting the
rows that are having emp_dept as D890 would also delete
the information of employee Maggie since she is assigned
only to this department.
29
• To overcome these anomalies we need to normalize the
data. In the next section we will discuss about
normalization.
• Normalization
• Here are the most commonly used normal forms:
• First normal form(1NF)
• Second normal form(2NF)
• Third normal form(3NF)
30
First normal form (1NF)
• As per the rule of first normal form,
an attribute (column) of a table cannot hold multiple values.
It should hold only atomic values.
31
• Example: Suppose a company wants to store the names
and contact details of its employees. It creates a table that
looks like this:
emp_id emp_name emp_address emp_mobile
101 Herschel New Delhi 8912312390
102 Jon Kanpur 8812121212 9900012222
103 Ron Chennai 7778881212
104 Lester Bangalore 9990000123 8123450987
32
• Two employees (Jon & Lester) are having two mobile
numbers so the company stored them in the same field as
you can see in the table below.
33
• This table is not in 1NF as the rule says
“each attribute of a table must have atomic (single) values”,
• the (emp_mobile) values for employees Jon & Lester
violates that rule
34
• To make the table complies with 1NF we should have the
data like this:
emp_id emp_name emp_address emp_mobile
101 Herschel New Delhi 8912312390
102 Jon Kanpur 8812121212
102 Jon Kanpur 9900012222
103 Ron Chennai 7778881212
104 Lester Bangalore 9990000123
104 Lester Bangalore 8123450987
35
Second normal form (2NF)
A table is said to be in 2NF if both the following
conditions hold:
• Table is in 1NF (First normal form)
• No non-prime attribute is dependent on the proper subset
of any candidate key of table.
• An attribute that is not part of any candidate key is known
as non-prime attribute.
36
• Example: Suppose a school wants to store the data of
teachers and the subjects they teach. They create a table
that looks like this: Since a teacher can teach more than
one subjects, the table can have multiple rows for a same
teacher.
teacher_id subject teacher_age
111 Math 38
111 Physics 38
222 Biology 38
333 Physics 40
333 Chemistry 40
Candidate Keys:
{teacher_id, subject}
Non prime attribute:
teacher_age
37
• The table is in 1 NF because each attribute has atomic
values
• However, it is not in 2NF because non-prime attribute
(teacher_age) is dependent on teacher_id alone.
• This violates the rule for 2NF as the rule says
• “NO non-prime attribute is dependent on the proper
subset of any candidate key of the table”.
38
• To make the table complies with 2NF we can break it in two
tables like this:
• teacher_subject table: teacher_details table:
teacher_id teacher_age
111 38
222 38
333 40
teacher_id subject
111 Maths
111 Physics
222 Biology
333 Physics
333 Chemistry
Now the tables comply with Second normal form (2NF).
39
Third Normal form (3NF)
A table design is said to be in 3NF if both the following conditions hold:
• Table must be in 2NF
- An attribute that is not part of any candidate key is known as non-prime attribute.
• A table is in 3NF if
• it is in 2NF and
• at least one of the following conditions hold:
• X→ Y
• X is a super key of table
• Y is a prime attribute of table
An attribute that is a part of one of the candidate keys is known as prime attribute.
40
• Example: Suppose a company wants to store the complete address of each employee, they create a table named employee_details that looks like this:
emp_id emp_name emp_zip emp_state emp_city emp_district
1001 John 282005 UP Agra Dayal Bagh
1002 Ajeet 222008 TN Chennai M-City
1006 Lora 282007 TN Chennai Urrapakkam
1101 Lilly 292008 UK Pauri Bhagwan
1201 Steve 222999 MP Gwalior Ratan
41
Super keys: {emp_id}, {emp_id, emp_name},
{emp_id, emp_name, emp_zip}…so on
Candidate Keys: {emp_id}
• Non-prime attributes: all attributes except emp_id are non-prime as they are not part of candidate key.
• Here, (emp_state, emp_city & emp_district )dependent on (emp_zip).
• And, (emp_zi)p is dependent on (emp_id) that makes
non-prime attributes (emp_state, emp_city & emp_district)
transitively dependent on super key (emp_id). This violates
the rule of 3NF.
• To make this table complies with 3NF we have to break the table into two tables to remove the transitive dependency:
42
• Employee table:
emp_id emp_name emp_zip
1001 John 282005
1002 Ajeet 222008
1006 Lora 282007
1101 Lilly 292008
1201 Steve 222999
43
• employee_zip table:
emp_zip emp_state emp_city emp_district
282005 UP Agra Dayal Bagh
222008 TN Chennai M-City
282007 TN Chennai Urrapakkam
292008 UK Pauri Bhagwan
222999 MP Gwalior Ratan
44
RELATIONAL ALGEBRA
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69