info 340 lecture 6 functional dependency, normalization
TRANSCRIPT
INFO 340
Lecture 6Functional Dependency,
Normalization
DeMorgan’s Theorem
• A AND B = A OR B
• A OR B = A AND B
• For the following questions, use DeMorgan’s theorem to write 2 alternative versions of the where clause. Use the relations Fig 3.3 (pg 80) in the book:
• From the staff table, which staff are male or have a birthdate after Jan 1, 1952 ?
• What are the clientno’s and maxrents where the preference type is a flat and the telephone # has a 0 as a first digit ?
• Which clients had viewed property before May 15, 2004 and preferred a house?
“Spreadsheet Syndrome”?
• When you use a spreadsheet program, you only really have one table.
• This leads to duplication of data.
Normalization
• Goal: Every non-key column is directly dependent on the key, the whole key, and nothing but the key!
• Goal: Reduce redundancies, less anomalies, and improve efficiency.
Functional Dependency & Normalization
• How normalization can be used when designing a relational database.
• The potential problems associated with redundant data in base relations.
• The concept of functional dependency, which describes the relationship between attributes.
• The characteristics of functional dependencies used in normalization.
Functional Dependency & Normalization
• How to identify the most commonly used normal forms, namely First Normal Form (1NF), Second Normal Form (2NF), and Third Normal Form (3NF).
Functional Dependency & Normalization
• Characteristics of a suitable set of relations include: – the minimal number of attributes necessary to
support the data requirements of the enterprise;– attributes with a close logical relationship are
found in the same relation;– minimal redundancy with each attribute
represented only once with the important exception of attributes that form all or part of foreign keys.
What happens if normalization hasn’t occurred?
• Data duplication• Multiple truths• Inability to separate entities• Difficulty to query
Functional Dependency
• A B if for each value of A there is associated exactly one value of B.
• Or think of it mathematically: F(A) = B, A B• Example: Name Date of Birth. That is for a
given person’s name, there is only one DOB that is associated with them. On the other hand, a date does not functionally determine a person. Many people can be born on a given day.
Full functional dependency
• Functional dependencies can work on sets, for example: (student_no, course_no) instructor is valid.
• A fully functional dependency is when you can not remove items from the first set (the A in AB) and maintain a functional dependency.
• The above example is not a full functional dependency, because course_no instructor is a functional dependency.
Transitive Dependency
• Important to recognize a transitive dependency because its existence in a relation can potentially cause update anomalies.
• Transitive dependency describes a condition where A, B, and C are attributes of a relation such that if A → B and B → C, then C is transitively dependent on A via B (provided that A is not functionally dependent on B or C).
Transitive Dependency Example
• Consider functional dependencies in the StaffBranch relation
staffNo → sName, position, salary, branchNo, bAddress branchNo → bAddress
Transitive dependency, branchNo → bAddress exists on staffNo via branchNo.
Functional Dependency & Normalization
• Main characteristics of functional dependencies used in normalization:– There is a one-to-one relationship between the
attribute(s) on the left-hand side (determinant) and those on the right-hand side of a functional dependency.
– Holds for all time.– The determinant has the minimal number of
attributes necessary to maintain the dependency with the attribute(s) on the right hand-side.
Functional Dependency & Normalization
Normalization
• Formal technique for analyzing a relation based on its primary key and the functional dependencies between the attributes of that relation.
• Often executed as a series of steps. Each step corresponds to a specific normal form, which has known properties.
• As normalization proceeds, the relations become progressively more restricted (stronger) in format and also less vulnerable to update anomalies.
Functional Dependency & Normalization
1st Normalized Form
• A relation in which the intersection of each row and column contains one and only one value.
• Atomicity. Based upon you’re requirements, a column holds only one value.
Examples of 1NF
• Let’s look at what’s not 1NF.• Imagine a table with names and telephone
numbers. What if we need more than one telephone #?– Add the value to the existing field, maybe a string
value separated by comma. Breaks 1NF– Fig 13.10 in book
2nd Normal Form
• Based on the concept of full functional dependency.
• A relation that is in 1NF and every non-primary-key attribute is fully functionally dependent on the primary key.
2NF examplesStudent Class Location
John CSE 143 EEB 103
John EE 131 EEB 103
Susie INFO 340 MGH 238
Susie MATH 124 PAR 104
Susie EE 131 EEB 103
• While in 1NF form, it is not in 2NF form. Candidate Key {Student,Class} . Location is not fully functional dependent, since it is dependent only on Class.
Student Class
John CSE 143
John EE 131
Susie INFO 340
Susie MATH 124
Susie EE 131
Class Location
CSE 143 EEB 103
EE 131 EEB 103
INFO 340 MGH 238
MATH 124
PAR 104
3rd Normal Form
• Based on the concept of transitive dependency.
• A relation that is in 1NF and 2NF and in which no non-primary-key attribute is transitively dependent on the primary key.
3NF examplePublisherID Name Address City State ZIP
1 Apress 2560 Ninth Street, Station 219 Berkeley CA 94710
• Looks good, but notice that City and State are really dependent on ZIP, not Publisher_ID.
• A good way to find transitive functional dependencies is think to yourself. If I update this column, do I need to update others? In this case, updating the City column would require you to update the ZIP and possible the State column.
• This example, though, hints that one of the dangers of normalization, that you can sometimes go too far..
PublisherID Name Address ZIP
1 Apress 2560 Ninth Street, Station 219 94710
ZIP City State
94710 Berkeley CA
Homework
13.1, 13.3,13.10,13.11
Books
For every relation: 1)Identify all FD’s Including transitive2) Answer whether it isIn First, Second, or ThirdNormal Form
Lecture 6Class Exercise