normalization for relational databaseepl242/lectures/normalization_theory_1.pdfand update anomalies...
TRANSCRIPT
![Page 1: NORMALIZATION FOR RELATIONAL DATABASEepl242/lectures/Normalization_Theory_1.pdfand Update Anomalies • One goal of “good” design is to minimize the storage that the base relation](https://reader036.vdocument.in/reader036/viewer/2022062610/6120361bf973fc22520fbe68/html5/thumbnails/1.jpg)
NORMALIZATION FOR
RELATIONAL DATABASE
![Page 2: NORMALIZATION FOR RELATIONAL DATABASEepl242/lectures/Normalization_Theory_1.pdfand Update Anomalies • One goal of “good” design is to minimize the storage that the base relation](https://reader036.vdocument.in/reader036/viewer/2022062610/6120361bf973fc22520fbe68/html5/thumbnails/2.jpg)
• Is a formal process developed to help designers define (choose) “good” relational schemas.
• Is a formal process to help designers choose between “bad” and “good” designs.
But:What is a “GOOD” design?
Normalization:
![Page 3: NORMALIZATION FOR RELATIONAL DATABASEepl242/lectures/Normalization_Theory_1.pdfand Update Anomalies • One goal of “good” design is to minimize the storage that the base relation](https://reader036.vdocument.in/reader036/viewer/2022062610/6120361bf973fc22520fbe68/html5/thumbnails/3.jpg)
Design Guidelines
• Relations should have a simple meaning.
• No Insert, Deletion or Modification anomalies
• Avoid requiring NULLS in relation columns.
• Beware of JOINS creating tuples.
![Page 4: NORMALIZATION FOR RELATIONAL DATABASEepl242/lectures/Normalization_Theory_1.pdfand Update Anomalies • One goal of “good” design is to minimize the storage that the base relation](https://reader036.vdocument.in/reader036/viewer/2022062610/6120361bf973fc22520fbe68/html5/thumbnails/4.jpg)
Relations should have simple meaning
Figure 1: Good Design
![Page 5: NORMALIZATION FOR RELATIONAL DATABASEepl242/lectures/Normalization_Theory_1.pdfand Update Anomalies • One goal of “good” design is to minimize the storage that the base relation](https://reader036.vdocument.in/reader036/viewer/2022062610/6120361bf973fc22520fbe68/html5/thumbnails/5.jpg)
![Page 6: NORMALIZATION FOR RELATIONAL DATABASEepl242/lectures/Normalization_Theory_1.pdfand Update Anomalies • One goal of “good” design is to minimize the storage that the base relation](https://reader036.vdocument.in/reader036/viewer/2022062610/6120361bf973fc22520fbe68/html5/thumbnails/6.jpg)
![Page 7: NORMALIZATION FOR RELATIONAL DATABASEepl242/lectures/Normalization_Theory_1.pdfand Update Anomalies • One goal of “good” design is to minimize the storage that the base relation](https://reader036.vdocument.in/reader036/viewer/2022062610/6120361bf973fc22520fbe68/html5/thumbnails/7.jpg)
![Page 8: NORMALIZATION FOR RELATIONAL DATABASEepl242/lectures/Normalization_Theory_1.pdfand Update Anomalies • One goal of “good” design is to minimize the storage that the base relation](https://reader036.vdocument.in/reader036/viewer/2022062610/6120361bf973fc22520fbe68/html5/thumbnails/8.jpg)
Redundant Information in Tuplesand Update Anomalies
• One goal of “good” design is to minimize the storage that the base relation occupy.- Compare the storage needed for the two designs.
• Insertion Anomalies:- Add an employee who has not been assign to a department.- Difficult to add a department that doen’t have any employee yet.
![Page 9: NORMALIZATION FOR RELATIONAL DATABASEepl242/lectures/Normalization_Theory_1.pdfand Update Anomalies • One goal of “good” design is to minimize the storage that the base relation](https://reader036.vdocument.in/reader036/viewer/2022062610/6120361bf973fc22520fbe68/html5/thumbnails/9.jpg)
Deletion Anomalies:• Loose information about a department by deleting its
last employee.
Modification Anomalies:• Updating might create an inconsistent database.
For example changing the manager of department 5.
• Therefore, design the base relation schemas so that no insertion, deletion anomalies occur in the relation.
![Page 10: NORMALIZATION FOR RELATIONAL DATABASEepl242/lectures/Normalization_Theory_1.pdfand Update Anomalies • One goal of “good” design is to minimize the storage that the base relation](https://reader036.vdocument.in/reader036/viewer/2022062610/6120361bf973fc22520fbe68/html5/thumbnails/10.jpg)
• Avoid Requiring NULLS in Relation Columns:If many of the attributes do not apply to all tuplesin the relation.Problem when using aggregate operations such COUNT or SUM.Nulls have multiple interpretations:
The attribute does not apply to this tupleThe attribute value is ‘unknown’The value is known but absent.
Example:“If only 10% of the employees have individual offices”Don’t include an “office_number” attribute in the EMPLOYEE relation,rather create a new relation.
![Page 11: NORMALIZATION FOR RELATIONAL DATABASEepl242/lectures/Normalization_Theory_1.pdfand Update Anomalies • One goal of “good” design is to minimize the storage that the base relation](https://reader036.vdocument.in/reader036/viewer/2022062610/6120361bf973fc22520fbe68/html5/thumbnails/11.jpg)
Be aware of joins that create spurious tuples
• Consider the following relation schema which is derived from the EMP_PROJ relation which by the way is a very bad schema.
![Page 12: NORMALIZATION FOR RELATIONAL DATABASEepl242/lectures/Normalization_Theory_1.pdfand Update Anomalies • One goal of “good” design is to minimize the storage that the base relation](https://reader036.vdocument.in/reader036/viewer/2022062610/6120361bf973fc22520fbe68/html5/thumbnails/12.jpg)
![Page 13: NORMALIZATION FOR RELATIONAL DATABASEepl242/lectures/Normalization_Theory_1.pdfand Update Anomalies • One goal of “good” design is to minimize the storage that the base relation](https://reader036.vdocument.in/reader036/viewer/2022062610/6120361bf973fc22520fbe68/html5/thumbnails/13.jpg)
Using that schema we can not recover the information that was originally in EMP_PROJ relation.
- Decompositing EMP_PROJ into EMP_PROJ1 and EMP_LOCS using NATURAL-JOIN we don’t get the correct original information.
![Page 14: NORMALIZATION FOR RELATIONAL DATABASEepl242/lectures/Normalization_Theory_1.pdfand Update Anomalies • One goal of “good” design is to minimize the storage that the base relation](https://reader036.vdocument.in/reader036/viewer/2022062610/6120361bf973fc22520fbe68/html5/thumbnails/14.jpg)
We should design relation schema so they can be JOINED with equality conditions on attributes that either primary keys or foreign keys in a way that guarantees that no spurious tuples are generated.
![Page 15: NORMALIZATION FOR RELATIONAL DATABASEepl242/lectures/Normalization_Theory_1.pdfand Update Anomalies • One goal of “good” design is to minimize the storage that the base relation](https://reader036.vdocument.in/reader036/viewer/2022062610/6120361bf973fc22520fbe68/html5/thumbnails/15.jpg)
Normalization Theory• Help a designer define a relation schema without the
previous anomalies.• Provide formal concepts that may be used to define
concepts of “goodness” and “badness” of individual relation schemas.
• Relational normalization is a process for identifying “stable” attribute groupings with high interdependency and affinity.
• Normalization is based on concepts of dependencies among attributes.
- These dependencies are called “Functional Dependencies”.- They are use to identify “stable” groupings.
![Page 16: NORMALIZATION FOR RELATIONAL DATABASEepl242/lectures/Normalization_Theory_1.pdfand Update Anomalies • One goal of “good” design is to minimize the storage that the base relation](https://reader036.vdocument.in/reader036/viewer/2022062610/6120361bf973fc22520fbe68/html5/thumbnails/16.jpg)
• Normalization theory use the term “normal form” to describe the extent to which attribute have been grouped into stable relations.
• Numerous normal forms have been proposed, each trying to achieve a more stable grouping of attributes.
Figure: Normal Forms
![Page 17: NORMALIZATION FOR RELATIONAL DATABASEepl242/lectures/Normalization_Theory_1.pdfand Update Anomalies • One goal of “good” design is to minimize the storage that the base relation](https://reader036.vdocument.in/reader036/viewer/2022062610/6120361bf973fc22520fbe68/html5/thumbnails/17.jpg)
Functional Dependencies(FD)• A functional dependency is a constrain between two
sets of attributes from the database.Definition:• Give a relation R, attribute Y of R is functionally
dependent on attribute X of R denoted:
• If and only if each X-value in R has associated with it precisely one Y-value in R (at any one time). Attribute X and Y may be composite.
![Page 18: NORMALIZATION FOR RELATIONAL DATABASEepl242/lectures/Normalization_Theory_1.pdfand Update Anomalies • One goal of “good” design is to minimize the storage that the base relation](https://reader036.vdocument.in/reader036/viewer/2022062610/6120361bf973fc22520fbe68/html5/thumbnails/18.jpg)
Example:
• Using the EMPLOYEE relation:
![Page 19: NORMALIZATION FOR RELATIONAL DATABASEepl242/lectures/Normalization_Theory_1.pdfand Update Anomalies • One goal of “good” design is to minimize the storage that the base relation](https://reader036.vdocument.in/reader036/viewer/2022062610/6120361bf973fc22520fbe68/html5/thumbnails/19.jpg)
• An alternate definition of FD:Given a relation R, attribute Y of R is functionally
dependent on attribute X of R if and only if whenever two tuples of R, t1 and t2 agree on their X-value, t1[X] = t2[X] they must necessarily agree on their Y-value, t1[Y] = t2[Y].
Example:– Relation EMP_PROJ of Figure 4 satisfies the FD
![Page 20: NORMALIZATION FOR RELATIONAL DATABASEepl242/lectures/Normalization_Theory_1.pdfand Update Anomalies • One goal of “good” design is to minimize the storage that the base relation](https://reader036.vdocument.in/reader036/viewer/2022062610/6120361bf973fc22520fbe68/html5/thumbnails/20.jpg)
• A functional dependency is a property of the meaning or semantic of the attributes in a relation schema.
• We use our understanding of the semantics of the attributes of R – that is, how they relate to one another – to specify the FD that should hold an all relational instances.
• Functional dependence is a semantic notion.– Recognizing the FDs is part of the process of understanding
what data means.