design dbms
DESCRIPTION
Database Management SystemsTRANSCRIPT
Relational Database DesignRelational Database Design
©Silberschatz, Korth and Sudarshan7.2Database System Concepts
Relational Database DesignRelational Database Design
RDBMS design issues – Pitfalls and Normalization.
Overview of Normal Forms
Pitfalls in Relational Database Design
Functional Dependencies
Decomposition
Boyce-Codd Normal Form
Third Normal Form
Multivalued Dependencies and Fourth Normal Form
Overall Database Design Process
©Silberschatz, Korth and Sudarshan7.3Database System Concepts
RDBMS Design issues RDBMS Design issues
So far we have assumed that attributes are grouped to form a relation schema by using the common sensecommon sense of database designer or by mapping a schema defined by ER model.
The goodness of Design should be evaluated for tables with few attributes or the tables with large number of attributes ?
Can we combine two tables ( Schemas) without any problem?
We still need some formal measure of why one grouping of grouping of attributesattributes into a relation schema may be better than another.
©Silberschatz, Korth and Sudarshan7.4Database System Concepts
The Banking SchemaThe Banking Schema branch = (branch_name, branch_city, assets)
customer = (customer_id, customer_name, customer_street, customer_city)
loan = (loan_number, amount)
account = (account_number, balance)
employee = (employee_id. employee_name, telephone_number, start_date)
dependent_name = (employee_id, dname)
account_branch = (account_number, branch_name)
loan_branch = (loan_number, branch_name)
borrower = (customer_id, loan_number)
depositor = (customer_id, account_number)
cust_banker = (customer_id, employee_id, type)
works_for = (worker_employee_id, manager_employee_id)
payment = (loan_number, payment_number, payment_date, payment_amount)
savings_account = (account_number, interest_rate)
checking_account = (account_number, overdraft_amount)
©Silberschatz, Korth and Sudarshan7.5Database System Concepts
Combine Schemas?Combine Schemas?
Suppose we combine borrower and loan to get
bor_loan = (customer_id, loan_number, amount )
Result is possible repetition of information (L-100 in example below)
©Silberschatz, Korth and Sudarshan7.6Database System Concepts
A Combined Schema Without RepetitionA Combined Schema Without Repetition
Consider combining loan_branch and loan
loan_amt_br = (loan_number, amount, branch_name)
No repetition (as suggested by example below)
©Silberschatz, Korth and Sudarshan7.7Database System Concepts
What About Smaller Schemas?What About Smaller Schemas?
Suppose we had started with bor_loan. How would we know to split up (decompose) it into borrower and loan?
Write a rule “if there were a schema (loan_number, amount), then loan_number would be a candidate key”
Denote as a functional dependency:
loan_number amount
In bor_loan, because loan_number is not a candidate key, the amount of a loan may have to be repeated. This indicates the need to decompose bor_loan.
Not all decompositions are good. Suppose we decompose employee into
employee1 = (employee_id, employee_name)
employee2 = (employee_name, telephone_number, start_date)
The next slide shows how we lose information -- we cannot reconstruct the original employee relation -- and so, this is a lossy decomposition.
©Silberschatz, Korth and Sudarshan7.8Database System Concepts
A Lossy DecompositionA Lossy Decomposition
©Silberschatz, Korth and Sudarshan7.9Database System Concepts
Pitfalls in Relational Database DesignPitfalls in Relational Database Design
Relational database design requires that we find a “good” collection of relation schemas. A bad design may lead to Repetition of Information.
Inability to represent certain information.
Design Goals: Avoid redundant data
Ensure that relationships among attributes are represented
Facilitate the checking of updates for violation of database integrity constraints.
©Silberschatz, Korth and Sudarshan7.10Database System Concepts
RDBMS Design issues RDBMS Design issues
So far we have assumed that attributes are grouped to form a relation schema by using the common sensecommon sense of database designer or by mapping a schema defined by ER model.
We still need some formal measure of why one grouping of grouping of attributesattributes into a relation schema may be better than another.
Unsatisfactory relation schemas that do not meet certain conditions – the normal form tests – are decomposed into smaller relation schemas that meet the tests and hence possess the desirable properties.
Thus, the normalization procedure provides database designers with; A formal framework for analyzing relation schemas based on their
keys and on the functional dependencies among their attributes. A series of normal form tests that can be carried out on individual
relation schemas so that the relational database can be normalized to any desired degree.
©Silberschatz, Korth and Sudarshan7.11Database System Concepts
First Normal FormFirst Normal Form
A relational database table that adheres to 1NF is one that meets a certain minimum set of criteria.
These criteria are basically concerned with ensuring that the table is a faithful representation of a relation and that it is free of repeating groups.
Some definitions of 1NF, most notably that of Edgar F. Codd, make reference to the concept of atomicity.
Codd states that the "values in the domains on which each relation is defined are required to be atomic with respect to the DBMS."
Codd defines an atomic value as one that "cannot be decomposed into smaller pieces by the DBMS (excluding certain special functions).“
Meaning a field should not be divided into parts with more than one kind of data in it such that what one part means to the DBMS depends on another part of the same field.
©Silberschatz, Korth and Sudarshan7.12Database System Concepts
First Normal FormFirst Normal Form
Domain is atomic if its elements are considered to be indivisible units
Examples of non-atomic domains:
Set of names, composite attributes
Identification numbers like CS101 that can be broken up into parts
A relational schema R is in first normal form if the domains of all attributes of R are atomic
Non-atomic values complicate storage and encourage redundant (repeated) storage of data
Example: Set of accounts stored with each customer, and set of owners stored with each account
We assume all relations are in first normal form (and revisit this again!)
©Silberschatz, Korth and Sudarshan7.13Database System Concepts
First Normal Form (Cont’d)First Normal Form (Cont’d)
Atomicity is actually a property of how the elements of the domain are used.
Example: Strings would normally be considered indivisible
Suppose that students are given roll numbers which are strings of the form CS0012 or EE1127
If the first two characters are extracted to find the department, the domain of roll numbers is not atomic.
Doing so is a bad idea: leads to encoding of information in application program rather than in the database.
©Silberschatz, Korth and Sudarshan7.14Database System Concepts
Goal — Devise a Theory for the FollowingGoal — Devise a Theory for the Following
Decide whether a particular relation R is in “good” form.
In the case that a relation R is not in “good” form, decompose it into a set of relations {R1, R2, ..., Rn} such that
each relation is in good form
the decomposition is a lossless-join decomposition
Our theory is based on:
functional dependencies
multivalued dependencies
©Silberschatz, Korth and Sudarshan7.15Database System Concepts
Overview of Normal FormsOverview of Normal Forms
1NF ( First Normal Form)1NF ( First Normal Form)
To understand
2NF
3NF
BCNF Concept of FD’s ( Functional Dependency) required
To understand
4NF
5NF
Concept of MVD (Multi Valued Dependency) is required
©Silberschatz, Korth and Sudarshan7.16Database System Concepts
Normalization Normalization
The basic objective of normalization is to reduce the various anomalies in the database.
Normalization can be looked upon as a process of analyzing the given relation schemas based on their FDs and primary keys to achieve the desirable properties of ; Minimizing redundancy
Minimizing the insertioninsertion, deletiondeletion, and updateupdate anomalies.
Unsatisfactory relation schemas that do not meet certain conditions – the normal form tests – are decomposed into smaller relation schemas that meet the tests and hence possess the desirable properties.
Thus, the normalization procedure provides database designers with;
A formal framework for analyzing relation schemas based on their keys and on the functional dependencies among their attributes.
A series of normal form tests that can be carried out on individual relation schemas so that the relational database can be normalized to any desired degree.
©Silberschatz, Korth and Sudarshan7.17Database System Concepts
Normalization…Normalization… The normal form of a relation refers to the highest normal form
condition that it meets, and hence indicates the degree to which it has been normalized.
Normal forms when considered in isolation from other factors, do not guarantee a good database designnot guarantee a good database design.
It is generally not sufficient to check separately that each relation schema in the database is, say, in BCNF or 3NF.
Rather, the process of normalization through decomposition must also confirm the existence of additional properties that the relation schemas, taken together should possess; The Lossless joinLossless join,
The dependency preservation propertydependency preservation property, which ensures that each functional dependency is represented in some individual relations resulting after decomposition.
©Silberschatz, Korth and Sudarshan7.18Database System Concepts
RDBMS designRDBMS design
RDBMS design involves checking the current design through Normal Form Test.
If the design is not in desired Normal form, then
Decompose the Relations (Tables) into smaller ones
Fulfill the properties of decomposition.
Properties of decomposition Functional dependency preservation
– Identify the FDs in the given Relation schema
– Apply Armstrong Axioms to find set of all FDs (Closure)
©Silberschatz, Korth and Sudarshan7.19Database System Concepts
OverviewOverview
To understand
2NF
3NF FD’s required
BCNF FD’s & Closure of FDs requiredFD’s & Closure of FDs required
To understand
4NF
5NF
MVD is required
©Silberschatz, Korth and Sudarshan7.20Database System Concepts
First Normal FormFirst Normal Form
Domain is atomic if its elements are considered to be indivisible units Examples of non-atomic domains:
Set of names, composite attributes
Identification numbers like CS101 that can be broken up into parts
A relational schema R is in first normal form if the domains of all attributes of R are atomic
Non-atomic values complicate storage and encourage redundant (repeated) storage of data