principles of database design, conclusions mbaa 609 r. nakatsu
TRANSCRIPT
Principles ofDatabase Design,
Conclusions
MBAA 609
R. Nakatsu
Principles of Database DesignNormalization is a process that assigns
attributes (fields) to tables such that data redundancies are eliminated or reduced.
Four Rules of Thumb:1. Single Themes: Break up a large table into
separate themes, or subjects.2. Field Uniqueness: Each field in a table
should represent a unique type of information (e.g., break up complex fields, and eliminate repeating groups).
Principles of Normalization,Continued
3. Primary Keys: Each table must have a unique identifier, or primary key, that is made up of one or more fields in the table.
4. Field Independence: You must be able to make a change to the data in any field (other than a field in the primary key) without affecting the data in any other field.
Rule 4: Field IndependenceEmployee (Employee Number, Last Name,
First Name, Job Class, Hourly Rate)
In this example, Hourly Rate is dependent on Job Class.
Employee Number Last Name First Name Job Class Hourly Rate
11 Smith John Mechanic 20
12 Jones Susan Technician 18
13 McKay Bob Mechanic 20
14 Owens Paula Clerk 15
15 Chang Steve Mechanic 20
16 Sarandon Sarah Mechanic 20
Solution: Create Two TablesEmployee (Employee Number, Last Name,
First Name, Job Class ID)
Job Class ID is the link to the Job Class table.
Employee Number Last Name First Name Job Class ID
11 Smith John 2
12 Jones Susan 3
13 McKay Bob 2
14 Owens Paula 1
15 Chang Steve 2
16 Sarandon Sarah 2
Job Class (Job Class ID, Job Class,
Hourly Rate)
There are no more field dependencies!
Job Class ID Job Class Hourly Rate1 Clerk 152 Mechanic 203 Technician 18
Integrity of a Database
It is important that the data in a database is correct and consistent.
Data integrity problems can occur due to: incorrect data entry data redundancy poor database design
Ways to Maintain Data Integrity
Entity Integrity (primary key must be unique and must not be null)
Referential Integrity (see previous slide) Field types (e.g., text, numeric, date, yes/no) Input masks (e.g., ISBN number, dates) Field validation rules (What are some
examples?) Good design of input screens (user interfaces) Check digits
Check Digit Example
Add a check digit to validate a clerk’s data entry.
The check digit is determined by some mathematical algorithm.
Example:Multiply the first digit by 2; multiply the second
digit by 3; multiply the third by 4; add the results; divide by 10 and take the remainder.
127 yields (1*2 + 2*3 + 7*4) / 10 = 36 / 10 for a check digit of 6.
Concurrency Control
Concurrency control is the management of concurrent transaction execution.
Why is it important?
The simultaneous execution of transactions over a shared database may create several data-integrity and consistency problems.
Lost Updates: An Example
Note that the first transaction has not been permanently recorded when the second transaction is executed.
Time TransactionStep Stored Value1 T1 Read QOH 502 T2 Read QOH 503 T1 QOH = 50 + 204 T2 QOH = 50 - 305 T1 Write QOH 706 T2 Write QOH 20
Concurrency Controlwith Locking
A lock guarantees exclusive use of a data item to a current transaction.
In the previous example, transaction T2 will not have access to the data item that is currently used by transaction T1.
The lock is released when the transaction T1 is completed.
Data Warehouse
support OLAP (online analytical processing) support data mining (data in a data warehouse are analyzed to reveal hidden patterns and trends
in historical business activity)
stores data that have been extracted from the various operational, external, and other databases of an organization
Data warehouses...
Data Warehouse
Recap
Why do we need a database? What is a database? What are the main functions of a DBMS? Entity-Relationship Modeling Linking two tables: 1:1, 1:M, and M:N Four principles of database design Data Integrity Concurrency Control Data Warehouses