dr. chen, business database systems chapter 1 overview of database concepts jason c.h. chen, ph.d....

88
Dr. Chen, Business Database Systems Chapter 1 Overview of Database Concepts Jason C.H. Chen, Ph.D. Professor of MIS School of Business, Gonzaga University Spokane, WA 99258 USA [email protected]

Upload: georgina-tucker

Post on 26-Dec-2015

221 views

Category:

Documents


1 download

TRANSCRIPT

Dr. Chen, Business Database Systems

Chapter 1Overview of Database Concepts

Jason C.H. Chen, Ph.D.

Professor of MIS

School of Business, Gonzaga University

Spokane, WA 99258 USA

[email protected]

Dr. Chen, Business Database Systems 2

Objectives

• Define database terms• Identify the purpose of a database management

system (DBMS)• Explain database design using entity-relationship

models and normalization• Explain the purpose of a Structured Query

Language (SQL)• Understand how this textbook’s topics are

sequenced and how the two sample databases are used

Dr. Chen, Business Database Systems

What is Information ?

DATA

Information is refined data.

INFORMATION

Dr. Chen, Business Database Systems 4

Data vs. Information

• Users really want is– Information

i What users can learn from the datavhow to satisfy their best customersvhow to allocate their resources most efficiently,vhow to minimize losses

Dr. Chen, Business Database Systems 5

Objectives of the MIS

Deliver the right information

to the right people,

at the right time,

with the right form.

Ultimately, MIS should improve the workers’ productivity.

who has what information about whom and when, where, and how will all be decided in the process of building an information system.

Dr. Chen, Business Database Systems

A Relationship Between the Organization and the DataBase

OrganizationalIS Resources:1. Hardware2. Software3. Data4. Procedures5. People

PictureOr

Model

OldState

Events

NewState

Activities

DataBaseManagement

Systems D.B.

Transactions

(Add, modify, delete, …)

Correspond with the processing of the events

Information must be:1. Time: Timely, 2. Content: Accurate, relevant, and verifiable.3. Form: Presented in a useable form.

User

TM 0-6

Dr. Chen, Business Database Systems 7

Questions

• Is MS/Access or Oracle (SQL/Server or IBM DB2) a data base?

• Why don’t we just create a “huge” data file containing all fields (columns)?– Have you ever created a database (with many

applications) with only one “huge” data file containing all fields (columns)?

Dr. Chen, Business Database Systems 8

Traditional File Processing Approach

• Traditional File Processing – An application uses one specialized file. For example,

purchase order processing application uses data about supplier and products, while an order-taking application uses data about customer, products and orders.

• Problems:– Data dependence: data structures are tightly coupled with

applications. In other words, we prefer data independence.– Data redundancy: same data are repeatedly saved for

different applications. – Other problems

Dr. Chen, Business Database Systems 9

Types of Data Processing

• Two types of data processing– File-based (traditional) data processing

• applications developed by Java

– Data-based data processing• applications developed by Oracle or MS/Access)

Dr. Chen, Business Database Systems

Three file processing systems at a Furniture Company

Duplicate Data

What is the main problem in the

company’s processing systems?

Dr. Chen, Business Database Systems 11

Disadvantages of File Processing

• Program-Data Dependence– All programs maintain metadata for each file they use

• Data Redundancy (Duplication of data)– Different systems/programs have separate copies of the same

data

• Limited Data Sharing– No centralized control of data

• Lengthy Development Times– Programmers must design their own file formats

• Excessive Program Maintenance– 80% of of information systems budget

Dr. Chen, Business Database Systems

Database Management System

DBMS manages data resources like an operating system manages hardware resources

DBMSDBMS Databasecontainingcentralized

shared data

Application#1

Application#2

Application#3

What are the advantages of employing Data-based Data Processing?

Dr. Chen, Business Database Systems

Traditional File Processing

Product

Sales rep

Supplier

Data Files

Purchasing System

Sales order processing

system

Product

Applications

Name, item#, description...

Name, item#, description...

Dr. Chen, Business Database Systems

Database Management Systems

Product

Sales rep

Supplier

Database

DatabaseManagem ent

System s

PurchasingSystem

Salesorder processing

system

Dr. Chen, Business Database Systems 15

Database Terminology

• Database – an organized collection of logically related data files

• Database management system (DBMS) – software used to create and interact with the database

Dr. Chen, Business Database Systems 16

Database Components

• Character– Basic unit of data– Can be a letter, number, or special symbol

• Field– A group of related characters– Represents an attribute or characteristic of an entity– Corresponds to a column in the physical database

• Record– A collection of fields for one specific entity– Corresponds to a row in the physical database

• File– A group of records about the same type of entity

Dr. Chen, Business Database Systems 17

Relational Databases

• Data is organized in tables– Columns (fields) represent different data categories– Rows (records) contain actual data values

Dr. Chen, Business Database Systems 18

Database Example

Dr. Chen, Business Database Systems

Components Example

Dr. Chen, Business Database Systems 20

Relational Database Terms

• Entity: an object about which you want to store data• Table, column, row

– Flat file, attribute (or field), record• Relationships: links that show how different records are

related• Key Fields: establish relationships among records in

different tables• Five main types of key fields:

– primary keys– candidate keys– surrogate keys– foreign keys– composite keys

Dr. Chen, Business Database Systems 21

Primary Keys vs. Candidate Keys

• Primary key– Value must be unique for each record– Serves to identify the record– Present in every record– Can’t be NULL– Should be numeric

• Candidate key – Any field that could be used as the primary key– Should be a unique, unchanging numeric field

Dr. Chen, Business Database Systems 22

Surrogate Keys

• Surrogate key: created to be the record’s primary key identifier when no suitable primary key exists

• Surrogate key has no real relationship to the record to which it is assigned, other than to identify the record uniquely

• Developers configure the database to generate surrogate key values automatically

• In an Oracle database, you can automatically generate surrogate key values using a sequence

• Surrogate keys are always numerical fields, because the database generates surrogate key values automatically by incrementing the previous value by one

Dr. Chen, Business Database Systems 23

Foreign Keys• Foreign key: a field in a table that is a primary key in

another table• Foreign key creates a relationship between the two tables• Foreign key value must exist in the table where it is a

primary key

Dr. Chen, Business Database Systems 24

Composite Key

• Composite key: a unique key that you create by combining two or more fields

• Usually comprised of fields that are primary keys in other tables

ORDER_ID PRODUCT_ID ORDER_QUANTITY

100 1 2

100 2 2

200 2 2

200 1 1

Composite Key

Dr. Chen, Business Database Systems 25

Database Management System

• Data storage: manage the physical structure of the database

• Security: control user access and privileges• Multiuser access: manage concurrent data access• Backup: enable recovery options for database failures• Data access language: provide a language that allows

database access• Data integrity: enable constraints or checks on data• Data dictionary: maintain information about database

structure

Dr. Chen, Business Database Systems 26

Client/Server Database Management Systems

• Client/server database– Takes advantage of distributed processing and networked

computers by distributing processing across multiple computers

– DBMS server process runs on one workstation, and the database applications run on separate client workstations across the network

– Preferred for database applications that retrieve and manipulate small amounts of data from databases containing large numbers of records because they minimize network traffic and improve response times

• Organizations generally use a client/server database if the database will have more than 10 simultaneous users and if the database is mission critical

Dr. Chen, Business Database Systems

The Oracle11g Client/Server Database

• Oracle11g is the latest release of Oracle Corporation’s relational database management system

• All Oracle server- and client-side programs use Oracle Net, a utility that enables the network communication between the client and the server

Dr. Chen, Business Database Systems 28

Linking Internal Databases to the Web

DATABASE TRENDS

N

Dr. Chen, Business Database Systems 29

Web-Based Client/Server Database Architecture

Network

Web server

Database Server

Web Browser5. Retrieved data

4. Data query

7. Data-based Web page

8. Data-based Web page

1. Request for data-based Web page

6. Retrieved data

3. Data query

2. Request for data-based Web page

LegendCommunicationsBetween Web browserAnd Web server

CommunicationsBetween Web serverAnd database server

Dr. Chen, Business Database Systems 30

System Response TimeDepends on ...

• the speed of the network• the size of the database• the way the database is used …• a personal database running on a server

might handle 10 users making database transactions at the same time before becoming overloaded.

Dr. Chen, Business Database Systems 31

Handling server and client failures Processing transactions Handling high data volumes Providing security Servicing multiple simultaneous users

Advantages of Client/Server Database Management Systems

Dr. Chen, Business Database Systems

SQL, SQL*Plus, and PL/SQL

Language or Tool Description

SQL A command language for communication with theOracle 10/11Server from any tool or application. OracleSQL contains many extensions.

SQL*Plus An Oracle tool that recognizes and submits SQL andPL/SQL statements to the Server for execution andcontains its own command language.

PL/SQL An Oracle Procedural Language for writing applicationlogic and manipulating data outside the database.

SQL*Plus ServerBuffer

SQL and PL/SQLScripts

Terminal

Dr. Chen, Business Database Systems 33

Design Principles

• To avoid creating tables that contain redundant data, group related items that describe a single entity together in a common table

• Do not create tables that duplicate values many times in different rows

• When creating a database and inserting data values, you must specify the data type for each column

• Recall that primary key fields should use a number data type to avoid typographical, punctuation, and case variation errors

Dr. Chen, Business Database Systems 34

Database Design

• Systems Development Life Cycle (SDLC)• Entity-relationship model (E-R model)• Normalization

Dr. Chen, Business Database Systems

Systems Development Life Cycle

Systems ImplementationProduct:

Operational System

Systems ImplementationProduct:

Operational System

Systems Investigation(Definition) Product:

Feasibility Study

Systems Investigation(Definition) Product:

Feasibility Study

Systems AnalysisProduct:

Functional Requirements

Systems AnalysisProduct:

Functional Requirements

Systems DesignProduct:

System Specifications

Systems DesignProduct:

System Specifications

Systems MaintenanceProduct:

Improved System

Systems MaintenanceProduct:

Improved System

Understand theBusinessProblem orOpportunity

Develop anInformationSystemSolution

Implementthe InformationSystemSolution

Dr. Chen, Business Database Systems 36

Systems Development Life Cycle (SDLC)

• Systems investigation – understanding the problem• Systems analysis – understanding the solution• Systems design – creating the logical and physical

components• Systems implementation – placing completed system

into operation – including integration, testing and deployment

• Systems maintenance and review – evaluating the implemented system

Dr. Chen, Business Database Systems 37

Data Models

• A data model is a collection of concepts for describing data.

CUSTOMER ORDER

• Three database models– Hierarchical– Network– Relational

Dr. Chen, Business Database Systems

Entity-Relationship Model (E-R Model)

• Used to depict the relationship that exists among entities

• The following relationships can be included in an E-R model:– One-to-one– One-to-many– Many-to-many

Dr. Chen, Business Database Systems 39

E-R Model Notation Examples

Figure 1-3 E-R Model notation examples

Please note that the name of entity should be singular even it contains multiple instances

Dr. Chen, Business Database Systems

Figure 1-5 JustLee Books’ table structures after normalization

Dr. Chen, Business Database Systems 41

One-to-One Relationship

• Each occurrence of data in one entity is represented by only one occurrence of data in the other entity

• Example: Each order has just one invoice and each invoice is assigned to just one order

ORDER INVOICE

Dr. Chen, Business Database Systems 42

One-to-Many Relationship

• Each occurrence of data in one entity can be represented by many occurrences of the data in the other entity

• Example: A class has only one instructor, but each instructor can teach many classes

Class Instructor

Dr. Chen, Business Database Systems 43

Many-to-Many Relationship

• Data can have multiple occurrences in both entities

• Example: A student can take many classes, and each class is composed of many students

• Can not be included in the physical database

Class Student

Dr. Chen, Business Database Systems 44

JustLee Example E-R Model

Dr. Chen, Business Database Systems 45

Installing Personal Oracle11g

• I created a “Discussion Questions” on the Bb for you to share (or post question) your experience on installation of Oracle 11g. Please participate and it will be also your part of class performance.

•  • Please note that if your computer with 64-bit operating

system such as Windows 7, you can’t use the DVD to install Oracle since it is for 32-bit OS. Rather you need to download 64-bit Oracle 11g from the Oracle site (www.oracle.com) or type in keyword search of “Oracle 11g download”.

Dr. Chen, Business Database Systems

ISBN Title Publication Date Cost Retail Category Publishe

r Contact Author

8843172113 Database Implementation

04-JUN-03 31.40 55.95 Computer American Publishing

Davidson T.Peterson

8843172113 Database Implementation

04-JUN-03 31.40 55.95 Computer American Publishing

Davidson J.Austin

8843172113 Database Implementation

04-JUN-03 31.40 55.95 Computer American Publishing

Davidson J.Adams

1915762492 Handcranked Computers

21-JUN-05 21.80 25.00 Computer Amercian Publishing

Davidson W.White

1915762492 Handcranked Computers

21-JUN-05 21.80 25.00 Computer Amercian Publishing

Davidson L.White

6522489652 Meaningful Accounting

25-MAY-08 41.60 85.00 Accounting Articulate Publishing

Brown M.Kane

6522489652 Meaningful Accounting

25-MAY-08 41.60 85.00 Accounting Articulate Publishing

Brown S. Little

Q: Can we create/enter the data base on the following table immediately?

BOOKS table

Dr. Chen, Business Database Systems 47

Database Normalization

• Purposes– design a reliable and stable data bases– increase data integrity (reduce or control data

redundancy)

• Processes– determines required tables and columns for

each table– multistep process

Dr. Chen, Business Database Systems

Problem Solving for Modeling a Database Project

Business ProblemStudy and Analyze w/Team

???

IMPLEMENTATION

Dr. Chen, Business Database Systems

Problem Solving for Modeling a Database Project

Business ProblemStudy and Analyze w/Team

ER or other Model

Normalization(3NF)

User interview & Integrated Model

Normalization

IMPLEMENTATION

Dr. Chen, Business Database Systems 50

Well-Structured Relations

• A relation that contains minimal data redundancy and allows users to insert, delete, and update rows without causing data inconsistencies

• Goal is to avoid (minimize) anomalies– Insertion Anomaly – adding new rows forces user to

create duplicate data– Deletion Anomaly – deleting rows may cause a loss of

data that would be needed for other future rows– Modification Anomaly – changing data in a row forces

changes to other rows because of duplication

General rule of thumb: a table should not pertain to more than one entity type

Dr. Chen, Business Database Systems 51

Functional Dependencies and Keys• Functional Dependency: The value of one attribute (the

determinant) determines the value of another attribute.• Candidate Key

– A unique identifier. One of the candidate keys will become the primary key• E.g. perhaps there is both credit card number and SS# in

a table…in this case both are candidate keys– Each non-key field is functionally dependent on every

candidate key

Figure 5-9

Dr. Chen, Business Database Systems

Database Normalization (continued)

• Data redundancy – refers to having the same data in different places within a database

• Data anomalies – refers to data inconsistencies– Insertion anomaly– Deletion anomaly– Modification anomaly

Dr. Chen, Business Database Systems

Unnormalized Data

• Contains repeating groups in the Author column in the BOOKS table

Dr. Chen, Business Database Systems

Table with Repeating Group attributes

First normalform (1NF)

Second normalform(2NF)

Third normalform (3NF)

Boyce-Codd normalform (BC-NF)

Fourth normalForm (4NF)

Fifth normalform (5NF)

Remove Repeating Group Attributes

Remove remaining anomalies resulting from multiple candidate keys

Remove Multivalued Dependencies

Remove Remaining Anomalies

Figure: 5-22 Steps in normalization

Remove …

Remove …

Dr. Chen, Business Database Systems 55

First-Normal Form (1NF)

• Primary key (pk) is identified• Repeating groups are eliminated• Every attribute value is atomic (singled-

value)

Dr. Chen, Business Database Systems

First-Normal Form (1NF) (continued)

• ISBN and Author columns together create a composite primary key

Dr. Chen, Business Database Systems

ISBN Title Publication Date Cost Retail Category Publishe

r Contact Author

8843172113 Database Implementation

04-JUN-03 31.40 55.95 Computer American Publishing

Davidson T.Peterson

8843172113 Database Implementation

04-JUN-03 31.40 55.95 Computer American Publishing

Davidson J.Austin

8843172113 Database Implementation

04-JUN-03 31.40 55.95 Computer American Publishing

Davidson J.Adams

1915762492 Handcranked Computers

21-JUN-05 21.80 25.00 Computer Amercian Publishing

Davidson W.White

1915762492 Handcranked Computers

21-JUN-05 21.80 25.00 Computer Amercian Publishing

Davidson L.White

6522489652 Meaningful Accounting

25-MAY-08 41.60 85.00 Accounting Articulate Publishing

Brown M.Kane

6522489652 Meaningful Accounting

25-MAY-08 41.60 85.00 Accounting Articulate Publishing

Brown S. Little

1NF: ISBN and Author fields together create a composite primary key

Q: What fields are dependent on ISBN alone ?

Dr. Chen, Business Database Systems 58

Composite Primary Key

• More than one field (column) is required to uniquely identify a record (row).

• Can lead to partial dependency – a field is only dependent on a portion of the primary key

Dr. Chen, Business Database Systems 59

Second-Normal Form (2NF)

• 1NF and every non-key attribute is fully functionally dependent on the primary key.

• Partial dependency must be eliminated– Break the composite primary key into two

parts, each part representing a separate table– Every non-key attribute must be defined by the

entire key (either a single PK or a CK), not by only part of the key

Dr. Chen, Business Database Systems

Second-Normal Form (2NF) (continued)

• BOOKS table in 2NF

60

ISBN Title Publication Date Cost Retail Category Publisher Contact

8843172113 Database Implementation

04-JUN-03 31.40 55.95 Computer American Publishing

Davidson

1915762492 Handcranked Computers

21-JUN-05 21.80 25.00 Computer American Publishing

Davidson

6522489652 Meaningful Accounting

25-MAY-08 41.60 85.00 Accounting Articulate Publishing

Brown

ISBN Author

8843172113 T.Peterson

8843172113 J.Austin

8843172113 J.Adams

1915762492 W.White

1915762492 L.White

6522489652 M.Kane

6522489652 S. Little

Dr. Chen, Business Database Systems

ISBN Title Publication Date Cost Retail Category Publishe

r Contact Author

8843172113 Database Implementation

04-JUN-03 31.40 55.95 Computer American Publishing

Davidson T.Peterson

8843172113 Database Implementation

04-JUN-03 31.40 55.95 Computer American Publishing

Davidson J.Austin

8843172113 Database Implementation

04-JUN-03 31.40 55.95 Computer American Publishing

Davidson J.Adams

1915762492 Handcranked Computers

21-JUN-05 21.80 25.00 Computer Amercian Publishing

Davidson W.White

1915762492 Handcranked Computers

21-JUN-05 21.80 25.00 Computer Amercian Publishing

Davidson L.White

6522489652 Meaningful Accounting

25-MAY-08 41.60 85.00 Accounting Articulate Publishing

Brown M.Kane

6522489652 Meaningful Accounting

25-MAY-08 41.60 85.00 Accounting Articulate Publishing

Brown S. Little

1NF: ISBN and Author fields together create a composite primary key

Q: What fields are dependent on ISBN alone ?

Dr. Chen, Business Database Systems

ISBN TitlePublication

DateCost Retail Category Publisher AuthorContact

Dependency on entire primary key (ISBN & Author)

Dependency on partial primary key (ISBN)

ISBN TitlePublication

DateCost Retail Category Publisher Contact Author

8843172113 Database Implementation

04-JUN-03 31.40 55.95 Computer American Publishing

Davidson T.Peterson

8843172113 Database Implementation

04-JUN-03 31.40 55.95 Computer American Publishing

Davidson J.Austin

8843172113 Database Implementation

04-JUN-03 31.40 55.95 Computer American Publishing

Davidson J.Adams

1915762492 Handcranked Computers

21-JUN-05 21.80 25.00 Computer Amercian Publishing

Davidson W.White

1915762492 Handcranked Computers

21-JUN-05 21.80 25.00 Computer Amercian Publishing

Davidson L.White

6522489652 Meaningful Accounting

25-MAY-08 41.60 85.00 Accounting Articulate Publishing

Brown M.Kane

6522489652 Meaningful Accounting

25-MAY-08 41.60 85.00 Accounting Articulate Publishing

Brown S. Little

Fig: A Process from 1NF to 2NF

Dr. Chen, Business Database Systems

ISBN TitlePublication

DateCost Retail Category Publisher AuthorContact

Dependency on entire primary key (ISBN & Author)

Dependency on partial primary key (ISBN)

Fig: A Process from1NF to 2NF

ISBN & Author Title, Publication Date, Cost …

ISBN Title, Publication Date, Cost …

Therefore, NOT in 2nd Normal Form!!

Dr. Chen, Business Database Systems

ISBN TitlePublication

DateCost Retail Category Publisher AuthorContact

Dependency on entire primary key (ISBN & Author)

Dependency on partial primary key (ISBN)

Fig: A Process from 1NF to 2NF

ISBN TitlePublication

DateCost Retail Category Publisher Contact

ISBN Author

Dr. Chen, Business Database Systems

ISBN Title Publication Date Cost Retail Category Publisher Contact

8843172113 Database Implementation

04-JUN-03 31.40 55.95 Computer American Publishing

Davidson

1915762492 Handcranked Computers

21-JUN-05 21.80 25.00 Computer American Publishing

Davidson

6522489652 Meaningful Accounting

25-MAY-08 41.60 85.00 Accounting Articulate Publishing

Brown

Fig: A Process from 1NF to 2NF

ISBN Author

8843172113 T.Peterson

8843172113 J.Austin

8843172113 J.Adams

1915762492 W.White

1915762492 L.White

6522489652 M.Kane

6522489652 S. Little

ISBN Title Publication Date Cost Retail Category Publisher Contact Author

8843172113 Database Implementation

04-JUN-03 31.40 55.95 Computer American Publishing

Davidson T.Peterson

8843172113 Database Implementation

04-JUN-03 31.40 55.95 Computer American Publishing

Davidson J.Austin

8843172113 Database Implementation

04-JUN-03 31.40 55.95 Computer American Publishing

Davidson J.Adams

1915762492 Handcranked Computers

21-JUN-05 21.80 25.00 Computer Amercian Publishing

Davidson W.White

1915762492 Handcranked Computers

21-JUN-05 21.80 25.00 Computer Amercian Publishing

Davidson L.White

6522489652 Meaningful Accounting

25-MAY-08 41.60 85.00 Accounting Articulate Publishing

Brown M.Kane

6522489652 Meaningful Accounting

25-MAY-08 41.60 85.00 Accounting Articulate Publishing

Brown S. Little

Dr. Chen, Business Database Systems

Figure 1-5 JustLee Books’ table structures after normalization

Dr. Chen, Business Database Systems

Table with Repeating Group attributes

First normalform (1NF)

Second normalform(2NF)

Third normalform (3NF)

Boyce-Codd normalform (BC-NF)

Fourth normalForm (4NF)

Fifth normalform (5NF)

Remove Repeating Group Attributes

Remove remaining anomalies resulting from multiple candidate keys

Remove Multivalued Dependencies

Remove Remaining Anomalies

Figure: 5-22 Steps in normalization

Remove Partial Dependencies

Remove Transitive Dependencies

Dr. Chen, Business Database Systems 68

Third Normal Form (3NF)

• Transitive dependency must be eliminated– 2NF and transitive dependencies (functional

dependency between non-key attributes) must be eliminated

Dr. Chen, Business Database Systems

ISBN TitlePublication

DateCost Retail Category Publisher AuthorContact

Dependency on entire primary key (ISBN & Author)

Dependency on partial primary key (ISBN)

Fig: A Process from 1NF to 2NF

ISBN Author

ISBN TitlePublication

DateCost Retail Category Publisher Contact

Is it in the 3NF?

Dr. Chen, Business Database Systems

ISBN TitlePublication

DateCost Retail Category Publisher Contact

Fig: A Process from 2NF to 3NF

ISBN TitleISBN Publication Date …ISBN CategoryISBN PublisherISBN Contact and

BUTISBN Category Contact implies

Transitive dependency(not in 3rd NF)

ISBN Contact

Category ContactAll this is OK(2nd NF)

All non-key fields are dependent on pk (ISBN)

a non-key field is dep. on another non-key field

Dr. Chen, Business Database Systems

ISBN TitlePublication

DateCost Retail Category Publisher Contact

Fig: A Process from 2NF to 3NF

ISBN TitlePublication

DateCost Retail Category Publisher

Category Contact

Dr. Chen, Business Database Systems

Third-Normal Form (3NF)

• Publisher contact name has been removed

TABLE 1-5 The BOOKS Table in 3NF

Category Contact

Computer Davidson

Accounting Brown

ISBN Title Publication Date Cost Retail Category Publisher

8843172113 Database Implementation

04-JUN-03 31.40 55.95 Computer American Publishing

1915762492 Handcranked Computers

21-JUN-05 21.80 25.00 Computer American Publishing

6522489652 Meaningful Accounting

25-MAY-08 41.60 85.00 Accounting Articulate Publishing

We might further create a look up table for “Category” (see Table 1-6, p.12)

Dr. Chen, Business Database Systems

ISBN TitlePublication

DateCost Retail Category Publisher Contact

8843172113 Database Implementation

04-JUN-03 31.40 55.95 Computer American Publishing

Davidson

1915762492 Handcranked Computers

21-JUN-05 21.80 25.00 Computer American Publishing

Davidson

6522489652 Meaningful Accounting

25-MAY-08 41.60 85.00 Accounting Articulate Publishin

g

Brown

Fig: A Process from 2NF to 3NF

Category Contact

Computer Davidson

Accounting Brown

ISBN TitlePublication

DateCost Retail Category Publisher

8843172113 Database Implementation

04-JUN-03 31.40 55.95 Computer American Publishing

1915762492 Handcranked Computers

21-JUN-05 21.80 25.00 Computer American Publishing

6522489652 Meaningful Accounting

25-MAY-08 41.60 85.00 Accounting Articulate Publishing

Dr. Chen, Business Database Systems

ISBN TitlePublication

DateCost Retail Category Publisher AuthorContact

Dependency on entire primary key (ISBN & Author)

Dependency on partial primary key (ISBN)

ISBN Author

ISBN TitlePublication

DateCost Retail Category Publisher

Category Contact

ISBN TitlePublication

DateCost Retail Category Publisher Contact

Q: Finally, how many tables on the 3NF?

Processes from 1NF 2NF 3NF

(remove partial dependency)

(remove transitive dependency)

Dr. Chen, Business Database Systems

ISBN TitlePublication

DateCost Retail Category Publisher AuthorContact

ISBN Author

Category Contact

ISBN TitlePublication

DateCost Retail Category Publisher

Q: How many tables on the 3NF?

Answer: Three tables. What are they?

Note that You can’t do any implementation until you transform a table into 3NF (with more tables produced)

BOOKS

BOOK_AUTHOR

CATEGORY_CONTACT

Dr. Chen, Business Database Systems

Processes from 1NF 2NF 3NF

Category Contact

Computer Davidson

Accounting Brown

ISBN Title Publication Date Cost Retail Category Publisher

8843172113 Database Implementation

04-JUN-03 31.40 55.95 Computer American Publishing

1915762492 Handcranked Computers

21-JUN-05 21.80 25.00 Computer American Publishing

6522489652 Meaningful Accounting

25-MAY-08 41.60 85.00 Accounting Articulate Publishing

ISBN Author

8843172113 T.Peterson

8843172113 J.Austin

8843172113 J.Adams

1915762492 W.White

1915762492 L.White

6522489652 M.Kane

6522489652 S. Little

ISBN Title Publication Date Cost Retail Category Publisher Contact Author

8843172113 Database Implementation

04-JUN-03 31.40 55.95 Computer American Publishing

Davidson T.Peterson

8843172113 Database Implementation

04-JUN-03 31.40 55.95 Computer American Publishing

Davidson J.Austin

8843172113 Database Implementation

04-JUN-03 31.40 55.95 Computer American Publishing

Davidson J.Adams

1915762492 Handcranked Computers

21-JUN-05 21.80 25.00 Computer Amercian Publishing

Davidson W.White

1915762492 Handcranked Computers

21-JUN-05 21.80 25.00 Computer Amercian Publishing

Davidson L.White

6522489652 Meaningful Accounting

25-MAY-08 41.60 85.00 Accounting Articulate Publishing

Brown M.Kane

6522489652 Meaningful Accounting

25-MAY-08 41.60 85.00 Accounting Articulate Publishing

Brown S. Little

Category Code Category

10 Computer

50 Accounting

fk(foreign key)

We will learn all detailed processes in the main text (ch.4)

Dr. Chen, Business Database Systems

Table with Repeating Group attributes

First normalform (1NF)

Second normalform(2NF)

Third normalform (3NF)

Boyce-Codd normalform (BC-NF)

Fourth normalForm (4NF)

Fifth normalform (5NF)

Remove Repeating Group Attributes

Remove remaining anomalies resulting from multiple candidate keys

Remove Multivalued Dependencies

Remove Remaining Anomalies

Figure: 5-22 Steps in normalization

Remove Partial Dependencies

Remove Transitive Dependencies

Dr. Chen, Business Database Systems 78

Summary of Normalization Steps

• 1NF: eliminate repeating groups, identify the primary key

• 2NF: table is in 1NF, and partial dependencies are eliminated

• 3NF: table is in 2NF, and transitive dependencies are eliminated

Dr. Chen, Business Database Systems

Table with Repeating Group attributes

First normalform (1NF)

Second normalform(2NF)

Third normalform (3NF)

Boyce-Codd normalform (BC-NF)

Fourth normalForm (4NF)

Fifth normalform (5NF)

Remove Repeating Group Attributes

Remove remaining anomalies resulting from multiple candidate keys

Remove Multivalued Dependencies

Remove Remaining Anomalies

Figure: 5-22 Steps in normalization

Remove Partial Dependencies

Remove Transitive Dependencies

Dr. Chen, Business Database Systems 80

Relating Tables within the Database

• Once tables are normalized, make certain tables are linked

• Tables are linked through a common field• A common field is usually a primary key in

one table and a foreign key in the other table

Dr. Chen, Business Database Systems

CategoryContact

Category_Contact

Dr. Chen, Business Database Systems 82

Lookup Table

• Common reference for descriptive data tables referenced in a foreign key

Category Code Category

10 Computer

50 Accounting

fk(foreign key)

Dr. Chen, Business Database Systems 83

Structured Query Language (SQL)

• Data sublanguage • Used to:

– Create or modify tables– Add data to tables– Edit data in tables– Retrieve data from tables

• ANSI and ISO standards

Dr. Chen, Business Database Systems 84

Databases Used in this Textbook – JustLee Books Database

• Assumptions– No back orders or partial shipments– Only U.S. addresses– Completed orders are transferred to the annual

SALES table at the end of each month to enable faster processing on the ORDERS table

Dr. Chen, Business Database Systems 85

Topic Sequence

• The first half of the text will focus on creating a database

• The second half of the text will focus on querying or retrieving data from a database

Dr. Chen, Business Database Systems 86

Summary• A DBMS is used to create and maintain a database• A database is composed of a group of interrelated tables• A file is a group of related records; a file is also called a table in the

physical database• A record is a group of related fields regarding one specific entity; a

record is also called a row• A common field is used to join data contained in different tables• A primary key is used to uniquely identify each record• A foreign key is a common field that exists between two tables but

is also a primary key in one of the tables• A lookup table is a common term for a table referenced in a foreign

key• A Structured Query Language (SQL) is a data sublanguage that

navigates the data stored within a database’s tables

Dr. Chen, Business Database Systems 87

Summary (continued)

• A record is considered unnormalized if it contains repeating groups

• A record is in first-normal form (1NF) if no repeating groups exist and it has a primary key

• Second-normal form (2NF) is achieved if the record is in 1NF and has no partial dependencies

• After a record is in 2NF and all transitive dependencies have been removed, then it is in third-normal form (3NF), which is generally sufficient for most databases

Dr. Chen, Business Database Systems

Homework: Transform it into 3NF

SALES relation with simple data

Cust_ID Name Salesperson Region

8023 Anderson 101 South

9167 Bancroft 102 West

7924 Hobbs 101 South

6837 Tucker 103 East

8596 Eckersley 102 West

7018 Arnold 104 North

SALES