1.1 - relational data model.odp
TRANSCRIPT
© 2010 IBM Corporation
Information Management
Information Management Ecosystem PartnershipsIBM Canada Lab
Summer/Fall 2010
The Relational Data Model
2 © 2010 IBM Corporation
Information Management
Agenda
■ Data and Databases
■ Database Management Systems (DBMS)
■ Information Model & Data Model
■ Relational Data Model & Relational Databases
■ Normalization
3 © 2010 IBM Corporation
Information Management
What is Data?
■ Collection of facts or numbers
■ Can be quantitative or qualitative
■ Describes a variable or set of variables
■ Essentially data can be thought of as the result of observations based on things like:
– measurements– statistics
4 © 2010 IBM Corporation
Information Management
Data and Information
■ Data is simply facts
■ Why is data important?–By relating different pieces of data
we are able to extract valuable information by presenting it in meaningful context
■ For that we need to be able to:
● Store data → so it can be persisted● Structure data → so it is easier to manipulate● Organize data → in a meaningful way● Process data → to derive data value from it
5 © 2010 IBM Corporation
Information Management
Why Databases?
■ Data can be stored using multiple methods such as:–Text files–Comma delimited data files –Spreadsheets–Databases
■ Why database?–The way data is accessed –The way data is handled
6 © 2010 IBM Corporation
Information Management
Managing Data
■ Using a database provides:–a standard interface for accessing data –multiple users with simultaneous ability to insert, update and
delete data–changes to the data without risk of losing data and its
consistency–the capability to handle huge volumes of data and users–tools for data backup, restore and recovery–security–reduce redundancy–data independence
7 © 2010 IBM Corporation
Information Management
Database Management Systems
■ Database Management System (DBMS)– It is the software system that manages databases– Provides an interface of access to the databases– Provides data services to applications
• Efficient data querying and update mechanisms• Data integrity – guarantees data is always right even in case
of software and hardware errors• Others: backup, compression, security, replication, etc.
■ DB2 is a Database Management System
8 © 2010 IBM Corporation
Information Management
Information Model
■ Information Model –Abstract management of objects at a conceptual level–Independent of specific implementations and protocols –Hides all protocol and implementation details –Defines relationships between managed objects.
■ Multiple implementations of an information model exists–Data models
9 © 2010 IBM Corporation
Information Management
Data Model
■ A model is a representation of an object or concept of the real world
– 3D model: a graphical representation of an object– Scale model: a replica or copy of an object in a smaller size– Business model: describes how a company operates
■ Data Model– Define how data is to be represented and structured– It can be used to map how data from the real world is to be
represented in a software system
■ Characteristics of a Data Model– Lower level of abstraction– Intended for the software developer– Includes specific implementation and protocol details
10 © 2010 IBM Corporation
Information Management
Types of Data Models
■ Extended Relational
■ Entity-Relationship
■ Hierarchical
■ Network
■ Object-oriented
■ Object-relational
■ Relational
■ Semantic
■ Semi-structured (XML)
11 © 2010 IBM Corporation
Information Management
The Relational Data Model
■ Proposed by E.F. Codd in 1970.
■ It is mathematical model that describes data as a collection of Relations (sets) and the values of the data is defined by Domains.
■ Focuses on providing better data independence
■ Data are operated upon by means of a relational calculus or relational algebra
■ Advantages– Based on a formal theoretical model and proven in practice– Provides logical view of the data
■ It is implemented by most DBMS in the market, such as DB2.– There are called Relational Database Management
Systems
12 © 2010 IBM Corporation
Information Management
Components of the Relational Data Model■ The relational data model has its own unique terms used to
define its concepts.
Attribute
Relation
Tuple
(ID, int) (NAME, text) (EXT, int) (Active, boolean)
1 John S 54213 Y
2 Michael B 52137 Y
3 Jeremy W 50603 Y
4 Leah E 58963 N
■ Domain (or data type) defines the set of possible values that data can assume
■ Relation is composed by a heading and a body
– Heading: a set of attributes
– Body: a set of tuples
■ Attribute is composed by a name and a domain (type)
■ A tuple is a set of attribute values
Domain
13 © 2010 IBM Corporation
Information Management
Components of a Relational Database
■ Concepts from Relational Data Model can be mapped to their implementation found in a Relational Database
■ Relational databases store data using tables– A table consists of columns and rows– Each column has a specific data type – Each row features a certain value for each column
Column
Table
Row
ID NAME EXTENSION MANAGER
1 John S 54213 Y
2 Susan P 59867 N
4 Andrew J 55935 N
5 Michael B 52137 Y
6 Jeremy W 50603 Y
7 Leah E 58963 N
14 © 2010 IBM Corporation
Information Management
Tables
■ A Table is the counterpart of a Relation from the Relational Data Model
■ A table stores data in rows and columns– Rows are the same as Tuples– Columns are the same as Attributes
■ There can be multiple tables for different types of data to reduce redundant information
– Normalization (more on this later)
■ For example: –You want to store data about a company
• Data about branch offices will be stored in a table• Employee data for specific branches will be stored in its own
table• Product data will be stored in another table
15 © 2010 IBM Corporation
Information Management
Columns
■ Columns are also known as fields
■ Each field contains a specific type of information such as name, extension, position and so on
■ Columns must be designated a specific data type such as DATE, VARCHAR, INTEGERand so on
Specific Information
Data type: Char
16 © 2010 IBM Corporation
Information Management
Domains and Data Types
■ Data Type– This is counterpart of Domains in the Relational Model,
which defines the smallest unit of data that can be stored– Columns always have a data type
Manager Domain = (Y, N)
A domain is the set of all possible values for the specific attribute
Smallest unit of Data (atomic value)
Attribute/Colum
17 © 2010 IBM Corporation
Information Management
Components of a Relational Database
■ Primary Keys– Uniquely identifies each tuple (row) of the relation (table)– Relations must always have a primary key– Although it is recommended, tables in a relational database
are not required to have a Primary Key
■ Examples– Driver's license of a person– ISBN of a book– Serial number of a product
Primary Key
18 © 2010 IBM Corporation
Information Management
Components of a Relational Database
■ Foreign Keys– Attribute in one relation whose values match a primary key
of another relation – Defines the relationship between two tables
ID(PK)
NAMEDEPT_ID
(FK)
1 John S A
2 Michael B B
3 Jeremy W B
4 Leah E C
ID(PK)
DEPT_NAME
A Finance
B Design
C Business
D Development
19 © 2010 IBM Corporation
Information Management
Why Normalization?
■ Why do we need normalization?
■ Consider the following table:–Lists of task an employee is involved in:
■ Example operation: if John moves to a new city, all entries related to John must be updated
– redundancy– anomalies
ID Name Office City Extension Task
1 John S Toronto 54213 Planning
1 John S Toronto 54213 Marketing
1 John S Toronto 54213 Testing
2 Susan P New York 59867 Marketing
3 Jennifer L Chicago 59415 Planning
3 Jennifer L Chicago 59415 Testing
20 © 2010 IBM Corporation
Information Management
Normalization
■ No anomalies, no redundancy
■ No loss of information
Employee ID Task ID
1 1
1 2
1 3
2 2
3 1
3 3
ID Name Office City Extension
1 John S Toronto 54213
2 Susan P New York 59867
3 Jennifer L Chicago 59415
ID Task
1 Planning
2 Marketing
3 Testing
Task TableEmployee Table
Employee Tasks Table
© 2010 IBM Corporation
Information Management
Information Management Ecosystem PartnershipsIBM Canada Lab
Summer/Fall 2010Questions?
E-mail: [email protected]: “DB2 Academic Workshop”