the entity-relationship model (chapter 2)
DESCRIPTION
The Entity-Relationship Model (Chapter 2). 1. Main Phases of Database Design. 2. Overview of Database Design. Conceptual design : ( ER Model is used at this stage.) What are the entities and relationships in the enterprise? - PowerPoint PPT PresentationTRANSCRIPT
CSC 411/511: DBMS Design
1
Dr. Nan Wang CSC411_L2_ER Model1
The Entity-Relationship Model
(Chapter 2)
CSC411_L2_ER ModelDr. Nan Wang22
Main Phases of Database Design
CSC411_L2_ER ModelDr. Nan Wang3
Overview of Database Design
• Conceptual design: (ER Model is used at this stage.) – What are the entities and relationships in the enterprise?– What information about these entities and relationships
should we store in the database?– What are the integrity constraints or business rules that
hold? – A database `schema’ in the ER Model can be represented
pictorially (ER diagrams).– Can map an ER diagram into a relational schema.
CSC411_L2_ER ModelDr. Nan Wang4
ER Model Basics
• Entity-Relationship (ER) data model – A database can be modeled as a collection of entities and
relationship among entities– Widely used to develop an initial database design
• Entity: Real-world object distinguishable from other objects.
– An entity is described (in DB) using a set of attributes.
Employees
ssnname
lot
CSC411_L2_ER ModelDr. Nan Wang55
ER Model Basics (Contd.)
• Entity Set: A collection of similar entities. E.g., all employees. – All entities in an entity set have the same set of attributes.
– Each entity set has a key. • Key: a minimal set of attributes whose values uniquely identify
an entity in the set
– Each attribute has a domain.• A domain defines the possible values of each attributes
• E.g., attribute name might be the set of 20-character string
Employees
ssnname
lot
CSC411_L2_ER ModelDr. Nan Wang66
CSC411_L2_ER ModelDr. Nan Wang7
ER Model Basics (Contd.)
• Relationship: Association among two or more entities. – E.g., James works in Pharmacy department.– Descriptive attribute, since– Record the information about the relationship, rather than
about one of the participating entities
did
dname
budget
Departments
lot
sincename
Works_InEmployees
ssn
CSC411_L2_ER ModelDr. Nan Wang88
ER Model Basics (Contd.)
• Relationship Set: Collection of similar relationships.
• Same entity set could participate in different relationship sets
lot
name dnamebudgetdid
sincename dname
budgetdid
since
Manages
since
DepartmentsEmployees
ssn
Works_In
CSC411_L2_ER ModelDr. Nan Wang99
ER Model Basics (Contd.)
• Relationship Set: Collection of similar relationships.
• A relationship might involve two entities in the same entity set
– Two entities with different “roles” in same set.
subor-dinate
super-visor
Reports_To
lot
name
Employees
ssn
CSC411_L2_ER ModelDr. Nan Wang10
Key Constraints
• Consider Works_In: An employee can work in many departments; a dept can have many employees.
Many-to-Many1-to-1 1-to Many Many-to-1
did
dname
budget
Departments
lot
sincename
Works_InEmployees
ssn
CSC411_L2_ER ModelDr. Nan Wang11
Key Constraints
• In contrast, each dept has at most one manager, according to the key constraint on Manages.
• The constraint that each department has at most one manager is an example of a key constraint, and it implies that each Departments entity appear in at most one Manages relationship.
• Arrow states that given a Department entity, we can uniquely determine the manages relationship in which it appears.
Many-to-Many1-to-1 1-to Many Many-to-1
dname
budgetdid
since
lot
name
ssn
ManagesEmployees Departments
CSC411_L2_ER ModelDr. Nan Wang12
Key Constraints
• Each department has only one manager
• And each employee can manage only one department.
12
Many-to-Many1-to-1 1-to Many Many-to-1
dname
budgetdid
since
lot
name
ssn
ManagesEmployees Departments
CSC411_L2_ER ModelDr. Nan Wang13
Participation Constraints
• Does every department have a manager?– If so, this is a participation constraint: the participation of
Departments in Manages is said to be total (vs. partial).• If the participation of an entity set in a relationship set is total,
the two are connected by a thick line• Every Departments entity must appear in an instance of the
Manages relationship.
lot
name dnamebudgetdid
sincename dname
budgetdid
since
Manages
since
DepartmentsEmployees
ssn
Works_In
CSC411_L2_ER ModelDr. Nan Wang14
exercise
14
CSC411_L2_ER ModelDr. Nan Wang1515
CSC411_L2_ER ModelDr. Nan Wang1616
CSC411_L2_ER ModelDr. Nan Wang1717
CSC411_L2_ER ModelDr. Nan Wang1818
CSC411_L2_ER ModelDr. Nan Wang19
Homework
• Page 53, exercise 2.4
• Submit your homework (softcopy, PDF or DOC format) at [email protected] or hardcopy at class)
• Due date: Next Thursday (Sept, 5th)
19
CSC411_L2_ER ModelDr. Nan Wang20
Weak Entities
• Employees can purchase insurance to cover their dependents.
• Only dependents’ SSN is not needed• If an employee quits, any policy owned by the
employee is terminated and all relevant policy and dependent info will be deleted from the database.
20
lot
name
agepname
DependentsEmployees
ssn
Policy
cost
ssn
CSC411_L2_ER ModelDr. Nan Wang21
Weak Entities
• A weak entity can be identified uniquely only by considering the primary key of another (owner) entity.– (One of two or more candidate keys is designated as the primary key)– (Partial key: the set of attributes of a weak entity set that uniquely
identify a weak entity for a given owner entity)– Owner entity set and weak entity set must participate in a one-to-many
relationship set (one owner, many weak entities).– Weak entity set must have total participation in this identifying
relationship set.
lot
name
agepname
DependentsEmployees
ssn
Policy
cost
CSC411_L2_ER ModelDr. Nan Wang2222
ISA (`is a’) Hierarchies
• As in C++, or other PLs, attributes are inherited.• If we declare A ISA B, every A entity is also considered to
be a B entity.• Employees is specialized into subclasses• Hourly-Emp and Contract_EMps are generalized by
Employees
namessn
Employees
lot
Contract_Emps
hourly_wagesISA
Hourly_Emps
contractid
hours_worked
CSC411_L2_ER ModelDr. Nan Wang23
ISA (`is a’) Hierarchies (Contd.)
• Overlap constraints: Can Joe be an Hourly_Emps as well as a Contract_Emps entity? (Allowed/disallowed)
• Covering constraints:
– Does every Employees entity also have to be an Hourly_Emps or a Contract_Emps entity? (Yes/no).
Intuitively, No.
• Reasons for using ISA: – To add descriptive attributes specific to a subclass.
– To identify entities that participate in a relationship.
CSC411_L2_ER ModelDr. Nan Wang24
Aggregation
• Used to model a relationship between a collection of entities and relationships.
• Aggregation allows us to treat a relationship set as an entity set for purposes of participation in (other) relationships.
budgetdidpid
started_on
pbudgetdname
until
DepartmentsProjects Sponsors
Employees
Monitors
lotname
ssn
since
CSC411_L2_ER ModelDr. Nan Wang2525
Aggregation (Contd.)
• Aggregation vs. ternary relationship: why not make Sponsors a ternary relationship? – There are really two distinct
relationships (monitors and sponsors), each with attribute of its own.
• Monitors is a distinct relationship, with a descriptive attribute.
– Also, can say that each sponsorship is monitored by at most one employee.
– How to set key constraint in aggregation and ternary relationship?
budgetdidpid
started_on
pbudgetdname
until
DepartmentsProjects Sponsors
Employees
Monitors
lotname
ssn
since
CSC411_L2_ER ModelDr. Nan Wang26
Conceptual Design Using the ER Model
• Design choices:– Should a concept be modeled as an entity or an attribute?– Should a concept be modeled as an entity or a relationship?– Identifying relationships: Binary or ternary? Aggregation?
• Constraints in the ER Model:– A lot of data semantics can (and should) be captured.– But some constraints cannot be captured in ER diagrams.
CSC411_L2_ER ModelDr. Nan Wang27
Entity vs. Attribute
• Should address be an attribute of Employees or an entity (connected to Employees by a relationship)?
Employees
ssnname
address
CSC411_L2_ER ModelDr. Nan Wang28
Entity vs. Attribute
• Depends upon the use we want to make of address information, and the semantics of the data:
• If we have several addresses per employee, address must be an entity (since attributes cannot be set-valued).
• If the structure (city, street, etc.) is important, e.g., we want to retrieve employees in a given city, address must be modeled as an entity (since attribute values are atomic).
Employees
ssnname
address
CSC411_L2_ER ModelDr. Nan Wang29
Entity vs. Attribute
street
citya#
name
ssn
Lives_inEmployees addresses
•If we have several addresses per employee, address must be an entity (since attributes cannot be set-valued). •If the structure (city, street, etc.) is important, e.g., we want to retrieve employees in a given city, address must be modeled as an entity (since attribute values are atomic).
CSC411_L2_ER ModelDr. Nan Wang30
Entity vs. Attribute (Contd.)
name
Employees
ssn lot
Works_In4
from todname
budgetdid
Departments
diddname
budget
Departments
lot
sincename
Works_InEmployees
ssn
CSC411_L2_ER ModelDr. Nan Wang31
Entity vs. Attribute (Contd.)
• Works_In4 does not allow an employee to work in a department for two or more periods.
• Similar to the problem of wanting to record several addresses for an employee:
– We want to record several values of the descriptive attributes for each instance of this relationship.
– Accomplished by introducing new entity set, Duration.
dnamebudgetdid
name
Departments
ssn lot
Employees Works_In4
Durationfrom to
name
Employees
ssn lot
Works_In4
from todname
budgetdid
Departments
CSC411_L2_ER ModelDr. Nan Wang32
Entity vs. Relationship
• First ER diagram OK if a manager gets a separate discretionary budget for each dept.
• What if a manager gets a discretionary budget (sum) that covers all managed depts?
– Redundancy: dbudget stored for each dept managed by manager.
– Misleading: Suggests that dbudget associated with department-mgr combination.
Manages2
name dnamebudgetdid
Employees Departments
ssn lot
dbudgetsince
dnamebudgetdid
DepartmentsManages2
Employees
namessn lot
since
Managers dbudget
ISA
This fixes theproblem!
CSC411_L2_ER ModelDr. Nan Wang33
Binary vs. Ternary Relationships (1)
• The ER diagram shows– An employee can own several policies (dental, health)– Each policy can be owned by several employees– Each dependent can be covered by several policies
agepname
DependentsCovers
name
Employees
ssn lot
Policies
policyid cost
lot
name
agepname
DependentsEmployees
ssn
Policy
cost
CSC411_L2_ER ModelDr. Nan Wang34
Binary vs. Ternary Relationships (1)
• If each policy is owned by just 1 employee, and each dependent is tied to the covering policy, this ER diagram is inaccurate. Why?– Key constraint on Policies with respect to Covers implies that a policy can
cover only one dependent.
agepname
DependentsCovers
name
Employees
ssn lot
Policies
policyid cost
CSC411_L2_ER ModelDr. Nan Wang35
Binary vs. Ternary Relationships (2)
• Additional constraints in the following ER diagram– A policy cannot be owned jointly by two or more employees (key constraint)– Every policy must be owned by some employees (total participation
constraint)– Dependents is a week entity set
• (uniquely identified by pname and policyid)
Beneficiary
agepname
Dependents
policyid cost
Policies
Purchaser
name
Employees
ssn lot
Better design
CSC411_L2_ER ModelDr. Nan Wang36
Binary vs. Ternary Relationships (3)
• Previous example illustrated a case when two binary relationships were better than one ternary relationship.
• An example of ternary relationship – a ternary relation Contracts relates entity sets Parts,
Departments and Suppliers, and has descriptive attribute qty.
– No combination of binary relationships is an adequate substitute (without aggregation)
• S “can-supply” P, D “needs” P, and D “deals-with” S does not imply that D has agreed to buy P from S.
• How do we record qty?
CSC411_L2_ER ModelDr. Nan Wang37
Summary of Conceptual Design
• Conceptual design follows requirements analysis, – Yields a high-level description of data to be stored
• ER model popular for conceptual design– Constructs are expressive, close to the way people think
about their applications.• Basic constructs
– entities, relationships, and attributes (of entities and relationships).
• Some additional constructs – weak entities, ISA hierarchies, and aggregation.
• Note: There are many variations on ER model.
CSC411_L2_ER ModelDr. Nan Wang38
Summary of ER (Contd.)
• Several kinds of integrity constraints can be expressed in the ER model – key constraints, participation constraints, and
overlap/covering constraints for ISA hierarchies. – Some foreign key constraints are also implicit in the
definition of a relationship set.• Some constraints (notably, functional
dependencies) cannot be expressed in the ER model.
• Constraints play an important role in determining the best database design for an enterprise.
CSC411_L2_ER ModelDr. Nan Wang39
Summary of ER (Contd.)
• ER design is subjective. – There are often many ways to model a given scenario!
Analyzing alternatives can be tricky, especially for a large enterprise.
– Common choices include:• Entity vs. attribute, entity vs. relationship, binary or n-ary
relationship, whether or not to use ISA hierarchies, and whether or not to use aggregation.
• Ensuring good database design – resulting relational schema should be analyzed and refined
further.
– FD (functional dependency) information and normalization techniques are especially useful.
CSC411_L2_ER ModelDr. Nan Wang40
CSC411_L2_ER ModelDr. Nan Wang41
CSC411_L2_ER ModelDr. Nan Wang4242
CSC 411/511: DBMS Design
43
Dr. Nan Wang CSC411_L2_ER Model43
Questions?