cmsc424, spring 2005 cmsc424: database design instructor: amol deshpande [email protected]

54
CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande [email protected]

Post on 19-Dec-2015

228 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

CMSC424: Database Design

Instructor: Amol Deshpande

[email protected]

Page 2: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

Data Modeling

• Goals:• Conceptual representation of the data• “Reality” meets “bits and bytes”• Must make sense, and be usable by other

people

• Today:• Entity-relationship Model• Relational Model

Page 3: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

Motivation

• You’ve just been hired by Bank of America as their DBA for their online banking web site.

• You are asked to create a database that monitors:• customers• accounts• loans• branches• transactions, …

• Now what??!!!

Page 4: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

4

Database Design Steps

Three Levels of Modeling

info

Conceptual Data Model

Logical Data Model

Physical Data Model

Conceptual DB design

Logical DB design

Physical DB design

Entity-relationship Model Typically used for conceptual database design

Relational Model Typically used for logical database design

Page 5: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

Entity-Relationship Model

• Two key concepts• Entities:

• An object that exists and is distinguishable from other objects

• Examples: Bob Smith, BofA, CMSC424

• Have attributes (people have names and addresses)• Form entity sets with other entities of the same type that

share the same properties• Set of all people, set of all classes

• Entity sets may overlap• Customers and Employees

Page 6: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

Entity-Relationship Model

• Two key concepts• Relationships:

• Relate 2 or more entities • E.g. Bob Smith has account at College Park Branch

• Form relationship sets with other relationships of the same type that share the same properties

• Customers have accounts at Branches

• Can have attributes:• has account at may have an attribute start-date

• Can involve more than 2 entities• Employee works at Branch at Job

Page 7: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

7

ER Diagram: Starting Example

• Rectangles: entity sets• Diamonds: relationship sets• Ellipses: attributes

customer has

cust-street

cust-id

cust-name

cust-city

account

balance

numberaccess-date

Page 8: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

Rest of the class

• Details of the ER Model• How to represent various types of

constraints/semantic information etc.

• Design issues

• A detailed example

Page 9: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

Next: Relationship Cardinalities

• We may know:One customer can only open one accountOROne customer can open multiple accounts

• Representing this is important• Why ?

• Better manipulation of data• Can enforce such a constraint• Remember: If not represented in conceptual

model, the domain knowledge may be lost

Page 10: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

Mapping Cardinalities

• Express the number of entities to which another entity can be associated via a relationship set

• Most useful in describing binary relationship sets

Page 11: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

Mapping Cardinalities

• One-to-One

• One-to-Many

• Many-to-One

• Many-to-Many

customer has account

customer has account

customer has account

customer has account

Page 12: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

Mapping Cardinalities

• Express the number of entities to which another entity can be associated via a relationship set

• Most useful in describing binary relationship sets

• N-ary relationships ?

Page 13: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

Next: Types of Attributes

• Simple vs Composite• Single value per attribute ?

• Single-valued vs Multi-valued• E.g. Phone numbers are multi-valued

• Derived• If date-of-birth is present, age can be derived• Can help in avoiding redundancy, enforcing

constraints etc…

Page 14: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

Types of Attributes

customer has

cust-street

cust-id

cust-name

cust-city

account

balance

numberaccess-date

Page 15: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

Types of Attributes

customer

cust-street

cust-id

cust-name

cust-city

has account

balance

numberaccess-date

phone no.

date-of-birth

age

• multi-valued (double ellipse)• derived (dashed ellipse)

Page 16: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

Types of Attributes

customer

cust-street

cust-id

cust-name

cust-city

has account

balance

numberaccess-date

phone no.

date-of-birth

age

month day year

Composite Attribute

Page 17: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

Next: Keys

• Key = set of attributes identifying individual entities or relationships

Page 18: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

customer

cust-street

cust-id

cust-name

cust-city phone no.

age

date-of-birthPossible Keys:

{cust-id}

{cust-name, cust-city, cust-street}

{cust-id, age}

cust-name ?? Probably not.

Domain knowledge dependent !!

Entity Keys

Page 19: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

Entity Keys

• Superkey• any attribute set that can distinguish entities

• Candidate key• a minimal superkey

• Can’t remove any attribute and preserve key-ness• {cust-id, age} not a superkey • {cust-name, cust-city, cust-street} is

• assuming cust-name is not unique

• Primary key• Candidate key chosen as the key by DBA• Underlined in the ER Diagram

Page 20: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

Entity Keys• {cust-id} is a natural primary key

• Typically, SSN forms a good primary key

• Try to use a candidate key that rarely changes• e.g. something involving address

not a great ideacustomer

cust-street

cust-id

cust-name

cust-city phone no.

age

date-of-birth

Page 21: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

Relationship Set Keys• What attributes are needed to represent a relationship

completely and uniquely ?• Union of primary keys of the entities involved, and relationship

attributes

• {cust-id, access-date, account number} describes a relationship completely

customer has

cust-id

account

numberaccess-date

Page 22: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

Relationship Set Keys

• Is {cust-id, access-date, account number} a candidate key ?• No. Attribute access-date can be removed from this set without

losing key-ness

• In fact, union of primary keys of associated entities is always a superkey

customer has

cust-id

account

numberaccess-date

Page 23: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

Relationship Set Keys

• Is {cust-id, account-number} a candidate key ?• Depends

customer has

cust-id

account

numberaccess-date

Page 24: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

Relationship Set Keys

• Is {cust-id, account-number} a candidate key ?• Depends

customer has

cust-id

account

numberaccess-date

• If one-to-one relationship, either {cust-id} or {account-number} sufficient• Since a given customer can only have one account, she can only

participate in one relationship

• Ditto account

Page 25: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

Relationship Set Keys

• Is {cust-id, account-number} a candidate key ?• Depends

customer has

cust-id

account

numberaccess-date

• If one-to-many relationship (as shown), {account-number} is a candidate key• A given customer can have many accounts, but at most one

account holder per account allowed

Page 26: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

Relationship Set Keys

• General rule for binary relationships• one-to-one: primary key of either entity set• one-to-many: primary key of the entity set

on the many side• many-to-many: union of primary keys of

the associate entity sets

• n-ary relationships• More complicated rules

Page 27: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

Next: Data Constraints

• Representing semantic data constraints• We already saw constraints on relationship

cardinalities

Page 28: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

Participation Constraint

• Given an entity set E, and a relationship R it participates in:• If every entity in E participates in at least

one relationship in R, it is total participation• partial otherwise

Page 29: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

29

Participation Constraint

customer has

cust-street

cust-id

cust-name

cust-city

account

balance

numberaccess-date

Total participation

Page 30: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

Cardinality Constraints

customer has

cust-id

account

numberaccess-date

0..* 1..1

How many relationships can an entity participate in ?

Minimum - 0Maximum – no limit

Minimum - 1 Maximum - 1

Page 31: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

Next: Recursive Relationships

• Sometimes a relationship associates an entity set to itself

Page 32: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

Recursive Relationships

Must be declared with roles

employee works-for

emp-street

emp-id

emp-name

emp-city

manager

worker

Page 33: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

Next: Weak Entity Sets

• An entity set without enough attributes to have a primary key

• E.g. Transaction Entity• Attributes:

• transaction-number, transaction-date, transaction-amount, transaction-type

• transaction-number: may not be unique across accounts

Page 34: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

Weak Entity Sets

• A weak entity set must be associated with an identifying or owner entity set

• Account is the owner entity set for Transaction

Page 35: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

Weak Entity Sets

account

balance

number

Transactionhas

trans-type

trans-number

trans-date

trans-amt

Still need to be able to distinguish between different weak entities associated with the same strong entity

Page 36: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

Weak Entity Sets

account

balance

number

Transactionhas

trans-type

trans-number

trans-date

trans-amt

Discriminator: A set of attributes that can be used for that

Page 37: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

Weak Entity Sets

• Primary key:• Primary key of the associated strong entity

+ discriminator attribute set• For Transaction:

• {account-number, transaction-number}

Page 38: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

Next: Specialization

• Consider entity person:• Attributes: name, street, city

• Further classification:• customer

• Additional attributes: customer-id, credit-rating• employee

• Additional attributes: employee-id, salary

• Note similarities to object-oriented programming

Page 39: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

Specialization: Example

Page 40: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

Finally: Aggregation• No relationships between relationships

• E.g.: Associate account officers with has account relationship set

customer has account

account officer

employee

?

Page 41: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

Finally: Aggregation• Associate an account officer with each account ?

• What if different customers for the same account can have different account officers ?

customer has account

account officer

employee

?

Page 42: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

Finally: Aggregation• Solution: Aggregation

customer has account

account officer

employee

Page 43: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

More…

• Read Chapter 2 for:• Specialization/Aggregation details

• Different types of specialization’s etc

• Generalization: opposite of specialization• Lower- and higher-level entities• Attribute inheritance• …

Page 44: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

An Example: Employees can have multiple phones

E/R Data ModelDesign Issue #1: Entity Sets vs. Attributes

Employee Employee PhoneUsesvs(a) (b)

To resolve, determine how phones are used1. Can many employees share a phone?

(If yes, then (b))

2. Can employees have multiple phones?

(if yes, then (b), or (a) with multivalued attributes)

3. Else

(a), perhaps with composite attributes

Employee

phone_no phone_loc no loc

no loc

phone

Page 45: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

An Example: How to model bank loans

E/R Data ModelDesign Issue #2: Entity Sets vs. Relationship Sets

vs

(a)

To resolve, determine how loans are issued1. Can there be more than one customer per loan?

• If yes, then (a). Otherwise, loan info must be replicated for each customer (wasteful, potential update anomalies)

2. Is loan a noun or a verb?• Both, but more of a noun to a bank. (hence (a) probably more appropriate)

Customer

(b)

Loans BranchCustomer LoanBorrows

lno amt ssn name ssn name

lno amt

bname bcity

Page 46: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

An Example: Works_At

E/R Data ModelDesign Issue #3: N-ary vs Binary Relationship Sets

Employee Works_at Branch

Dept

Employee WAE Branch

Dept

WABinary:

Ternary:

WAB

WAD

vs(Joe, Moody, Acct) Works_At

(Joe, w3) WAE

(Moody, w3) WAB

(Acct, w3) WAD

Choose n-arywhen possible!(Avoids redundancy,update anomalies)

Page 47: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

Example Design

• We will model a university database• Main entities:

• Professor• Projects• Departments• Graduate students• etc…

Page 48: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

professorarea

name

SSN

rank

projectstart

sponsor

proj-number

budget

deptoffice

name

dept-no

homepage

gradage

name

SSN

degree

Page 49: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

professorarea

name

SSN

rank

projectstart

sponsor

proj-number

budget

deptoffice

name

dept-no

homepage

gradage

name

SSN

degree

Page 50: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

professorarea

name

SSN

rank

projectstart

sponsor

proj-number

budget

deptoffice

name

dept-no

homepage

gradage

name

SSN

degree

PI

Co-PI

RA

Major

ChairSupervises

Mentorad

vise

e

advi

sor

Appt

Time (%)

Page 51: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

professorarea

name

SSN

rank

projectstart

sponsor

proj-number

budget

deptoffice

name

dept-no

homepage

gradage

name

SSN

degree

PI

Co-PI

RA

Major

Chair ApptSupervises

Mentorad

vise

e

advi

sor

Time (%)

Page 52: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

professorarea

name

SSN

rank

projectstart

sponsor

proj-number

budget

deptoffice

name

dept-no

homepage

gradage

name

SSN

degree

PI

Co-PI

RA

Major

Chair ApptSupervises

Mentorad

vise

e

advi

sor

Time (%)

And so on…

Page 53: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

Summary

• Entity-relationship Model• Intuitive diagram-based representation of domain

knowledge, data properties etc…• Two key concepts:

• Entities• Relationships

• We also looked at:• Relationship cardinalities• Keys• Participation Constraints• …

Page 54: CMSC424, Spring 2005 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

CMSC424, Spring 2005

Summary

• Details unimportant• Key idea: We can represent many data

properties and constraints conceptually using this

• Read Chapter 2• Assignment will require you to do this

anyway !