lecture #2 october 5 th, 2000 conceptual modeling administration: –hw1 available –details on...

61
Lecture #2 October 5 th , 2000 Conceptual Modeling • Administration: – HW1 available – Details on projects – Exam date – XML comment

Post on 19-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Lecture #2October 5th, 2000

Conceptual Modeling• Administration:

– HW1 available– Details on projects– Exam date– XML comment

Building an Application with a Database System

• Requirements modeling (conceptual, pictures)– Decide what entities should be part of the application and

how they should be linked.

• Schema design and implementation– Decide on a set of tables, attributes.– Define the tables in the database system.– Populate database (insert tuples).

• Write application programs using the DBMS– way easier now that the data management is taken care

of.

Database Design

• Why do we need it?– Agree on structure of the database before

deciding on a particular implementation.

• Consider issues such as:– What entities to model– How entities are related– What constraints exist in the domain– How to achieve good designs

Database Design Formalisms

• Object Definition Language (ODL):– Closer in spirit to object-oriented models

• Entity/Relationship model (E/R):– More relational in nature.

• Both can be translated (semi-automatically) to relational schemas (with varying amount of pain).

• ODL to OO-schema: direct transformation (C++ or Smalltalk based system).

Outline

• ODL (rather briefly)

• E/R diagrams

• Some high-level design principles

• Modeling constraints

• Introduction to the relational model

• From E/R & ODL to relations

Object Definition Language

• Is part of ODMG, which also gave us OQL. • Resembles C++ (and Smalltalk).• Basic design paradigm in ODL:

– Model objects and their properties.

• For abstraction purposes:– Group objects into classes.

• What qualifies as a good class?– Objects should have common properties.

ODL Class Declarations

Interface <name> {

attributes: <type> <name>;

relationships <range type> <name>;

methods

}

Method example:

float gpa(in: Student) raises (noGrades)

Arbitrary function can compute the value of gpa, based on a

student object given as input.

ODL Example

Product

Person

Company

category

name

price

namestockprice

name

address ssn

ODL DeclarationsInterface Product { attribute string name; attribute float price; attribute enum Categories {electronics, communications, sports …} category }

Interface Company { attribute string name; attribute float stockprice; }Interface Person { attribute integer ssn; attribute string name; attribute Struct Address {string street, string city} address; }

ODL Example Extended

Product

Person

Company

category

name

price

namestockprice

name

address ssn

buys

worksFor

madeBy

ODL Declarations, ExtendedInterface Product { attribute string name; attribute float price; attribute enum Categories {electronics, communications, sports …} category; relationship <Company> madeBy; }

Interface Person { attribute integer ssn; attribute string name; attribute Struct Address {string street, string city} address; relationship set <Product> buys; relationship set <Company> worksFor;}

ODL Example, Extended Again

Product

Person

Company

category

name

price

namestockprice

name

address ssn

buys

worksFor

madeBy

employs

makes

ODL Declarations, Extended Again

Interface Company { attribute string name; attribute float stockprice;

relationship set <Product> makes inverse Product::madeBy;

relationship set <Person> employs inverse Person::worksFor; }

Types in ODL

Basic types: Atomic types (e.g., string, integer, …) Interface types (e.g., Person, Product, Company)

Constructors:

Set: (1, 5, 6) Bag: (1, 1, 5, 6, 6 ) List: (1, 5, 6, 1, 6 ) Array: Integer[17] Struct: {string street, string city, integer zipcode}

Allowable Types in ODLFor attributes: start with atomic or struct, and apply a collection type.

OK: string, set of integer, bag of Address. Not OK: Product, set of set of integer.

For relationships: start with interface type and apply a collection type.

OK: Product, set of Product, list of Person.

Not OK: struct {pname Product, cname Company} set of bag of Product integer

Entity / Relationship Diagrams

Objects entitiesClasses entity sets

Attributes are like in ODL.

Relationships: like in ODL except

- not associated with classes (I.e., first class citizens)

- not necessarily binary

Product

address

buys

address name ssn

Person

buys

makes

employs

CompanyProduct

name category

stockprice

name

price

Multi-way Relationships

Purchase

How do we model a purchase relationship between buyers,products and stores?

Product

Person

Store

Roles in Relationships

Purchase

What if we need an entity set twice in one relationship?

Product

Person

Store

salesperson buyer

Roles in Relationships

Purchase

Product

Person

Store

salesperson buyer

Note the multiplicity of the relationships: we cannot express all possibilities

Attributes on Relationships

Purchase

Product

Person

Store

date

Design Principles

PurchaseProduct Person

What’s wrong?

President PersonCountry

Moral: be faithful!

What’s Wrong?

Purchase

Product

Store

date

personweather

Moral: don’t talk too much.

What’s Wrong?

Purchase

Product

Person

Store

dateDates

Moral: don’t complicate life more than it already is.

Do we really need 3-way relationships?

Purchase

Person

Store

Product

StoreOf

ProductOf

BuyerOf

Moral: Find a nice way to say things.

Modeling Subclasses

The world is not flat!

Some objects in a class may have properties not shared by other members:

Products

Software products

Educational products

So --- we define subclasses (in ODL and in E/R).

Subclasses in ODL

Interface SoftwareProduct: Product{

attribute Set<string> platform; attribute Set<integer> requiredMemory;

}

Interface EducationalProduct: Product{

attribute Struct Interval {integer begin, integer end} ageGroup; attribute string topic}

The two classes also inherit all the properties of Product.

Product

name category

price

isa isa

Educational ProductSoftware Product

Age Groupplatforms

Subclasses in E/R Diagrams

Multiple Inheritance

Product

Educational Product

Educ-softwareProduct

Software Product

ageGrouptopic

Platformsrequired memory

Educational-method

How do we resolve conflicts?

Product

Educational Product

Educ-softwareProduct

Software Product

ageGrouptopic

Platformsrequired memory

Educational-method

Rating(ATA)

Rating(ASA) Rating?

Product

name category

price

isa isa

Educational ProductSoftware Product

Age Groupplatforms

In ODL: Every object belongs to a single class

In E/R: An entity may be spread out in multiple sets.

Modeling ConstraintsExtracting constraints is what modeling is all about. But how dowe express them?

Examples:

Keys: social security number uniquely identifies a person.

Single-value constraints: a person can have only one father.

Referential integrity constraints: if you work for a company, it must exist in the database.

Domain constraints: peoples’ ages are between 0 and 150.

Why are these constraints useful in the implementation?

KeysA set of attributes that uniquely identify an object or entity:

Person: social security number name name + address name + address + age

Perfect keys are often hard to find, so organizations usuallyinvent something.

An object may have multiple keys:

employee number, social-security number

Keys in ODL

Interface Person (key ssn) { properties… }

Defining multiple keys:

(key ssn employeID (name address age))

Keys in E/R Diagrams

address name ssn

Person

Product

name category

price

No formal way to specify multiple keys in E/R diagrams

Single Value ConstraintsAn entity (or object) may have at most one value for a given attribute or relationship.

Person: name, social-security numberCompany: stock price

How do we do this in ODL?

In E/R, every attribute has at most one value. Arrows tell us about multiplicity of relations.

If we have a single-valued constraint, we can either: 1. Require that the value exist (see referential integrity shortly) 2. Allow null values.

Referential Integrity ConstraintsA relationship has one value and the value must exist.

Example: Product madeBy Company: company must exist.

How do we enforce referential integrity constraints? (otherwise, we get dangling pointers)

- forbid to delete a reference object, or - delete the objects that reference an object we’re deleting.

CompanyProduct makes

In E/R diagrams:

Weak Entity SetsEntity sets are weak when their key attributes come from otherclasses to which they are related.

This happens if:

- part-of hierarchies - splitting n-ary relations to binary.

UniversityTeam affiliation

numbersport name

The Relational Data ModelDatabase Model(ODL, E/R)

Relational Schema

Physicalstorage

ODL definitions

Diagrams (E/R)

Tables: row names: attributes rows: tuples

Complexfile organizationand index structures.

Terminology

Name Price Category Manufacturer

gizmo $19.99 gadgets GizmoWorks

Power gizmo $29.99 gadgets GizmoWorks

SingleTouch $149.99 photography Canon

MultiTouch $203.99 household Hitachi

tuples

Attribute names

What can’t you say in the relational model?

More Terminology

Every attribute has an atomic type.

Relation Schema: relation name + attribute names + attribute types

Relation instance: a set of tuples. Only one copy of any tuple!

Database Schema: a set of relation schemas.

Database instance: a relation instance for every relation in the schema.

More on Tuples

Formally, a mapping from attribute names to (correctly typed) values:

name gizmo price $19.99 category gadgets manufacturer GizmoWorks

Sometimes we refer to a tuple by itself: (note order of attributes)

(gizmo, $19.99, gadgets, GizmoWorks) or

Product (gizmo, $19.99, gadgets, GizmoWorks).

Updates

The database maintains a current database state.

Updates to the data:

1) add a tuple 2) delete a tuple 3) modify an attribute in a tuple

Updates to the data happen very frequently.

Updates to the schema: relatively rare. Rather painful. Why?

From ODL to Relational Schema

Start simple: a class definition has only single valued attributes

Interface product{ float price; string name; Enum {telephony, gadgets, books} category}

Class becomes a relation, and every attribute becomes a relation attribute:

Name Price Category

Gizmo $19.99 gadgets

Product

Adding Non atomic Attributes

Name Currency Amount Category

Gizmo US$ 19.99 gadgets

Power Gizmo US$ 29.99 gadgets

Price is a record: {string currency, float amount}

Product

Set Attributes

Name SSN Phone Number

Fred 123-321-99 (201) 555-1234

Fred 123-321-99 (206) 572-4312Joe 909-438-44 (908) 464-0028Joe 909-438-44 (212) 555-4000

One option: have a tuple for every value in the set:

Disadvantages?

Modeling Collection Types

How can we model bags?

Lists?

Fixed length arrays?

The problem becomes even more significant if a class has several attributes that are set types? Question: how bad is the redundancy for n set type attributes, each with possibly up to m values?

Questions:

Modeling RelationshipsInterface Product { attribute string name;

attribute float price; relationship <Company> madeBy; }Interface Company {

attribute string name; attribute float stock-price; attribute string address;

} How do we incorporate the relationship madeBy into the schema?

Option #1 Name Price made-by-name made-by-stock-price made-by-address

Gizmo $19.99 gizmoWorks 0.0001$ Montezuma

What’s wrong?

HintInterface Product { attribute string name;

attribute float price; relationship <Company> madeBy; }Interface Company {

attribute string name; attribute float stock-price; attribute string address; relationship set <Product> makes; }

Better Solution

Name Price made-by-name

Gizmo $19.99 gizmoWorks

Product relation: (assume: name is a key for company)

Company relation:

Name Stock Price Address

Gizmo $0.00001 Montezuma

Additional Issues

1. What if there is no key?

2. What if the relationship is multi-valued?

3. How do we represent a relationship and its inverse?

From E/R Diagrams to Relational Schema

Easier than ODL

- relationships are already independent entities

- only atomic types exist in the E/R model.

Entity sets relations

Relationships relations

Special care for weak entity sets.

address name ssn

Person

buys

makes

employs

CompanyProduct

name category

Stock price

name

price

Entity Sets to Relations

Product

name category

price

Product:

Name Category Price

gizmo gadgets $19.99

Relationships to Relations

makes CompanyProduct

name category

Stock price

name

Relation Makes (watch out for attribute name conflicts)

Product-name Product-Category Company-name Starting-year

gizmo gadgets gizmoWorks 1963

Start Year

Handling Weak Entity Sets

UniversityTeam affiliation

numbersport name

Relation Team:

Sport Number University-name

mud wrestling 15 Montezuma State U.

- need all the attributes that contribute to the key of Team - don’t need a separate relation for Affiliation.

Modeling Subclass Structure

Product

Educational Product

Educ-softwareProduct

Software Product

ageGrouptopic

Platformsrequired memory

Educational-method

Option #1: the ODL Approach

4 tables: each object can only belong to a single class

Product(name, price, category, manufacturer)

EducationalProduct( name, price, category, manufacturer, ageGroup, topic)

SoftwareProduct( name, price, category, manufacturer, platforms, requiredMemory)

EducationalSoftwareProduct( name, price, category, manufacturer, ageGroup, topic, platforms, requiredMemory)

Option #2: the E/R Approach

Product(name, price, category, manufacturer)

EducationalProduct( name, ageGroup, topic)

SoftwareProduct( name, platforms, requiredMemory)

No need for a relation EducationalSoftwareProduct

Unless, it has a specialized attribute: EducationalSoftwareProduct(name, educational-method)

Option #3: The Null Value Approach

Have one table:

Product ( name, price, manufacturer, age-group, topic, platforms, required-memory, educational-method)

Some values in the table will be NULL, meaning that the attribute not make sense for the specific product.

How many more meanings will NULL have??