cm20145 database design dr alwyn barry dr joanna bryson

Post on 28-Mar-2015

222 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

CM20145CM20145Database DesignDatabase Design

Dr Alwyn BarryDr Joanna Bryson

Lecture PlanLecture Plan

1. Basic Concepts

2. Data, Information & Knowledge

3. Data Models (The E-R Model)

4. The Relational Algebra

5. Introduction to SQL

6. Further SQL (Joins, RA Equivalences)

7. Database Design

8. Further DB Design – Normalisation

9. Architectures and Implementations

10. Integrity and Security

Lecture PlanLecture Plan

11. Ethics and Professional Conduct

12. Legal Issues

13. Transactions

14. Recovery

15. Concurrency Control

16. Storage and File Structure

17. Indexing and Hashing

18. Query Processing & Optimisation

January… Review Session

`Reading’ (PROJECT) Week

OverviewOverview

Database Design and Software Engineering.

Example from real life. Views and Users. Redundancy and

Decomposition.

Loosely based on A. Keller, 2002, USCU

What Is Relational Algebra For?What Is Relational Algebra For?

Fundamental Operators: Restriction Projection Cartesian Product Union Difference

Useful Operators: Join Intersection Division

Unary

Binary

Fun

dam

en

tal

Not

Prim

itive

File Problems DBs Should SolveFile Problems DBs Should Solve

Data redundancy & inconsistency Duplication of information; Different formats

Data isolation Different formats; Different locations

Limited access Need new program for each query

Cannot support “Business Rules” Consistency Validity

Atomicity Failures leave data in inconsistent state; Two users

cannot update the database at once

Security Needs bespoke security in each application

Database DesignDatabase Design

DB design usually covers formal methods for trying to ensure databases meet all these goals.

No formal system is perfect: Formal methods take a long time,

may be computationally intractable. Human error: mistakes in proofs. Formally correct databases can still

have errors (will be examples.) Design is an art; but science can

help.

E-R Model – System DesignE-R Model – System Design

Problemdomain

Problemdomain

Data RequirementsFunctional Requirements

RequirementsCollection & Analysis

Conceptual DesignFunctional Analysis

Conceptual SchemaHigh-level Transaction Spec.

Logical Schema

Logical Design(Data model mapping)

Physical Design

Application ProgramDesign

Internal Schema

TransactionImplementation

Application Programs

DB

MS

In

de

pe

nd

en

tD

BM

S D

ep

en

de

ntt

From Elmasri & Navathe, 2003, pg 51

Software Development IteratesSoftware Development Iterates

http://www.extremeprogramming.org/

Waterfall Feigning IterationWaterfall Feigning Iteration

This is not agile SE. People don’t want to know, but

iteration happens. The question is how well you cope.

Software Development IteratesSoftware Development Iterates

Models change as programmers understand the problem better.

Requirements change as users understand possibilities better.

More resources become available. So:

Save and maintain all modelling and planning tools.

Interact with users frequently. Learn rules to recognize common

failures.

OverviewOverview

Database Design and Software Engineering.

Example from real life. Views and Users. Redundancy and

Decomposition.

Loosely based on A. Keller, 2002, USCU

Where Do Models Come From?Where Do Models Come From?

Demand – What do the users want to see?

Data – Don’t throw anything away.

Design – Never store anything twice. Efficiency. Integrity. Clarity. Security.

Automating Trading OperationsAutomating Trading Operations

Automating Trading OperationsAutomating Trading Operations

Data from: trading cards, market prices, underlying

values.

Demand from: Knowing what is

owned for determining risk.

Knowing what individuals did for determining pay.

Entities and attributesEntities and attributes

Trades Instrument, trade,

price, 2 traders, 2 clearing firms.

Traders Positions, trades.

Trading cards Trades, trader,

number, time.

Instruments Daily values,

volatilities, expiration dates.

(draw on board)

New Requirement: ReconcilingNew Requirement: Reconciling

Business process – dual entry: Operations: primary concerns are

risk, execution on trading floor, relations with individual traders.

Clearing: primary concerns are accounting, banks, law, relations with other clearing firms.

The same trading cards resulted in different trade quantities.

The same trade has different dates!

Models will change!Models will change!

Design – Find out models are inefficient or clumsy to maintain.

Data – Discover new categories, salient values.

Demand – Users see new potential.

OverviewOverview

Database Design and Software Engineering.

Example from real life. Views and Users. Redundancy and

Decomposition.

Loosely based on A. Keller, 2002, USCU

Views: Why they’re important.Views: Why they’re important.

The same data may be seen by different users in different ways.

Shorthand for frequent joins, formulas – may be more efficient.

Automate / enforce security – make access to tables and views depend on user’s function.

Keep reports logically independent from underlying representation – protect the users!

Why Limit / Protect Users?Why Limit / Protect Users?

Databases contain all data for a company. Limits on access eliminate

suspects for errors, crime. Information overload.

Smart, authoritative users still need to find things quickly.

Some users really are naïve.

Programmers are users.

OverviewOverview

Database Design and Software Engineering.

Example from real life. Views and Users. Redundancy and

Decomposition.

Loosely based on A. Keller, 2002, USCU

Software Development IteratesSoftware Development Iterates

Models change as programmers understand the problem better.

Requirements change as users understand possibilities better.

More resources become available. So:

Save and maintain all modelling and planning tools.

Interact with users frequently. Learn rules to recognize common

failures.

Software Development IteratesSoftware Development Iterates

Models change as programmers understand the problem better.

Requirements change as users understand possibilities better.

More resources become available. So:

Save and maintain all modelling and planning tools.

Interact with users frequently. Learn rules to recognize common

failures.

Pitfalls in Relational DB DesignPitfalls in Relational DB Design

A bad design may lead to: redundant information, difficulty in representing certain

information, or difficulty in checking integrity

constraints.

Design goals: Avoid redundant data. Ensure that relationships among

attributes are represented. Facilitate the checking of updates for

violation of integrity constraints.

©Silberschatz, Korth and Sudarshan

Modifications & additions by S Bird, Melbourne

Example of Bad DesignExample of Bad Design Consider the relation schema:

Lending-schema = (branch-name,branch-city,assets, customer-name,loan-

number,amount)

Redundant Information: Data for branch-name, branch-city, assets are

repeated for each loan that a branch makes. Wastes space and complicates updates, introducing

possibility of inconsistency of assets value. Difficulty representing certain information:

Cannot store info. about a branch if no loans exist. Can use null values, but they are difficult to handle.

Solution: DecompositionSolution: Decomposition

Break up redundant tables into multiple tables - this operation is called decomposition. E.g. Lending-schema = (branch-

name, branch-city, assets, customer-name, loan-number, amount)

Branch-schema = (branch-name, branch-city,assets)

Loan-info-schema = (customer-name, loan-number, branch-name, amount)

Lossless-Join Decomposition:Lossless-Join Decomposition: Want to ensure that the original

data is recoverable.1.All attributes of the original schema

(R) must appear in the decomposition (R1, R2), i.e. R = R1 R2

2.Decomposition must be a lossless-join decomposition.

Definition: R1,R2 is a lossless-join decomposition of Rif, for all possible relations r(R) r = R1 (r) ⋈ R2 (r)

Bad Decomposition ExampleBad Decomposition Example

A Non Lossless-Join Decomposition R = (A, B) R1 = (A), R2 = (B)

A B

121

A

B

12

rA(r)

B(r)

A (r) ⋈ B (r)

A B

1212

Thus, r is different to A (r) ⋈ B (r)

So A,B is not a lossless-join decomposition of R.

OverviewOverview

Database Design and Software Engineering.

Example from real life. Views and Users. Redundancy and

Decomposition.

Loosely based on A. Keller, 2002, USCU

SummarySummary

Database design is an ongoing, iterative process. Requirements come from data, user

demands, design issues. Change occurs:

Corporations & technologies grow. Programmers & users learn.

Views / security. Lossless-join decomposition

Next: Science for improving design.

Reading & ExercisesReading & Exercises

Reading Connolly & Begg Chapter 9, (13, 14) Silberschatz Chapter 7.

Much of 7 will be in the next lecture!

Exercises: C&B: 9.2, 9.3/11, 9.9, 9.10 Silberschatz:7.1, 7.2, 7.16

These need functional dependencies, which are covered next lecture.

top related