itec313 database programming lecture 1: database design methodology : introduction

68
ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Upload: samson-dean

Post on 12-Jan-2016

222 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

ITEC313 Database Programming

Lecture 1: Database Design Methodology : Introduction

Page 2: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Learning Objectives

• Database Design Terminology• Purpose of Database Design• Phases of Database Design

Page 3: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Database and Database System

• A database is a shared collection of logically related data designed to meet the information needs of an organization.

• Components of a Database Systems– Database– Hardware– Software - DBMS– Users

3

Page 4: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Database

• The data in the database will be expected to be both integrated and shared particularly on multi-user systems

• Integration - The database may be thought of as a unification of several otherwise distinct files, with any redundancy among these files eliminated

• Shared - individual pieces of data in the database may be shared among several different users 4

Page 5: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Hardware

These are secondary storage on which the database physically resides, together with the associated I/O devices, device controllers etc.

5

Page 6: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

DBMS

Examples of DBMS Products Oracle Informix Access DB2 Fox pro dBase SQL Server My SQL

6

Page 7: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Typical Functions of DBMS

7

Functions of a DBMS

Data storage, retrieval

and update A user-accessible

catalog

Transaction support

Concurrency and control

services

Recovery services

Authorization

services

Support of data

communication

Integrity Services

Services to promote

data independen

ce

Utility services

Page 8: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Users

• Application Programmer - writes programs that use the database

• Database Designers - designs conceptual and logical database

• Database Administrator (DBA)• Data Administrator• End - user - interacts with the system

from an on-line terminal by using Query Languages etc.

8

Page 9: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Data & Database Administration

• Data Administrator – a business manager responsible for controlling the overall corporate data resources

• Database Administrator (DBA) - a technical person responsible for development of the total system

9

Page 10: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Sample Applications

• Student Records• Banking • Insurance• Billing Systems e.g.

Electricity, Phone• ISPs• Accounting Systems• Reservation Systems

e.g. Airline, Hotel• Medical Records

10

• Stock control• Personnel systems• Product catalogues• Telephone directories• Train timetables• Airline bookings• Credit card details• Customer histories• Stock market prices• Discussion boards• Web indexes• Library catalogues

Page 11: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Advantages

• Control of data redundancy

• Data consistency

• Multipurpose use of data

• Sharing of data,

• Enforcement of standards

• Economy of scale

• Balance conflicting user

requirement

• Improved data accessibility and

responsiveness

• Increased productivity

• Improved maintenance through

data independence

• Increased concurrency

• Improved backup and recovery

services.

11

Page 12: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Disadvantages

Complexity

Size

Cost of DBMS

Additional hardware costs

Cost of conversion

12

Page 13: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Data Independence

• Software maintenance is a large part (50%) of information system budgets

• Reduce impact of changes by separating database description from applications

• Change database definition with minimal effect on applications that use the database

Page 14: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Three Schema Architecture

Page 15: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Three Schema Architecture

Page 16: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Database Architecture

External Level – concerned with the way users perceive the database

Conceptual Level – concerned with abstract representation of the database in its entirety

Internal Level – concerned with the way data is actually stored

17

Page 17: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Differences among Levels

• External– Course Registration Form– Instructor load assignments

• Conceptual: –Tables: student, course,takes, …

• Internal– Files needed to store the tables– Extra files to improve performance

Page 18: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Architecture of Db System

19

DBMS

Application 2Application 1 Application 3

Database

Conceptual Level

Internal Level

External Level

Logical Data Independence

Physical Data Independence

Page 19: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Data Independence

• Logical Data Independence – users and user programs are independent of logical structure of the database

• Physical Data Independence – the separation of structural information about the data from the programs that manipulate and use the data i.e. the immunity of application programs to changes in the storage structure and access strategy

20

Page 20: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Data Independence

• Different applications will need different views of the same data, so that if they are not interested in a part of the database, that part need not be included in their view. This feature is also important for controlling access to parts of database

• The DBA must have the freedom to change the storage structure or access strategy in response to changing requirements, without having to modify the existing applications

21

Page 21: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Client-Server Architecture

Database

Database

a) Client, server, anddatabase on thesame computer

b) Mulitple clients and 1 serveron different computers

c) Multiple servers and databases on different computers

Client

Server

Client Server

Client Server Server

DatabaseDatabase

Client

Client

Client

Client

Client

Page 22: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Database Development

• In the past many software development projects were unsuccessful due to:– requirements were not properly collected/specified– Lack of development methodology

• The stages in the DB development cycle has been identified:– Clearly specified– Not sequential, but involve some repetition.– Contain feedback loops (even back to the

requirements stage)

Page 23: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Db Development Life Cycle

Database planning System definition Requirement collection and analysis Database design DBMS selection Application design Prototyping Implementation Data conversion and loading Testing Operational maintenance

24

Page 24: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Database Design

DATABASE PLANNING

SYSTEMS DEFINITION

REQUIREMENTS ANALYSIS

IMPLEMENTATION

CONCEPTUAL DESIGN

DISTRIBUTED DB DESIGN

PHYSICAL DESIGN

APPLICATION DESIGN

DBMS SELECTION

PROTOTYPING

DATA LOADING

TESTING

MAINTENANCE

LOGICAL DESIGN

Database Application

Lifecycle

Optional

Page 25: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Database Application Lifecycle

• Management activities that allow the stages of the database application to be realized as efficiently as possible

Database Planning :

• The scope and boundaries of the application including its major application areas and user groups

System Definition :

• Encompasses tasks that determine the needs or conditions to meet for a new or altered product, taking account of the possibly conflicting, vague and incomplete requirements of the various stakeholders

Requirements Analysis:

Page 26: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Database Application Lifecycle

• Design of the user interface and the application programs that use and process the database.

Application Design :

• Building a working model of a database application

Prototyping :

• Physical realization of the database and application design

Implementation :

Page 27: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Database Application Lifecycle

• Transferring any existing data into the new database and converting any existing processes to run on the new database.

Data Conversion and Loading :

• Process of executing the application programs with the intent of finding errors.

Testing :

• Process of monitoring and maintaining the system following installation.

Operational Maintenance :

Page 28: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Planning

Slide 29

Planning Factors

The work to

be done

The resources

to do it

The cost

Planning Objectives

Organisational Units

Consist of various

departments

Locations

List of operational locations

Business Functions

Identify related

business processes

Entity Types

Something for which

data is collected

Two stages

Page 29: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

System Definition

• Identify boundaries– Want to know at a very high level what the

boundaries of the system are, e.g.• Current users• Current application areas

• Identify interfaces within organization

Page 30: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Requirements Analysis

• Database design should reflect the information within the organisation

• Many ways of gathering information• interviewing• observing• examining documents• using questionnaires• using experience from the design of other systems• …

Page 31: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Requirements Analysis• Critical information

– Main application areas and user groups– Documentation used– Details of transactions needed

• A prioritized user requirement specification• Amount gathered depends on size of

organization and scope of application• Documentation is VERY important

– DFD, matrices etc.• Identifying the required functionality for a database system is crucial:• systems with inadequate functionality will fail

Page 32: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Database Design

MAIN AIMS• To represent data & relationships required

by users and applications• To provide a data model which supports

transactions• To specify a design that meets performance

requirements

Page 33: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Database Design Approaches

begins at the level of attributes and then adds entities as new relationships are seen. Normalization is an example of this.

starts with the development of the data model that contains a few high level entities and then it refines them in ever increasing detail. Data modeling comes under this.

BOTT

OM

UP

TOP-D

OW

N

Page 34: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Phases of database Design

• Remember the main phases:

– Conceptual Database Design– Logical Database design– Distributed Database Design (optional)– Physical Database Design

Page 35: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Conceptual Database Design

• Create a conceptual data model– Use data modeling to understand

• each users perspective of data• the data• Use of data across applications

• Independent of any implementation details– DBMS or physical aspects are immaterial

• Based on user requirements specification– assists in understanding data– facilitates communication

Page 36: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Logical database design

• The data model created in the previous phase is refined

• At this point you know – which type of DBMS you will implementing in - e.g.

relational, object-oriented …– but not the actual DBMS

• Test the correctness of the data model through– Normalization– Validation against user transactions

Page 37: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

A crucial stage in the database Application lifecycle is choosing the DB.

The aim is to choose a system that

• allows expansion• enables speedy retrieval• gives easy application development etc.

All data should have been collected and documented before DB selection

Many organizations in practice choose a DBMS purely on the basis of cost.

Database selection

Page 38: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Define terms of reference• the scope of the study should be stated• potential list of the products to be assessed• the criteria to be used, timescales …

Identify products• hardware, • compatibility with existing systems, • cost ..• User support • upgrades …

Produce shortlist of products• Shortlist 2-3 products

Evaluate products• Ask Vendors• Involve Users

Recommend selection and produce report• Give details of criteria used• Compare/Contrast alternatives

Database selection

Page 39: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Physical Database Design

HOW to physically implement the logical data model

– derive tables & constraints– identify storage structures and access methods– design security features

Page 40: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Application Design

• Design transactions– data to be used by transactions– functions of the transactions– output of transactions– programs

• Design human interface– Various guidelines

Design of software programs which will process the data

Page 41: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Prototyping

• used to check – developer’s understanding of what is required– interpretation of requirements

• Building a working model

• Inexpensive & quick to build

Page 42: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Implementation

• Database created using DDL• Implement application programs using

selected language• Implement security & integrity controls

Page 43: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Data Loading/Conversion

• Transfer any existing data• Insert any new data• Usually there is a facility within the DBMS to

load data into a database

Page 44: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Testing

• The process of executing the application programs with the intention of finding errors.– Use realistic data– Involve users

• There are various strategies that can be used:– White Box – Black box testing

Slide 45

Page 45: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Maintenance

• Monitoring Performance– Various tools are available

• Maintaining and Upgrading

Slide 46

Page 46: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Overview of Database Design• Assist in understanding of the semantics of data• Facilitate the communication about information

requirements

Purpose of Data Modeling

Criteria for Optimal Data Models

Shareability

Diagrammatic Representation

Extensibility

Expressability

Structural Validity

Nonredundancy

Integrity

Simplicity

Page 47: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Database Design Methodology• A structured approach that uses procedures, techniques,

tools and documentation aids to support and facilitate the process of design

Interaction with users

Structured methodology

Data-driven approach

Structural and integrity

considerations

Data dictionaryvalidate

diagrams

DBDL

Repeat

Page 48: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Broad Goals of Database Development

• Develop a common vocabulary• Define data meaning• Ensure data quality• Provide efficient implementation

Page 49: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Develop a Common Vocabulary

• Diverse groups of users• Difficult to obtain acceptance of a common

vocabulary• Compromise to find least objectionable

solution• Unify organization by establishing a common

vocabulary

Page 50: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Define Meaning of Data

• Business rules support organizational policies

• Restrictiveness of business rules– Too restrictive: reject valid business

interactions– Too loose: allow erroneous business

interactions• Exceptions allow flexibility

Page 51: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Data Quality

• Poor data quality leads to poor decision making– Difficult customer communication– Inventory shortages

• Cost-benefit tradeoff to achieve desired level of data quality

• Long-term effects of poor data quality

Page 52: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Data Quality Measures

• Completeness• Lack of ambiguity• Timeliness• Correctness• Consistency• Reliability

Page 53: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Data Quality Measures• Completeness:

– database represents all important parts of an information system• Lack of ambiguity:

– each part of a database has only one meaning• Timeliness:

– business changes are posted to a database without excessive delays• Correctness:

– database contains values perceived by the user• Consistency:

– different parts of a database do not conflict• Reliability:

– failures or interference do not corrupt database

Importance of measure depends on the database, system, and organizationEach measure can be quantified

Page 54: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Efficient Implementation

• Supersedes other goals• Optimization problem

– Maximize performance– Subject to constraints of data quality, data

meaning, and resource usage• Difficult problem:

– Number of choices– Relationships among choices– DBMS specific

Page 55: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Database Development Phases

Conceptual Data Modeling

Logical Database Design

Distributed Database Design

Physical Database Design

ERD

Tables

Distribution Schema

Internal Schema, Populated DB

Data requirements

OPTIONAL

Page 56: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Database Design

• Conceptual database design - the process of constructing a model of the information used in an organization, independent of all physical considerations

Step 1 Build local conceptual data model for each user view

58

Page 57: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Database Design

• Logical database design for the relational model - the process of constructing a model of the info used in an organization based on a specific data model, but independent of a particular DBMS and other physical considerations

Step 2 Build and validate local data model for each user viewStep 3 Build and validate global logical data model

59

Page 58: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Database Design

• Physical database design for relational databases - the process of producing a description of the implementation of the database on secondary storage.

Step 4 Translate global data model for target DBMS

Step 5 Design physical representationStep 6 Design security mechanismsStep 7 Monitor and tune the operational

system 60

Page 59: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Phases of Database Design

• Process of constructing a model of the information used in an enterprise independent of all physical considerations

Conceptual Database Design

• Process of constructing a model of information used in an enterprise based on a specific data model but independent of a particular DBMS or any other physical considerations

Logical Database Design

• (Optional)Process of deciding about the placement of data across the sites of a computer network. Involves designing the network itself, as well as distribution of DBMS software, DB applications and data

Distributed Database Design

• Description of the implementation of the database on secondary storage. It describes the storage structures and access methods for efficient access.

Physical Database Design

Page 60: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Overview of Database Design

Build local conceptual data model for each user view

Build and Validate local logical data model for each user view

Build and validate global logical Model

Translate global logical model for target DBMS

Design Physical representation

Design Security Mechanisms

Monitor and Tune operational system

Conceptual

Logical

Physical

Page 61: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Centralized Approach to Managing Multiple User Views

63Pearson Education © 2009

Page 62: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

View Integration Approach to Managing Multiple User Views

64

Page 63: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Conceptual Database Design

1.1 • Identify entity

types

1.2 • Identify

relationship types

1.3 • Identify and

associate attributes with entity or relationship types

1.4 • Determine

Attribute Domains

1.5 • Determine

candidate and primary key attributes

1.6 • Specialize/

generalize entity types

1.7 • Draw Entity-

Relationship Diagram

1.8 • Review local

conceptual data model with user

1. Build local conceptual data model for each user view

Page 64: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Logical Database Design

2.1 • Map local

Conceptual data model to local logical data model

2.2 • Derive relations

from local logical data model

2.3 • Validate model

using normalization

2.4 • Validate model

against user transactions

2.5 • Draw Entity

relationship Diagram

2.6 • Define integrity

constraints

2.7 • Review Local

logical data model with user

2. Build and validate local logical data model

Page 65: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Logical Database Design

3.1 • Merge local logical

data models into global model

3.2 • Validate global

logical data model

3.3 • Check for future

growth

3.4 • Draw final Entity

Relationship diagram

3.5 • Review global

logical data model with users

3. Build and Validate Global Logical data model

Page 66: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Physical Database Design

4. Translate Global Logical Data Model for target DBMS

4.1 Design base relations for target DBMS4.2 Design enterprise constraints for target DBMS

5. Design Physical Representations5.1 Analyze transactions5.2 Choose file organizations

Page 67: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Physical Database design

5.3 Choose secondary indexes5.4 Consider introduction of controlled redundancy

6. Design Security Mechanisms6.1 Design user views6.2 Design access rules

7. Monitor and tune operational system

Page 68: ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

END OF LECTURE