data modeling and database design database systems: architecture and components

Post on 26-Dec-2015

230 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Data Modeling and Database Design

Database Systems: Architecture and Components

Data

• Data - Distinct pieces of information,

information you store for future

reference• Data can exist in a variety of forms:

1. as numbers or text on pieces of paper,

2. as bits and bytes stored in electronic memory,

3. or as facts stored in a person's mind.

4. etc

What is the difference of Data and Information ?

Data vs. Information

• Data: raw facts (employee names, hours worked etc). Represent real world things. They get value when some relationship is defined between them. Rules and relationships are set up to organize data into valuable information

• Information: collection of facts organized in such a way that they have additional value beyond the facts themselves : information is value-added data

• Turning data into information is a process – a set of logically related tasks to achieve a defined outcome

• Knowledge is awareness and understanding of a set of information and the ways how it supports a specific task

Data vs. Information

Valuable Information : result of data processing

Managing Data

Difficulties in Managing Data

Amount of data increases exponentially.

Data are scattered and collected by many individuals using various methods and devices.

Data come from many sources – any problems here ?

Data security, quality and integrity are critical – why ?

Difficulties in Managing Data (continued)

An ever-increasing amount of data needs to be considered in making organizational decisions.

The Data Deluge

Chapter 1 – Database Systems: Architecture and Components 8

Terminology

• Data• Information• Metadata

Chapter 1 – Database Systems: Architecture and Components 9

Data Management

1. Creation of data

2. Retrieval of data

3. Update or modification of data

4. Deletion of data

For that, data must be accessed and, for the ease of access, data must be organized.

What Is a Database?

• Databases in general are sets of data (information) that are arranged for easy access. Doesn’t have to be on a computer (is Rolodex a database ?).

• Databases are good for tracking and reporting on most things in business – i.e. invoices, inventory, customers.

• Examples: use a database to store and retrieve telephone numbers for customers, customer info, history of orders etc

Database:An Organized collection of Data

Database:An Organized collection of Data

Electronic Electronic spreadsheetspreadsheet

Filing Filing cabinetcabinet

Database Database in a in a

computercomputer

Chapter 1 – Database Systems: Architecture and Components 12

History of Data Management

1950 1960 1970 1980 1990 2000

File systems

Hierarchical DBMS

Network DBMS

Relational DBMS

Object-oriented DBMS

The Hierarchy of Data :data is usually organized in a hierarchy

Chapter 1 – Database Systems: Architecture and Components 14

Limitations of File-Processing Systems

• Lack of Data IntegrityData integrity (data values are correct, consistent, complete, and current) is often violated in isolated environments.

• Lack of StandardsOrganizations find it hard to enforce standards for naming data items as well as for accessing, updating, and protecting data.

• Lack of Flexibility/MaintainabilityFile-processing systems are not amenable to structural changes in data and are therefore dependent upon a programmer who can either write or modify program code.

Chapter 1 – Database Systems: Architecture and Components 15

Limitations of File-Processing Systems (continued)

The limitations to file-processing systems are due to:• Lack of Data Integration

Data are separated and isolated in a file-processing environment.

• Lack of Program-Data IndependenceThe structure of each file is embedded in the application programs.

The Traditional Database Approach : flat files to keep data of specific kind

Chapter 1 – Database Systems: Architecture and Components 17

So, What Is Desirable?

• Integrated data – Not data in isolation to be integrated by the application

program/programmer

• Data Independence– Application program(s) immune to changes in storage

structure and access strategy

– Independent user views of data

Relational Database vs Flat Database Files

• Flat file Database: puts all information in “one large table”. OK for small database. Consequences :

• Leads to redundant data.• Potential of data corruption is high.

• Relational Database: divides data into two or more tables and then relates the tables. For large databases, it is a faster, easier & more flexible use of your data.

Database Approach : a pool of the same data shared by multiple applications

Database approach involves a combination of hard ware and software

The Database Approach

• Database management system (DBMS) provides all users with access to all the data.

• DBMSs minimize the following problems:– Data redundancy : The same data are stored in

many places.– Data isolation : Applications cannot access data

associated with other applications.– Data inconsistency : Various copies of the data

do not agree.

Database Approach (continued)

• DBMSs maximize the following issues:– Data security : Keeping the organization’s data safe

from theft, modification, and/or destruction.– Data integrity : Data must meet constraints (e.g.,

student grade point averages cannot be negative).• Data independence : Applications and data are

independent of one another. applications and data are not linked to each other, meaning that different applications are able to access the same data.

Database Management Systems

Advantages of the Database Approach

More on Advantages of the Database Approach

Disadvantages of the Database Approach

Data Modeling and Database Models

• Database design must reflect the enterprise’s business processes

• When building the database , consider :

– Content - What data should be collected?– Access - What data should be given to what

users?– Logical structure - How will the data be

organized to make sense to a particular user?– Physical organization - Where will the data

actually be located?

Data Models : relational data model is the basis for any relational database,

but this model is not the only one

• A data model defines how records are related, which affects how users can access the data.

• Existing Data Models :

– Hierarchical models– Network models– Relational models : the most popular and

most widely used nowadays

Hierarchical (Tree) Model is used for a Flat File database design

• Early Database based on a hierarchical system much like your Windows filing system:

Hierarchical Databases

More on Database types : Network

• Database Models:

2. Network :

• A designer needs to set a predetermined structure : where to store what

• Implement Owner/Member model• One has to be very familiar with

database

Network Models

More on Database Types : Relational

• Database Models:

3. Relational

• Addresses limitations of other models

• Each table represents an entity of its own, related to other entities ( other tables)

• Independent of application which uses data

Entity-Relationship (ER) Diagram:graphical method to show organization

and relationships between data

Chapter 1 – Database Systems: Architecture and Components 34

History of Data Management

In the 1970s, the Standards Planning and Requirements Committee (SPARC) of the American National Standards Institute (ANSI) proposed what came to be known as the ANSI/SPARC three-schema architecture: conceptual, internal and external schema.

Chapter 1 – Database Systems: Architecture and Components 35

EXTERNALSCHEMA

CONCEPTUAL SCHEMA

EXTERNALSCHEMA

INTERNAL SCHEMA

STORED DATABASE

. . . . . . . .Individual User Views

Global View

Storage View

Figure 1.2 The ANSI / SPARC three-schema Architecture

Three Perspectives of Metadata in a Database

Chapter 1 – Database Systems: Architecture and Components 36

Conceptual Schema

• Core of the architecture• Represents the global view of the structure of the

entire database for a community of users• Captures data specification (metadata)• Describes all data items and relationships between

data together with integrity constraints• Separates data from the program (or views from the

physical storage structure)• Technology independent

Chapter 1 – Database Systems: Architecture and Components 37

Internal Schema

• Describes the physical structure of the stored data (e.g., how the data is actually laid out on storage devices)

• Describes the mechanism used to implement access strategies (e.g., indexes, hashed addresses, etc.)

• Technology dependent• Concerned with the efficiency of data storage and

access mechanisms

Chapter 1 – Database Systems: Architecture and Components 38

External Schema

• Represents different user views, each describing portions of the database

• Technology independent• Views are generated exclusively by logical references

Chapter 1 – Database Systems: Architecture and Components 39

Physical and Logical Data Independence

• Physical Data Independence

Definition: External views unaffected by changes to the internal structure

How?: Introduction of conceptual schema between the external views and the internal (physical) schema

Physical and Logical Data Independence (continued)

• Logical Data Independence

Definition: External views unaffected by design changes (growth or restructuring) in conceptual schema

How?: External views generated exclusively through logical reference to elements in the conceptual schema

Consequence: External views unaffected by changes to other external views

Chapter 1 – Database Systems: Architecture and Components 40

Chapter 1 – Database Systems: Architecture and Components 41

What is a Database System?

• A self-describing collection of integrated records

Self-describing

The structure of the database (metadata) is recorded within the database system – not in the application programs.

Integrated

The responsibility for 'integrating' data items as needed is assumed by the DBMS instead of the programmer.

Chapter 1 – Database Systems: Architecture and Components 42

Characteristics of a Database System

Database

A single, integrated set of files

Database Management System (DBMS)

A collection of general-purpose software that facilitates the process of defining, constructing, and manipulating a database for various applications

Chapter 1 – Database Systems: Architecture and Components 43

ApplicationProgram 4

ApplicationProgram 3

Application Program 2

ApplicationProgram 1

ApplicationProgram 5

User-friendly Database Interrogation

Data Items(Minimal/Controlled Redundancy)

Management

A

C

DB HF

E IG

Users

Users

Figure 1.4 An early view of a database system*

*Adapted From Richard L. Nolan, "Computer Data Bases: The Future is Now," Harvard Business Review, (September-October, 1973)

An Early View of a Database System

Chapter 1 – Database Systems: Architecture and Components 44

What is a Database Management System (DBMS)?

A DBMS is a collection of general-purpose software that facilitates the processes of defining, constructing, and manipulating a database.

•The major components of a DBMS include one or more query languages; tools for generating reports; facilities for providing security, integrity, backup and recovery; a data manipulation language for accessing the database; and a data definition language used to define the structure of data.

Chapter 1 – Database Systems: Architecture and Components 45

Query Language[SQL]

Data Manipulation Language[DML/SQL]

Security & Recovery[DCL/SQL]

Report Generator

Access RoutinesData Definition Language

[DDL/SQL]

Database Management System [DBMS]Software component

Figure 1.5 Components of a database system

Database{Contains Data}

Computer-aidedSof tware Engineering

Tools[CASE Tools]

Data Repository{Data Models

Metadata}

Data Dictionary{DBMS Metadata}

Components of a Database System

Chapter 1 – Database Systems: Architecture and Components 46

Types of Database Systems

• Number of usersSingle-user

Desktop database systemMulti-user

Workgroup database systemEnterprise database system

• ScopeDesktop database systemWorkgroup database systemEnterprise database system

Some Popular DBMS :

• Excel **• Access• SQL Server• Oracle• MySQL

More on Popular RDBMS

• MS Access ( small business, low cost, low security, small number of concurrent users)

• MS SQL Server (middle to large companies, good security, good integration with Windows platforms, large number of concurrent users, relatively low cost)

• Oracle (large businesses with high data storage and retrieval requirements, excellent security, scalability, performance, high cost)

• DB2 (IBM) ( large businesses, usually deployed on mainframes and large scale workstations/clusters, requires professional management, high costs, very reliable)

• Free public domain RDBMS ( mySQL, Cloudescape , more) : limited user interface tools, need high qualifications of personnel, from small to large businesses. Free ? No costs?

Popular Database Management Systems

System Development Life Cycle of a Database

System Development Life Cycle of a Database

StrategyStrategyandand

AnalysisAnalysisDesignDesign

BuildBuildandand

DocumentDocument

TransitionTransition

ProductionProduction

Systematic approach to database development- transforms business requirements into an operational database

Design a system basing on the model developed in the strategy and analysis phase.

Build the prototype. Write and execute the command to create tables and objects. Develop user documentation and manual

Analyze business requirements. Build models of the system. Transfer business narrative into graphical representation of needs and rules. Confirm and refine the model with the analysts and experts

Quick Quiz

• What is the difference between data, metadata and information?

• What is the difference between sequential access and direct access? Give an example of each.

ANSWER: File.• What is data integrity and what is significance of a lack of

data integrity? ANSWER: Data integrity.• : attribute • What is the difference between a database and a database

management system? ANSWER: Hierarchical. • What is the role of data models in database design? Which

data model is the most flexible? ANSWER: Relational.• ANSWER: Entity-Relationship (ER) diagrams.

Quick Quiz• True or False: Oracle is the leading provider of database

systems. ANSWER: TrueSWER: File.• What term is used to describe the degree of accuracy of data? ASWER: Data integrity.• : attribute • Which data model uses a parent-child structure? ANSWER: Hierarchical. • Which data model is the most flexible? ANSWER: Relational.• What is the most common data modeling technique? ANSWER: Entity-Relationship (ER) diagrams.• True or False: Oracle is the leading provider of database

systems. ANSWER: TrueANSWER: SQL

Quick Quiz• What is the term used to describe a collection of records? ANSWER: File.• What term is used to describe the degree of accuracy of data? ANSWER: Data integrity.• : attribute • Which data model uses a parent-child structure? ANSWER: Hierarchical. • Which data model is the most flexible? ANSWER: Relational.• What is the most common data modeling technique? ANSWER: Entity-Relationship (ER) diagrams.• True or False: Oracle is the leading provider of database

systems. ANSWER: TrueANSWER: SQL

Quick Quiz• What is the term used to describe a collection of records? ANSWER: File.• What term is used to describe the degree of accuracy of data? ANSWER: Data integrity.• : attribute • Which data model uses a parent-child structure? ANSWER: Hierarchical. • Which data model is the most flexible? ANSWER: Relational.• What is the most common data modeling technique? ANSWER: Entity-Relationship (ER) diagrams.• True or False: Oracle is the leading provider of database

systems. ANSWER: TrueANSWER: SQL

top related