referans

23
©Silberschatz, Korth and Sudarsha 1.1 Chapter 1: Introduction Chapter 1: Introduction Databases & Database Management Systems (DBMS) Levels of Abstraction Data Models Database Languages Types of Users DBMS Function and Structure Read Chapter 1, including the historical notes on pages 28 and 29.

Upload: databaseguys

Post on 25-May-2015

1.042 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Referans

©Silberschatz, Korth and Sudarshan1.1

Chapter 1: IntroductionChapter 1: Introduction

Databases & Database Management Systems (DBMS) Levels of Abstraction Data Models Database Languages Types of Users DBMS Function and Structure

Read Chapter 1, including the historical notes on pages 28 and 29.

Page 2: Referans

©Silberschatz, Korth and Sudarshan1.2

What is a Database?What is a Database?

According to the book: Collection of interrelated data

Set of programs to access the data

A DBMS contains information about a particular enterprise

DBMS provides an environment that is both convenient and efficient to use.

Another definition: A database is a collection of organized, interrelated

data, typically relating to a particular enterprise

A Database Management System (DBMS) is a set of programs for managing and accessing databases

Page 3: Referans

©Silberschatz, Korth and Sudarshan1.3

Some PopularSome PopularDatabase Management SystemsDatabase Management Systems

Commercial “off-the-shelf” (COTS): Oracle IBM DB2 (IBM) SQL Server (Microsoft) Sybase Informix (IBM) Access (Microsoft) Cache (Intersystems – nonrelational)

Open Source: MySQL PostgreSQL

Note: This is not a course on any particular DBMS!

Page 4: Referans

©Silberschatz, Korth and Sudarshan1.4

Some Database ApplicationsSome Database Applications

Databases touch all aspects of our lives: Banking – accounts, loans, customers Airlines - reservations, schedules Universities - registration, grades Sales - customers, products, purchases Manufacturing - production, inventory, orders, supply chain Human resources - employee records, salaries, tax deductions

Anywhere there is data, there could be a database.

Course context is an “enterprise” that has requirements for: Storage and management of 100’s of gigabytes or terabytes of

data Support for 100’s or more of concurrent users and

transactions Traditional supporting platform, e.g, Sun Enterprise server,

4GB RAM, 50TB of disk space

Page 5: Referans

©Silberschatz, Korth and Sudarshan1.5

Purpose of Database SystemPurpose of Database System

Prior to the availability of COTS DBMSs, database applications were built on top of file systems – coded from the ground up. Sometimes this approach is still advocated.

Drawbacks of this approach: Data redundancy and inconsistency Multiple files and formats A new program is required to carry out each new task Integrity constraints (e.g. account balance > 0) become

embedded throughout program code Plus others…

Page 6: Referans

©Silberschatz, Korth and Sudarshan1.6

Purpose of Database Systems (Cont.)Purpose of Database Systems (Cont.)

Database systems offer solutions for the above problems.

Database systems also support: Atomicity of updates

Concurrent access by multiple users

Data security, etc.

Recoding this functionality from scratch is not easy!

So when should we code from scratch, and when do we buy a DBMS??

Page 7: Referans

©Silberschatz, Korth and Sudarshan1.7

Levels of AbstractionLevels of Abstraction

Physical level: defines low-level details about how data item is stored on disk.

Logical level: describes data stored in a database, and the relationships among the data. Usually conveyed as a data model, e.g., an ER diagram.

View level: defines how information is presented to users. Views can also hide details of data types, and information (e.g., salary) for security purposes.

Page 8: Referans

©Silberschatz, Korth and Sudarshan1.8

View of Data, Cont.View of Data, Cont.

Physical data independence is the ability to modify the physical schema without changing the logical or view levels.

Physical data independence is important in any database or DBMS.

Page 9: Referans

©Silberschatz, Korth and Sudarshan1.9

Instances vs. SchemasInstances vs. Schemas

The difference between a database schema and a database instance is similar to the difference between a data type and a variable in a program.

A database schema defines the structure or design of a database Analogous to type information of a variable in a program

For example, a database might consists of information about a set of customers and accounts and the relationship between them

More precisely: A logical schema defines a database design at the logical level

A physical schema defines a database design at the physical level

An instance of a database is the combination of the database and its’ contents at one point in time Analogous to a variable and its’ value

Page 10: Referans

©Silberschatz, Korth and Sudarshan1.10

What is a Data Model?What is a Data Model?

The phrase “data model” is used in a couple of different ways.

Frequently used (use #1) to refer to an overall approach or philosophy for database design and development.

For those individuals, groups and corporations that subscribe to a specific data model, that model permeates all aspects of database design, development, implementation, etc.

Current data models: Relational model Entity-Relationship model Object-oriented model, Object-relational model Semi, and non-structured data models

Legacy models: Network Hierarchical

Page 11: Referans

©Silberschatz, Korth and Sudarshan1.11

What is a Data Model, Cont?What is a Data Model, Cont?

During the early phases of database design and development, a “data model” is frequently developed (use #2).

The purpose of developing the data model is to define: Data Relationships between data items Semantics of data items Constraints on data items

In other words, a data model defines the logical schema, i.e., the logical level of design of a database.

A data model is typically conveyed as one or more diagrams.

The type of diagrams used depends on the overall approach or philosophy (i.e., the data model, as defined in the first sense).

This early phase in database development is referred to as data modeling.

Page 12: Referans

©Silberschatz, Korth and Sudarshan1.12

Entity-Relationship ModelEntity-Relationship Model

Example of an entity-relationship diagram:

Widely used for database modeling.

An ER model is converted to tables in a relational database.

Page 13: Referans

©Silberschatz, Korth and Sudarshan1.13

Relational ModelRelational Model

Example of tabular data in the relational model:

From a data modeling perspective, which approach is preferable? The ER model, or the relational model?

Attributes

customer-name

customer-id

customer-street

customer-city

account-number

Johnson

Smith

Johnson

Jones

Smith

192-83-7465

019-28-3746

192-83-7465

321-12-3123

019-28-3746

Alma

North

Alma

Main

North

Palo Alto

Rye

Palo Alto

Harrison

Rye

A-101

A-215

A-201

A-217

A-201

Page 14: Referans

©Silberschatz, Korth and Sudarshan1.14

A Sample Relational DatabaseA Sample Relational Database

Regardless of the model, the end result is the same:

Page 15: Referans

©Silberschatz, Korth and Sudarshan1.15

Data Definition Language (DDL)Data Definition Language (DDL)

DDL is used for defining a (physical) database schema (see page 129 for a more complete example):

create table account ( account-number char(10), balance integer)

Given a DDL file, the DDL compiler generates a set of tables.

A description of those tables is stored in a data dictionary: Contains information from the database schema Frequently referred to as metadata (i.e., data about data)

Data storage and definition language (DSDL?): Language in which the storage structure and access methods

used by the database system are specified Usually an extension of the data definition language

Page 16: Referans

©Silberschatz, Korth and Sudarshan1.16

Data Manipulation Language (DML)Data Manipulation Language (DML)

DML is used for accessing and manipulating a database DML is also known as a query language

Two classes of DML: Procedural – user specifies how to get the required data

Non-procedural – user specifies what data is required without specifying how to get that data

SQL is the most widely used DML (query language): Usually referred to as a non-procedural query language

Page 17: Referans

©Silberschatz, Korth and Sudarshan1.17

SQL ExamplesSQL Examples

Find the name of the customer with customer-id 192-83-7465:

select customer.customer-namefrom customerwhere customer.customer-id = ‘192-83-7465’

Find the balances of all accounts held by the customer with customer-id 192-83-7465:

select account.balancefrom depositor, accountwhere depositor.customer-id = ‘192-83-7465’ and depositor.account-number =

account.account-number

Databases are typically accessed by: Users through a command line interface Users through a query or software editing tool Application programs that (generally) access them through

embedded SQL or an application program interface (e.g. ODBC/JDBC)

Page 18: Referans

©Silberschatz, Korth and Sudarshan1.18

Database UsersDatabase Users

Users are differentiated by the way they interact with the system

Naïve users invoke application programs that have been written previously, e.g. people accessing a database over the web, bank tellers, clerical staff, ATM users

Application programmers interact with the system by making DML calls through an API, e.g., ODBC or JDBC from within a computer program

Sophisticated users form requests in a database query language, typically submitted at the command-line, e.g., a database administrator

Specialized users write specialized database applications that do not fit into the traditional data processing framework, e.g., geo-spatial or CAD specialists

Page 19: Referans

©Silberschatz, Korth and Sudarshan1.19

Database Administrator (DBA)Database Administrator (DBA)

The DBA coordinates all the activities of the database system; has a good understanding of the enterprise’s information resources and needs.

DBA duties: Granting user authority to access the database Acting as liaison with users Installing and maintaining DBMS software Monitoring performance and performance tuning Backup and recovery

According to the book, the DBA is also responsible for: Logical and Physical schema definition and modification Access method definition Specifying integrity constraints Responding to changes in requirements

These latter tasks are typically performed by a software engineer specialized in database design, or perhaps a systems engineer

Page 20: Referans

©Silberschatz, Korth and Sudarshan1.20

Transaction ManagementTransaction Management

A transaction is a collection of operations that performs a single logical function in a database application

The backup and recovery components of a DBMS ensure that the database remains in a consistent (correct) state despite failures: system, power, network failures

operating system crashes

transaction failures.

The concurrency-control manager in a DBMS controls the interaction among the concurrent transactions, to ensure the consistency of the database.

Page 21: Referans

©Silberschatz, Korth and Sudarshan1.21

Storage ManagementStorage Management

The storage manager in a DBMS provides the interface between the low-level data stored in the database and the application programs and queries submitted to the system.

The storage manager is responsible to the following tasks: interaction with the file manager

efficient storing, retrieving and updating of data

Note that the DBMS may or may not make use of the facilities of the operating system’s file management facilities.

Page 22: Referans

©Silberschatz, Korth and Sudarshan1.22

Overall System Structure Overall System Structure

Query Optimizer

Page 23: Referans

©Silberschatz, Korth and Sudarshan1.23

Application ArchitecturesApplication Architectures

Architectures: Mainframe – client programs and DBMS reside on one

platform.

Two Tier – client programs and DBMS reside on different platforms; clients connect to DBMS via an API such as ODBC/JDBC.

Three Tier – client programs, application server (or other “middleware”), and DBMS; clients connect to DBMS indirectly through the application server (also via an API). Typically used in web-based applications.

N Tier – generalization of 2 and 3 tier architectures.