distributed database systems lecture no 1

Upload: zubeyir

Post on 10-Apr-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/8/2019 Distributed Database Systems Lecture No 1

    1/39

  • 8/8/2019 Distributed Database Systems Lecture No 1

    2/39

    NEED OF DISTRIBUTED DATABASE

    SYSTEM

    Organization size

    Organization Branches

    Computerize Information System at each site

    Single data at different location

    y How to best utilize the resources ?

    y How to maintain the integrity of data at different

    location?

  • 8/8/2019 Distributed Database Systems Lecture No 1

    3/39

    OUTLINE

    Introduction

    What is a distributed DBMS

    Problems

    Current state-of-affairs

    Background

    Distributed DBMS Architecture

    Distributed Database Design (Briefly)

    Distributed Query Processing (Briefly)

    Distributed Transaction Management (Extensive)

    Mobile Database Systems (research paper based)

    Privacy, Trust, and Authentication (research

    paper based)

    Peer to Peer Systems (research paper based)

  • 8/8/2019 Distributed Database Systems Lecture No 1

    4/39

    OUTLINE

    Introduction

    y What is a distributed DBMS

    y Problems

    y Current state-of-affairs

    Background

    Distributed DBMS Architecture

    Distributed Database Design (Briefly)

    Distributed Query Processing (Briefly)

    Distributed Transaction Management (Extensive)

    Mobile Database Systems (research paper based)

    Privacy, Trust, and Authentication (research

    paper based)

    Peer to Peer Systems (research paper based)

  • 8/8/2019 Distributed Database Systems Lecture No 1

    5/39

    FILE SYSTEMS

    program 1

    data description 1

    program 2

    data description 2

    program 3

    data description 3

    File 1

    File 2

    File 3

  • 8/8/2019 Distributed Database Systems Lecture No 1

    6/39

    DATABASE MANAGEMENT

    database

    DBMS

    Applicationprogram 1(with datasemantics)

    Applicationprogram 2(with datasemantics)

    Applicationprogram 3(with datasemantics)

    description

    manipulation

    control

  • 8/8/2019 Distributed Database Systems Lecture No 1

    7/39

    INTEGRATE DATABASES AND COMMUINICATION

    DatabaseTechnology

    ComputerNetworks

    integration distribution

    integration

    Distributed

    Database

    Systems

  • 8/8/2019 Distributed Database Systems Lecture No 1

    8/39

    DISTRIBUTED COMPUTING

    A number of autonomous processing elements

    (not necessarily homogeneous) that areinterconnected by a computer network and that

    cooperate in performing their assigned tasks.

  • 8/8/2019 Distributed Database Systems Lecture No 1

    9/39

    Synonymous terms

    y distributed data processing

    y multiprocessors/multicomputers

    y satellite processing

    y backend processing

    y dedicated/special purpose

    computersy timeshared systems

    y functionally modular systems

    y Peer to Peer Systems

    DISTRIBUTED COMPUTING

  • 8/8/2019 Distributed Database Systems Lecture No 1

    10/39

    Processing logic

    Functions

    Data

    Control

    WHAT IS DISTRIBUTED

  • 8/8/2019 Distributed Database Systems Lecture No 1

    11/39

    WHAT IS ADISTRIBUTED DATABASE SYSTEM?

    A distributed database (DDB) is a collection of multiple,

    logically interrelated databases distributed over a

    computer network.

    A distributed database management system (DDBMS)

    is the software that manages the DDB and provides an

    access mechanism that makes this distribution

    transparent to the users.

    Distributed database system (DDBS) = DB +

    Communication

  • 8/8/2019 Distributed Database Systems Lecture No 1

    12/39

    A timesharing computer system

    A loosely or tightly coupled multiprocessor system

    A database system which resides at one of the

    nodes of a network of computers - this is a

    centralized database on a network node

    WHAT IS NOT ADDBS?

  • 8/8/2019 Distributed Database Systems Lecture No 1

    13/39

    CENTRALIZED DBMS ON ANETWORK

    Site 5

    Site 1

    Site 2

    Site 3Site 4

    Communication

    Network

  • 8/8/2019 Distributed Database Systems Lecture No 1

    14/39

    DISTRIBUTED DBMS ENVIRONMENT

    Site 5

    Site 1

    Site 2

    Site 3Site 4

    Communication

    Network

  • 8/8/2019 Distributed Database Systems Lecture No 1

    15/39

    SHARED-MEMORYARCHITECTURE

    Examples : symmetric multiprocessors (Sequent,

    Encore) and some mainframes

    (IBM3090, Bull's DPS8)

    P1 Pn M

    D

  • 8/8/2019 Distributed Database Systems Lecture No 1

    16/39

    SHARED-NOTHINGARCHITECTURE

    Examples : Teradata's DBC, Tandem, Intel's

    Paragon, NCR's 3600 and 3700

    P1

    M1

    D1

    Pn

    Mn

    Dn

  • 8/8/2019 Distributed Database Systems Lecture No 1

    17/39

    Manufacturing - especially multi-plant

    manufacturing

    Military command and control

    Electronic fund transfers and electronic trading

    Corporate MIS

    Airline restrictions

    Hotel chains

    Any organization which has a decentralized

    organization structure

    APPLICATIONS

  • 8/8/2019 Distributed Database Systems Lecture No 1

    18/39

    Transparent management of distributed,

    fragmented, and replicated data

    Improved reliability/availability through

    distributed transactions

    Improved performance

    Easier and more economical system

    expansion

    DISTRIBUTED DBMS PROMISES

  • 8/8/2019 Distributed Database Systems Lecture No 1

    19/39

    TRANSPARENCY

    Transparency is the separation of the higher levelsemantics of a system from the lower levelimplementation issues.

    Fundamental issue is to provide

    data independence

    in the distributed environment

    y Network (distribution) transparency

    y Replication transparency

    y Fragmentation transparency

    horizontal fragmentation: selection vertical fragmentation: projection

    hybrid

  • 8/8/2019 Distributed Database Systems Lecture No 1

    20/39

  • 8/8/2019 Distributed Database Systems Lecture No 1

    21/39

    TRANSPARENTACCESS

    SELECT ENAME,SAL

    FROM EMP,ASG,PAY

    WHERE DUR > 12

    AND EMP.ENO = ASG.ENOAND PAY.TITLE = EMP.TITLE Paris projects

    Paris employees

    Paris assignments

    Boston employees

    Montreal projects

    Paris projects

    New York projects

    with budget > 200000

    Montreal employees

    Montreal assignments

    Boston

    Communication

    Network

    Montreal

    Paris

    New

    York

    Boston projects

    Boston employees

    Boston assignments

    Boston projects

    New York employees

    New York projects

    New York assignments

    Tokyo

  • 8/8/2019 Distributed Database Systems Lecture No 1

    22/39

    OUTLINE

    Introduction

    Background

    Distributed DBMS Architecture

    Distributed Database Design (Briefly) Distributed Query Processing (Briefly)

    Distributed Transaction Management

    (Extensive)

    Building Distributed Database Systems

    (RAID)

    Mobile Database Systems

    Privacy, Trust, and Authentication

    Peer to Peer Systems

  • 8/8/2019 Distributed Database Systems Lecture No 1

    23/39

    DISTRIBUTED DATABASE - USERVIEW

    Distributed Database

  • 8/8/2019 Distributed Database Systems Lecture No 1

    24/39

    DISTRIBUTED DBMS - REALITY

    CommunicationSubsystem

    UserQuery

    DBMSSoftware

    DBMSSoftware

    UserApplication

    DBMSSoftware

    User

    ApplicationUserQuery

    DBMSSoftware

    UserQuery

    DBMSSoftware

  • 8/8/2019 Distributed Database Systems Lecture No 1

    25/39

    POTENTIALLYIMPROVED PERFORMANCE

    Proximity of data to its points of use

    y Requires some support for fragmentation and

    replication

    Parallelism in execution

    y Inter-query parallelism

    y Intra-query parallelism

  • 8/8/2019 Distributed Database Systems Lecture No 1

    26/39

    SYSTEM EXPANSION

    Issue is database scaling

    Peer to Peer systems

    Communication overhead

  • 8/8/2019 Distributed Database Systems Lecture No 1

    27/39

    DISTRIBUTED DBMS ISSUES

    Distributed Database Design

    y how to distribute the database

    y replicated & non-replicated database distribution

    y a related problem in directory management

    Query Processing

    y convert user transactions to data manipulation

    instructions

    y optimization problem

    y min{cost = data transmission + local processing}

    y general formulation is NP-hard

  • 8/8/2019 Distributed Database Systems Lecture No 1

    28/39

    DISTRIBUTED DBMS ISSUES

    Concurrency Control

    y Synchronization of concurrent accesses

    y Consistency and isolation of transactions' effects

    y Deadlock management

    Reliability

    y How to make the system resilient to failures

    y Atomicity and durability

    Privacy/Security

    y Keep database access private

    y Protect against malicious activities

    Trusted Collaborations (Emerging requirements)

    y Evaluate trust among users and database sites

    y Enforce policies for privacy

    y Enforce integrity

  • 8/8/2019 Distributed Database Systems Lecture No 1

    29/39

    Directory

    Management

    RELATIONSHIP BETWEEN ISSUES

    Reliability

    Deadlock

    Management

    Query

    Processing

    ConcurrencyControl

    Distribution

    Design

  • 8/8/2019 Distributed Database Systems Lecture No 1

    30/39

    Operating System Support

    y operating system with proper support for

    database operations

    y dichotomy between general purpose processing

    requirements and database processing

    requirements

    Open Systems and Interoperability

    y Distributed Multidatabase Systemsy More probable scenario

    y Parallel issues

    Network Behavior

    RELATED ISSUES

  • 8/8/2019 Distributed Database Systems Lecture No 1

    31/39

    OUTLINE Introduction Background

    Distributed DBMS Architecturey Introduction to Database Concepts

    Architecture, Schema, Views

    y Alternatives in Distributed Database Systems

    y Datalogical Architecturey Implementation Alternatives

    y Component Architecture

    Distributed Database Design (Briefly)

    Distributed Query Processing (Briefly)

    Distributed Transaction Management (Extensive)

    Building Distributed Database Systems (RAID)

    Mobile Database Systems

    Privacy, Trust, and Authentication

    Peer to Peer Systems

  • 8/8/2019 Distributed Database Systems Lecture No 1

    32/39

    Background materials of database architecture

    Defines the structure of the system

    y components identified

    y functions of each component defined

    y interrelationships and interactions between

    components defined

    ARCHITECTURE OF ADATABASE SYSTEM

  • 8/8/2019 Distributed Database Systems Lecture No 1

    33/39

    ANSI/SPARC ARCHITECTURE

    External

    Schema

    Conceptual

    Schema

    Internal

    SchemaInternal view

    Users

    External

    view

    Conceptualview

    External

    view

    External

    view

  • 8/8/2019 Distributed Database Systems Lecture No 1

    34/39

    Reference Modely A conceptual framework whose purpose is to divide

    standardization work into manageable pieces and to show at ageneral level how these pieces are related to one another.

    Approaches

    y

    Component-basedComponents of the system are defined together with the

    interrelationships between components.

    Good for design and implementation of the system.

    y Function-based

    Classes of users are identified together with the functionality

    that the system will provide for each class.The objectives of the system are clearly identified. But how do

    you achieve these objectives?

    y Data-based

    Identify the different types of describing data and specify the

    functional units that will realize and/or use data according tothese views.

    STANDARDIZATION

  • 8/8/2019 Distributed Database Systems Lecture No 1

    35/39

    RELATION EMP [

    KEY = {ENO}

    ATTRIBUTES = {

    ENO : CHARACTER(9)

    ENAME : CHARACTER(15)TITLE : CHARACTER(10)

    }

    ]

    RELATION PAY [

    KEY = {TITLE}

    ATTRIBUTES = {

    TITLE : CHARACTER(10)

    SAL : NUMERIC(6)

    }

    ]

    CONCEPTUAL SCHEMADEFINITION

  • 8/8/2019 Distributed Database Systems Lecture No 1

    36/39

    RELATION PROJ [

    KEY = {PNO}

    ATTRIBUTES = {

    PNO : CHARACTER(7)

    PNAME : CHARACTER(20)

    BUDGET : NUMERIC(7)

    }

    ]

    RELATIONASG [

    KEY = {ENO,PNO}

    ATTRIBUTES = {ENO : CHARACTER(9)

    PNO : CHARACTER(7)

    RESP : CHARACTER(10)

    DUR : NUMERIC(3)

    }

    ]

    CONCEPTUAL SCHEMADEFINITION

  • 8/8/2019 Distributed Database Systems Lecture No 1

    37/39

    RELATION EMP [

    KEY = {ENO}

    ATTRIBUTES = {ENO : CHARACTER(9)

    ENAME : CHARACTER(15)

    TITLE : CHARACTER(10)

    }

    ]

    INTERNAL_REL EMPL [

    INDEXON E# CALL EMINX

    FIELD = {HEADER : BYTE(1)

    E# : BYTE(9)

    ENAME : BYTE(15)

    TIT : BYTE(10)

    }

    ]

    INTERNAL SCHEMADEFINITION

  • 8/8/2019 Distributed Database Systems Lecture No 1

    38/39

    Create a BUDGET view from the PROJ relation

    CREATE VIEW BUDGET(PNAME, BUD)

    AS SELECTPNAME, BUDGET

    FROM PROJ

    EXTERNALVIEW DEFINITION EXAMPLE 1

  • 8/8/2019 Distributed Database Systems Lecture No 1

    39/39

    Create a Payroll view from relations EMP and

    TITLE_SALARY

    CREATE VIEW PAYROLL (ENO, ENAME, SAL)

    AS SELECT

    EMP.ENO,EMP.ENAME,PAY.SAL

    FROM EMP, PAY

    WHERE EMP.TITLE = PAY.TITLE

    EXTERNALVIEW DEFINITION EXAMPLE 2