database design – lecture 16 distributed databases

38
Database Design – Lecture 16 Distributed Databases

Upload: dulcie-sullivan

Post on 28-Dec-2015

226 views

Category:

Documents


1 download

TRANSCRIPT

Database Design – Lecture 16

Distributed Databases

2

Lecture Objectives Distributed Processing and Distributed

Databases Distributed Database Management

System (DDBMS) Distributed Database Design

3

Distributed Processing

Shares thedatabase’s logical processing amongtwo or more physically independent sitesthat are connectedthrough a network.

Note: data resides at only one site and is shared by other sites (“centralized”)

4

Distributed DatabasesStores a logicallyrelated databaseover two or morephysicallyindependent sites.The sites areconnected by acomputer

network.

Note: database is composed of several parts know as database fragments. These fragments are located at several different sites.

5

Distributed Processing and Distributed Databases

In a distributed database environment, the users do not need to know the name or location of each database fragment in order to access the database – transparent to the user

Distributed processing does not require a distributed database but a distributed database requires distributed processing

Both distributed processing and distributed databases require a network to connect all components

6

Lecture Objectives Distributed Processing and Distributed

Databases Distributed Database Management

System (DDBMS) Distributed Database Design

7

DDBMS Advantages Data are located near/at “greatest

demand” site – improved performance Improved reliability – data replication Growth facilitation Reduced operating costs

8

DDBMS Disadvantages Complexity Cost Database design more complex

9

Distributed Database Management System(DDBMS)

Governs the storage and processing of a single logically related database over interconnected computer systems in which both data and processing functions are distributed among several sites.

10

Distributed Database Management System(DDBMS)

A DDBMS must have at least the following functions to be classified as distributed:

- Application Interface - Validation- Transformation - Query Optimization- Mapping - I/O Interface- Formatting - Security- Backup & Recovery - DB Administration- Concurrency Control - Transaction

Management- Computer Workstations (sites or nodes)- Network Hardware & Software- Communications Media

11

Distributed Database Management System(DDBMS)

A DDBMS must have at least the following functions to be classified as distributed:

Application Interface Allows the interaction with the end user or

application programs and with other DBMSs within the distributed database

Validation Able to analyze data requests

Transformation To determine which data request components are

distributed and which ones are local

12

Distributed Database Management System(DDBMS)

A DDBMS must have at least the following functions to be classified as distributed:

Query Optimization To find the best access strategy

Mapping To determine the data location of local and

remote fragments I/O Interface

To read or write data from or to permanent local storage

13

Distributed Database Management System(DDBMS)

A DDBMS must have at least the following functions to be classified as distributed:

Formatting To prepare the data for presentation to the end

user or an application program Security

To provide data privacy at both local and remote databases

Backup and Recovery To ensure the availability and recoverability of

the database in case of a failure

14

Distributed Database Management System(DDBMS)

A DDBMS must have at least the following functions to be classified as distributed:

DB Administration To allow the Database Administrator to maintain the

databases Concurrency Control

To manage simultaneous data access and ensure data consistency across database fragments in the DDBMS

Transaction Management To ensure that the data move from on consistent

state to another – synchronizing transactions

15

Distributed Database Management System(DDBMS)

A DDBMS must have at least the following components:

Computer Workstations (sites or nodes) Form the network system

Network Hardware and Software Components that reside in each workstation Allows all sites to interact and exchange data

Communications media Carries data from one workstation to another

16

Distributed Database Management System(DDBMS)

A DDBMS must have at least the following components:

Transaction Processor (TP) Software component found in each computer

that requests data Receives and processes the application’s data

requests (remote and local) Data Processor (DP)

Software component residing on each computer that stores and retrieves data located at the site

17

Distributed Database Environment

18

Lecture Objectives Distributed Processing and Distributed

Databases Distributed Database Management

System (DDBMS) Distributed Database Design

19

Distributed Database Design Designing for a relational data base

structure does not change – start with a top down approach

HOWEVER, need to consider the following as well:

How to partition the database into fragments Which fragments to replicate Where to locate those fragments and replicas

More frequently used fragments should be stored locally

Fragments used by all users should be stored centrally

20

Distributed Database Design Data Fragmentation:

Allows a single object to be broken into two or more segments or fragments

Each fragment can be stored at any site on the network

Data fragmentation information is stored in the distributed data catalog (DDC), from which it is accessed by the TP to process user requests

21

Distributed Database Design Types of Data Fragmentation:

Horizontal Vertical Mixed

22

Distributed Database Design Types of Data Fragmentation:

Horizontal The division of a relation into tuples (rows) Each fragment is stored at a different node and

each fragment has unique rows Each tuple has the same attributes (columns)

but the rows are fragmented

23

Distributed Database Design Example of horizontal fragmentation

Original structure:5th Edition

Fragmentedstructure: Splitby state6th Edition

24

Distributed Database Design Example of horizontal fragmentation

Resulting structure:

Fragmentedstructure: Splitby state5th Edition

25

Distributed Database Design Types of Data Fragmentation:

Vertical The division of a relation into subsets by attributes

(column) Each subset is stored at a different node, and each

fragment has unique columns – with the exception of the key column, which is common to all fragments

Transaction issues here because same record may need to be inserted into two tables (part of record into 1 table and other part into another table). If only 1 insert is successful; end up with inconsistent data.

26

Distributed Database Design

Original structure:5th Edition

Fragmentedstructure: Splitby location6th Edition

27

Distributed Database Design

Example of Vertical Fragmentation

Original structure:5th Edition

Fragmentedstructure: Splitby location5th Edition

28

Distributed Database Design Types of Data Fragmentation:

Mixed A combination of horizontal and vertical

strategies

29

Distributed Database Design Example of Mixed Fragmentation:

30

Distributed Database Design Example of

Mixed Fragmentation:

31

Data Replication Storage of data copies at multiple sites

served by a computer network Fragment copies can be stored at several

sites to serve specific information requirements Can enhance data availability and response

time Can help to reduce communication and total

query costs

32

Replication Scenarios Fully replicated database:

Stores multiple copies of each database fragment at multiple sites

Can be impractical due to amount of overhead Partially replicated database:

Stores multiple copies of some database fragments at multiple sites

Most DDBMSs are able to handle the partially replicated database well

Unreplicated database: Stores each database fragment at a single

site No duplicate database fragments

33

Data Allocation Deciding where to locate data Allocation strategies:

Centralized data allocation Entire database is stored at one site

Partitioned data allocation Database is divided into several disjointed parts

(fragments) and stored at several sites Replicated data allocation

Copies of one or more database fragments are stored at several sites

Data distribution over a computer network is achieved through data partition, data replication, or a combination of both

34

Distributed Database Design How is a distributed database

managed? Distributed Data Catalog (DDC)

Contains the description of the entire database as seen by the DBA

Translates user requests into sub-queries (remote requests) that will be processed by different DPs

DDC is distributed and replicated at network nodes (the location of a database fragment)

35

Examples of Distributed Databases

Banking Account data distributed at each local

branch Loan data distributed at each local branch Corporate data at head office

(summarized branch information) Insurance

Policy data with each branch Corporate data at head office

36

Examples of Distributed Databases

Retail Inventory data distributed at each local store Employee Scheduling data at each store Corporate data at head office (summarized

store information) Payroll data at head office

Utilities Utility monitoring data at each location (I.e.

nuclear station monitoring – air, water etc at each location)

Corporate data at head office

37

Distributed Database vs Client Server

Client/Server is really an architecture which models a computerized solution based on the distribution of functions between servers and clients. A client requests specific services from a server and a server provides requested services to clients

Distributed processing could be one aspect of client/server architecture – data ‘centralized’

The DDBMS distributes data to different locations – could be used in a Client/Server architecture

38

Distributed Database Design Steps:

1. Always start with a centralized view design 2. Consider horizontal fragmentation of a

centralized database3. Consider vertical fragmentation of a

horizontally fragmented database4. Re-consider PK for all fragments of the

database5. Define data replication rules (scenarios)6. Complete Design