chapter 23 distributed dbmss - advanced concepts transparencies © pearson education limited 1995,...

97
Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

Upload: mabel-nora-berry

Post on 22-Dec-2015

250 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

Chapter 23

Distributed DBMSs - Advanced Concepts

Transparencies

© Pearson Education Limited 1995, 2005

Page 2: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

2

Chapter 23 - Objectives

Distributed transaction management. Distributed concurrency control. Distributed deadlock detection. Distributed recovery control. Distributed integrity control. X/OPEN DTP standard. Distributed query optimization. Oracle’s DDBMS functionality.

© Pearson Education Limited 1995, 2005

Page 3: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

3

Distributed Transaction Management

Distributed transaction accesses data stored at more than one location.

Divided into a number of sub-transactions, one for each site that has to be accessed, represented by an agent.

Indivisibility of distributed transaction is still fundamental to transaction concept.

DDBMS must also ensure indivisibility of each sub-transaction.

© Pearson Education Limited 1995, 2005

Page 4: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

4

Distributed Transaction Management

Thus, DDBMS must ensure:– synchronization of subtransactions with other

local transactions executing concurrently at a site;

– synchronization of subtransactions with global transactions running simultaneously at same or different sites.

Global transaction manager (transaction coordinator) at each site, to coordinate global and local transactions initiated at that site.

© Pearson Education Limited 1995, 2005

Page 5: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

5

Coordination of Distributed Transaction

© Pearson Education Limited 1995, 2005

Page 6: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

6

Distributed Locking

Look at four schemes:

– Centralized Locking.– Primary Copy 2PL.– Distributed 2PL.– Majority Locking.

© Pearson Education Limited 1995, 2005

Page 7: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

7

Centralized Locking

Single site that maintains all locking information. One lock manager for whole of DDBMS. Local transaction managers involved in global

transaction request and release locks from lock manager.

Or transaction coordinator can make all locking requests on behalf of local transaction managers.

Advantage - easy to implement. Disadvantages - bottlenecks and lower reliability.

© Pearson Education Limited 1995, 2005

Page 8: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

8

Primary Copy 2PL

Lock managers distributed to a number of sites. Each lock manager responsible for managing

locks for set of data items. For replicated data item, one copy is chosen as

primary copy, others are slave copies Only need to write-lock primary copy of data item

that is to be updated. Once primary copy has been updated, change can

be propagated to slaves.

© Pearson Education Limited 1995, 2005

Page 9: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

9

Primary Copy 2PL

Disadvantages - deadlock handling is more complex; still a degree of centralization in system.

Advantages - lower communication costs and better performance than centralized 2PL.

© Pearson Education Limited 1995, 2005

Page 10: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

10

Distributed 2PL

Lock managers distributed to every site. Each lock manager responsible for locks for

data at that site. If data not replicated, equivalent to primary

copy 2PL. Otherwise, implements a Read-One-Write-All

(ROWA) replica control protocol.

© Pearson Education Limited 1995, 2005

Page 11: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

11

Distributed 2PL

Using ROWA protocol:– Any copy of replicated item can be used for

read.– All copies must be write-locked before item

can be updated. Disadvantages - deadlock handling more

complex; communication costs higher than primary copy 2PL.

© Pearson Education Limited 1995, 2005

Page 12: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

12

Majority Locking

Extension of distributed 2PL. To read or write data item replicated at n sites,

sends a lock request to more than half the n sites where item is stored.

Transaction cannot proceed until majority of locks obtained.

Overly strong in case of read locks.

© Pearson Education Limited 1995, 2005

Page 13: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

13

Distributed Timestamping

Objective is to order transactions globally so older transactions (smaller timestamps) get priority in event of conflict.

In distributed environment, need to generate unique timestamps both locally and globally.

System clock or incremental event counter at each site is unsuitable.

Concatenate local timestamp with a unique site identifier: <local timestamp, site identifier>.

© Pearson Education Limited 1995, 2005

Page 14: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

14

Distributed Timestamping

Site identifier placed in least significant position to ensure events ordered according to their occurrence as opposed to their location.

To prevent a busy site generating larger timestamps than slower sites:– Each site includes their timestamps in messages.

– Site compares its timestamp with timestamp in message and, if its timestamp is smaller, sets it to some value greater than message timestamp.

© Pearson Education Limited 1995, 2005

Page 15: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

15

Distributed Deadlock

More complicated if lock management is not centralized.

Local Wait-for-Graph (LWFG) may not show existence of deadlock.

May need to create GWFG, union of all LWFGs. Look at three schemes:

– Centralized Deadlock Detection.– Hierarchical Deadlock Detection.– Distributed Deadlock Detection.

© Pearson Education Limited 1995, 2005

Page 16: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

16

Example - Distributed Deadlock

T1 initiated at site S1 and creating agent at S2,

T2 initiated at site S2 and creating agent at S3,

T3 initiated at site S3 and creating agent at S1.

Time S1 S2 S3

t1 read_lock(T1, x1) write_lock(T2, y2) read_lock(T3, z3)

t2 write_lock(T1, y1) write_lock(T2, z2)

t3 write_lock(T3, x1) write_lock(T1, y2) write_lock(T2, z3)

© Pearson Education Limited 1995, 2005

Page 17: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

17

Example - Distributed Deadlock

© Pearson Education Limited 1995, 2005

Page 18: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

18

Centralized Deadlock Detection

Single site appointed deadlock detection coordinator (DDC).

DDC has responsibility for constructing and maintaining GWFG.

If one or more cycles exist, DDC must break each cycle by selecting transactions to be rolled back and restarted.

© Pearson Education Limited 1995, 2005

Page 19: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

19

Hierarchical Deadlock Detection

Sites are organized into a hierarchy. Each site sends its LWFG to detection site above

it in hierarchy. Reduces dependence on centralized detection

site.

© Pearson Education Limited 1995, 2005

Page 20: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

20

Hierarchical Deadlock Detection

© Pearson Education Limited 1995, 2005

Page 21: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

21

Distributed Deadlock Detection

Most well-known method developed by Obermarck (1982).

An external node, Text, is added to LWFG to indicate remote agent.

If a LWFG contains a cycle that does not involve Text, then site and DDBMS are in deadlock.

© Pearson Education Limited 1995, 2005

Page 22: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

22

Distributed Deadlock Detection

Global deadlock may exist if LWFG contains a cycle involving Text.

To determine if there is deadlock, the graphs have to be merged.

Potentially more robust than other methods.

© Pearson Education Limited 1995, 2005

Page 23: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

23

Distributed Deadlock Detection

© Pearson Education Limited 1995, 2005

Page 24: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

24

Distributed Deadlock Detection

S1: Text T3 T1 Text

S2: Text T1 T2 Text

S3: Text T2 T3 Text

Transmit LWFG for S1 to the site for which transaction T1 is waiting, site S2.

LWFG at S2 is extended and becomes:

S2: Text T3 T1 T2 Text

© Pearson Education Limited 1995, 2005

Page 25: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

25

Distributed Deadlock Detection

Still contains potential deadlock, so transmit this WFG to S3:

S3: Text T3 T1 T2 T3 Text

GWFG contains cycle not involving Text, so deadlock exists.

© Pearson Education Limited 1995, 2005

Page 26: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

26

Distributed Deadlock Detection

Four types of failure particular to distributed systems:– Loss of a message.– Failure of a communication link.– Failure of a site.– Network partitioning.

Assume first are handled transparently by DC component.

© Pearson Education Limited 1995, 2005

Page 27: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

27

Distributed Recovery Control

DDBMS is highly dependent on ability of all sites to be able to communicate reliably with one another.

Communication failures can result in network becoming split into two or more partitions.

May be difficult to distinguish whether communication link or site has failed.

© Pearson Education Limited 1995, 2005

Page 28: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

28

Partitioning of a network

© Pearson Education Limited 1995, 2005

Page 29: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

29

Two-Phase Commit (2PC)

Two phases: a voting phase and a decision phase. Coordinator asks all participants whether they

are prepared to commit transaction. – If one participant votes abort, or fails to

respond within a timeout period, coordinator instructs all participants to abort transaction.

– If all vote commit, coordinator instructs all participants to commit.

All participants must adopt global decision.

© Pearson Education Limited 1995, 2005

Page 30: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

30

Two-Phase Commit (2PC)

If participant votes abort, free to abort transaction immediately

If participant votes commit, must wait for coordinator to broadcast global-commit or global-abort message.

Protocol assumes each site has its own local log and can rollback or commit transaction reliably.

If participant fails to vote, abort is assumed. If participant gets no vote instruction from

coordinator, can abort.© Pearson Education Limited 1995, 2005

Page 31: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

31

2PC Protocol for Participant Voting Commit

© Pearson Education Limited 1995, 2005

Page 32: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

32

2PC Protocol for Participant Voting Abort

© Pearson Education Limited 1995, 2005

Page 33: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

33

2PC Termination Protocols

Invoked whenever a coordinator or participant fails to receive an expected message and times out.

Coordinator Timeout in WAITING state

– Globally abort transaction.

Timeout in DECIDED state – Send global decision again to sites that have not

acknowledged.

© Pearson Education Limited 1995, 2005

Page 34: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

34

2PC - Termination Protocols (Participant)

Simplest termination protocol is to leave participant blocked until communication with the coordinator is re-established. Alternatively:

Timeout in INITIAL state– Unilaterally abort transaction.

Timeout in the PREPARED state– Without more information, participant blocked.

– Could get decision from another participant .

© Pearson Education Limited 1995, 2005

Page 35: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

35

State Transition Diagram for 2PC

(a) coordinator; (b) participant

© Pearson Education Limited 1995, 2005

Page 36: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

36

2PC Recovery Protocols

Action to be taken by operational site in event of failure. Depends on what stage coordinator or participant had reached.

Coordinator Failure Failure in INITIAL state

– Recovery starts commit procedure. Failure in WAITING state

– Recovery restarts commit procedure.

© Pearson Education Limited 1995, 2005

Page 37: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

37

2PC Recovery Protocols (Coordinator Failure)

Failure in DECIDED state– On restart, if coordinator has received all

acknowledgements, it can complete successfully. Otherwise, has to initiate termination protocol discussed above.

© Pearson Education Limited 1995, 2005

Page 38: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

38

2PC Recovery Protocols (Participant Failure)

Objective to ensure that participant on restart performs same action as all other participants and that this restart can be performed independently.

Failure in INITIAL state– Unilaterally abort transaction.

Failure in PREPARED state– Recovery via termination protocol above.

Failure in ABORTED/COMMITTED states– On restart, no further action is necessary.

© Pearson Education Limited 1995, 2005

Page 39: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

39

2PC Topologies

© Pearson Education Limited 1995, 2005

Page 40: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

40

Three-Phase Commit (3PC)

2PC is not a non-blocking protocol. For example, a process that times out after

voting commit, but before receiving global instruction, is blocked if it can communicate only with sites that do not know global decision.

Probability of blocking occurring in practice is sufficiently rare that most existing systems use 2PC.

© Pearson Education Limited 1995, 2005

Page 41: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

41

Three-Phase Commit (3PC)

Alternative non-blocking protocol, called three-phase commit (3PC) protocol.

Non-blocking for site failures, except in event of failure of all sites.

Communication failures can result in different sites reaching different decisions, thereby violating atomicity of global transactions.

3PC removes uncertainty period for participants who have voted commit and await global decision.

© Pearson Education Limited 1995, 2005

Page 42: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

42

Three-Phase Commit (3PC)

Introduces third phase, called pre-commit, between voting and global decision.

On receiving all votes from participants, coordinator sends global pre-commit message.

Participant who receives global pre-commit, knows all other participants have voted commit and that, in time, participant itself will definitely commit.

© Pearson Education Limited 1995, 2005

Page 43: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

43

State Transition Diagram for 3PC

(a) coordinator; (b) participant

© Pearson Education Limited 1995, 2005

Page 44: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

44

3PC Protocol for Participant Voting Commit

© Pearson Education Limited 1995, 2005

Page 45: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

45

3PC Termination Protocols (Coordinator)

Timeout in WAITING state – Same as 2PC. Globally abort transaction.

Timeout in PRE-COMMITTED state – Write commit record to log and send

GLOBAL-COMMIT message. Timeout in DECIDED state

– Same as 2PC. Send global decision again to sites that have not acknowledged.

© Pearson Education Limited 1995, 2005

Page 46: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

46

3PC Termination Protocols (Participant)

Timeout in INITIAL state– Same as 2PC. Unilaterally abort transaction.

Timeout in the PREPARED state– Follow election protocol to elect new coordinator.

Timeout in the PRE-COMMITTED state– Follow election protocol to elect new coordinator.

© Pearson Education Limited 1995, 2005

Page 47: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

47

3PC Recovery Protocols (Coordinator Failure)

Failure in INITIAL state– Recovery starts commit procedure.

Failure in WAITING state– Contact other sites to determine fate of transaction.

Failure in PRE-COMMITTED state– Contact other sites to determine fate of transaction.

Failure in DECIDED state– If all acknowledgements in, complete transaction;

otherwise initiate termination protocol above.

© Pearson Education Limited 1995, 2005

Page 48: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

48

3PC Recovery Protocols (Participant Failure)

Failure in INITIAL state– Unilaterally abort transaction.

Failure in PREPARED state– Contact other sites to determine fate of

transaction. Failure in PRE-COMMITTED state

– Contact other sites to determine fate of transaction.

Failure in ABORTED/COMMITTED states– On restart, no further action is necessary.

© Pearson Education Limited 1995, 2005

Page 49: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

49

3PC Termination Protocol After New Coordinator

Newly elected coordinator will send STATE-REQ message to all participants involved in election to determine how best to continue.

1. If some participant has aborted, then abort.2. If some participant has committed, then

commit.3. If all participants are uncertain, then abort.4. If some participant is in PRE-COMMIT, then

commit. To prevent blocking, send PRE-COMMIT and after acknowledgements, send GLOBAL-COMMIT.

© Pearson Education Limited 1995, 2005

Page 50: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

50

Network Partitioning

If data is not replicated, can allow transaction to proceed if it does not require any data from site outside partition in which it is initiated.

Otherwise, transaction must wait until sites it needs access to are available.

If data is replicated, procedure is much more complicated.

© Pearson Education Limited 1995, 2005

Page 51: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

51

Identifying Updates

© Pearson Education Limited 1995, 2005

Page 52: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

52

Identifying Updates

Successfully completed update operations by users in different partitions can be difficult to observe.

In P1, transaction withdrawn £10 from account and in P2, two transactions have each withdrawn £5 from same account.

At start, both partitions have £100 in balx, and on completion both have £90 in balx.

On recovery, not sufficient to check value in balx and assume consistency if values same.

© Pearson Education Limited 1995, 2005

Page 53: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

53

Maintaining Integrity

© Pearson Education Limited 1995, 2005

Page 54: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

54

Maintaining Integrity

Successfully completed update operations by users in different partitions can violate constraints.

Have constraint that account cannot go below £0. In P1, withdrawn £60 from account and in P2,

withdrawn £50. At start, both partitions have £100 in balx, then on

completion one has £40 in balx and other has £50.

Importantly, neither has violated constraint. On recovery, balx is –£10, and constraint violated.

© Pearson Education Limited 1995, 2005

Page 55: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

55

Network Partitioning

Processing in partitioned network involves trade-off in availability and correctness.

Correctness easiest to provide if no processing of replicated data allowed during partitioning.

Availability maximized if no restrictions placed on processing of replicated data.

In general, not possible to design non-blocking commit protocol for arbitrarily partitioned networks.

© Pearson Education Limited 1995, 2005

Page 56: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

56

X/OPEN DTP Model

Open Group is vendor-neutral consortium whose mission is to cause creation of viable, global information infrastructure.

Formed by merge of X/Open and Open Software Foundation.

X/Open established DTP Working Group with objective of specifying and fostering appropriate APIs for TP.

Group concentrated on elements of TP system that provided the ACID properties.

© Pearson Education Limited 1995, 2005

Page 57: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

57

X/OPEN DTP Model

X/Open DTP standard that emerged specified three interacting components:

– an application, – a transaction manager (TM), – a resource manager (RM).

© Pearson Education Limited 1995, 2005

Page 58: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

58

X/OPEN DTP Model

Any subsystem that implements transactional data can be a RM, such as DBMS, transactional file system or session manager.

TM responsible for defining scope of transaction, and for assigning unique ID to it.

Application calls TM to start transaction, calls RMs to manipulate data, and calls TM to terminate transaction.

TM communicates with RMs to coordinate transaction, and TMs to coordinate distributed transactions.

© Pearson Education Limited 1995, 2005

Page 59: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

59

X/OPEN DTP Model - Interfaces

Application may use TX interface to communicate with a TM.

TX provides calls that define transaction scope, and whether to commit/abort transaction.

TM communicates transactional information with RMs through XA interface.

Finally, application can communicate directly with RMs through a native API, such as SQL or ISAM.

© Pearson Education Limited 1995, 2005

Page 60: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

60

X/OPEN DTP Model Interfaces

© Pearson Education Limited 1995, 2005

Page 61: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

61

X/OPEN Interfaces in Distributed Environment

© Pearson Education Limited 1995, 2005

Page 62: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

62

Distributed Query Optimization

© Pearson Education Limited 1995, 2005

Page 63: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

63

Distributed Query Optimization

Query decomposition: takes query expressed on global relations and performs partial optimization using centralized QO techniques. Output is some form of RAT based on global relations.

Data localization: takes into account how data has been distributed. Replace global relations at leaves of RAT with their reconstruction algorithms.

© Pearson Education Limited 1995, 2005

Page 64: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

64

Distributed Query Optimization

Global optimization: uses statistical information to find a near-optimal execution plan. Output is execution strategy based on fragments with communication primitives added.

Local optimization: Each local DBMS performs its own local optimization using centralized QO techniques.

© Pearson Education Limited 1995, 2005

Page 65: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

65

Data Localization

In QP, represent query as R.A.T. and, using transformation rules, restructure tree into equivalent form that improves processing.

In DQP, need to consider data distribution. Replace global relations at leaves of tree with

their reconstruction algorithms - RA operations that reconstruct global relations from fragments:– For horizontal fragmentation, reconstruction

algorithm is Union; – For vertical fragmentation, it is Join.

© Pearson Education Limited 1995, 2005

Page 66: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

66

Data Localization

Then use reduction techniques to generate simpler and optimized query.

Consider reduction techniques for following types of fragmentation:– Primary horizontal fragmentation.– Vertical fragmentation.– Derived fragmentation.

© Pearson Education Limited 1995, 2005

Page 67: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

67

Reduction for Primary Horizontal Fragmentation

If selection predicate contradicts definition of fragment, this produces empty intermediate relation and operations can be eliminated.

For join, commute join with union. Then examine each individual join to determine

whether there are any useless joins that can be eliminated from result.

A useless join exists if fragment predicates do not overlap.

© Pearson Education Limited 1995, 2005

Page 68: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

68

Example 23.2 Reduction for PHF

SELECT *

FROM Branch b, PropertyForRent p

WHERE b.branchNo = p.branchNo AND p.type = ‘Flat’;

P1: branchNo=‘B003’ type=‘House’ (PropertyForRent)

P2: branchNo=‘B003’ type=‘Flat’ (PropertyForRent)

P3: branchNo!=‘B003’ (PropertyForRent)

B1: branchNo=‘B003’ (Branch)

B2: branchNo!=‘B003’ (Branch)

© Pearson Education Limited 1995, 2005

Page 69: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

69

Example 23.2 Reduction for PHF

© Pearson Education Limited 1995, 2005

Page 70: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

70

Example 23.2 Reduction for PHF

© Pearson Education Limited 1995, 2005

Page 71: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

71

Example 23.2 Reduction for PHF

© Pearson Education Limited 1995, 2005

Page 72: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

72

Reduction for Vertical Fragmentation

Reduction for vertical fragmentation involves removing those vertical fragments that have no attributes in common with projection attributes, except the key of the relation.

© Pearson Education Limited 1995, 2005

Page 73: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

73

Example 23.3 Reduction for Vertical Fragmentation

SELECT fName, lName

FROM Staff;

S1: staffNo, position, sex, DOB, salary(Staff)

S2: staffNo, fName, lName, branchNo (Staff)

© Pearson Education Limited 1995, 2005

Page 74: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

74

Example 23.3 Reduction for Vertical Fragmentation

© Pearson Education Limited 1995, 2005

Page 75: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

75

Reduction for Derived Fragmentation

Use transformation rule that allows join and union to be commuted.

Using knowledge that fragmentation for one relation is based on the other and, in commuting, some of the partial joins should be redundant.

© Pearson Education Limited 1995, 2005

Page 76: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

76

Example 23.4 Reduction for Derived Fragmentation

SELECT *

FROM Branch b, Client c

WHERE b.branchNo = c.branchNo AND

b.branchNo = ‘B003’;

B1 = branchNo=‘B003’ (Branch)

B2 = branchNo!=‘B003’ (Branch)

Ci = Client branchNo Bi i = 1, 2

© Pearson Education Limited 1995, 2005

Page 77: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

77

Example 23.4 Reduction for Derived Fragmentation

© Pearson Education Limited 1995, 2005

Page 78: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

78

Global Optimization

Objective of this layer is to take the reduced query plan for the data localization layer and find a near-optimal execution strategy.

In distributed environment, speed of network has to be considered when comparing strategies.

If know topology is that of WAN, could ignore all costs other than network costs.

LAN typically much faster than WAN, but still slower than disk access.

© Pearson Education Limited 1995, 2005

Page 79: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

79

Global Optimization

Cost model could be based on total cost (time), as in centralized DBMS, or response time. Latter uses parallelism inherent in DDBMS.

© Pearson Education Limited 1995, 2005

Page 80: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

80

Global Optimization – R*

R* uses a cost model based on total cost and static query optimization.

Like centralized System R optimizer, algorithm is based on an exhaustive search of all join orderings, join methods (nested loop or sort-merge join), and various access paths for each relation.

When Join is required involving relations at different sites, R* selects the sites to perform Join and method of transferring data between sites.

© Pearson Education Limited 1995, 2005

Page 81: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

81

Global Optimization – R*

For a Join of R and S with R at site 1 and S at site 2, there are three candidate sites:– site 1, where R is located;– site 2, where S is located; – some other site (e.g., site of relation T, which is

to be joined with join of R and S).

© Pearson Education Limited 1995, 2005

Page 82: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

82

Global Optimization – R*

In R*, there are 2 methods for transferring data:1. Ship whole relation2. Fetch tuples as needed.

First method incurs a larger data transfer but fewer message then second.

R* considers only the following methods:1. Nested loop, ship whole outer relation to site of inner.2. Sort-merge, ship whole inner relation to site of outer.3. Nested loop, fetch tuples of inner relation as needed

for each tuple of outer relation.4. Sort-merge, fetch tuples of inner relation as needed

for each tuple of outer relation.5. Ship both relations to third site.

© Pearson Education Limited 1995, 2005

Page 83: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

83

Global Optimization – SDD-1

Based on an earlier method known as “hill climbing”, a greedy algorithm that starts with an initial feasible solution which is then iteratively improved.

Modified to make use of Semijoin to reduce cardinality of join operands.

Like R*, SDD-1 optimizer minimizes total cost, although unlike R* it ignores local processing costs and concentrates on communication message size.

Like R*, query processing timing used is static.

© Pearson Education Limited 1995, 2005

Page 84: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

84

Global Optimization – SDD-1

Based on concept of “beneficial Semijoins”. Communication cost of Semijoin is simply cost of

transferring join attribute of first operand to site of second operand.

“Benefit” of Semijoin is taken as cost of transferring irrelevant tuples of first operand, which Semijoin avoids.

© Pearson Education Limited 1995, 2005

Page 85: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

85

Global Optimization – SDD-1

Phase 1 – Initialization: Perform all local reductions using Selection and Projection. Execute Semijoins within same site to reduce sizes of relations. Generate set of all beneficial Semijoins across sites (Semijoin is beneficial if its cost is less than its benefit).

Phase 2 – Selection of beneficial Semijoins: Iteratively select most beneficial Semijoin from set generated and add it to execution strategy. After each iteration, update database statistics to reflect incorporation of the Semijoin and update the set with new beneficial Semijoins.

© Pearson Education Limited 1995, 2005

Page 86: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

86

Global Optimization – SDD-1

Phase 3 – Assembly site selection: Select, among all sites, site to which transmission of all relations incurs a minimum cost. Choose site containing largest amount of data after reduction phase so that sum of the amount of data transferred from other sites will be minimum.

Phase 4 – Postoptimization: Discard useless Semijoins; e.g. if R resides in assembly site and R is due to be reduced by Semijoin, but is not used to reduce other relations after Semijoin, then since R need not be moved to another site during assembly phase, Semijoin on R is useless and can be discarded.

© Pearson Education Limited 1995, 2005

Page 87: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

87

Oracle’s DDBMS Functionality Oracle does not support type of fragmentation

discussed previously, although DBA can distribute data to achieve similar effect.

Thus, fragmentation transparency is not supported although location transparency is.

Discuss:– connectivity– global database names and database links– transactions– referential integrity– heterogeneous distributed databases– Distributed QO.

© Pearson Education Limited 1995, 2005

Page 88: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

88

Connectivity – Oracle Net Services

Oracle Net Services supports communication between clients and servers.

Enables both client-server and server-server communication across any network, supporting both distributed processing and distributed DBMS capability.

Also responsible for translating any differences in character sets or data representation that may exist at operating system level.

© Pearson Education Limited 1995, 2005

Page 89: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

89

Global Database Names

Unique name given to each distributed database. Formed by prefixing the database’s network

domain name with the local database name. Domain name follows standard Internet

conventions, with levels separated by dots ordered from leaf to root, left to right.

© Pearson Education Limited 1995, 2005

Page 90: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

90

Database Links

Used to build distributed databases. Defines a communication path from one Oracle

database to another (possibly non-Oracle) database.

Acts as a type of remote login to remote database.

CREATE PUBLIC DATABASE LINKRENTALS.GLASGOW.NORTH.COM;

SELECT * FROM [email protected];UPDATE [email protected] salary = salary*1.05;

© Pearson Education Limited 1995, 2005

Page 91: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

91

Types of Transactions

Remote SQL statements: Remote query selects data from one or more remote tables, all of which reside at same remote node. Remote update modifies data in one or more tables, all of which are located at same remote node .

Distributed SQL statements: Distributed query retrieves data from two or more nodes. Distributed update modifies data on two or more nodes.

Remote transactions: Contains one or more remote statements, all of which reference a single remote node.

© Pearson Education Limited 1995, 2005

Page 92: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

92

Types of Transactions

Distributed transactions: Includes one or more statements that, individually or as a group, update data on two or more distinct nodes of a distributed database. Oracle ensures integrity of distributed transactions using 2PC.

© Pearson Education Limited 1995, 2005

Page 93: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

93

Referential Integrity

Oracle does not permit declarative referential integrity constraints to be defined across databases.

However, parent-child table relationships across databases can be maintained using triggers.

© Pearson Education Limited 1995, 2005

Page 94: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

94

Heterogeneous Distributed Databases

Here one of the local DBMSs is not Oracle. Oracle Heterogeneous Services and a non-Oracle

system-specific agent can hide distribution and heterogeneity.

Can be accessed through:– transparent gateways– generic connectivity.

© Pearson Education Limited 1995, 2005

Page 95: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

95

Transparent Gateways

© Pearson Education Limited 1995, 2005

Page 96: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

96

Generic Connectivity

© Pearson Education Limited 1995, 2005

Page 97: Chapter 23 Distributed DBMSs - Advanced Concepts Transparencies © Pearson Education Limited 1995, 2005

97

Oracle Distributed Query Optimization

A distributed query is decomposed by the local Oracle DBMS into a number of remote queries, which are sent to remote DBMS for execution.

Remote DBMSs execute queries and send results back to local node.

Local node then performs any necessary postprocessing and returns results to user.

Only necessary data from remote tables are extracted, thereby reducing amount of data that needs to be transferred.

© Pearson Education Limited 1995, 2005