phases of distributed query processing

PHASES OF DISTRIBUTED QUERY

PROCESSING -IPRESENTATION MEMBERS

-ANU ISSAC

-CRYSTAL CUTHINHO

-LEON D’SOUZA

-NEVIL D’SOUZA

-ANDREA FURTADO

• SQL , Parsing , The SQL query determines what data is to be found, but does not define the method by which the data manager searches the database.

• Indexing, hashing, Views , ...

OBJECTIVES OF QUERY PROCESSING• First and main objective of query Processing, "To convert high level language(SQL) to low level

language(Relational Algebra)".

• In distributed systems, the query written on single machine but are actually executed on different local database. as the local database provides the data to execute the query.

• A query can be processed using various techniques, such that the technique should be efficient to execute the query in distributed environment.

• To minimize the overall cost of execution.(I/O cost + CPU cost +Communication cost).

• to minimize the time required to execute the query.(I/O time + CPU time +Communication time).

• This phase brings query into normalized form in order of easier processing of query.

• its specifically focus on the where clause of query.

2. ANALYSIS

• Lexical & Syntactical analysis

• Verification of relations & attributes

• Conflict between operations

• Checks if result is possible

2. ANALYSISSEMANTIC ANALYSIS :

1) Connection Graph:

aka Query Graph

2. ANALYSISSEMANTIC ANALYSIS :

2) Join Graph:

-Subgraph of connection graph

-Only join operations considered

3. SIMPLIFICATION

• Detects redundant predicates

• Transforms queries – makes them simple & efficient

• Not at the cost of semantic correctness

• Checks for factors responsible for redundancy

QUERY RESTRUCTURING

1. Rewrites Query into equivalent Relational Algebra

2. Makes use of a Query Tree or Operator Tree

• Leaf node for every relation in the query

• Non-leaf node for intermediate relation we can generate

• Root node for the result of a query

3.Sequence of operation is from Leaf towards the Root

4.Transformation Rules are applied .

EXAMPLE• Q) Find the names of employees other than Raj who worked on CAD/CAM project for either one or two years.

• Query:

SELECT Ename

FROM PROJ, ASG, EMP

WHERE ASG.ENO=EMP.ENO

AND ASG.PNO=PROJ.PNO

AND ENAME ≠ “Raj”

AND PROJ.PNAME=“CAD/CAM”

AND (DUR=12 OR DUR=24);

ΠENAME (P)

σDUR=12 OR DUR=24(S)

σPNAME=“CAD/CAM”(S)

σENAME≠“RAJ”(S)

Pno(J) ENO

PROJASGEMP

RESTRUCTURING

FRAGMENTATION:

• Forming relational algebraic queries to be used on fragmented relations.• Generation of fragmented query is done by replacing the global relations with

fragmented relation in the query tree of distributed query.• The generic tree still has some scope for reconstruction and simplification.• Generic tree is used to generate a simpler and optimized query by using

reduction technique.• Type of fragmentation determines the reduction technique to be used.

REDUCTION FOR HORIZONTAL FRAGMENTATION:• Reduction of generic tree is done using either selection operation

or join operation.• Selection operation: Produces an intermediate relation that

remains empty if there is contradiction between selection predicate and definition of the fragment.

• Join operation: Detects useless join operation by commuting joins with union operation.

EXAMPLE• Consider the schemas: EMP(ENO, ENAME, TITLE) , ASG(ENO, PNO, RESP, DUR)• Consider the following query and fragmentation: Query: SELECT *

FROM EMP, ASG WHERE EMP.ENO=ASG.ENO • Horizontal fragmentation: ∗ EMP1 = σENO ≤ “ E3”(EMP) ASG1 = σENO ≤ ” E3”(ASG)

EMP2 = σ “ E3””<ENO<” E6”(EMP) ASG2 = σENO> ” E3”(ASG) –

EMP3 = σENO> ” E6”(EMP)

USING SELECTION OPERATION

• Consider the query: SELECT * FROM EMP WHERE ENO=”E5”• Here the leaf node that corresponds to Employee relation in the generic tree

can be replaced by reconstruction rule.• The selection predicate has a contradiction with the definition of EMP1 and

EMP3 fragment and produces empty result.

Department(deptno,dname,location)

• DEPT1= σ deptno<=10(Department)

• DEPT2= σ deptno>10(Department)

• Assume that the fragmention of employee relation is derived from department

• EMPi=Employee deptno DEPTi i=1,2

Select * from Employee,Department where depno>10 and Employee.deptno=Department.deptno.

Employee.deptno=Department.deptno

deptno>10

U DEPT2

EMP1 EMP2

Employee.deptno=Department.deptno

DEPT2 DEPT1

phases of distributed query processing

Data & Analytics

distributed query processing for mobile...

detecting distributed scans using high-performance query

distributed query deep dive conor cunningham

compliant geo-distributed query processing

distributed spatiotemporal trajectory query processing in...

1 6. distributed query optimization chapter 9 optimization...

query health: distributed population queries

6 . distributed query optimization

lessons&in&building&a distributed&query&planner&

efficient query processing in distributed search engines

distributed database...

perfopticon: visual query analysis for distributed databases

query processing in distributed database

adaptive query processing in distributed...

distributed dbms - unit 6 - query processing

1 tdd: topics in distributed databases distributed query...

distributed query-sub-query presented by noam pettel 29/5/05

query processing chapter 21 in textbook. overview what is...

distributed query processing

méthodes d'optimisation pour le traitement de requêtes...