16-17 october 2003 grids and applied language theory: declarative grid service orchestration with...

18
16-17 October 2003 Grids and Ap plied Langua 1 Declarative Grid Service Orchestration with OGSA-DQP Alvaro A A Fernandes Department of Computer Science University of Manchester Service-Based Distributed Query Processing on the Grid

Upload: jesse-roy

Post on 28-Mar-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes) 1 Declarative Grid Service

16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)

1

Declarative Grid Service Orchestration with OGSA-DQP

Alvaro A A FernandesDepartment of Computer Science

University of Manchester

Service-Based Distributed Query Processing on the Grid

Page 2: 16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes) 1 Declarative Grid Service

16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)

2

places, people, funding, projects

Manchester

M Nedim AlpdemirAnastasios

GounarisNorman W PatonAlvaro A A FernandesRizos Sakellariou

Newcastle upon Tyne

Arijit MukherjeeJim Smith

Paul Watson

Page 3: 16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes) 1 Declarative Grid Service

16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)

5

motivation

• Pull by applications:– overwhelming

amounts of semantically complex data in

– very diverse, structurally dissimilar, and autonomous, geographically dispersed data sources

– requiring computationally demanding analysis.

• Push from context and infrastructure:– Web service impetus

combined with– Grid abstractions and

protocols that enable,– not just dynamic

resource discovery but also,

– dynamic resource allocation and use.

Page 4: 16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes) 1 Declarative Grid Service

16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)

6

context

1. High-level data access and integration services are needed if applications that have data with complex structure and complex semantics are to benefit from the Grid.

2. Standards for data access are emerging, and middleware products that are reference implementations of such standards are already available.

3. Distributed query processing technology is one approach to delivering (1.) given the availability of (2.).

4. Declarative service orchestration falls out.

Page 5: 16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes) 1 Declarative Grid Service

16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)

9

OGSA-DQPapproach

• OGSA-DQP uses a middleware approach.

• It can be seen as a mediator over OGSA-DAI wrappers.

• It promises bottom-lines regarding:– efficiency: “leave to it

to schedule in parallel”;

– effectiveness: “leave to it to orchestrate your services”;

– usability: “use it as a Grid data service”.

DBMS

data

OGSA-DQP

DBMS

data

Query Results

OGSA-DAI

OGSA-DAI

Page 6: 16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes) 1 Declarative Grid Service

16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)

16

OGSA-DQPexample

• Given two DBMSs and one analysis tool (e.g., a WS):– proteinTerm to a GO Gene

Ontology running as a remote mySQL DB,

– protein to a GIMS Genome Warehouse running as a remote ODMG-compliant DB,

– Blast (sequence alignment scoring);

• We can obtain alignment scores for a sequence against proteins of a certain kind:

select p.proteinId, Blast(p.sequence)from protein p, proteinTerm twhere t.termId = ‘GO:0005942’ and

p.proteinId = t.proteinId

• Then, OGSA-DQP acts as an enactor of a declarative orchestration of services on the Grid:

index_scantermId=GO:0005942(proteinTerm)

table_scan(protein)

reduce reduce

hash_join(proteinId)

op_call(Blast)reduce

exchange exchange

exchange

3,4

2

1

5

Page 7: 16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes) 1 Declarative Grid Service

16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)

17

OGSA-DQPextends/depends on

extends

• Leonidas Fegaras’s -DB system and OPTGEN optimiser generator. [1997-2000]

• Polar: a parallel query processing engine. [1998-2001]

• Polar*: an MPICH-G distributed extension of Polar. [2002]

depends on

• OGSA/OGSI/GT3 Grid Services (GSs).

• OGSA-DAI Grid Data Services (GDSs).

• Leonidas Fegaras and David Maier’s work on a formal semantics for OQL. [TODS 25(4),2000]

Page 8: 16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes) 1 Declarative Grid Service

16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)

18

OGSA-DQPmanages/provides

provides

• Grid Distributed Query Services (GDQSs) that:– interact with clients;– find and retrieve

service descriptions;– parse, compile,

partition and schedule the query execution over a union of distributed data sources.

• The query plan is an orchestration of GQESs

manages

• Grid Query Evaluation Services (GQESs) that:– implement the physical

query algebra;– implement the query

execution model and semantics;

– run a partition of a query execution plan generated by a GDQS;

– interact with other GQESs/GDSs/WSs but not with clients.

Page 9: 16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes) 1 Declarative Grid Service

16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)

19

OGSA-DQPa brief tour (1)

• It builds upon GDSs which build upon GSs.

• A GDS is a leaf in a query execution plan up from which data ultimately flows.

• Data resources are, thereby, virtualised.

• Since they are GSs, they can be dynamically created by dynamically discovered factories and then disposed of.

• A GDQS is a GDS capable of integration and distributed retrieval and analysis of data.

• To perform a request a GDQS spawns as many GQESs in as many hosts as the partitioning and scheduling policies of the GDQS recommend for that request.

Page 10: 16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes) 1 Declarative Grid Service

16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)

20

OGSA-DQPa brief tour (2)

• To obtain an execution plan, a GDQS:– Interacts with registries to fetch information

about the data and computational services deemed of interest by the requestor;

– Interacts with GDSs and (in future) Index Services to acquire relevant metadata;

– Compiles, optimises, partitions and schedules the query execution.

Page 11: 16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes) 1 Declarative Grid Service

16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)

21

OGSA-DQPa brief tour (3)

• Given a distributed query plan, a GDQS:– Interacts with GDS factories to create the

leaf services in the plan;– Interacts with WSs that front-end analysis

capabilities;– Commands the creation of GQESs as

stipulated by the partitioning and scheduling decided on by the compiler;

– Coordinates the GQESs into executing the plan.

Page 12: 16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes) 1 Declarative Grid Service

16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)

39

what is going

on behind

the scenes

(1)

GG D S G Q ES n

G D T

111

G D Q S

G D S

G D T

G D Q

G

Re gi s try

G SG D S R

C lie n t

GG D S G D S

In s tan ce s

GG D S G Q ES 1

G D T

. .

.

GFactory G D Q S F

114

113

112

8

5

114 .1

7

8

re gis te rSe rv ice

fi ndSe rv ice Da ta

cre a te Se rv ice

im portSche m a

fi ndSe rv ice Da ta ( DBSche m a )

G D T

G S

pe

rform

(qu

ery

Su

bP

lan

)pe r fo rm ( que ry )

pe rfo rm ( gqes_ query )

6

6

Page 13: 16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes) 1 Declarative Grid Service

16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)

40

GFactory G Q ES F

GFactory G Q ES F

GFactory G Q ES F

N 2

N 1

N3

GC lie n tGG D S

GG D S

G D Q

G D T

G D Q S

N 0G D S

GFactory G Q ES F

N4

p erform (Q u ery)1

cre a te S e rv ice

cre a te S e rv ice2

cre ate S e rvi ce

2

2

GG D S G Q ES 2

G D T

GG D S G Q ES 3

G D T

GG D S G Q ES 1

G D T

GG D S G Q ES 1

G D T

p erform (Q u ery S u b p la n )

p erform (Q u ery S u b p la n )

perform(Q

uerySu bpl an)

3

s eq u en t ial_ s can

red u ce (p r o tein ID ,s eq u en ce )

s eq u en t ial_ s can ( ter m = 8 3 7 2 )

red u ce (p r o tein ID )

h as h _ jo in(p .p r o tein ID = t.p r o tein ID )

3

o p erat io n _ callb la s t(p .s eq u en ce)

red u ce (p .p r o tein ID , b la s t)

o p erat io n _ callb la s t(p .s eq u en ce)

red u ce (p .p r o tein ID , b la s t)

3

W e b S e rvi ce s (B L A S T)

resu lts

resu lts

resu lts

4

1144

what is going

on behind

the scenes

(2)

Page 14: 16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes) 1 Declarative Grid Service

16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)

51

the Khalaf-Leymann taxonomy for web services aggregation

unconstrained constrained

aggregation

agreements

grouping recursive wiring

choreography service domains

Page 15: 16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes) 1 Declarative Grid Service

16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)

54

OGSA-DQPvarious kinds of service aggregation

• There is interface inheritance from GSs and GDSs.

• The execution plan can be seen as encapsulating a wiring of GQESs,

• But constrained, and constructed on-the-fly, as in an an orchestration.

• As in service domains, there is competition of GQESs for a role to play in the orchestration.

• As is agreements, the orchestration is opportunistic, responsive to the obtaining resource levels and short-lived.

Page 16: 16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes) 1 Declarative Grid Service

16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)

57

summary

• OGSA-DQP is a service-based distributed query processor for the Grid that is:– Exposed as a service;– Implemented as an orchestration of services.

• OGSA-DQP is an enactor of declarative Grid service orchestrations that:– Improves on Grid portals when only retrieval

and analysis is involved;– Fills the gap left by the lack of a service

orchestration framework in the OGSA.

Page 17: 16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes) 1 Declarative Grid Service

16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)

58

where to find out more: papers

1. M N Alpdemir, A Mukherjee, A Gounaris, A A A Fernandes, N W Paton, P Watson, J Smith. An Experience Report on Designing and Building OGSA-DQP: A Service Based Distributed Query Processor for the Grid. GGF9 Workshop on Designing and Building Grid Services, 2003.

2. M N Alpdemir, A Mukherjee, A Gounaris, N W Paton, P Watson, A A A Fernandes, J Smith. Service-Based Distributed Querying on the Grid. 1st Int. Conf. on Service Oriented Computing, 2003. LNCS, to appear

3. M N Alpdemir, A Mukherjee, A Gounaris, N W Paton, P Watson, A A A Fernandes, J Smith. OGSA-DQP: A Service-Based Distributed Query Processor for the Grid. 2nd UK e-Science All Hands Meeting, 2003.

4. J Smith, A Gounaris, P Watson, N W Paton, A A A Fernandes, R Sakellariou. Distributed Query Processing on the Grid. GRID 2002, LNCS 2536

(papers available from http://www.cs.man.ac.uk/~alvaro/publications.html )

Page 18: 16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes) 1 Declarative Grid Service

16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)

59

where to find out more: software

OGSA-DQPGrid middleware to query distributed data

sources

www.ogsadai.org.uk/dqp OGSA-DAI

Grid middleware to interface with data(bases)

www.ogsadai.org.uk/ Globus ToolkitOpen-source implementation of OGSA/OGSI

www.globustoolkit.org/