16-17 october 2003 grids and applied language theory: declarative grid service orchestration with...
TRANSCRIPT
16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
1
Declarative Grid Service Orchestration with OGSA-DQP
Alvaro A A FernandesDepartment of Computer Science
University of Manchester
Service-Based Distributed Query Processing on the Grid
16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
2
places, people, funding, projects
Manchester
M Nedim AlpdemirAnastasios
GounarisNorman W PatonAlvaro A A FernandesRizos Sakellariou
Newcastle upon Tyne
Arijit MukherjeeJim Smith
Paul Watson
16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
5
motivation
• Pull by applications:– overwhelming
amounts of semantically complex data in
– very diverse, structurally dissimilar, and autonomous, geographically dispersed data sources
– requiring computationally demanding analysis.
• Push from context and infrastructure:– Web service impetus
combined with– Grid abstractions and
protocols that enable,– not just dynamic
resource discovery but also,
– dynamic resource allocation and use.
16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
6
context
1. High-level data access and integration services are needed if applications that have data with complex structure and complex semantics are to benefit from the Grid.
2. Standards for data access are emerging, and middleware products that are reference implementations of such standards are already available.
3. Distributed query processing technology is one approach to delivering (1.) given the availability of (2.).
4. Declarative service orchestration falls out.
16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
9
OGSA-DQPapproach
• OGSA-DQP uses a middleware approach.
• It can be seen as a mediator over OGSA-DAI wrappers.
• It promises bottom-lines regarding:– efficiency: “leave to it
to schedule in parallel”;
– effectiveness: “leave to it to orchestrate your services”;
– usability: “use it as a Grid data service”.
DBMS
data
OGSA-DQP
DBMS
data
Query Results
OGSA-DAI
OGSA-DAI
16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
16
OGSA-DQPexample
• Given two DBMSs and one analysis tool (e.g., a WS):– proteinTerm to a GO Gene
Ontology running as a remote mySQL DB,
– protein to a GIMS Genome Warehouse running as a remote ODMG-compliant DB,
– Blast (sequence alignment scoring);
• We can obtain alignment scores for a sequence against proteins of a certain kind:
select p.proteinId, Blast(p.sequence)from protein p, proteinTerm twhere t.termId = ‘GO:0005942’ and
p.proteinId = t.proteinId
• Then, OGSA-DQP acts as an enactor of a declarative orchestration of services on the Grid:
index_scantermId=GO:0005942(proteinTerm)
table_scan(protein)
reduce reduce
hash_join(proteinId)
op_call(Blast)reduce
exchange exchange
exchange
3,4
2
1
5
16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
17
OGSA-DQPextends/depends on
extends
• Leonidas Fegaras’s -DB system and OPTGEN optimiser generator. [1997-2000]
• Polar: a parallel query processing engine. [1998-2001]
• Polar*: an MPICH-G distributed extension of Polar. [2002]
depends on
• OGSA/OGSI/GT3 Grid Services (GSs).
• OGSA-DAI Grid Data Services (GDSs).
• Leonidas Fegaras and David Maier’s work on a formal semantics for OQL. [TODS 25(4),2000]
16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
18
OGSA-DQPmanages/provides
provides
• Grid Distributed Query Services (GDQSs) that:– interact with clients;– find and retrieve
service descriptions;– parse, compile,
partition and schedule the query execution over a union of distributed data sources.
• The query plan is an orchestration of GQESs
manages
• Grid Query Evaluation Services (GQESs) that:– implement the physical
query algebra;– implement the query
execution model and semantics;
– run a partition of a query execution plan generated by a GDQS;
– interact with other GQESs/GDSs/WSs but not with clients.
16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
19
OGSA-DQPa brief tour (1)
• It builds upon GDSs which build upon GSs.
• A GDS is a leaf in a query execution plan up from which data ultimately flows.
• Data resources are, thereby, virtualised.
• Since they are GSs, they can be dynamically created by dynamically discovered factories and then disposed of.
• A GDQS is a GDS capable of integration and distributed retrieval and analysis of data.
• To perform a request a GDQS spawns as many GQESs in as many hosts as the partitioning and scheduling policies of the GDQS recommend for that request.
16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
20
OGSA-DQPa brief tour (2)
• To obtain an execution plan, a GDQS:– Interacts with registries to fetch information
about the data and computational services deemed of interest by the requestor;
– Interacts with GDSs and (in future) Index Services to acquire relevant metadata;
– Compiles, optimises, partitions and schedules the query execution.
16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
21
OGSA-DQPa brief tour (3)
• Given a distributed query plan, a GDQS:– Interacts with GDS factories to create the
leaf services in the plan;– Interacts with WSs that front-end analysis
capabilities;– Commands the creation of GQESs as
stipulated by the partitioning and scheduling decided on by the compiler;
– Coordinates the GQESs into executing the plan.
16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
39
what is going
on behind
the scenes
(1)
GG D S G Q ES n
G D T
111
G D Q S
G D S
G D T
G D Q
G
Re gi s try
G SG D S R
C lie n t
GG D S G D S
In s tan ce s
GG D S G Q ES 1
G D T
. .
.
GFactory G D Q S F
114
113
112
8
5
114 .1
7
8
re gis te rSe rv ice
fi ndSe rv ice Da ta
cre a te Se rv ice
im portSche m a
fi ndSe rv ice Da ta ( DBSche m a )
G D T
G S
pe
rform
(qu
ery
Su
bP
lan
)pe r fo rm ( que ry )
pe rfo rm ( gqes_ query )
6
6
16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
40
GFactory G Q ES F
GFactory G Q ES F
GFactory G Q ES F
N 2
N 1
N3
GC lie n tGG D S
GG D S
G D Q
G D T
G D Q S
N 0G D S
GFactory G Q ES F
N4
p erform (Q u ery)1
cre a te S e rv ice
cre a te S e rv ice2
cre ate S e rvi ce
2
2
GG D S G Q ES 2
G D T
GG D S G Q ES 3
G D T
GG D S G Q ES 1
G D T
GG D S G Q ES 1
G D T
p erform (Q u ery S u b p la n )
p erform (Q u ery S u b p la n )
perform(Q
uerySu bpl an)
3
s eq u en t ial_ s can
red u ce (p r o tein ID ,s eq u en ce )
s eq u en t ial_ s can ( ter m = 8 3 7 2 )
red u ce (p r o tein ID )
h as h _ jo in(p .p r o tein ID = t.p r o tein ID )
3
o p erat io n _ callb la s t(p .s eq u en ce)
red u ce (p .p r o tein ID , b la s t)
o p erat io n _ callb la s t(p .s eq u en ce)
red u ce (p .p r o tein ID , b la s t)
3
W e b S e rvi ce s (B L A S T)
resu lts
resu lts
resu lts
4
1144
what is going
on behind
the scenes
(2)
16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
51
the Khalaf-Leymann taxonomy for web services aggregation
unconstrained constrained
aggregation
agreements
grouping recursive wiring
choreography service domains
16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
54
OGSA-DQPvarious kinds of service aggregation
• There is interface inheritance from GSs and GDSs.
• The execution plan can be seen as encapsulating a wiring of GQESs,
• But constrained, and constructed on-the-fly, as in an an orchestration.
• As in service domains, there is competition of GQESs for a role to play in the orchestration.
• As is agreements, the orchestration is opportunistic, responsive to the obtaining resource levels and short-lived.
16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
57
summary
• OGSA-DQP is a service-based distributed query processor for the Grid that is:– Exposed as a service;– Implemented as an orchestration of services.
• OGSA-DQP is an enactor of declarative Grid service orchestrations that:– Improves on Grid portals when only retrieval
and analysis is involved;– Fills the gap left by the lack of a service
orchestration framework in the OGSA.
16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
58
where to find out more: papers
1. M N Alpdemir, A Mukherjee, A Gounaris, A A A Fernandes, N W Paton, P Watson, J Smith. An Experience Report on Designing and Building OGSA-DQP: A Service Based Distributed Query Processor for the Grid. GGF9 Workshop on Designing and Building Grid Services, 2003.
2. M N Alpdemir, A Mukherjee, A Gounaris, N W Paton, P Watson, A A A Fernandes, J Smith. Service-Based Distributed Querying on the Grid. 1st Int. Conf. on Service Oriented Computing, 2003. LNCS, to appear
3. M N Alpdemir, A Mukherjee, A Gounaris, N W Paton, P Watson, A A A Fernandes, J Smith. OGSA-DQP: A Service-Based Distributed Query Processor for the Grid. 2nd UK e-Science All Hands Meeting, 2003.
4. J Smith, A Gounaris, P Watson, N W Paton, A A A Fernandes, R Sakellariou. Distributed Query Processing on the Grid. GRID 2002, LNCS 2536
(papers available from http://www.cs.man.ac.uk/~alvaro/publications.html )
16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
59
where to find out more: software
OGSA-DQPGrid middleware to query distributed data
sources
www.ogsadai.org.uk/dqp OGSA-DAI
Grid middleware to interface with data(bases)
www.ogsadai.org.uk/ Globus ToolkitOpen-source implementation of OGSA/OGSI
www.globustoolkit.org/