andrew jones interop. in changing infrastructure biodiversityworld grid workshop nesc, edinburgh –...

18
BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 Andrew Jones Interop. in changing infrastructure 1 Design Decisions Interoperability in a changing architecture Andrew Jones

Upload: jasmine-atkinson

Post on 02-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Andrew Jones Interop. in changing infrastructure BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 1 Design Decisions Interoperability

BiodiversityWorld GRID WorkshopNeSC, Edinburgh – 30 June and 1 July 2005

Andrew JonesInterop. in changing infrastructure

1

Design Decisions

Interoperability in a changing architecture

Andrew Jones

Page 2: Andrew Jones Interop. in changing infrastructure BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 1 Design Decisions Interoperability

BiodiversityWorld GRID WorkshopNeSC, Edinburgh – 30 June and 1 July 2005

Andrew JonesInterop. in changing infrastructure

2

BiodiversityWorld requirements (1)

• Biodiversity Problem Solving Environment –• Heterogeneous diverse resources

• Facilitating integration of both legacy and newly-developed resources

• Flexible workflows• Main challenges centre around metadata,

interoperability, resource discovery, etc;• High-performance computing secondary

(though relevant)

Page 3: Andrew Jones Interop. in changing infrastructure BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 1 Design Decisions Interoperability

BiodiversityWorld GRID WorkshopNeSC, Edinburgh – 30 June and 1 July 2005

Andrew JonesInterop. in changing infrastructure

3

BiodiversityWorld requirements (2)• Distinctive features:

• a biodiversity informatics GRID• interoperability with heterogeneous data, complex in

structure• resilience to infrastructure change & interoperation with

other GRIDs• interactive collaboration a secondary concern

• Assumptions about resources:• A resource worked either:

• Essentially in ‘batch’ mode, or• Supporting a sequence of operations on a single resource, but

involving exchange of minimal data• Reasonable to treat each resource (including databases)

as a service offering its own, defined set of operations

Page 4: Andrew Jones Interop. in changing infrastructure BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 1 Design Decisions Interoperability

BiodiversityWorld GRID WorkshopNeSC, Edinburgh – 30 June and 1 July 2005

Andrew JonesInterop. in changing infrastructure

4

BiodiversityWorld architectural overview

BiodiversityWorld-GRID Interface (BGI)

The GRID

Workflow enactment

engine Wrapped resources

Native Biodiversity-

World Resources

Metadata repository

Presentation

BGI API

User interface

Page 5: Andrew Jones Interop. in changing infrastructure BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 1 Design Decisions Interoperability

BiodiversityWorld GRID WorkshopNeSC, Edinburgh – 30 June and 1 July 2005

Andrew JonesInterop. in changing infrastructure

5

The BGI concept

• Standardised invocation mechanism

• Wrappers notionally divided into Grid-facing and resource-facing parts

1 <<abstract>> BdwAbstractWrapper

<<interface>> BgiWrapperInterface

Bgi Implementation_1

Bgi Implementation_2

Concrete Wrapper_1

Concrete Wrapper_2

1

. . .

. . .

Page 6: Andrew Jones Interop. in changing infrastructure BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 1 Design Decisions Interoperability

BiodiversityWorld GRID WorkshopNeSC, Edinburgh – 30 June and 1 July 2005

Andrew JonesInterop. in changing infrastructure

6

Why we protected ourselves from ‘the Grid’(!)

• Rapidly evolving standards• Previous experience in GRAB

• Globus 2 approach needed ‘canned queries’, temporary files, etc … unnatural for distributed request/response model

• BiodiversityWorld• Globus and other software still evolving

• Globus 3: Grid Services; Globus 4: WSRF; …

• Trade-off: abstraction layer (BGI); invocation mechanism• Insulates from change• Performance penalty

• Assume computationally intensive applications lie in a single BDW resource

• Proprietary invocation mechanism hinders interoperation with other Grid/Web services

Page 7: Andrew Jones Interop. in changing infrastructure BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 1 Design Decisions Interoperability

BiodiversityWorld GRID WorkshopNeSC, Edinburgh – 30 June and 1 July 2005

Andrew JonesInterop. in changing infrastructure

7

Implementations of BGI

• RMI

• GT3 Grid Services (incomplete)

• Web services

• GT4/WSRF/Grid-Service-as-portal

Page 8: Andrew Jones Interop. in changing infrastructure BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 1 Design Decisions Interoperability

BiodiversityWorld GRID WorkshopNeSC, Edinburgh – 30 June and 1 July 2005

Andrew JonesInterop. in changing infrastructure

8

Benefits & limitations

• Too many standards, so we defined a new one!!• Interoperability with other projects restricted

• Could wrap non-BDW resources, or• Implement alternative Grid-facing “glue” replacing

invocation mechanism with some other standard• Restrictions on highly interactive applications

• BGI OK for coarse-grained interaction; not for dynamic interaction with potentially large data volumes

• Transmission and storage of intermediate results: method not specified• Can pass URI instead of data, but no specifications

restricting what this might refer to

Page 9: Andrew Jones Interop. in changing infrastructure BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 1 Design Decisions Interoperability

BiodiversityWorld GRID WorkshopNeSC, Edinburgh – 30 June and 1 July 2005

Andrew JonesInterop. in changing infrastructure

9

Transmission/storage of data

• Desirable to have uniform mechanisms for transmission and storage of data for:• Efficient operation of workflows• Re-use; composition of workflows• Supporting more flexible experimentation

Page 10: Andrew Jones Interop. in changing infrastructure BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 1 Design Decisions Interoperability

BiodiversityWorld GRID WorkshopNeSC, Edinburgh – 30 June and 1 July 2005

Andrew JonesInterop. in changing infrastructure

10

Are workflows sufficient for flexible experimentation?

• Creating a workflow:• Workflows clearly good for capturing complex tasks

• Good for ‘tweaking’ tasks• But is this how users think?• If not, we should provide an environment that supports a

more exploratory approach too, e.g.• User tries out some small subtasks• (S)he joins results together• Builds larger workflows from fragments

• This requires recording of interactions, so re-usable workflows can be composed

• Storage of intermediate data sets• Provenance metadata (extending MDR)

Page 11: Andrew Jones Interop. in changing infrastructure BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 1 Design Decisions Interoperability

BiodiversityWorld GRID WorkshopNeSC, Edinburgh – 30 June and 1 July 2005

Andrew JonesInterop. in changing infrastructure

11

How to achieve dynamic interaction?

• Some possibilities for future development• Remote direct manipulation (And other remote interactions?)

• BGI not well suited to fine-grained interaction with resources• Some resources may not be accessible except as stand-alone• May need (less portable) ‘by-pass’ mechanisms, e.g.

• New BGI protocol• Using existing techniques, such as VNC

• Local direct manipulation, etc.• Achievable via component-based ‘plug-in’ approaches (e.g. using

JavaBeans), but component interface must be defined• Requires data to be present locally; bandwidth concerns• Some bandwidth problems can be addressed by combining local

specialised client component & remote server component (e.g. passing vectors, not bitmaps)

• BGI may or may not be fast enough in this case

Page 12: Andrew Jones Interop. in changing infrastructure BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 1 Design Decisions Interoperability

BiodiversityWorld GRID WorkshopNeSC, Edinburgh – 30 June and 1 July 2005

Andrew JonesInterop. in changing infrastructure

12

How to achieve data transmission/intermediate result storage?

• Low level• E.g. orchestrate facilities such as GridFTP,

GRAM, …

• Higher-level• E.g. Inferno, SRB

Page 13: Andrew Jones Interop. in changing infrastructure BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 1 Design Decisions Interoperability

BiodiversityWorld GRID WorkshopNeSC, Edinburgh – 30 June and 1 July 2005

Andrew JonesInterop. in changing infrastructure

13

Additional considerations

• Again, have problem of committing to other, evolving standards

• Need at least a thin API layer to protect resources from change

• And don’t want to break existing BDW system

Page 14: Andrew Jones Interop. in changing infrastructure BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 1 Design Decisions Interoperability

BiodiversityWorld GRID WorkshopNeSC, Edinburgh – 30 June and 1 July 2005

Andrew JonesInterop. in changing infrastructure

14

More direct database exploitation with OGSA-DAI

• BioDA project is investigating relevance & suitability of OGSA-DAI in relation to bioinformatics projects

• 2 main possibilities within BDW:1. Augment BGI to support inclusion of queries in workflows and to be

sent directly to OGSA-DAI enabled databases.• Distributed query processing facilities could assist in planning execution

& distribution of data-orientated parts of a workflow. (For the current status of OGSA-DQP see Section 4.) • Very major revision to BDW protocols; also,• many resources of interest are simply not exposed as databases.

2. Provide facilities within individual wrappers that benefit from OGSA-DAI.

• Current exemplar (under development) takes approach (2) …

Page 15: Andrew Jones Interop. in changing infrastructure BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 1 Design Decisions Interoperability

BiodiversityWorld GRID WorkshopNeSC, Edinburgh – 30 June and 1 July 2005

Andrew JonesInterop. in changing infrastructure

15

BDW OGSA-DAI initial exemplar

8. getOutPut()

OGSA-DAI R5 GDS

Format file (xsl)OGSA-DAIClient

1. BGI()

1. BGIinvokeOperation

BDWQueryActivity

Wrapper Module

WrapperWrapperWrapperWrapperWrapperWrapper

2. Create GDS

and query

3. Invoke wrapper

Web DBs

4. Query

Web DBs

4. Query

deliverFromURL(url)

5. Download URL

XSLTransform

6. url

7. XSL transform to BDW

format

pull data8. getOutPut()

OGSA-DAI R5 GDS

Format file (xsl)OGSA-DAIClient

1. BGI()

1. BGIinvokeOperation

BDWQueryActivity

Wrapper Module

WrapperWrapperWrapperWrapperWrapperWrapper

2. Create GDS

and query

3. Invoke wrapper

Web DBs

4. Query

Web DBs

4. Query

deliverFromURL(url)

5. Download URL

XSLTransform

6. url

7. XSL transform to BDW

format

pull data

Page 16: Andrew Jones Interop. in changing infrastructure BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 1 Design Decisions Interoperability

BiodiversityWorld GRID WorkshopNeSC, Edinburgh – 30 June and 1 July 2005

Andrew JonesInterop. in changing infrastructure

16

BDW OGSA-DAI exemplar extension

OGSA-DAI R5 GDS

7. XSL transform to BDW format

XSLTransformXSLTransformXSLTransformXSLTransformXSLTransformXSLTransform

mergeOutputOGSA-DAI

Client

1. BGI([ ])

1. BGIInvokeOperation ([ ])

8. integrate output

deliverToURL /GFTP

9. To WF unit

OGSA-DAI R5 GDS

7. XSL transform to BDW format

XSLTransformXSLTransformXSLTransformXSLTransformXSLTransformXSLTransform

mergeOutputOGSA-DAI

Client

1. BGI([ ])

1. BGIInvokeOperation ([ ])

8. integrate output

deliverToURL /GFTP

9. To WF unit

Page 17: Andrew Jones Interop. in changing infrastructure BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 1 Design Decisions Interoperability

BiodiversityWorld GRID WorkshopNeSC, Edinburgh – 30 June and 1 July 2005

Andrew JonesInterop. in changing infrastructure

17

Conclusions

• BDW interoperation layer designed to meet requirements we were given• Suitable for high-level interactions• Not so good for dynamic interaction with resources (need

for this now generally recognised)• Doesn’t specify how data is to be moved around

• Applicable to other domains meeting similar criteria• Interesting possibilities for extension• But we have achieved a sustainable architecture;

this is an important feature to retain in future systems

Page 18: Andrew Jones Interop. in changing infrastructure BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 1 Design Decisions Interoperability

BiodiversityWorld GRID WorkshopNeSC, Edinburgh – 30 June and 1 July 2005

Andrew JonesInterop. in changing infrastructure

18

Some discussion points(Arising from Jaspreet’s and Andrew’s talks)

1. Balance of requirements for different kinds of GRIDS – (performance, resource discovery, sustainability, …) – how does this affect decisions about architectures, protocols, … ?

2. How can BDW protocols best be enhanced in future projects?

3. How can we best achieve interoperability between grids from different projects (including BDW)?

4. How can we make it easier for 3rd parties to• Introduce their resources to an existing

BgiWrapperService?• Develop their own additional BgiWrapperServices?