the globus toolkit™: and its application to gryphyn carl kesselman director of the center for grid...

80
The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University of Southern California

Post on 18-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

The Globus Toolkit™: and its application to GryPhyN

Carl Kesselman

Director of the Center for Grid Technologies

Information Sciences Institute

University of Southern California

Page 2: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 2

Outline

Overview of the Globus toolkit Application of Globus to virtual data problem

(GriPhyN) Open Grid Services Architecture

Page 3: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 3

Partial Acknowledgements Open Grid Services Architecture design

- Karl Czajkowski @ USC/ISI

- Ian Foster, Steve Tuecke @ANL

- Jeff Nick, Steve Graham, Jeff Frey @ IBM

Grid services collaborators at ANL- Kate Keahey, Gregor von Laszewski

- Thomas Sandholm, Jarek Gawor, John Bresnahan

Globus Toolkit R&D also involves many fine scientists & engineers at ANL, USC/ISI, and elsewhere (see www.globus.org)

Strong links with many EU, UK, US Grid projects Support from DOE, NASA, NSF, Microsoft

Page 4: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 4

The Grid Problem

Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations

Page 5: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 5

Grid Computing Concept New applications enabled by the coordinated use

of geographically distributed resources- E.g., distributed collaboration, data access and analysis,

distributed computing

Persistent infrastructure for Grid computing- E.g., certificate authorities and policies, protocols for

resource discovery/access

Original motivation, and support, from high-end science and engineering; but has wide-ranging applicability

Page 6: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 6

Grids: Why Now?

Moore’s law Þ highly functional end-systems Ubiquitous Internet Þ universal connectivity Network exponentials produce dramatic

changes in geometry and geography- 9-month doubling: double Moore’s law!

- 1986-2001: x340,000; 2001-2010: x4000?

New modes of working and problem solving emphasize teamwork, computation

New business models and technologies facilitate outsourcing

Page 7: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 7

The Grid World: Current Status Dozens of major Grid projects in scientific &

technical computing/research & education- Deployment, application, technology

Considerable consensus on key concepts and technologies- Open source Globus Toolkit™ a de facto standard for

major protocols & services

- Far from complete or perfect, but out there, evolving rapidly, and large tool/user base

Global Grid Forum a significant force Industrial interest emerging rapidly

Page 8: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 8

Layered Grid Architecture(By Analogy to Internet Architecture)

Application

Fabric“Controlling things locally”: Access to, & control of, resources

Connectivity“Talking to things”: communication (Internet protocols) & security

Resource“Sharing single resources”: negotiating access, controlling use

Collective“Coordinating multiple resources”: ubiquitous infrastructure services, app-specific distributed services

InternetTransport

Application

Link

Inte

rnet P

roto

col

Arch

itectu

re

Page 9: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 9

Globus Toolkit

Globus Toolkit is the source of many of the protocols described in “Grid architecture”

Adopted by almost all major Grid projects worldwide as a source of infrastructure

Open source, open architecture framework encourages community development

Active R&D program continues to move technology forward

Developers at ANL, USC/ISI, NCSA, LBNL, and other institutions

www.globus.org

Page 10: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 10

Globus ToolkitComponents Include …

Core protocols and services– Grid Security Infrastructure

– Grid Resource Access & Management

– MDS information & monitoring

– GridFTP data access & transfer Other services

– Community Authorization Service

– DUROC co-allocation service Other Data Grid technologies

– Replica catalog, replica management service

Page 11: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 11

Globus Toolkit Structure

GRAM MDS

GSI

GridFTP MDS

GSI

???

GSI

Reliable invocationSoft state

management

Notification

ComputeResource

DataResource

Other Serviceor Application

Jobmanager

Jobmanager

Service naming

Page 12: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 12

User

Userprocess #1

Proxy

Authenticate & create proxy

credential

GSI(Grid

Security Infrastruc-

ture)

Gatekeeper(factory)

Reliable remote

invocation

GRAM(Grid Resource Allocation & Management)

Reporter(registry +discovery)

Userprocess #2Proxy #2

Create process Register

The Globus Toolkit in One Slide Grid protocols (GSI, GRAM, …) enable resource sharing within

virtual orgs; toolkit provides reference implementation ( = Globus Toolkit services)

Protocols (and APIs) enable other tools and services for membership, discovery, data mgmt, workflow, …

Other service(e.g. GridFTP)

Other GSI-authenticated remote service

requests

GIIS: GridInformationIndex Server (discovery)

MDS-2(Meta Directory Service)

Soft stateregistration;

enquiry

Page 13: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 13

GriPhyN Project Goals

Amplify science productivity through the Grid- Provide powerful abstractions for scientists:

datasets and transformations, not files and programs- Using a grid is harder than using a workstation.

GriPhyN seeks to reverse this situation! These goals challenge the boundaries of

computer science in knowledge representation and distributed computing.

Apply these advances to major experiments- Not just developing solutions, but proving them

through deployment

Page 14: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 14

GriPhyN Approach

Virtual Data- Tracking the derivation of experiment data with high

fidelity

- Transparency with respect to locationand materialization

Automated grid request planning- Advanced, policy driven scheduling

Achieve this at peta-scale magnitude We present here a vision that is still 3 years away, but

the foundation is starting to come together

Page 15: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 15

Virtual Data

Track all data assets Accurately record how they were derived Encapsulate the transformations that produce

new data objects Interact with the grid in terms of requests for

data derivations

Page 16: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 16

Request Automation Request Planning and Execution High performance

- Grid resources are used in efficient ways for high throughput and/or fast response

Based on policy- Policy specifies how resources should be used and

how workloads should be treated Fault tolerant

- It’s a grid – so failures are normal Transparent to the user

- Make the grid like a workstation

Page 17: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 17

NCSA Linux cluster

5) Secondary reports complete to master

Master Condor job running at

Caltech

7) GridFTP fetches data from UniTree

NCSA UniTree - GridFTP-enabled FTP server

4) 100 data files transferred via GridFTP, ~ 1 GB each

Secondary Condor job on WI

pool

3) 100 Monte Carlo jobs on Wisconsin Condor pool

2) Launch secondary job on WI pool; input files via Globus GASS

Caltech workstation

6) Master starts reconstruction jobs via Globus jobmanager on cluster

8) Processed objectivity database stored to UniTree

9) Reconstruction job reports complete to master

GriPhyN Challenge Problem:CMS Event Reconstruction

Work of: Scott Koranda, Miron Livny, Vladimir Litvin, & others

Page 18: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 18

Why is this useful? Easier to FIND the data

- A disciplined approch to tracking massive amounts of data

Can PRODUCE and analyze data easier- Automate details of data production

Can VALIDATE scientific results accurately Can SHARE data easier Can produce and analyze MORE data FASTER

- Leverage huge storage and computing resources

Page 19: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 19

Why is this hard?

Data derivation tracking- Diversity of transformations- Achieving fidelity of reproduction- Many modes of data storage

Automated request planning- Multiple levels of resource sharing and allocation

policy- Faults are the norm in large grids- Resources are constantly in flux- An OS the size of the planet!

Peta-Scale performance level

Page 20: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 20

The Virtual Data Model

Data suppliers publish data to the Grid Users request raw or derived data from Grid,

without needing to know- Where data is located

- Whether data is stored or computed on demand

User and applications can easily determine- What it will cost to obtain data

- Quality of derived data

Virtual Data Grid serves requests efficiently, subject to global and local policy constraints

Page 21: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 21

GriPhyN: Virtual DataTracking Complex Dependencies

Dependency graph is:- Files: 8 < (1,3,4,5,7), 7 < 6, (3,4,5,6) < 2

- Programs: 8 < psearch, 7 < summarize,(3,4,5) < reformat, 6 < conv, (1,2) < simulate

simulate –t 10 …

file1

file2reformat –f fz …

file1file1File3,4,5

psearch –t 10 …

conv –I esd –o aodfile6 summarize –t 10 …

file7

file8

Requestedfile

Page 22: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 22

Re-creating Virtual Data

To recreate file 8: Step 1- simulate > file1, file2

simulate –t 10 …

file1

file2reformat –f fz …

file1file1File3,4,5

psearch –t 10 …

conv –I esd –o aodfile6 summarize –t 10 …

file7

file8

Requestedfile

Page 23: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 23

Re-creating Virtual Data

To re-create file8: Step 2- files 3, 4, 5, 6 derived from file 2

- reformat > file3, file4, file5

- conv > file 6

simulate –t 10 …

file1

file2reformat –f fz …

file1file1File3,4,5

psearch –t 10 …

conv –I esd –o aodfile6 summarize –t 10 …

file7

file8

Requestedfile

Page 24: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 24

Re-creating Virtual Data

To re-create file 8: step 3- File 7 depends on file 6

- Summarize > file 7

simulate –t 10 …

file1

file2reformat –f fz …

file1file1File3,4,5

psearch –t 10 …

conv –I esd –o aodfile6 summarize –t 10 …

file7

file8

Requestedfile

Page 25: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 25

Re-creating Virtual Data

To re-create file 8: final step- File 8 depends on files 1, 3, 4, 5, 7

- psearch < file1, file3, file4, file5, file 7 > file 8

simulate –t 10 …

file1

file2

psearch –t 10 …

reformat –f fz …

conv –I esd –o aod

file1file1File3,4,5

file6 summarize –t 10 …

file7

file8

Requestedfile

Page 26: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 26

GriPhyN/PPDGData Grid Architecture

Application

Planner

Executor

Catalog Services

Info Services

Policy/Security

Monitoring

Repl. Mgmt.

Reliable TransferService

Compute Resource Storage Resource

DAG (concrete)

DAG (abstract)

DAGMAN, Kangaroo

GRAM GridFTP; GRAM; SRM

GSI, CAS

MDS

MCAT; GriPhyN catalogs

GDMP

MDS

Globus

Page 27: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 27

(evolving) View of Data Grid Stack

Data Transport(GridFTP)

Storage Element

Local Repl Catalog(Flat or Hierarchical)

Reliable FileTransfer

Replica LocationService

Publish-SubscribeService (GDMP)

StorageElementManager

Reliable Replication

Page 28: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 28

Initial GriPhyN Virtual Data Implementation

Virtual DataCatalog

(PostgreSQL)

Local FileStorage

Virtual DataLanguage

VDLInterpreter

(VDLI)GSI

GSI

GSI

Job Execution SiteU of Chicago

GridFTPClient

GlobusGRAM

Co

nd

or

Po

ol

Job Execution SiteU of Wisconsin

GridFTPClient

GlobusGRAM

Co

nd

or

Po

ol

Job Execution SiteU of Florida

GridFTPClient

GlobusGRAM

Co

nd

or

Po

ol

JobSumissionSitesANL, SC,…

Condor-GAgent

GlobusClient

GridFTPServer

Grid testbed

Simulate Physics

Simulate CMS Detector

Response

Copy flat-fileto OODBMS

Simulate Digitizationof Electronic Signals

Production DAG of Simulated CMS Data:

Architecture of the System:

Page 29: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 29

Virtual Data CatalogConceptual Data Structure

TRANSFORMATION

/bin/physapp1version 1.2.3b(2)created on 12 Oct 1998owned by physbld.orcaDERIVATION

^ paramlist^ transformation

FILE

LFN=filename1PFN1=/store1/1234987PFN2=/store9/2437218PFN3=/store4/8373636^derivation

FILE

LFN=filename2PFN1=/store1/1234987PFN2=/store9/2437218^derivation

PARAMETER LIST

PARAMETERi filename1

PARAMETERO filename2

PARAMETERE PTYPE=muon

PARAMETERp -g

Page 30: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 30

Planner Decision Making

Planner considers:- Policy (fairly static, from CAS/SAS)

- Grid resource status: state, load

- Job (user/group) resource consumption history

- Job profiles (resources over time) from Prophesy

planner

policy

AccountingRecords

Status

Job Usageinfo

Job ProfileRecords

Prohphesy(predictor)

Job ProfilingData

Page 31: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 31

Executor Example: Condor DAGMan

Directed Acyclic Graph Manager

Specify the dependencies between Condor jobs using DAG data structure

Manage dependencies automatically- (e.g., “Don’t run job “B” until job “A” has completed successfully.”)

Each job is a “node” in DAG

Any number of parent or children nodes

No loops

Job A

Job B Job C

Job D

Slide courtesy Miron Livny, U. Wisconsin

Page 32: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 32

Executor Example: Condor DAGMan (Cont.) DAGMan acts as a “meta-scheduler”

- holds & submits jobs to the Condor queue at the appropriate times based on DAG dependencies

If a job fails, DAGMan continues until it can no longer make progress and then creates a “rescue” file with the current state of the DAG- When failed job is ready to be re-run, the rescue file is used to

restore the prior state of the DAG

DAGMan

CondorJobQueue

C

D

B

C

B

A

Slide courtesy Miron Livny, U. Wisconsin

Page 33: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 33

Abstract DAG- Represents user requests

- Simplest case: request for one or more data product

- Complex case: request execution of a chained set of applications

- No file or execution locations need be present

Concrete DAG- Specifies any application invocations needed to derive data

- Specifes locations of all invocations (to the site level)

- Includes explicit job steps to move data

DAG Usage

Page 34: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 36

pythia_input

pythia.exe

cmsim_input

cmsim.exe

writeHits

writeDigis

begin v /usr/local/demo/scripts/cmkin_input.csh file i ntpl_file_path file i template_file file i num_events stdout cmkin_param_fileend

begin v /usr/local/demo/binaries/kine_make_ntpl_pyt_cms121.exe pre cms_env_var stdin cmkin_param_file stdout cmkin_log file o ntpl_fileend

begin v /usr/local/demo/scripts/cmsim_input.csh file i ntpl_file file i fz_file_path file i hbook_file_path file i num_trigs stdout cmsim_param_fileend

begin v /usr/local/demo/binaries/cms121.exe condor copy_to_spool=false condor getenv=true stdin cmsim_param_file stdout cmsim_log file o fz_file file o hbook_fileend

begin v /usr/local/demo/binaries/writeHits.sh condor getenv=true pre orca_hits file i fz_file file i detinput file i condor_writeHits_log file i oo_fd_boot file i datasetname stdout writeHits_log file o hits_dbend

begin v /usr/local/demo/binaries/writeDigis.sh pre orca_digis file i hits_db file i oo_fd_boot file i carf_input_dataset_name file i carf_output_dataset_name file i carf_input_owner file i carf_output_owner file i condor_writeDigis_log stdout writeDigis_log file o digis_dbend

CMS Pipeline in VDL

Page 35: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 37

GriPhyN CMS SC2001 Demo

Full Event Database of ~100,000

large objects

Full Event Database of

~40,000 large objects

“Tag” database of ~140,000

small objects

RequestRequest

Parallel tuned GSI FTP Parallel tuned GSI FTP

Bandwidth Greedy Grid-enabled Object Collection Analysisfor Particle Physics

http://pcbunn.cacr.caltech.edu/Tier2/Tier2_Overall_JJB.htm

Work of: Koen Holtman, J.J. Bunn, H. Newman, & others

Page 36: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 38

SDSS Galaxy Cluster Finding

Page 37: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 39

Cluster-finding Data Pipelinecatalog

cluster

5

4

core

brg

field

tsObj

3

2

1

brg

field

tsObj

2

1

brg

field

tsObj

2

1

brg

field

tsObj

2

1

core

3

Page 38: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 40

Cluster-finding Grid

Work of: Yong Zhao, James Annis, & others

Page 39: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 41

GriPhyN-LIGO SC2001 Demo

Desired Result

:

Single channel time series

HTTP

frontend

MyProxyserver

ReplicaCatalog

ExecutorCondorG/DAGMan

Planner Monitoring

TransformationCatalog

GridFTP GRAM/LDAS

LDAS at UWMGridCVS

Logs

SC floor

GridFTP

ComputeResource

GRAM

xml

Cgi interface

G-DAG (DAGMan)

GridFTP GRAM/LDAS

LDAS at CaltechUWM

GridFTP

UWM

GridFTP

ReplicaSelection

Frame

In integration

Prototype exclusive

In design

Globus component

Work of: Ewa Deelman, Gaurang Mehta, Scott Koranda, & others

Page 40: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 42

Globus Toolkit: Evaluation (+) Good technical solutions for key problems, e.g.

- Authentication and authorization

- Resource discovery and monitoring

- Reliable remote service invocation

- High-performance remote data access

This & good engineering is enabling progress- Good quality reference implementation, multi-language

support, interfaces to many systems, large user base, industrial support

- Growing community code base built on tools

Page 41: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 43

Globus Toolkit: Evaluation (-) Protocol deficiencies, e.g.

- Heterogeneous basis: HTTP, LDAP, FTP

- No standard means of invocation, notification, error propagation, authorization, termination, …

Significant missing functionality, e.g.- Databases, sensors, instruments, workflow, …

- Virtualization of end systems (hosting envs.)

Little work on total system properties, e.g. - Dependability, end-to-end QoS, …

- Reasoning about system properties

Page 42: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 44

Globus Toolkit Structure

GRAM MDS

GSI

GridFTP MDS

GSI

???

GSI

Reliable invocationSoft state

management

Notification

ComputeResource

DataResource

Other Serviceor Application

Jobmanager

Jobmanager

Lots of good mechanisms, but (with the exception of GSI) not that easilyincorporated into other systems

Service naming

Page 43: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 45

Open Grid Services Architecture Service orientation to virtualize resources Define fundamental Grid service behaviors

- Core set required, others optional A unifying framework for interoperability &

establishment of total system properties

Integration with Web services and hosting environment technologies Leverage tremendous commercial base Standard IDL accelerates community code

Delivery via open source Globus Toolkit 3.0 Leverage GT experience, code, mindshare

Page 44: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 46

“Web Services” Increasingly popular standards-based framework for

accessing network applications- W3C standardization; Microsoft, IBM, Sun, others

WSDL: Web Services Description Language- Interface Definition Language for Web services

SOAP: Simple Object Access Protocol- XML-based RPC protocol; common WSDL target

WS-Inspection- Conventions for locating service descriptions

UDDI: Universal Desc., Discovery, & Integration - Directory for Web services

Page 45: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 47

Web Services Example:Database Service

WSDL definition for “DBaccess” porttype defines operations and bindings, e.g.:- Query(QueryLanguage, Query, Result)

- SOAP protocol

Client C, Java, Python, etc., APIs can then be generated

DBaccess

Page 46: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 48

Transient Service Instances

“Web services” address discovery & invocation of persistent services- Interface to persistent state of entire enterprise

In Grids, must also support transient service instances, created/destroyed dynamically- Interfaces to the states of distributed activities

- E.g. workflow, video conf., dist. data analysis

Significant implications for how services are managed, named, discovered, and used- In fact, much of our work is concerned with the

management of service instances

Page 47: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 49

The Grid Service =Interfaces + Service Data

Servicedata

element

Servicedata

element

Servicedata

element

GridService … other interfaces …

Implementation

Service data accessExplicit destructionSoft-state lifetime

NotificationAuthorizationService creationService registryManageabilityConcurrency

Reliable invocationAuthentication

Hosting environment/runtime(“C”, J2EE, .NET, …)

Page 48: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 50

Open Grid Services Architecture:Fundamental Structure

1) WSDL conventions and extensions for describing and structuring services- Useful independent of “Grid” computing

2) Standard WSDL interfaces & behaviors for core service activities- portTypes and operations => protocols

Page 49: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 51

WSDL Conventions & Extensions portType (standard WSDL)

- Define an interface: a set of related operations

serviceType (extensibility element)- List of port types: enables aggregation

serviceImplementation (extensibility element)- Represents actual code

service (standard WSDL)- instanceOf extension: map descr.->instance

compatibilityAssertion (extensibility element)- portType, serviceType, serviceImplementation

Page 50: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 52

Structure of a Grid Serviceservice

PortTypePortType

service service service

Standard WSDL

… …

ServiceDescription

ServiceInstantiation

PortType

serviceImplementation serviceImplementation …

=

serviceType serviceType …

cA

cA

cA compatibilityAssertion=

cA

instanceOf instanceOf instanceOf instanceOf

Page 51: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 53

Standard Interfaces & Behaviors:Four Interrelated Concepts Naming and bindings

- Every service instance has a unique name, from which can discover supported bindings

Information model- Service data associated with Grid service instances,

operations for accessing this info

Lifecycle- Service instances created by factories

- Destroyed explicitly or via soft state

Notification- Interfaces for registering interest and delivering

notifications

Page 52: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 54

GridService Required- FindServiceData

- Destroy

- SetTerminationTime

NotificationSource- SubscribeToNotificationTopic

- UnsubscribeToNotificationTopic NotificationSink

- DeliverNotification

OGSA Interfaces and OperationsDefined to Date

Factory- CreateService

PrimaryKey- FindByPrimaryKey

- DestroyByPrimaryKey

Registry- RegisterService

- UnregisterService

HandleMap- FindByHandle

Authentication, reliability are binding propertiesManageability, concurrency, etc., to be defined

Page 53: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 55

Service Data A Grid service instance maintains a set of service

data elements- XML fragments encapsulated in standard <name, type, TTL-

info> containers

- Includes basic introspection information, interface-specific data, and application data

FindServiceData operation (GridService interface) queries this information- Extensible query language support

See also notification interfaces- Allows notification of service existence and changes in

service data

Page 54: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 56

Grid Service Example:Database Service

A DBaccess Grid service will support at least two portTypes- GridService

- DBaccess

Each has service data- GridService: basic introspection information, lifetime,

- DBaccess: database type, query languages supported, current load, …, …

GridService DBaccess

DB info

Name, lifetime, etc.

Page 55: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 59

Lifetime Management GS instances created by factory or manually;

destroyed explicitly or via soft state- Negotiation of initial lifetime with a factory (=service

supporting Factory interface)

GridService interface supports- Destroy operation for explicit destruction

- SetTerminationTime operation for keepalive

Soft state lifetime management avoids- Explicit client teardown of complex state

- Resource “leaks” in hosting environments

Page 56: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 60

Factory Factory interface’s CreateService operation

creates a new Grid service instance- Reliable creation (once-and-only-once)

CreateService operation can be extended to accept service-specific creation parameters

Returns a Grid Service Handle (GSH)- A globally unique URL

- Uniquely identifies the instance for all time

- Based on name of a home handleMap service

Page 57: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 61

Transient Database Services

GridService DBaccess

DB info

Name, lifetime, etc.

GridService

DBaccessFactory

Factory info

Instance name, etc.

GridService Registry

Registry info

Instance name, etc.

GridService DBaccess

DB info

Name, lifetime, etc.

“What services can you create?”

“What database services exist?”

“Create a database service”

Page 58: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 62

Example:Data Mining for Bioinformatics

UserApplication

BioDB n

Storage Service Provider

MiningFactory

CommunityRegistry

DatabaseService

BioDB 1

DatabaseService

.

.

.

Compute Service Provider

“I want to createa personal databasecontaining data one.coli metabolism”

.

.

.

DatabaseFactory

Page 59: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 63

Example:Data Mining for Bioinformatics

UserApplication

BioDB n

Storage Service Provider

MiningFactory

CommunityRegistry

DatabaseService

BioDB 1

DatabaseService

.

.

.

Compute Service Provider...

“Find me a data mining service, and somewhere to store

data”

DatabaseFactory

Page 60: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 64

Example:Data Mining for Bioinformatics

UserApplication

BioDB n

Storage Service Provider

MiningFactory

CommunityRegistry

DatabaseService

BioDB 1

DatabaseService

.

.

.

Compute Service Provider...

GSHs for Miningand Database factories

DatabaseFactory

Page 61: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 65

Example:Data Mining for Bioinformatics

UserApplication

BioDB n

Storage Service Provider

MiningFactory

CommunityRegistry

DatabaseService

BioDB 1

DatabaseService

.

.

.

Compute Service Provider...

“Create a data mining service with initial lifetime 10”

“Create adatabase with initial lifetime 1000”

DatabaseFactory

Page 62: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 66

Example:Data Mining for Bioinformatics

UserApplication

BioDB n

Storage Service Provider

DatabaseFactory

MiningFactory

CommunityRegistry

DatabaseService

BioDB 1

DatabaseService

.

.

.

Compute Service Provider...

Database

Miner

“Create a data mining service with initial lifetime 10”

“Create adatabase with initial lifetime 1000”

Page 63: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 67

Example:Data Mining for Bioinformatics

UserApplication

BioDB n

Storage Service Provider

DatabaseFactory

MiningFactory

CommunityRegistry

DatabaseService

BioDB 1

DatabaseService

.

.

.

Compute Service Provider...

Database

Miner

Query

Query

Page 64: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 68

Example:Data Mining for Bioinformatics

UserApplication

BioDB n

Storage Service Provider

DatabaseFactory

MiningFactory

CommunityRegistry

DatabaseService

BioDB 1

DatabaseService

.

.

.

Compute Service Provider...

Database

Miner

Query

Query

Keepalive

Keepalive

Page 65: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 69

Example:Data Mining for Bioinformatics

UserApplication

BioDB n

Storage Service Provider

DatabaseFactory

MiningFactory

CommunityRegistry

DatabaseService

BioDB 1

DatabaseService

.

.

.

Compute Service Provider...

Database

MinerKeepalive

KeepaliveResults

Results

Page 66: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 70

Example:Data Mining for Bioinformatics

UserApplication

BioDB n

Storage Service Provider

DatabaseFactory

MiningFactory

CommunityRegistry

DatabaseService

BioDB 1

DatabaseService

.

.

.

Compute Service Provider...

Database

Miner

Keepalive

Page 67: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 71

Example:Data Mining for Bioinformatics

UserApplication

BioDB n

Storage Service Provider

DatabaseFactory

MiningFactory

CommunityRegistry

DatabaseService

BioDB 1

DatabaseService

.

.

.

Compute Service Provider...

Database

Keepalive

Page 68: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 72

Notification Interfaces NotificationSource for client subscription

- One or more notification generators> Generates notification message of a specific type

> Typed interest statements: E.g., Filters, topics, …

> Supports messaging services, 3rd party filter services, …

- Soft state subscription to a generator

NotificationSink for asynchronous delivery of notification messages

A wide variety of uses are possible- E.g. Dynamic discovery/registry services, monitoring,

application error notification, …

Page 69: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 73

Notification Example

Notifications can be associated with any (authorized) service data elements

GridService DBaccess

DB info

Name, lifetime, etc.

GridService

DB info

Name, lifetime, etc.

NotificationSource

NotificationSink

Subscribers

Page 70: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 74

Notification Example

Notifications can be associated with any (authorized) service data elements

GridService DBaccess

DB info

Name, lifetime, etc.

GridService

DB info

Name, lifetime, etc.

NotificationSource

“Notify me ofnew data about

membrane proteins”

Subscribers

NotificationSink

Page 71: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 75

Notification Example

Notifications can be associated with any (authorized) service data elements

GridService DBaccess

DB info

Name, lifetime, etc.

GridService

DB info

Name, lifetime, etc.

NotificationSource

Keepalive

NotificationSink

Subscribers

Page 72: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 76

Notification Example

Notifications can be associated with any (authorized) service data elements

GridService DBaccess

DB info

Name, lifetime, etc.

GridService

NotificationSink

DB info

Name, lifetime, etc.

NotificationSource

New data

Subscribers

Page 73: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 77

Open Grid Services Architecture:Summary

Service orientation to virtualize resources- Everything is a service

From Web services- Standard interface definition mechanisms: multiple protocol

bindings, local/remote transparency

From Grids- Service semantics, reliability and security models

- Lifecycle management, discovery, other services

Multiple “hosting environments”- C, J2EE, .NET, …

Page 74: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 78

Recap: The Grid Service

Servicedata

element

Servicedata

element

Servicedata

element

GridService … other interfaces …

Implementation

Service data accessExplicit destructionSoft-state lifetime

NotificationAuthorizationService creationService registryManageabilityConcurrency

Reliable invocationAuthentication

Hosting environment/runtime(“C”, J2EE, .NET, …)

Page 75: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 79

OGSA and the Globus Toolkit Technically, OGSA enables

- Refactoring of protocols (GRAM, MDS-2, etc.)—while preserving all GT concepts/features!

- Integration with hosting environments: simplifying components, distribution, etc.

- Greatly expanded standard service set

Pragmatically, we are proceeding as follows- Develop open source OGSA implementation

> Globus Toolkit 3.0; supports Globus Toolkit 2.0 APIs

- Partnerships for service development

- Also expect commercial value-adds

Page 76: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 80

GT3: An Open Source OGSA-Compliant Globus Toolkit

GT3 Core- Implements Grid service

interfaces & behaviors

- Reference impln of evolving standard

- Java first, C soon, C#?

GT3 Base Services- Evolution of current Globus

Toolkit capabilities

- Backward compatible

Many other Grid services

GT3 Core

GT3 Base Services

Other GridServicesGT3

DataServices

Page 77: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 81

Hmm, Isn’t This Just Another Object Model?

Well, yes, in a sense- Strong encapsulation

- We (can) profit greatly from experiences of previous object-based systems

But- Focus on encapsulation not inheritance

- Does not require OO implementations

- Value lies in specific behaviors: lifetime, notification, authorization, …, …

- Document-centric not type-centric

Page 78: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 82

Grids and OGSA:Research Challenges

Grids pose profound problems, e.g.- Management of virtual organizations

- Delivery of multiple qualities of service

- Autonomic management of infrastructure

- Software and system evolution

OGSA provides foundation for tackling these problems in a rigorous fashion?- Structured establishment/maintenance of global

properties

- Reasoning about total system properties

Page 79: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 83

Summary

The Grid problem: Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations

Globus Toolkit a source of protocol and API definitions—and reference implementations- And many projects applying Grid concepts (& Globus

technologies) to important problems

Open Grid Services Architecture represents (we hope!) next step in evolution

An enabling framework for investigations of Internet-scale computing systems

Page 80: The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University

April 18, 2023 EO Grid Workshop 84

For More Information The Globus Project™

- www.globus.org

Grid architecture- www.globus.org/research/

papers/anatomy.pdf

Open Grid Services Architecture- www.globus.org/ogsa