the globus toolkit™: and its application to gryphyn carl kesselman director of the center for grid...
Post on 18-Dec-2015
214 views
TRANSCRIPT
The Globus Toolkit™: and its application to GryPhyN
Carl Kesselman
Director of the Center for Grid Technologies
Information Sciences Institute
University of Southern California
April 18, 2023 EO Grid Workshop 2
Outline
Overview of the Globus toolkit Application of Globus to virtual data problem
(GriPhyN) Open Grid Services Architecture
April 18, 2023 EO Grid Workshop 3
Partial Acknowledgements Open Grid Services Architecture design
- Karl Czajkowski @ USC/ISI
- Ian Foster, Steve Tuecke @ANL
- Jeff Nick, Steve Graham, Jeff Frey @ IBM
Grid services collaborators at ANL- Kate Keahey, Gregor von Laszewski
- Thomas Sandholm, Jarek Gawor, John Bresnahan
Globus Toolkit R&D also involves many fine scientists & engineers at ANL, USC/ISI, and elsewhere (see www.globus.org)
Strong links with many EU, UK, US Grid projects Support from DOE, NASA, NSF, Microsoft
April 18, 2023 EO Grid Workshop 4
The Grid Problem
Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations
April 18, 2023 EO Grid Workshop 5
Grid Computing Concept New applications enabled by the coordinated use
of geographically distributed resources- E.g., distributed collaboration, data access and analysis,
distributed computing
Persistent infrastructure for Grid computing- E.g., certificate authorities and policies, protocols for
resource discovery/access
Original motivation, and support, from high-end science and engineering; but has wide-ranging applicability
April 18, 2023 EO Grid Workshop 6
Grids: Why Now?
Moore’s law Þ highly functional end-systems Ubiquitous Internet Þ universal connectivity Network exponentials produce dramatic
changes in geometry and geography- 9-month doubling: double Moore’s law!
- 1986-2001: x340,000; 2001-2010: x4000?
New modes of working and problem solving emphasize teamwork, computation
New business models and technologies facilitate outsourcing
April 18, 2023 EO Grid Workshop 7
The Grid World: Current Status Dozens of major Grid projects in scientific &
technical computing/research & education- Deployment, application, technology
Considerable consensus on key concepts and technologies- Open source Globus Toolkit™ a de facto standard for
major protocols & services
- Far from complete or perfect, but out there, evolving rapidly, and large tool/user base
Global Grid Forum a significant force Industrial interest emerging rapidly
April 18, 2023 EO Grid Workshop 8
Layered Grid Architecture(By Analogy to Internet Architecture)
Application
Fabric“Controlling things locally”: Access to, & control of, resources
Connectivity“Talking to things”: communication (Internet protocols) & security
Resource“Sharing single resources”: negotiating access, controlling use
Collective“Coordinating multiple resources”: ubiquitous infrastructure services, app-specific distributed services
InternetTransport
Application
Link
Inte
rnet P
roto
col
Arch
itectu
re
April 18, 2023 EO Grid Workshop 9
Globus Toolkit
Globus Toolkit is the source of many of the protocols described in “Grid architecture”
Adopted by almost all major Grid projects worldwide as a source of infrastructure
Open source, open architecture framework encourages community development
Active R&D program continues to move technology forward
Developers at ANL, USC/ISI, NCSA, LBNL, and other institutions
www.globus.org
April 18, 2023 EO Grid Workshop 10
Globus ToolkitComponents Include …
Core protocols and services– Grid Security Infrastructure
– Grid Resource Access & Management
– MDS information & monitoring
– GridFTP data access & transfer Other services
– Community Authorization Service
– DUROC co-allocation service Other Data Grid technologies
– Replica catalog, replica management service
April 18, 2023 EO Grid Workshop 11
Globus Toolkit Structure
GRAM MDS
GSI
GridFTP MDS
GSI
???
GSI
Reliable invocationSoft state
management
Notification
ComputeResource
DataResource
Other Serviceor Application
Jobmanager
Jobmanager
Service naming
April 18, 2023 EO Grid Workshop 12
User
Userprocess #1
Proxy
Authenticate & create proxy
credential
GSI(Grid
Security Infrastruc-
ture)
Gatekeeper(factory)
Reliable remote
invocation
GRAM(Grid Resource Allocation & Management)
Reporter(registry +discovery)
Userprocess #2Proxy #2
Create process Register
The Globus Toolkit in One Slide Grid protocols (GSI, GRAM, …) enable resource sharing within
virtual orgs; toolkit provides reference implementation ( = Globus Toolkit services)
Protocols (and APIs) enable other tools and services for membership, discovery, data mgmt, workflow, …
Other service(e.g. GridFTP)
Other GSI-authenticated remote service
requests
GIIS: GridInformationIndex Server (discovery)
MDS-2(Meta Directory Service)
Soft stateregistration;
enquiry
April 18, 2023 EO Grid Workshop 13
GriPhyN Project Goals
Amplify science productivity through the Grid- Provide powerful abstractions for scientists:
datasets and transformations, not files and programs- Using a grid is harder than using a workstation.
GriPhyN seeks to reverse this situation! These goals challenge the boundaries of
computer science in knowledge representation and distributed computing.
Apply these advances to major experiments- Not just developing solutions, but proving them
through deployment
April 18, 2023 EO Grid Workshop 14
GriPhyN Approach
Virtual Data- Tracking the derivation of experiment data with high
fidelity
- Transparency with respect to locationand materialization
Automated grid request planning- Advanced, policy driven scheduling
Achieve this at peta-scale magnitude We present here a vision that is still 3 years away, but
the foundation is starting to come together
April 18, 2023 EO Grid Workshop 15
Virtual Data
Track all data assets Accurately record how they were derived Encapsulate the transformations that produce
new data objects Interact with the grid in terms of requests for
data derivations
April 18, 2023 EO Grid Workshop 16
Request Automation Request Planning and Execution High performance
- Grid resources are used in efficient ways for high throughput and/or fast response
Based on policy- Policy specifies how resources should be used and
how workloads should be treated Fault tolerant
- It’s a grid – so failures are normal Transparent to the user
- Make the grid like a workstation
April 18, 2023 EO Grid Workshop 17
NCSA Linux cluster
5) Secondary reports complete to master
Master Condor job running at
Caltech
7) GridFTP fetches data from UniTree
NCSA UniTree - GridFTP-enabled FTP server
4) 100 data files transferred via GridFTP, ~ 1 GB each
Secondary Condor job on WI
pool
3) 100 Monte Carlo jobs on Wisconsin Condor pool
2) Launch secondary job on WI pool; input files via Globus GASS
Caltech workstation
6) Master starts reconstruction jobs via Globus jobmanager on cluster
8) Processed objectivity database stored to UniTree
9) Reconstruction job reports complete to master
GriPhyN Challenge Problem:CMS Event Reconstruction
Work of: Scott Koranda, Miron Livny, Vladimir Litvin, & others
April 18, 2023 EO Grid Workshop 18
Why is this useful? Easier to FIND the data
- A disciplined approch to tracking massive amounts of data
Can PRODUCE and analyze data easier- Automate details of data production
Can VALIDATE scientific results accurately Can SHARE data easier Can produce and analyze MORE data FASTER
- Leverage huge storage and computing resources
April 18, 2023 EO Grid Workshop 19
Why is this hard?
Data derivation tracking- Diversity of transformations- Achieving fidelity of reproduction- Many modes of data storage
Automated request planning- Multiple levels of resource sharing and allocation
policy- Faults are the norm in large grids- Resources are constantly in flux- An OS the size of the planet!
Peta-Scale performance level
April 18, 2023 EO Grid Workshop 20
The Virtual Data Model
Data suppliers publish data to the Grid Users request raw or derived data from Grid,
without needing to know- Where data is located
- Whether data is stored or computed on demand
User and applications can easily determine- What it will cost to obtain data
- Quality of derived data
Virtual Data Grid serves requests efficiently, subject to global and local policy constraints
April 18, 2023 EO Grid Workshop 21
GriPhyN: Virtual DataTracking Complex Dependencies
Dependency graph is:- Files: 8 < (1,3,4,5,7), 7 < 6, (3,4,5,6) < 2
- Programs: 8 < psearch, 7 < summarize,(3,4,5) < reformat, 6 < conv, (1,2) < simulate
simulate –t 10 …
file1
file2reformat –f fz …
file1file1File3,4,5
psearch –t 10 …
conv –I esd –o aodfile6 summarize –t 10 …
file7
file8
Requestedfile
April 18, 2023 EO Grid Workshop 22
Re-creating Virtual Data
To recreate file 8: Step 1- simulate > file1, file2
simulate –t 10 …
file1
file2reformat –f fz …
file1file1File3,4,5
psearch –t 10 …
conv –I esd –o aodfile6 summarize –t 10 …
file7
file8
Requestedfile
April 18, 2023 EO Grid Workshop 23
Re-creating Virtual Data
To re-create file8: Step 2- files 3, 4, 5, 6 derived from file 2
- reformat > file3, file4, file5
- conv > file 6
simulate –t 10 …
file1
file2reformat –f fz …
file1file1File3,4,5
psearch –t 10 …
conv –I esd –o aodfile6 summarize –t 10 …
file7
file8
Requestedfile
April 18, 2023 EO Grid Workshop 24
Re-creating Virtual Data
To re-create file 8: step 3- File 7 depends on file 6
- Summarize > file 7
simulate –t 10 …
file1
file2reformat –f fz …
file1file1File3,4,5
psearch –t 10 …
conv –I esd –o aodfile6 summarize –t 10 …
file7
file8
Requestedfile
April 18, 2023 EO Grid Workshop 25
Re-creating Virtual Data
To re-create file 8: final step- File 8 depends on files 1, 3, 4, 5, 7
- psearch < file1, file3, file4, file5, file 7 > file 8
simulate –t 10 …
file1
file2
psearch –t 10 …
reformat –f fz …
conv –I esd –o aod
file1file1File3,4,5
file6 summarize –t 10 …
file7
file8
Requestedfile
April 18, 2023 EO Grid Workshop 26
GriPhyN/PPDGData Grid Architecture
Application
Planner
Executor
Catalog Services
Info Services
Policy/Security
Monitoring
Repl. Mgmt.
Reliable TransferService
Compute Resource Storage Resource
DAG (concrete)
DAG (abstract)
DAGMAN, Kangaroo
GRAM GridFTP; GRAM; SRM
GSI, CAS
MDS
MCAT; GriPhyN catalogs
GDMP
MDS
Globus
April 18, 2023 EO Grid Workshop 27
(evolving) View of Data Grid Stack
Data Transport(GridFTP)
Storage Element
Local Repl Catalog(Flat or Hierarchical)
Reliable FileTransfer
Replica LocationService
Publish-SubscribeService (GDMP)
StorageElementManager
Reliable Replication
April 18, 2023 EO Grid Workshop 28
Initial GriPhyN Virtual Data Implementation
Virtual DataCatalog
(PostgreSQL)
Local FileStorage
Virtual DataLanguage
VDLInterpreter
(VDLI)GSI
GSI
GSI
Job Execution SiteU of Chicago
GridFTPClient
GlobusGRAM
Co
nd
or
Po
ol
Job Execution SiteU of Wisconsin
GridFTPClient
GlobusGRAM
Co
nd
or
Po
ol
Job Execution SiteU of Florida
GridFTPClient
GlobusGRAM
Co
nd
or
Po
ol
JobSumissionSitesANL, SC,…
Condor-GAgent
GlobusClient
GridFTPServer
Grid testbed
Simulate Physics
Simulate CMS Detector
Response
Copy flat-fileto OODBMS
Simulate Digitizationof Electronic Signals
Production DAG of Simulated CMS Data:
Architecture of the System:
April 18, 2023 EO Grid Workshop 29
Virtual Data CatalogConceptual Data Structure
TRANSFORMATION
/bin/physapp1version 1.2.3b(2)created on 12 Oct 1998owned by physbld.orcaDERIVATION
^ paramlist^ transformation
FILE
LFN=filename1PFN1=/store1/1234987PFN2=/store9/2437218PFN3=/store4/8373636^derivation
FILE
LFN=filename2PFN1=/store1/1234987PFN2=/store9/2437218^derivation
PARAMETER LIST
PARAMETERi filename1
PARAMETERO filename2
PARAMETERE PTYPE=muon
PARAMETERp -g
April 18, 2023 EO Grid Workshop 30
Planner Decision Making
Planner considers:- Policy (fairly static, from CAS/SAS)
- Grid resource status: state, load
- Job (user/group) resource consumption history
- Job profiles (resources over time) from Prophesy
planner
policy
AccountingRecords
Status
Job Usageinfo
Job ProfileRecords
Prohphesy(predictor)
Job ProfilingData
April 18, 2023 EO Grid Workshop 31
Executor Example: Condor DAGMan
Directed Acyclic Graph Manager
Specify the dependencies between Condor jobs using DAG data structure
Manage dependencies automatically- (e.g., “Don’t run job “B” until job “A” has completed successfully.”)
Each job is a “node” in DAG
Any number of parent or children nodes
No loops
Job A
Job B Job C
Job D
Slide courtesy Miron Livny, U. Wisconsin
April 18, 2023 EO Grid Workshop 32
Executor Example: Condor DAGMan (Cont.) DAGMan acts as a “meta-scheduler”
- holds & submits jobs to the Condor queue at the appropriate times based on DAG dependencies
If a job fails, DAGMan continues until it can no longer make progress and then creates a “rescue” file with the current state of the DAG- When failed job is ready to be re-run, the rescue file is used to
restore the prior state of the DAG
DAGMan
CondorJobQueue
C
D
B
C
B
A
Slide courtesy Miron Livny, U. Wisconsin
April 18, 2023 EO Grid Workshop 33
Abstract DAG- Represents user requests
- Simplest case: request for one or more data product
- Complex case: request execution of a chained set of applications
- No file or execution locations need be present
Concrete DAG- Specifies any application invocations needed to derive data
- Specifes locations of all invocations (to the site level)
- Includes explicit job steps to move data
DAG Usage
April 18, 2023 EO Grid Workshop 36
pythia_input
pythia.exe
cmsim_input
cmsim.exe
writeHits
writeDigis
begin v /usr/local/demo/scripts/cmkin_input.csh file i ntpl_file_path file i template_file file i num_events stdout cmkin_param_fileend
begin v /usr/local/demo/binaries/kine_make_ntpl_pyt_cms121.exe pre cms_env_var stdin cmkin_param_file stdout cmkin_log file o ntpl_fileend
begin v /usr/local/demo/scripts/cmsim_input.csh file i ntpl_file file i fz_file_path file i hbook_file_path file i num_trigs stdout cmsim_param_fileend
begin v /usr/local/demo/binaries/cms121.exe condor copy_to_spool=false condor getenv=true stdin cmsim_param_file stdout cmsim_log file o fz_file file o hbook_fileend
begin v /usr/local/demo/binaries/writeHits.sh condor getenv=true pre orca_hits file i fz_file file i detinput file i condor_writeHits_log file i oo_fd_boot file i datasetname stdout writeHits_log file o hits_dbend
begin v /usr/local/demo/binaries/writeDigis.sh pre orca_digis file i hits_db file i oo_fd_boot file i carf_input_dataset_name file i carf_output_dataset_name file i carf_input_owner file i carf_output_owner file i condor_writeDigis_log stdout writeDigis_log file o digis_dbend
CMS Pipeline in VDL
April 18, 2023 EO Grid Workshop 37
GriPhyN CMS SC2001 Demo
Full Event Database of ~100,000
large objects
Full Event Database of
~40,000 large objects
“Tag” database of ~140,000
small objects
RequestRequest
Parallel tuned GSI FTP Parallel tuned GSI FTP
Bandwidth Greedy Grid-enabled Object Collection Analysisfor Particle Physics
http://pcbunn.cacr.caltech.edu/Tier2/Tier2_Overall_JJB.htm
Work of: Koen Holtman, J.J. Bunn, H. Newman, & others
April 18, 2023 EO Grid Workshop 38
SDSS Galaxy Cluster Finding
April 18, 2023 EO Grid Workshop 39
Cluster-finding Data Pipelinecatalog
cluster
5
4
core
brg
field
tsObj
3
2
1
brg
field
tsObj
2
1
brg
field
tsObj
2
1
brg
field
tsObj
2
1
core
3
April 18, 2023 EO Grid Workshop 40
Cluster-finding Grid
Work of: Yong Zhao, James Annis, & others
April 18, 2023 EO Grid Workshop 41
GriPhyN-LIGO SC2001 Demo
Desired Result
:
Single channel time series
HTTP
frontend
MyProxyserver
ReplicaCatalog
ExecutorCondorG/DAGMan
Planner Monitoring
TransformationCatalog
GridFTP GRAM/LDAS
LDAS at UWMGridCVS
Logs
SC floor
GridFTP
ComputeResource
GRAM
xml
Cgi interface
G-DAG (DAGMan)
GridFTP GRAM/LDAS
LDAS at CaltechUWM
GridFTP
UWM
GridFTP
ReplicaSelection
Frame
In integration
Prototype exclusive
In design
Globus component
Work of: Ewa Deelman, Gaurang Mehta, Scott Koranda, & others
April 18, 2023 EO Grid Workshop 42
Globus Toolkit: Evaluation (+) Good technical solutions for key problems, e.g.
- Authentication and authorization
- Resource discovery and monitoring
- Reliable remote service invocation
- High-performance remote data access
This & good engineering is enabling progress- Good quality reference implementation, multi-language
support, interfaces to many systems, large user base, industrial support
- Growing community code base built on tools
April 18, 2023 EO Grid Workshop 43
Globus Toolkit: Evaluation (-) Protocol deficiencies, e.g.
- Heterogeneous basis: HTTP, LDAP, FTP
- No standard means of invocation, notification, error propagation, authorization, termination, …
Significant missing functionality, e.g.- Databases, sensors, instruments, workflow, …
- Virtualization of end systems (hosting envs.)
Little work on total system properties, e.g. - Dependability, end-to-end QoS, …
- Reasoning about system properties
April 18, 2023 EO Grid Workshop 44
Globus Toolkit Structure
GRAM MDS
GSI
GridFTP MDS
GSI
???
GSI
Reliable invocationSoft state
management
Notification
ComputeResource
DataResource
Other Serviceor Application
Jobmanager
Jobmanager
Lots of good mechanisms, but (with the exception of GSI) not that easilyincorporated into other systems
Service naming
April 18, 2023 EO Grid Workshop 45
Open Grid Services Architecture Service orientation to virtualize resources Define fundamental Grid service behaviors
- Core set required, others optional A unifying framework for interoperability &
establishment of total system properties
Integration with Web services and hosting environment technologies Leverage tremendous commercial base Standard IDL accelerates community code
Delivery via open source Globus Toolkit 3.0 Leverage GT experience, code, mindshare
April 18, 2023 EO Grid Workshop 46
“Web Services” Increasingly popular standards-based framework for
accessing network applications- W3C standardization; Microsoft, IBM, Sun, others
WSDL: Web Services Description Language- Interface Definition Language for Web services
SOAP: Simple Object Access Protocol- XML-based RPC protocol; common WSDL target
WS-Inspection- Conventions for locating service descriptions
UDDI: Universal Desc., Discovery, & Integration - Directory for Web services
April 18, 2023 EO Grid Workshop 47
Web Services Example:Database Service
WSDL definition for “DBaccess” porttype defines operations and bindings, e.g.:- Query(QueryLanguage, Query, Result)
- SOAP protocol
Client C, Java, Python, etc., APIs can then be generated
DBaccess
April 18, 2023 EO Grid Workshop 48
Transient Service Instances
“Web services” address discovery & invocation of persistent services- Interface to persistent state of entire enterprise
In Grids, must also support transient service instances, created/destroyed dynamically- Interfaces to the states of distributed activities
- E.g. workflow, video conf., dist. data analysis
Significant implications for how services are managed, named, discovered, and used- In fact, much of our work is concerned with the
management of service instances
April 18, 2023 EO Grid Workshop 49
The Grid Service =Interfaces + Service Data
Servicedata
element
Servicedata
element
Servicedata
element
GridService … other interfaces …
Implementation
Service data accessExplicit destructionSoft-state lifetime
NotificationAuthorizationService creationService registryManageabilityConcurrency
Reliable invocationAuthentication
Hosting environment/runtime(“C”, J2EE, .NET, …)
April 18, 2023 EO Grid Workshop 50
Open Grid Services Architecture:Fundamental Structure
1) WSDL conventions and extensions for describing and structuring services- Useful independent of “Grid” computing
2) Standard WSDL interfaces & behaviors for core service activities- portTypes and operations => protocols
April 18, 2023 EO Grid Workshop 51
WSDL Conventions & Extensions portType (standard WSDL)
- Define an interface: a set of related operations
serviceType (extensibility element)- List of port types: enables aggregation
serviceImplementation (extensibility element)- Represents actual code
service (standard WSDL)- instanceOf extension: map descr.->instance
compatibilityAssertion (extensibility element)- portType, serviceType, serviceImplementation
April 18, 2023 EO Grid Workshop 52
Structure of a Grid Serviceservice
PortTypePortType
service service service
Standard WSDL
… …
…
ServiceDescription
ServiceInstantiation
PortType
serviceImplementation serviceImplementation …
=
serviceType serviceType …
cA
cA
cA compatibilityAssertion=
cA
instanceOf instanceOf instanceOf instanceOf
April 18, 2023 EO Grid Workshop 53
Standard Interfaces & Behaviors:Four Interrelated Concepts Naming and bindings
- Every service instance has a unique name, from which can discover supported bindings
Information model- Service data associated with Grid service instances,
operations for accessing this info
Lifecycle- Service instances created by factories
- Destroyed explicitly or via soft state
Notification- Interfaces for registering interest and delivering
notifications
April 18, 2023 EO Grid Workshop 54
GridService Required- FindServiceData
- Destroy
- SetTerminationTime
NotificationSource- SubscribeToNotificationTopic
- UnsubscribeToNotificationTopic NotificationSink
- DeliverNotification
OGSA Interfaces and OperationsDefined to Date
Factory- CreateService
PrimaryKey- FindByPrimaryKey
- DestroyByPrimaryKey
Registry- RegisterService
- UnregisterService
HandleMap- FindByHandle
Authentication, reliability are binding propertiesManageability, concurrency, etc., to be defined
April 18, 2023 EO Grid Workshop 55
Service Data A Grid service instance maintains a set of service
data elements- XML fragments encapsulated in standard <name, type, TTL-
info> containers
- Includes basic introspection information, interface-specific data, and application data
FindServiceData operation (GridService interface) queries this information- Extensible query language support
See also notification interfaces- Allows notification of service existence and changes in
service data
April 18, 2023 EO Grid Workshop 56
Grid Service Example:Database Service
A DBaccess Grid service will support at least two portTypes- GridService
- DBaccess
Each has service data- GridService: basic introspection information, lifetime,
…
- DBaccess: database type, query languages supported, current load, …, …
GridService DBaccess
DB info
Name, lifetime, etc.
April 18, 2023 EO Grid Workshop 59
Lifetime Management GS instances created by factory or manually;
destroyed explicitly or via soft state- Negotiation of initial lifetime with a factory (=service
supporting Factory interface)
GridService interface supports- Destroy operation for explicit destruction
- SetTerminationTime operation for keepalive
Soft state lifetime management avoids- Explicit client teardown of complex state
- Resource “leaks” in hosting environments
April 18, 2023 EO Grid Workshop 60
Factory Factory interface’s CreateService operation
creates a new Grid service instance- Reliable creation (once-and-only-once)
CreateService operation can be extended to accept service-specific creation parameters
Returns a Grid Service Handle (GSH)- A globally unique URL
- Uniquely identifies the instance for all time
- Based on name of a home handleMap service
April 18, 2023 EO Grid Workshop 61
Transient Database Services
GridService DBaccess
DB info
Name, lifetime, etc.
GridService
DBaccessFactory
Factory info
Instance name, etc.
GridService Registry
Registry info
Instance name, etc.
GridService DBaccess
DB info
Name, lifetime, etc.
“What services can you create?”
“What database services exist?”
“Create a database service”
April 18, 2023 EO Grid Workshop 62
Example:Data Mining for Bioinformatics
UserApplication
BioDB n
Storage Service Provider
MiningFactory
CommunityRegistry
DatabaseService
BioDB 1
DatabaseService
.
.
.
Compute Service Provider
“I want to createa personal databasecontaining data one.coli metabolism”
.
.
.
DatabaseFactory
April 18, 2023 EO Grid Workshop 63
Example:Data Mining for Bioinformatics
UserApplication
BioDB n
Storage Service Provider
MiningFactory
CommunityRegistry
DatabaseService
BioDB 1
DatabaseService
.
.
.
Compute Service Provider...
“Find me a data mining service, and somewhere to store
data”
DatabaseFactory
April 18, 2023 EO Grid Workshop 64
Example:Data Mining for Bioinformatics
UserApplication
BioDB n
Storage Service Provider
MiningFactory
CommunityRegistry
DatabaseService
BioDB 1
DatabaseService
.
.
.
Compute Service Provider...
GSHs for Miningand Database factories
DatabaseFactory
April 18, 2023 EO Grid Workshop 65
Example:Data Mining for Bioinformatics
UserApplication
BioDB n
Storage Service Provider
MiningFactory
CommunityRegistry
DatabaseService
BioDB 1
DatabaseService
.
.
.
Compute Service Provider...
“Create a data mining service with initial lifetime 10”
“Create adatabase with initial lifetime 1000”
DatabaseFactory
April 18, 2023 EO Grid Workshop 66
Example:Data Mining for Bioinformatics
UserApplication
BioDB n
Storage Service Provider
DatabaseFactory
MiningFactory
CommunityRegistry
DatabaseService
BioDB 1
DatabaseService
.
.
.
Compute Service Provider...
Database
Miner
“Create a data mining service with initial lifetime 10”
“Create adatabase with initial lifetime 1000”
April 18, 2023 EO Grid Workshop 67
Example:Data Mining for Bioinformatics
UserApplication
BioDB n
Storage Service Provider
DatabaseFactory
MiningFactory
CommunityRegistry
DatabaseService
BioDB 1
DatabaseService
.
.
.
Compute Service Provider...
Database
Miner
Query
Query
April 18, 2023 EO Grid Workshop 68
Example:Data Mining for Bioinformatics
UserApplication
BioDB n
Storage Service Provider
DatabaseFactory
MiningFactory
CommunityRegistry
DatabaseService
BioDB 1
DatabaseService
.
.
.
Compute Service Provider...
Database
Miner
Query
Query
Keepalive
Keepalive
April 18, 2023 EO Grid Workshop 69
Example:Data Mining for Bioinformatics
UserApplication
BioDB n
Storage Service Provider
DatabaseFactory
MiningFactory
CommunityRegistry
DatabaseService
BioDB 1
DatabaseService
.
.
.
Compute Service Provider...
Database
MinerKeepalive
KeepaliveResults
Results
April 18, 2023 EO Grid Workshop 70
Example:Data Mining for Bioinformatics
UserApplication
BioDB n
Storage Service Provider
DatabaseFactory
MiningFactory
CommunityRegistry
DatabaseService
BioDB 1
DatabaseService
.
.
.
Compute Service Provider...
Database
Miner
Keepalive
April 18, 2023 EO Grid Workshop 71
Example:Data Mining for Bioinformatics
UserApplication
BioDB n
Storage Service Provider
DatabaseFactory
MiningFactory
CommunityRegistry
DatabaseService
BioDB 1
DatabaseService
.
.
.
Compute Service Provider...
Database
Keepalive
April 18, 2023 EO Grid Workshop 72
Notification Interfaces NotificationSource for client subscription
- One or more notification generators> Generates notification message of a specific type
> Typed interest statements: E.g., Filters, topics, …
> Supports messaging services, 3rd party filter services, …
- Soft state subscription to a generator
NotificationSink for asynchronous delivery of notification messages
A wide variety of uses are possible- E.g. Dynamic discovery/registry services, monitoring,
application error notification, …
April 18, 2023 EO Grid Workshop 73
Notification Example
Notifications can be associated with any (authorized) service data elements
GridService DBaccess
DB info
Name, lifetime, etc.
GridService
DB info
Name, lifetime, etc.
NotificationSource
NotificationSink
Subscribers
April 18, 2023 EO Grid Workshop 74
Notification Example
Notifications can be associated with any (authorized) service data elements
GridService DBaccess
DB info
Name, lifetime, etc.
GridService
DB info
Name, lifetime, etc.
NotificationSource
“Notify me ofnew data about
membrane proteins”
Subscribers
NotificationSink
April 18, 2023 EO Grid Workshop 75
Notification Example
Notifications can be associated with any (authorized) service data elements
GridService DBaccess
DB info
Name, lifetime, etc.
GridService
DB info
Name, lifetime, etc.
NotificationSource
Keepalive
NotificationSink
Subscribers
April 18, 2023 EO Grid Workshop 76
Notification Example
Notifications can be associated with any (authorized) service data elements
GridService DBaccess
DB info
Name, lifetime, etc.
GridService
NotificationSink
DB info
Name, lifetime, etc.
NotificationSource
New data
Subscribers
April 18, 2023 EO Grid Workshop 77
Open Grid Services Architecture:Summary
Service orientation to virtualize resources- Everything is a service
From Web services- Standard interface definition mechanisms: multiple protocol
bindings, local/remote transparency
From Grids- Service semantics, reliability and security models
- Lifecycle management, discovery, other services
Multiple “hosting environments”- C, J2EE, .NET, …
April 18, 2023 EO Grid Workshop 78
Recap: The Grid Service
Servicedata
element
Servicedata
element
Servicedata
element
GridService … other interfaces …
Implementation
Service data accessExplicit destructionSoft-state lifetime
NotificationAuthorizationService creationService registryManageabilityConcurrency
Reliable invocationAuthentication
Hosting environment/runtime(“C”, J2EE, .NET, …)
April 18, 2023 EO Grid Workshop 79
OGSA and the Globus Toolkit Technically, OGSA enables
- Refactoring of protocols (GRAM, MDS-2, etc.)—while preserving all GT concepts/features!
- Integration with hosting environments: simplifying components, distribution, etc.
- Greatly expanded standard service set
Pragmatically, we are proceeding as follows- Develop open source OGSA implementation
> Globus Toolkit 3.0; supports Globus Toolkit 2.0 APIs
- Partnerships for service development
- Also expect commercial value-adds
April 18, 2023 EO Grid Workshop 80
GT3: An Open Source OGSA-Compliant Globus Toolkit
GT3 Core- Implements Grid service
interfaces & behaviors
- Reference impln of evolving standard
- Java first, C soon, C#?
GT3 Base Services- Evolution of current Globus
Toolkit capabilities
- Backward compatible
Many other Grid services
GT3 Core
GT3 Base Services
Other GridServicesGT3
DataServices
April 18, 2023 EO Grid Workshop 81
Hmm, Isn’t This Just Another Object Model?
Well, yes, in a sense- Strong encapsulation
- We (can) profit greatly from experiences of previous object-based systems
But- Focus on encapsulation not inheritance
- Does not require OO implementations
- Value lies in specific behaviors: lifetime, notification, authorization, …, …
- Document-centric not type-centric
April 18, 2023 EO Grid Workshop 82
Grids and OGSA:Research Challenges
Grids pose profound problems, e.g.- Management of virtual organizations
- Delivery of multiple qualities of service
- Autonomic management of infrastructure
- Software and system evolution
OGSA provides foundation for tackling these problems in a rigorous fashion?- Structured establishment/maintenance of global
properties
- Reasoning about total system properties
April 18, 2023 EO Grid Workshop 83
Summary
The Grid problem: Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations
Globus Toolkit a source of protocol and API definitions—and reference implementations- And many projects applying Grid concepts (& Globus
technologies) to important problems
Open Grid Services Architecture represents (we hope!) next step in evolution
An enabling framework for investigations of Internet-scale computing systems
April 18, 2023 EO Grid Workshop 84
For More Information The Globus Project™
- www.globus.org
Grid architecture- www.globus.org/research/
papers/anatomy.pdf
Open Grid Services Architecture- www.globus.org/ogsa