1 high-performance grid computing and research networking presented by selim kalayci instructor: s....
Post on 22-Dec-2015
218 views
TRANSCRIPT
![Page 1: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/1.jpg)
1
High-Performance Grid Computing and High-Performance Grid Computing and Research NetworkingResearch Networking
Presented by Selim Kalayci
Instructor: S. Masoud Sadjadihttp://www.cs.fiu.edu/~sadjadi/Teaching/
sadjadi At cs Dot fiu Dot edu
Grid ComputingGrid Computing
![Page 2: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/2.jpg)
2
Acknowledgements The content of many of the slides in this
lecture notes have been adopted from the online resources prepared previously by the people listed below. Many thanks!
Henri Casanova Principles of High Performance Computing http://navet.ics.hawaii.edu/~casanova [email protected]
Ian Foster Presentations&Tutorials from
www.globus.org
![Page 3: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/3.jpg)
3
Agenda Grid Computing Grid Middleware - Globus Security in Globus Data Management Execution Management Monitoring Metaschedulers - Gridway
![Page 4: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/4.jpg)
4
Multiple Computers Adding CPUs to a single computer
becomes very expensive How about multiple computers
together? Linux Clusters (60% of Top-500 list)
Blue/Gene: 30K computers
![Page 5: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/5.jpg)
5
Beyond the machine room? Need more capacity than available at (most) single sites
Everyone would like a 10K-node 100GHz cluster Very expensive (cooling, power) More economical to have multiple sites
Need to locate available resources now Data/Instruments are inherently distributed
Campus
Machine Room
Nation
![Page 6: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/6.jpg)
6
Grid Computing A dynamic multi-institutional network of computers that come
together to share resources for the purpose of coordinated problem solving.
resource
application
institutional boundaryAchieved through:
1. Open general-purpose protocols
2. Standard interfaces
![Page 7: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/7.jpg)
7
Layers in Grid
![Page 8: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/8.jpg)
8
A Grid Checklist coordinates resources that are not subject to centralized
control …
… using standard, open, general-purpose protocols and interfaces …
… to deliver nontrivial qualities of service.
Virtual Organizations Group of individuals or institutions defined by sharing
rules to share the resources of “Grid” for a common goal. Example: Application service providers, storage service
providers, databases, crisis management team, consultants.
![Page 9: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/9.jpg)
9
How is a grid different? Grids focus on site autonomy
Grids involve heterogeneity
Grids involve more resources than just computers and networks
Grids focus on the user
![Page 10: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/10.jpg)
10
Agenda Grid Computing Grid Middleware - Globus Security in Globus Data Management Execution Management Monitoring Metaschedulers - Gridway
![Page 11: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/11.jpg)
11
Grid Infrastructure Distributed management
Of physical resources Of software services Of communities and their policies
Unified treatment Build on Web services framework Use WS-RF, WS-Notification (or
WS-Transfer/Man) to represent/access state
Common management abstractions & interfaces
![Page 12: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/12.jpg)
12
Globus is Open Source Grid Infrastructure
Implement key Web services standards State, notification, security, …
Software for Grid infrastructure Service-enable new & existing resources E.g., GRAM on computer, GridFTP on storage
system, custom application services Uniform abstractions & mechanisms
Tools to build applications that exploit Grid infrastructure Registries, security, data management, …
Enabler of a rich tool & service ecosystem
![Page 13: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/13.jpg)
13
GLOBUS TOOLKIT 4 – GT4
Open source toolkit developed by The Globus Alliance that allows us to build Grid applications.
Organized as a collection of loosely coupled components.
Consists of services, programming libraries, and development tools.
![Page 14: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/14.jpg)
14
GT Domain Areas Core runtime
Infrastructure for building new services Security
Apply uniform policy across distinct systems Execution management
Provision, deploy, & manage services Data management
Discover, transfer, & access large data Monitoring
Discover & monitor dynamic services
![Page 15: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/15.jpg)
15
GT4 Components
![Page 16: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/16.jpg)
16
WSRF & WS-Notification Naming and bindings (basis for virtualization)
Every resource can be uniquely referenced, and has one or more associated services for interacting with it
Lifecycle (basis for fault resilient state mgmt) Resources created by services following factory pattern Resources destroyed immediately or scheduled
Information model (basis for monitoring, discovery) Resource properties associated with resources Operations for querying and setting this info Asynchronous notification of changes to properties
Service groups (basis for registries, collective svcs) Group membership rules & membership management
Base Fault type
![Page 17: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/17.jpg)
17
Agenda Grid Computing Grid Middleware - Globus Security in Globus Data Management Execution Management Monitoring Metaschedulers - Gridway
![Page 18: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/18.jpg)
18
Security Services Forms the underlying communication
medium for all the services Secure Authentication and
Authorization Single Sign-on
User need not explicitly authenticate himself every time a service is requested
Uniform Credentials Ex: GSI (Globus Security Infrastructure)
![Page 19: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/19.jpg)
19
Grid Security Infrastructure - GSI
Grid Security Infrastructure (GSI)
Use GSI as a standard mechanism for bridging disparate security mechanisms
Doesn’t solve trust problem, but now things talk same protocol and understand each other’s identity credentials
Basic support for delegation, policy distribution Translate from other mechanisms to/from GSI
as needed Convert from GSI identity to local identity for
authorization
![Page 20: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/20.jpg)
20
Grid Security Infrastructure - GSI
Grid Security Infrastructure (GSI)
Based on standard PKI technologies CAs allow one-way, light-weight trust relationships (not
just site-to-site) SSL protocol or WS-Security for authentication,
message protection X.509 Certificates for asserting identity
for users, services, hosts, etc. Proxy Certificates
GSI extension to X.509 certificates for delegation, single sign-on
![Page 21: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/21.jpg)
21
Gridmap file A gridmap file at each site maps the grid id of a user to
a local id The grid id of the user is his/her subject in the grid
user certificate The local id is site-specific; multiple grid ids can be mapped to a single local id
Usually a local id exists for each VO participating in that grid effort
The local ids are then used to implement site specific policies
Priorities etc.
![Page 22: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/22.jpg)
22
Gridmap file entry The gridmap-file is maintained by the site
administrator Each entry maps a Grid DN (distinguished
name of the user; subject name) to local user names
##Distinguished Name Local username#
“/DC=org/DC=doegrids/OU=People/CN=Laukik Chitnis 712960” ivdgl“/DC=org/DC=doegrids/OU=People/CN=Richard Cavanaugh 710220” grid3“/DC=org/DC=doegrids/OU=People/CN=JangUk In 712961” ivdgl“/DC=org/DC=doegrids/OU=People/CN=Jorge Rodriguez 690211” osg
![Page 23: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/23.jpg)
23
How to create and use an Identity (1)
Run the below command to generate a personal grid identity certificate.
grid-cert-request
This will create the following files in $HOME/.globus
usercert_request.pem (request to sign certificate)userkey.pem (private key - encrypted)usercert.pem (public key - signed)
![Page 24: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/24.jpg)
24
How to create and use an Identity (2)
After you have created the request then you need to mail it to the local certificate authority:
cat $HOME/.globus/usercert_request.pem | mail [email protected] (or [email protected])
Then the CA will mail you back a signed certificate which you will want to put into $HOME/.globus/usercert.pem
(it can take up to a day for the CA to process the request)
![Page 25: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/25.jpg)
25
Commands to log in / logout grid-proxy-init
This "logs you into" the globus system.
grid-proxy-info Use this to see your status.
grid-proxy-destroy Use this to log out.
A proxy is like a temporary ticket to use the Grid, default in the above case being 12 hours.
Once this is done, you should be able to run “grid jobs” globus-job-run site-name command
![Page 26: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/26.jpg)
26
Agenda Grid Computing Grid Middleware - Globus Security in Globus Data Management Execution Management Monitoring Metaschedulers - Gridway
![Page 27: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/27.jpg)
27
GT4 Data Management Stage/move large data to/from nodes
GridFTP, Reliable File Transfer (RFT) Alone, and integrated with GRAM
Locate data of interest Replica Location Service (RLS)
Replicate data for performance/reliability Distributed Replication Service (DRS)
Provide access to diverse data sources File systems, parallel file systems, hierarchical
storage: GridFTP Databases: OGSA DAI
![Page 28: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/28.jpg)
28
GridFTP
What is GridFTP?
A secure, robust, fast, efficient, standards based, widely accepted data transfer protocol
A Protocol Multiple independent implementations can interoperate
This works. Both the Condor Project at Uwisand Fermi Lab have home grown servers that work with ours.
Lots of people have developed clients independent of the Globus Project.
We also supply a reference implementation: Server Client tools (globus-url-copy) Development Libraries
![Page 29: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/29.jpg)
29
Globus-url-copy GridFTP-compliant client from the Globus team Copy files from one URL to another URL
One URL is usually a gsiftp:// URL Another URL is usually a file:/ URL To move a file from remote GridFTP-enabled server to local machine
% globus-url-copy gsiftp://gcb.fiu.edu/tmp/jt file:/home/skala001/jt
To put file onto server reverse URLs % globus-url-copy file:/home/skala001/jt
gsiftp://gcb.fiu.edu/tmp/jt Monitor performance using –vb flag % globus-url-copy -vb gsiftp://gcb.fiu.edu/tmp/jt
file:/home/skala001/jt
![Page 30: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/30.jpg)
30
Reliable File Transfer - RFT WSRF compliant Fault-tolerant, High- performance
data transfer service Soft state. Notifications/Query
Reliability on top of high performance provided by GridFTP. Fire and Forget. Integrated Automatic Failure Recovery.
Network level failures. System level failures etc.
Essentially a Data transfer scheduler with FIFO as a Queue Policy.
![Page 31: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/31.jpg)
31
RFT
RFT Service
RFT Client
SOAP Messages
Notifications(Optional)
DataChannel
Protocol Interpreter
MasterDSI
DataChannel
SlaveDSI
IPCReceiver
IPC Link
MasterDSI
Protocol Interpreter
Data Channel
IPCReceiver
SlaveDSI
Data Channel
IPC Link
GridFTP Server GridFTP Server
![Page 32: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/32.jpg)
32
Agenda Grid Computing Grid Middleware - Globus Security in Globus Data Management Execution Management Monitoring Metaschedulers - Gridway
![Page 33: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/33.jpg)
33
Execution Management Common WS interface to schedulers
Unix, Condor, LSF, PBS, SGE, … More generally: interface for process
execution management Lay down execution environment Stage data Monitor & manage lifecycle Kill it, clean up
A basis for application-driven provisioning
![Page 34: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/34.jpg)
34
Grid Job Management GoalsProvide a service to securely: Create an environment for a job Stage files to/from environment Cause execution of job process(es)
Via various local resource managers Monitor execution Signal important state changes to client Enable client access to output files
Streaming access during execution
![Page 35: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/35.jpg)
35
GRAM GRAM:Globus Resource Allocation and
Management GRAM is a Globus Toolkit component
For Grid job management GRAM is a unifying remote interface to
Resource Managers Yet preserves local site security/control
GRAM is for stateful job control Reliable operation Asynchronous monitoring and control Remote credential management File staging via RFT and GridFTP
![Page 36: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/36.jpg)
36
GRAMservices
GT4 Java Container
GRAMservices
Delegation
RFT FileTransfer
Transferrequest
GridFTPRemote storage element(s)
Localscheduler
Userjob
Compute element
GridFTP
sudo
GRAMadapter
FTPcontrol
Local job control
Delegate
FTP data
Cli
ent Job
functions
Delegate
Service host(s) and compute element(s)
GT4 WS GRAM Architecture
SEGJob events
![Page 37: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/37.jpg)
37
GRAMservices
GT4 Java Container
GRAMservices
Delegation
RFT FileTransfer
Transferrequest
GridFTPRemote storage element(s)
Localscheduler
Userjob
Compute element
GridFTP
sudo
GRAMadapter
FTPcontrol
Local job control
Delegate
FTP data
Cli
ent Job
functions
Delegate
Service host(s) and compute element(s)
GT4 WS GRAM Architecture
SEGJob events
Delegated credential can be:Made available to the application
![Page 38: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/38.jpg)
38
GRAMservices
GT4 Java Container
GRAMservices
Delegation
RFT FileTransfer
Transferrequest
GridFTPRemote storage element(s)
Localscheduler
Userjob
Compute element
GridFTP
sudo
GRAMadapter
FTPcontrol
Local job control
Delegate
FTP data
Cli
ent Job
functions
Delegate
Service host(s) and compute element(s)
GT4 WS GRAM Architecture
SEGJob events
Delegated credential can be:Used to authenticate with RFT
![Page 39: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/39.jpg)
39
GRAMservices
GT4 Java Container
GRAMservices
Delegation
RFT FileTransfer
Transferrequest
GridFTPRemote storage element(s)
Localscheduler
Userjob
Compute element
GridFTP
sudo
GRAMadapter
FTPcontrol
Local job control
Delegate
FTP data
Cli
ent Job
functions
Delegate
Service host(s) and compute element(s)
GT4 WS GRAM Architecture
SEGJob events
Delegated credential can be:Used to authenticate with GridFTP
![Page 40: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/40.jpg)
40
A Simple Example Command example:% globusrun-ws -submit -c /bin/date
Submitting job...Done.Job ID: uuid:002a6ab8-6036-11d9-bae6-0002a5ad41e5Termination time: 01/07/2005 22:55 GMTCurrent job state: ActiveCurrent job state: CleanUpCurrent job state: DoneDestroying job...Done.
A successful submission will create a new ManagedJob resource with its own unique EPR for messaging
Use –o option to create the EPR file% globusrun-ws -submit –o job.epr -c /bin/date
![Page 41: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/41.jpg)
41
A Simple Example(2) To see the output, use –s (stream) option% globusrun-ws -submit –s -c /bin/date
Termination time: 06/14/2007 18:07 GMTCurrent job state: ActiveCurrent job state: CleanUp-HoldWed Jun 13 14:07:54 EDT 2007Current job state: CleanUpCurrent job state: DoneDestroying job...Done.Cleaning up any delegated credentials...Done.
If you want to send the output to a file, use –so option% globusrun-ws -submit –s –so job.out -c /bin/date
…% cat job.out
Wed Jun 13 14:07:54 EDT 2007
![Page 42: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/42.jpg)
42
A Simple Example(3)Submitting your job to different schedulers Fork% globusrun-ws -submit -Ft Fork -s -c
/bin/date(Actually, the default is Fork. So, you can skip it in this case.)
SGE% globusrun-ws -submit -Ft SGE -s -c
/bin/hostname
![Page 43: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/43.jpg)
43
Batch Job Submissions% globusrun-ws -submit -batch -o job_epr -c
/bin/sleep 50Submitting job...Done.Job ID: uuid:f9544174-60c5-11d9-97e3-0002a5ad41e5Termination time: 01/08/2005 16:05 GMT
% globusrun-ws -status -j job_eprCurrent job state: Active
% globusrun-ws -status -j job_eprCurrent job state: Done
% globusrun-ws -kill -j job_eprRequesting original job description...Done.Destroying job...Done.
![Page 44: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/44.jpg)
44
Complete Factory Contact Override default EPR
Select a different host/service Use “contact” shorthand for convenience
Relies on proprietary knowledge of EPR format!
Command example:
% globusrun-ws -submit –F gcb.fiu.edu\-c /bin/date
![Page 45: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/45.jpg)
45
Read RSL from File Command:% globusrun-ws -submit -f touch.xml
Contents of touch.xml file:<job> <executable>/bin/touch</executable> <argument>touched_it</argument></job>
![Page 46: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/46.jpg)
46
Resource Specification Language (RSL) RSL is the language used by the clients to submit
a job. All job submission requests are described in RSL,
including the executable file and arguments. You can specify the type and capabilities of
resources to execute your job. You can also coordinate Stage-in and Stage-out
operations through RSL.
![Page 47: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/47.jpg)
47
Common/useful options globusrun-ws -J
Perform delegation as necessary for job globusrun-ws -S
Perform delegation as necessary for job’s file staging
globusrun-ws -s Stream stdout/err during job execution to the
terminal globusrun-ws -self
Useful for testing, when you have started the service using your credentials instead of host credentials
![Page 48: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/48.jpg)
48
Staging job<job><executable>/bin/echo</executable><directory>/tmp</directory><argument>Hello</argument><stdout>job.out</stdout><stderr>job.err</stderr><fileStageOut> <transfer> <sourceUrl>file:///tmp/job.out</sourceUrl> <destinationUrl> gsiftp://host.domain:2811/tmp/stage.out </destinationUrl> </transfer></fileStageOut>
</job>
![Page 49: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/49.jpg)
49
RSL Variable Enables late binding of values
Values resolved by GRAM service
System-specific variables ${GLOBUS_USER_HOME} ${GLOBUS_LOCATION} ${GLOBUS_SCRATCH_DIR}
Alternative directory that is shared with compute node
Typically providing more space than user’s HOME dir
![Page 50: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/50.jpg)
50
RSL Variable Example<job><executable>/bin/echo</executable><argument>HOME is ${GLOBUS_USER_HOME}</argument><argument>SCRATCH = ${GLOBUS_SCRATCH_DIR}</argument><argument>GL is ${GLOBUS_LOCATION}</argument><stdout>${GLOBUS_USER_HOME}/echo.stdout</stdout><stderr>${GLOBUS_USER_HOME}/echo.stderr</stderr>
</job>
!!!/tmp/rslExample
![Page 51: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/51.jpg)
51
GRAM Commands Run a job using:
% globus-job-run localhost /bin/date Submit to Fork:
% globus-job-run localhost/jobmanager-fork /bin/date
Submit a batch job using:% globus-job-submit localhost /bin/sleep 50
globus-job-status globus-job-get-output globus-job-cancel
![Page 52: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/52.jpg)
52
Running a Script in GRAM Add this script to file “job”
#! /bin/csh -fecho "Hello World from ";$GLOBUS_LOCATION/bin/globus-hostnameecho arg 1 = $1echo arg 2 = $2echo -n "sum is " echo "$1+$2" | /usr/bin/bc –l
Change the permissions for “job”% chmod +x job
Run the job% globus-job-run localhost ./job 5 6
You should getHello World fromgcb.fiu.eduarg 1 = 5arg 2 = 6sum is 11
!!!/tmp/job
![Page 53: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/53.jpg)
53
Agenda Grid Computing Grid Middleware - Globus Security in Globus Data Management Execution Management Monitoring Metaschedulers - Gridway
![Page 54: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/54.jpg)
54
What is MDS4? Grid-level monitoring system used most often
for resource selection and error notification Aid user/agent to identify host(s) on which to run an
application Make sure that they are up and running correctly
Uses standard interfaces to provide publishing of data, discovery, and data access, including subscription/notification
WS-ResourceProperties, WS-BaseNotification, WS-ServiceGroup
Functions as an hourglass to provide a common interface to lower-level monitoring tools
![Page 55: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/55.jpg)
55
MDS4 Components Information providers
Monitoring is a part of every WSRF service Non-WS services are also be used
Higher level services Index Service – a way to aggregate data Trigger Service – a way to be notified of changes Both built on common aggregator framework
Clients WebMDS
All of the tool are schema-agnostic, but interoperability needs a well-understood common language
![Page 56: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/56.jpg)
56
Information Providers GT4 information providers collect
information from some system and make it accessible as WSRF resource properties
Growing number of information providers Ganglia, CluMon, Nagios SGE, LSF, OpenPBS, PBSPro, Torque
Many opportunities to build additional ones E.g., network monitoring, storage systems,
various sensors
![Page 57: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/57.jpg)
57
Information Providers Data sources for the higher-level services Some are built into services
Any WSRF-compliant service publishes some data automatically
WS-RF gives us standard Query/Subscribe/Notify interfaces
GT4 services: ServiceMetaDataInfo element includes start time, version, and service type name
Most of them also publish additional useful information as resource properties
![Page 58: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/58.jpg)
58
Information Providers:GT4 Services Reliable File Transfer Service (RFT)
Service status data, number of active transfers, transfer status, information about the resource running the service
Community Authorization Service (CAS) Identifies the VO served by the service instance
Replica Location Service (RLS) Note: not a WS Location of replicas on physical storage
systems (based on user registrations) for later queries
![Page 59: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/59.jpg)
59
Information Providers (2) Other sources of data
Any executables Other (non-WS) services Interface to another archive or data
store File scraping
Just need to produce a valid XML document
![Page 60: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/60.jpg)
60
Information Providers:Cluster and Queue Data
Interfaces to Hawkeye, Ganglia, CluMon, Nagios Basic host data (name, ID), processor information,
memory size, OS name and version, file system data, processor load data
Some condor/cluster specific data This can also be done for sub-clusters, not just at
the host level Interfaces to PBS, Torque, LSF
Queue information, number of CPUs available and free, job count information, some memory statistics and host info for head node of cluster
![Page 61: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/61.jpg)
61
Higher-Level Services Index Service
Caching registry Trigger Service
Warn on error conditions
All of these have common needs, and are built on a common framework
![Page 62: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/62.jpg)
62
MDS4 Index Service Index Service is both registry and cache
Datatype and data provider info, like a registry (UDDI) Last value of data, like a cache
Subscribes to information providers In memory default approach
DB backing store currently being discussed to allow for very large indexes
Can be set up for a site or set of sites, a specific set of project data, or for user-specific data only
Can be a multi-rooted hierarchy No *global* index
![Page 63: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/63.jpg)
63
Container-wide Index Each GT4 container has a local index Collects information about services in that
container Each service registers to container index when
correctly configured
![Page 64: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/64.jpg)
64
VO-wide indexes Local indexes can be registered to VO wide indexes Configfile at resource container or at VO index –
contains URL for resource or VO index
![Page 65: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/65.jpg)
65
MDS4 Trigger Service Subscribe to a set of resource
properties Evaluate that data against a set of
pre-configured conditions (triggers) When a condition matches, action
occurs Email is sent to pre-defined address Website updated
![Page 66: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/66.jpg)
66
Information models Each information sources publishes
information in XML according to some schema.
Some times the author of the information source or the grid resource defines that schema.
Some collaborative efforts to define common schemas–for example GLUE for compute information
Schema typically written in XSD, but not required
![Page 67: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/67.jpg)
67
GLUE schema Grid Laboratory Uniform Environment Schema developed by DataTAG for
EU/USA interoperability. Modelled in UML Implementations
XML version for MDS Information collected from various cluster
monitoring systems Also: LDAP and SQL versions (used by older
versions of MDS and other monitoring systems).
![Page 68: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/68.jpg)
68
MDS user interfaces
General purpose UIs Web browser based interface -
WebMDS Command line tools
Specialized clients Brokers
![Page 69: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/69.jpg)
69
WebMDS
Web-based interface to display monitoring information
Easily extensible for new data using XSLT
![Page 70: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/70.jpg)
70
MDS4 - Command Line Xpath Queries to query the Index Service To see all collected in the Index Service
wsrf-query -s \ https://gcb.fiu.edu:8443/wsrf/services/DefaultIndexService
To see the number of free nodes: wsrf-query -s
https://gcb.fiu.edu:8443/wsrf/services/DefaultIndexService "number(//*/glue:GLUECE//glue:ComputingElement/glue:State/@glue:FreeCPUs)"
To see how many jobs are currently running: wsrf-query -s
https://gcb.fiu.edu:8443/wsrf/services/DefaultIndexService "number(//*[local-name()='GLUECE']//glue:ComputingElement//glue:State/@glue:TotalJobs)"
![Page 71: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/71.jpg)
71
Configuring GRAM to use a cluster monitoring system
GRAM extracts and publishes cluster information from either Ganglia or Hawkeye
$GLOBUS_LOCATION/etc/globus_wsrf_mds_usefulrp/gluerp.xml
<defaultProvider> tag specifies whether to use Ganglia or Hawkeye or none.
Uncomment appropriate example supplied in the configfile
![Page 72: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/72.jpg)
72
Agenda Grid Computing Grid Middleware - Globus Security in Globus Data Management Execution Management Monitoring Metaschedulers - Gridway
![Page 73: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/73.jpg)
73
Grid Meta-Scheduler Local Schedulers is not fit for Grid environment
Meta-scheduler(s) should interact with lower-level schedulers for scheduling decisions
Resources (Computational, Data, Network, etc.) and Jobs are other entities, Meta-Scheduler should be aware of and interact with
Meta-Scheduler uses existing Grid services
![Page 74: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/74.jpg)
74
GridWay Lightweight metascheduler on top of GT 2.4 –
4.x Properties:
Support of GGF DRMA standard API for job submission and management
Support for JSDL Simple scheduling mechanisms but extensible Interoperability between different grid infrastructures
and middlewares (Globus, EGEE, UNICORE…) Allows job dependencies (workflow) Supports job migration/adaptive execution (Grid- and
application-initiated)
![Page 75: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/75.jpg)
75
GridWay Architecture
RequestManager
DispatchManager
InformationManager
ExecutionManager
TransferManager
SchedulerScheduler
GridWay Core
Jobpool
Hostpool
GRAMRFT MDS
Resource
GRAMRFT MDS DRMAA Library CLIJob control operations
Matchmaking, execution and
migration
Execution of jobs on LRM
Performance Monitor
![Page 76: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/76.jpg)
76
GridWay Modules Request Manager Interfaces with client
commands Dispatch Manager Performs job scheduling Information Manager Resource Monitoring and
data gathering Execution Manager Executes job stages Performance Monitor Evaluates the job
performance
![Page 77: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/77.jpg)
77
Scheduling Strategy Dispatch manager wakes up at every scheduling
interval Uses Resource Selector to select the host(s) to
submit the job Resource Selector interfaces with Grid Information
Services, such as MDS Resource Selector returns a candidate list of hosts
to submit the job by using a policy script You can implement your own policy script, so it is
extensible Dispatch Manager then submits the job to the
Execution Manager
![Page 78: 1 High-Performance Grid Computing and Research Networking Presented by Selim Kalayci Instructor: S. Masoud Sadjadi sadjadi/Teaching](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d805503460f94a64128/html5/thumbnails/78.jpg)
78
GridWay Commands gwd - start the daemon gwhost - information about
resources gwps - information about jobs gwuser - information about users gwsubmit - submits job gwkill - cancels a job