ws-jdml: a web service interface for job submission and monitoring stephen m c gough william lee...
TRANSCRIPT
WS-JDML: A Web Service Interface for Job Submission
and Monitoring
Stephen MCGough
William Lee
London e-Science Centre
Department of Computing, Imperial College London
2
What Services do we need to make the Grid Work?
• One of the key services required is job submission– The ability to transparently submit a job to a
resource (potentially through a DRM) where it will run
• Many DRM systems exist (Condor, Globus, SGE etc…)– Each have their own way to define a job
(language)– Each have their own submission mechanism
(command line, API, Service)
3
The Problem
• Submitting jobs requires– Knowledge of the job definition procedure– The ability to interface with the appropriate DRM
• The Solution– One common Job description language that can be
used with all resources (eg RSL)– A generic submission system for jobs
• Using community based standards that are in common use
4
Generic Job Submission
WebServices
JDML
WS-JDML
5
Web Service
• We are using a plain “Vanilla” Web Service– Don’t rely on any proposed WS standards– Don’t need anything more than core standards for
this simple service
• Developed in Java• Our work has been deployed into the J2EE
enterprise platform– This enables
• Scalability• Fault tolerance
6
Job Description Markup LanguageJDML
• Originally developed from Condor ClassAds
• Developed for the European DataGrid project
• Used within the Imperial College ICENI project
• This work is now feeding into the Global Grid Forum Job Submission Description Language standardisation work
• JDML will morph to become JSDL
7
JDML (2)
• JDML documents are written in sections– What job to run– The environment to run the job in– Where to get files from– Where to send files to at the end
• JDML is strongly typed
• Consists of name/value pairs
8
JDML (3)
• Can have DRM specific sections– It must be safe to ignore this section and the job
still work correctly– Seen as a set of hints to the DRM
• File transfer is defined for multiple protocols– Grid FTP, HTTP, copy etc…– Each file may have multiple of these definitions
• DRM can select the appropriate ones to use
9
WS-JDML Architecture
10
Job Submission Port Type
• Takes a JDML document describing the job to run
• Validates the JDML so that an immediate response can be given
• Validates user credentials, passed as part of the SOAP header, using WS-Security
• Job is then placed into queue before being processed into a DRM specific version and deployed locally
11
Job Submission Port Type (2)
• Various results– Unrecognised Job Term
• The JDML contains some term that the Service doesn’t understand
– Invalid Job Term• The JDML has a term which has the wrong type or an
invalid value
– Successful Submission• URI to identify the job instance is returned
12
Job Monitoring Port Type
• This port provides a means to observe the current status of a job and manipulate the output transfer mechanism
• Requires the URI representing a job provided from job submission
• Current job status is returned– pending, scheduled, running, suspended, done, exit– Not all DRMs support all states
13
Job Monitoring Port Type (2)File Transfers
• Port provides the ability to– Get portions of the files specified in the JDML
transferred– Override the transfer methods given in the JDML– Indicate that files should be transferred back as
attachments to the SOAP document• Allows easy monitoring of the job progress
14
Deployment
• DRM Specific Translators have been obtained from existing code within the ICENI project– These include Shell, SGE, Globus and Condor
• Web Service architecture has been deployed in Java J2EE 1.4 platform– This provides a number of support features for the
services.
16
Further Work
• Job State Transition– The ability to represent the status of a job running within a
resource
• Notification– Currently to monitor a job requires the polling of the
monitoring port• Would be better if notifications to a sink service through say WS-
Notification
• Job Term Semantics– Definition of job terms using natural language– No formal model makes JDML transformation error prone– Develop an Ontology for Job submission terms
17
What do you use to build your service?
• Widely Implemented Standard Specification (1pt)– <Demonstrable Multiple Implementations, e.g. SOAP, WSDL>
• Implemented draft specification (2pt)– <Specification in standards body and supported by most/many companies. One/few implementations
exist (e.g., WS-Security, BPEL)>• Implemented draft specification (3pt)
– <Specification in standards body but alternatives exist. Industry is divided. One/few implementations exist. (e.g., Transactions, coordination, notification, etc.).
• Implemented proposal (4pt)– An implementation of an idea, a proposal but not submitted to standards body yet (e.g.,
WS-Addressing, WS-Trust, etc.)• Non-implemented proposal (5pt)
– <An idea that exits as a white paper, but no code and no specification details>• Concept (6pt)
– <An idea that exists only as power point slides!!>• TOTAL: SOAP, WSDL, WS-Security = 3
18
Service Dependencies
• What else does your service depend on (i.e. external dependencies)?– RDBMs / J2EE EJBs – Logging (Java Logging)– Message Queue (JMS)
• What does your implementation depend on?– Java– J2EE 1.4 compliant
19
AAA & Security
• What authentication mechanism do you use?– WS-Security
• What authorisation mechanism do you use?– Flexible composition of authorisation plugins.
• What accounting mechanism do you use?– Java logging
• Does service interaction need to be encrypted?• If these are not used now, will they be in the future?
20
Exploiting the Service Architecture
• What features from your ‘plumbing’ do you use in your service?– Event notification– Meta-data
21
Service Activity
• Multiple interaction or single user?– Multiple
• Throughput (1/per day or 100/per second?)
• Typical data volume moved in
• Typical data volume moved out
22
Service Failure
• Required Reliability– Failure semantics?
• Positive ack (might need WS-ReliableMessaging)
• Required Persistence– Job entered into the queue is always persisted
• Required Availability– One of many or unique requirement
23
Required Service Management
• Remote access to:– Usage statistics– Job Progress– Job Diagnostic and repair interfaces
24
Acknowledgements
• Director: Professor John Darlington• Research Staff:
– Anthony Mayer, Nathalie Furmento– Stephen McGough, James Stanton– Yong Xie, William Lee– Marko Krznaric, Murtaza Gulamali– Asif Saleem, Laurie Young, Gary Kong
• Contact:– http://www.lesc.ic.ac.uk/– e-mail: [email protected]