enabling grids for e-science kiam 1 in gt4 evaluation activity and grid research pavel berezovskiy...

32
Enabling Grids for E-sciencE KIAM 1 in GT4 Evaluation Activity and Grid Research Pavel Berezovskiy Dmitry Semyachkin ARDA Meeting October 12, 2005 CERN 1 Keldysh Institute for Applied Mathematics, Russian Academy of Sciences

Upload: silvia-townsend

Post on 18-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Enabling Grids for E-sciencE KIAM 1 in GT4 Evaluation Activity and Grid Research Pavel Berezovskiy Dmitry Semyachkin ARDA Meeting October 12, 2005 CERN

Enabling Grids for E-sciencE

KIAM1 in GT4 Evaluation Activityand Grid Research

Pavel Berezovskiy

Dmitry Semyachkin

ARDA Meeting October 12, 2005 CERN

1Keldysh Institute for Applied Mathematics, Russian Academy of Sciences

Page 2: Enabling Grids for E-sciencE KIAM 1 in GT4 Evaluation Activity and Grid Research Pavel Berezovskiy Dmitry Semyachkin ARDA Meeting October 12, 2005 CERN

ARDA Meeting October 12, 2005 CERN 2

Enabling Grids for E-sciencE

KIAM RAS

Contents• Evaluation focus• Evaluation team • GT4 general characteristics

– Applicability– Architecture– Structure– Functionality

• GT4 vs. gLite• Risks assessment• GT4 WS Core

– Architecture– Containers– Performance– Service creation

• GT4 WS GRAM– Architecture– GT4 WS GRAM vs. GT3 GRAM– Performance

• Prototyping the GT4 GRAM backend for gLite CE• KIAM in Grid research

Page 3: Enabling Grids for E-sciencE KIAM 1 in GT4 Evaluation Activity and Grid Research Pavel Berezovskiy Dmitry Semyachkin ARDA Meeting October 12, 2005 CERN

Enabling Grids for E-sciencE

I. GT4 Evaluation Activity

Page 4: Enabling Grids for E-sciencE KIAM 1 in GT4 Evaluation Activity and Grid Research Pavel Berezovskiy Dmitry Semyachkin ARDA Meeting October 12, 2005 CERN

ARDA Meeting October 12, 2005 CERN 4

Enabling Grids for E-sciencE

KIAM RAS

Evaluation focus

• Focus: – evaluation of advantages of GT4 and comparison of GT4

services and basic gLite services.

• Main targets:– quality of realization of WS and WSRF;– metrics of basic services;– more detail investigation core services (WS GRAM, RFT, etc.);– comparison GT4 services with corresponding core services in

gLite.

Page 5: Enabling Grids for E-sciencE KIAM 1 in GT4 Evaluation Activity and Grid Research Pavel Berezovskiy Dmitry Semyachkin ARDA Meeting October 12, 2005 CERN

ARDA Meeting October 12, 2005 CERN 5

Enabling Grids for E-sciencE

KIAM RAS

Russian evaluation team

• Keldysh Institute for Applied Mathematics, Russian Academy of Sciences (WS Core, Execution Management):

– P. Berezovskiy– E. Huhlaev– V. Kovalenko – D. Semyachkin

• Joint Institute for Nuclear Research, Dubna (Data Management):– V. Galaktionov – N. Kutovskiy– V. Pose

• Skobeltsyn Institute of Nuclear Physics, Moscow State University (Monitoring and Discovery, Security):

– A. Demichev– A. Krukov– L. Shamardin

Page 6: Enabling Grids for E-sciencE KIAM 1 in GT4 Evaluation Activity and Grid Research Pavel Berezovskiy Dmitry Semyachkin ARDA Meeting October 12, 2005 CERN

ARDA Meeting October 12, 2005 CERN 6

Enabling Grids for E-sciencE

KIAM RAS

General characteristics

• Applicability: GT4 may be used– as independent middleware for Grid;– being based on widely accepted protocols of WS stack, GT4 is

highly interoperable and could be used along with other middleware packages or separate services with the same basis.

• Architecture: GT4 implements WS architecture, enriching it with– the additional techniques to access the data that should persist

between different invocations of services, in a consistent and interoperable manner;

– simplification the development of stateful applications that function in a multi-user and multi-session mode.

Page 7: Enabling Grids for E-sciencE KIAM 1 in GT4 Evaluation Activity and Grid Research Pavel Berezovskiy Dmitry Semyachkin ARDA Meeting October 12, 2005 CERN

ARDA Meeting October 12, 2005 CERN 7

Enabling Grids for E-sciencE

KIAM RAS

General characteristics

• Structure: GT4 consists of– WS Core – program environment that supports services

functioning;– a number of ready-to-use services;– development tools (tools for client programs and services

creation, API for accessing the services of GT4, user-level utilities for grid operation).

• Functionality: GT4 services cover four areas– Security– Data Management– Execution Management– Information Services

Page 8: Enabling Grids for E-sciencE KIAM 1 in GT4 Evaluation Activity and Grid Research Pavel Berezovskiy Dmitry Semyachkin ARDA Meeting October 12, 2005 CERN

ARDA Meeting October 12, 2005 CERN 8

Enabling Grids for E-sciencE

KIAM RAS

GT4 vs. gLite

• In each main area gLite has services, which are absent in GT4, but are substantial:– Security

Virtual organization support (VOMS gLite)

– Execution Management Resource virtualization (Workload Management System) Support of computational resource internal organization: CE, reliability

(Condor-C), logging and bookkeeping (L&B), job monitoring

– Data Management Support of storage resource internal organization: SE, SRM, file catalogue

(FireMan)

• On the other hand gLite incorporates some components from GT, though from previous version (GT2):– Grid Security Infrastructure– GridFTP– Gatekeeper

Page 9: Enabling Grids for E-sciencE KIAM 1 in GT4 Evaluation Activity and Grid Research Pavel Berezovskiy Dmitry Semyachkin ARDA Meeting October 12, 2005 CERN

ARDA Meeting October 12, 2005 CERN 9

Enabling Grids for E-sciencE

KIAM RAS

Risk assessment

• GT4 is issued recently and some of it’s components are still under development

• GT4 includes three WS Core implementations (C, Java, Python), but presently all GT4 services are developed for the Java WS Core only, which has the worse performance characteristics than C container

• GT4 doesn’t implement the whole stack of WS specifications (Transfer, Messaging, Quality of Service, Business Logic)

• WS stack is not matured enough to be wholly stable• There exists alternative to WSRF approaches to

managing state, WS-GAF, for example

Page 10: Enabling Grids for E-sciencE KIAM 1 in GT4 Evaluation Activity and Grid Research Pavel Berezovskiy Dmitry Semyachkin ARDA Meeting October 12, 2005 CERN

ARDA Meeting October 12, 2005 CERN 10

Enabling Grids for E-sciencE

KIAM RAS

GT4 WS Core architecture

• WS Core architecture divides common logic of request processing from the implementation code of services. Common part consists of:– suite of handlers (security, dispatch)– other mechanisms (Lifetime Management, Notification

Producer/Consumer, etc.),

and is extendible.• GT4 is based on the following WS specifications:

– WS architecture (XML, SOAP, WSDL)– WS-Security, etc.– WS-Addressing, WSRF, and WS-Notification specifications used

to define, name and interact with stateful resources

Page 11: Enabling Grids for E-sciencE KIAM 1 in GT4 Evaluation Activity and Grid Research Pavel Berezovskiy Dmitry Semyachkin ARDA Meeting October 12, 2005 CERN

ARDA Meeting October 12, 2005 CERN 11

Enabling Grids for E-sciencE

KIAM RAS

GT4 WS Core containers

• GT4 has three types of WS containers with different performance properties, supported WS implementation languages, security support, etc.:– Java WS Core– C WS Core– Python WS Core

• Other implementations exist:– WSRF::Lite– WSRF.NET

• It is important, that all GT4 ready-to-use services are implemented only for Java WS Core

Page 12: Enabling Grids for E-sciencE KIAM 1 in GT4 Evaluation Activity and Grid Research Pavel Berezovskiy Dmitry Semyachkin ARDA Meeting October 12, 2005 CERN

ARDA Meeting October 12, 2005 CERN 12

Enabling Grids for E-sciencE

KIAM RAS

WSRF implementations comparison

• Security TLS/SSL transport-level security protocol, Secure conversation,

Secure Message WSRF::Lite supports only transport-level security

• Persistence persist WS-Resource in memory (by default) WSRF.NET uses database (by default)

• Lifetime Management resource creation and destruction GT4 Java doesn’t define a specific create() operation

• WS-Notification only WSRF.NET implements all WSN specifications

• Authorization GT4 Java, pyGridWare, WSRF.NET define authorization callout

Page 13: Enabling Grids for E-sciencE KIAM 1 in GT4 Evaluation Activity and Grid Research Pavel Berezovskiy Dmitry Semyachkin ARDA Meeting October 12, 2005 CERN

ARDA Meeting October 12, 2005 CERN 13

Enabling Grids for E-sciencE

KIAM RAS

GT4 container performance

• GT4 Java container shows performance worse than script based pyGridWare, and GT4 C outperforms GT4 Java up to 10 times.

• From the data in table it is possible to evaluate the lower bound of Java container’s max throughput:– simple operation (GetRP) – 6 invocations per second;– 4.5 notifications per second.

GT4 Java, ms GT4 C, ms pyGridWare, ms

WSRF::Lite, ms

WSRF.NET, ms

GetRP 182.66/181.96 15.77/14.77 139.65/140.50 N/A 82.4/81.39

SetRP 182.47/182.04 15.88/14.99 140.74/142.21 N/A 81.84/82.48

CreateR 188.21/188.46 15.95/14.98 133.41/132.26 N/A 96.88/96.22

DestroyR 182.47/182.03 17.10/15.76 137.12/136.12 N/A 86.42/86.89

Notify 221.28/219.51 N/A 152.10/244.93 N/A 100.01/101.57

Globus Team results• Comparison of five WSRF containers is presented in table. The values

present the average duration for client to invoke WSRF services.

Page 14: Enabling Grids for E-sciencE KIAM 1 in GT4 Evaluation Activity and Grid Research Pavel Berezovskiy Dmitry Semyachkin ARDA Meeting October 12, 2005 CERN

ARDA Meeting October 12, 2005 CERN 14

Enabling Grids for E-sciencE

KIAM RAS

GT4 container performance

KIAM results• For the same scenario (non WSRF GetRP) with dummy

service we have got:– 84.97 ms per invocation;– approximately 12 invocations per second.

• More accurate data are available for GT3 evaluation (concurrent scenario with multiple concurrent invocations): – max throughput have been estimated at 1.3 services per second; – 10–15 notifications per second.

• We have estimated the overhead expenses of GT4 Java container (i.e. common part for any service): – 2.60 ms (for no-security container);– 55.97 ms (for security container, X.509 signing).

Page 15: Enabling Grids for E-sciencE KIAM 1 in GT4 Evaluation Activity and Grid Research Pavel Berezovskiy Dmitry Semyachkin ARDA Meeting October 12, 2005 CERN

ARDA Meeting October 12, 2005 CERN 15

Enabling Grids for E-sciencE

KIAM RAS

Service creation

• There are three methods to create Web service:– Using java2wsdl and wsdl2java tools, having java interface of service

universal and most popular for designing services which do not use WSRF doesn’t support WS-Resource

– GT4’s approach (The GT4 Programmer’s Tutorial by Borja Sotomayor) general method for all types of services requires knowing SOAP, WSDL, WSDD, etc.

– Matthew Smith’s and Bernd Freisleben’s approach using special library and other additional staff which help to make service

creation easier

• We propose that service creation have to be originated from service interface, but not from WSDL description, because of it is a machine-oriented language and very difficult for human. Therefore the third method is more suitable, but for the moment this approach supports only services and clients written on Java

Page 16: Enabling Grids for E-sciencE KIAM 1 in GT4 Evaluation Activity and Grid Research Pavel Berezovskiy Dmitry Semyachkin ARDA Meeting October 12, 2005 CERN

ARDA Meeting October 12, 2005 CERN 16

Enabling Grids for E-sciencE

KIAM RAS

GT4 WS GRAM Architecture

• GRAM is intended for secure submission, monitoring and control of job on remote computing resources with coordinated file staging

• WS GRAM is implemented as a suite of web services consistent with WSRF model (under GT4 Java container):– ManagedJobFactory– ManagedJob

• WS GRAM suite is also using other GT4 components:– Delegation service– Reliable File Transfer (RFT)– Workspace Management Service (may be used)

Page 17: Enabling Grids for E-sciencE KIAM 1 in GT4 Evaluation Activity and Grid Research Pavel Berezovskiy Dmitry Semyachkin ARDA Meeting October 12, 2005 CERN

ARDA Meeting October 12, 2005 CERN 17

Enabling Grids for E-sciencE

KIAM RAS

GT4 WS GRAM vs. GT3 GRAM

• In GT4 developers refused high secure but non scalable and low managed personal Java container for each user as in GT3. It is more scalable solution although less secure. As a result GT4 performance is significantly more than GT3 performance.

• Delegation service replaces the implicit delegation of previous GRAM solutions with explicit service operations, by which client delegates credentials for use by RFT service or transferring to submitted jobs.

• Integration with RFT replaces the legacy GASS data transfer protocol and provides a much more robust file staging manager than the ad hoc solution from previous GRAM versions.

• In GT4 WS GRAM sudo is used to run scheduler adapter and write delegation credentials to user account context. This mechanism replaces the root-privileged gatekeeper from pre-WS GRAM in order to avoid running an entire Java container as root.

• New service – Workspace Service (WSS) allows a Grid client to dynamically create and manage a workspace (Unix account) on a remote site.

Page 18: Enabling Grids for E-sciencE KIAM 1 in GT4 Evaluation Activity and Grid Research Pavel Berezovskiy Dmitry Semyachkin ARDA Meeting October 12, 2005 CERN

ARDA Meeting October 12, 2005 CERN 18

Enabling Grids for E-sciencE

KIAM RAS

GT4 WS GRAM performance

• KIAM results:– response time – 1,75 ± 0,1sec (sequential job submissions)– max throughput – 79 jobs/min (concurent job submissions)

• Globus team results:– maх throughput – 74 jobs/min (without delegation and staging)– maximum concurrent job submissions – 8000 long running jobs

was achieved with no failures– maximum concurrent job submissions to the condor scheduler -

32,000 jobs (due to system limit)– maximum sequential job submissions – 23 days 500,000+ jobs

submitted without container crash

• GT3 GRAM: – max throughput – 3.8 jobs/min

Page 19: Enabling Grids for E-sciencE KIAM 1 in GT4 Evaluation Activity and Grid Research Pavel Berezovskiy Dmitry Semyachkin ARDA Meeting October 12, 2005 CERN

ARDA Meeting October 12, 2005 CERN 19

Enabling Grids for E-sciencE

KIAM RAS

GT4 GRAM as a backend for gLite CE

• It is proposed to improve current gLite CE using GT4 GRAM, Workspace Service (WSS) and modified Condor (with WSS support), investigate a question of these components interoperability and prototype the GT4 GRAM backend for future gLite CE.

• This activity is divided into two steps:– Submitting jobs from Condor-C to GT4 GRAM directly without

involving WSS.The aim is to check possibility of submitting a job via Condor-C to GT4 GRAM so that this job is executed in PBS under another user account.

– Submitting jobs from Condor-C to GT4 GRAM using WSS.The aim is to prototype presented scheme using dynamic account allocation (WSS).

Page 20: Enabling Grids for E-sciencE KIAM 1 in GT4 Evaluation Activity and Grid Research Pavel Berezovskiy Dmitry Semyachkin ARDA Meeting October 12, 2005 CERN

ARDA Meeting October 12, 2005 CERN 20

Enabling Grids for E-sciencE

KIAM RAS

First step of prototyping

• Step 1. Job is submitted from submission host (RB) via Condor-C to the execution host (CE). Job executes under another user (user DN is mapped to local UID according grid-mapfile on the execution host).

Condor-Cserver

GAHP GT4 GRAM PBS

CE

DN UID

DN -> UID

Condor-Cclient

condor_submit

RB

• Result. The configuration works successfully (with some issues and restrictions).

– some technical issues (solved with Globus Team assistance)– GridFTP server has to be running on both execution and submission hosts– the configuration works only when execution and submission hosts are different hosts because of

otherwise there are a problems with files staging

Page 21: Enabling Grids for E-sciencE KIAM 1 in GT4 Evaluation Activity and Grid Research Pavel Berezovskiy Dmitry Semyachkin ARDA Meeting October 12, 2005 CERN

ARDA Meeting October 12, 2005 CERN 21

Enabling Grids for E-sciencE

KIAM RAS

Second step of prototyping

• Step 2. Job is submitted from submission host (RB) via Condor-C to the execution host (CE). Condor server obtains dynamic account from the Workspace Service by means of user’s DN. After that Condor submits a job to GT4 GRAM with acquired account (Step 1). For security reasons GRAM should check if obtained account is valid and put job to the PBS.

Condor-Cclient

condor_submit

RB

Condor-Cserver

GAHP account GT4 GRAM PBS

WSS

check_account

get_account (DN)LCMAPS

UID

DN, UID UID

CE

Page 22: Enabling Grids for E-sciencE KIAM 1 in GT4 Evaluation Activity and Grid Research Pavel Berezovskiy Dmitry Semyachkin ARDA Meeting October 12, 2005 CERN

ARDA Meeting October 12, 2005 CERN 22

Enabling Grids for E-sciencE

KIAM RAS

Second step of prototyping

• This configuration is not work properly, because of Condor-C with WSS support is not released yet. Also there are some technical issues with WSS configuration needed to be solved.

• To our mind presented configuration looks good but is needed to be improved in the following way:– Condor with WSS support is needed to be done.– Communication between WSS and LCMAPS should be more

clear and better documented.– LCMAPS (or another WSS backend) haven’t to affect on other

services (GridFTP, etc.).

Page 23: Enabling Grids for E-sciencE KIAM 1 in GT4 Evaluation Activity and Grid Research Pavel Berezovskiy Dmitry Semyachkin ARDA Meeting October 12, 2005 CERN

ARDA Meeting October 12, 2005 CERN 23

Enabling Grids for E-sciencE

KIAM RAS

Future work

• After we manage to have the full setup functional, we need to test:– performance – throughput– scalability

of the components• Investigate a question of the components

interoperability

Page 24: Enabling Grids for E-sciencE KIAM 1 in GT4 Evaluation Activity and Grid Research Pavel Berezovskiy Dmitry Semyachkin ARDA Meeting October 12, 2005 CERN

Enabling Grids for E-sciencE

II. Grid Research

Page 25: Enabling Grids for E-sciencE KIAM 1 in GT4 Evaluation Activity and Grid Research Pavel Berezovskiy Dmitry Semyachkin ARDA Meeting October 12, 2005 CERN

ARDA Meeting October 12, 2005 CERN 25

Enabling Grids for E-sciencE

KIAM RAS

History of KIAM team

– 1998 – Grid team has been started it’s activity – 1999 – Grid testbed, collecting institute’s resources was created– 2001 – MetaDispatcher was developed– 2004 – KIAM was entered into EGEE project– 2004 – Grid portal GRIDCLUB.RU was opened– 2004 – First version of GridDispatcher was released

Page 26: Enabling Grids for E-sciencE KIAM 1 in GT4 Evaluation Activity and Grid Research Pavel Berezovskiy Dmitry Semyachkin ARDA Meeting October 12, 2005 CERN

ARDA Meeting October 12, 2005 CERN 26

Enabling Grids for E-sciencE

KIAM RAS

Direction of research

• Main direction of our research is Resource management.

• In this area during the last year several implementations of Broker Scheduling services have appeared.

• Existing systems have several disadvantages:– They didn’t guarantee job execution for determined time;– They didn’t allow user to control execution time of each his job.

• We offer resource management model, based on scheduling which free of these weaknesses.

• Simple methods (direct resource allocation) are not effective for resource management, so the scheduling methods are required.

Page 27: Enabling Grids for E-sciencE KIAM 1 in GT4 Evaluation Activity and Grid Research Pavel Berezovskiy Dmitry Semyachkin ARDA Meeting October 12, 2005 CERN

ARDA Meeting October 12, 2005 CERN 27

Enabling Grids for E-sciencE

KIAM RAS

Requirements

In our model scheduling system have to satisfy following requirements:

– “Fair share” of resources between Grid users in order to protect resource monopolization.

– Resource owner has possibility to control amount of resources, devoted to Grid (shared resources with quotas mechanism).

– User has ability to his job execution time control (for example, by paying more price).

– The execution of job must be guaranteed. Job haven’t to stay in queue for infinite time.

Page 28: Enabling Grids for E-sciencE KIAM 1 in GT4 Evaluation Activity and Grid Research Pavel Berezovskiy Dmitry Semyachkin ARDA Meeting October 12, 2005 CERN

ARDA Meeting October 12, 2005 CERN 28

Enabling Grids for E-sciencE

KIAM RAS

Scheduling methods

Various forms of Grid require various scheduling methods.

We consider following forms:– By resource organization

Clusters Individual machines

– By manner of using resources Alienable resources: resources are used only for Grid proposes Shared resources: resources are shared between owner and Grid users

– By type of jobs Simple jobs: one job per processor Multiprocessor jobs Serialized jobs: small job with big amount of input data. This data may

be divided on several blocks and every block can be processed separately.

Page 29: Enabling Grids for E-sciencE KIAM 1 in GT4 Evaluation Activity and Grid Research Pavel Berezovskiy Dmitry Semyachkin ARDA Meeting October 12, 2005 CERN

ARDA Meeting October 12, 2005 CERN 29

Enabling Grids for E-sciencE

KIAM RAS

Current results

Described scheduling model is realized in our developments:

• Pilot project MetaDispatcher (2001)– Centralized dispatching without planning

• First version of GridDispatcher (2004)– Global scheduling based on prioritized job queue

• Current work at second version of GridDispatcher– Multi-processor jobs– Serialized jobs

Page 30: Enabling Grids for E-sciencE KIAM 1 in GT4 Evaluation Activity and Grid Research Pavel Berezovskiy Dmitry Semyachkin ARDA Meeting October 12, 2005 CERN

ARDA Meeting October 12, 2005 CERN 30

Enabling Grids for E-sciencE

KIAM RAS

GridDispatcher v1.0

• Resource organization– Clusters

• Manner of resources using– Shared resources

• Type of jobs– Simple jobs (one job per processor)

• Features– Advanced scheduling algorithm based on local schedules and

prioritized global jobs queue– One-step forecast– Improved planning oriented Information Service, based on

RDBMS– No resource reservation

Page 31: Enabling Grids for E-sciencE KIAM 1 in GT4 Evaluation Activity and Grid Research Pavel Berezovskiy Dmitry Semyachkin ARDA Meeting October 12, 2005 CERN

ARDA Meeting October 12, 2005 CERN 31

Enabling Grids for E-sciencE

KIAM RAS

GridDispatcher v2.0

• Resource organization– Clusters

• Manner of resources using– Shared resources

• Type of jobs– Multi-processor jobs

• Features– Advanced scheduling method, based on Backfill algorithm– Multiple-step forecast– Resource reservation

Page 32: Enabling Grids for E-sciencE KIAM 1 in GT4 Evaluation Activity and Grid Research Pavel Berezovskiy Dmitry Semyachkin ARDA Meeting October 12, 2005 CERN

ARDA Meeting October 12, 2005 CERN 32

Enabling Grids for E-sciencE

KIAM RAS

Future plans

• GridDispatcher improvements:– Resource reservation mechanism– No job type restriction

Additional support for multi-processor jobs

• Designing of scheduling methods for other actual Grid forms:– Grid consists of clusters, resources are used together with their

owners, multi-processor jobs– Individual machines, shared resources, serialized jobs