grid developments and middleware components mike mineter egee training team [email protected]

45
EGEE is a project funded by the European Union under contract IST-2003-508833 Grid developments and middleware components Mike Mineter EGEE Training team [email protected] http://egee- intranet.web.cern.ch

Upload: nuwa

Post on 05-Jan-2016

21 views

Category:

Documents


2 download

DESCRIPTION

Grid developments and middleware components Mike Mineter EGEE Training team [email protected]. http://egee-intranet.web.cern.ch. Acknowledgements. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

EGEE is a project funded by the European Union under contract IST-2003-508833

Grid developments and middleware components

Mike MineterEGEE Training team

[email protected]

http://egee-intranet.web.cern.ch

Page 2: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

Grid developments and middleware components, 19 July 04 - 2

Acknowledgements

This presentation for the GGF Summer School, 2004 was prepared by the NeSC Edinburgh training team. It includes slides and information from many sources: Roberto Barbera (Slides on middleware are based on presentations

given in Edinburgh, April 2004) Malcolm Atkinson and Ian Bird (Sites in LCG-2/EGEE-0 at GGF-11) Other colleagues in EGEE (project overview slides) The European DataGrid training team Authors of the LCG-2 User Guide v. 2.0 : Antonio Delgado Peris,

Patricia Méndez Lorenzo, Flavia Donno, Andrea Sciabà, Simone Campana, Roberto Santinelli https://edms.cern.ch/file/454439//LCG-2-UserGuide.html

Page 3: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

Grid developments and middleware components, 19 July 04 - 3

Outline

• Grid developments from an EGEE perspective: Creating e-Infrastructure Building on and with other Grid projects Towards service-orientation Establishing a “production Grid”

• Overview of the middleware of the current EGEE-0 system Major components Lifecycle of a job

• Summary

Page 4: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

Grid developments and middleware components, 19 July 04 - 4

Towards a European e-Infrastructure

• To underpin European science and technology in the service of society

• To link with and build on National, regional and

international initiatives Emerging technologies (e.g.

fibre optic networks)

• To foster international cooperation both in the creation and the use

of the e-infrastructure Network infrastructure

(GÉANT )

Op

era

tio

ns

, S

up

po

rt a

nd

tr

ain

ing

Collaboration

Pan-European Grid

Page 5: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

Grid developments and middleware components, 19 July 04 - 5

In 2 years EGEE will:

• Establish production quality sustained Grid services 3000 users from at least 5

disciplines over 20,000 CPU's, 50 sites over 5 Petabytes (1015) storage

• Demonstrate a viable general process to bring other scientific communities on board

• Spend 32 Million Euros - started April 2004 70 institutions in 27 countries

• Propose a second phase in mid 2005 to take over EGEE in early 2006 Initial New

Page 6: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

Grid developments and middleware components, 19 July 04 - 6

EGEE will:

• Establish production quality sustained Grid services 3000 users from at least 5

disciplines over 20,000 CPU's

• Demonstrate a viable general process to bring other scientific communities on board

• Spend 32 Million Euros over 2 years starting April 2004 70 institutions in 28 countries

• Propose a second phase in mid 2005 to take over EGEE in early 2006 Initial domains

Page 7: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

Grid developments and middleware components, 19 July 04 - 8

Outline

• Grid developments from an EGEE perspective: Creating e-Infrastructure Building on and with other Grid projects Towards service-orientation Establishing a “production Grid”

• Overview of the middleware of the current EGEE-0 system Major components Lifecycle of a job

• Summary

Page 8: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

Grid developments and middleware components, 19 July 04 - 9

EGEE view of history

. . .

LCG

2004

2001

EGEE

Used in

USA EU

NextGrid …GridCC

Future e-Infrastructure

EDG

Globus MyProxyCondor ...

VDT

DataTAG

AliEnCrossGrid ...

SRM

Page 9: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

Grid developments and middleware components, 19 July 04 - 10

Outline

• Grid developments from an EGEE perspective: Creating e-Infrastructure Building on and with other Grid projects Towards service-orientation Establishing a “production Grid”

• Overview of the middleware of the current EGEE-0 system Major components Lifecycle of a job

• Summary

Page 10: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

Grid developments and middleware components, 19 July 04 - 11

Service orientation: building EGEE-1

• “gLite” - the new EGEE middleware (under test)

• Service oriented - components that are : Loosely coupled (by messages – examples tomorrow) Accessible across network; modular and self-contained; clean

modes of failure So can change implementation without changing interfaces Can be developed in anticipation of new uses

• … and are based on standards. Opens EGEE to: New middleware (plethora of tools now available) Heterogeneous resources (storage, computation…) Interact with other Grids (international, regional and national)

Page 11: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

Grid developments and middleware components, 19 July 04 - 12

Outline

• Grid developments from an EGEE perspective: Creating e-Infrastructure Building on and with other Grid projects Towards service-orientation Establishing a “production Grid”

• Overview of the middleware of the current EGEE-0 system Major components Lifecycle of a job

• Summary

Page 12: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

Grid developments and middleware components, 19 July 04 - 13

LCG and EGEE

• LCG: Large Hadron Collider Computing Grid

• LCG infrastructure running LCG-2 is “EGEE-0”

• In parallel producing new web-service-oriented middleware (“gLite”)

• Will replace LCG-2 as production facility in 2005

• New major releases each year

Globus 2 based Web services based

EGEE-2EGEE-1LCG-2LCG-1

Page 13: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

Grid developments and middleware components, 19 July 04 - 14

Sites in LCG-2/EGEE-0 : June 4 2004

http://goc.grid-support.ac.uk/gppmonWorld/gppmon_maps/CERN_lxn1188.html

• 22 Countries• 58 Sites (45 Europe, 2 US, 5 Canada, 5 Asia, 1 HP)

• Coming: New Zealand, China, other HP (Brazil, Singapore)

• 3800 cpu

Page 14: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

Grid developments and middleware components, 19 July 04 - 15

Operations Infrastructure

• A lot more than middleware!!

• 40% of EGEE budget

• 10 ROCs: coordinate deployment & operation, tasks include: First point of contact for all new sites, new

users, and user support; Issue certificates Negotiate policies with resource providers

• 5 CICs: tasks include provision of VO services, core Grid services (RBs, UIs,

database services, BDIIs)

Page 15: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

Grid developments and middleware components, 19 July 04 - 16

EGEE: adding a VO

EGEE has a formal procedure for adding selected new user communities:

• Negotiation with one of the Regional Operations Centres

• Seek balance between the resources contributed by a VO and those that they consume.

• Resource allocation will be made at the VO level.

• Many resources need to be available to multiple VOs : shared use of resources is fundamental to a Grid

Page 16: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

Grid developments and middleware components, 19 July 04 - 17

Story so far: themes illustrated by EGEE

• e-Infrastructure Integrating networks, grids and emerging technologies Based on standards Underpinning research, industry, … the “knowledge economy”

• International, collaborative effort

• Moving to a Service Orientated Architecture

• Focus: Production grids for multiple VOs Demands massive effort in organisation and administration:

• Operations• Support• Training

Page 17: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

Grid developments and middleware components, 19 July 04 - 22

Outline

• Grid developments from an EGEE perspective: Creating e-Infrastructure Building on and with other Grid projects Towards service-orientation Establishing a “production Grid”

• Overview of the middleware of the current EGEE-0 system Major components Lifecycle of a job

• Summary

Page 18: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

Grid developments and middleware components, 19 July 04 - 23

User-view of EGEE: a multi-VO Grid

User Interface

Grid services

User Interface

Page 19: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

Grid developments and middleware components, 19 July 04 - 24

Middleware components

ReplicaReplicaCatalogueCatalogue

Logging &Logging &Book-keepingBook-keeping

ResourceResourceBrokerBroker

StorageStorageElementElement

ComputingComputingElementElement

Information Information ServiceService

Job Status

DataSets info

Author.&Authen.

Job S

ub

mit

Even

t

Job

Qu

ery

Job

Stat

us

Input “sandbox”

Input “sandbox” + Broker Info

Output “sandbox”

Output “sandbox”

Pu

blis

h

SE & CE info

““User User interface”interface”

Page 20: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

Grid developments and middleware components, 19 July 04 - 25

Workload Management System (WMS)

• Distributed scheduling multiple UI’s where you submit your job multiple RB’s from where the job is sent to a CE multiple CE’s where the job can be put in a queuing system

• Distributed resource management multiple information systems that monitor the state of the grid Information from SE, CE, sites

Page 21: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

Grid developments and middleware components, 19 July 04 - 26

Authentication, Authorisation

• Authentication User obtains certificate from CA Connects to UI by ssh Downloads certificate Invokes Proxy server Single logon – to UI - then

Secure Socket Layer with proxy identifies user to other nodes

• Authorisation - currently User joins Virtual Organisation VO negotiates access to Grid nodes

and resources (CE, SE) Authorisation tested by CE, SE:

gridmapfile maps user to local account

UI

CA

VO mgr

Personal

VO database

Gridmapfiles

On CE, SE nodes

SSL

(proxy)

VO service

Page 22: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

Grid developments and middleware components, 19 July 04 - 27

User Interface node

• The user’s interface to the Grid

• Command-line interface to Proxy server Job operations

• To submit a job• Monitor its status• Retrieve output

Data operations• Upload file to SE• Create replica• Discover replicas

Other grid services

• Also C++ and Java APIs

• To run a job user creates a JDL (Job Description Language) file

UIJDL

Page 23: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

Grid developments and middleware components, 19 July 04 - 28

“Compute element” in LCG-2

A CE is a grid batch queuewith a “grid gate” front-end:

Homogeneous set of worker nodes

Grid gate node

Local resource management system:Condor / PBS / LSF master

Globus gatekeeper

Job request

Info system

Logging

gridmapfile

I.S.

Logging

Page 24: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

Grid developments and middleware components, 19 July 04 - 29

Storage elements and files

• Storage elements hold files: write once, read many

• Replica files can be held on different SE: “close” to CE; share load on SE

• Replica Catalogue - what replicas exist for a file?

• Replica Location Service - where are they?

Local Info

EventLogging

gridmapfile

GridFTP

Disk arrays or Disk arrays or tapestapes

Info system

Logging

Globus gatekeeper

File transfer Requests

Page 25: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

Grid developments and middleware components, 19 July 04 - 32

Resource Broker nodes

• Run the Workload Management System To accept job submissions Dispatch jobs to appropriate Compute Element (CE) Allow users

• To get information about their status• To retrieve their output

• A configuration file on each UI node determines which RB node(s) will be used

• When a user submits a job, JDL options are to: Specify CE Allow RB to choose CE (using optional tags to define

requirements) Specify SE (then RB finds “nearest” appropriate CE, after

interrogating Replica Location Service)

Page 26: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

Grid developments and middleware components, 19 July 04 - 35

Outline

• Grid developments from an EGEE perspective: Creating e-Infrastructure Building on and with other Grid projects Towards service-orientation Establishing a “production Grid”

• Overview of the middleware of the current EGEE-0 system Major components Lifecycle of a job

• Summary

Page 27: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

UI

NetworkServer

Job Contr.-

CondorG

WorkloadManager

ReplicaLocationServer

Inform.Service

ComputingElement

StorageElement

RB node

CE characts& status

SE characts& status

Page 28: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

UI

NetworkServer

Job Contr.-

CondorG

WorkloadManager

ReplicaLocationServer

Inform.Service

ComputingElement

StorageElement

RB node

CE characts& status

SE characts& status

submitted

Job Status

UI: allows users to access the functionalitiesof the WMS(via command line, GUI, C++ and Java APIs)

Page 29: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

UI

NetworkServer

Job Contr.-

CondorG

WorkloadManager

ReplicaLocationServer

Inform.Service

ComputingElement

StorageElement

RB node

CE characts& status

SE characts& status

edg-job-submit myjob.jdlMyjob.jdl

JobType = “Normal”;Executable = "$(CMS)/exe/sum.exe";InputSandbox = {"/home/user/WP1testC","/home/file*”, "/home/user/DATA/*"};OutputSandbox = {“sim.err”, “test.out”, “sim.log"};Requirements = other. GlueHostOperatingSystemName == “linux" && other. GlueHostOperatingSystemRelease == "Red Hat 7.3“ && other.GlueCEPolicyMaxCPUTime > 10000;Rank = other.GlueCEStateFreeCPUs;

submitted

Job Statu

s

Job Description Language(JDL) to specify job characteristics and requirements

Page 30: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

UI

NetworkServer

Job Contr.-

CondorG

WorkloadManager

ReplicaLocationServer

Inform.Service

ComputingElement

StorageElement

RB node

CE characts& status

SE characts& status

RBstorage

Input Sandboxfiles

Job

waiting

submitted

Job Status

NS: network daemon responsible for acceptingincoming requests

Page 31: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

Job submission

UI

NetworkServer

Job Contr.-

CondorG

WorkloadManager

ReplicaLocationServer

Inform.Service

ComputingElement

StorageElement

RB node

CE characts& status

SE characts& status

RBstorage

waiting

submitted

Job Status

WM: responsible to takethe appropriate actions to satisfy the request

Job

Page 32: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

Job submission

UI

NetworkServer

Job Contr.-

CondorG

WorkloadManager

ReplicaLocationServer

Inform.Service

ComputingElement

StorageElement

RB node

CE characts& status

SE characts& status

RBstorage

waiting

submitted

Job Status

Match-Maker/Broker

Where must thisjob be executed ?

Page 33: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

Job submission

UI

NetworkServer

Job Contr.-

CondorG

WorkloadManager

ReplicaLocationServer

Inform.Service

ComputingElement

StorageElement

RB node

CE characts& status

SE characts& status

RBstorage

waiting

submitted

Job Status

Match-Maker/ Broker

Matchmaker: responsible to find the “best” CE where to submit a job

Page 34: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

Job submission

UI

NetworkServer

Job Contr.-

CondorG

WorkloadManager

ReplicaLocationServer

Inform.Service

ComputingElement

StorageElement

RB node

CE characts& status

SE characts& status

RBstorage

waiting

submitted

Job Status

Match-Maker/ Broker

Where are (which SEs) the needed data ?

What is thestatus of the

Grid ?

Page 35: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

Job submission

UI

NetworkServer

Job Contr.-

CondorG

WorkloadManager

ReplicaLocationServer

Inform.Service

ComputingElement

StorageElement

RB node

CE characts& status

SE characts& status

RBstorage

waiting

submitted

Job Status

Match-Maker/Broker

CE choice

Page 36: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

Job submission

UI

NetworkServer

Job Contr.-

CondorG

WorkloadManager

ReplicaLocationServer

Inform.Service

ComputingElement

StorageElement

RB node

CE characts& status

SE characts& status

RBstorage

waiting

submitted

Job Status

JobAdapter

JA: responsible for the final “touches” to the job before performing submission(e.g. creation of wrapper script, etc.)

Page 37: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

Job submission

UI

NetworkServer

Job Contr.-

CondorG

WorkloadManager

ReplicaLocationServer

Inform.Service

ComputingElement

StorageElement

RB node

CE characts& status

SE characts& status

RBstorage

Job Status

JC: responsible for theactual job managementoperations (done via CondorG)

Job

submitted

waiting

ready

Page 38: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

Job submission

UI

NetworkServer

Job Contr.-

CondorG

WorkloadManager

ReplicaLocationServer

Inform.Service

ComputingElement

StorageElement

RB node

CE characts& status

SE characts& status

RBstorage

Job Status

Job

InputSandboxfiles

submitted

waiting

ready

scheduled

Page 39: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

Job submission

UI

NetworkServer

Job Contr.-

CondorG

WorkloadManager

ReplicaLocationServer

Inform.Service

ComputingElement

StorageElement

RB node

RBstorage

Job Status

InputSandbox

submitted

waiting

ready

scheduled

running

“Grid enabled”data transfers/

accesses

Job

Page 40: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

Job submission

UI

NetworkServer

Job Contr.-

CondorG

WorkloadManager

ReplicaLocationServer

Inform.Service

ComputingElement

StorageElement

RB node

RBstorage

Job Status

OutputSandboxfiles

submitted

waiting

ready

scheduled

running

done

Page 41: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

Job submission

UI

NetworkServer

Job Contr.-

CondorG

WorkloadManager

ReplicaLocationServer

Inform.Service

ComputingElement

StorageElement

RB node

RBstorage

Job Status

OutputSandbox

submitted

waiting

ready

scheduled

running

done

edg-job-get-output <dg-job-id>

Page 42: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

Job submission

UI

NetworkServer

Job Contr.-

CondorG

WorkloadManager

ReplicaLocationServer

Inform.Service

ComputingElement

StorageElement

RB node

RBstorage

Job Status

OutputSandboxfiles

submitted

waiting

ready

scheduled

running

done

cleared

Page 43: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

Job monitoring

UI

Log Monitor

Logging &Bookkeeping

NetworkServer

Job Contr.-

CondorG

WorkloadManager

ComputingElement

RB node

LM: parses CondorG logfile (where CondorG logsinfo about jobs) and notifies LB

LB: receives and stores job events; processes corresponding job status

Log ofjob events

edg-job-status <dg-job-id>edg-job-get-logging-info <dg-job-id>

Job status

Page 44: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

Grid developments and middleware components, 19 July 04 - 64

Summary… 1

• EGEE is creating a production-quality Grid as a step towards an emerging Europe-wide e-Infrastructure Secure, reliable, sustainable Wide spectrum of VOs Integrating with national, regional, international grids and networks

• EGEE is reengineering middleware, with Service Orientation

• The LCG is providing a service now

• EGEE-0 components, Job submission and life-cycle have been described….

Page 45: Grid developments and  middleware components Mike Mineter EGEE Training team mjm@nesc.ac.uk

Grid developments and middleware components, 19 July 04 - 65

Summary -2: EGEE components

ReplicaReplicaCatalogueCatalogue

Logging &Logging &Book-keepingBook-keeping

ResourceResourceBrokerBroker

StorageStorageElementElement

ComputingComputingElementElement

Information Information ServiceService

Job Status

DataSets info

Author.&Authen.

Job S

ub

mit

Even

t

Job

Qu

ery

Job

Stat

us

Input “sandbox”

Input “sandbox” + Broker Info

Output “sandbox”

Output “sandbox”

Pu

blis

h

SE & CE info

““User User interface”interface”