infso-ri-508833 enabling grids for e-science egee middleware the resource broker egee project...

26
INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org EGEE Middleware The Resource Broker EGEE project members

Upload: antony-scott

Post on 18-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE Middleware The Resource Broker EGEE project members

INFSO-RI-508833

Enabling Grids for E-sciencE

www.eu-egee.org

EGEE MiddlewareThe Resource Broker EGEE project members

Page 2: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE Middleware The Resource Broker EGEE project members

EGEE ResourceBroker

2

Enabling Grids for E-sciencE

INFSO-RI-508833

Contents

• Short review of concepts• Requirements of the applications communities• Overview of the main grid services• A closer look

Page 3: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE Middleware The Resource Broker EGEE project members

EGEE ResourceBroker

3

Enabling Grids for E-sciencE

INFSO-RI-508833

Current production middleware

Logging &Logging &Book-keepingBook-keeping

ResourceResourceBrokerBroker

StorageStorageElementElement

ComputingComputingElementElement

Information Information ServiceService

Job Status

DataSets info

Author.&Authen.

Job S

ub

mit

Even

t

Job

Qu

ery

Job

Stat

us

Input “sandbox”

Input “sandbox” + Broker Info

Output “sandbox”

Output “sandbox”

Pu

blis

h

SE & CE info

““User User interface”interface”

LCG LCG FileCatalogue FileCatalogue (LFC)(LFC)

Page 4: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE Middleware The Resource Broker EGEE project members

EGEE ResourceBroker

4

Enabling Grids for E-sciencE

INFSO-RI-508833

Building on basic tools and Information Service

Example JDL fileExecutable = “gridTest”;

StdError = “stderr.log”;

StdOutput = “stdout.log”;

InputSandbox = {“/home/joda/test/gridTest”};

OutputSandbox = {“stderr.log”, “stdout.log”};

•Submit job to grid via the “resource broker”,

•edg_job_submit my.jdl

Page 5: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE Middleware The Resource Broker EGEE project members

EGEE ResourceBroker

5

Enabling Grids for E-sciencE

INFSO-RI-508833

User Interface node

• The user’s interface to the Grid

• Command-line interface to– Proxy server– Job operations

To submit a job Monitor its status Retrieve output

– Data operations Upload file to SE Create replica Discover replicas

– Other grid services

• Also C++ and Java APIs

• To run a job user creates a JDL (Job Description Language) file

UIJDL

Page 6: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE Middleware The Resource Broker EGEE project members

EGEE ResourceBroker

6

Enabling Grids for E-sciencE

INFSO-RI-508833

Example JDL fileExecutable = “gridTest”;

StdError = “stderr.log”;

StdOutput = “stdout.log”;

InputSandbox = {“/home/joda/test/gridTest”};

OutputSandbox = {“stderr.log”, “stdout.log”};

InputData = “lfn:/grid/VOname/mydir/testbed0.00019”;

DataAccessProtocol = “gridftp”;

Requirements = other.Architecture==“INTEL” && \ other.OpSys==“LINUX” && other.FreeCpus >=4;

Rank = “other.GlueHostBenchmarkSF00”;

Building on basic tools and Information Service

•Submit job to grid via the “resource broker (RB)”,

•edg_job_submit my.jdlReturns a “job-id” used to monitor job, retrieve output

Page 7: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE Middleware The Resource Broker EGEE project members

EGEE ResourceBroker

7

Enabling Grids for E-sciencE

INFSO-RI-508833

Example JDL fileExecutable = “gridTest”;

StdError = “stderr.log”;

StdOutput = “stdout.log”;

InputSandbox = {“/home/joda/test/gridTest”};

OutputSandbox = {“stderr.log”, “stdout.log”};

InputData = “lfn:/grid/VOname/mydir/testbed0-00019”;

DataAccessProtocol = “gridftp”;

Requirements = other.Architecture==“INTEL” && \ other.OpSys==“LINUX” && other.FreeCpus >=4;

Rank = “other.GlueHostBenchmarkSF00”;

Building on basic tools and Information Service

•Submit job to grid via the “resource broker”,

•edg_job_submit my.jdlReturns a “job-id” used to monitor job, retrieve output

lfn: logical file name

RB uses Catalog to find replica locations

Page 8: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE Middleware The Resource Broker EGEE project members

EGEE ResourceBroker

8

Enabling Grids for E-sciencE

INFSO-RI-508833

Example JDL fileExecutable = “gridTest”;

StdError = “stderr.log”;

StdOutput = “stdout.log”;

InputSandbox = {“/home/joda/test/gridTest”};

OutputSandbox = {“stderr.log”, “stdout.log”};

InputData = “lfn:testbed0-00019”;

DataAccessProtocol = “gridftp”;

Requirements = other.Architecture==“INTEL” && \ other.OpSys==“LINUX” && other.FreeCpus >=4;

Rank = “other.GlueHostBenchmarkSF00”;

Building on basic tools and Information Service

•Submit job to grid via the “resource broker”,

•edg_job_submit my.jdlReturns a “job-id” used to monitor job, retrieve output

Uses BDII Information System

Page 9: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE Middleware The Resource Broker EGEE project members

9

UI

NetworkServer

Job Contr.-

CondorG

WorkloadManager

LFC

Inform.Service

ComputingElement

StorageElement

RB node

CE characts& status

SE characts& status

Job submission

Page 10: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE Middleware The Resource Broker EGEE project members

10

UI

NetworkServer

Job Contr.-

CondorG

WorkloadManager

LFC

Inform.Service

ComputingElement

StorageElement

RB node

CE characts& status

SE characts& status

Job Status

UI: allows users to access the functionalitiesof the WMS(via command line, GUI, C++ and Java APIs)WMS: Workload Management System

Page 11: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE Middleware The Resource Broker EGEE project members

11

UI

NetworkServer

Job Contr.-

CondorG

WorkloadManager

ReplicaLocationServer

Inform.Service

ComputingElement

StorageElement

RB node

CE characts& status

SE characts& status

edg-job-submit myjob.jdlMyjob.jdl

JobType = “Normal”;Executable = "$(CMS)/exe/sum.exe";InputSandbox = {"/home/user/WP1testC","/home/file*”, "/home/user/DATA/*"};OutputSandbox = {“sim.err”, “test.out”, “sim.log"};Requirements = other. GlueHostOperatingSystemName == “linux" && other. GlueHostOperatingSystemRelease == "Red Hat 7.3“ && other.GlueCEPolicyMaxCPUTime > 10000;Rank = other.GlueCEStateFreeCPUs;

submitted

Job Status

Job Description Language(JDL) to specify job characteristics and requirements

Page 12: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE Middleware The Resource Broker EGEE project members

12

UI

NetworkServer

Job Contr.-

CondorG

WorkloadManager

LFC

Inform.Service

ComputingElement

StorageElement

RB node

CE characts& status

SE characts& status

RBstorage

Input Sandboxfiles

Job

waiting

submitted

Job Status

NS: network daemon responsible for acceptingincoming requests

Page 13: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE Middleware The Resource Broker EGEE project members

13

UI

NetworkServer

Job Contr.-

CondorG

WorkloadManager

LFC

Inform.Service

ComputingElement

StorageElement

RB node

CE characts& status

SE characts& status

RBstorage

waiting

submitted

Job Status

WM: responsible to takethe appropriate actions to satisfy the request

Job

Page 14: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE Middleware The Resource Broker EGEE project members

14

Job submission

UI

NetworkServer

Job Contr.-

CondorG

WorkloadManager

LFC

Inform.Service

ComputingElement

StorageElement

RB node

CE characts& status

SE characts& status

RBstorage

waiting

submitted

Job Status

Match-Maker/Broker

Where must thisjob be executed ?

Page 15: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE Middleware The Resource Broker EGEE project members

15

Job submission

UI

NetworkServer

Job Contr.-

CondorG

WorkloadManager

LFC

Inform.Service

ComputingElement

StorageElement

RB node

CE characts& status

SE characts& status

RBstorage

waiting

submitted

Job Status

Match-Maker/ Broker

Matchmaker: responsible to find the “best” CE where to submit a job

Page 16: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE Middleware The Resource Broker EGEE project members

16

Job submission

UI

NetworkServer

Job Contr.-

CondorG

WorkloadManager

LFC

Inform.Service

ComputingElement

StorageElement

RB node

CE characts& status

SE characts& status

RBstorage

waiting

submitted

Job Status

Match-Maker/ Broker

Where are (which SEs) the needed data ?

What is thestatus of the

Grid ?

Page 17: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE Middleware The Resource Broker EGEE project members

17

Job submission

UI

NetworkServer

Job Contr.-

CondorG

WorkloadManager

LFC

Inform.Service

ComputingElement

StorageElement

RB node

CE characts& status

SE characts& status

RBstorage

waiting

submitted

Job Status

Match-Maker/Broker

CE choice

Page 18: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE Middleware The Resource Broker EGEE project members

18

Job submission

UI

NetworkServer

Job Contr.-

CondorG

WorkloadManager

LFC

Inform.Service

ComputingElement

StorageElement

RB node

CE characts& status

SE characts& status

RBstorage

waiting

submitted

Job Status

JobAdapter

JA: responsible for the final “touches” to the job before performing submission(e.g. creation of wrapper script, etc.)

Page 19: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE Middleware The Resource Broker EGEE project members

19

Job submission

UI

NetworkServer

Job Contr.-

CondorG

WorkloadManager

LFC

Inform.Service

ComputingElement

StorageElement

RB node

CE characts& status

SE characts& status

RBstorage

Job Status

JC: responsible for theactual job managementoperations (done via CondorG)

Job

submitted

waiting

ready

Page 20: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE Middleware The Resource Broker EGEE project members

20

Job submission

UI

NetworkServer

Job Contr.-

CondorG

WorkloadManager

LFC

Inform.Service

ComputingElement

StorageElement

RB node

CE characts& status

SE characts& status

RBstorage

Job Status

Job

InputSandboxfiles

submitted

waiting

ready

scheduled

Page 21: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE Middleware The Resource Broker EGEE project members

21

UI

NetworkServer

Job Contr.-

CondorG

WorkloadManager

LFC

Inform.Service

ComputingElement

StorageElement

RB node

RBstorage

Job Status

InputSandbox

submitted

waiting

ready

scheduled

running

“Grid enabled”data transfers/

accesses

Job

Page 22: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE Middleware The Resource Broker EGEE project members

22

UI

NetworkServer

Job Contr.-

CondorG

WorkloadManager

LFC

Inform.Service

ComputingElement

StorageElement

RB node

RBstorage

Job Status

OutputSandboxfiles

submitted

waiting

ready

scheduled

running

done

Page 23: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE Middleware The Resource Broker EGEE project members

23

UI

NetworkServer

Job Contr.-

CondorG

WorkloadManager

LFC

Inform.Service

ComputingElement

StorageElement

RB node

RBstorage

Job Status

OutputSandbox

submitted

waiting

ready

scheduled

running

done

edg-job-get-output <dg-job-id>

Page 24: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE Middleware The Resource Broker EGEE project members

24

UI

NetworkServer

Job Contr.-

CondorG

WorkloadManager

LFC

Inform.Service

ComputingElement

StorageElement

RB node

RBstorage

Job Status

OutputSandboxfiles

submitted

waiting

ready

scheduled

running

done

cleared

Page 25: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE Middleware The Resource Broker EGEE project members

25

Job monitoring

UI

Log Monitor

Logging &Bookkeeping

NetworkServer

Job Contr.-

CondorG

WorkloadManager

ComputingElement

RB node

LM: parses CondorG logfile (where CondorG logsinfo about jobs) and notifies LB

LB: receives and stores job events; processes corresponding job status

Log ofjob events

edg-job-status <dg-job-id>edg-job-get-logging-info <dg-job-id>

Job status

Page 26: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE Middleware The Resource Broker EGEE project members

EGEE ResourceBroker

26

Enabling Grids for E-sciencE

INFSO-RI-508833

Possible job states

Flag Meaning

SUBMITTED submission logged in the LB

WAIT job match making for resources

READY job being sent to executing CE

SCHEDULED job scheduled in the CE queue manager

RUNNING job executing on a WN of the selected CE queue

DONE job terminated without grid errors

CLEARED job output retrieved

ABORT job aborted by middleware, check reason