hepcal, ppdg cs11 & the gae workshop

14
LCG HEPCAL, PPDG CS11 & the GAE workshop Ruth Pordes Fermilab presenting (as usual) the work of many others. HEPCALs Documenting Use Cases. A forum for coming to a common understanding and generating/checking Grid middleware requirements across 4 LHC experiments. Chair of the committees is Federico, and Jeff Templon is the chief editor. HEPCAL - Summer 02 HEPCAL Prime - HEPCAL updated - Spring 03. HEPCAL-II - Analysis Use cases - Phase 1 June ‘03; Phase 2 Nov ‘03

Upload: tass

Post on 12-Jan-2016

31 views

Category:

Documents


0 download

DESCRIPTION

HEPCAL, PPDG CS11 & the GAE workshop. Ruth Pordes Fermilab presenting (as usual) the work of many others. HEPCALs Documenting Use Cases. A forum for coming to a common understanding and generating/checking Grid middleware requirements across 4 LHC experiments. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: HEPCAL, PPDG CS11 & the GAE workshop

LCG

HEPCAL, PPDG CS11& the GAE workshop

Ruth Pordes

Fermilab

presenting (as usual) the work of many others.

HEPCALs

Documenting Use Cases.

A forum for coming to a common understanding and generating/checking Grid middleware requirements across 4 LHC experiments.

Chair of the committees is Federico, and Jeff Templon is the chief editor.

HEPCAL - Summer 02

HEPCAL Prime - HEPCAL updated - Spring 03.

HEPCAL-II - Analysis Use cases - Phase 1 June ‘03; Phase 2 Nov ‘03

Page 2: HEPCAL, PPDG CS11 & the GAE workshop

23 June 2003 R. Pordes, GAE workshop 2

LCG HEPCAL - its usefulness -

• Discussion and comments following the release stimulated Test Case implementations for EDG.

- Useful in identifying holes; thinking through details of end to end functionality.

- Helped to solidify how to move forward to joint “GLUE” testing project.

• Joint response from US and EU Grid Middleware projects helped

- understanding of boundaries between VDT and EDG components

- ability to move to to common underlying infrastructure.

- better appreciation of components in LCG, EDG and VDT.

• Good reference for glossary and definitions

Willingness to have regular updates to this document will contribute to its usefulness -> Hepcal-Prime

Page 3: HEPCAL, PPDG CS11 & the GAE workshop

23 June 2003 R. Pordes, GAE workshop 3

LCGHepcal aims to give input/guidance to

Software in the “Grid Domain”

Algorithms Framework Services API

Application external Services API

Framework Domain Grid Domain

Page 4: HEPCAL, PPDG CS11 & the GAE workshop

23 June 2003 R. Pordes, GAE workshop 4

LCG HEPCAL-Prime - its relevance

• Gives agreed upon definitions and scope of many Concepts. These may be wrong - but there is plenty of text to critique, an active mail list for discussions, and a recognised forum for consensus and decision. E.g.

- “catalogues and datasets. A catalogue is a collection of data that is updateable and transactional. A dataset is a read-only collection of data. A special case of the dataset is the Virtual Dataset”.

- Long discussion of datasets etc.

- We expect the Grid to assign a unique job identifier to each Job. Classify all Jobs into 2 categories of “Organized” or “Chaotic”

• Some significant areas of Requirements and Use superficially addressed e.g.

- System Wide issues - Architecture, Requirements, Operations

- Security - VO, Authorization mechanisms

- Treatment of failures and faults

- Long transactions and persistent state

• Are the fundamental assumptions and scope correct or agreed to?

- Mostly FILEs

- LDN and GUID

- All events part of a tree

- Concept that “user” is often an “Agent” or “Role” based capability came late and there are lacks due to this.

http://cern.ch/fca/HEPCAL-prime.doc

Page 5: HEPCAL, PPDG CS11 & the GAE workshop

23 June 2003 R. Pordes, GAE workshop 5

LCGHEPCAL-prime has added first Performance

Requirements

Page 6: HEPCAL, PPDG CS11 & the GAE workshop

23 June 2003 R. Pordes, GAE workshop 6

LCG HEPCAL-II scope and status

• Goal is to provide Use Cases describing Analysis such that Requirements can be synthesized and a Software Architecture and Design started.

• First phase “over” for document to be delivered to the SC2 at the end of this month . Not clear that this is sufficient for the new RTAG.

• Really only a first pass at bringing people on the committees thinking forward to approach the differences and similarities between Analysis and Production Processing.

• At the moment there seem to be a couple of concepts that people agree are different:

- May not know the Input Data that is needed til the job is run. (job executions are preceded by Queries to define the input data.)

- User Interaction may be required and will have a wide range of “response” needs.

- System concepts like planning, prioritization, VO management not included.

Page 7: HEPCAL, PPDG CS11 & the GAE workshop

23 June 2003 R. Pordes, GAE workshop 7

LCG

Still simple models of end to end Analysis steps

Page 8: HEPCAL, PPDG CS11 & the GAE workshop

23 June 2003 R. Pordes, GAE workshop 8

LCG

• Performance Requirements: [ This section needs considerable reworking, still looking for brilliant ideas. ] It is expected to have about 10-15 physics analysis groups in each experiment with probably 10-20 active people in each extracting the data from the earlier scenarios...

For the later stages ..the produced data may not necessarily be registered on the Grid. In addition, it is expected to have about 30(?) people per subdetector in each experiment (total of 3-500? per experiment) accessing the data for detector studies and/or calibration purposes. So a total of 400-600 people in each experiment is expected to do the extraction of (possibly private) results. This number is representative; depending on the stage of the experiment the profile might be quite different.

• Is there a common data handling layer that is external to the application and has middleware and/or external to middleware components? Still no assumption on this. - is it time to make a decision? Query handlers as an LCG common project? Collaborating with PPDG?

Page 9: HEPCAL, PPDG CS11 & the GAE workshop

23 June 2003 R. Pordes, GAE workshop 9

LCG The Arrow of “increasing interactivity”

Response time

Sporadic tuningand optimisation

Continuous tuningand optimisation

Eventdisplay

Histo’sPlottingBrowsing

Production, reconstruction, …

Interactiveclients

Interactiveservers

IrrelevantImpractical

Useless

A

B

C

D

The horizontal axis can be divided into general regions based largely on human time-scales:

< 1 sec: Instantaneous. User's attention is continually focused upon the job.

< 1 min: Fast. Time periods spent waiting for response or results is short enough that user will not start another task in the interim.

< 1 hour: Slow. User will likely devote attention to another task while waiting for response/results, but will return to task in same working day.

> 1 day: Glacial. User will likely release and forget. Will return to task after an extended period or only upon notification that task has completed.

Page 10: HEPCAL, PPDG CS11 & the GAE workshop

23 June 2003 R. Pordes, GAE workshop 10

LCG 1.1.1 Persistent interactive environment

For each analysis session user should be able to assign a name (in user’s private namespace) to which he/she can subsequently refer in order to

• get additional information about analysis status, estimated time to completion,…

• find and retrieve partial results of his/her analysis

• re-establish complete analysis environment at later stage

• ….

Page 11: HEPCAL, PPDG CS11 & the GAE workshop

23 June 2003 R. Pordes, GAE workshop 11

LCG PPDG CS-11

Page 12: HEPCAL, PPDG CS11 & the GAE workshop

23 June 2003 R. Pordes, GAE workshop 12

LCGPPDG CS-11

“Interactive” Physics Analysis on a Grid

• Cross Experiment Working Group tp discuss common requirements and interfaces.

• Forum to bring information about many needed parallel implementations and prototyping to gain understanding

• Extract the common requirements that such applications make on the grid, to influence grid middleware to contain the necessary hooks

• Evaluate existing interfaces and services propose extensions/ describe new interfaces as needed

• Particularly strong participation has come from analysis tool makers in the US: JAS, Caigee, ROOT.

Page 13: HEPCAL, PPDG CS11 & the GAE workshop

23 June 2003 R. Pordes, GAE workshop 13

LCG PPDG Analysis Tools Work

Not focused yet on common development effort. Still a “working group” for PPDG Year3. Expect it to be a focus of Year 4&5.

People in PPDG are encouraging us to make it a strong focus development -> production effort sooner? However, PPDG must avoid landing in the todays situation as for Replica Management systems ie 6 different implementations

IN PRODUCTION

Page 14: HEPCAL, PPDG CS11 & the GAE workshop

23 June 2003 R. Pordes, GAE workshop 14

LCG ..CS-11 service names to date..

• Submit Abstract Job

• Submit Concrete Job

• Control Concrete Job

• Status of Concrete Job

- (Status is an exposed interface to every service)

• Concrete Job Capabilities.

• Sub-Job Management / Partition Job

• Estimate Performance

• Move Data

• Copy Data

• Query DataSet Catalog

• Manage Dataset Catalog

• Manage Data Replication

• Access Metadata Catalog

• Discover Resource

• Reserve Resource

• Matchmaker

• Manage Storage

• Login/Logout

• Install Software