atlas distributed analysis and proposal for atlas-lhcb system
DESCRIPTION
ATLAS Distributed Analysis and proposal for ATLAS-LHCb system. ATLAS-LHCb-GANGA Meeting. David Adams BNL March 22, 2004. Definitions Architecture AJDL Application Task Dataset Job High-level services Analysis service Job management service Catalog services. Contents. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: ATLAS Distributed Analysis and proposal for ATLAS-LHCb system](https://reader036.vdocument.in/reader036/viewer/2022062409/5681509e550346895dbe9ab5/html5/thumbnails/1.jpg)
David Adams
ATLAS
ATLAS Distributed Analysis and proposal for ATLAS-LHCb system
David AdamsBNL
March 22, 2004
ATLAS-LHCb-GANGA Meeting
![Page 2: ATLAS Distributed Analysis and proposal for ATLAS-LHCb system](https://reader036.vdocument.in/reader036/viewer/2022062409/5681509e550346895dbe9ab5/html5/thumbnails/2.jpg)
ATLAS dist analysis ATLAS_LHCb-GANGA March 22, 2004 2
David Adams
ATLAS
Contents
Definitions
Architecture
AJDL⢠Application
⢠Task
⢠Dataset
⢠Job
High-level services⢠Analysis service
⢠Job management service
⢠Catalog services
Implementation Strategy
Effort providers⢠ARDA
⢠Role of GANGA
Connection to LHCb
More information
![Page 3: ATLAS Distributed Analysis and proposal for ATLAS-LHCb system](https://reader036.vdocument.in/reader036/viewer/2022062409/5681509e550346895dbe9ab5/html5/thumbnails/3.jpg)
ATLAS dist analysis ATLAS_LHCb-GANGA March 22, 2004 3
David Adams
ATLAS
DefinitionsAnalysis (not necessarily distributed)
⢠Supports the manipulation and extraction of summary data (e.g. histograms) from any type of event data
â AOD, ESD, âŚ
⢠Supports user-level production of event dataâ e.g. MC generation, simulation and reconstruction
Distributed analysis⢠Extends the extraction and production support to
include distributed users, data and processing.⢠Natural extension of non-distributed analysis⢠Easily invoked from any ATLAS analysis environment
â including Python, ROOT, command lineâ easily ported to any future environment (e.g. JAS)
![Page 4: ATLAS Distributed Analysis and proposal for ATLAS-LHCb system](https://reader036.vdocument.in/reader036/viewer/2022062409/5681509e550346895dbe9ab5/html5/thumbnails/4.jpg)
ATLAS dist analysis ATLAS_LHCb-GANGA March 22, 2004 4
David Adams
ATLAS
Architecture
M id d lew ar e s er v ic e in te r f ac es
C EW M S F ileC ata lo g
etc . . . .e tc . M id d lew ar es er v ic es
Hig h lev e l s er v ic e in te r f ac es ( AJ D L )
D I ALAn aly s isS er v ic e
G AN G AAn aly s isS er v ic e
AT P R O DAn aly s isS er v ic e
R O O Tc m d lin e
C lien t
G AN G Ac m d lin e
C lien t
G AN G AT as k
M an ag em en t
D I R ACAn aly s isS er v ic e
G AN G AJ o b
S u b m is s io n
G AN G AJ o b
M an ag em en t
Hig h - lev e ls er v ic es
C lien t to o ls
AR D AAn aly s isS er v ic eC ata lo g
s er v ic es
G AN G A G UI
D atas e tS p lit te r
D atas e tM er g er
J o bM an ag em en t
![Page 5: ATLAS Distributed Analysis and proposal for ATLAS-LHCb system](https://reader036.vdocument.in/reader036/viewer/2022062409/5681509e550346895dbe9ab5/html5/thumbnails/5.jpg)
ATLAS dist analysis ATLAS_LHCb-GANGA March 22, 2004 5
David Adams
ATLAS
AJDLAcronym: Analysis Job Definition Language
Used to define interfaces for high-level services
Components include:⢠Application â executable to process data
⢠Task â user configuration of application
⢠Dataset â describes input and output data
⢠Job â Activity to perform on (or off) the gridâ Typical: app, task and input dataset output dataset
Following diagram shows typical component interactions
![Page 6: ATLAS Distributed Analysis and proposal for ATLAS-LHCb system](https://reader036.vdocument.in/reader036/viewer/2022062409/5681509e550346895dbe9ab5/html5/thumbnails/6.jpg)
ATLAS dist analysis ATLAS_LHCb-GANGA March 22, 2004 6
David Adams
ATLAS
AnalysisFramework
Job 1
Job 2
Application Task
Dataset 1
AnalysisService
1. Locate
2. select 3. Create or select
4. select
5. submit(app,tsk,ds)
6. splitDataset
Dataset 2
7. create
e.g. ROOT
e.g. athena
Result9. create
10. gather
Result 9. create
exe, pkgs scripts, codeADA/DIAL user
interface
![Page 7: ATLAS Distributed Analysis and proposal for ATLAS-LHCb system](https://reader036.vdocument.in/reader036/viewer/2022062409/5681509e550346895dbe9ab5/html5/thumbnails/7.jpg)
ATLAS dist analysis ATLAS_LHCb-GANGA March 22, 2004 7
David Adams
ATLAS
AJDL (cont)Components must be extensible
⢠Use subtypesâ E.g. HistogramDataset, EventDataset, AtlasEventDataset
⢠Generic interfaceâ For use by (shared) generic high-level services
⢠Experiment-specific interfaceâ For application and users
Nature of components⢠Persistent representation of data (e.g. XML)
⢠Classes to interpret this data (C++, Python, java,âŚ)â Language bindings or re-implementations
⢠Service or resource (as in WSRF)
![Page 8: ATLAS Distributed Analysis and proposal for ATLAS-LHCb system](https://reader036.vdocument.in/reader036/viewer/2022062409/5681509e550346895dbe9ab5/html5/thumbnails/8.jpg)
ATLAS dist analysis ATLAS_LHCb-GANGA March 22, 2004 8
David Adams
ATLAS
ApplicationApplication specifies executable used to process data
Two entry points⢠Extract and build task
⢠Process input dataset to produce output datasetâ Application + Task = Dataset transformation
Carries enough information to⢠Locate entry points
â Or carry the corresponding scripts
⢠Enable installation of all required softwareâ E.g. list of packages for use with package management system
â Might be subtypes for different package management systems
![Page 9: ATLAS Distributed Analysis and proposal for ATLAS-LHCb system](https://reader036.vdocument.in/reader036/viewer/2022062409/5681509e550346895dbe9ab5/html5/thumbnails/9.jpg)
ATLAS dist analysis ATLAS_LHCb-GANGA March 22, 2004 9
David Adams
ATLAS
TaskTask carries the user configuration for an application
⢠E.g. runtime configuration or code for shared library
⢠Nature of the task specified by the corresponding application
⢠At present the task is a collection of embedded text files
Task plus application (transformation) should specify the content of input and output datasets
⢠Enable users and processing system toâ Verify transformation is suitable for given input dataset
â Avoid staging unneeded parts of input dataset
â Predict the content of output dataset
![Page 10: ATLAS Distributed Analysis and proposal for ATLAS-LHCb system](https://reader036.vdocument.in/reader036/viewer/2022062409/5681509e550346895dbe9ab5/html5/thumbnails/10.jpg)
ATLAS dist analysis ATLAS_LHCb-GANGA March 22, 2004 10
David Adams
ATLAS
DatasetProvides data view
Generic properties for use in high-level services:⢠Location of data (files, DB, âŚ)
â So data can be staged
⢠Contentâ E.g. for ATLAS events: event IDâs and type-keys (e.g. good
electrons) for each event
â EventDataset is an important generic subtype
⢠Constituents for compound datasetâ Natural boundaries for dataset splitting
Subtypes provide interface for users and applications to access the data
![Page 11: ATLAS Distributed Analysis and proposal for ATLAS-LHCb system](https://reader036.vdocument.in/reader036/viewer/2022062409/5681509e550346895dbe9ab5/html5/thumbnails/11.jpg)
ATLAS dist analysis ATLAS_LHCb-GANGA March 22, 2004 11
David Adams
ATLAS
JobInterface enables users (and high-level services) to monitor and manage jobs on the grid
Generic properties⢠State: running, succeeded, failed, paused, âŚ
⢠Input parameters (e.g. application, task and dataset)
⢠Result (e.g. output dataset) after completion
Management⢠Pause/resume
⢠Kill
⢠Update status
⢠Job management service to implement these
![Page 12: ATLAS Distributed Analysis and proposal for ATLAS-LHCb system](https://reader036.vdocument.in/reader036/viewer/2022062409/5681509e550346895dbe9ab5/html5/thumbnails/12.jpg)
ATLAS dist analysis ATLAS_LHCb-GANGA March 22, 2004 12
David Adams
ATLAS
High-level servicesHigh-level services use AJDL components
⢠Middleware does not
Typically high-level services are generic⢠Only use generic properties of AJDL components
⢠Same service for different applications and datasets
⢠Different experiments or realms can share servicesâ E.g. LHCb and ATLAS
Examples⢠Analysis (transformation) service
⢠Job management
⢠Catalogs
![Page 13: ATLAS Distributed Analysis and proposal for ATLAS-LHCb system](https://reader036.vdocument.in/reader036/viewer/2022062409/5681509e550346895dbe9ab5/html5/thumbnails/13.jpg)
ATLAS dist analysis ATLAS_LHCb-GANGA March 22, 2004 13
David Adams
ATLAS
Analysis serviceTransformation service might be a better name
Provides means to create a concrete dataset
Interface functions⢠Request dataset
â Input is application, task and dataset
â Output is job ID
â Associated job carries ID for output dataset
⢠Fetch job descriptionâ Input is job ID
â Output is job
![Page 14: ATLAS Distributed Analysis and proposal for ATLAS-LHCb system](https://reader036.vdocument.in/reader036/viewer/2022062409/5681509e550346895dbe9ab5/html5/thumbnails/14.jpg)
ATLAS dist analysis ATLAS_LHCb-GANGA March 22, 2004 14
David Adams
ATLAS
Analysis service (cont)Example scenario for processing a high-level job
⢠Input is application, task, dataset and job configuration⢠Map input virtual dataset to concrete representation⢠Split into sub-datasets⢠Create sub-job for each sub-dataset⢠Stage files for each sub-job⢠Locate and possibly install application⢠Build (e.g. compile) task⢠Run sub-jobs⢠Gather and merge results to create output dataset⢠Register output dataset (including replica)⢠Job provides connection to output dataset and detailed
job provenance
![Page 15: ATLAS Distributed Analysis and proposal for ATLAS-LHCb system](https://reader036.vdocument.in/reader036/viewer/2022062409/5681509e550346895dbe9ab5/html5/thumbnails/15.jpg)
ATLAS dist analysis ATLAS_LHCb-GANGA March 22, 2004 15
David Adams
ATLAS
Job management serviceProvide means to manage jobs
⢠Analysis service creating the job provides this
⢠May also want this functionality elsewhere
Accessed from job interface to implement management functions
⢠Might create job service (OGSI)
⢠Or job is a resource (WSRF)
![Page 16: ATLAS Distributed Analysis and proposal for ATLAS-LHCb system](https://reader036.vdocument.in/reader036/viewer/2022062409/5681509e550346895dbe9ab5/html5/thumbnails/16.jpg)
ATLAS dist analysis ATLAS_LHCb-GANGA March 22, 2004 16
David Adams
ATLAS
Catalog servicesRepositories
⢠Store AJDL components indexed by ID
Selection (metadata) catalogs⢠Help user to select input data, task , âŚ
VDC â Virtual Dataset Catalog⢠Prescriptions for creating datasets
â Application, task input dataset
DRC â Dataset Replica Catalog⢠Mapping between virtual and concrete datasets
Job catalog⢠Detailed provenance for concrete datasets
![Page 17: ATLAS Distributed Analysis and proposal for ATLAS-LHCb system](https://reader036.vdocument.in/reader036/viewer/2022062409/5681509e550346895dbe9ab5/html5/thumbnails/17.jpg)
ATLAS dist analysis ATLAS_LHCb-GANGA March 22, 2004 17
David Adams
ATLAS
Implementation strategyDefine AJDL
⢠Components, nature, interfaces
Implement catalogs⢠Tables in AMI
⢠Programmatic interfaceâ (C++ with Python binding)
Analysis services⢠Start with existing services or analogs
â DIAL, ATCOM, Capone, GANGA, âŚ
⢠Different implementations for different strategies
⢠At least one using ARDA middleware
![Page 18: ATLAS Distributed Analysis and proposal for ATLAS-LHCb system](https://reader036.vdocument.in/reader036/viewer/2022062409/5681509e550346895dbe9ab5/html5/thumbnails/18.jpg)
ATLAS dist analysis ATLAS_LHCb-GANGA March 22, 2004 18
David Adams
ATLAS
Implementation strategy (cont)User interface
⢠Programmatic interface to high-level services and AJDL components
â C++, python and eventually java bindings
⢠GANGA will provide python binding and use it to deliver a GUI
â Extensible design: client tools plug into python bus
Middleware⢠Whatever works to begin
⢠ARDA services will be used in that contextâ Like to see better integration with other middleware efforts
![Page 19: ATLAS Distributed Analysis and proposal for ATLAS-LHCb system](https://reader036.vdocument.in/reader036/viewer/2022062409/5681509e550346895dbe9ab5/html5/thumbnails/19.jpg)
ATLAS dist analysis ATLAS_LHCb-GANGA March 22, 2004 19
David Adams
ATLAS
Implementation strategy (cont)Web service infrastructure
⢠Short term use independent persistent services
⢠Mid-term follow ARDA strategyâ GAS â grid access service
⢠Long term follow standards such as WSRFâ Dataset and job become resources?
Releases⢠Deliver working prototype in May
â Robust enough for average physicist
⢠Regular releases adding functionality, improving performance and incorporating new middleware
![Page 20: ATLAS Distributed Analysis and proposal for ATLAS-LHCb system](https://reader036.vdocument.in/reader036/viewer/2022062409/5681509e550346895dbe9ab5/html5/thumbnails/20.jpg)
ATLAS dist analysis ATLAS_LHCb-GANGA March 22, 2004 20
David Adams
ATLAS
Effort providersLook to the following for effort:
⢠GANGA for user interface and more
⢠DIAL for interactive analysis service
⢠ARDA integration team for ARDA analysis service
⢠ARDA/EGEE and US grid projects for middleware
⢠POOL for datasets and metadata?
⢠SEAL for python-C++ integrationâ Later java as well?
⢠ATLAS physics and computing groups for ATLAS-specific pieces
â ATLAS applications and datasets
â System testing and evaluation
![Page 21: ATLAS Distributed Analysis and proposal for ATLAS-LHCb system](https://reader036.vdocument.in/reader036/viewer/2022062409/5681509e550346895dbe9ab5/html5/thumbnails/21.jpg)
ATLAS dist analysis ATLAS_LHCb-GANGA March 22, 2004 21
David Adams
ATLAS
ARDAARDA begins April 1
Two areas in LCG:⢠Middleware development (1st report delivered)
⢠Integration team
ATLAS ARDA prototype⢠Collaboration in context of integration team
⢠Deliver at least one analysis service base on ARDA middleware
⢠We would also like to collaborate on AJDL and other high-level services
![Page 22: ATLAS Distributed Analysis and proposal for ATLAS-LHCb system](https://reader036.vdocument.in/reader036/viewer/2022062409/5681509e550346895dbe9ab5/html5/thumbnails/22.jpg)
ATLAS dist analysis ATLAS_LHCb-GANGA March 22, 2004 22
David Adams
ATLAS
Role of GANGALook to GANGA to provide
⢠Python binding (or implementation) for AJDL
⢠Client toolsâ Job submission
â Job monitoring and management
â Task management
> Including JOE
⢠Comprehensive graphical analysis environmentâ Including the above client tools
⢠LCG analysis service?
⢠Help with system integration and testing
⢠And moreâŚ
![Page 23: ATLAS Distributed Analysis and proposal for ATLAS-LHCb system](https://reader036.vdocument.in/reader036/viewer/2022062409/5681509e550346895dbe9ab5/html5/thumbnails/23.jpg)
ATLAS dist analysis ATLAS_LHCb-GANGA March 22, 2004 23
David Adams
ATLAS
Connection to LHCbTo be determined
⢠This meeting?
My ideal is that ATLAS and LHCB share a system⢠Along lines of the architecture described here
⢠Most GANGA effort directed toward delivering generic high-level services and client tools
Implications⢠Most of the effort expended by GANGA developers is
directly usable by both experiments
⢠Easy for others outside GANGA to contribute pieces
⢠Use by two experiments validates the idea of generic tools and services
![Page 24: ATLAS Distributed Analysis and proposal for ATLAS-LHCb system](https://reader036.vdocument.in/reader036/viewer/2022062409/5681509e550346895dbe9ab5/html5/thumbnails/24.jpg)
ATLAS dist analysis ATLAS_LHCb-GANGA March 22, 2004 24
David Adams
ATLAS
More informationADA home page:
⢠http://www.usatlas.bnl.gov/ADA
⢠This page has links to other projects