atlas data challenges us atlas physics & computing anl october 30th 2001 gilbert poulard cern...

23
ATLAS Data Challenges US ATLAS Physics & Computing ANL October 30th 2001 Gilbert Poulard CERN EP-ATC

Upload: kelley-norris

Post on 05-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ATLAS Data Challenges US ATLAS Physics & Computing ANL October 30th 2001 Gilbert Poulard CERN EP-ATC

ATLAS Data Challenges

US ATLAS Physics & Computing

ANLOctober 30th 2001

Gilbert PoulardCERN EP-ATC

Page 2: ATLAS Data Challenges US ATLAS Physics & Computing ANL October 30th 2001 Gilbert Poulard CERN EP-ATC

ATLAS Plenary-18 October 2001 2

Outline

ATLAS Data Challenges & “LHC Computing Grid Project”

GoalsScenariosOrganization

Page 3: ATLAS Data Challenges US ATLAS Physics & Computing ANL October 30th 2001 Gilbert Poulard CERN EP-ATC

ATLAS Plenary-18 October 2001 3

From CERN Computing Review

CERN Computing Review (December 1999 - February 2001)Recommendations:

organize the computing for the LHC eraLHC Grid project

• Phase 1: Development & prototyping (2001-2004)• Phase 2: Installation of the 1st production system (2005-2007)

Software & Computing Committee (SC2)Project accepted by the CERN council (20 September)

Ask the experiments to validate their Computing model by iterating on a set of Data Challenges of increasing complexity

However DC’s were in our plans

Page 4: ATLAS Data Challenges US ATLAS Physics & Computing ANL October 30th 2001 Gilbert Poulard CERN EP-ATC

ATLAS Plenary-18 October 2001 4

LHC Computing GRID projectPhase1:

Prototype constructiondevelop Grid middlewareacquire experience with high-speed wide-area networkdevelop model for distributed analysisadapt LHC applicationsdeploy a prototype (CERN+Tier1+Tier2)

Softwarecomplete the development of the 1st version of the physics application and enable them for the distributed grid model

develop & support common libraries, tools & frameworks• including simulation, analysis, data management, ...

in parallel LHC collaborations must develop and deploy the first version of their core software

Page 5: ATLAS Data Challenges US ATLAS Physics & Computing ANL October 30th 2001 Gilbert Poulard CERN EP-ATC

ATLAS Plenary-18 October 2001 5

ATLAS Data challenges Goal

understand and validate: our computing model, our data model and our softwareour technology choices

How?In iterating on a set of DCs of increasing complexity

Start with data which looks like real dataRun the filtering and reconstruction chainStore the output data into our databaseRun the analysisProduce physics results

To studyPerformances issues, database technologies, analysis scenarios, ...

To identifyweaknesses, bottle necks, etc… (but also good points)

Page 6: ATLAS Data Challenges US ATLAS Physics & Computing ANL October 30th 2001 Gilbert Poulard CERN EP-ATC

ATLAS Plenary-18 October 2001 6

ATLAS Data challenges

But:Today we don’t have ‘real data’

Needs to produce ‘simulated data’ first so:

• Physics Event generation • Simulation • Pile-up • Detector response• Plus reconstruction and analysis

will be part of the first Data Challenges

we need also to “satisfy” the ATLAS communitiesHLT, Physics groups, ...

Page 7: ATLAS Data Challenges US ATLAS Physics & Computing ANL October 30th 2001 Gilbert Poulard CERN EP-ATC

ATLAS Plenary-18 October 2001 7

ATLAS Data Challenges: DC0

DC0 November-December 2001'continuity' test through the software chain

aim is primarily to check the state of readiness for DC1We plan ~100k Z+jet events, or similarTo validate the software:

• issues to be checked include – G3 simulation running with the ‘latest’ version of

the geometry– reconstruction running

Re-analyze part of the Physics TDR data “reading from & writing to Objectivity”

Would test the “Objy database infrastructure”Complementary to the “continuity test”

Page 8: ATLAS Data Challenges US ATLAS Physics & Computing ANL October 30th 2001 Gilbert Poulard CERN EP-ATC

ATLAS Plenary-18 October 2001 8

ATLAS Data Challenges: DC1

DC1 February-July 2002reconstruction & analysis on a large scale

learn about data model; I/O performances; identify bottle necks …

use of GRID as and when possible and appropriatedata management

Use (evaluate) more than one database technology (Objectivity and ROOT I/O)Learn about distributed analysis

should involve CERN & outside-CERN sitessite planning is going on, an incomplete list already includes sites from Canada, France, Italy, Japan, UK, US, Russia scale 107 events in 10-20 days, O(1000) PC’s

data needed by HLT & Physics groups (others?)Study performance of Athena and algorithms for use in HLTsimulation & pile-up will play an important role

checking of Geant4 versus Geant3

Page 9: ATLAS Data Challenges US ATLAS Physics & Computing ANL October 30th 2001 Gilbert Poulard CERN EP-ATC

ATLAS Plenary-18 October 2001 9

ATLAS Data Challenges: DC1

DC1 will have two distinct phasesFirst, production of events for HLT TDR, where the primary concern is delivery of events to HLT community;Second, testing of software (G4, dBases, detector description,etc.) with delivery of events for physics studiesSoftware will change between these two phases

Simulation & pile-up will be of great importancestrategy to be defined (I/O rate, number of “event” servers?)

As we want to do it ‘world-wide’ we will ‘port’ our software to the GRID environment and use as much as possible the GRID middleware (ATLAS kit to be prepared)

Page 10: ATLAS Data Challenges US ATLAS Physics & Computing ANL October 30th 2001 Gilbert Poulard CERN EP-ATC

ATLAS Plenary-18 October 2001 10

ATLAS Data Challenges: DC2

DC2 Spring-Autumn 2003Scope will depend on what has and has not been achieved in DC0 & DC1 At this stage the goal includes:

Use of ‘TestBed’ which will be built in the context of the Phase 1 of the “LHC Computing Grid Project”

• Scale at a sample of 108 events • System at a complexity ~50% of 2006-2007 system

Extensive use of the GRID middlewareGeant4 should play a major rolePhysics samples could(should) have ‘hidden’ new physicsCalibration and alignment procedures should be tested

May be to be synchronized with “Grid” developments

Page 11: ATLAS Data Challenges US ATLAS Physics & Computing ANL October 30th 2001 Gilbert Poulard CERN EP-ATC

ATLAS Plenary-18 October 2001 11

DC scenario

Production Chain:Event generationDetector Simulation Pile-upDetectors responsesReconstructionAnalysis

These steps should be as independent as possible

Page 12: ATLAS Data Challenges US ATLAS Physics & Computing ANL October 30th 2001 Gilbert Poulard CERN EP-ATC

ATLAS Plenary-18 October 2001 12

Production stream Input Output Framework

Event generation Pythia (others)

none Ntuple/FZ OO-db

Atlsim/Genz Athena

Detector Simulation Geant3 Dice Geant4

Ntuple/FZ OO-db

FZ OO-db

Atlsim FADS/Goofy

Pile-up & Detector responses

Atlsim Dice

FZ OO-db

FZ OO-db

Atlsim Athena

Data conversion FZ OO-db Athena

Reconstruction OO-db OO-db “Ntuple”

Athena “Atrecon?”

Analysis “Ntuple” Paw / Root Anaphe / Jas

“OO-db” is used for “OO database”, it could be Objectivity, ROOT/IO, …

Page 13: ATLAS Data Challenges US ATLAS Physics & Computing ANL October 30th 2001 Gilbert Poulard CERN EP-ATC

ATLAS Plenary-18 October 2001 13

DC1

Ntuple

Pythia, Isajet,Herwig, MyGeneratorModule

HepMC Obj., Root ATLFAST OO

Ntuple

Obj., Root

GENZ

G3/DICE RD event ?OO-DB ?

ATHENA reconstruction

Comb. Ntuple

Obj., RootComb. Ntuple

G4Obj.

Missing:-- filter, trigger -- Detector description-- HepMC in Root -- Digitisation-- ATLFAST output in Root (TObjects) -- Pile-up-- Link MC truth - ATLFAST -- Reconstruction output in Obj., Root-- EDM (e.g. G3/DICE , G4 input to ATHENA)

Ntuple-like

ZEBRA

Page 14: ATLAS Data Challenges US ATLAS Physics & Computing ANL October 30th 2001 Gilbert Poulard CERN EP-ATC

ATLAS Plenary-18 October 2001 14

Detector Simulation Geant3 and Geant4

For HLT & physics studies we will use Geant3Continuity with past studiesATLAS validation of Geant4 is proceeding well but not

completedDetector simulation in Atlsim (Zebra output)

Some production with Geant4 tooGoals to be defined with G4 and Physics groups It is important to get experience with ‘large production’ as part

of G4 validation it is important to use the same geometry input

In the early stage we could decide to use only part of the detector it would also be good to use the same sample of generated

eventsDetector simulation we propose to use the FADS/Goofy

frameworkOutput will be ‘Hits collections’ in OO-db

Detector responses (& pileup) has to be worked on in new framework

Page 15: ATLAS Data Challenges US ATLAS Physics & Computing ANL October 30th 2001 Gilbert Poulard CERN EP-ATC

ATLAS Plenary-18 October 2001 15

Reconstruction

Reconstructionwe want to use the ‘new reconstruction’ code being run in Athena frameworkInput should be from OO-dbOutput in OO-db:

ESD (event summary data)AOD (analysis object data)TAG (event tag)

Page 16: ATLAS Data Challenges US ATLAS Physics & Computing ANL October 30th 2001 Gilbert Poulard CERN EP-ATC

ATLAS Plenary-18 October 2001 16

Analysis

We are just starting to work on this but Analysis tools evaluation should be part of the DC

It will be a good test of the Event Data ModelPerformance issues should be evaluated

Analysis scenario It is important to know the number of analysis groups, the number of physicists per group, the number of people who want to access the data at the same timeIt is of ‘first’ importance to ‘design’ the analysis environment

• to measure the response time• to identify the bottle necks

for that users’ input is needed

Page 17: ATLAS Data Challenges US ATLAS Physics & Computing ANL October 30th 2001 Gilbert Poulard CERN EP-ATC

ATLAS Plenary-18 October 2001 17

Data management

It is a key issue Evaluation of more than one technology is part of

DC1Infrastructure has to be put in place:

For Objectivity & ROOT I/O• Software, hardware, tools to manage the data

– creation, replication, distribution

discussed in database workshop Tools are needed to run the production

“bookkeeping” , “cataloguing” …Run number, random number allocation, …Working group now set-up

Job submission Close collaboration with ATLAS Grid (validation of Release 1)

Page 18: ATLAS Data Challenges US ATLAS Physics & Computing ANL October 30th 2001 Gilbert Poulard CERN EP-ATC

ATLAS Plenary-18 October 2001 18

DC1-HLT - CPU

Number of events

Time per event sec SI95

Total time Sec SI95

Total timeHoursSI95

simulation107 3000

3 * 1010 107

reconstruction 107 640

6.4 * 109 2 * 106

Page 19: ATLAS Data Challenges US ATLAS Physics & Computing ANL October 30th 2001 Gilbert Poulard CERN EP-ATC

ATLAS Plenary-18 October 2001 19

DC1-HLT - data

Number of events

Event sizeMB

Total size GB

Total sizeTB

simulation107 2 20000 20

reconstruction 107 0.5 5000 5

Page 20: ATLAS Data Challenges US ATLAS Physics & Computing ANL October 30th 2001 Gilbert Poulard CERN EP-ATC

ATLAS Plenary-18 October 2001 20

DC1-HLT data with pile-up

L Number of events

Event size MB

Total size GB Total size TB

2 x 1033 1.5 x 106 (1) 2.6(2) 4.7

40007000

47

1034 1.5 x 106 (1) 6.5(2) 17.5

1000026000

1026

In addition to ‘simulated’ data, assuming ‘filtering’ after simulation (~14% of the events kept).

- (1) keeping only ‘digits’

- (2) keeping ‘digits’ and ‘hits’

Page 21: ATLAS Data Challenges US ATLAS Physics & Computing ANL October 30th 2001 Gilbert Poulard CERN EP-ATC

ATLAS Plenary-18 October 2001 21

Ramp-up scenario @ CERN

0

50

100

150

200

250

300

350

400

7 11 16 20 24 25 26

CPU

Week in 2002

Page 22: ATLAS Data Challenges US ATLAS Physics & Computing ANL October 30th 2001 Gilbert Poulard CERN EP-ATC

ATLAS Plenary-18 October 2001 22

What nextThis week:

Have an updated list of goals & requirements prepared with

HLT, Physics communities simulation, reconstruction, database communities people working on ‘infrastructure’ activities

• bookkeeping, cataloguing, ...

Have a list of tasksSome Physics orientedBut also like testing code, running production, …with ‘established’ responsibilities and priorities

And working groups in placeYou can join

Page 23: ATLAS Data Challenges US ATLAS Physics & Computing ANL October 30th 2001 Gilbert Poulard CERN EP-ATC

ATLAS Plenary-18 October 2001 23

What next

In parallelDefine ATLAS validation plan for

EU-DataGrid Release 1ATLAS software for DC0 and DC1

Understand the involvement of Tier centersInsure that we have the necessary resources

@ CERN and outside CERNWe have already some input and a ‘table’ is being prepared

“And turn the key”