the atlas computing model: status, plans and future possibilities shawn mckee university of michigan...
TRANSCRIPT
The ATLAS Computing Model:
Status, Plans and Future
Possibilities
Shawn McKeeShawn McKee
University of MichiganUniversity of Michigan
CCP 2006, Gyeongju, CCP 2006, Gyeongju,
Korea Korea
August 29August 29thth, 2006, 2006
The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities
Shawn McKeeShawn McKee 2
Overview
The ATLAS collaboration has only a year The ATLAS collaboration has only a year
before it must manage large amounts of before it must manage large amounts of
“real” data for its globally “real” data for its globally
distributed collaboration.distributed collaboration.
ATLAS physicists need the software and ATLAS physicists need the software and
physical infrastructure required to:physical infrastructure required to: Calibrate and align detector subsystems
to produce well understood data Realistically simulate the ATLAS detector
and its underlying physics Provide access to ATLAS data globally Define, manage, search and analyze data-
sets of interest
I will cover current status, plans and I will cover current status, plans and
some of the relevant research in this some of the relevant research in this
area and indicate how it might benefit area and indicate how it might benefit
ATLAS in augmenting and extending its ATLAS in augmenting and extending its
infrastructure.infrastructure.
ATLASATLAS
The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities
Shawn McKeeShawn McKee 3
The ATLAS Computing Model
Computing Model is fairly well evolved, documented in C-TDRComputing Model is fairly well evolved, documented in C-TDR http://doc.cern.ch//archive/electronic/cern/preprints/lhcc/publi
c/lhcc-2005-022.pdf
There are many areas with significant questions/issues to be There are many areas with significant questions/issues to be
resolved:resolved: Calibration and alignment strategy is still evolving
Physics data access patterns MAY be exercised (SC04: since June) Unlikely to know the real patterns until 2007/2008!
Still uncertainties on the event sizes , reconstruction time
How best to integrate ongoing “infrastructure” improvements from research efforts into our operating model?
Lesson from the previous round of experiments at CERN (LEP, 1989-Lesson from the previous round of experiments at CERN (LEP, 1989-
2000)2000)
Reviews in 1988 underestimated the computing requirements by an order of magnitude!
The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities
Shawn McKeeShawn McKee 4
ATLAS Computing Model Overview
We have a hierarchical model (EF-T0-T1-T2) with We have a hierarchical model (EF-T0-T1-T2) with
specific roles and responsibilitiesspecific roles and responsibilities Data will be processed in stages: RAW->ESD->AOD-TAG
Data “production” is well-defined and scheduled
Roles and responsibilities are assigned within the hierarchy.
Users will send jobs to the data and extract Users will send jobs to the data and extract relevant datarelevant data typically NTuples or similar
Goal is a production and analysis system with Goal is a production and analysis system with seamless access to all ATLAS grid resourcesseamless access to all ATLAS grid resources
All resources need to be managed effectively to All resources need to be managed effectively to insure ATLAS goals are met and resource providers insure ATLAS goals are met and resource providers policy’s are enforced. Grid middleware must policy’s are enforced. Grid middleware must provide thisprovide this
The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities
Shawn McKeeShawn McKee 5
ATLAS Facilities and Roles
Event Filter Farm at CERN Event Filter Farm at CERN Assembles data (at CERN) into a stream to the Tier 0 Center
Tier 0 Center at CERNTier 0 Center at CERN Data archiving: Raw data to mass storage at CERN and to Tier 1 centers Production: Fast production of Event Summary Data (ESD) and Analysis Object
Data (AOD) Distribution: ESD, AOD to Tier 1 centers and mass storage at CERN
Tier 1 Centers distributed worldwide (10 centers)Tier 1 Centers distributed worldwide (10 centers) Data steward: Re-reconstruction of raw data they archive, producing new ESD,
AOD Coordinated access to full ESD and AOD (all AOD, 20-100% of ESD depending upon
site)
Tier 2 Centers distributed worldwide (approximately 30 centers)Tier 2 Centers distributed worldwide (approximately 30 centers) Monte Carlo Simulation, producing ESD, AOD, ESD, AOD sent to Tier 1 centers On demand user physics analysis of shared datasets
Tier 3 Centers distributed worldwideTier 3 Centers distributed worldwide Physics analysis
A CERN Analysis FacilityA CERN Analysis Facility Analysis
Enhanced access to ESD and RAW/calibration data on demand
The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities
Shawn McKeeShawn McKee 6
Computing Model: event data flow from EF
Events written in “ByteStream” format by the Event Filter farm in Events written in “ByteStream” format by the Event Filter farm in
2 GB files 2 GB files ~1000 events/file (nominal size is 1.6 MB/event)
200 Hz trigger rate (independent of luminosity)
Currently 4+ streams are foreseen: Express stream with “most interesting” events Calibration events (including some physics streams, such as inclusive leptons)
“Trouble maker” events (for debugging) Full (undivided) event stream
One 2-GB file every 5 seconds will be available from the Event Filter
Data will be transferred to the Tier-0 input buffer at 320 MB/s (average)
The Tier-0 input buffer will have to hold raw data waiting for The Tier-0 input buffer will have to hold raw data waiting for
processingprocessing And also cope with possible backlogs
~125 TB will be sufficient to hold 5 days of raw data on disk
The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities
Shawn McKeeShawn McKee 7
ATLAS Data Processing
Tier-0:Tier-0: Prompt first pass processing on express/calibration & physics streams
24-48 hours, process full physics streams with reasonable calibrations
Implies large data movement from T0 →T1s, some T0 ↔ T2 (Calibration)Tier-1:Tier-1:
Reprocess 1-2 months after arrival with better calibrationsReprocess all local RAW at year end with improved calibration and software
Implies large data movement from T1↔T1 and T1 → T2
The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities
Shawn McKeeShawn McKee 8
ATLAS partial &“average” T1 Data Flow (2008)
Tier-0
CPUfarm
T1T1OtherTier-1s
diskbuffer
RAW
1.6 GB/file0.02 Hz1.7K f/day32 MB/s2.7 TB/day
ESD2
0.5 GB/file0.02 Hz1.7K f/day10 MB/s0.8 TB/day
AOD2
10 MB/file0.2 Hz17K f/day2 MB/s0.16 TB/day
AODm2
500 MB/file0.004 Hz0.34K f/day2 MB/s0.16 TB/day
RAW
ESD2
AODm2
0.044 Hz3.74K f/day44 MB/s3.66 TB/day
RAW
ESD (2x)
AODm (10x)
1 Hz85K f/day720 MB/s
T1T1OtherTier-1s
T1T1EachTier-2
Tape
RAW
1.6 GB/file0.02 Hz1.7K f/day32 MB/s2.7 TB/day
diskstorage
AODm2
500 MB/file0.004 Hz0.34K f/day2 MB/s0.16 TB/day
ESD2
0.5 GB/file0.02 Hz1.7K f/day10 MB/s0.8 TB/day
AOD2
10 MB/file0.2 Hz17K f/day2 MB/s0.16 TB/day
ESD2
0.5 GB/file0.02 Hz1.7K f/day10 MB/s0.8 TB/day
AODm2
500 MB/file0.036 Hz3.1K f/day18 MB/s1.44 TB/day
ESD2
0.5 GB/file0.02 Hz1.7K f/day10 MB/s0.8 TB/day
AODm2
500 MB/file0.036 Hz3.1K f/day18 MB/s1.44 TB/day
ESD1
0.5 GB/file0.02 Hz1.7K f/day10 MB/s0.8 TB/day
AODm1
500 MB/file0.04 Hz3.4K f/day20 MB/s1.6 TB/day
AODm1
500 MB/file0.04 Hz3.4K f/day20 MB/s1.6 TB/day
AODm2
500 MB/file0.04 Hz3.4K f/day20 MB/s1.6 TB/day
PlusPlus simulation and simulation and analysis data flowanalysis data flow
Slide from D.Barberis
There are a significant There are a significant number of flows to be number of flows to be managed and optimizedmanaged and optimized
The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities
Shawn McKeeShawn McKee 9
ATLAS Event Data Model
RAW:RAW: “ByteStream” format, ~1.6 MB/event
ESD (Event Summary Data):ESD (Event Summary Data): Full output of reconstruction in object (POOL/ROOT) format:
Tracks (+ their hits), Calo Clusters, Calo Cells, combined reconstruction objects etc.
Nominal size 500 kB/event currently 2.5 times larger: contents and technology under revision
AOD (Analysis Object Data):AOD (Analysis Object Data): Summary of event reconstruction with “physics” (POOL/ROOT)
objects: electrons, muons, jets, etc.
Nominal size 100 kB/event currently 70% of that: contents and technology under revision
TAG:TAG: Database used to quickly select events in AOD and/or ESD
files
The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities
Shawn McKeeShawn McKee 10
ATLAS Data Streaming
ATLAS Computing TDR had 4 streams from ATLAS Computing TDR had 4 streams from event filter event filter primary physics, calibration, express, problem events Calibration stream has split at least once since!
Discussions are focused upon optimisation of data Discussions are focused upon optimisation of data
accessaccess
At AOD, envisage ~10 streamsAt AOD, envisage ~10 streams
TAGs useful for event selection and data set TAGs useful for event selection and data set
definitiondefinition
We are now planning ESD and RAW streamingWe are now planning ESD and RAW streaming Straw man streaming schemes (trigger based) being agreed Will explore the access improvements in large-scale exercises
Are also looking at overlaps, bookkeeping etc
The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities
Shawn McKeeShawn McKee 11
HEP Data Analysis
Raw data Raw data hits, pulse heights
Reconstructed data (ESD)Reconstructed data (ESD) tracks, clusters…
Analysis Objects (AOD)Analysis Objects (AOD) Physics Objects Summarized Organized by physics topic
Ntuples, histograms, Ntuples, histograms, statistical datastatistical data
The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities
Shawn McKeeShawn McKee 12
Production Data
Processing
Raw dataRaw data
Reconstruction
Data Acquisition
Level 3 trigger
Trigger TagsTrigger Tags
Event Summary Data ESDEvent Summary Data ESD Event Tags Event Tags
Physics Models
Monte Carlo Truth DataMonte Carlo Truth Data
MC Raw DataMC Raw Data
Reconstruction
MC Event Summary DataMC Event Summary Data MC Event Tags MC Event Tags
Detector Simulation
Calibration DataCalibration Data
Run ConditionsRun Conditions
Trigger System
coordination required at the collaboration and group levels
The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities
Shawn McKeeShawn McKee 13
Physics Analysis
Event Tags Event TagsEvent Selection
Calibration DataCalibration Data
Analysis
ProcessingRaw DataRaw Data
Tier 0,1Collaboration
wide
Tier 2Analysis
Groups
Tier 3, 4Physicists
Physics Analysis
PhysicsObjects
StatObjects
ESDESD
ESDESD
ESD
Analysis
Objects
PhysicsObjects
StatObjects
PhysicsObjects
StatObjects
The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities
Shawn McKeeShawn McKee 14
ATLAS Resource Requirements in for 2008
Recent (July 2006) updates have reduced the expected Recent (July 2006) updates have reduced the expected contributionscontributionsCPU (MSI2k) Tape (PB) Disk (PB)
Tier-0 3.7 2.1 0.2
CERN AF 2.1 0.3 1.0
Sum of Tier-1s 16.7 6.0 7.6
Sum of Tier-2s 18.9 0.0 6.1
Total 41.4 8.4 14.9
Computing TDRComputing TDR
The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities
Shawn McKeeShawn McKee 15
ATLAS Grid Infrastructure
ATLAS plans to use grid technologyATLAS plans to use grid technology To meet its resource needs To manage those resources
Three gridsThree grids LCG Nordugrid OSG
Significant resources, but different middlewareSignificant resources, but different middleware Teams working on solutions are typically associated to a
grid and its middleware
In principle all ATLAS resources are available to all In principle all ATLAS resources are available to all ATLAS usersATLAS users
Works out to O(1) cpu per user Interest by ATLAS users to use their local systems with
priority Not only a central system, flexibility concerning middleware
Plan “A” is “the Grid”…there is no Plan “A” is “the Grid”…there is no plan “B”plan “B”
The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities
Shawn McKeeShawn McKee 16
ATLAS Virtual Organization
Until recently the Grid has been a “free for all”Until recently the Grid has been a “free for all” no CPU or storage accounting (new in a prototyping/testing phase) no or limited priorities (roles mapped to small number of accounts:
atlas01-04) no storage space reservation
Last year ATLAS saw a competition for resources between “official” Last year ATLAS saw a competition for resources between “official” Rome productions and “unofficial”, but organized, productionsRome productions and “unofficial”, but organized, productions B-physics, flavour tagging...
The latest release of the VOMS (VirThe latest release of the VOMS (Virttual Organisation Management ual Organisation Management Service) middleware package allows the definition of user groups Service) middleware package allows the definition of user groups and roles within the ATLAS Virtual Organisation and roles within the ATLAS Virtual Organisation and is used by all ATLAS grid flavors!
Relative priorities are easy to enforce IF all jobs go through the Relative priorities are easy to enforce IF all jobs go through the same systemsame system
For a distributed submission system, it is up to the resource For a distributed submission system, it is up to the resource providers to:providers to: agree to the policies of each site with ATLAS publish and enforce the agreed policies
The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities
Shawn McKeeShawn McKee 17
Calibrating and Aligning ATLAS
Calibrating and aligning detector subsystems is a critical processCalibrating and aligning detector subsystems is a critical process Without well understood detectors we will have no meaningful
physics data
The default option for offline prompt calibrations is processing at The default option for offline prompt calibrations is processing at Tier-0 or at the Cern Analysis Facility, however the TDR states Tier-0 or at the Cern Analysis Facility, however the TDR states that: that: “Tier-2 centres will provide analysis facilities, and some will provide
the capacity to produce calibrations based on processing raw data”. “Tier-2 facilities may take a range of significant roles in ATLAS such
as providing calibration constants, simulation and analysis”. “Some Tier-2s may take significant role in calibration following the
local detector interests and involvements”.
ATLAS will have some subsystems utilizing Tier-2 centers as ATLAS will have some subsystems utilizing Tier-2 centers as Calibration and Alignment sites.Calibration and Alignment sites. Must insure we can support the data flow without disrupting other
planned flows Real-time aspect is critical – the system must account for “deadlines”
The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities
Shawn McKeeShawn McKee 18
L2PUL2PU
ThreadThread
Thread Thread
ThreadThread
Calibration
Server
Local
Server
Local
Server
Local
Server
GathererGathererCalibration
farm
disk
Server
1
2 3
4
6
Control
Network
x 25x 25
x x ~~2020
5
=Thread
~ ~ 10 MB/s10 MB/s
TCP/IP, UDP, etc.
~ ~ 500 kB/s500 kB/s
~ ~ 500 kB/s500 kB/s
DequeueMemoryqueue
Proposed ATLAS Muon Calibration System
(quoted bandwidths are for 10 KHz muon rate)
The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities
Shawn McKeeShawn McKee 19
ATLAS Simulations
Within ATLAS the Tier-2 centers will be responsible for Within ATLAS the Tier-2 centers will be responsible for
the bulk of the simulation effort.the bulk of the simulation effort.
Current planning assumes ATLAS will simulate approximately Current planning assumes ATLAS will simulate approximately
20% of the real data volume20% of the real data volume This number is dictated by resources; ATLAS may need to find a
way to increase this fraction
Event generator frame work Event generator frame work
interfaces interfaces
multiple packagesmultiple packagesincluding the Genser distribution provided by LCG-AA
Simulation with Geant4 since Simulation with Geant4 since
early 2004early 2004automatic geometry build from GeoModel>25M events fully simulated up to now since mid-2004
only a handful of crashes!
Digitization tested and tuned Digitization tested and tuned
with Test Beamwith Test Beam
The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities
Shawn McKeeShawn McKee 20
ATLAS Analysis Computing Model
ATLAS Analysis model broken into ATLAS Analysis model broken into two componentstwo components
Scheduled central productionScheduled central production of augmented of augmented
AOD, tuples & TAG collections from ESDAOD, tuples & TAG collections from ESD Derived files moved to other T1s and to T2s
Chaotic user analysisChaotic user analysis of augmented AOD of augmented AOD
streams, tuples, new selections etc and streams, tuples, new selections etc and
individual user simulation and CPU-bound individual user simulation and CPU-bound
tasks matching the official MC productiontasks matching the official MC production Modest to large(?) job traffic between T2s (and T1s, T3s)
The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities
Shawn McKeeShawn McKee 21
Distributed Analysis
At this point emphasis is on a batch model to At this point emphasis is on a batch model to implement the ATLAS Computing modelimplement the ATLAS Computing model Interactive solutions are difficult to realize on top of the current middleware layer
We expect ATLAS users to send large batches of We expect ATLAS users to send large batches of short jobs to optimize their turnaroundshort jobs to optimize their turnaround Scalability Data Access
Analysis in parallel to productionAnalysis in parallel to production Job Priorities
Distributed analysis effectiveness depends Distributed analysis effectiveness depends strongly upon the hardware and software strongly upon the hardware and software infrastructure. infrastructure.
Analysis is divided into “group” and “on Analysis is divided into “group” and “on demand” typesdemand” types
The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities
Shawn McKeeShawn McKee 22
ATLAS Group Analysis
Group analysis is characterised by access to full ESD and Group analysis is characterised by access to full ESD and
perhaps RAW dataperhaps RAW data This is resource intensive Must be a scheduled activity Can back-navigate from AOD to ESD at same site Can harvest small samples of ESD (and some RAW) to be sent to Tier
2s Must be agreed by physics and detector groups
Group analysis will produceGroup analysis will produce Deep copies of subsets Dataset definitions TAG selections
Big TrainsBig Trains Most efficient access if analyses are blocked into a ‘big train’ Idea around for a while, already used in e.g. heavy ions
Each wagon (group) has a wagon master=production manager Must ensure will not derail the train
Train must run often enough (every ~2 weeks?)
The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities
Shawn McKeeShawn McKee 23
ATLAS On-demand Analysis
Restricted Tier 2s and CAFRestricted Tier 2s and CAF Could specialize some Tier 2s for some groups ALL Tier 2s are for ATLAS-wide usage
Role and group based quotas are essentialRole and group based quotas are essential Quotas to be determined per group not per user
Data Selection Data Selection Over small samples with Tier-2 file-based TAG and AMI dataset
selector TAG queries over larger samples by batch job to database TAG at
Tier-1s/large Tier 2s
What data?What data? Group-derived EventViews Root Trees Subsets of ESD and RAW
Pre-selected or selected via a Big Train run by working group
Each user needs 14.5 kSI2k (about 12 current boxes)Each user needs 14.5 kSI2k (about 12 current boxes)
2.1TB ‘associated’ with each user on average2.1TB ‘associated’ with each user on average
The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities
Shawn McKeeShawn McKee 24
ATLAS Data Management
Based on DatasetsBased on Datasets PoolFileCatalog API is used to hide grid PoolFileCatalog API is used to hide grid differences differences On LCG, LFC acts as local replica catalog Aims to provide uniform access to data on all grids
FTS is used to transfer data between the sitesFTS is used to transfer data between the sites To date FTS has tried to manage data flow by restricting allowed endpoints (“channel” definition)
Interesting possibilities exist to incorporate network related research advances to improve performance, efficiency and reliability
Data management is a central aspect of Data management is a central aspect of Distributed AnalysisDistributed Analysis PANDA is closely integrated with DDM and operational LCG instance was closely coupled with SC3 Right now we run a smaller instance for test purposes Final production version will be based on new middleware for SC4 (FPS)
The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities
Shawn McKeeShawn McKee 25
Distributed Data Management Accessing distributed data on the Grid is not a simple task Accessing distributed data on the Grid is not a simple task
(see below!)(see below!)
Several DBs are needed centrally to hold dataset informationSeveral DBs are needed centrally to hold dataset information
““Local” catalogues hold information on local data storageLocal” catalogues hold information on local data storage
The new DDM systemThe new DDM system
(right) is under test (right) is under test
this summerthis summer
It will be usedIt will be used
for all ATLAS datafor all ATLAS data
from October onfrom October on
(LCG Service(LCG Service
Challenge 3)Challenge 3)
The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities
Shawn McKeeShawn McKee 26
ATLAS plans for using FTS
T1
T0
T2T2
LFC
LFC
FTS Server T1
FTS Server T0
T1
….
VO box
VO box
LFC: local within ‘cloud’
All SEs SRM
Tier-0 FTS server:Tier-0 FTS server: Channel from Tier-0 to all Tier-1s: used to move "Tier-0" (raw and 1st pass reconstruction data) Channel from Tier-1s to Tier-0/CAF: to move e.g. AOD (CAF also acts as "Tier-2" for analysis)
Tier-1 FTS server:Tier-1 FTS server: Channel from all other Tier-1s to this Tier-1 (pulling data): used for DQ2 dataset subscriptions (e.g. reprocessing, or massive "organized" movement when doing Distributed Production) Channel to and from this Tier-1 to all its associated Tier-2s
Association defined by ATLAS management (along with LCG)
“Star”-channel for all remaining traffic [new: low-traffic]
The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities
Shawn McKeeShawn McKee 27
ATLAS and Related Research
Up to now I have focused on the ATLAS computing modelUp to now I have focused on the ATLAS computing model
Implicit in this model and Implicit in this model and central to its successcentral to its success are: are: High-performance, ubiquitous and robust networks Grid middleware to securely find, prioritize and manage resources
Without either of these capabilities the model risks melting Without either of these capabilities the model risks melting
down or failing to deliver the required capabilities.down or failing to deliver the required capabilities.
Efforts to date have (Efforts to date have (necessarilynecessarily) focused on building the ) focused on building the
most basic capabilities and demonstrating they can work.most basic capabilities and demonstrating they can work.
To be truly effectiveTo be truly effective will require updating and extending this will require updating and extending this
model to include the best results of ongoing networking and model to include the best results of ongoing networking and
resource management research projects.resource management research projects.
A quick overview of some selected (US) projects follows…A quick overview of some selected (US) projects follows…
The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities
Shawn McKeeShawn McKee 28
The UltraLight Project UltraLight isUltraLight is
A four year $2M NSF ITR funded by MPS (2005-8)
Application driven Network R&D.
A collaboration of BNL, Buffalo, Caltech, CERN, Florida, FIU, FNAL, Internet2, Michigan, MIT, SLAC, Vanderbilt.
Significant international participation: Brazil, Japan, Korea amongst many others.
Goal:Goal: Enable the network as a managed resource. Enable the network as a managed resource.
Meta-Goal:Meta-Goal: Enable physics analysis and discoveries which could Enable physics analysis and discoveries which could
not otherwise be achieved.not otherwise be achieved.
The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities
Shawn McKeeShawn McKee 29
ATLAS and UltraLight Disk-to-Disk Research
ATLAS MDT sub-systems need very fast calibration turn-around time (< 24 hours)
Initial estimates plan for as much as 0.5 TB/day of high-Pt muon data for calibration.
UltraLightUltraLight could enable us to quickly transport (~1/4 hour) the needed events to Tier-2 sites for calibration
Michigan is an ATLAS Muon Alignment and Calibration Center, a Tier-2 and an UltraLight Site
Muon calibration work has presented an opportunity to couple research efforts into production
The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities
Shawn McKeeShawn McKee 30
Networking at KNU (Korea)
Uses 10Gbps GLORIAD link Uses 10Gbps GLORIAD link
from Korea to US, which is from Korea to US, which is
called BIG-GLORIAD, also called BIG-GLORIAD, also
part of UltraLightpart of UltraLight
Try to saturate this BIG-Try to saturate this BIG-
GLORIAD link with servers GLORIAD link with servers
and cluster storages and cluster storages
connected with 10Gbps connected with 10Gbps
Korea is planning to Korea is planning to
be a Tier-1 site for be a Tier-1 site for
LHC experimentsLHC experiments
KoreaU.S.
BIG-GLORIAD
The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities
Shawn McKeeShawn McKee 31
VINCI: Virtual Intelligent Networks
for Computing Infrastructures A network Global Scheduler
implemented as a set of collaborating agents running on distributed MonALISA services
Each agent uses policy-based priority queues; and negotiates for an end to end connection using a set of cost functions
A lease mechanism is implemented for each offer an agent makes to its peers
Periodic lease renewal is used for all agents; this results in a flexible response to task completion, as well as to application failure or network errors
If network errors are detected, supervising agents cause all segments to be released along a path. An alternative path may then be
set up rapidly enough to avoid a TCP timeout, allowing the transfer to continue uninterrupted.
The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities
Shawn McKeeShawn McKee 32
Lambda Station
A network path forwarding service to interface production A network path forwarding service to interface production facilities with advanced research networks: facilities with advanced research networks:
Goal is selective forwarding on a per flow basis Alternate network paths for high impact data movement
Dynamic path modification, with graceful cutover & fallback
Current implementation is based on policy-based routing
& DSCP marking
Lambda Station interacts with:Lambda Station interacts with: Host applications & systems LAN infrastructure Site border infrastructure Advanced technology WANs Remote Lambda Stations
D. Petravick, P. DeMar
The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities
Shawn McKeeShawn McKee 33
TeraPaths (LAN QoS Integration)
Site A Site BWAN
web services web services
WAN monitoring
WAN web services
hardware drivershardware drivers
Web page
APIs
Cmd line
QoS requests
user manager
scheduler
site monitor
…
router manager
user manager
scheduler
site monitor
…
router manager
The TeraPaths project investigates the integration The TeraPaths project investigates the integration and use of LAN QoS and MPLS/GMPLS-based and use of LAN QoS and MPLS/GMPLS-based differentiated network services in the ATLAS data differentiated network services in the ATLAS data intensive distributed computing environment in order intensive distributed computing environment in order to manage the network as a to manage the network as a critical resourcecritical resource
TeraPaths TeraPaths Includes:Includes:
BNLBNL
MichiganMichigan
ESNet (OSCARS) ESNet (OSCARS)
FNAL(LambdaStatioFNAL(LambdaStation)n)
SLAC(DWMI)SLAC(DWMI)
The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities
Shawn McKeeShawn McKee 34
Integrating Research into Production
As you can see there are many efforts, even just As you can see there are many efforts, even just
within the US, to help integrate a managed within the US, to help integrate a managed
network into our infrastructurenetwork into our infrastructure
There are also many similar efforts in computing, There are also many similar efforts in computing,
storage, grid-middleware and applications (EGEE, storage, grid-middleware and applications (EGEE,
OSG, LCG,…). OSG, LCG,…).
The challenge will be to harvest these efforts The challenge will be to harvest these efforts
and integrate them into a robust system for LHC and integrate them into a robust system for LHC
physicists.physicists.
I will close with an “example” vision of what I will close with an “example” vision of what
could result from such integration…could result from such integration…
The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities
Shawn McKeeShawn McKee 35
An Example: UltraLight/ATLAS Application (2008)
Node1> fts –vvv –in mercury.ultralight.org:/data01/big/zmumu05687.root –Node1> fts –vvv –in mercury.ultralight.org:/data01/big/zmumu05687.root –out venus.ultralight.org:/mstore/events/data –prio 3 –deadline +2:50 –xsumout venus.ultralight.org:/mstore/events/data –prio 3 –deadline +2:50 –xsum
FTS: Initiating file transfer setup…FTS: Initiating file transfer setup… FTS: Remote host responds readyFTS: Remote host responds ready FTS: Contacting path discovery serviceFTS: Contacting path discovery service PDS: Path discovery in progress…PDS: Path discovery in progress… PDS:PDS: Path RTT 128.4 ms, best effort path bottleneck is 10 GEPath RTT 128.4 ms, best effort path bottleneck is 10 GE PDS:PDS: Path optionsPath options found: found: PDS:PDS: LightpathLightpath option exists end-to-end option exists end-to-end PDS:PDS: Virtual pipeVirtual pipe option exists (partial) option exists (partial) PDS:PDS: High-performance protocolHigh-performance protocol capable end-systems capable end-systems
existexist FTS: Requested transfer 1.2 TB file transfer within 2 hours 50 FTS: Requested transfer 1.2 TB file transfer within 2 hours 50
minutes, priority 3minutes, priority 3 FTS: Remote host confirms available space for FTS: Remote host confirms available space for DN=DN=
[email protected]@ultralight.org FTS: End-host agent contacted…parameters transferredFTS: End-host agent contacted…parameters transferred EHA: Priority 3 request allowed for EHA: Priority 3 request allowed for [email protected]@ultralight.org EHA: request scheduling detailsEHA: request scheduling details EHA: EHA: Lightpath prior scheduling (higher/same priority) precludes Lightpath prior scheduling (higher/same priority) precludes
useuse EHA: Virtual pipe sizeable to 3 Gbps available for 1 hour EHA: Virtual pipe sizeable to 3 Gbps available for 1 hour
starting in 52.4 minutesstarting in 52.4 minutes EHA: request monitoring prediction along pathEHA: request monitoring prediction along path EHA: FAST-UL transfer expected to deliver 1.2 Gbps (+0.8/-0.4) EHA: FAST-UL transfer expected to deliver 1.2 Gbps (+0.8/-0.4)
averaged over next 2 hours 50 minutesaveraged over next 2 hours 50 minutes
The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities
Shawn McKeeShawn McKee 36
ATLAS FTS 2008 Example (cont.) EHA: Virtual pipe (partial) expected to deliver 3 Gbps(+0/-0.3) EHA: Virtual pipe (partial) expected to deliver 3 Gbps(+0/-0.3)
during reservation; variance from unprotected section < 0.3 Gbps during reservation; variance from unprotected section < 0.3 Gbps 95%CL95%CL
EHA: Recommendation: begin transfer using FAST-UL using network EHA: Recommendation: begin transfer using FAST-UL using network identifier #5A-3C1. Connection will migrate to MPLS/QoS tunnel in identifier #5A-3C1. Connection will migrate to MPLS/QoS tunnel in 52.3 minutes. Estimated completion in 1 hour 22.78 minutes. 52.3 minutes. Estimated completion in 1 hour 22.78 minutes.
FTS: Initiating transfer between mercury.ultralight.org and FTS: Initiating transfer between mercury.ultralight.org and venus.ultralight.org using #5A-3C1venus.ultralight.org using #5A-3C1
EHA: Transfer initiated…tracking at URL: EHA: Transfer initiated…tracking at URL: fts://localhost/FTS/AE13FF132-FAFE39A-44-5A-3C1fts://localhost/FTS/AE13FF132-FAFE39A-44-5A-3C1
EHA: Reservation placed for MPLS/QoS connection along partial path: EHA: Reservation placed for MPLS/QoS connection along partial path: 3Gbps beginning in 52.2 minutes: duration 60 minutes3Gbps beginning in 52.2 minutes: duration 60 minutes
EHA: Reservation confirmed, rescode #9FA-39AF2E, note: unprotected EHA: Reservation confirmed, rescode #9FA-39AF2E, note: unprotected network section included.network section included.
<…lots of status messages…><…lots of status messages…> FTS: Transfer proceeding, average 1.1 Gbps, 431.3 GB transferredFTS: Transfer proceeding, average 1.1 Gbps, 431.3 GB transferred EHA: Connecting to reservation: tunnel complete, traffic marking EHA: Connecting to reservation: tunnel complete, traffic marking
initiatedinitiated EHA: Virtual pipe active: current rate 2.98 Gbps, estimated EHA: Virtual pipe active: current rate 2.98 Gbps, estimated
completion in 34.35 minutescompletion in 34.35 minutes FTS: Transfer complete, signaling EHA on #5A-3C1FTS: Transfer complete, signaling EHA on #5A-3C1 EHA: Transfer complete received…hold for xsum confirmationEHA: Transfer complete received…hold for xsum confirmation FTS: Remote checksum processing initiated…FTS: Remote checksum processing initiated… FTS: Checksum verified—closing connectionFTS: Checksum verified—closing connection EHA: Connection #5A-3C1 completed…closing virtual pipe with 12.3 EHA: Connection #5A-3C1 completed…closing virtual pipe with 12.3
minutes remaining on reservationminutes remaining on reservation EHA: Resources freed. Transfer details uploading to monitoring nodeEHA: Resources freed. Transfer details uploading to monitoring node EHA: EHA: Request successfully completedRequest successfully completed, transferred 1.2 TB in 1 , transferred 1.2 TB in 1
hour 41.3 minutes (transfer 1 hour 34.4 minuteshour 41.3 minutes (transfer 1 hour 34.4 minutes))
The ATLAS Computing Model: Status, Plans and Future The ATLAS Computing Model: Status, Plans and Future PossibilitiesPossibilities
Shawn McKeeShawn McKee 37
Conclusions
ATLAS is quickly approaching ATLAS is quickly approaching “real” data and our computing “real” data and our computing model has been successfully model has been successfully validated (as far as we have validated (as far as we have been able to take it).been able to take it). Some major uncertainties Some major uncertainties exist, especially around “user exist, especially around “user analysis” and what resource analysis” and what resource implications these may have.implications these may have. There are lots of R&D programs There are lots of R&D programs active in many areas of special active in many areas of special importance to ATLAS (and LHC) importance to ATLAS (and LHC) which could significantly which could significantly strengthen the core modelstrengthen the core model
The challenge will be to The challenge will be to select, integrate, select, integrate,
prototypeprototype and and testtest the R&D developments in time to have the R&D developments in time to have
a meaningful impact upon the ATLAS (or LHC) programa meaningful impact upon the ATLAS (or LHC) programQuestions?Questions?