offline week hlt plans thorsten kollegger
DESCRIPTION
Offline Week HLT Plans Thorsten Kollegger. CERN | 26 .03.2013. HLT Status. A few words on the HLT operational status… More details in the HLT presentation during the Workshop on Run 1 Conclusions and Run 2 Outlook https://indico.cern.ch/conferenceDisplay.py?confId= 240668. - PowerPoint PPT PresentationTRANSCRIPT
Offline Week
HLT Plans
Thorsten Kollegger
CERN | 26.03.2013
HLT Plans | Offline Week | 26.03.2013 | Thorsten Kollegger 2
HLT Status
A few words on the HLT operational status…
More details in the HLT presentation during the
Workshop on Run 1 Conclusions and Run 2 Outlookhttps://indico.cern.ch/conferenceDisplay.py?confId=240668
HLT Plans | Offline Week | 26.03.2013 | Thorsten Kollegger 3
Run 1 Conclusions
Operation
• Run participation: > 97%
• EOR percentage: <≈ 5%
- dominated by hardware issues, can we get better?
Performance
• Full online TPC tracking for maximum
TPC read-out rate demonstrated
• Efficiency comparable to offline tracking
• Limited by “online calibration”
HLT Plans | Offline Week | 26.03.2013 | Thorsten Kollegger 4
Tracking Performance
HLT GPU tracking – speed comparison
• It’s fast…
HLT Plans | Offline Week | 26.03.2013 | Thorsten Kollegger 5
Tracking Performance
HLT GPU tracking – physics performance
• Similar performance as offline…
• … if we have the same calibration (need online calibration)
HLT Plans | Offline Week | 26.03.2013 | Thorsten Kollegger 6
LS1 – HLT Farm Availability
Two use cases
• HLT development
• Offline usage
AliEn CAFOn
Demand
QA
CernVM/FSCloud Agent
Cloud Gateway
HLTPublic Cloud
Private CERN Cloud
HLT Plans | Offline Week | 26.03.2013 | Thorsten Kollegger 7
LS1 – HLT Farm Availability
From Technical Coordination
• No power starting week 22 for two months
• No cooling weeks 22-40
Implications
• Testing and development capabilities for HLT limited
• Offline usage of farm not feasible/worthwhile
Development of tools continuing in view of Run 2 downtimes,
general use
HLT Plans | Offline Week | 26.03.2013 | Thorsten Kollegger 8
HLT Offline Plans LS1
?
HLT Plans | Offline Week | 26.03.2013 | Thorsten Kollegger 9
HLT Offline Plans LS1
Group responsible for HLT offline integration left HLT project
• Can maintain current software,
but no development/restructuring
• Need to concentrate on core efforts,
e.g. HLT Event Display only on best-effort basis
• Have been in this mode since ≈ summer 2012
What can still be done in/with offline?
HLT Plans | Offline Week | 26.03.2013 | Thorsten Kollegger 10
To be finished…
There are many HLT-related developments
waiting to be finished/back-ported to offline
• Fast (TPC) transformations (calibration, > factor 10 speed-up)
• HLT tracks as offline reconstruction TPC seeds
• HLT tracking performance evaluation
A more general remark:
Need to converge as-much-as-possible towards common code,
very good experience working with TPC group
HLT Plans | Offline Week | 26.03.2013 | Thorsten Kollegger 11
Event Display
Event Display for nice pictures…
… not only, but also very useful
for fast feed-back for shift-crew,
displaced vertex runs “by eye”
HLT can support this only on a
best-effort basis:
• Can we converge towards
a common event display?
HLT Plans | Offline Week | 26.03.2013 | Thorsten Kollegger 12
Start of Run (SOR)
Strong push from TC/RC to SOR time to 10 s
HLT engage times currently around 100 s• 25 s – Pendolino prepare calibration/HCDB• <5 s – Start of HLT processes• 60 s – Initialization of processes (CDB access, map
preparation)• <5 s - Subscriptions (Configured->Running+Subscribed+....)• <5 s - Network connections (Running... -> Connected)
Can AliROOT go from “off” to first event reco from in 2 s?
(incl. building geometry, updating calibrations…)
HLT Plans | Offline Week | 26.03.2013 | Thorsten Kollegger 13
LS1 – HLT Farm Replacement
Full replacement of computing farm beginning of 2014
Switch from H-RORC to C-RORC
• 12 DDLs/board vs 2 DDLs/board
reduced number of FEP nodes
• Test-Setup in CR2 working
Tracking/Compute node hardware to be decided
• move tracker from CUDA to OpenCL to gain
vendor independence
HLT Plans | Offline Week | 26.03.2013 | Thorsten Kollegger 14
Run 2 Outlook
All detectors available in HLT• Use requires development (not HLT core)• If we want to do this, major software development effort by
detectors needed
HLT “core” focused on improvement of online software
• Data transport and process control/monitoring framework
• HLT specific detector software
HLT Plans | Offline Week | 26.03.2013 | Thorsten Kollegger 15
Run 2 Outlook
Full TPC reconstruction available• TPC cluster finding for data reduction• TPC seeding• Online calibration of TPC, replacement of CPass0?• Additional data reduction?
Towards online calibration• porting of DAs to HLT started, massive changes required
(TPC, MUON)• “CPass0” feasibility study with TPC
Need to know NOW computing requirements after LS1
HLT Plans | Offline Week | 26.03.2013 | Thorsten Kollegger 16
Online Calibration
There are two running modes in the HLT
• Guaranteed analysis of every event
-> main chain
• “best-effort” event delivery
-> monitoring chain
The choice of chain defines requirements on your code!
HLT Plans | Offline Week | 26.03.2013 | Thorsten Kollegger 17
Online Calibration Requirements
Stability
• Can your code run on >100M events?(memory consumption?)
• How well can you handle detector errors?
In offline you are too a large extend shielded
already by online (DAQ/HLT)
• How stable is your code?
Online code changes require verification,
only possible without beam
HLT Plans | Offline Week | 26.03.2013 | Thorsten Kollegger 18
Online Calibration Requirements
Processing• What are the computing resources you need to process
1000 Hz of minimum bias Pb+Pb collisions?• Does your code allow for parallelization of event processing?
How do you handle merging of process output?• Input data format: ROOT adds huge overhead…
An example why this matters: Your code takes 1 second to process 1 event• No parallel code 1Hz• Parallel code 1000 Hz
However: 1000 cores 100 additional nodes >500k€
HLT Plans | Offline Week | 26.03.2013 | Thorsten Kollegger 19
Online Calibration Requirements
Need to know NOW computing requirements after LS1 This defines the infrastructure,
there are no fast changes possible
Reminder:• All detectors available in HLT
Some limited computing power available for local processing• Resources available for TPC tracking (and limited calibration)
Everything beyond requires additional resources…
HLT Plans | Offline Week | 26.03.2013 | Thorsten Kollegger 20
Run 3 Outlook
Developments towards
the upgrade…
O2
Online-Offline Facility
AliEn HLTDAQ
ReconstructionCalibration
Re-reconstructionOnline Raw Data Store
HLT Plans | Offline Week | 26.03.2013 | Thorsten Kollegger 21
Run 3 Outlook
… to make this a reality
2
HLT Plans | Offline Week | 26.03.2013 | Thorsten Kollegger 22
Backup
HLT Plans | Offline Week | 26.03.2013 | Thorsten Kollegger 23
Tracking Performance
HLT uses GPUs for TPC tracking
• Unique at LHC, other experiments following now with R&D
• Factor 4 in total tracking time: factor 3 less nodes in system