Download - A Level-2 trigger algorithm for the identification of muons in the ATLAS Muon Spectrometer
A Level-2 trigger algorithm for the identification of muons in the ATLAS Muon Spectrometer
Alessandro Di Mattia
on behalf of the Atlas TDAQ group
Computing in High Energy PhysicsInterlaken, September 26-30, 2004
Outline:•The ATLAS trigger
• Fast algorithm•relevant physics performances
•Implementation in the Online framework
•Latency of the algorithm
•Conclusions
LHC: proton-proton collisions @ ECM = 14 TeV starting 2007
L = 1034 cm-2 s-1 23 collisions per bunch crossing @ 25 ns interval 1 year at L = 1034 cm-2 s-1 ∫Ldt ≈ 100 fb -1
The LHC challenge to ATLAS Trigger/DAQ
Challenge to the ATLAS Trigger/DAQ interaction rate 109 Hz, offline computing can handle O(102 Hz).
cross section of physics processes vary over many order of magnitude:
Inelastic: 109 Hz W → l : 102 Hz tt production: 10 Hz Higgs (100 GeV): 0.1 Hz Higgs (600 GeV):10-2 Hz
ATLAS has O(108) read-out channels → average event size ~1.5 MByte
The ATLAS Trigger
75 kHz
~ 2 kHz
~ 200 Hz
Rate
Target processing time
~ 2 s
~ 10 ms
2.5 μs
Level-1
Hardware trigger
High Level Triggers
(HLT)
Level-2 + Event Filter
Software trigger
Standalone muon reconstruction at Level-2
Task of the Level-2 muon trigger:• Confirm the Level-1 trigger with a more precise pt estimation
within a “Region of interest (RoI)”.• Contribute to the global Level-2 decision.
To perform the muon reconstruction RoI data are gathered together and processed in three steps:
1) “Global Pattern Recognition” involving trigger chambers and positions of MDT tubes (no use of drift time);
2) “Track fit” involving drift time measurements, performed for each MDT chamber;
3) Fast “pt estimate” via a Look-up-table (LUT) with no use of time consuming fit methods.
Result ,,direction of flight into the spectrometer, and pt at the interaction vertex.
muo
nm
uon
App
roxi
mat
ed
App
roxi
mat
ed
Muo
n tra
ject
ory
Muo
n tra
ject
ory
After L1 emulation 1 hit from each Trigger 1 hit from each Trigger Station is required to start the Station is required to start the Pattern Recognition on MDT Pattern Recognition on MDT data.data.
1 hit from each Trigger 1 hit from each Trigger Station is required to start the Station is required to start the Pattern Recognition on MDT Pattern Recognition on MDT data.data.
Global Pattern recognition:seeded by the trigger chamber data
Use the L1 simulation Use the L1 simulation code to select the RPC code to select the RPC
Trigger Pattern Trigger Pattern
Valid coincidence in the Low-Pt CMA
Define “-roads” around this trajectory in each chamber;
Collect hit tubes within the roads using theresidual of the muon tube.
Apply a contiguity algorithm to furtherremove background hits inside the roads.
(muon hits) = 96% backgr. hits ~ 3%
Low pt (~ 6 GeV) High pt (~ 20 GeV)
Muon Roads and “contiguity algorithm”
Track Fit
Use drift time measurement to fitthe best straight line crossing allpoints.
Compute the track bending usingthe sagitta method: three pointsrequired
For a given chamber the sagitta is: s ~ 150 m for muon pt = 20 GeV s ~ 500 m for muon pt = 6 GeV
small effects respect to sm
Use linear relation between 1/s
and pT to estimate pT.
Prepare Look Up Tables (LUT) as a set of relations between valuesof s and pt for different regions (s = f ( , , pt)).
30 x 60 ( , ) tables for each detector octant.
PT estimate
Performances including background simulation for the high luminosity environment
Resolution comparable with the ATLAS reconstruction program (factor of about 2).
Track finding efficiency of about 97% for muons.
Trigger rates (barrel)Low pt
(6 GeV)
L1 rate (KHz) L2 rate(KHz)
K/ decays 7.9 3.1
b decays 1.7 1.0
c decays 1.0 0.5
Fake L1 1.0 Negligible
Total 10.6 4.6
High pt
(20 GeV)
L1 rate (KHz) L2 rate(KHz)
K/ decays 1.1 0.06
b decays 0.8 0.09
c decays 0.4 0.04
W decays 0.06 0.05
Fake L1 negligible negligible
Total 2.4 0.24
HLT Event Selection Software
HLTSSW
Steering Monitoring Service
1..*
MetaData Service
1..*ROB DataCollector
DataManager
HLTAlgorithms
Processing Task
Event DataModel
L2PU Application
<<import>>
Event DataModel
Reconstr. Algorithms
<<import>>
StoreGateAthena/Gaudi
<<import>><<import>>
Interface
Dependency
Package
Event Filter
HLT Core Software
Offline Core Software Offline Reconstruction
HLT Algorithms
HLT Data Flow Software
HLT Selection Software Framework ATHENA/GAUDI Reuse offline components Common to Level-2 and EF
Offline algorithms used in EF
Bytestream modelStandardization of data access forces to model the data according to detector regions
…. but …. bytestream should be optimized for a fast access to the detector data.
RPC bytestream: the detector regions can’t be easily mapped on the readout structure because this latter is geared towards the trigger needs. Use an ad hoc solution:
PAD -> Coincidence Matrix -> Fired CMA channel
Data are strictly limited to the needed ones: no overhead introduced in the data decoding.
MDT bytestream: readout structure mapped on the MDT chambers.
CSM -> AMT hit (AMT data word)
Data access according to chambers is not efficient: optimization needed.
PADPAD
CM … up to 8CM … up to 8CMCMCMCM
Fired channelFired channelFired channelFired channelFired channelFired channel
CSM = MDT chamberCSM = MDT chamber
MdtAmtHittMdtAmtHittMdtAmtHittMdtAmtHittMdtAmtHitMdtAmtHit
Standard MDT data access scheme:use LVL1 Muon RoI info
muo
nm
uon
7 MDT chambers 7 MDT chambers
to be accessedto be accessed
LVL1 RoILVL1 RoIMDT chamber MDT chamber
accessed accessed
This tail is critical for the MDT converter timing
muo
nm
uon
App
roxi
mat
ed
App
roxi
mat
ed
Muo
n tra
ject
ory
Muo
n tra
ject
ory
After L1 emulation
Width < 50 cm
Width ~ 5 cm
Width < 40 cm
Optimized MDT data access scheme:use Muon Roads
3 MDT chambers to 3 MDT chambers to be accessed; up to 6 be accessed; up to 6 in case Roads overlap in case Roads overlap two chambers.two chambers.
MDT chamberMDT chamberaccessedaccessed
Only three MDT chambers are accessed in most of the cases.
Further optimizationA considerable fraction of the data access time is taken by the “data preparation”.
Data preparation is for:– associating space point to detector hits;
– resolving ambiguites in some special detector regions (RPC data only);
– providing refined info to the reconstruction: t0 subtraction (to MDT drift time), calibration of the space-time relationship of MDT tubes.
To optimize this process, the data preparation is performed inside the algorithm using a standalone detector description that provides
1) description of the readout xxx xxxx
2) description of the detector geometry
3) offline versus online map xxx xxxx
Advantages:• prepare only the data needed for reconstruction;
• use code optimized for speed:– detector geometry organized according to readout hierarchy;
– minimal use of STL, no memory allocation on demand;
• minimize the dependencies towards the offline code: ease the integration
CSMCSM
Fast sequence diagram
RPC dataaccess
Level-1emulation
RPC patternrecognition
MDT dataaccess
Featureextraction
MDT patternrecognition
Monitoring
PAD Id
FastFastexecutionexecution
RoIreconstruction
IDC for RPCIDC for RPC
PADPAD
CM … up to 8CM … up to 8CMCMCMCM
Fired channelFired channelFired channelFired channelFired channelFired channel
PADPAD Triggerpattern
Muonroads
IDC for MDTIDC for MDT
CSMCSM
AmtAmtAmtAmtAmtAmt
CSMCSM CSMCSM
AmtAmtAmtAmtAmtAmtAmtAmtAmtAmtAmtAmt
CSMCSMCSMCSM
Muon Features
Prepared digits
Frameworkinfrastructure
Fastsequences
Filling histos for monitoring
Fast and Total latency time
• Optimized code run on (XEON @ 2.4GHz).
– Signal: single muon, pt=100 GeV– Cavern Background: High Lumi x 2
• The total latency shows timings made on the same event sample before and after optimizing the MDT data access.Optimized version:
– total data access time ~ 800 s;– data access takes the same cpu time of
Fast;
TotalTotal
Fast Fast Fast takes Fast takes ~ 10% of the Level-2 ~ 10% of the Level-2 latency.latency.
Cavern background does not Cavern background does not increase the processing time.increase the processing time.
First implementation
Conclusions
• Fast is suitable to perform the muon trigger selection in ATLAS L2:BARREL RESULTS:– Fast reconstructs muon tracks into Muon Spectrometer and measures the PT at
the interaction vertex with a resolution of 5.5% at 6 GeV and 4% at 20 GeV;
– Fast allows to reduce the LVL1 trigger rate from 10.6 kHz to 4.6 kHz (6 GeV), and from 2.4 kHz to 0.24 kHz (20 GeV).
• algorithm fully implemented in the Online framework.
• algorithm and data access time match the L2 trigger latency: now ready to undergo a next optimization phase more devoted to standardize the software components.
Backup transparencies
Requirements for implementation
• L2 latency time set to 10 ms;
• Thread Safety
• Data access in restricted Geometrical Region (RoI seeding);
• Hide aspects of data access behind offline Storgate interfaces;
• Use RDO (Raw Data Object) as the atomic data component:– translate the bytestream Raw data into RDO;– conversion mechanism integrated into the data access.
• Standardize the data access for every subdetector:– general region lookup to implement RoI mechanism,– common interfaces for detector specific code, e.g. RDO converters,– force a common structure for the RDOs, as far as it is possible: fit it into
detector modules.
• ROB (ReadOut Buffer) access and data preparation/conversion
on demand;
Software Software designdesign
TriggerTriggerarchitecturearchitecture
Level-1 RoI is the intersection of a CMA processing RPC eta-view with a CMA processing RPC phi-view inside 1 PAD.
RPC bytestreamRPC bytestream reflects the organization of the trigger logic:
– ROD -> Rx -> PAD -> Coincidence Matrix (CMA) -> CMA channel– 1 ROD = 2 Sector Logic = 2 Rx; RPC detector are read by 64 Logic Sector;– Up to 7 PAD into a Rx; up to 8 CMA into a PAD (4 per view);– CMA channel = 32/64 depending on the CMA side (Pivot/Confirm);
1 CMA coincidences between RPC planes in a 3-dimensional area
Confirm plane high pt
Pivot plane
Confirm plane low pt
No way to fit RPC bytestream into RPC detector modules!
Shown are odd number CMAs only, CMAs overlap in confirm planes, but not in the pivot plane.
RPC RDO Definition• Needed different types:
– “bare” RDO as persistent representation of bytestream; contains raw data from the Level-1 and are used by Fast to run the Level-1 emulation on one RoI;
– “prepared” RDO (or RIO – Reconstruction Input Object) are obtained from the RDO with some manipulation of the data to resolve the overlap regions and to associate space positions to the hits. Used by the offline reconstruction.
BARE: Convenient way to organizing RDOs in IDC is according to PAD. Data requests are simplified thanks to the close correspondence between PAD and RoI.
PAD -> Coincidence Matrix -> Fired CMA channel
Data are strictly limited to the needed
ones: no overhead introduced in the data decoding.
PREPARED: Stored in Storegate in hierarchical structure as defined by offline identifiers up to the RPC chamber modules.
PADPAD
CM … up to 8CM … up to 8CMCMCMCM
Fired channelFired channelFired channelFired channelFired channelFired channel
MDT bytestream organization:ROD -> Chamebr System Module (CSM) -> TDC ->TDC channel
– 1 ROD = 1 trigger tower (f x h x r = 1 x 2 x 3);
– 1 CSM read 1 MDT chamber; one CSM can have up to 18 TDC;
– 1 AMT (Atlas Muon TDC) can have up to 24 channel (= “tubes”);
MDT bytestream
To fit MDT bytestream into MDT detector modules is trivial.
MDT RDO definition• Need different types:
– “bare” RDO as persistent representation of bytestream; contains MDT raw data and are used by Fast to confirm the Level-1 RoI.
– “prepared” RDO contains refined info (drift time, calibarted time, radius, error).
BARE: Convenient way to organizing RDOs in IDC is according to CSM, because can be closely matched both to a detector element and to the trigger tower read-out. No ordering is foreseen for AMT data words.
CSM -> AMT hit (AMT data word)
Data access according to chambers is not
efficient: optimization needed.
PREPARED: Stored in Storegate with the same structure as RDO but contains a list of offline MDT digits.
CSM = MDT chamberCSM = MDT chamber
MdtAmtHittMdtAmtHittMdtAmtHittMdtAmtHittMdtAmtHitMdtAmtHit
Optimization of MDT data access
• Standard implementation of MDT data access was not efficient:• ~7 chambers required per RoI;• … but typically only 3 chambers have muon hits;• direct impact on the timing performance because:
– MDT occupancy dominated by Cavern Background;– MDT converter time scales linearly with the chamber occupancy;
• A more efficient access schema has be implemented using:• Muon Roads – refinement of the RoI region available after L1-emulation.
The widths of Muon Roads are smaller than the chamber size.
• An optimized way for accessing the detector elements – selects detector elements according to the station (Innermost, Middle, Outermost), to the sector and to the track path.
Bytestream dataflow
ROD emulation
RoI B. emulation RoI
Bytestream
RDO Converter RIO Converter
RIO
Fast EF andoffline rec.
Fast uses a dedicated detector description code to reconstruct RDOs:– standalone implementation to ease the integration in HLTSSW;
– detector geometry organized according to readout hierarchy;
– minimal use of STL container.
Readout CablingDetector GeometryOnline vs Offline map
L2 Detector Description
Offline DetectorDescription
RDO
use
use
use
Simulation
use
• The processing tasks are implemented by “process” classes– acts on “C style” data structure; no use of offline EDM.
– process versioning implemented through inheritance
• The “sequence” classes manage the execution of processes and publish the data sctucture towards the processes and the sequences– provide interfaces to framework components: MessageSvc, TimerSvc, etc.
Fast implementation
ProcessTYP
ProcessBase
ProcessStd
Pure virtual implementation
Concrete imp. of the data structure I/O and printouts
Concrete imp. of the task type
Minimal use of STL containers.Minimal use of STL containers.No memory allocation on No memory allocation on
demand.demand.
Minimal use of STL containers.Minimal use of STL containers.No memory allocation on No memory allocation on
demand.demand.
Sequence
name: string type: integer data: struct <TYPE>
Methods: giveData() start()
ProcessStd
name: stringtype: integerdata&: struct<TYPE>
Methods: run() printout()
runs