the rtes project – btev, and beyond michael j. haney 1 shikha ahuja 2, ted bapty 2, harry cheung...
TRANSCRIPT
The RTES Project – BTeV, and Beyond
Michael J. Haney1
Shikha Ahuja2, Ted Bapty2, Harry Cheung3, Zbigniew Kalbarczyk4, Akhilesh Khanna4, Jim Kowalkowski3, Derek Messie5, Daniel Mossé6, Sandeep Neema2, Steve Nordstrom2, Jae Oh5, Paul Sheldon7, Shweta Shetty2, Dmitri Volper5,
Long Wang4, Di Yao2
1High Energy Physics, University of Illinois, 1110 W. Green Street, Urbana, IL 61801 USA2Institute for Software Integrated Systems, Vanderbilt University, Nashville, TN 37235 USA
3Fermi National Accelerator Laboratory, Batavia, IL 60510 USA4Electrical and Computer Science, University of Illinois, Urbana, IL 61801 USA
5Electrical Engineering and Computer Science, Syracuse University, Syracuse, NY 13244 USA6Computer Science, University of Pittsburgh, Pittsburgh, PA 15250 USA
7Physics and Astronomy Department, Vanderbilt University, Nashville, TN 37235 USA
M. Haney; RT 2005 The RTES Project - BTeV, and Beyond
Outline
• Real Time Embedded System Project– BTeV => RTES
• Prototypes– SuperComputing 2003– Demo System 2004
• Beyond BTeV
M. Haney; RT 2005 The RTES Project - BTeV, and Beyond
BTeV - High Energy Physics
• Input: 500 GB/s (2.5 MHz)• Level 1 processing: 190s
– rate of 396 ns– 528 “8 GHz” G5 CPUs
• (factor of 50 event reduction)
– high performance interconnects
• Level 2/3 processing: 5+135 ms • (factor of 10+2 event reduction)
– 1536 “12 GHz” CPUs commodity networking
• Output: 200 MB/s (4 kHz) = 1-2 Petabytes/year
M. Haney; RT 2005 The RTES Project - BTeV, and Beyond
BTeV’s Need• “Given the very complex nature of this system
where thousands of events are simultaneously and asynchronously cooking, issues of data integrity, robustness, and monitoring are critically important and have the capacity to cripple a design if not dealt with at the outset… BTeV [needs to] supply the necessary level of “self-awareness” in the trigger system.”
– [June 2000 Project Review]
M. Haney; RT 2005 The RTES Project - BTeV, and Beyond
thus, RTES• The Real Time Embedded System Group
– University of Illinois– University of Pittsburgh– University of Syracuse– Vanderbilt University (PI)– Fermilab
• Physicists and Computer Scientists/Electrical Engineers with expertise in– High performance, real-time system software and hardware,– Reliability and fault tolerance,
– System specification, generation, and modeling tools.
• NSF ITR grant ACI-0121658
M. Haney; RT 2005 The RTES Project - BTeV, and Beyond
Analysis
Local Oper.Manager
LocalFaultMgr
TrigAlgo.
ARMOR/RTOS
TrigAlgo.Trig
Algo.Trig
Algo.
Logical C
ontrol N
etwork
L1/DSP
Local Oper.Manager
LocalFaultMgr
TrigAlgo.
ARMOR/RTOS
TrigAlgo.Trig
Algo.Trig
Algo.
Log
ical
Dat
a N
et
Local OperManager
LocalFaultMgr
TrigAlgo.
ARMOR/Linux
TrigAlgo.Trig
Algo.Trig
Algo.
Log
ical
Dat
a N
et
Logical C
ontrol Netw
ork
Local OperManager
LocalFaultMgr
TrigAlgo.
ARMOR/Linux
TrigAlgo.Trig
Algo.Trig
Algo.
L2,3/CISC/RISC
Region Operations Mgr
RegionFault Mgr
Runtime
Designand
AnalysisAlgorithm Fault Behavior
Resource
Syn
thes
is
PerformanceSimulation
DiagnosabilityAnalysis
ReliabilityAnalysis
SystemModels
Soft Real-Time Hard
ExperimentControl
Interface
Synthesis
Fee
dbac
k
Modeling
Logical C
ontrol Netw
ork
Global Operations
Manager
Global Fault Manager
Reconfig Behavior
The RTES Solution• Model Integrated Computing
– Graphical representation of complex system,with modeling (simulation) resources
• ARMORs– To protect Linux processes
• And sub processors
• VLAs– To monitor/mitigate
at every level• embedded, supervisory Linux,
Linux trigger farm, etc.
M. Haney; RT 2005 The RTES Project - BTeV, and Beyond
Modeling Environment: GME*Fault handlingProcess dataflowHW Configuration
* GME is an Open-Source, Meta-configurable, multi-aspect graphical modeling tool
M. Haney; RT 2005 The RTES Project - BTeV, and Beyond
ARMOR: Adaptive Reconfigurable Mobile Objects of Reliability
Heartbeat ARMORDetects and recovers FTM failures
Fault Tolerant ManagerHighest ranking manager in the system
DaemonsDetect ARMOR crash and hang failures
ARMOR processesProvide a hierarchy of error detection and recovery.ARMORS are protected through checkpointingand internal self-checking.
Execution ARMOROversees application process(e.g. the various Trigger Supervisor/Monitors)
Daemon
Fault TolerantManager (FTM)
Daemon
HeartbeatARMOR
Daemon
ExecARMOR
AppProcess
network
M. Haney; RT 2005 The RTES Project - BTeV, and Beyond
Very Lightweight Agents
• Minimal footprint
• Platform independence– Employable everywhere in
the system!
• Monitors hardware and software
• Handles fault detection& communications with higher level entities
PhysicsApplication
HardwareOS Kernel
(Linux)
VLA
L2/L3 Manager Nodes(Linux)
PhysicsApplication
Level 2/3 Farm Nodes(Linux)
NetworkAPI
M. Haney; RT 2005 The RTES Project - BTeV, and Beyond
RTES view of the BTeV L1 Trigger
DAQ Switch
L2/3 L2/3 L2/3
ITCH
DSP
FPGA
DSP
FPGA
DSP
FPGA
DSP
FPGA
BufferManager
Concentrator (4:1)
Concentrator(80:1)
Level 1 Switch
FIFO
GL1
GlobalMTSM
GlobalL2/3SM
FarmletManager
RegionalMTSM
Slow/RunControl
L1Buf
GlobalGL1SM
FPGAFPGAManager
RegionalMTSM
GL1Manager
Muon Detector
FPGA FPGAL1Buf
L1Buf
60+ (or 600+) farmlets
GlobalPTSM
GL1SM – Global L1 TriggerSupervisor/Monitor
MTSM – Muon TriggerSupervisor/Monitor
PTSM – Pixel TriggerSupervisor/Monitor
L2/3SM – Level2/Level3 TriggerSupervisor/Monitor
M. Haney; RT 2005 The RTES Project - BTeV, and Beyond
SC2003 Prototype
GatewayGateway
PC - Windows OSDATA
DSP - BIOS
PhysicsApplication
PhysicsApplication
Very Light Monitor Agent
Very Light Monitor Agent
TCP/IP
PC - Linux OS
EPICS Graphical Display System
EPICS Graphical Display System
TCP/IP
COMMANDS
ARMOR Microkernel
RecoveryPolicy
MsgParser
Local Manager ARMOR
DSPInterface
DaemonDaemon
ARMOR Microkernel
RecoveryPolicy
MsgParser
Local Manager ARMOR
EPICSInterface
DaemonDaemon
M. Haney; RT 2005 The RTES Project - BTeV, and Beyond
EPICS GUI
1 2
3
4 6
5
9
7
8
10
M. Haney; RT 2005 The RTES Project - BTeV, and Beyond
Independent Review
• Following SuperComputing 2003, a software review was conducted– GME needs to coherently address
multiple, differing domains• System modeling, messaging, fault mitigation,
Run Control function, GUI, other
– ARMORs need to be easily customized• Via GME
– Overall packaging and version control - vital
M. Haney; RT 2005 The RTES Project - BTeV, and Beyond
Domain-specific languages
• GME models, metamodels, and interpreters for– system description, messaging, state machine (run control,
ARMOR), GUI
• Each language generates appropriate artifacts– C++, Python, Matlab M-files, Elvin config files, etc.
M. Haney; RT 2005 The RTES Project - BTeV, and Beyond
Versioning/Build System
Run TreeRun TreeSystem
Executables
Build TreeBuild Tree
UDM TranslatorsUDM Translators
CanonicalXML models
Domain ModelsDomain Models
MetamodelsMetamodels
Language Specification
Domain Artifacts
ArtifactsCompiler/Linker
Translator
SourceFiles
Models
LanguageSpecification
ObjectSourceArtifacts
OUT
IN
OUT
IN
IN
M. Haney; RT 2005 The RTES Project - BTeV, and Beyond
Demo System 2004 - L2/3 Trigger
FTM
Global Mgr Heartbeat/Source node
Regional Mgr 1
Worker 1.1
HB
Exec ARMOR
Exec ARMOR
Filter 1 Filter 2 Event Builder
Worker 1.2
Regional Mgr 2
Exec ARMOR
Worker 2.1
Elvin Router
GUI
Region 1
Elvin msg
ARMOR msg Exec ARMOR
Event Source
M. Haney; RT 2005 The RTES Project - BTeV, and Beyond
Demo System 2004
Iron
Gangliapublic
private
laptop
MatlabElvin
laptop
MatlabElvin
Boulder
Elvin
Global
RC, ARMOR
Regional
RC, ARMOR
Worker
RC, VLA, ARMORFilterApp
Worker
RC, VLA, ARMORFilterApp
…
Regional
RC, ARMOR
Worker
RC, VLA, ARMORFilterApp
Worker
RC, VLA, ARMORFilterApp
…
Regional
RC, ARMOR
Worker
RC, VLA, ARMORFilterApp
…
DataSource
file reader
M. Haney; RT 2005 The RTES Project - BTeV, and Beyond
Matlab GUI
M. Haney; RT 2005 The RTES Project - BTeV, and Beyond
Beyond BTeV - CMS
• GME modeling for XDAQ– System descriptions, state machines,
messaging…– Work in progress
• Fault tolerance for HLT– ARMORs and VLAs
• Being discussed
• Balancing CMS needs and RTES goals• Adding value, without requiring changes
M. Haney; RT 2005 The RTES Project - BTeV, and Beyond
Beyond BTeV - LQCD• Lattice Gauge Theory Computation
– farm at Fermilab
• Single-point sensitivities– Single process fault can compromise
entire farm computation– Checkpointed; can be restarted, but…
• ARMORs and VLAs– Batch/autonomous protection
• No operator
– Dynamic mix of protection requirements• Not a (quasi)static L2/3 Trigger
M. Haney; RT 2005 The RTES Project - BTeV, and Beyond
Beyond BTeV - Grid, Other
• Grid Projects– Load balancing and networks studies
• Nodes-in-farm => farms-in-grid• Resource driven, deadline driven, other
– Extension of studies done for BTeV/partitioning
• Other - Dark Energy Survey (astro camera)– “Simple” system (few nodes)– Not real-time hard (can reacquire image)
• But it will be a good case-study for the “cost” of incorporating RTES (GME, ARMORs, VLAs)
M. Haney; RT 2005 The RTES Project - BTeV, and Beyond
Conclusions
• The RTES project developed two prototypes (L1, and L2/3) for BTeV– Demonstrated at conferences
• RTES is now applying its design-time modeling and runtime middleware to severalhigh performance heterogeneousembedded application environments
M. Haney; RT 2005 The RTES Project - BTeV, and Beyond