preshower front-end boards news lhcb group / lpc clermont news trig-pga bit flip behaviour
Post on 22-Jan-2016
25 Views
Preview:
DESCRIPTION
TRANSCRIPT
Preshower Front-End Boards Preshower Front-End Boards NewsNews
LHCb group / LPC Clermont
1. News
2. TRIG-PGA Bit Flip Behaviour
3. Bit Flip Simulations
4. Conclusion
• 2 First Final Prototypes FE Boards received from Hitachi• Fully Tested @ LPC Test Bench = FUNCTIONNAL• Production tests processes ready
• FEPGAs • data processing path• offsets
• Analogic and the DAQ parts by analogic injection with AWG and DAQ through the SEQPGA
Power-up failure SEQ ?, ‘random’ Problematic with multiple Boards
‘sine’ injection with AWGStable Offsets over 55 h
1. PS FEB Validation : Data Path
•Functioning of the delay chips & Synchronisation ~OK, ongoing studies
•Connectivity on the trigger path: OKFEPGA-TRIGPGAinputs/outputs with memory boards
•Trigger algorithm: OK
see also Combined Tests in Building 156
Manufacturer JTAG Tests for Production
•Basic tests on wires between PGAs, spot soldering defects
•Improved JTAG tests including most of the board I/Os with dedicated back-plane data conversion, access to individual bits @ 40 MHz, TTL levels internal buses, pga pins + links drivers and connectors Limitation : no JTAG access to analogue parts and ADCs
1. PS FEB Validation : Trigger Path
1. TRIG-PGA ISSUE
We are in touch with ACTEL France
No obvious problem from PGA code, implantationInternal to APA450, Noise handling issues suggested ?
An issue was spotted with the APA450 TRIG-PGASome input patterns lead to erroneous output
Failure diagnostic : output values missing, then repeatedTRIG-PGA enters a ‘Blocked’ status
This failure can be correlated to inputs Bit Flip Rate ( BFR )
•Meanwhile …… TRIG behaviour extensivelly studied @ LPC, Clermont Test Bench …
2. Bit Flip Generation
Total Available Bits Used with RAMs Used with Memory Boards
PS 64 64 0
SPD 64 64 64
ECAL 26 10* 10*
Neighbours 34 34 0
Total 188 ( 100 % ) 178 ( 95 % ) 74 ( 39 % )
Table 1: Bit usage for the TRIG bit flip tests. In the last raw are also indicated the fraction of the total 188 bits varied during the tests. (*) The BCID counters of ECAL bits have a particular status.
Bit Flip Rate ( % ) Maximal Run Duration ( cycles )
RAMs: PS & SPD random
13-47 2.5∙106
Memory Boards: SPD random 4-21 1.6∙108
SPD alternated & full flip 21-39 4.5∙105
Table 2: Bit Flip Rate generation strategies. For the RAMs setup all data, except BCIDs were taken randomly at each cycle. In MB runs in order to finely investigate the problematic area of high BFR we used a combination of some bits being fully flipped while others were alternated with random patterns.
40 s LHCb @ 40 MHz= 2 days of injection
8 words of 8 bitsMask n/8
Monte-Carlo strategy to study BFR impact
2 Internal BCID counters [0; 255 ]Average BFR = 2 %
BFR = flips / 188
2. TRIG-PGA Failure Distribution
Figure 1: TRIG PGA failure in the RAM configuration for a mean BFR of ~47 %. The figure shows the distribution of cumulated flips in PS and SPD bits before failure occurs. The open circles stand for board P#8-01 and the plain dots for board P#8-02. Error bars are statistical.
Statistical spread:
•Fluctuations in the BFR over time due to BCIDs, pattern randomization protocol•Inherent. Even for same inputs few % spread
Failure occurs on average when cumulating some amount of Bit Flip
Board P#8-02
Board P#8-01
2. TRIG-PGA Failure as BFR
•There is a threshold BFR, fr,0, below which no failure occurs.•Failure is not driven by a dynamical process.•Below the threshold, the TRIG-PGA can re-generate.
Postponed failure sequency
Blocked
fr,0
Additional cyclesbefore failure
Relax
Relax BFR = 5 %
Relax BFR = 16 %
2. TRIG-PGA Failure as T, Clock f
Figure 4: TRIG PGA failure in the MB configuration for various temperatures and operating clock frequencies. On the left is shown the failure cycle number for different temperatures. The usual conditions encountered in the lab were of 30 ºC. The right figure shows the failure cycle number for various operating frequencies. The nominal frequency is 40 MHz. Below 25 MHz no failure occurs up to a maximal 39 % BFR.
Failure depends on Temperature, Clock frequency
No failure @ -55 ºC
No failure below 25 MHz
2. TRIG-PGA Recovery
•Once blocked the TRIG-PGA can recover if BFR below threshold•Full recovery is long … ~ 103 cycles
Blocked Blocked
1st load 2nd load
Relax
Recovery Sequency
‘Exhausted’ plateau Full recovery
Recovery litle depends on load and relax BFR values ( except threshold ) Suggests time driven mechanism
2. TRIG-PGA Failure Cost Model
•Failure cycle follows a simple ‘Cost’ model as BFR•Assume cost is stationnary
Blocked
Cross-check sequency
Fully random BFR in perfect agreement with data
Cost modeled from data Cross-check model
~ 20 % bias when BFR varies in time ; BOOT ?
Co
st /
cyc
le (
% )
3. TRIG-PGA Bit Flip Expected in LHCb
Figure 9: Bit Flip Rate distribution for pp events at a luminosity of 5·1032
cm-2·s-1 and for the two ‘hottest’ cells ( 82, 91 ) of the PS/SPD system. The SPD threshold was set as low as 0.3 MIP.
Mean Bit Flip Rate ( % )
Contribution to BFR ( % )
BCIDs 2.1 31
PS bits 0.9 14
SPD bits 2.0 29
ECAL addresses
1.7 25
Total 6.7 100
Table 3: Mean Bit Flip Rate for cells ( 82, 91 ) at a
luminosity of 5·1032 cm-2·s-1. The fractions at which the various bits contribute to the BFR are also indicated.
•Bit Flip simulated from DC 06 minimum bias pp events (103 events / cell)•Set SPD threshold to 0.3 MIP = very low
•Select the 2 ‘hottest’ boards(close beam line)
L = 5·1032 cm-2·s-1
Average BFR well below failure Threshold
But …Distribution extends well above ( ~ 3 % above )
Threshold
BCID Counters
3. TRIG-PGA Failure Probability @ LHCb
Figure 10: Cost probabilities and TRIG-PGA failure as simulated. The left figure shows the maximal cost distribution within 65 k events runs. The right figure is the failure probability within 1 year of LHC operation and assuming various cost thresholds.
•MiniBias pattern injection looped @ 40 MHz over 1 day : No TRIG-PGA failure But always same pattern … 40 s LHC = 2 days tests
Estimate failure from simulation / Cost model
116 min @ 40 MHz simulated in Lyon batch farm
1st : Bunchs of 65 k cycles No faillure predicted2nd : Extrapolate to 1 yr LHCb ‘non-stop’
Cost Below 10 %
Failure expected @ cost ~ 80 %
SHOULD NOT HAPPENin standard conditions
4. CONCLUSION
THE PS FEB ARE WORKING PROPERLY fits our needs
PS FEB BOARDS EXTENSIVELY TESTED @ LPC & ongoing in Building 156
•Power-up issues @ SEQ ?•TRIG-PGA ‘pathologic behaviour’ No problem expected in standard LHCb conditions Actel France @ LPC Monday, December 4th
Production Should Follow
top related