advanced topics on fpga applications screen b wu, jinyuan fermilab ieee nss 2007 refresher course...
TRANSCRIPT
Advanced Topics on FPGA ApplicationsScreen B
Wu, Jinyuan
Fermilab
IEEE NSS 2007 Refresher Course
Supplemental Materials
Oct, 2007
Oct. 2007, Wu Jinyuan, Fermilab
IEEE NSS Refresher Course, Supplemental Materials
2
Outline Digital Design with FPGAs (This 45 min. Course)
Logic Element in a Nutshell Variations of the Registered Adders Tricks of Using RAM RAM based histograms Topics on Multipliers Curved Track Fitter
Advanced Topics on FPGA Applications (Included as Supplemental Materials) Doublet Finding, Hash Sorter Triplet Finding, Tiny Triplet Finder (TTF) Options of Sequence Control, Recursive Structure, etc.
Oct. 2007, Wu Jinyuan, Fermilab
IEEE NSS Refresher Course, Supplemental Materials
3
y
xz
y1a
y1b
x1a
x1b
y2a
y2b
x2a
x2b
y3a
y3b
x3ax3b
2*y1 = y23*y1 = y3
Doublet Matching
Oct. 2007, Wu Jinyuan, Fermilab
IEEE NSS Refresher Course, Supplemental Materials
4
Example of Evaluating the Key Number3*y1 = y3
K= 3*y1/8 K= y3/8
*3
y1 y3
K K
Oct. 2007, Wu Jinyuan, Fermilab
IEEE NSS Refresher Course, Supplemental Materials
5
DIN DOUT
Index RAM
Pointer RAM
DATA RAM
K
Link List Structure of Hash Sorter
Oct. 2007, Wu Jinyuan, Fermilab
IEEE NSS Refresher Course, Supplemental Materials
6
Histogram with Fast Reset
D QK
DV
RAM
QDWAWERA
D QD Q
+1
D Q
D Q
0
RAM
QDWAWERA
==
RCRC
CE
RESET
Oct. 2007, Wu Jinyuan, Fermilab
IEEE NSS Refresher Course, Supplemental Materials
7
An Example of Track Recognition: Hits
Oct. 2007, Wu Jinyuan, Fermilab
IEEE NSS Refresher Course, Supplemental Materials
8
An Example of Track Recognition: Doublets
Hits are paired together as doublet.
Ghost doublets may exist.
Ghost doublets may exist.
Oct. 2007, Wu Jinyuan, Fermilab
IEEE NSS Refresher Course, Supplemental Materials
9
)sin(5025
2
0
0
r
cm
R
cmc
An Example of Track Recognition: Histogram
0
c0
Two track parameters can be calculated for each doublet.
A 2-D histogram is booked.
Doublets from same track are entered into same bin, (since they have same track parameters).
Sometimes they are stored in clusters.
This is a “ghost”.
Oct. 2007, Wu Jinyuan, Fermilab
IEEE NSS Refresher Course, Supplemental Materials
10
An Example of Track Recognition: Tracks
All doublets from a track are contained in a cluster.
Oct. 2007, Wu Jinyuan, Fermilab
IEEE NSS Refresher Course, Supplemental Materials
11
Simulation Results
An event with 200 tracks
It still works at 1000 tracks/event
Oct. 2007, Wu Jinyuan, Fermilab
IEEE NSS Refresher Course, Supplemental Materials
12
Example: Finding “Soft Jets” A simulated event with 200 tracks. Flat distributions. Min. R = 55 cm
16 soft tracks are added. They are grouped in 2 small initial angle
regions, i.e., 2 “soft jets”.
00
Can you see the “soft jets”?
Can you see the “soft jets” now?
Oct. 2007, Wu Jinyuan, Fermilab
IEEE NSS Refresher Course, Supplemental Materials
13
Outline Digital Design with FPGAs (This 45 min. Course)
Logic Element in a Nutshell Variations of the Registered Adders Tricks of Using RAM RAM based histograms Topics on Multipliers Curved Track Fitter
Advanced Topics on FPGA Applications (Included as Supplemental Materials) Doublet Finding, Hash Sorter Triplet Finding, Tiny Triplet Finder (TTF) Options of Sequence Control, Recursive Structure, etc.
Oct. 2007, Wu Jinyuan, Fermilab
IEEE NSS Refresher Course, Supplemental Materials
14
y
xz
u1a
u1b
v1a
v1b
u2a
u2b
v2a v2b
u3a
u3b
v3a
v3b
u
v
Oct. 2007, Wu Jinyuan, Fermilab
IEEE NSS Refresher Course, Supplemental Materials
15
• Three data items must satisfy the condition: xA+ xC = 2 xB.
• A total of n3 combinations must be checked (e.g. 5x5x5=125).
• Three layers of loops if the process is implemented in software.
• Large silicon resource may be needed without careful
planning: O(N2)
Triplet Finding
Plane A Plane B Plane C
Oct. 2007, Wu Jinyuan, Fermilab
IEEE NSS Refresher Course, Supplemental Materials
16
Block Diagram, Step 1
Oct. 2007, Wu Jinyuan, Fermilab
IEEE NSS Refresher Course, Supplemental Materials
17
Block Diagram, Step 2
Oct. 2007, Wu Jinyuan, Fermilab
IEEE NSS Refresher Course, Supplemental Materials
18
Circular Tracks from Collision Point on Cylindrical Detectors
For a given hit on layer 3, the coincident between a layer 2 and a layer 1 hit satisfying coincident map signifies a valid circular track.
A track segment has 2 free parameters, i.e., a triplet. The coincident map is invariant of rotation.
0
10
20
30
40
50
60
70
80
90
100
0 20 40 60 80 100
0
16
32
48
64
80
96
112
128
0 16 32 48 64 80 96 112 128
1-3)+64
2-
3)+
64
Oct. 2007, Wu Jinyuan, Fermilab
IEEE NSS Refresher Course, Supplemental Materials
19
Logarithmic Shifter
S1
S2
S4
# of bits: NShift distance: L# of stages: log2L
Total LE usage: N*log2L
A shift of X bit of the bit pattern is done in one clock cycle rather than X cycles.
Logarithmic shifter is also known as “barrel shifter”, but the term “logarithmic” reflects nature of implementation, resource usage and propagation delay better.
Oct. 2007, Wu Jinyuan, Fermilab
IEEE NSS Refresher Course, Supplemental Materials
20
Logic Cell Usage
Both 64- and 128-bit TTF designs fit $100 FPGA comfortably.
A simple 64-bit Hough transform design is shown for scale.
A $1200 FPGA is shown for scale.
EP2A40 ($1200)
EP1C12 ($118)
TTF64
TTF128
Hough Trans. 64
TTF64 TTF128
$100 $1200
Hough64
Oct. 2007, Wu Jinyuan, Fermilab
IEEE NSS Refresher Course, Supplemental Materials
21
u1
v1
u2
v2
u3
v3
u4
v4
y5
x5
Complex Triplet Finding Problems
Oct. 2007, Wu Jinyuan, Fermilab
IEEE NSS Refresher Course, Supplemental Materials
22
Outline Digital Design with FPGAs (This 45 min. Course)
Logic Element in a Nutshell Variations of the Registered Adders Tricks of Using RAM RAM based histograms Topics on Multipliers Curved Track Fitter
Advanced Topics on FPGA Applications (Included as Supplemental Materials) Doublet Finding, Hash Sorter Triplet Finding, Tiny Triplet Finder (TTF) Options of Sequence Control, Recursive Structure, etc.
Oct. 2007, Wu Jinyuan, Fermilab
IEEE NSS Refresher Course, Supplemental Materials
23
FPGA Process Sequencing Options
Program
Type
Program
Length
(CLK cycles)
Reprogram Resource
Usage
Finite State Machine
(FSM)
Fixed
Wired
10 Hard Small
Enclosed Loop Micro-Sequencer
(ELMS)
Memory
Stored
Program
10-1000 Easy Small
Microprocessor
(MP)
Memory
Stored
Program
>1000 Easy Large
Oct. 2007, Wu Jinyuan, Fermilab
IEEE NSS Refresher Course, Supplemental Materials
24
The Between Counter
0,1,2,3,4,5,6,7,8,9,A
5,6,7,8,9,ASLOAD
D[]
SCLR
N Q[]
M-1==
A[]
B[]
T
5,6,7,8,9,A
5,6,7,8,9,A
5,6,7,8,9,A
5,6,7,8,9,A,B,C,D,E,F…
PC0: instr0PC1: instr1PC2: instr2PC3: instr3PC4: instr4PC5: instr5PC6: instr6PC7: instr7PC8: instr8PC9: instr9PCA: instrAPCB: instrBPCC: instrCPCD: instrD
TROM
BetweenCounter
ControlSignals
Oct. 2007, Wu Jinyuan, Fermilab
IEEE NSS Refresher Course, Supplemental Materials
25
ELMS – Detailed Block Diagram
UserControlSignals
ROM128x
36bits
+1
CondJMP
PC
Reset
Loop & Return Registers
+ Stack (128 words)
Compare
RTNJMPIF
CNT
endA
bckA
PushPop
LoopBack
DEC
RTN
LastPass
LoopBack = DEC =(PC==endA) && (CNT!=0)
LastPass =(PC==endA) && (CNT==1)
desA
JMP
0x04
RUNat04 cnt EndA BckA
FOR BckA1 EndA1 #nLD R2, #addr_aLD R3, #addr_XLD R7, #0
BckA1 LD R4, (R2)INC R2LD R5, (R3)INC R3MUL R6, R4, R5
EndA1 ADD R7, R7, R6LD R8, R7
The Stack supports nested loops, up to 128 layers.
Oct. 2007, Wu Jinyuan, Fermilab
IEEE NSS Refresher Course, Supplemental Materials
26
What’s Good About ELMSFOR Loops at Machine Code Level
Looping sequence is known in this example before entering the loop. Regular micro-processor treat the sequence as unknown. ELMS supports FOR loops with pre-defined iterations at machine code level. Execution time is saved and micro-complexities (branch penalty, pipeline bubble, etc.)
associated with conditional branches are avoided.
LD R1, #nLD R2, #addr_aLD R3, #addr_XLD R7, #0
BckA1 LD R4, (R2)INC R2LD R5, (R3)INC R3MUL R6, R4, R5
EndA1 ADD R7, R7, R6DEC R1BRNZ BckA1
FOR BckA1 EndA1 #nLD R2, #addr_aLD R3, #addr_XLD R7, #0
BckA1 LD R4, (R2)INC R2LD R5, (R3)INC R3MUL R6, R4, R5
EndA1 ADD R7, R7, R6
n
iiiXaY
0
25%
Microprocessor The ELMS
Conditional Branch
Oct. 2007, Wu Jinyuan, Fermilab
IEEE NSS Refresher Course, Supplemental Materials
27
Outline Digital Design with FPGAs (This 45 min. Course)
Logic Element in a Nutshell Variations of the Registered Adders Tricks of Using RAM RAM based histograms Topics on Multipliers Curved Track Fitter
Advanced Topics on FPGA Applications (Included as Supplemental Materials) Doublet Finding, Hash Sorter Triplet Finding, Tiny Triplet Finder (TTF) Options of Sequence Control, Recursive Structure, etc.
Oct. 2007, Wu Jinyuan, Fermilab
IEEE NSS Refresher Course, Supplemental Materials
28
The Problem: 3 60Hz AC
Rectify noise from power supply using 3-phase 60Hz AC are picked up by the input cable laying in the accelerator tunnel.
0
1000
2000
3000
4000
5000
6000
0 360 720 1080 1440 1800 2160 2520 2880 3240 3600
frequency (Hz)
Am
pli
tud
e
Time Domain
Frequency Domain
ADC21s/sample
Oct. 2007, Wu Jinyuan, Fermilab
IEEE NSS Refresher Course, Supplemental Materials
29
Filtering Results
Noises >360Hz, the dominating portion, are filtered out in both filter functions.
CIC sum is a lot smoother than the sliding sum. But small signals are still buried under ripples of 60 and 180 Hz.
SlidingSum
CICSum
Signals
Oct. 2007, Wu Jinyuan, Fermilab
IEEE NSS Refresher Course, Supplemental Materials
30
Recursive Implementation of CIC Sum
The non-recursive
implementation needs: 248 memory fetches, 248 multiplications, 248 additions and more
ops for longer sum lengths.
+
s[n]
-x[n-K]
x[n]
+y[n]
-s[n-K]
+u[n]
-2x[n-K]
x[n]
+y[n]
x[n-2K]
x[n]
y[n]
*h1*h2
*h[K]
The CIC sum constructed
as a sliding sum of sliding
sums: 2 memory fetches, 0 multiplications, 4 add/sub ops for any
sum length.
The re-formulated CIC sum uses the raw data buffer rather than a separate buffer.
CICSum
Oct. 2007, Wu Jinyuan, Fermilab
IEEE NSS Refresher Course, Supplemental Materials
31
Exponential Sequence Generator
Q
SET
D
if (CO==1) {Q = Q - Q/32;}
0
10000
20000
30000
40000
50000
60000
70000
0 20 40 60 80 100 120 140 160
This is also an example of recursive structure. This is IIR but it is stable. Dropping exponential components are used to stabilize
other recursive structures.