fpga interconnect planning
DESCRIPTION
dfdfTRANSCRIPT
FPGA Interconnect PlanningFPGA Interconnect PlanningFPGA Interconnect Planning
Amit Singh
Xilinx Inc.San Jose, CA
Malgorzata Marek-Sadowska
University of California, Santa BarbaraSanta Barbara, [email protected]
OutlineOutline
IntroductionPrevious WorkArchitecture Model
• Architecture & Design Rent’s exponentClustering
• Spatial regularity
Fanout distribution• Area minimization• Area-delay minimization• Performance
Conclusions
Introduction
Clustered FPGAs• System-on-Chip (SOC)
Performance, Power, Area• Matching design and architecture complexities• Circuit packing/clustering
Fanout distribution (Zarkesh-Ha et al. 2000)• Rent’s rule
Segment length planning in FPGAs• Area reduction• Area-Delay product minimization
Previous Work
Rent’s rule• Landman, Russo (1974), Donath (1979)
Interconnect distribution model• Davis et al. (1998), Zarkesh-Ha et al. (2000)• Local, semi-global, global wiring requirement
Homogeneous, heterogeneous systems
FPGA segment length distribution• Betz et al. (1999)• Type of routing switches• Impact – segment length distribution on area-delay
productBest area-delay product: Mix of length 4 and 8 segments
Clustered FPGAs
Clusters of LEs, connection boxes and switch-boxes
Regular 2-D mesh array
Example: Xilinx Virtex series, Altera APEX & Stratix
Cluster of LEsCluster of LEs
A Logic ElementA Logic Element
Routing Model
Cluster
WSe
gmen
t Le
ngth
Cluster
S-Box
C-Box
Cluster
Routing Model
Buffered routing switches
Buffer chain delay
Pass-transistor chain delay
How much interconnect?How much interconnect?
~80% of FPGA area = interconnects
Routing resource utilization (RRU) is low • 100% logic utilization
Depopulating logic clusters• Regularity
Interconnect complexity guided fanout distribution-Rent’s rule
• Average fanoutSegment lengths (shorter segments or longer segments?)Switch type (tri-state buffers or pass-transistor?)
Rent’s Rule
(simple) 0 ≤ p ≤ 1 (complex)
Typical values: 0.5 ≤ p ≤ 0.75
Measure for the complexity
of the interconnection topology.
Rent’s rule: Landman and Russo in 1971.
Average number of terminals and blocks per module in a
partitioned design:
T = t B p
p = Rent exponent
t ≅ average # term./block
11
100010
10
100
100
T
B
averageRent’s rule
Rent’s Rule
Definitions
• Pd – Rent’s parameter for Design
• Pa – Rent’s parameter for Architecture R
outi
ng
reso
urc
e u
tiliz
atio
n
Rent’s parameter
Pa = 0.64
Logic Clustering
• Separation : Sum of all terminals of nets incident to LE
• Degree : Number of nets incident to LE
Examplet
CY
Xxz
y
)]1()(2[)( α+•= xnwXG k
aPnkIO )1( +≤
2dseparationc =
•Net weight: r
xw 2)( =
Clustering: Seed selection
degree = 4; separation = 18, c = 1.25
AB
degree = 4; separation = 8, c = 0.5
B
Nets absorbed = 4
A
Nets absorbed = 1
Rent’s Rule: Depopulation
Case 1: Pd <= Pa
• Achieve spatial uniformity.
Case 2: Pd > Pa
• Need more routing resources.
• Solution – Depopulate clusters
A
A
A
Regularity
Better clustering for increased routability
Routing succeeded with a channel width factor of 12Routing succeeded with a channel width factor of 21
Avg. fan-out 2.7
Ours:Avg. fan-out
3.7
Rent’s Rule: Depopulation
Cluster Pin Utilization
01020304050607080
0 5 10 15 20 25 30
# Cluster Pins utilized
# Cl
uste
rs
Ours T-VPack
Segment length Planning
Typical segment distribution
What is the best mix of segments for:i) Area minimization ii) Area-delay minimization?
Fanout Distribution
Inter-cluster wire requirement
Equivalent Rent’s parameter of system
Netlist profile• Low fanout nets – smaller length segments• Global nets – Longer segments (long lines)
Global segments – buffered
Shorter segments – not buffered(pass transistor switches)
Nn
i
Nieq
ikk ∏=
=1
(N
Npp
n
iii
eq
∑== 1
Fanout distribution
Array of clusters
0
50
100
150
200
250
300
350
1 2 3 4 5 6 7 8 9 10
Pins per net
net
sActual Predictedm
mmkNmNetpp ))1(()(
11 −− −−=
( ) 1,)1(1
)1(12
1
−Ω−+−
+−= −
−
Maxp
Max
pMax
avg FopFoFoFo ( ) ∑
= +=Ω
MaxFo
n
p
Max nnnFop
12 )1(
,
Fanout distribution: Area-Minimization
Good Placement
Example
Fanout distribution: Area-Minimization
0.00E+00
2.00E+06
4.00E+06
6.00E+06
8.00E+06
1.00E+07
1 2 3 4 5 6
Circuit
Area
(Tra
nsis
tors
)
Ours Length 1 Xilinx-like
Fanout distribution: Area-Minimization
0.00E+00
5.00E-08
1.00E-07
1.50E-07
2.00E-07
2.50E-07
3.00E-07
1 2 3 4 5 6
Circuit
Crit
ical
Pat
h (n
s)
Ours Length 1 Xilinx-like
Fanout distribution:Area-Delay Product
Performance!
Average fanout: Critical path model
Fanout distribution:Area-Delay Product
Timing-driven Placer and Router
0
0.2
0.4
0.6
0.8
1
1.2
alu4
apex
2ap
ex4
bigke
yde
sdif
feq dsip
ex5p
misex3
s298 se
qtse
ng
Circuits
Area
-del
ay p
rodu
ct
Ours Xilinx-like
Fanout distribution:Performance
Using a Timing-driven Placer and RouterNormalized Critical Path
0
0.2
0.4
0.6
0.8
1
1.2
1.4
alu4
apex
2ap
ex4
bigke
y
des
diffeq ds
ipex
5pmise
x3s2
98 seq
tseng Avg.
Circuits
Nor
mal
ized
del
ay
Ours Xilinx-like
Conclusions & Future Work
FPGA Clustering for Regularity• Rent’s rule
Fanout distribution based segment length planning• Area minimization
Reduced wire-length• Area-Delay minimization
Reduction of 29% over state-of-art
Fixed FPGA Architecture• 20% better area-delay product
Future work: Metal layer assignment• Applying technique to a pipelined FPGA