OPTIMIZATION OF POWER REDUCTION
IN FPGA INTERCONNECT BY
CHARGE RECYCLING
Deepa Soman, HyunSuk Nam, Rekha Srinivasaraghavan, Shashank Sivakumar
Agenda
Day 2 Power Reduction T
echniques (Conti) Charge Recycling Our Project Discussions
Day 1 Intro Power Consumpti
on Techniques Power Reduction T
echniques Discussions
Motivation Achilles’ Heel Logic flexibility & re-programmability -
longer wires (7-14 X) higher than asics
Introduction
Power Consumption
Dynamic Power - power consumed while the inputs are active
Static power - power consumed even when there is no circuit activity !!!
fCVP dddynamic2
KTqV
DSdd
leakageddsub
th
eIV
IVP
0
.
9
Voltage Scaling
Selecting core voltage based on performance requirements
How to Choose? – From Timing Analysis
Types: 1) Static Voltage Scaling 2) Dynamic Voltage Scaling
10
1. Static Voltage Scaling
Selected core voltage only Realized using on chip Low-Dropout
regulator(LDO) Voltage controlled by configuration bit
stream 0.8-V - minimum dynamic and leakage
power 1.0-V - overall highest performance
[1]"A FPGA Prototype Design Emphasis on Low Power Technique" Xu, Jian
1.0v
0.8v
LDO
11
2. Dynamic Voltage Scaling
Provides different voltage levels Realized using voltage controlling unit
Can be level shifter or DC-DC converter
DVS implementation
(LDMC – Logic Delay Measurement Unit) Delay error
”Dynamic Voltage Scaling for Commercial FPGAs”, C.T. Chow1, L.S.M. Tsui1, P.H.W.
12
Dual Supply Voltage (Vdd)
Separate voltage supplies for configuration SRAM and other elements
Purpose: To support sleep mode Shutdown most logic except SRAM using
LDO
“A Dual-VDD Low Power FPGA Architecture” A. Gayasen1, K. Lee1, N. Vijaykrishnan1, M. Kandemir1, M.J. Irwin1, and T. Tuan2
13
Performance
Static voltage scaling techniques leads to nearly 53% power reduction. Dynamic(upto 54%). Dual Vdd- 14%
Merits: SVS - Simple hardware DVS - Self adaptive Dual Vdd – eliminate speed penalty
Demerits: SVS - Voltage is fixed DVS - design complexity Dual Vdd - area overhead
[1]"A FPGA Prototype Design Emphasis on Low Power Technique" Xu, Jian[2]”A 90-nm Low-Power FPGA for Battery-Powered Applications”,Tuan, Das, Steve, Sean
14
Frequency Scaling
fCVP dddynamic2 f : frequency of switching
Dynamic Clock Management Implementations
(a)Simple dynamic clock management circuit
(b) Using Feedback, PLL circuit can reduce skew; lock time
(c) dynamic clock division
Merits:• Can subsequently reduce voltageDemerits:• Increased Latency
15
Benefits of Frequency Scaling
As frequency decreases, power consumption also decreases
"Dynamic Clock Management for Low Power Applications in FPGAs", Lan, zilic
16
Clock Gating
Controlling the clock flow Purpose: To temporarily disable blocks Can be realized in hardware using clock enable
signals minimizes power dissipation in clock
circuits/network
17
Clock Gating - Performance
industry-a,b,c,d, are DSP circuits, while the remaining circuits are collected from customers and are of unknown function
Over 20% power reductions are observed for the DSP circuits
Clock Power Reduction for Virtex-5 FPGAs
Eliminates unnecessary toggling on outputs, gates of FFs and clock signals
Demerits:Clock skew
"Clock Power Reduction for Virtex-5 FPGAs",Wang, Gupta, Anderson
A
• System Level: • Algorithm
Modification• CAD Tools :
• Logic Partitioning
• Mapping,• Clustering • Placement &
Routing
Software
Techniques
Low Power FFT Implementation Architecture
Matrix multiplication ->1D array low power dissipation than 2D array
Module Disabling – Clock gating to disable modules eg: twiddle factor calculation
dynamic memory activation Multiple time multiplexed Pipeline uP Parallel Processing Algorithm : Block Matrix Multiplication
FFT implementation Results
17% to 26% power reduction
"High throughput energy efficient multi-FFTarchitecture on FPGAs" , Chen , Park, Prasanna
21
Energy Reduction Contributions of CAD Stages
Clustering contributes to the major share !
"On the interaction between power aware FPGA CAD algorithms" , Julien , Steven
Results: Power Aware clustering
“Netlength Based Routability Driven Power Aware Clustering" , Akoglu, Easwaran
Temperature Aware Routing
leakage current increases exponentially with temperature
Switching capacitance
27
Algorithm
By discouraging routing algorithm to form connections that cross hotspot regions
Cost Function Modification:
Power Savings Range between 30 – 63 %
"A Temperature-Aware Placement and Routing targeting 3D FPGAs", Kostas, Soudris
Power-Aware FPGA Design Flow
Step 2• Power Aware
Packing• or Clustering
CAD• Power Aware
Placement
Tools• Power Aware
Routing
Step 1
• Power Based Architectural
• (High level modelling)
RTL
• Voltage scaling, Dual Vdd
• Freq Scaling, Clock gating
Main/Baseline Paper
Problem Addressed Power consumption in FPGAs
is dominated by interconnect(62%)
Proposed idea Charge recycling for
power reduction in FPGA interconnect
Charge Recycling in FPGAs
How?? “Unused routing resources “ as reservoirs
Reduces charge drawn from Vdd25% reduction in energy
1. 2. 3.
4.
5. 6. 7.
Unused/Reservoir
Unused/Reservoir
Unused w/o friends !!
CR-Capable FPGA Interconnect
Analysis Four components
SRAM Cell• Produce signals CR and TS :
control a switch (Normal, CR, tri-state )
Delay Line• Transition between VIN and
DLOUT
CR Circuit• Perform the charge sharing
between the load and reservoir Input Stage
Experiments/Methodology
VPR6.0
Baseline : Island style, Unidirectional, Wilton (K=6 ,N=4)
Router – Path Finder - Cost Function Modification
Post Routing CR mode
VPR place/route tool helps in finding % increase in area
Post - Routing
Mixed Integer Linear Program
Tries to maximize the number of nodes to be put into CR mode
Constraint: Critical delay of the circuit
Results Continued…
Number of min-width transistors as the area metric
Reductions in power savings are not directly proportional to the reduction in CR-capable switches (area)
What we propose new?
Not all unused wires become friends Unused wires connected to constant voltage
“URekha” --- Unused wires Tri-stated “further power savings!!”
~6% savings