skewed flip-flop transformation for minimizing leakage in sequential circuits
DESCRIPTION
Skewed Flip-Flop Transformation for Minimizing Leakage in Sequential Circuits. Jun Seomun , Jaehyun Kim, Youngsoo Shin Dept. of Electrical Engineering, KAIST, KOREA. Leakage Power in Technology Scaling. 250. Dynamic Power. Leakage Power. 200. 150. Power (W). 100. 50. 0. 0.25 µ. - PowerPoint PPT PresentationTRANSCRIPT
Skewed Flip-Flop Skewed Flip-Flop Transformation for Transformation for Minimizing LeakageMinimizing Leakagein Sequential Circuitsin Sequential Circuits
Jun SeomunJun Seomun, Jaehyun Kim, Youngsoo , Jaehyun Kim, Youngsoo ShinShin
Dept. of Electrical Engineering, KAIST, KOREADept. of Electrical Engineering, KAIST, KOREA
Leakage Power in Technology Scaling
250250
200200
150150
100100
5050
00
Pow
er (W
)Po
wer
(W)
TechnologyTechnology
0.250.25µµ 0.180.18µµ 0.130.13µµ 0.100.10µµ 0.070.07µµ
Dynamic PowerDynamic PowerLeakage PowerLeakage Power
Intel Corporation, 2002Intel Corporation, 2002
Overview of Mixed Vt Technique
Mixed VMixed Vtt CMOS CMOS– Low VLow Vtt : fast but high leakage : fast but high leakage– High VHigh Vtt : low leakage but slow : low leakage but slow
Value of mixed VValue of mixed Vtt is limited is limited– It considers only the combinational portion of It considers only the combinational portion of
circuitscircuits
Low Vt
High Vt
High Vt gates can be assigned some non–critical path
Critical path
Initially all low Vt
Motivation Leakage of sequential elementsLeakage of sequential elements
– Sequential elements take large proportion in Sequential elements take large proportion in many controllersmany controllers
s298
s298
s344
s344
s349
s349
s382
s382
s400
s400
s444
s444
s526
s526
s641
s641
s713
s713
s838
s838
s923
4s9
234
Flip-flopFlip-flopComb.Comb.
100%100%
80%80%
60%60%
40%40%
20%20%
0%0%
s298
s298
s344
s344
s349
s349
s382
s382
s400
s400
s444
s444
s526
s526
s641
s641
s713
s713
s838
s838
s923
4s9
234
Mixed Vt
Why Not High Vt Flip-Flop? Large effects on the slack Large effects on the slack
– The delay overhead of high VThe delay overhead of high Vtt flip-flops is flip-flops is larger than that of the other high Vlarger than that of the other high Vtt combinational gates combinational gates
– Flip-flop typically affects more than one of the Flip-flop typically affects more than one of the timing paths in a circuit timing paths in a circuit
00
55
1010
1515
2020
2525
F/FF/F INVINV NAND2NAND2 NOR2NOR2 NAND3NAND3 NAND4NAND4
00
11
22
33
44
s298
s298
s344
s344
s349
s349
s400
s400
s444
s444
s526
s526
s641
s641
s713
s713
s838
s838
s923
4s9
234
[Ave
rage
# fa
nout
tim
ing
path
s on
F/F
s]
[Ave
rage
# fa
nout
tim
ing
path
s on
F/F
s]
/ [A
vera
ge #
fano
ut ti
min
g pa
ths
on c
omb.
Gat
es]
/ [A
vera
ge #
fano
ut ti
min
g pa
ths
on c
omb.
Gat
es]
Del
ay o
f hig
h V
Del
ay o
f hig
h V
tt gat
e g
ate
- del
ay o
f low
V- d
elay
of l
ow V
tt gat
e g
ate
Mixed LMixed Lgategate flip-flop flip-flop – Lager LLager Lgategate transistor transistor
Smaller delay overhead than high VSmaller delay overhead than high Vtt transistor transistor Footprint of gate remains almost the sameFootprint of gate remains almost the same
– Selective assignment of larger LSelective assignment of larger Lgate gate in flip-flopin flip-flop Smaller delay overhead than entire assignment Smaller delay overhead than entire assignment
in flip-flopin flip-flop Maximum reduction can be obtained up to Maximum reduction can be obtained up to same same
amount of leakage reduction with the case when amount of leakage reduction with the case when all gates are larger Lall gates are larger Lgategate
Unequal leakage along with values of D and QUnequal leakage along with values of D and Q– Four kinds of SFFsFour kinds of SFFs
Characterized to minimize leakage corresponding Characterized to minimize leakage corresponding to four states (D & Q)to four states (D & Q)
SFSF0000, SF, SF0101, SF, SF1010 and SF and SF1111
Skewed Flip-Flops
Delay : 32% Leakage : 72%
cf. high Vt inverter Delay : 81% Leakage : 92%
LeakageLeakageDelayDelay
Del
ay [p
s]D
elay
[ps]
Gate length (nm)Gate length (nm)Le
akag
e [n
A]
Leak
age
[nA
]0
20
40
60
80
100
120
0
10
8070
60
50
40
30
20
45 46 47 48 49 50
Skewed Flip-Flops Design of an SFF (in case of SFDesign of an SFF (in case of SF0000))
– Assume CK = 0 in idle state (clock gating)Assume CK = 0 in idle state (clock gating)
clkclk11
clkclk
clkclk
clkclk
clkclk
clkclk
clkclk
clkclk
clkclk
00
11
00
11
11
00
11
00
11 00
00
QQDD
CKCK clkclk00 00
00 0011
Larger LLarger Lgategate
Skewed Flip-Flops Skewed flip-flopsSkewed flip-flops
clk
clk
clk
clk
clk
clk
clk
clk
QD
CK clk
clk
clk
clk
clk
clk
clk
clk
clk
QD
CK clk
clk
clk
clk
clk
clk
clk
clk
clk
QD
CK clk
clk
clk
clk
clk
clk
clk
clk
clk
clk
QD
CK clk
clk
clk clk
SFSF0000 SFSF0101
SFSF1010 SFSF1111
Leakage Characteristic of SFFs 45-nm PTM, 4 nm biasing45-nm PTM, 4 nm biasing
00
400400
800800
12001200
0/00/0 0/10/1 1/01/0 1/11/1
(a) SF(a) SF0000
D/QD/Q
00
400400
800800
12001200
0/00/0 0/10/1 1/01/0 1/11/1
(b) SF(b) SF0101
D/QD/Q
00
400400
800800
12001200
0/00/0 0/10/1 1/01/0 1/11/1
(c) SF(c) SF1010
D/QD/Q
00
400400
800800
12001200
0/00/0 0/10/1 1/01/0 1/11/1
(d) SF(d) SF1111
D/QD/Q
Cur
rent
[nA
]C
urre
nt [n
A]
Cur
rent
[nA
]C
urre
nt [n
A]
Orig.Orig.SFSF0000
Orig.Orig.SFSF0101
Orig.Orig.SFSF1010
Orig.Orig.SFSF1111
Cur
rent
[nA
]C
urre
nt [n
A]
Cur
rent
[nA
]C
urre
nt [n
A]
45-nm PTM, 4 nm biasing45-nm PTM, 4 nm biasing
Timing Characteristic of SFFs
00
1010
2020
3030
4040
00
1010
2020
3030
4040
00
1010
2020
3030
4040
00
1010
2020
3030
Rising Rising TTsusu
Falling Falling TTsusu
RisingRisingTTc-qc-q
Falling Falling TTc-qc-q
(a) SF(a) SF0000
Rising Rising TTsusu
Falling Falling TTsusu
RisingRisingTTc-qc-q
Falling Falling TTc-qc-q
(b) SF(b) SF01 01
Rising Rising TTsusu
Falling Falling TTsusu
RisingRisingTTc-qc-q
Falling Falling TTc-qc-q
(c) SF(c) SF1010
Rising Rising TTsusu
Falling Falling TTsusu
RisingRisingTTc-qc-q
Falling Falling TTc-qc-q
(d) SF(d) SF1111
Del
ay [p
s]D
elay
[ps]
Del
ay [p
s]D
elay
[ps]
Del
ay [p
s]D
elay
[ps]
Del
ay [p
s]D
elay
[ps]
Orig.Orig.SFSF0000
Orig.Orig.SFSF0101
Orig.Orig.SFSF1010
Orig.Orig.SFSF1111
(a) Rising T(a) Rising Tsusu (b) Falling T(b) Falling Tsusu
TT susu ''
TT susu
TT susu ''
TT11
TT11 '' TT11 ''
TT11
DD clkclkTT susu
DD
clkclk
CK (rising edge)CK (rising edge) CK (rising edge)CK (rising edge)
0.90.9
00
0.90.9
00
Orig.Orig.SFSF 0000
Orig.Orig.SFSF 0000
TimeTime TimeTime
Vol
tage
[V]
Vol
tage
[V]
Vol
tage
[V]
Vol
tage
[V]
SFF Transformation Utilize SFFs while maintaining timing
constraints– Input : netlist & idle state probabilities of flip-
flops– Output : new netlist with skewed flip-flops
Skewed flip-flop Skewed flip-flop transformationtransformationunder under timing timing constraintsconstraints
Initial SFF Initial SFF assignment assignment
Flip-flop Flip-flop transformationtransformation
Find critical pathFind critical path
Find candidate Find candidate
Substitute Substitute
Netlist & Idle state Netlist & Idle state probabilities probabilities
Mixed VMixed Vtt assignment assignment on combinational on combinational
subcircuitssubcircuits
For a smoother transition For a smoother transition – HSFHSF00 : unchanged setup time delay : unchanged setup time delay– HSFHSF11 : unchanged clock-to-q delay : unchanged clock-to-q delay
Half Skewed Flip-Flops (HSFs)
clkclk
clkclk
clkclk
clkclk
clkclk
clkclk
clkclk
clkclk
QQDD
CKCK
clkclk
clkclk
(b) Design of HSF(b) Design of HSF11
clkclk
clkclk
clkclk
clkclk
clkclk
clkclk
clkclk
clkclk
QQDD QQDD
CKCK
clkclk
clkclk
(b) Design of HSF(b) Design of HSF11
clkclk
clkclk
clkclk
clkclk
clkclk
clkclk
clkclk
clkclk
QQDD
CKCK
clkclk
clkclk
(a) Design of HSF(a) Design of HSF00
clkclk
clkclk
clkclk
clkclk
clkclk
clkclk
clkclk
clkclk
QQDD
CKCK
clkclk
clkclk
clkclk
clkclk
clkclk
clkclk
clkclk
clkclk
clkclk
clkclk
QQDD QQDD
CKCK
clkclk
clkclk
(a) Design of HSF(a) Design of HSF00
HSFHSF00 HSFHSF11
SFF Transformation Algorithm Select a flip-flop to be Select a flip-flop to be
transformedtransformed– Find critical pathFind critical path– Find candidateFind candidate
Both ends of the Both ends of the most critical pathmost critical path
Larger timing Larger timing improvement improvement
Skewed flipSkewed flip--flop flop transformationtransformationunder under timing timing constraintsconstraints
Initial SFF Initial SFF assignment assignment
FlipFlip--flop flop transformationtransformation
Find critical pathFind critical path
Find candidate Find candidate
Substitute Substitute
Netlist & Idle state Netlist & Idle state probabilities probabilities
Mixed VMixed Vtt assignment assignment on combinational on combinational
subcircuitssubcircuits
Skewed flipSkewed flip--flop flop transformationtransformationunder under timing timing constraintsconstraints
Initial SFF Initial SFF assignment assignment
FlipFlip--flop flop transformationtransformation
Find critical pathFind critical path
Find candidate Find candidate
Substitute Substitute
Netlist & Idle state Netlist & Idle state probabilities probabilities
Mixed VMixed Vtt assignment assignment on combinational on combinational
subcircuitssubcircuits
SubstituteSubstitute– (1) Most effective SFFs in terms of delay given (1) Most effective SFFs in terms of delay given
position and phase of transition position and phase of transition – (2) If (1) fails, try HSFs(2) If (1) fails, try HSFs– (3) If (2) fails, use the original flip-flops(3) If (2) fails, use the original flip-flops
Experimental Results For ISCAS benchmark circuits (45-nm PTM For ISCAS benchmark circuits (45-nm PTM
library)library)BenchmarkBenchmark Mixed VMixed Vt t onlyonly SFX + Mixed VSFX + Mixed Vtt
NameName # Gates# Gates # FFs# FFs Comb. (uA)Comb. (uA) SE (uA)SE (uA) Total (uA)Total (uA) Comb. (x)Comb. (x) SE (x)SE (x) Total (x)Total (x)
s298s298 130130 1414 3030 1313 4343 0.970.97 0.440.44 0.810.81
s344s344 144144 1515 3131 1515 4646 0.990.99 0.540.54 0.860.86
s349s349 142142 1515 3131 1515 4646 1.001.00 0.540.54 0.860.86
s382s382 185185 2121 3838 1919 5757 1.061.06 0.380.38 0.840.84
s400s400 198198 2121 3838 1919 5757 1.121.12 0.360.36 0.870.87
s444s444 199199 2121 4949 1919 6868 1.121.12 0.360.36 0.910.91
s526s526 258258 2121 4141 1919 6060 0.990.99 0.550.55 0.850.85
s641s641 206206 1919 3030 1818 4848 0.990.99 0.450.45 0.790.79
s713s713 206206 1919 3434 1818 5252 1.001.00 0.450.45 0.810.81
s838s838 416416 3232 7070 3030 100100 1.031.03 0.370.37 0.830.83
s5378s5378 15341534 163163 244244 155155 399399 1.071.07 0.420.42 0.820.82
s9234s9234 14571457 135135 280280 121121 401401 1.031.03 0.360.36 0.830.83
Avg.Avg. 1.041.04 0.440.44 0.840.84
Comparison of Mixed Vt Flip-Flop
s298
s344
s349
s400
s444
s526
s641
s713
s838
s923
4
s382
Mixed Vt FFs + Mixed Vt comb.
SFX + Mixed Vt comb.
0.6
0.7
0.8
0.9
1.0
0
1
2
3
4
s298
s344
s349
s400
s444
s526
s641
s713
s838
s923
4
[Ave
rage
# fa
nout
tim
ing
path
s of
F/F
s]
[Ave
rage
# fa
nout
tim
ing
path
s of
F/F
s]
/ [A
vera
ge #
fano
ut ti
min
g pa
ths
of c
omb.
Gat
es]
/ [A
vera
ge #
fano
ut ti
min
g pa
ths
of c
omb.
Gat
es]
Conclusion Proposed Skewed Flip-FlopsProposed Skewed Flip-Flops
– The set of mixed LThe set of mixed Lgategate flip-flops flip-flops– Skewed characteristics in terms of leakage Skewed characteristics in terms of leakage
and delayand delay
A heuristic algorithm that substitutes SFFs A heuristic algorithm that substitutes SFFs – An average leakage saving of 16% is achieved, An average leakage saving of 16% is achieved,
compared to the use of mixed Vcompared to the use of mixed Vtt alone alone