a high-speed variation-tolerant sub-threshold interconnect...
TRANSCRIPT
1
A High-Speed Variation-TolerantSub-Threshold Interconnect Technique
Using Capacitive Boosting
Jonggab Kil*, Jie Gu, and Chris H. Kim
*Intel Corporation
University of MinnesotaDepartment of Electrical and Computer Engineering
[email protected], [email protected]/~chriskim/
2
Presentation Agenda• Global Interconnect in Sub-threshold
• Proposed Sub-threshold Interconnect Technique
• Application on Sub-threshold Clocking
• Performance and Power Results
• Test Chip Measurements
• Conclusions
3
• Main Benefit– Super-linear power savings
– Minimum energy solution for low-performance designs
Subthreshold Operation
ddleakddtotalVIVCfP ⋅+⋅⋅⋅= 2α
Vgs(V)
I ds(A
/µm
)
operating region
1.E-10
1.E-09
1.E-08
1.E-07
1.E-06
1.E-05
1.E-04
1.E-03
0.0 0.2 0.4 0.6 0.8 1.00.E+00
1.E-04
2.E-04
3.E-04
4.E-04
5.E-04log scale
linear scale
Vgs Ids
• Limitations– PVT variation– Interconnect delay– Lack of a systematic design
methodology
4
Interconnect Scaling Problem
0%
20%
40%
60%
80%
100%
Nor
mal
ized
del
ay
Technology node (nm) 90 65 45 32
Logic delay Global interconnect delay
50%56%
64%73%
with repeaters
0%
20%
40%
60%
80%
100%
Nor
mal
ized
del
ay
Technology node (nm) 90 65 45 32
Logic delay Global interconnect delay
50%
66%
81%91%
without repeaters
• Global interconnect delay dictates system performance• Multiple clock cycles required for global signals to
propagate across die
5
Interconnect Delay in Sub-threshold
We need interconnect technique for sub-threshold logic
0
0.2
0.4
0.6
0.8
1
1.2
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2
Sub-thresholdSuper-threshold
Nor
mal
ized
del
ay
Supply Voltage (V)
Global interconnect delay Logic delay
Globalinterconnect delay
VDD
Logic delay
VDD
0.18µm, 20°C
• Global interconnect problem worsens in sub-threshold– Device resistance dominates over wire resistance– Wire capacitance is constant while MOS capacitance reduces at
lower voltages
6
Proposed Sub-threshold Interconnect Technique
Operating region is shifted
from sub-threshold to super-threshold
Charge pump circuitry to boost
the gate voltage of drivers
100X+ increase in drive current at the expense of leakage
current
7
Circuit Operation: Active Mode
1
2
IN OUT
*Output holders not shown
0V
0.4V
0V
0.4V 0V
0.4V
0.4V0.8V
-0.4V0V
-0.4V
0.8V
-0.4V
0.8V
NMOS boosted
8
Circuit Operation: Active Mode
0V
0.4V
0V
0.4V
0V
0.4V
-0.4V
0.8V
0.4V0.8V
9
Circuit Operation: Standby Mode
• Half cycle start-up time• No start-up required for inactive periods up to 200 cycles
(@ 0.4V, 4MHz, 20°C)• Level holders can be used for restoring output level
10
Boosting Capacitance in Sub-threshold
-Vt
Cgate
DepletionModerate
Strong inversion
Sub-thresholdOperating
region
Vgs
Operates in weak inversion region
Accumulation
11
Boosting Efficiency
12
Operating Waveforms
0.0 0.1 0.2 0.3 0.4Time (µs)
0.0
0.4
0.2
This work ConventionalInput
0.0 0.1 0.2 0.3 0.4
0.0
0.4
0.8
-0.4
0.0
0.4
0.8
-0.4
VBOOST_N
VBOOST_P
VPRESET_P
VPRESET_N
Time (µs) Time (µs)
Volta
ge (V
)
0.18µm, 20°C
VIN_BARVIN_BAR
0.0 0.1 0.2 0.3 0.4
Volta
ge (V
)
Volta
ge (V
)
• 2.6X delay improvement for 1pF load
13
Delay and Power Consumption
0
50
100
150
200
0.0 1.5 3.0 4.5
Fall delay (This work)
Rise delay (This work)
Rise delay (Conv.)
Fall delay (Conv.)
• 2X+ delay improvement for different interconnect loads• Comparable power consumption once the frequencies
are high enough that active and short circuit power dominate over leakage current
0.0
0.5
1.0
1.5
2.0
2.5
3.0
0 4 8 12
14
Sensitivity to Variation
Voltage (V) Temperature(°C)200.38
This work Delay variation
Conv. Delay variation
4060
800.39
0.400.41
0.42
0.1
0.2
Temperature(°C)200.38
This work Delay variation
Conv. Delay variation
4060
800.39
0.400.41
0.42
0.1
0.2
Voltage (V)
0.05
0.15
0.05
0.15
• Delay variation reduced by 2X under supply (0.4V±5%) and temperature (20~80°C) variations
• The driver transistors are no longer in the sub-threshold region
15
Sub-threshold Clocking
......
......
Four clock buffer stages with each buffer driving four clock buffers and the long interconnects
D. Harris, TVLSI 2001
Clock signal paths are symmetrically routed across the chip for
minimum skew
16
Clock Skew Comparison
Proposed driverAverage delay: 52ns
Standard deviation: 5.2ns
Conventional driverAverage delay: 251ns
Standard deviation: 30.7ns
• 8.3X reduction in 3σ clock skew and 22% reduction in σ/µ using the proposed driver
17
Test Chip Photograph
• The number of transistors per driver increased from 4 to 18 leading to 30% area overhead
• 40% of the driver area is occupied by boosting capacitors
18
Test Chip Schematic
Level-down converter : NMOS transistor is
employed to pull down node A1
Level-up converter PMOS transistors (P7
and P8) used to reduce the contention
Boo
stin
gci
rcui
t
19
Performance Measurements
Rise (fall) delay improved by 2.6X (2.9X)
Supply voltage : core(0.4V), level converter (1.0V), IO buffer (1.8V)
20
Performance Measurements
0.4 0.45 0.5
Proposed boosting technique is efficient in reducing the PVT impact at lower supply voltages
21
Variability and Power Measurements
• Temperature sensitivity reduced from 1.6X to 0.7X• Proposed driver operated with 41% less power dissipation
at 4MHz or 49% higher performance at 5.1µW
22
Conclusions• Global interconnects in sub-threshold region
suffer from increased delay and large sensitivity to PVT variations
• Proposed capacitive boosting technique – Effective for high activity long line drivers– 70% boosting efficiency at 0.4V
• Measurements from a 0.18µm test chip show– 1.7-2.9X delay improvement– 2.3X reduced delay variability at 0.4V– 41% power saving at 4MHz– 49% performance improvement at 5.1µW