vlsi design challenges and eda in the forthcoming decade...1 vlsi design challenges and eda in the...
TRANSCRIPT
-
1
VLSI design challenges and EDA in the forthcoming decade
Prof. Takayasu SakuraiCenter for Collaborative Research, and
Institute of Industrial Science,University of Tokyo
E-mail:[email protected]
1
DA show ‘01
-
T.Sakurai
Technology trend
Des
ign
rule
, gat
e le
ngth
Gat
e ox
ide
thic
knes
s [n
m]
Year1998 2002 2006 2010 2014
Design rule
Gate length
Gate oxide thickness
DRAM bit per chip
0.1
1
10
100
1000
DR
AM
bit
per c
hip
[Gbi
t]1000
100
10
1
-
T.Sakurai
Limit of Miniturization
Conventional I-V curve at 0.04µm (Even down to 0.014µm)
0.04µm MOSFET
0.0 0.4 0.8 1.2 1.6 2.00.00
0.21
0.42
0.63 Vg = 2.0 V
Vg = 1.6 V
Vg = 1.2 V
Vg = 0.8 V
Gate Length = 40 nm
Dra
in C
urre
nt[m
A/µ
m]
Drain Voltage [V]
0.04µm
M. Ono, M. Saito, T. Yoshitomi, C. Fiegna, T. Ohguro, and H. Iwai, "Sub-50nm gate Length N-MOSFETs with 10 nm Phosphorus Source and Drain Junctions", IEDM Technical Digest, pp. 119 -122, 1993.H. Kawaura, T. Sakamoto, Y. Ochiai, J. Fujita, and T. Baba, "Fabrication and Characterization of 14-nm-Gate-Length EJ-MOSFETs", Extended Abstracts of SSDM, pp.572-573, 1997.
-
T.Sakurai
Scaling Law
Drain Source
Gate
0.2micron
Drain SourceGate
0.2micronSize 1/2
Size x1/2Voltage x1/2Electric Field x1Speed x3Cost x1/4
Power density x1.6RC delay/Tr. delay x3.2Current density x1.6Voltage noise x3.2Design complexity x4
Favorable effects Unfavorable effects
-
T.Sakurai
Three crises in VLSI designs
Power crisis
Interconnection crisis
Complexity crisis
-
T.Sakurai
Ever Increasing VLSI Power
959085800.01
0.1
1
10
100
Year
Pow
er (W
)x4 / 3years
Processors published in ISSCC
-
T.Sakurai
VDD, power and current trend
Year
Volta
ge [V
]
Pow
er p
er c
hip
[W]
V DD
curr
ent [
A]
1998 2002 2006 2010 20140
0.5
1
1.5
2
2.5
0 0
200 500
Current
Power
Voltage
International Technology Roadmap for Semiconductors 1999 update sponsored by the Semiconductor Industry Association in cooperation with European Electronic Component Association (EECA) , Electronic Industries Association of Japan (EIAJ), Korea Semiconductor Industry Association (KSIA), and Taiwan Semiconductor Industry Association (TSIA)
-
T.Sakurai
Ultra Low-Voltage Operation
inverter (T=300K)
2nand(T=300K)
inverter (T=77K)
Vin (V)
Vout
(V)
50mV
25mV
100mV360mV
140mV
J.Burr&J.Shott,"A 200mV Self-Testing Encoder/Decoder using Stanford Ultra-Low-Power CMOS",ISSCC94, pp.84-85.
((((Stanford Univ.)Stanford Univ.)Stanford Univ.)Stanford Univ.)
-
T.Sakurai
Power & Delay Dependence on VDD & VTH
Power : P = pt •fCLK •CL •VDD + I0 •10 •VDD 2V thS
(αααα=1.3)
k • CL • VDD(VDD - Vth)
ααααDelay = k•Q
I=
12
34
-0.400.40.8
00.2
0.4
0.6
0.8
1x 10-4
Vth (V)
VDD(V)
Pow
er (W
)
A
B
12
34
-0.400.40.8
0
1
2
3
4
5x 10
-10
Del
ay (s
)
Vth (V)
VDD(V)
A B
-
T.Sakurai
Controlling VDD and VTH for low powerLow power →→→→ Low VDD →→→→ Low speed →→→→ Low VTH →→→→ High leakage →→→→ VDD-VTH control
Active Stand-byMultiple VTH Dual-VTH MTCMOSVariable VTH VTH hopping VTCMOSMultiple VDD Dual-VDD Boosted gate MOSVariable VDD VDD hopping
*) MTCMOS: Multi-Threshold CMOS*) VTCMOS: Variable Threshold CMOS• Multiple : spatial assignment• Variable : temporal assignment
Software-hardware cooperationTechnology-circuit cooperation
-
T.Sakurai
Controlling VDD and VTH for low power
Active Stand-byMultiple VTH Dual-VTH MTCMOSVariable VTH VTH hopping VTCMOSMultiple VDD Dual-VDD Boosted gate MOSVariable VDD VDD hopping
• MTCMOS: Multi-Threshold CMOS
-
T.Sakurai
Multi-Threshold CMOS Circuit
CMOS logic
Low-VTHD Q
Hi-VTH Hi-VTHHi-VTH
Low-VTH
St'bySt'by
MTCMOS logic MTCMOS latch
In active mode, low-VTH (~0.2V) achieve high speed. In standby mode when St'by signal is high, high-VTH(~0.6V) MOSFET in series to normal logic circuits cut off leakage current. Doesn’t work under 0.8V.
-
T.Sakurai
Controlling VDD and VTH for low power
Active Stand-byMultiple VTH Dual-VTH MTCMOSVariable VTH VTH hopping VTCMOSMultiple VDD Dual-VDD Boosted gate MOSVariable VDD VDD hopping
-
T.Sakurai
Subthreshold current
NMOS L = 0.3µµµµm
VGS3V
VBS
D
BG
ID
0 1 2
0.4V
VBS=0V
-1V-2V
-3V
-410
-210
-610
-810
-1010
Gate Source Voltage VGS [V]
Dra
in C
urre
nt I D
[A]
VTH
)2.0~15.0(2
110
0
≈−≈−−=γγ
φγ
BST
BSFTTH
VVVVV
)decade/V.~s(I s
VV
D
THGS
1010
−
∝
W = 10µµµµm
-
T.Sakurai
Standby Power Reduction (SPR) CircuitISSCC'95 pp.318-31
VDD (2V)
VSS (0V)
VPBB (-2V)
VNWELL (2 or 4V)VNBB (4V)
Level ShifterVoltage Switch
V1 CW
M3 M4M5
CW
M2M1
V2V3
V4 VPWELL (0 or -2V)
St'by
St'by
are added to ensure reliability
• In standby mode and in IDDQ test, substrate bias is applied to increase VTH, which reduces leakage.
• In active mode, substrate bias is not applied to lower VTH, which ensures high speed.
-
T.Sakurai
Previous circuit schemes
[1] S.Mutoh et al. IEEE, JSSC, 1995. [2] T.Kuroda et al. IEEE, JSSC, 1996.
MTCMOS [1] VTCMOS [2]• Tunneling leakagecannot be cut-off.
• Area penalty increaseswhen VDD < 1V.
• Junction leakage increases due toband-to-band tunneling.
• Tunneling leakage isnot suppressed.
High Vth(Stand-by)
Low Vth (Active)VDD
Vnn
Vpp
GND
VDD
GNDV
GNDHigh Vth
Low Vth
(~0.6V)
(~0.2V) (~0.2V)
(~0.6V)
-
T.Sakurai
Transistors go leaky
1999 2002 2005 2008 201110-13
10-1210-1110-1010-910-810-710-610-5
Requirement(Stand-by Power = 1mW)
(Vbs= -1V)
JunctionLeakage
0.1V
0.2V
0.3V
0.4V
Vth = 0.4V
SubthresholdLeakage
Gate TunnelingLeakage
(180nm) (50nm)(70nm)(100nm)(130nm)
I off
(A/u
m)
Year (Technology Node)
-
T.Sakurai
Controlling VDD and VTH for low power
Active Stand-byMultiple VTH Dual-VTH MTCMOSVariable VTH VTCMOS VTCMOSMultiple VDD Dual-VDD Boosted gate MOSVariable VDD VDD hopping
-
T.Sakurai
Boosted-Gate MOS (BGMOS)
Leak cut-off Switch (LS)- high VTH- thick TOX
VDD
Virtual VSS
CMOS circuits- low VTH- ultra thin TOX
T.Inukai, M.Takamiya, K.Nose, H.Kawaguchi, T.Hiramoto and T. Sakurai, "Boosted Gate MOS (BGMOS): Device/Circuit Cooperation Scheme to Achieve Leakage-Free Giga-Scale Integration," CICC'00, p.409, May 2000.
0VVDDVBOOST
Device / circuit collaborative approach
-
T.Sakurai
Leak switch optimization
1 2 3 4 5 6 7 8
0
1
2
3
Vth = 0.5V (const.)Tech. Node: 50nm
Tox(nm)
Dra
in C
ondu
ctan
ce (m
S/µ µµµm
)1 2 3 4
Vboost (V)
50
100
150
200
L g(n
m)
-
T.Sakurai
GSI’s in deep-submicron era
High voltageHigh VTHThick TOX
Low voltage Low VTHThin TOX
Logic CircuitsF/F
SRAM (leak cut-off by high VTH)
Leak cut-offPower Switch
Analog(low VDDReducesS/N ratio)
I/O (for compatibility)
VDD
VBOOST
VSS
-
T.Sakurai
Power switch gate width in BGMOS
Degrade circuit speed unpredictably
VDD
Logic
low-VTH
Power switchhigh-VTH
StbyVirtual ground
VDD
R CX
VDD
Kao, DAC'97, pp.409-414.
-
T.Sakurai
Controlling VDD and VTH for low power
Active Stand-byMultiple VTH Dual-VTH MTCMOSVariable VTH VTH hopping VTCMOSMultiple VDD Dual-VDD Boosted gate MOSVariable VDD VDD hopping
-
T.Sakurai
Dual-VTH concept
Low-VTH circuit(High leakage)
High-VTH circuit(Low leakage)
Critical paths
Non-critical paths
-
T.Sakurai
Calculated result of power reduction(Theoretical, VDD=0.5V)
0 0.1 0.2 0.3 0.40
0.5
1
VDD=0.5VVTH(low)=-0.1V∆∆∆∆VTH=±±±±0.04V
VTH(high)-VTH(low)
Nor
mal
ized
pow
er
td
P(td)
-
T.Sakurai
Synopsis simulation resultSimple processor example
0 0.1 0.2 0.3 0.40
0.5
1
VTH(high)-VTH(low)
Nor
mal
ized
pow
er
VDD=0.5V
VTH(low)=0.1V
VTH(low)=0V
VTH(low)=-0.1V
-
T.Sakurai
Controlling VDD and VTH for low power
Active Stand-byMultiple VTH Dual-VTH MTCMOSVariable VTH VTH hopping VTCMOSMultiple VDD Dual-VDD Boosted gate MOSVariable VDD VDD hopping
-
T.Sakurai
Clustered Voltage Scaling for Multiple VDD’s
Lower VDD portion is shown as shaded
CVS StructureConventional Design
Critical Path
Level-Shifting F/F
Critical Path
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
M.Takahashi et al., “A 60mW MPEG4 Video Codec Using Clustered Voltage Scaling with Variable Supply-Voltage Scheme,” ISSCC, pp.36-37, Feb.1998.
Once VL is applied to a logic gate, VL is applied to subsequent logic gates until F/F’s to eliminate DC current paths. F/F’s restore VH.
-
T.Sakurai
Path-delay Distribution in Dual-VS
MEF (1527 cells)
MCB (1366 cells)
VLD (3812 cells)
DMA (1493 cells)
DCT (5466 cells)
RISC (5645 cells)
before
after
t
MEC (2912 cells)
p(t)
IDCT (6227 cells)
VLC (3462 cells)
-
T.Sakurai
Clustered Voltage Scaling Technique
M.Takahashi et al., “A 60mW MPEG4 Video Codec Using Clustered Voltage Scaling with Variable Supply-Voltage Scheme,” ISSCC, pp.36-37, Feb.1998.
Pow
er d
issi
patio
n (m
W)
10090
80
70
60
50
40
30
20
10
03.3V
Conventional2.5VVS &
VTCMOS
2.5V & 1.75VDual-VS & VTCMOS
Logic
F/FClock
Memory
-43%
-43%
-43%
-43%
-30%
-37%
-51%
±0%
Measured
Measured
-
T.Sakurai
Controlling VDD and VTH for low power
Active Stand-byMultiple VTH Dual-VTH MTCMOSVariable VTH VTH hopping VTCMOSMultiple VDD Dual-VDD Boosted gate MOSVariable VDD VDD hopping
-
T.Sakurai
Energy consumption isproportional tothe square of VDD.
Energy consumption isproportional tothe square of VDD.
VDD should be loweredto the minimum levelwhich ensuresthe real-time operation.
VDD should be loweredto the minimum levelwhich ensuresthe real-time operation.
Normalized workload0.0 0.2 0.4 0.6 0.8 1.0
Nor
mal
ized
pow
er
0.0
0.2
0.4
0.6
0.8
1.0
Variable VddFixed Vdd
If you don’t need to hussle,VDD should be as low as possible
-
T.Sakurai
Application slicing and software feedback loop in Voltage Hopping
S.Lee and T.Sakurai, “Run-time Power Control Scheme Using Software Feedback Loop for Low-Power Real-time Applications,”ASPDAC'00, A5.2, pp.381~pp.386, Jan. 2000.S.Lee and T.Sakurai, “Run-time Voltage Hopping for Low-power Real-time Systems,” DAC'00, June 2000.
3
1 2 3 N...4
TR3
TR1TR2
TR4
TSF = Time constraint of sync frame
1 2TC2TC3 TTAR3
fVAR = f1 = fCLK TTD
TTD
fVAR = f2 = fCLK/2
fVAR = f3 = fCLK/3
fVAR = f4 = fCLK/4
TTD
TTD
TL3,f1 = TTD + TW3TL3,f2 = TTD + TW3 x 2
TL3,f4 = TTD + TW3 x 4TL3,f3 = TW3 x 3
Clock frequency for the previous timeslot was f3.
Processor coreProcessor core
Voltagefrequencycontroller
Voltagefrequencycontroller
Clock & VDDClock & VDDControl info
-
T.Sakurai
Run-time Voltage Hoppingreduces power to less than 1/10
MPEG-4 video encoding
Transition Delay TTD (ms)0.0 0.2 0.4 0.6 0.8 1.0
Nor
mal
ized
Pow
er P
/PFI
X
0.00
0.02
0.04
0.06
0.08
0.10
0.12
0.14
0.16
RVH: 2 levels (f,f/2)RVH : 3 levels (f,f/2,f/3)RVH : 4 levels (f,f/2,f/3,f/4)RVH : infinite levelspost-simulation analysis
SH-4
Clock
VDDSH-4 Modified
Clock
VDD
-
T.Sakurai
Measured power characteristics
Total power = 0.8 x 0.08 + 0.16 x 0.86 + 0.07 x 0.06 = 0.2WTotal power = 0.8 x 0.08 + 0.16 x 0.86 + 0.07 x 0.06 = 0.2W
VDD hopping can cut down power consumption to 1/4VDD hopping can cut down power consumption to 1/4
0.8W
0
0.2
0.4
0.6
0.8
1
Supply voltage: VDD [V]
Pow
er: P
[W]
0 1 2
ƒ=100MHz
ƒ=200MHz
0.16W
Downto 1/5
Time for sleep: 6% 0.07W
Time for VDDmin: 86%
Time for VDDmax: 8%
-
T.Sakurai
Power Conscious OS & Application Slicing
0 10 3020 40 50 60
Proposed scheduling: cooperation of OS and applications (power consumption=0.24)
0 10 3020 40 50 60
Speed control within application slices (power consumption=0.47)
0 10 3020 40 50 60
Conventional rate-monotonic scheduling (power consumption=1)
0 10 3020 40 50 60
Speed control with power-conscious OS (power consumption=0.85)
Y.S.Shin, H.Kawaguchi, T.Sakurai, "Cooperative Voltage Scaling (CVS) between OS and Applications for Low-Power Real-Time Systems," CICC'01, pp.553-556, May 2001.
-
T.Sakurai
Power switches
Supply voltage to SH-4
(B) SH-4 embedded board
(A) System
Hardware
-
T.Sakurai
Controlling VDD and VTH for low power
Active Stand-byMultiple VTH Dual-VTH MTCMOSVariable VTH VTH hopping VTCMOSMultiple VDD Dual-VDD Boosted gate MOSVariable VDD VDD hopping
K. Nose, M.Hirabayashi, H.Kawaguchi, S.Lee and T.Sakurai, “VTH-hopping Scheme for 82% Power Saving in Low-voltage Processors,” to be published, CICC 2001.
-
T.Sakurai
VTH-hopping
-0.1 0 0.1 0.20
1
2
4
6
8
1
0
2
Dynamic power
Delay
Leakage power
Nor
mal
ized
del
ay :
t d(t d
(VTH
=0)=
1)
Nor
mal
ized
pow
er :
P (P
(VTH
=0)=
1)
Threshold voltage : VTH [V]
VDD=0.5V0.35µµµµm process
Fixed VTHDynamic VTH control
Normalized workloadN
orm
aliz
ed P
ower
0 0.5 10
0.5
1
VTHlow=0VVDD=0.5V0.35µµµµm process
-
T.Sakurai
Schematic of VTH-hopping
VTHcontroller
Freq. control
CONT
fCLK or fCLK/2
VBSP2
Target processor
VTH-hopping(VTHlow&VTHhigh)
Power control block
VBSP1
VBSN2VBSN1
VTHlow_Enable
VTHhigh_Enable
VDD
GND
VBSN
VBSP
-
T.Sakurai
Microphotograph of RISC processor
0.6µµµµm process
Overhead of VTH-hopping= 14%
RISC core = 2.1mm x 2.0mm
VBS selector = 0.2mm x 0.6mm
-
T.Sakurai
Power comparison
Fixed VTHFixed freq.
Dual VTH0
0.5
1
Leakage powerDynamic power
VDD=0.5VVTHlow=0V94% f/2 operation
Nor
mal
ized
pow
er
VTH-hopping VTH-hopping+ Dual VTH
-
T.Sakurai
Principle for Power Reduction
● Lowering switching probability (αααα)● Low transition coding● Gated clock , Conditional F/F● Power estimation / optimization CAD
● Lowering load capacitance (CL)● Embedded memory● Low capacitance circuit● Low-power cell library (gate sizing)
● Lowering supply voltage (VS • VDD=VDD2)● Variable-VDD, Variable-VTH, Multi-VDD, Multi-VTH , DC/DC ● Low voltage memory● Low swing clock / bus
● Lowering operating frequency● Better algorithm
leak.juncleakgate10IfVVCP s/V0CLKDDSL TH ++•+••••α≈−
-
T.Sakurai
Bus Shuffling
● Bus shuffling● Virtually no overhead● Pattern information is necessary: applicable to special-
purpose systems● Layout modification or compiler backend support
● Problem definition● Given a set of patterns or statistics of patterns● Find order of bus lines so that power consumption due
to area and coupling capacitances is minimized
Y. Shin and T. Sakurai, “Coupling-driven bus design for low-power application-specific systems,”Proc. Design Automation Conf. (DAC), pp.750-753, June 2001.
-
T.Sakurai
Bus Shuffling
● Example
0001111001110001001100000100111110000111
0001111001110001001100000100111110000111
0001111001110001001100000100111110000111
0001111001110001001100000100111110000111
0001111001110001001100000100111110000111
0001111001110001001100000100111110000111
0000111111100001001000011100111010011010
time
b7 ……. b0
shielding lines line with highest transition prob.
line with highest positive switchingcorrelation with b6
line with highest positive switchingcorrelation with b6 or b0
partial order of this cluster: b0b6b5
-
T.Sakurai
Experiments● Result of heuristic
● 7 data address sets● % power saving compared to un-shuffled
buses
0
10
20
30
40
50
60
70
80
90
100
wavelet linear laplace compress lowpass fft ac3
% s
avin
g
eta=1eta=2eta=3eta=4eta=5eta=infinity
ηηηη=coupling cap.
area cap.
-
T.Sakurai
Software Level Low-Power WorkLow-power SW
Applications Operating system (OS)
Source-level design Compilation
-LP appl design-Power-aware appl
-Instruction scheduling-Register allocation-Memory mapping
Shutdown Slowdown
-Time-out-Predictive-Policy opt
-Reactive system-Real-time system
Techniques with architectural changes
-Cache design-Memory hierarchy design-Code compression-Application-specific processor design-Encoding
-
T.Sakurai
Less wasteful design wins low-power race
Example of MPEG2 decodingProcessor (software)~~~~25W
DSP~~~~4W
Dedicated sytem LSI (SW/HW)~~~~0.7W
Low
-pow
er
Hig
h fle
xibi
lity
-
T.Sakurai
Summary – low-powerFor reducing standby power, insert a power switch in series to logic circuit (BGMOS).→→→→ Choice of power switch gate width
For reducing active power, dual-VTH scheme and software control of VDD and VTH are promising.→→→→ Tools to support system-level low-power design with S/H co-design capability
Future giga-scale integration will use multiple VDD, VTH and Tox.→→→→ Tools to support new tech.
-
T.Sakurai
Drain induced barrier lowering (DIBL)
● Drain-Source間電圧が下がるとバンドギャップが増加し,間電圧が下がるとバンドギャップが増加し,間電圧が下がるとバンドギャップが増加し,間電圧が下がるとバンドギャップが増加し,リーク電流が減少するリーク電流が減少するリーク電流が減少するリーク電流が減少する
qVDDqVDD
Low VDDHigh VDD
Source Drain
-
T.Sakurai
Stack effect
● Stack effect の最大の利点の最大の利点の最大の利点の最大の利点 = VINT
-
T.Sakurai
Scaling of DIBL factor
0.1030.0760.0570.042DIBL factor : λλλλ
11.510.59.58stack effect factor
0.91.21.51.8VDD
0.060.090.130.18generation [µµµµm]
S=80mV/decade
-
T.Sakurai
Analysis of DIBL effect
● DIBLは閾値を変えるだけと仮定するは閾値を変えるだけと仮定するは閾値を変えるだけと仮定するは閾値を変えるだけと仮定する (ON時も時も時も時もDIBLを受けるを受けるを受けるを受ける)● VTH=0.15VDDH, その他のパラメータは前のスライドと同じその他のパラメータは前のスライドと同じその他のパラメータは前のスライドと同じその他のパラメータは前のスライドと同じ● 効果が世代にほとんどよらず約効果が世代にほとんどよらず約効果が世代にほとんどよらず約効果が世代にほとんどよらず約12%に減少に減少に減少に減少
VDDH ↔↔↔↔ VDDL
‘0’
0.06 0.09 0.13 0.180
1
2
Design rule [µµµµm]
0
1
0.40.2 Pr
atio
= P
(VD
DL)
/P(V
DD
H)
Volta
ge [V
]
VDDH
VDDL
Pratio
-
T.Sakurai
Three crises in VLSI designs
Power crisis
Interconnection crisis
Complexity crisis
-
T.Sakurai
Interconnect determines cost & perf.
0
20
40
60
80
100
Pow
er [%
]
MOSFET
2000 ‘05 ‘10 ‘15
Num
ber o
f int
erco
nnec
t la
yers
Year
(ITRS’99)
0
20
40
60
80
100
Del
ay [%
]
1mm Cu
0
20
40
60
80
100
Proc
ess
step
s [%
]# of layers
RC delayw/o buffers
Interconnect
MOSFETGatedelay
2000 ‘05 ‘10 ‘15Year
2000 ‘05 ‘10 ‘15Year
2000 ‘05 ‘10 ‘15Year
11
10
9
8
7
6
Interconnect
-
T.Sakurai
DSM interconnect design issuesbecome major design closure obstacles
Larger currentIR drop (static and dynamic)Reliability (electro-migration)
Smaller geometry / Denser patternRC delayCrosstalk noiseDelay fluctuationSignal Integrity
Higher speedInductanceEMI
-
T.Sakurai
Interconnect parameters trend
0
1
2
3
4
1996 2000 2004 2008 2012Year
εεεε r
ρρρρ[ΩΩΩΩ • cm]
Aspect ratio
Width [x0.1μμμμm]
Al Cu
Semiconductor Industry Association roadmaphttp://notes.sematech.org/1997pub.htm
-
T.Sakurai
RC delay and gate delay
1996 2000 2004 2008 201210-12
10-11
10-10
10-9
10-8
Year
Clock period
Gate delay
3mm
1mm
100µm
50µm
Del
ay (s
ec)
-
T.Sakurai
Buffer insertion
a) Without buffers b) With buffers
RINTCINT
( )INTINTOPTOPT
MOSINTINTINTOPT
INTINTOPT
INT
INTOPT
INTINTINTINT
TINTINTTTTINTINT
CCppChkgatesofCap
CRCRpppDelay
stagesofnumberOptimizedCRCR
ppk
kDelay
inverterbufferofsizeOptimizedCRRCh
hDelay
BufferedhCk
Rk
Ch
RhCh
Rpk
Ck
RpkDelay
MOSFETminimumofresistanceeffectiveGateRMOSFETminimumofecapacitancGateC
CRCRCRCRt
73.0/.
4.22
:0
:0
:
::
)(693.0377.0
210
00221
002
1
0
0
00
00
21
0
0
05
===
ττ≈+=
=→=∂
∂
=→=∂
∂
+++≈
+++≈
RTCT
-
T.Sakurai
Power and delay optimization
1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 30.20.40.60.81
1.21.41.61.8
D/D
OPT
, h/h
OPT
, k/k
OPT
Capacitance increase / CINT
h: size of repeater
k: # of stages
D: total delay
CINT: interconnect
capacitance
a) Without repeaters b) With repeaters
RINTCINTRTCT
-
T.Sakurai
Buffer insertion with junction cap.
( )
)CCwhen(C04.1)0Cwhen(C73.0
CPPC
CCC
PPCChk)increasepower(p
00JINT
0JINT
INTJ2
1INT
0
0J0
2
10J0OPTOPT
====
η=+
=+=∆
CJ
( )
( )
)CCwhen(L8.2
)0Cwhen(L4.2
CR,/1pppL2
CCRCRCC
CpppL2Delay
00J0MOS0INT
0J0MOS0INT
000MOSJ0MOS0INTJ221
0J000INT0INT0J0
0221OPT
=ττ≈
=ττ≈
=τηττη+=
+
++=
b) With buffersa) Without buffers
RINTCINTRTCT
CJ L:length k stagesSize: h
Length: l
0
0JJ C
C1+=η
Total power is 15% smaller when CJ is negligible.
Delay is 15% smaller when CJ is negligible.
Delay is a geometric mean of interconnect delay and gate delay and decreases when gate is faster.
-
T.Sakurai
Buffered interconnect delay
1998 2002 2006 2010 201410-11
10-10
10-9
10-8
Gate delay
1um x 1um chip long
Clock period (global)
Clock period (local)
1um x 1um chip long buffered
Year
Del
ay (s
ec)
• In Emotion Engine design for PS2, about 10k buffers are inserted, which is increasing.
-
T.Sakurai
RC delay of global interconnections
10-10
10-9
10-8
10-7
10-6
Year
Del
ay (s
ec)
6µm x 2µm cross-section interconnectGlobal interconnect
Chip18mm x 18mm
30mm x 30mm
1998
Clock period
Minimum cross-section interconnect
2002 2006 2010 2014
-
T.Sakurai
DSM interconnect design issues
Larger currentIR drop (static and dynamic)Reliability (electro-migration)
Smaller geometry / Denser patternRC delayCrosstalk noiseDelay fluctuationSignal Integrity
Higher speedInductanceEMI
-
T.Sakurai
Capacitive Coupling Noise
1996 2000 2004 2008 2012Year
0
1
2
3
4
C20
C12C12
C12/C20
peak couple noise / signal voltage
ratio
-
T.Sakurai
Coupling noise in RC bus
0 1 20
1
Time : t / RC
Volta
ge :
V / E
1
Vp
Vp ≈ Cc/C1+ 2Cc/C
0 1 2Time : t / RC
0
1Vo
ltage
: V
/ E1
Vp
V1
V2
V1
V2
Cc / C = 1Cc / C = 1
E10 t=0
RC
CRCc
V1
V2
RC
CRCc
V1
V2Cc
E10 t=0
E10 t=0
RC
CRCc
V1
V2
RC
CRCc
V1
V2Cc
E10 t=0
Vp = 1+ 4Cc/C -1√√√√1+ 4Cc/C +1√√√√Vp ≈
2Cc/C2+ 3Cc/C
(Cc/C < 2)(Bus)
(Three lines)H.Kawaguchi and T.Sakurai, "Delay and Noise Formulas for Capacitively Coupled Distributed RC Lines," ASPDAC, Digest of Tech. Papers, pp.35-43, Feb. 1998.
-
T.Sakurai
Coupling among Interconnections
0 0.5 1 1.50
0.5
1
Nor
mal
ized
vol
tage
time
w/o buffers
CCCC
C CC/C=3.1
+105%-58%
)C/C()(..
elogRCt
C
pd
=≤+≈
++−+=
ηηη
ηηη
2370631
68126
1221 2
0.2µµµµ 0.2µµµµ0.3µµµµ0.5µµµµ
H.Kawaguchi and T.Sakurai, "Delay and Noise Formulas for Capacitively Coupled Distributed RC Lines," 1998 ASPDAC, Digest of Tech. Papers, pp.35-43, Feb. 1998.
-
T.Sakurai
Coupling among Interconnections
0 0.5 1 1.50
0.5
1
time
Nor
mal
ized
vol
tage Normal buffer insertion
Zigzag buffer insertion
WN/WP=25µµµµ/38µµµµ0.13mm processLINT=3mmK=8 (# of buf.)
+32%-28%
+9%
0
0.5
1
Nor
mal
ized
vol
tage
time0 0.5 1 1.5
In1 In2 In3
In1
In2
In3
In1
In2
In3
+4%
-
T.Sakurai
DSM interconnect design issues
Larger currentIR drop (static and dynamic)Reliability (electro-migration)
Smaller geometry / Denser patternRC delayCrosstalk noiseDelay fluctuationSignal Integrity
Higher speedInductanceEMI
-
T.Sakurai
Interconnect Cross-Section and Noise
Unscaled / anti-scaled• Clock• Long bus• Power supply
Scaled interconnect• Signal
1V 20W →→→→20A current5% noise →→→→0.05V noise →→→→~0.02V / 20A→→→→~10µm thick CuThick layer interconnect, area pad, package are co-designed.
-
T.Sakurai
DSM interconnect design issues
Larger currentIR drop (static and dynamic)Reliability (electro-migration)
Smaller geometry / Denser patternRC delayCrosstalk noiseDelay fluctuationSignal Integrity
Higher speedInductanceEMI
-
T.Sakurai
Inductance• Now RC effects surmounts LC
effects because R > |jωωωωL|.
• In the future, both of R and ωωωωLincrease (but for signal lines R > |jωωωωL|).
• Exception in low-R lines
• Inductive effects should be considered in wide clock lines in a fast processor, power supply lines and wide-bit bus lines changed at the same time.
• Clock lines are placed on power plane to reduce inductive effects.
[1] D.A.Priore, "Inductance on Silicon for Sub-micron CMOS VLSI," Symp. on VLSI Circuits, 1993.
W / H
L : S
elf-i
nduc
tanc
e (n
H/ c
m)
L = 2 ln 6H0.8W + T
100
10
10.001 0.01 0.1 0.5
0.1
0.01
0.001T/H=
-
T.Sakurai
Inductive Effects
10-2
10-1
100
101
102
1996 2000 2004 2008 2012Year
ω ωωωL
/ R
Min. width (scaled)
W=1μμμμmmmm
W=10μμμμmmmm
W=100μμμμmmmm
-
T.Sakurai
Inductive Effects in Clock Lines
Board design practice is imported in LSI.
P.J.Restle & A Deutsch, “Designing the Best Clock Distribution Network,” VLSI circuits symp., pp.2-3, May 1998.
-
T.Sakurai
Three crises in VLSI designs
Power crisis
Interconnection crisis
Complexity crisis
-
T.Sakurai
VLSI Design in 2010
Designing a map of 10m wide roadsfor a world atlas
-
T.Sakurai
Overcome complexity crisis
MPU Core
Cache
ROMRAM
MPEG CoreUSB Core
ProprietaryLogic
IP (A inc.)IP (B univ.)IP (C inst.)IP (D semi.)
IP ; CPU, DSP, memories, analog, I/O, logic..HW/FW/SW
• Re-use and sharing of design• Design in higher abstraction
-
T.Sakurai
System LSI for Games
●●●● Clock freq. 300MHz
● ● ● ● 10M transistors
● ● ● ● Graphics synthesizer integrate
40M tr. With embedded DRAM
●●●● Memory bandwidth 3.2GB/s
●●●● Floating operation 6.2GFLOPS/sec
●●●● 3D CG 6.6M polygon/sec
●●●● MPEG2 decode
-
T.Sakurai
Issues in System-on-Chip
• Un-distributed IP’s (i.e. CPU, DSP of a certain company)
• Low yield due to larger die size
• Huge initial investment for masks & development
• IP testability, upfront IP test cost
• Process-dependent memory IP’s
• Difficulty in high precision analog IP’s due to noise
• Process incompatibility with non-Si materials and/or
MEMS
-
T.Sakurai
Technologies integrated on a chip
‘98
LogicSRAM
Flash memoryEmbedded DRAM
CMOS RFFPGAMEMS
FeRAMChemical sensors
Electro-opticalElectro-biological
Year‘00 ‘02 ‘04 ‘06 ‘08 ‘10 ‘12
+0+1~2+4+4~5+3~5+2+2~10+4~5+2~6+5~8+?
ITRS’99RF : Radio FrequencyFPGA : Field Programmable Gate ArrayMEMS : Micro Electro Mechanical SystemsFeRAM : Ferroelectric RAM
-
T.Sakurai
DRAM embedding
DRAM Processor System LSI
Two orders of magnitude improvement in bandwidth and power
K.Sawada, T.Sakurai, et al, "A 72K CMOS Channelless Gate Array with Embedded 1Mbit Dynamic RAM," in Proc. CICC'88, pp.20.3.1-20.3.4, May 1988.
BUT EXPENSIVE!
-
T.Sakurai
Micro-machined mechanical switch
G.Weinberger, “The New Millennium: Wireless Technologies for a Truly Mobile Society,” ISSCC, pp.20-24, Feb. 2000.
-
T.Sakurai
Silicon MEMS microphone
M.Pinto, “Atoms to Applets: Building Systems ICs in the 21st Century,” ISSCC, pp.26-30, Feb. 2000.
10umWill soon exceed the performance of the best commercial microphones, yet be inexpensive and potentially integrated with on-chip electronics.
-
T.Sakurai
System-in-Package
-
T.Sakurai
System on a Chip
SoC vs. SiP
Chip
MPU Core
Cache
ROMLogic
AnalogUSB Core
DRAM• Smaller area• Shorter interconnect• Optimized process for
each die (Analog, DRAM, MEMS…)
• Good electrical isolation• Through-chip via
• Heat dissipation is an issue
Heat spreader/Heat pipe
System in a PackageMPU Core
DRAM
Logic
AnalogCache
USB ROM
-
T.Sakurai
Superconnect example based on three-dimensional assembly
PURE LOGIC
Heat Sink
ANALOG
RF/ANALOG
DRAM
K.Ohsawa, H.Odaira, M.Ohsawa, S.Hirade, T.Iijima, S.G.Pierce, “3-D Assembly Interposer Technology for Next-Generation Integrated Systems,” ISSCC Digest of Tech. Papers, pp.272-273, Feb.2001.
-
T.Sakurai
System-in-Packageの課題の課題の課題の課題
• Special design tools for placement & route for co-
design of LSI’s and assembly
• High-density reliable substrate and metallization
technology
• low-cost, available known good die
(reworkablility and module testing)
-
T.Sakurai
Super-connect technologyD
esig
n ru
le (µ
m)
0.01
0.1
1
10
100
Past Present Future
Tr. gate
Tr. gate
Tr. gate
InterconnectUpper layerLower layer
Upper layer
Middle layer
Lower layer
Super-connect
PackagePackagePackage
Technology Vacuum
M. Kimura,"Superconnect : 21st Century LSI Production and Design Method", Nikkei Microdevices, no.180, pp.62-79, June 2000.
-
T.Sakurai
Super-connect
Bandwidth
0.1
1
10
100
Design rule Power@1GB/s
Area
102
103
104
105
[mm] [mW] [GB/sec]
[mm2/bit]
Super-connect
Off-chip On-chip
1
10
100
1000
Cost/line
[AU]
10
1000[day]
100
Turn-aroundtime
1
10
100
1000
1
10
100
1000
-
T.Sakurai
0.5V LSIでの逆温度特性での逆温度特性での逆温度特性での逆温度特性
Photograph of 32bit FA0.3µµµµm CMOS
0000 0.50.50.50.5 1111 1.51.51.51.5 22220000
1111
2222
3333
4444
Norm
aliz
ed
Norm
aliz
ed
Norm
aliz
ed
Norm
aliz
ed t ttt
pd
pdpd
pd
VDD [V]
90ºC
50ºC
20ºC Measured
K.Kanda, K.Nose, H.Kawaguchi, and T.Sakurai,"Design Impact of Positive Temperature Dependence of Drain Current in Sub 1V CMOS VLSI's",CICC99, pp.563-566, May 1999.
-
T.Sakurai
Assembly & Packaging
“There is an increased awareness in the industry that assembly and packaging is becoming a differentiator in product development.”
International Technology Roadmap for Semiconductors, ITRS’99 p.213
-
T.Sakurai
LSI in 2014Year Unit 1999 2014 Factor
Design rule µm 0.18 0.035 0.2Tr. Density /cm2 6.2M 390M 30Chip size mm2 340 900 2.6Tr. Count per chip (µP) 21M 3.6G 170DRAM capacity 1G 1T 1000Local clock on a chip Hz 1.2G 17G 14Global clock on a chip Hz 1.2G 3.7G 3.1Power W 90 183 2.0Supply voltage V 1.5 0.37 0.2Current A 60 494.6 8Interconnection levels 6 10 1.7Mask count 22 28 1.3Cost / tr. (packaged) µcents 1735 22 0.01Chip to board clock Hz 500M 1.5G 3.0# of package pins 810 2700 3.3Package cost cents/pin 1.61 0.75 0.5
International Technology Roadmap for Semiconductors 1998 update sponsored by the Semiconductor Industry Association in cooperation with European Electronic Component Association (EECA) , Electronic Industries Association of Japan (EIAJ), Korea Semiconductor Industry Association (KSIA), and Taiwan Semiconductor Industry Association (TSIA) , International Technology Roadmap for Semiconductors: 1999 edition. Austin, TX:International SEMATECH, 1999.
-
T.Sakurai
Retinal Prosthesis
Prosthesis Prosthesis -- DualDual IntraocularIntraocular UnitsUnits
Courtesy: Prof. Wentai Liu (North Carolina Univ.)http://www.ece.ncsu.edu/erl/faculty/wtl_data/retina.html
-
T.Sakurai
Summary – Interconnect & SiPNew possibilities with buffered interconnects may open up new tools opportunity.
Silicon-in-Package needs new tools that support co-design of VLSI’s, package and assembly.
New design closure issues
-
T.Sakurai
Super-expoenentially increasing design challenges
1K
1Billion
Num
ber o
f Tra
nsis
tors
Functionality + Testability
Functionality + Testability + Wire delay
Functionality + Testability + Wire delay+ Power management
Functionality + Testability + Wire delay+ Power management
+ Embedded software
Functionality + Testability + Wire delay+ Power management
+ Embedded software + Signal integrityFunctionality + Testability + Wire delay+ Power management + Embedded software + Signal integrity + RF
Functionality + Testability + Wire delay+ Power management + Embedded software + Signal integrity + RF +Hybrid chips
Functionality + Testability + Wire delay+ Power management + Embedded software + Signal integrity + RF +Hybrid chips + PackagingFunctionality + Testability + Wire delay+ Power management + Embedded software + Signal integrity + RF +Hybrid chips + Packaging + Management of physical limits
ITRS’99