three-dimensional vlsi system with self …...jst international symposium on dependable vlsi systems...
TRANSCRIPT
JST International Symposium on Dependable VLSI Systems 2013
1
Three-Dimensional VLSI System with Self-Restoration Function
Mitsumasa Koyanagi , Tohoku University Hiroaki Kobayashi, Tohoku University Takafumi Aoki, Tohoku University Toshinori Sueyoshi, Kumamoto University Tadashi Kamada, DENSO Corporation Makoto Motoyoshi, Tohoku-MicroTec
2
Setting Target for High Performance Image Processing System to Guarantee Safety in Automobile Driving Assist
Number of camera: 2 Measuring distance: 0~100m Base line length: 12cm Focus length: 6.5cm Pixel resolution: SXGA Window size: 48 pixels x 31 lines Reconfiguration points: 30,000 Matching accuracy: 1/20 Pixel Z Axis resolution (Measuring distance accuracy): 80cm resolution at 50m ahead 3m resolution at 100m ahead Calculating ability : ~1TFLOPS
3D-VLSI
Visual Information Processing Unit with 3D Stacked Structure
3
Application of 3D stacked graphics processor to Obstacle Detection Using Stereo Vision
Relation between Requirements by Application and Various Technologies
Target : ISO 26262 ASIL=C
Application of 3D VLSI to Image Processing VLSI for Automobile
Performance
Requirement
100FIT
Random Hardware Failures
Management of
Design Process, etc.
Hazard-causing
Systematic Failure
Dangerous Failure
ASIL=C
Failure Rate Allocation to Components Considering System Structure
Dependability
Requirement
80FIT Test Coverage for Single-
Point Failures:97% Test Coverage for Combined
Failures:80%
1TFPLOS / 5W
Performance=
Marketability・Safety
Requirements by Application
Device Scaling
3D Integration
Limitation in Increasing Test Coverage Increase of Dynamic Failures
Fault Detection by Multi-Modular Redundant Random Test
Self-Repair
Redundancy
Demand
Utilization
Realization Necessity
Cons. Demand
3D VLSI
Failure
5
Architecture of 3D-DVLSI System
Block diagram of a processor core
Configuration of 3D stacked multicore processor
Cross-sectional structure of 3D-VLSI
Design for 3D Stacked Processor Depending on Granularity
2D Design vs. 3D Design
0
0.5
1
1.5
2
2.5
3
3.5
4
1MB 2MB 4MB 8MB
Nor
mal
ized
De
lay
Cache Size
2D 3D
0
0.5
1
1.5
2
2.5
3
1MB 2MB 4MB 8MB
Nor
mal
ized
Ene
rgy
Cache Size
2D 3D
Highly Dependable 3D-Stacked Multicore Processor Using SVP for Image Processing
GPP
Vector Core
3D TAP
I/O
GPP
Vector Core
3D TAP
I/O
GPP
Vector Core
3D TAP
I/O
GPP
Vector Core
3D TAP
I/O
Processing Cores (Can use as Sys-SVP)
System-level SVP
System management
Allocating processing task
Testing processing cores
Replacing faulty cores
Memory
CPU
Sys-SVP maintenance
Migrating Sys-SVP
JTAG
UART RS232
System SVP+3D Processor Hardware SVP
9
System-Level SVP
Computing Core 1
Computing Core m
Redundant Core 1
Redundant Core (n-m-1)
Hardware-Level SVP (FPGA)
System Configuration, Test System-Level SVP
Task Allocation, Online Self-Test and Repair Control
Test: BIST controlled through TAP (JTAG) I/F Repair: Replacement by another Redundant Core
Ver
tica
l Co
mm
on
Bu
s
Health Info., Logging
Vertically Stacked and Electrically Connected by Through-Silicon Vias (TSVs) using 3D Integration Technology.
Repair by replacing the failed core with redundant core
Dependability Maintained by SVP
10
RC : Recovery Controller
Plasma Plasma
Plasma
Spare
Selector + Voter + Detector
Memory (ECC protected)
メモリコントローラ
UART
メモリコントローラ
UART
Memory controller
UART RC ICAP
FrameECC
Memory
ICAP : Internal Configuration Access Port
RM
RM : Recovery Module
Implemented on:Xilinx Virtex-6 XC6VLX240T
Dependable HW-SVP
Triplicating processor core and peripheral modules
Implementing RM, RC and Spare
RM and RC control recovery sequence
Spare is used for hard-error avoidance
11
Readback and Overwrite reconfiguration (Scrubbing)
ICAP : Internal Configuration Access Port * Frame : Minimum unit of reconfiguration (1 frame = 2,592bit on Virtex-6)
Readback and error detect
ICAP
Frame ECC
・・・
FPGA
・・・ ・・・ ・・・
(3) Repair readback data
(4) Overwrite same frame
Reconfigure to correct error
Error detected
Apply these sequence for all frames
ICAP
Frame ECC
Frame*
・・・
FPGA
・・・ ・・・ ・・・
(2) Create syndrome
(1) Readback configuration data
Soft-Error Recovery
12
Relocate PRB and separate a broken module
ICAP ・・・ ・・・ ・・・
FPGA
・・・ ・・・ ・・・
Module_0
Module_1
Module_2
Spare
Hard error
Selector
Voter
Implementing a copy of Module on Spare to reconstruct TMR configuration
* This is realized by uniforming inner configuration of PR region (reported on Dec. 2011)
Readback
Reconfiguration
PRB relocation *
Hard-Error Recovery
Block Diagram of Processor Core for 3D-Processor Chip
Inte
rfac
e for
D
ebugg
er Te
rmin
al
Exte
rnal
Mem
ory
Buss
In
terf
ace t
o E
valu
ate S
tack
ed M
em
ory
(Soft
-Err
or
etc
.)
Quad
ruple
TSV
Buss
(I/O
)
Peri
phera
l B
us
System Bus
DBG
H-UDI
INTC
SCI
PB
Bri
dge
CPG
TMU
General Purpose
Processor
DM
AC
On-Line Self-Test
Controller
Stacked Shared
Memory
Vertical Bus
Bridge
WDT
Memory Controller
Internal Memory
Access I/F
TS
V B
uss
TS
V B
us
IEEE1149.1 TAP
Controller
Block Diagram of 3D Microprocessor
System Bus
PB Bridge
Processor Core
On-Line Self-Test
Controller
Stacked Shared
Memory
Vertical Bus
Bridge
Memory Controller
System Bus
PB Bridge
Processor Core
On-Line Self-Test
Controller
Stacked Shared
Memory
Vertical Bus
Bridge
Memory Controller
disabled
System Bus
PB Bridge
Processor Core
On-Line Self-Test
Controller
Stacked Shared
Memory
Vertical Bus
Bridge
Memory Controller
disabled
Tier 3
Tier 2
Tier 1
Tier 0 Exte
rnal
M
em
ory
Vertical Bus using TSVs
System Bus
PB Bridge
Processor Core
On-Line Self-Test
Controller
Stacked Shared
Memory
Vertical Bus
Bridge
Memory Controller
disabled
3D Test Access Port (3D TAP)
Test Bus
System Bus Memory Bus
15
Functional Design • Stacked Dies, Core-Based
• Inter-Connect: TSVs • Extra-Connect: Pins
Existing Design-for-Test • Core: Internal Scan, TDC, LBIST,
MBIST; IEEE 1149.1 wrappers, TAPC
• Stack Product: IEEE Std 1149.1
3D-DfT Architecture - Test Wrapper per Die • Based on IEEE 1149.1
• Two Entry/Exit Points per Die:
- Pre-Bond : Extra Probe Pads - Post-Bond: Extra TSVs
3D DfT Architecture
2nd Tier
1st Tier
3D DfT Architecture
Functional Design
• 4 stacked dies, core-based
• Inter-connect: TSVs
• Extra-connect: Pins
Existing Design-for-Test
• Core: Internal scan, TDC, LBIST, MBIST;
IEEE 1149.1 wrappers, TAPC
• Stack product: IEEE Std 1149.1
3D-DfT Architecture - Test wrapper per die
• Based on IEEE 1149.1
• Two entry/Exit points per die:
- Pre-bond : Extra probe pads
- Post-bond: extra TSVs
Ps
eu
do
-Ran
do
m
Pa
tte
rn G
en
era
tor
MIS
R/ C
om
pa
rato
r
Die 1
(Bottom)
Ps
eu
do
-Ran
do
m
Pa
tte
rn G
en
era
tor
MIS
R/ C
om
pa
rato
r
Ps
eu
do
-Ran
do
m
Pa
tte
rn G
en
era
tor
MIS
R/ C
om
pa
rato
r
Sys. SVP
(Top)
Sys. SVP
Die n
TDI TDO TDI TDO
HW SVP
New 3D VLSI DfT Architecture for Online Self-Test of Dies Based on IEEE 1149.1.
No redundancy
Repairable TSV (16 signals : 20 TSVs)
Single TSV
Double TSV
Quadruple TSV
Reipairable TSV (4 signals : 6 TSVs)
Reipairable TSV (8 signals : 10 TSVs)
TSV
TSV
TSV
TSV
TSV
TSV
TSV
TSV
TSV
TSV
TSV
TSV
TSV
TSV
TSV
TSV
TSV
TSV
TSV
TSV
TSV
TSV
TSV
TSV
TSV
TSV
TSV
TSV
TSV
TSV
TSV
TSV
TSV
TSV
TSV
TSV
TSV
TSV
TSV
TSV
TSV
TSV
TSV
Multiple TSV Repairable TSV
A: +100% S: -
A: +300% S: -
A: +50% S: 3
A: +25% S: 5
A: +25% S: 3
A – TSV area cost compared to single TSV S – switch width for a signal
A: +0% S: -
0.00
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
1.00
1.0E-06 1.0E-05 1.0E-04 1.0E-03 1.0E-02 1.0E-01
Sta
ckin
g Y
ield
(in
case o
f 1
,00
0
TSV
s)
TSV Failure Rate
single TSV double TSV quadruple TSV
4 signals : 6 TSVs 8 signals : 10 TSVs 16 signals : 20 TSVs
0.90
0.91
0.92
0.93
0.94
0.95
0.96
0.97
0.98
0.99
1.00
1.0E-06 1.0E-05 1.0E-04 1.0E-03 1.0E-02 1.0E-01
Sta
ckin
g Y
ield
or
Rep
air
ab
ilit
y
TSV Failure Rate
0.90
0.91
0.92
0.93
0.94
0.95
0.96
0.97
0.98
0.99
1.00
1.0E-06 1.0E-05 1.0E-04 1.0E-03 1.0E-02 1.0E-01
Sta
ckin
g Y
ield
or
Rep
air
ab
lity
TSV Failure Rate
(a) 1,000 vertical signals (b) 10,000 vertical signals
0.90
0.91
0.92
0.93
0.94
0.95
0.96
0.97
0.98
0.99
1.00
1.0E-06 1.0E-05 1.0E-04 1.0E-03 1.0E-02 1.0E-01
Sta
ckin
g Y
ield
or
Rep
air
ab
ilit
y
TSV Failure Rate (c) 100,000 vertical signals
0.90
0.91
0.92
0.93
0.94
0.95
0.96
0.97
0.98
0.99
1.00
1.0E-06 1.0E-05 1.0E-04 1.0E-03 1.0E-02 1.0E-01
Sta
ckin
g Y
ield
or
Rep
air
ab
ilit
y
TSV Failure Rate (d) 1,000,000 vertical signals
20
TEST
Tier4
Tier3
Tier2
Tier1
Process 1 push
pop
TEST
Process 1
Process 2 push
pop
TEST
Process 2
Process 3 push
pop
TEST
Process 3
push
pop Process 1
TEST
1 frame period = ex.16.7mS
A Sequence of Task Allocation with On-line Self-Test
PLL TSV array for
Vertical System Bus Bridge
TSV array for 3D Shared Memory
General Purpose Processor Core
Vertical System Bus Bridge
3D Shared Memory
Controller
Local System Bus
Mem
ory
Contr
oller
Peripherals
TAP
Input / Output Buffer, Power, Ground
90nm CMOS 5mm x 5mm 1,920TSVs/tier TSV pitch >= 30um @200MHz
420 TSVs for 3D Stacked Shared Memory
540 TSVs for Vertical System Bus
960 TSVs for Input / Output Buffer, Power, Ground
5.0 mm
5.0
mm
Photograph of Processor Core Chip Fabricated by CMOS 90nm Technology
SEM Cross-Sectional View of 3D Stacked Processor
Processor 2
Processor 1
Multi-level Metallization
IN : pri_reset_n
DeviceIDの期待値
DeviceIDの実測値
IN : dp_clk
IN : dp_tdi_l
OUT : dp_tdo_l
IN : dp_tms
IN : dp_trst_n
Measured Output Waveform from 3D Stacked Processor
DeviceIDの部分は正常に取得できた。 Device ID(32'b00010100010001010000000000000001) ※波形の順序は逆
◆タイミングチャート
◆入出力波形(5MHz駆動)
25
Summary
We proposed a new dependable architecture for 3D-VLSI in which a system supervisor processor (Sys-SVP) controls self-test and self-repair circuits and a hardware system supervisor processor (HW-SVP) controls Sys-SVP to maintain dependability.
We established a self-repair scheme with soft-error recovery using dynamic reconfiguration and hard-error avoidance using partial reconfiguration to guarantee dependability of HW-SVP.
We employed a checkpoint and restart scheme to migrate all function in a failure processor layer to another processor layer to maintain dependability and proposed a new algorithm to dynamically perform self-test and migration to maintain performance in 3D stacked processor.
We introduced a new 3D DfT (Design for Test) architecture with on-line self-test using Sys-SVP through 3D TAP (test access port) based on IEEE 1149.1. into 3D stacked processor to maintain dependability.
We introduced redundancy methods to guarantee the reliability of TSVs (Through Si Vias) in 3D-VLSI.
We have successfully fabricated three-layer stacked multicore processor using a back-via type 3D integration technology.