is system level test practical at the wafer level on ate ... · pdf fileis system level test...
TRANSCRIPT
Is System Level Test Practical At The Wafer Level ON ATE For 3D-IC Processors?
Gregory SmithGregory SmithTeradyne
System Level Test
• Typically the last test insertion before shipment• Main purpose of the test is to “Boot” the• Main purpose of the test is to Boot the
processor and run self testing diagnostics• Sometimes system level test is a permanent
part of the test flow.p• Often SLT is used on devices until functional
test coverage is high enough to eliminate the insertion Photo: Chroma ATE
• Current commercial solutions support 6 sites in parallel
• Test times can be minutes long• Yields are usually very high (>>98%)
System Level Test Details
• Tester applies power to device• Device loads bootstrap loader from TestDebug
DRAM
• Device loads bootstrap loader fromflash memory into DRAM and executes loader.
• Loader loads OS Kernel imageDevice Under Test
Test Controller (PC)
Debug PortAudio
(usualylooped back)
• Loader loads OS Kernel image from Flash into DRAM, and thenlaunches the OS
• OS automatically starts diagnostic
Test
Flash Memory (withbootstrap loader, and
kernel image
Display(usually
to an HDMI rcvr)
Power Management Partner devices
• OS automatically starts diagnosticroutines as a startup program
• Diagnostic routines test all cores,graphics interfaces and power modes
Loader size: ~0.5 to 2MBOS Kernel Size: ~ 5 to 20MBLoader size: ~0.5 to 2MBOS Kernel Size: ~ 5 to 20MB
kernel image
graphics, interfaces and power modes• Test results are written to diagnostic port (UART)• Tester looks for “Test Passed” message to bin the part
Mobile Processor Production Process ChangesProduction Process Changes
FAB Assem-ble Burn-inWafer
TestPkgTest
Post BI Test
System Level
Mobile Processor w/ ext Memory
bleTest Test Test Test
Scrap
Bad Die Die + Pkg Die + Pkg Die + Pkg
Mobile Processor w/ Package On Package Memory
FABAssem-ble MAP
dieBurn-inWafer
TestPkgTest
Post BI Test
System Level Test
Add POP
Memory
Mobile Processor w/ Wide IO Memory
Bad Die Die, Pkg, MemScrap
Die, Pkg, Mem
FABAdd
Memory Cube
Burn-inWafer Test
PkgTest
Post BI Test
System Level Test
Assem-ble
Bad Die Die, Pkg, Mem Die, Pkg, MemScrap
What if … SLT at Wafer Level Test
Passing Die 3D IC Assembly
Host AudioSLT at Package Test
Image Credit : Qualcomm
DUTInitial
Package Test
Burn-in Post BI Test
Memory Dice
Flash PMIC
SLT Coverage at Wafer Test
Passing Die
Memory Dice
3D IC Assembly
HostEmulation
AudioTests
DRAMEmulation
Low
SLT Coverage at Wafer Test
Yield Loss only due to Assembly
Image Credit : Qualcomm
g
DUTFinal
Package Test
Wafer levelBurn-in
Protocol Aware
ATE
DPM
B d D i due to Assembly defects Display
InterfacePower
Mode TestsFlash
EmulationBad DevicesIdentified atWafer Level
SLT on ATE –Challenges / StrategiesChallenges / StrategiesChallenges• Technical
Strategies
– Need to emulate device interfaces in real time (including memory)
– Protocol Aware capable ATE solutions for many interfaces
– WideIO interfaces unsolved
• Interfacing– Need to support full at speed
performance on a probe card• Commercial
– Leverage Probe technology in use for RF ICs, GPU and MPUs
Commercial– SLT is cheap! Couple of
thousand bucks for a motherboard, plus handler and powers supplies. How can an
i ATE t ?
– ROI needs to account for:• Reduced of scrap costs• Faster failure analysis• Improved Time to Market
expensive ATE compete?
3D-IC Could Radically Increases Scrap Cost
single die packageConventional SLT
single die packageWL-SLT
processor with POP memory
conventional SLT
Processor with POP memoryWL-SLT
die cost 2.000$ 2.000$ 2.000$ 2.000$
Increases Scrap Cost
WS test cost 0.100$ 0.300$ 0.100$ 0.300$ WS yield 90% 88% 90% 88%WS scrap cost 0.210$ 0.276$ 0.210$ 0.276$ Memory Cube Cost 10.000$ 10.000$ Package cost 0.500$ 0.500$ 1.000$ 1.000$ assembly cost 0.100$ 0.100$ 0.500$ 0.500$ FT Test cost 0.100$ 0.100$ 0.100$ 0.100$ FT yield 98% 99.5% 98% 99.5%FT scrap cost 0.060$ 0.016$ 0.278$ 0.071$ SLT Test Cost 0.100$ 0.100$ SLT Yield 99.5% 99.5%SLT Scrap Cost 0.016$ 0.070$
total COGS 3.186$ 3.292$ 14.358$ 14.247$
T t l COT 0 300$ 0 400$ 0 300$ 0 400$Total COT 0.300$ 0.400$ 0.300$ 0.400$ Total Scrap Cost 0.286$ 0.292$ 0.558$ 0.347$ Total COT + Scrap 0.586$ 0.692$ 0.858$ 0.747$
Change from baseline -18.2% 13.0%
SLT on ATE Prevents Idle Test CellsIdle Test Cells
ATE and SLT CapacityDevice Volume ATE Capacity SLT Capacity
ATE and SLT CapacityDevice Volume ATE with SLT Capacity
25
30
250
300
25
30
250
300
Idle SLT S t
10
15
20
100
150
200
# of Test C
ells
Wee
kly Vo
lume (K)
10
15
20
100
150
200
# of Test C
ells
Wee
kly Vo
lume (K)Setups
0
5
10
0
50
100
0
5
10
0
50
100
0 10 20 30 40 50 60Weeks from Production Release
0 10 20 30 40 50 60Weeks from Production Release
Eliminates excess SLT capacity as SLT test time is reducedIncreases velocity of improvements to structural tests from SLT with better FAIncreases velocity of improvements to structural tests from SLT with better FAReduces device bring up time for initial sample test
SLT on ATE is Enabled by Protocol Aware ATEProtocol Aware ATE
“stored response”
Digital Card
TTTiming
HostComputer
digital
DSSC
LogicPatgen
Pin Electronics
DUTTiming
FPGA BasedProtocol E i
g
TransactionMemory
• PA Architecture integrated into Digital Instrument
Engines Memory
g g– Select Protocol Aware or Standard Digital on any pin– Used Together with Scan, BIST, etc.
• “Real Time Intelligence” To communicate with DUTReal Time Intelligence To communicate with DUT• FPGA architecture allows flexibility and low latency
ATE Memory EmulationState of the Art - 2012State of the Art - 2012
D i I t f i t C t SLT i l t ti C t ATE C bilit P t ti l ATE C bilitDevice Interface requirements Current SLT implementation Current ATE Capability Potential ATE CapabilityFlash Memory
Emulate EMMC protocol Yes Supported Protocol Supported ProtocolImage supported ~20MB >20MW >20MW
LP DDRLP-DDRProvide LP-DDR3 I/F Uses DRAM device DDR Emulation DDR Emulation
Interface speeds400 to 1600Mbps nowUp to 4.1 by 2014 To 1067Mbps
faster, but emulation at 4.1 is tough
Read Latency <10 cycles >>10 cycles >>10 cyclesMemory Size ~20MB (Kernel) 64KW double?Memory Size 20MB (Kernel) 64KW double?
Wide IO MemoryProvide Wide IO pin count Too expensive need ~$100/pin solvable, sacrifice featuresProvide very low load C (<1pf) loads > 20pf custom buffer on Probe card?Reliably contact >500 microbumps microbumps too close continued R&D in probe cards
Perform Test after memory assyReliably contact 500 microbumps microbumps too close continued R&D in probe cards
Emulate Wide IO protocol PA limited to 1/2 board (128 pins) solvabley y
DDR Memory Emulation ChallengeChallenge
• Physical distance from DUT to Memory, plus buffering adds latency• Use of internal FPGA memory (Max 10Mb) limits emulation size• Possible to use external memory but with large increase in latency for• Possible to use external memory, but with large increase in latency for
additional DDR controller in FPGA• Crossing time domain from device DDR timing to internal digital instrument
timing requires retiming with PLLs. This makes rate changes tough to followg q g g g
Is it practical to replace SLT with tests at an ATE insertion?with tests at an ATE insertion?• For LP-DDR POP and PIP design, yes…
– Memory Controller testability may require enhancementy y y q• Tolerate longer read latency, constant DDR rates
– OS Kernel requires a radical size reduction to utilize emulated DRAM• Current Loader and OS Kernel include large numbers of unused services• A test specific “Tiny Loader” and “Tiny Kernel” could to be developed• Bonus: A “Tiny Kernel” will load and execute more quickly
– Test coverage would equal SLT, but would not use the same OS as the target application
• For WideIO memory designsATE per pin digital prices need to come down by 5x– ATE per pin digital prices need to come down by 5x
– A solution to low capacitance drive must be developed– A solution to probe high density microbumps must be developed
– Until that happens, emulating WideIO memory for System Level Testing on Wafer is not yet practical.
ConclusionMemory Dice
3D IC Assembly
HostEmulation
AudioTests
DRAMEmulation
Low
Yield Loss only Image Credit : Qualcomm
Passing Die 3D IC Assembly
DUTFinal
Package Test
Wafer levelBurn-in
Protocol Aware
ATE
LowDPM
ydue to Assembly defects Display
InterfacePower
Mode TestsFlash
EmulationBad DevicesIdentified atWafer Level
• Increasing test coverage at Wafer Test, including SLT coverage is possible for LP-DDR designs, and has been validated to reduce scrap, improve final yield and accelerate time to volume
• Focused efforts from ATE suppliers and device manufacturers is needed to Improvement ATE capability– Improvement ATE capability
– Improvement device testability– Develop “Tiny Loader” and “Tiny Kernel”
• Providing WideIO memory emulation on ATE will require significant R&D to solve electrical and mechanical challenges before it can be used in production.g p
• Yield loss after assembly of 3D ICs could limit adoption in more cost sensitive markets