soc design lecture 14: soc...

Shaahin Hessabi

Department of Computer Engineering

Sharif University of Technology

SoC DesignLecture 14: SoC Testing

of 55

Outline

Hessabi©Sharif University of TechnologySoC: SoC Testing

Introduction to Testing

Importance of SoC Testing

Challenges of SoC Testing

Basics of SoC Testing

IEEE 1500 Embedded Core Test Standard

NoC Testing

Case Studies

Moore’s Law and Test Challenges

of 55

Importance of Testing

Hessabi©Sharif University of TechnologyLecture 14: SoC Testing

Moore’s Law results from decreasing feature size (dimensions) from 10s of μm to 10s of nm for transistors and interconnecting wires

Operating frequencies have increased from 100KHz to several GHz

Decreasing feature size increases probability of defects during manufacturing process A single faulty transistor or wire results in faulty IC

Testing required to guarantee fault-free products

“Test may account for more than 70% of the total manufacturing cost - test cost does not directly scale with transistor count, dies size, device pin count, or process technology.” [ITRS’03]

“Test data volume and testing time in 2010 will 30X that for today’s chips.” [ITRS’05]

of 55

Failure Rate Example


Assume system with 500 components Each component has a failure rate i = 1000 FITS (failures in tera seconds)

System failure rate =(1000/109)×500=5×10−4

MTBF = 1/ = 2000 hours

Suppose availability of system must be 99.999% each year Repair time allocated for system repair each year should be less than: t = T × (1–system availability) = 1×365×24×3600×(1–0.99999) = 315 seconds 5 minutes

Only fault tolerance with built-in self-repair (BISR) capability meets system availability requirement.

of 55

Testing Principle


of 55

Manufacturing Test


Testing is one of the most expensive parts of chips Logic verification accounts for > 50% of design effort for many chips

Debug time after fabrication has enormous opportunity cost

Shipping defective parts can sink a company

Example: Intel FDIV bug Logic error not caught until > 1M units shipped

Recall cost $450M (!!!)

A speck of dust on a wafer is sufficient to kill chip

Yield of any chip is < 100% Must test chips after manufacturing before delivery to customers to only ship

good parts

Manufacturing testers are very expensive Minimize time on tester Careful selection of test vectors

of 55

Stuck-At Faults


How does a chip fail?Usually failures are shorts between two conductors or opens in a

conductor This can cause very complicated behavior

A simpler model: Stuck-AtAssume all failures cause nodes to be “stuck-at” 0 or 1, i.e. shorted

to GND or VDD

Not quite true, but works well in practice

of 55

Observability & Controllability


Observability: ease of observing a node by watching external output pins of the chip

Controllability: ease of forcing a node to 0 or 1 by driving input pins of the chip

Combinational logic is usually easy to observe and control

Finite state machines can be very difficult, requiring many cycles to enter desired state Especially if state transition diagram is not known to the test

engineer

of 55

Test Pattern Generation


Manufacturing test ideally would check every node in the circuit to prove it is not stuck.

Apply the smallest sequence of test vectors necessary to prove each node is not stuck.

Good observability and controllability reduces number of test vectors required for manufacturing test.Reduces the cost of testingMotivates design-for-test

of 55

Test Generation Process


of 55

Design for Test (DFT)


Design the chip to increase observability and controllability

If each register could be observed and controlled, test problem reduces to testing combinational logic between registers.

Better yet, logic blocks could enter test mode where they generate test patterns and report the results automatically.

of 55

Structured DFT: Scan Design


Circuit is designed using pre-specified design rules.

Test structure (hardware) is added to the verified design: Add a test control (TC) primary input.

Replace flip-flops by scan flip-flops (SFF) and connect to form one or more shift registers in the test mode.

Make input/output of each scan shift register controllable/observable from PI/PO.

Use combinational ATPG to obtain tests for all testable faults in the combinational logic.

Add shift register tests and convert ATPG tests into scan sequences for use in manufacturing test.

of 55

Tests for Full-Scan Circuits


Test generation for combinational logic only Separate the test vectors and response data, based on PI, PO and state (F)

variables: ti = tiI, ti

F i = 1, 2, …, n ri = riO, ri

F

Test application:1. Scan-in ti

F by setting the circuit in test mode2. Apply ti

I

3. Observe riO

4. Set the circuit in functional mode and capture the response riF into scan

register5. Scan-out ri

F while scanning-in ti+1F by setting the circuit in test mode

6. i i + 1. Go to 2Sequence length = (ncomb + 1) nsff + ncomb clock periodsncomb = number of combinational vectors nsff = number of scan flip-flops

Scan register must be tested prior to application of scan test sequences. Add nsff+4 to sequence length obtained above

of 55

Scan Design overheads


1. I/O pins: One pin necessary.

2. Additional area for latches/FFs (area overhead) Gate overhead=[4nsff/(ng+10nff)]x100%; ng=comb. Gates, nff=FFs;

Example: ng = 100k gates, nff = 2k flip-flops, overhead = 6.7%.

More accurate estimate must consider scan wiring and layout area.

3. Additional time required to latch the next state into the resisters (speed overhead)

Multiplexer delay added in combinational path; approx. two gate-delays.

Flip-flop output loading due to one additional fan-out; approx. 5-6%.

4. Additional time required to scan in/out test vectors and responses (testing overhead)

5. Clock generation and distribution is more difficult.

of 55

Built-In Self-Testing (BIST)


Specify test as one of the system functions self-test Useful for field test and diagnosis (less expensive than a local automatic test

equipment)

Logic Built-In Self-Test (BIST): Incorporates test pattern generator (TPG) and output response analyzer (ORA) internal to design Chip can test itself

Can be used at all levels of testing SoC PCB system field operation

Combine with scan approach at the design stage

Crucial for safety-critical and mission-critical applications

of 55

BIST Architecture


Note: BIST cannot test the following wires and transistors: From PI pins to Input MUX From POs to output pins

High-speed ATE or boundary scan required for these paths.

of 55

Bed-of-Nails Tester Concept


of 55

Motivation for Boundary Scan Standard


Bed-of-nails printed circuit board tester gone We put components on both sides of PCB & replaced DIPs with flat packs to reduce inductance Nails would hit components

Reduced spacing between PCB wires Nails would short the wires

PCB Tester must be replaced with built-in test delivery system -- JTAG does that

Need standard System Test Port and Bus

Integrate components from different vendors One chip has test hardware for other chips

Test bus identical for various components

of 55

Boundary Scan


Scan design applied to I/O buffers of chip Used for testing interconnect on PCB Provides access to internal DFT capabilities

IEEE standard 4-wire (TCK, TMS, TDI, TDO) Test Access Port (TAP)

boundary-scan cell (BSC) and operation modes bidirectional buffer with BSCs

of 55

Boundary Scan Architecture


1. Instruction sent (serially) through TDI into instruction register.

2. Selected test circuitry configured to respond to the instruction.

3. Test pattern shifted into selected data register and applied to logic to be tested

4. Test response captured into some data register

5. Captured response shifted out; new test pattern shifted in simultaneously

6. Steps 3-5 repeated until all test patterns are applied.

of 55

SoC Testing


SoC testing is a composite test comprised of individual tests for each core, user-defined logic (UDL) tests, and interconnect tests.

To avoid cumbersome format translation for IP cores, SoC and core development working groups such as virtual socket interface alliance (VSIA) have been formed to propose standards.

IEEE 1500 standard has been announced to facilitate SoC testing.

IEEE 1500 specifies interface standard which allows cores to fit quickly into virtual sockets on SoC.

of 55

Testing Responsibilities of Participants


Core creators (VC providers) Testable core design Provide access and isolation

mechanisms Generate test benches and fault

coverage

EDA vendors ATPG tools DFT (testability insertion) tools, scan,

BIST Fault grading tools Testability integration (boundary scan,

BIST) tools

Core integrators (VC integrators) Provide access to individual cores Isolate cores during testing Test interconnect between cores Test SoC as a whole Test scheduling and integration

Core-Based system manufacturers (Fabs) Apply SoC-level test Debug and diagnosis

of 55

SoC Test Problems/Requirements


Mixing technologies: logic, processor, memory, analog Need various DFT/BIST/other techniques 80% of an SoC design could contain embedded memories Need for advanced memory test techniques such as built-in self-diagnosis (BISD) and built-in self-repair

(BISR) of memory defects.

10% of an SoC design that contains analog circuits could contribute to 90% of the total test cost during manufacturing test.

Deeply embedded cores (not always directly accessible from chip I/Os) Need Test Access Mechanism TAMs impact test time and test cost

Chips and bare dies Embedded coresmanufactured and tested not manufactured

access available access mechanism needed

replaceable elements not replaceable

of 55

SoC Test Problems/Requirements (cont’d)


Generally, core users cannot access core net-lists and insert DFT circuits. Core users rely on test patterns supplied by core vendors, that guarantee a specific fault coverage.

Patterns must be applied to cores in a given order, using specific clocking strategy. Make sure that undesirable test patterns and clock skews are not introduced into test streams.

Hierarchical core reuse Need hierarchical test management

Different core providers and SoC test developers Need standard for test integration

Analog and mixed-signal core testing must be dealt with. Failure mechanisms and test requirements are less known than digital cores.

IP protection/test reuse Need core test standard/documentation

Higher-performance core pins than SoC pins Need on-chip, at-speed testing

of 55

Analog Test Issues


Analog signals are tested within specified bands/limits.

Analog signals are sensitive to process variations performance-sensitive to the process (parameter variation, correlation, mismatch, noise, …). Prevents the abstraction of a standard analog fault model if the manufacturing process changes.

Lack of analog DFT methodologies.

Specification-driven testing of analog circuits, manual test generation, and lack of robust EDA tools, resulting in long test development times.

The yield vs. defect level trade-off in analog testing is nondeterministic, because most analog faults do not result in catastrophic failure.

Analog test results are affected by noise and measurement accuracy. Also, pass/fail results do not provide any diagnosis capability and are inadequate, preventing the

use of analog BIST methods. Further complicated because any intrusion of a DFT circuit into an analog circuit affects the circuit’s

performance. Thus, it is difficult to develop analog BIST and DFT methods.

of 55

Test-Wrapper for a Core


Core-vendor supplied tests must be applied to embedded cores.

Test-wrapper: logic added around a core to provide test access to embedded core.

Test-wrapper provides:

For each core input terminal A normal mode: Core terminal driven by host chip An external test mode: Wrapper element observes core input terminal for interconnect test An internal test mode: Wrapper element controls state of core input terminal for testing the

logic inside core For each core output terminal A normal mode: Host chip driven by core terminal An external test mode: Host chip is driven by wrapper element for interconnect test An internal test mode: Wrapper element observes core outputs for core test

of 55

DFT Architecture for SoC


Required structural elements:1. Test pattern source and sink

2. TAM

3. Core test wrapper

of 55

DFT Components


Test source: Provides test vectors via on-chip LFSR, counter, ROM, or off-chip ATE.

Test sink: Provides output verification using on-chip signature analyzer, or off-chip ATE.

Test access mechanism (TAM): User-defined test data communication structure; carries test signals from source to module, and module to sink;

tests module interconnects via test-wrappers;

TAM may contain bus, boundary-scan and analog test bus components.

Test controller: Boundary-scan test access port (TAP); receives control signals from outside;

serially loads test instructions in test-wrappers.

of 55

More SoC Test Problems/Requirements


Once TAM and test translation mechanism (test wrapper) are determined, the major challenge for system integrator is test scheduling. Order in which various core tests and tests for user-designed interface logic are applied.

Test scheduling must consider several conflicting factors: SoC test time minimization,

Resource conflicts due to sharing of TAMs and on-chip BIST engines,

Precedence constraints among tests,

Power constraints.

External ATE inefficiency Need “on-chip ATE”

Test power must be considered Need lower power design or test scheduling

Testable design automation Need new testable design tools and flow

Embedded Core Test Standard -- 1500

System Overview of IEEE 1500 Standard

Core with the IEEE 1500 Wrapper

TAM Architectures for Parallel Access

Comparison between 1149.1 and 1500


of 55

A System Overview of IEEE 1500 Standard


Most important feature: provision of a “wrapper” on the boundary (I/O terminals) of each core to standardize the test interface of the core.

WSP (Wrapper Serial Port): set of I/O terminals of the wrapper for serial operations. WSP =WSI + WSO + WSC WSI (Wrapper Serial Input), WSO (Wrapper Serial Output),

Several WSC (Wrapper Serial Control) terminals.

WIR (Wrapper Instruction Register): stores the instruction to be executed in

the corresponding core,

controls operations in the wrapper including accessing WBR (Wrapper Boundary Register), WBY (Wrapper Bypass Register), or other user-defined function registers. WBR consists of WBCs (Wrapper

Boundary Cells).

Architecture of an SoC with N cores, each wrapped by an IEEE 1500 wrapper:

of 55

A Core with the IEEE 1500 Wrapper


WSP supports serial test mode similar to BS architecture, but without TAP controller. Serial control signals of 1500 directly applies to the

cores, hence provide more test flexibility. E.g., supports delay testing that requires a sequence of

test patterns to be consecutively applied to a core.

Optional parallel test mode with a user-defined, parallel TAM. Transports test signals in parallel, reducing test

time.

of 55

TAM Architectures for Parallel Access


Different architectures can be implemented in TAM for providing parallel access to control and test signals (input and output) via wrapper parallel port (WPP):

a) Multiplexed access:

cores time-share the test control and data ports

b) Daisy-chained access:

output of one core is connected to the input of the next core

c) Direct access to each core (distribution architecture)

of 55

Pins in 1500 Standard


A chip with 1500-wrapped cores may use the same 4 mandatory pins as in the IEEE 1149.1 standard.

An on-chip test controller with the capability of the TAP controller in the boundary-scan standard can be used to generate the WSC for each core. Can also be used to deal with the testing of hierarchical cores in a complex system.

Not required nor suggested in the 1500 standard.

of 55

Comparison between 1149.1 and 1500


1149.1 1500Purpose Board-level Core-based

Parallel Mode No Yes

Extra Data/ControlI/Os

Mandatory: TDI, TDO, TMS, TCKOptional: TRST

Mandatory: WSI, WSO, 6 WSCOptional: TransferDR, WPP, AUXCKn(s)

FSM Yes No

Transfer Mode No Yes

Latency between operations

Yes No

MandatoryInstructions

EXTEST, BYPASS,SAMPLE, PRELOAD

WS_EXTEST, WS_BYPASS,one Wx_INTEST,WS_PRELOAD (cond. required)

NoC Testing

Special Features of NoC TestingNoC vs. SoC Design and TestTesting of RoutersNetwork Interface TestingSoC Examples

Network-on-Chip Processor (Cell Processor)SocTesting for PNX8550 System ChipNoC testing for high-end TV system


of 55

Special Features of NoC Testing


Testing an NoC-based system = testing embedded cores + on-chip network.

Greatest difference between NoC and SoC testing: test access mechanism design. On-chip-network of NoC can be reused as a TAM for test packet delivery. Theoretically, no TAM interconnects are required to be invested.

Test time can be reduced by network reuse even under power constraints, with minimized pin count and area overhead.

Generally, more cores can be tested in parallel than TAM-based SoC testing, due to large NoC channel bandwidth.

Key point: how to utilize on-chip network as a TAM without compromising fault coverage or test time.

Research on NoC testing is still premature when compared to industrial needs, and future research and development are needed.

of 55

SoC Design and Test


Design

Communication infrastructure is becoming new bottleneck Wire delay Signal integrity Power dissipation Area vs. speed

New interconnection schemes needed.

Test

Test of SoC has been well understood TAM, wrapper Test scheduling IEEE 1500

Test needs dedicated hardware

Hardware for mission-mode communication cannot be reused for testing.

of 55

NoC Design and Test


Design

High performance High bandwidth Low signal delay

Reasonable overhead

Suitable for large number of cores

Network design is versatile

Possible next-generation SoC paradigm

Test

Test of NoC has not received much attention Core testing Router and interconnection testing Test wrapper design Test scheduling

No need for dedicated TAMs

Network can be reused for testing

of 55

Testing of Routers


Routers are used to implement functions of flow control, routing, switching and buffering of packets.

Router testing can be treated as sequential circuit testing.

Due to NoC’s regularity, test pattern broadcasting can be applied to reduce test time. Since all routers are identical, all can be tested in parallel by test pattern broadcasting.

of 55

Testing Routers


Testing a router consists of testing the control logic (routing, arbitration, and flow control modules) and FIFO buffers.

Control logic can be tested by typical sequential circuit testing methods such as scan testing.

A smart way to test FIFO is to configure the first register of FIFO as part of a scan chain, and others can be tested through this scan chain.

Since all routers are identical, all can be tested in parallel by test pattern broadcasting.

Comparator is implemented by XOR gates.

of 55

Router Test Wrapper Design and Test


IEEE-1500 compliant test wrapper is designed to support test pattern broadcasting and test response evaluation.

All SC1 chains of these routers share the same set of test patterns.

All Din[0] (Din-R0[0], …, Din-Rn[0]) data inputs share the same test patterns.

Wrapper also supports test response comparison for scan chains/data outputs.

Diagnosis control block activates diagnosis.

Small HW overhead and small number of test patterns due to test broadcasting.

Small test application time using multiple, balanced scan chain and test broadcasting.

The method is scalable.

of 55

Network Interface Testing


Network interface (NI) is used to: receive data bits from its corresponding IP core (router),

packetize (depacketize) the bits,

perform clock domain conversions between the router and the core.

NI might be the most difficult to test component in an on-chip network, because clock domain conversion introduces non-deterministic device behavior. Damaging to conventional stored response testing.

Testing NoC-based system by separating core testing from on-chip network testing is inadequate. Interactions between cores and on-chip network must be tested using extensive functional testing.

Interactions between on-chip network components (routers, interconnects, and NIs) must be thoroughly tested by functional testing as well.

of 55

SoC Example: PNX8550 System Chip


PNX8550: a chip based on Nexperiadigital video platform by Philips.

Fabricated using 0.13µm process, 6 metal layers,1.2V, die size is 100mm2.

Entire chip contains 62 logic cores (5 hard, 57 soft), 212 memory cores, and 94 clock domains. Five hard cores: one MIPS CPU, two VLIW

TriMedia CPUs, a custom analog block (PLLs and DLLs), and a D-to-A converter.

Soft cores: MPEG decoder, UART, PIC 2.2 bus interface, …

All 62 logic cores are partitioned into 13 chiplets. Chiplet : a group of cores placed together, and

connected to a specific set of TAM wires.

of 55

PNX8550 Structure and Test Methods


Two device control and status (DCS) networks enable each processor to control/observe on-chip modules.

A bridge is used to allow both DCS networks to communicate.

CPUs and many modules have access to external memory via a high-speed memory access network.

PNX8550 allows test reuse through test wrappers (TestShell), and test access mechanism (TestRail).

Test methods: random logic: full scan test with 99%

stuck-at fault coverage, small embedded memories: scan test, large memories: BIST.

of 55

PNX8550 Test Strategies


There are 140 TAM wires (i.e., 280 chip pins) for the entire chip.

Design issues: how to assign these TAM wires to different cores and how to design the wrapper for each core.

Requirement: each channel must provide 28M of test data volume and test application time must be minimized.

Philips developed a tool called TR-ARCHITECT to deal with these core-based testing requirements.

TR-ARCHITECT requires two different kinds of inputs: SoC data file and a list of user options. SoC data file: SoC parameters such as number of cores in the SoC, number of test patterns and

number of scan chains in each core.

User options: test choices such as number of SoC test pins, type of modules (hard or soft), TAM type (test bus/test rail), architecture type (daisy chain, distribution, or hybrid), test schedule type (serial or parallel for daisy chain), and external bypass per module (yes/no).

of 55

TAM Wires Distribution and Test Architecture


Distribution of 140 TAM wires to 13 chiplets is done manually, because TR-ARCHITECT became available half way of PNX8550 design process.

Assignment of TAM wires for a chiplet ranges from 2 to 21.

Next step is to design the test architecture inside each chiplet.

Distribution test architecture is used for all except two chiplets: UMDCS and UTDCS. For these two chiplets (hybrid test architecture), some wires are shared by two or more cores

using daisy chain; some cores are connected by distribution architecture.

of 55

Test Architecture Design for Each Chiplet


Test architecture design is trivial if chiplet under consideration has only one core. Test wrapper of the core can be designed based on TAM wires assigned and core parameters.

For a chiplet containing multiple cores and using distribution test architecture, TR-ARCHITECT determines the number of TAM wires assigned to each core and designs the test wrapper for the core.

For both chiplets with hybrid test architecture, TRARCHITECT determines number of TAM-wire groups,

width assigned to each group,

assignment of cores to each group,

designs the test wrapper for each core.

of 55

NoC Testing for High--End TV Companion Chip by Phillips


A high-end TV system with two chips: main chip (PNX8558 discussed above),

companion chip (implementing more advanced technologies that will not be released to competitors) [Steenhof 2006].

of 55

Main TV Chip and Companion Chip


Main TV chip (PNX 8550) controls entire system and interacts with users, TV sources, TV display, peripherals, and configuration of companion chip.

Companion chip contains nine IP blocks for enhancing video quality.

Main and companion chips have their own dedicated interconnect structures. connected using a high-speed external link (HSEL).

Advantages of partitioning a complex system into main and companion chips: reducing development risk (because of implementing smaller and less complex chips),

managing different innovation rates in different market segments,

encapsulating different functionality.

of 55

Companion Chip: NoC Implementation


On-chip network contains routers (R), interconnects, and network interface (NI). Each NI contains one kernel (K), one shell (S), and several ports.

Mainly, it is a 2x2 mesh NoC.

Numbers of master (M) and slave (S) ports are indicated in each NI.

Ports are connected to IPs of processors, DSPs, or memory arrays.

New HSEL is used to attach another companion chip (e.g., FPGA).

of 55

Test Methods for Phillips’ AEthreal NoC


AEthereal NoC is configured at run time with the required task diagram.

NoC structure offers great flexibility and reuse potential of the companion chip with the price of: increased area (4%),

larger power consumption (12%),

and larger latency (10%).

Test methods for AEthreal NoC architecture can be found in [Vermeulen 2003].

On-chip network can be treated as a core for testing;

Knowledge about on-chip network can be used to enhance standard core-based test approach to get better results. For example: E.g. all identical blocks (e.g., all routers) can be tested by test broadcasting, while test responses

can be compared to each other and any mismatch will be sent off-chip.

of 55

Test Methods for Phillips’ AEthreal NoC (cont’d)


Timing test is extremely important because:1) long wires in NoC may cause crosstalk errors can be dealt with by [Grecu 2006]:

C. Grecu, P. Pande, A. Ivanov, and R. Saleh, “BIST for network-on-chip interconnect infrastructures,” in Proc. IEEE VLSI Test Symp., pp. 30–35, April 2006.

2) clock boundaries between cores are in NIs and timing errors can occur. still waiting for good solution.

Once on-chip network fully tested, it can be used to transfer data for core testing.

No TAM wires are required for testing, and NoC is fully reused for core testing.

NoC structure also supports parallel testing if channel capacity can support parallel data transportation with a specific power budget.

of 55

Concluding Remarks on SoC/NoC Testing


Modular test techniques for digital, mixed-signal, and hierarchical SoCs must be developed further to keep pace with technology advances.

Test data bandwidth needs for analog cores are very different from digital cores, and unified top-level testing of mixed-signal SoCs remains a major challenge.

Research is also needed to develop wrapper design techniques and test planning methods for multi-frequency core testing. 1500 standard doesn’t address wrapper design for at-speed testing of such cores.

Key point: how to utilize on-chip network as a TAM without compromising fault coverage or test time.

Research on NoC testing is still premature when compared to industrial needs, and future research and development are needed. Currently limited support for various network topologies, routing strategies, …

Wrapper design techniques for SoC testing can be adopted by NoC-based systems.

of 55

Moore’s Law and Test Challenges


Moore’s law: the number of transistors integrated per square inch will double approximately every 18 months.

To keep track of Moore’s law: die size, feature size, gate delay , interconnect delay

To reduce interconnect delay, interconnects are made wider and taller. Causes crosstalk noises between adjacent lines due to capacitive and inductive coupling (called

signal integrity problem). This is very difficult to test.

Power integrity: clock frequency, supply voltage , power supply voltage can drop by L(di/dt). This is very difficult to test.

Process variation: precise control of Si process is becoming more difficult. Example: hard to control effective channel length of a transistor. Power and delay exhibit large variability. This is hard to detect.

soc design lecture 14: soc...

Documents