ee 587 soc design & test partha pande school of eecs washington state university...

Post on 29-Mar-2015

232 Views

Category:

Documents

6 Downloads

Preview:

Click to see full reader

TRANSCRIPT

EE 587SoC Design & Test

Partha PandeSchool of EECS

Washington State Universitypande@eecs.wsu.edu

SoC Physical Design Issues

Interconnect Architectures and Signal Integrity

Design Challenges

1. Non-scalable global wire delay

2. Moving signals across a large die within one clock cycle is not possible.

3. Current interconnection architecture- Buses are inherently non-scalable.

4. Transmission of digital signals along wires is not reliable.

Bus – non scalability

Clock cycle depends on the parasitic and bus length

Multiple bus segments

•More than one design iteration

•Converges to network

Bus Architectures

Split Bus Architecture

),(),()(

),(

),([

5.02

1 221

2 ,22

1 ,11

2

ijBUSi BUSj

jiBUSBUS

BUSi jiBUSjjiBUS

BUSi jiBUSjjiBUS

MMxferMMxferCC

MMxferC

MMxferC

swVE

Achievable Clock Cycle in a Bus segment

Minimize Power Consumption

Modification of interconnect architectures

Incorporate parallelism (ITRS 2003 & ISSCC 2004) Decoupling of communication and processing Modular architecture

Minimize use of global wires Locality in communication

SoC Micro architecture Trend

50-100K gates block – No global wire delay problem. Block-based hierarchical design style that uses block sizes of

50-100K gates. Single synchronous clock regions will span only a small fraction

of the chip area. Different self-synchronous IPs communicate via network-

oriented protocols. Structured network wiring leads to deterministic electrical

parameters - reduces latency and increases bandwidth. Failures due to inherent unreliable physical medium can be

addressed by introducing error correction mechanisms.

New design paradigm

New designs – very large number of functional blocks Moving bits around efficiently

• Develop on-chip infrastructure to solve future inter-block communication bottlenecks

Development of infrastructure IPs

• SoC = (SFIP + SI2P)

Silicon Back plane

MIPS SoC-it

The network-on-chip paradigm

Driven by

Increased levels of integration Complexity of large SoCs

– New designs counting 100s of IP blocks

Need for platform-based design methodologies DSM constraints (power, delay, time-to-market, etc…)

Decoupling of functionality from communication Dedicated infrastructure for data transport

High-bandwidthmemory interface

High-performanceARM processor

High-bandwidthARM processor

DMA Busmaster

BRI

DGE

UART

PIOKeypad

TimerAHB APB

NoC infrastructure

switch link

NoC Features

Some Common Architectures

(a) Mesh, (b) Folded-Torus (FT) and (c) Butterfly Fat Tree (BFT)

- F unc tio nal IP - S w itc h

(a) (b )

(c )

Data Transmission

Packet-based communication Low memory requirement

Packet switching Wormhole routing

Packets are broken down into flow control units or flits which are then routed in a pipelined fashion

Connecting Different IP Blocks Using Tree Architecture

Communication Pipelining

• Need to constrain the delay of each stage within 15 FO4

Signal Integrity

According to ITRS signal integrity will become a major issue in future technologies

Causes for such inherent unreliability Shrinking geometries, layout dimensions

Reduction in the charge used for storing bits

Increased probability of transient events like:

Crosstalk

Ground Bounce

Alpha particle hits

Micro network Protocol Stack

On Chip Signal Transmission

Future global wires will function as lossy transmission lines Reduced-swing signaling Noise due to crosstalk, electromagnetic interference, and other

factors will have increased impact. it will not be possible to abstract the physical layer of on-chip

networks as a fully reliable, fixed-delay channel At the micro network stack layers atop the physical layer, noise

is a source of local transient malfunctions.

Coding Schemes

Low-Power Coding

Reducing self-transition activity Crosstalk Avoidance Coding

Reducing Coupling with adjacent lines Error Control Coding

SEC, SECDED

Low Power Coding

Reduction of self-transition activity Bus-Invert Code Data is inverted and an invert bit is sent to the decoder if the

current data word differs from the previous data word in more than half the number of bits

Effectiveness decreases with increase in bus width

Error Control Coding

Linear block codes (n, k) linear block code, a data block, k bits long, is mapped

onto an n bit code word, Forward Error Correction or Automatic Repeat Request Redundant wires Possibility of voltage reduction Energy efficiency is an important criterion Codec overhead

Worst Case Crosstalk

Transition from 101 to 010 pattern or vice versa

Due to Miller Capacitance worst case capacitance between adjacent wires become

Victim Rise Time

Aggressor Rise Time

Victim Wire

Aggressor Wire 2

0

1

1

0

Aggressor Wire 1

0

1

LC41

Joint Crosstalk Avoidance and Single Error Correction Codes

Reduce crosstalk as well correct errors due to other transient events

Duplicate Add Parity (DAP) Dual Rail Code (DR) Boundary Shift Code (BSC) Modified Dual Rail Code (MDR)

Worst case crosstalk capacitance is reduced to (1+2λ)CL

Duplicate-Add-Parity Code

Each bit is duplicated A parity bit from one

copy is computed Same as Dual Rail

Code

Crosstalk Avoidance Double Error Correction Code (CADEC)

The 32-bit flit is Hamming coded and then an overall parity is calculated

All bits apart from the overall parity are duplicated

The 32 bit original flit becomes 77 bits

Minimum Hamming distance is 7

Worst case crosstalk capacitance is reduced to (1+2λ)CL

(38,32)Hamming encoding

32 38

38 parity, bit76

bit 0

bit 1

bit 2

bit 3

bit 4

bit 5

bit 6

bit 7

bit 74

bit 75

32 bit i/p

77 bit o/p

Hamming encoding

DAP duplication

Energy Savings with Joint Codes

Due to increased error resilience lower noise margins can be tolerated and hence operating voltage can be reduced

Coding adds overhead in terms of extra wires and codec

Voltage Swing Reduction for CADEC

10-20

10-10

0.4

0.5

0.6

0.7

0.8

0.9

1EDDAPCADEC

V

Word error rate

The probability of word error for DAP 2

2

)1(3

kkPDAP

32 )4()( nnPCADEC

Energy Savings with CADEC

2010

Communication Pipelining

Inter- and Intra-switch stages

Pipelined Data Transfer

inte

r-sw

itch

li

nk

inte

r-sw

itch

li

nk

inte

r-sw

itch

li

nk

dec

od

er

enco

der

dec

od

er

enco

der

intra-switch pipelined stages

intra-switch pipelined stages

Latency Characteristics

0

200

400

600

800

1000

1200

1400

1600

1800

2000

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1Injection Load

Ave

rag

e M

essa

ge

Lat

ency

(C

ycle

s)

UncodedCoded

•The codes should be optimized

It can be merged with existing stages No Latency penalty

Adaptive Supply Voltage Links

Dynamic Voltage Scaling (DVS) DVS schemes dynamically adjust the processor clock frequency

and supply voltage to just meet instantaneous performance requirement, making the system energy aware.

communication architectures display a wide variance in their utilization depending on the communication patterns of applications

adapts the link’s frequency and supply voltage in accordance with the instantaneous traffic bandwidth.

Repeater Insertion & Coding

Repeater insertion reduces interconnect wire delay Increases power dissipation due large drivers CACs reduce coupling capacitance Joint repeater insertion and CAC is a promising solution to

reduce power in global wires

Repeater Insertion & Coding

Reference: A low-Power Bus

Design Using Joint Repeater Insertion and

Coding

130 nm

Repeater Insertion & Coding

45 nm

Reliability

Crosstalk, electromigration,material ageing…. Transient failures

Error control coding Crosstalk avoidance coding Power, area trade-off

Permanent failures

Spare switches and links Overall routing complexity Effect on system performance

top related