networks -on-chip and noc synthesissi2.epfl.ch/~amaru/dtis/slides_presentations/dt9.pdf · networks...

56
NETWORKS-ON-CHIP AND NOC SYNTHESIS Federico ANGIOLINI LSI EPFL 2014 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 13-Oct-14 1

Upload: doanngoc

Post on 26-Apr-2018

218 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

NETWORKS-ON-CHIPAND NOC SYNTHESIS

Federico ANGIOLINILSI EPFL2014

DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS13-Oct-14 1

Page 2: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

CORE-BASED DESIGNFederico ANGIOLINILSI EPFL2014

13-Oct-14 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 2

Page 3: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

A WIDENING PRODUCTIVITY GAP

ITRS

DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS13-Oct-14 3

Page 4: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

TWO TRENDS: CMPS AND MPSOCS

13-Oct-14 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 4

Chip MultiProcessor(CMP)

MultiProcessorSystem-on-Chip

(MPSoC)

Intel, Tilera, STM Qualcomm, Samsung, TI, Apple

Blurred boundaries…

Page 5: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

CMPS VS. MPSOCS

CMP MPSoC

Fully homogeneous Very heterogeneous

Fully programmable cores Mostly fixed-function cores

Tile-based Subsystem-based

Regular chip layout Heavily customized layout

Emphasis on ease of programming

Emphasis on power/performance/cost

For: general processing, programmable accelerators

For: embedded applications, optimized accelerators

13-Oct-14 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 5

Page 6: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

MPSOC SYSTEMS BECOME VERY COMPLEX

13-Oct-14 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 6

TI OMAP 4460, 2011

Page 7: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

INTEGRATION ISSUE ARISES

❖Many cores

❖If MPSoC, heterogeneous cores➢Different pinouts➢Different clocking (+DVFS)➢Different physical sizes➢Different power/temperature budgets➢Different communication needs

❖How to integrate, verify?

❖System interconnect

DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS13-Oct-14 7

Page 8: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

ON-CHIP INTERCONNECTSFederico ANGIOLINILSI EPFL2014

13-Oct-14 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 8

Page 9: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

TRADITIONAL ANSWER: BUSES

❖ Shared bus topology

❖ Aimed at simple, cost-effective integration

Bus

Slave 1

Master 1

Slave 3

Master 2

Slave 2

Master 3

❖ Typical example: ARM AMBA AHB➢ Arbitration among multiple masters➢ Single outstanding transaction allowed➢ If wait states are needed, everybody waits➢ Slow!

DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS13-Oct-14 9

Page 10: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

BUS EVOLUTION OCCURRED

Bus

Bus

Bus

Bus

Bus

Bus

Bus

Cros

sbar

Bus++ v2.0

DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS13-Oct-14 10

Protocolevolution

Topologyevolution

Page 11: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

STILL OUTSTANDING PROBLEMS

❖Performance scalability

❖ Ease of design and verification➢ ~2 years time to market (89% late!)

❖Physical design issues➢ Routing congestion➢ Wire propagation delay

13-Oct-14 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 11

Numetrics Management Systems

Page 12: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

AND OVERALL, THE DESIGN LOOKS LIKE…

Chip for mobile multimedia apps (under NDA)

DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS13-Oct-14 12

Page 13: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

THE RISE OF NETWORKS-ON-CHIP (NOCS)

❖ Packet-based communication

❖ NIs packetize transactions

❖ Switches route transactions across

13-Oct-14 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 13

Benini-De Micheli, 2002

Dally, 2001

Page 14: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

NOC OPERATION EXAMPLE

CPU

I/O

Net

wor

k In

terf

ace

switch

Net

wor

k In

terf

ace1. CPU request (AHB, OCP, ... pinout)

2. Packetization and transmission

3. Routing

4. Receipt and unpacketization (AHB, OCP, ... pinout)

5. Device response (if needed)

6. Packetization and transmission

7. Routing

8. Receipt and unpacketization

switch

switch

DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS13-Oct-14 14

Page 15: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

NOC HERITAGE: A LOGICAL STEP

❖ On-chip version of large area networks➢ WAN, MAN, LAN, PAN, …

❖ Next logical step in interconnect evolution

❖ Scalable, robust, efficient, well-understood

❖ Different trade-offs compared to large area networks➢ Wire parallelism is cheap vs. cables➢ Latency, power, area must be orders of magnitude lower➢ Buffers must be extremely small

13-Oct-14 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 15

Page 16: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

NOC ADOPTION TRENDS

13-Oct-14 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 16

Sonics Whitepaper, 2012

Page 17: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

HOW TO DESIGN NOCSFederico ANGIOLINILSI EPFL2014

13-Oct-14 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 17

Page 18: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

NOCS HAVE MANY DESIGN AXES

❖ How to connect things: topology

❖ How to inject and eject messages: network interface architecture (incl. packetizationpolicy)

❖ How to pass packets along: router architecture (incl. switching policy, flow control policy)

❖ Where to send packets: routing policy

DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS13-Oct-14 18

Page 19: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

NoC

NETWORK INTERFACE ARCHITECTURE (XPIPES)

13-Oct-14 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 19

ni_request

ni_response

ni_receive

ni_resendMas

ter

Slav

e

Initiator NI Target NI

Hea

der R

egPa

yloa

d Re

gH

eade

r Reg

Payl

oad

Reg

Hea

der R

egPa

yloa

d Re

gH

eade

r Reg

Payl

oad

Reg

Pending Trans Reg

FSM

Routing LUT

Pending Trans Reg

FSM

Routing LUT

FSM

FSM

Buffer Buffer

BufferBuffer

OCP

AHB

AXI

OCP

AHB

AXI

OCP

AHB

AXI

OCP

AHB

AXI

Source routing

Parametric link width

Protocol interoperab.

Page 20: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

PACKET TRANSMISSION

13-Oct-14 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 20

Header

Header

Header

Payload

Payload 1Payload 2Payload 3Payload 4…

(1) Read request

(2) Single write request or single read response

(3) Burst write request or burst read response

Initi

ator

Targ

et

Page 21: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

DATABYTE

ENABLES

DATA RESP

ATTRIBU

TES

BURSTLENGTH

BURST

SEQU

ENC

E

BURST

PRECISE

BYTEEN

ABLES

CO

MM

AN

D

ADDRESS SENDER ROUTE

BURST

INC

REMEN

T

ID

SEND

ER WID

TH

ROUTE

TYPEID SENDER

SEND

ER WID

TH

request header

request payload

response header

response payload

INTEROPERABLE PACKET FORMAT

13-Oct-14 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 21

Page 22: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

ROUTER ARCHITECTURE (XPIPES)

❖ Input and/or output buffering

❖ Wormhole switching

❖ Supports multiple flow control policiesDESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS13-Oct-14 22

Crossbar

AllocatorArbiter

RoutingFlow Control

Page 23: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

DEADLOCKS❖ Routing level

❖ Message level

Initiator Target

Request

Response

DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS13-Oct-14 23

Page 24: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

APPLICATION-SPECIFICNOC DESIGN

Federico ANGIOLINILSI EPFL2014

13-Oct-14 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 24

Page 25: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

CMP VS. MPSOC INTERCONNECTS

CMP MPSoC

Regular interconnect (mesh, torus) Custom topology

Unpredictable workload Mostly predictable workloads

Synthetic workloads may be representative

All usage scenarios must be checked

Emphasis on routing, bisection bandwidth

Emphasis on latency, simulated performance

Homogeneous NoC architecture Locally optimized architecture

Requires cache coherence Requires core interoperability

Regular, short wiring Irregular, long wiring

13-Oct-14 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 25

Page 26: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

MANY CHALLENGES TO TACKLE

13-Oct-14 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 26

Sonics Whitepaper, 2012

Page 27: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

…WHICH IS EXPENSIVE

13-Oct-14 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 27

Sonics Whitepaper, 2012

MPSoC architects spend ~28% of time defining the NoC

Company A: € 76 M (profitable)Company B: € 56 M (maybe bankrupt)

3 months

Page 28: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

STANDARD DESIGN FLOW

13-Oct-14 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 28

DesignRequirements

HardwareDescription

SynthesisPlacement

Routing

ProcessorsMemoriesI/ONoC Inter-connect

■ Designer picks NoC with expertise + guesswork■ NoC is iterated on multiple times based on simulation and

synthesis feedback■ Electronic Design Automation (EDA) Tools exist but most

difficult work is manual

Page 29: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

AN AUTOMATED DESIGN FLOW

13-Oct-14 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 29

NoC RTL

NoC

Models

Comm

Specs

IP Core

RTL

Back-End

Flow

Floorplanwith NoC

ArchSpecs

NoC

Library

Chip

Topology

Generation,

Floorplanning

Existing know-how, IP and tools

NoC extensions

Floorplan(optional)

Simulation

Design PointSelection

Page 30: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

REFERENCE ALGORITHM

13-Oct-14 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 30

Partitioning (core-to-switch

assignment)

Partitioning (core-to-switch

assignment)

Design point generation

Partitioning (core-to-switch

assignment)

Switch-to-switch connectivity

establishment

Post-processing (trimming,

simulation, …)

go to next design point

Dump output

1

2

3

4

5

Murali, DAC 2004Murali, ASPDAC 2005

Bertozzi, IEEE TPDS 2005

Other seminal works:Pinto, ICCD 2003

Srinivasan, ICCD 2004

Page 31: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

1 - DESIGN POINT GENERATION

❖ Identify NoC parameter permutations➢ NoC flit width➢ Switch count in each V/f island➢ Potentially any others: buffering, VCs…

❖ Sweep grain impacts runtime, exploration thoroughness

13-Oct-14 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 31

Page 32: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

EXAMPLE DESIGN POINTS

13-Oct-14 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 32

Core 2

Core 3

Core 4

Core 5

Core 6

Core 7

Core 1

Core 8

Core 18

Core 9

Core 17

Core 10

Core 16

Core 15

Core 14

Core 13

Core 12

Core 11

Core 2

Core 3

Core 4

Core 5

Core 6

Core 7

Core 1

Core 8

Core 18

Core 9

Core 17

Core 10

Core 16

Core 15

Core 14

Core 13

Core 12

Core 11

Core 2

Core 3

Core 4

Core 5

Core 6

Core 7

Core 1

Core 8

Core 18

Core 9

Core 17

Core 10

Core 16

Core 15

Core 14

Core 13

Core 12

Core 11

Core 2

Core 3

Core 4

Core 5

Core 6

Core 7

Core 1

Core 8

Core 18

Core 9

Core 17

Core 10

Core 16

Core 15

Core 14

Core 13

Core 12

Core 11

Page 33: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

Core to switch assignment such that communication constraints can be met

2 - DETERMINE SWITCHES IN V ISLANDS❖ Operating frequency bounds maximum switch

radix, thus max radix different in each VI

13-Oct-14 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 33

f = 600 MHz → Switch = 9x9

f = 850 MHz → Switch = 3x3 f = 700 MHz →

Switch = 7x7

MEM

ARM DSP

DMA

RAM

MEM DSP COPROC ACCEL

IMPROC

FIFO

VI 0

VI 1

VI 2

MEM

ARM DSP

DMA

RAM

MEM DSP COPROC ACCEL

IMPROC

FIFO

VI 0

VI 1

VI 2

Page 34: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

EXAMPLE TOPOLOGY AT END OF STEP 2

13-Oct-14 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 34

Core 2 Core 3 Core 4 Core 5 Core 6 Core 7

Core 1 Core 8

Core 18 Core 9

Core 17 Core 10

Core 16 Core 15 Core 14 Core 13 Core 12 Core 11

S1

S8 S7S6

S3S2

S5

S4

Page 35: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

3 – FLOW PATH FINDING❖ Establish physical links, consider➢ Cost function: marginal power (to be minimized)➢ Constraints: max switch radix, minimum bandwidth, max latency

(power/latency auto-tuning)

13-Oct-14 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 35

Many paths available !

MEM

ARM DSP

DMA

RAM

MEM DSP COPROC ACCEL

IMPROC

FIFO

VI 0

VI 1

VI 2

Page 36: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

3 – DEADLOCK AVOIDANCE❖ Routing deadlocks removed during synthesis❖ Methods guarantee availability of paths❖ No special hardware/virtual channels needed

13-Oct-14 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 36

MEM

ARM DSP

DMA

RAM

MEM DSP COPROC ACCEL

IMPROC

FIFO

VI 0

VI 1

VI 2

Page 37: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

MEM

ARM DSP

DMA

RAM

MEM DSP COPROC ACCEL

IMPROC

FIFO

VI 0

VI 1

VI 2

VI 4

3 x 3

9 x 9

7 x 7

Frequency converters modeled and cost included in path computation

3 – DVFS-AWARE FLOW ROUTING❖ Inter-island routing options:➢ Directly from source VI to destination VI, or across intermediate islands

13-Oct-14 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 37

Page 38: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

EXAMPLE TOPOLOGY AT END OF STEP 3

13-Oct-14 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 38

Core 2 Core 3 Core 4 Core 5 Core 6 Core 7

Core 1 Core 8

Core 18 Core 9

Core 17 Core 10

Core 16 Core 15 Core 14 Core 13 Core 12 Core 11

S1

S8 S7S6

S3S2

S5

S4

Page 39: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

4, 5 – DESIGN POINT FINALIZATION❖ Topology is now defined

❖Optional optimizations and checks:➢ Trim 1x1 switches➢ Can trim flit width where excessive for requirements➢ Can refine buffer sizing to optimize

performance/area/power➢ Can simulate for performance verification and iterate

❖Dump topology to disk13-Oct-14 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 39

Page 40: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

EXAMPLE TOPOLOGY AT END OF STEP 5

13-Oct-14 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 40

Core 2 Core 3 Core 4 Core 5 Core 6 Core 7

Core 1 Core 8

Core 18 Core 9

Core 17 Core 10

Core 16 Core 15 Core 14 Core 13 Core 12 Core 11

S1

S8 S7

S3S2

S5

S4

Page 41: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

NOC FLOORPLANNINGFederico ANGIOLINILSI EPFL2014

13-Oct-14 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 41

Page 42: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

ROUTING CONGESTION IS AN ISSUE

DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS17-Oct-14 42

Arteris Whitepaper

Page 43: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

WIRING DELAY IS AN ISSUE

DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS17-Oct-14 43

ITRS 2004

Page 44: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

TRAVERSABLE WIRE LENGTH IS GOING DOWN

❖ In 40 nm, at same frequency, wires must be 20-25% shorter than in 65 nm

13-Oct-14 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 44

Page 45: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

THEREFORE, FLOORPLAN INPUT IS ESSENTIAL

13-Oct-14 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 45

Page 46: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

ACCOUNTING FOR FLOORPLAN

❖ Optional input

❖ Not to be reshuffled➢ Too many constraints (thermal, pins, analog macros…)

❖ Insert NoC in the floorplan➢ Figure out ideal NoC component positions➢ Identify links still too long, pipeline them➢ Estimate power according to library model

❖ Designer can always tweak to taste

❖ Output floorplan can be exported towards P&R tools

13-Oct-14 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 46

Page 47: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

OPTIMIZED FLOORPLAN EFFECT❖ Collaboration with Teklatech

❖ Different 40nm reference floorplans

❖ NoCs inserted by iNoCs with optimizations

❖ 15-20% savings in demanding use cases

13-Oct-14 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 47

Page 48: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

IMPACT ON SYNTHESIS ALGORITHM

❖ Floorplan is updated simultaneously with topology synthesis (wires accounted in exploration)

❖ Two floorplanning steps:

➢ doInitialFloorplanning(): places blocks in ideal positions. Can be iterated and refined during topology building.

➢ doFloorplanning(): moves blocks from ideal positions to suitable locations (empty areas). Can be executed once as final pass.

13-Oct-14 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 48

Page 49: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

EXAMPLE: INPUT FLOORPLAN

13-Oct-14 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 49

Page 50: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

OUTPUT FLOORPLAN (NO OVERLAPS)

13-Oct-14 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 50

Page 51: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

OUTPUT FLOORPLAN (OVERLAPS)

13-Oct-14 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 51

Page 52: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

WIRE LENGTH ESTIMATION

13-Oct-14 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 52

Page 53: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

LINK PIPELINING

❖ Wire segmentation by topology design➢ Put more switches, or place them closer

❖ Wire segmentation by pipeline insertion➢ Flops/relay stations to break links

❖ Take advantage of pre-existing converters➢ If e.g. frequency converter is instantiated along link, space it

evenly to help with segmenting

13-Oct-14 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 53

Sender Receiver

VALID

STALL

VALID

STALL

VALID

STALL

Page 54: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

EXAMPLE: LINK LENGTH DISTRIBUTION

13-Oct-14 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 54

Original benchmarkmax f = 400 MHz

Overclocked benchmarkmax f = 800 MHz

Page 55: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

CONCLUSIONSFederico ANGIOLINILSI EPFL2014

13-Oct-14 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 55

Page 56: NETWORKS -ON-CHIP AND NOC SYNTHESISsi2.epfl.ch/~amaru/DTIS/Slides_presentations/DT9.pdf · networks -on-chip and noc synthesis federico angiolini lsi epfl. 2014. 13-oct-14 design

CONCLUSIONS

❖ IP-based design becoming prevalent➢ CMPs, MPSoCs

❖ System interconnect is evolving towards NoCs➢ Scalable performance, better physical design properties

❖ EDA tooling needed to help with NoC design➢ Devise best interconnect from high-level specifications➢ Accounting for floorplan is important

13-Oct-14 DESIGN TECHNOLOGIES FOR INTEGRATED SYSTEMS 56