arteris network on chip: the growing cost of wires
DESCRIPTION
Arteris NoC SoC Interconnect presentation given by Jonah Probell at ARM Technology Conference 9-11 Nov 2010. Explains how traditional AXI fabrics require huge numbers of wires and leads to routing congestion, and how network on chip interconnects address routing congestion by allowing fewer wires. Explains the basics of NoC packetization and serialization.TRANSCRIPT
![Page 1: Arteris network on chip: The growing cost of wires](https://reader035.vdocument.in/reader035/viewer/2022062703/555155f0b4c905e1708b4911/html5/thumbnails/1.jpg)
November 9-11, 2010The Santa Clara Convention Center
www.armtechcon.com
![Page 2: Arteris network on chip: The growing cost of wires](https://reader035.vdocument.in/reader035/viewer/2022062703/555155f0b4c905e1708b4911/html5/thumbnails/2.jpg)
P&R congestion is the focus of EDAP&R congestion is the focus of EDA
"...upstream tools need to be claivoyant deep into the layout."
"The worst crises are when you're deep into the layout and realize that my floorplan's no good. So how do you avoid that? Well what's needed are claivoyant tools. That is a chain of steps where each step already knows a little bit about the changes downstream."
"The synthesizer can, this year, avoid congestion; and congestion is really the killer of schedules.“
-Aart DeGeus, Synopsys Synposium 2010
![Page 3: Arteris network on chip: The growing cost of wires](https://reader035.vdocument.in/reader035/viewer/2022062703/555155f0b4c905e1708b4911/html5/thumbnails/3.jpg)
Interconnect
Interconnects logicallyInterconnects logically
The interconnect transports AXI transactions between masters and slaves. The means of transportation are not defined by the AXI spec.
master
master
master
master
master
slave
slave
slave
slave
slave
AXI AXI
![Page 4: Arteris network on chip: The growing cost of wires](https://reader035.vdocument.in/reader035/viewer/2022062703/555155f0b4c905e1708b4911/html5/thumbnails/4.jpg)
Interconnect physicallyInterconnect physically
The interconnect lives in the hallways between IP cores.The width of the links affects the compactness of the die.
![Page 5: Arteris network on chip: The growing cost of wires](https://reader035.vdocument.in/reader035/viewer/2022062703/555155f0b4c905e1708b4911/html5/thumbnails/5.jpg)
1. Growing interface complexity1. Growing interface complexity
Data width 32 64 128
AHB signals 113 177 305
AXI signals 204 272 408
Address
Write data
Read data
Write address
Write data
Read data
Read address
Write response
32
Data width
Data width
32
Data width
Data width
32
Control A few
WiresWires
Control A few
Control A few
Control A few
Control A few
A few
AHBAXI
SignalSignal
![Page 6: Arteris network on chip: The growing cost of wires](https://reader035.vdocument.in/reader035/viewer/2022062703/555155f0b4c905e1708b4911/html5/thumbnails/6.jpg)
2. More interfaces each year2. More interfaces each year
cores
connections
0
10
20
30
40
50
60
70
1 2 3 4 5 6 7 8 9 10
![Page 7: Arteris network on chip: The growing cost of wires](https://reader035.vdocument.in/reader035/viewer/2022062703/555155f0b4c905e1708b4911/html5/thumbnails/7.jpg)
3. Relative wire cost growing3. Relative wire cost growing
Transistor sizes shrink faster than wire widths.
286 CPU (1982)69 mm2
Atom N450 (2010)66 mm2
Chips are, on average, the same size as ever.
![Page 8: Arteris network on chip: The growing cost of wires](https://reader035.vdocument.in/reader035/viewer/2022062703/555155f0b4c905e1708b4911/html5/thumbnails/8.jpg)
The growing cost of wiresThe growing cost of wires
1.More wires per interface
2. More interfaces to connect
3. Relative wire cost growing
![Page 9: Arteris network on chip: The growing cost of wires](https://reader035.vdocument.in/reader035/viewer/2022062703/555155f0b4c905e1708b4911/html5/thumbnails/9.jpg)
Packetizing AXI to transport transactionsPacketizing AXI to transport transactions
Read Address
Read Data
Write Data
Write Address
Write Response
Request from master Response to master
Request Packet Response Packet
packetize depacketize
![Page 10: Arteris network on chip: The growing cost of wires](https://reader035.vdocument.in/reader035/viewer/2022062703/555155f0b4c905e1708b4911/html5/thumbnails/10.jpg)
Packetizing AXI to transport transactionsPacketizing AXI to transport transactions
Read Address
Read Data
Write Data
Write Address
Write Response
Request to slave Response to master
Request Packet Response Packet
depacketize packetize
![Page 11: Arteris network on chip: The growing cost of wires](https://reader035.vdocument.in/reader035/viewer/2022062703/555155f0b4c905e1708b4911/html5/thumbnails/11.jpg)
SerializingSerializing
With a packetized protocol, serializing data simply requires a register and a mux.
Serializing packets is much easier than serializing the AXI interface protocol.
![Page 12: Arteris network on chip: The growing cost of wires](https://reader035.vdocument.in/reader035/viewer/2022062703/555155f0b4c905e1708b4911/html5/thumbnails/12.jpg)
Throughput and wiresThroughput and wires
header
data
data
header
data
data
data
data
header
data
data
data
data
header
header
data
data
(a)
(b)
(c)
(d) Link width = data width + header widthHeader penalty = 0
Link width = header widthHeader penalty = 1 cycle per transaction
Link width < data widthHeader penalty > 1 cycle per transaction
Link width = data widthHeader penalty = 1 cycle per transaction
![Page 13: Arteris network on chip: The growing cost of wires](https://reader035.vdocument.in/reader035/viewer/2022062703/555155f0b4c905e1708b4911/html5/thumbnails/13.jpg)
Selection of link widthSelection of link width
L2
DDR
peripherals
Place cores with high communication throughput and low latency requirements near each other. Use zero header penalty links between such cores.
Use narrow links for long paths to low throughput peripherals. This minimizes the number of long wires for P&R
![Page 14: Arteris network on chip: The growing cost of wires](https://reader035.vdocument.in/reader035/viewer/2022062703/555155f0b4c905e1708b4911/html5/thumbnails/14.jpg)
Experimental packetized link widthExperimental packetized link width
Data width 32 64 128
AHB signals 113 177 305
AXI signals 204 272 408
Packets with 0 penalty cycles 146 218 362
Packets with 1 penalty cycle 84 156 300
results obtained with Arteris FlexNoC packetized interconnect generator
0
50
100
150
200
250
300
350
400
450
32 bit = 59 % 64 bit = 43 % 128 bit = 26 %
wire savings
wir
e co
un
t AXI
AHB
Arteris w/o header latency
Arteris w/ header latency
![Page 15: Arteris network on chip: The growing cost of wires](https://reader035.vdocument.in/reader035/viewer/2022062703/555155f0b4c905e1708b4911/html5/thumbnails/15.jpg)
Experimental place & route resultsExperimental place & route results
Standard NoC
![Page 16: Arteris network on chip: The growing cost of wires](https://reader035.vdocument.in/reader035/viewer/2022062703/555155f0b4c905e1708b4911/html5/thumbnails/16.jpg)
SummarySummary
Routing congestion is the problem of the decade for chip implementation.
AXI is expensive in wires.
Packetizing and serializing transaction data effectively reduces routing congestion.
![Page 17: Arteris network on chip: The growing cost of wires](https://reader035.vdocument.in/reader035/viewer/2022062703/555155f0b4c905e1708b4911/html5/thumbnails/17.jpg)
Clairvoyant IPClairvoyant IP
physicalsynthesisphysicalsynthesis P&RP&RRTLRTL
serializing interconnect → fewer wiresserializing interconnect → fewer wires
physical synthesis → shorter wiresphysical synthesis → shorter wires