buses a. jantsch / z. lu / i. sander dally: chapter 22, 18
TRANSCRIPT
![Page 1: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/1.jpg)
Buses
A. Jantsch / Z. Lu / I. Sander
Dally: Chapter 22, 18
![Page 2: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/2.jpg)
Some History…
![Page 3: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/3.jpg)
April 20, 2023 SoC Architecture 3
ENIAC60 years ago…
ENIAC, short for Electronic Numerical Integrator and Computer, was the first large-scale, electronic, digital computer capable of being reprogrammed to solve a full range of computing problems, although earlier computers had been built with some of these properties.
Source: http:://www.embedded.com
![Page 4: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/4.jpg)
April 20, 2023 SoC Architecture 4
ENIAC60 years ago…
ENIAC was designed and built to calculate artillery firing tables for the U.S. Army's Ballistics Research Laboratory.
The first problems run on the ENIAC however, were related to the design of the hydrogen bomb.
Source: http:://www.embedded.com
![Page 5: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/5.jpg)
April 20, 2023 SoC Architecture 5
ENIAC60 years ago…
ENIAC contained 17,468 vacuum tubes, 7,200 crystal diodes, 1,500 relays, 70,000 resistors, 10,000 capacitors and around 5 million hand-soldered joints
It weighed 27 tons, was roughly 2.4 m by 0.9 m by 30 m, took up 167 m²
Source: http:://www.embedded.com
![Page 6: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/6.jpg)
April 20, 2023 SoC Architecture 6
ENIAC60 years ago…
ENIAC consumed 150 kW of power or about the same as 2000 Pentium 4 chips.
That's 500 million times the power used by some of TI's MSP430 processors when operating at 1 MHz (200 times the ENIAC's clock rate of 5kHz).
It took up to 29 milliseconds to do a division.
The Pentium 4 is a million times faster and twice as precise.
Source: http:://www.embedded.com
![Page 7: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/7.jpg)
April 20, 2023 SoC Architecture 7
ENIAC60 years ago…
The machine cost $500k in 1946 dollars, equivalent to $5,134,000 today
Now some variants of Microchip's PIC10 go for $0.39
Source: http:://www.embedded.com
![Page 8: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/8.jpg)
Classic Bus
![Page 9: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/9.jpg)
April 20, 2023 SoC Architecture 9
Introduction
Buses are the simplest and most widely used interconnection networks
A number of modules is connected via a single shared channel
Micro-controller
DigitalSignal
Processor
Input/OutputDevice
Memory
Bus
![Page 10: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/10.jpg)
April 20, 2023 SoC Architecture 10
Bus Properties
Serialization Only one component can send a message at
any given time There is a total order of messages
Micro-controller
DigitalSignal
Processor
Input/OutputDevice
Memory
Bus
Module1
Bus
Module2
Module3
Module4
1 2
![Page 11: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/11.jpg)
April 20, 2023 SoC Architecture 11
Bus Properties
Broadcast A module can send a message to several
other components without an extra cost
Module1
Bus
Module2
Module3
Module4
![Page 12: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/12.jpg)
April 20, 2023 SoC Architecture 12
Bus Hardware
Principle for hardware to access the bus Bus Transmit: ET active Bus Receive: ER active
Reg ER
ET
Module 1
Reg ER
ET
Module 2
Reg ER
ET
Module 3
Bus
![Page 13: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/13.jpg)
April 20, 2023 SoC Architecture 13
Bus Transmitter Interfaces
ET
Bus
T
ET
T
Bus
ET
T
Tri-state driverOpen-drain driver
Dotted emitter driver
![Page 14: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/14.jpg)
April 20, 2023 SoC Architecture 14
Cycles, Messages and Transactions
Buses operate in units of cycles, messages and transactions. Message: Logical unit of information (a read
message contains an address and control signals for read)
Cycles: A message requires a number of cycles to be sent from sender to receiver over the bus
Transaction: A transaction consists of a sequence of messages which together form a transaction (a memory read requires a memory read message and a reply with the requested data)
![Page 15: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/15.jpg)
April 20, 2023 SoC Architecture 15
Synchronous Bus
Includes a clock in the control lines A fixed protocol for communication
that is relative to the clock Advantage: involves very little logic
and can run very fast Disadvantages:
Every device on the bus must run at the same clock rate
To avoid clock skew, they cannot be long if they are fast
CLK
READ
ADR
DATA
![Page 16: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/16.jpg)
April 20, 2023 SoC Architecture 16
Asynchronous Bus
It is not clocked It can accommodate a
wide range of devices It can be lengthened
without worrying about clock skew
It requires a handshaking protocol
READ
ADR
DATA
ACK
1. Master puts address on bus and asserts READ when address is stable
2. Memory puts data on bus and asserts ACK when data is stable
3. Master deasserts READ when data is read
4. Memory deasserts ACK
![Page 17: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/17.jpg)
Bus Arbitration
![Page 18: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/18.jpg)
April 20, 2023 SoC Architecture 18
Bus Arbitration
Since only one bus master can use the bus at a given time bus arbitration is used
An arbiter collects the requests of all bus masters and gives only one module the right to access the bus (bus grant)
Module1
Module2
Module3
Arbiter
Req
Req
Req
Gra
nt
Gra
nt
Gra
nt
Bus
![Page 19: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/19.jpg)
April 20, 2023 SoC Architecture 19
Importance of Arbiters
Arbiters are not only used in bus-system, but everywhere where several devices request shared resources
In network-on-chips arbitration is for instance needed, if two or more packets want to enter the same channel
![Page 20: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/20.jpg)
April 20, 2023 SoC Architecture 20
Arbiter Interfaces
This arbiter interface can be used to give a bus grant for a fixed number of cycles (a): 1 cycle (b): 4 cycles
![Page 21: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/21.jpg)
April 20, 2023 SoC Architecture 21
Arbiter Interfaces
This arbiter allows for variable length grants
The grant is hold as long as the “hold”-line (controlled by client) is asserted
In cycle 2 requester 0 gets the bus for 3 cycles
In cycle 5 requester 1 gets the bus for 2 cycles
In cycle 7 requester 1 gets the bus for one cycle
![Page 22: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/22.jpg)
April 20, 2023 SoC Architecture 22
Fairness
Fairness is a key property of an arbiter Some definitions:
Weak fairness: Every request is eventually served Strong fairness: Requests will be served equally
often Weighted “strong” fairness: The number of times
requester i is served is equal to its weight wi
FIFO fairness: Requests are served in the order the requests have been made
![Page 23: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/23.jpg)
April 20, 2023 SoC Architecture 23
Local Fairness vs. Global Fairness
Even if an arbiter is locally fair, a system with several arbiters employing that arbiter may not be fair.
Though each arbiter Ai allocate 50% of their bandwidth to its two inputs, r0 only gets 12.5% of the total bandwidth, while r3 gets 50%.
![Page 24: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/24.jpg)
April 20, 2023 SoC Architecture 24
Fixed-Priority Arbiter
A fixed-priority arbiter can be constructed as an iterative circuit
Each cell receives a request input ri and a carry input ci and generates a grant output gi and a carry output ci+1
The resulting arbiter is not fair, since a continuously asserted request r0 means that none of the other requests will ever be served!
![Page 25: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/25.jpg)
April 20, 2023 SoC Architecture 25
Fair Arbiters
A fair arbiter can be generated by changing the priority from cycle to cycle
Depending on the priority generation, different arbitration schemes and degrees of fairness can be achieved
Only one input pi has the value 1. All other inputs pj have the value 0.
![Page 26: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/26.jpg)
April 20, 2023 SoC Architecture 26
Fair Arbiters
Oblivious Arbiters If pi is generated without
knowledge of ri and gi, the result is an oblivious (unconscious) arbiter
Examples are: Randomly generated pi
Rotating priorities (by shiftregister)
Weak fairness, but not strong fairness
![Page 27: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/27.jpg)
April 20, 2023 SoC Architecture 27
Oblivious Arbiters
Oblivious arbiters provide weak fairness but not strong fairness
(i.e. if r0 and r1 are constantly asserted) Request r1 wins the
arbitration only when p1 is true, in all other cases r0 gets the grant
1
11
00
0
0
0
10
0
0
1
1
1
1
0
0
0
![Page 28: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/28.jpg)
April 20, 2023 SoC Architecture 28
Round-Robin Arbiter
A round-robin arbiter achieves strong fairness
A request that was just served gets the lowest priority
0
0
1
0
1
0
0
0
01
0
0
0
![Page 29: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/29.jpg)
April 20, 2023 SoC Architecture 29
Weighted Round-Robin Arbiter A weighted round-robin arbiter allows to give requesters a
larger number of grants than other requesters in a controlled fashion
If three devices have the weight 1,2,3 they get 1/6, 1/3 and 1/2 of the grants
The preset line is activated periodically after N (here 6 cycles) to load the counter with its weight
If some modules do not issue any requests during that interval, the shared resource will remain idle until the next preset cycle
![Page 30: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/30.jpg)
April 20, 2023 SoC Architecture 30
Matrix Arbiter A matrix arbiter implements a
least recently served priority scheme by maintaining a triangular array of state bits wij for all i < j
If wij is true, then request i takes priority over request j
Each state bit is set on column grant and reset on row grant= a gi results in lowest priority for stage i in next cycle
Only the upper triangular portion needs to be maintained
The matrix arbiter has to be proper initialized
The Matrix arbiter is very good suited for a small number of inputs, since it is fast, easy to implement and provides strong fairness!
(Exercise Dally 18-3)gj
gi wij
gi
gj
gi
wijij
![Page 31: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/31.jpg)
April 20, 2023 SoC Architecture 31
Grand-Hold Circuit
Allows for uninterrupted access to a resource for several cycles
Extends the duration of a grant As long as hold is asserted further arbitration is disabled
![Page 32: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/32.jpg)
April 20, 2023 SoC Architecture 32
Queuing Arbiter
A queuing arbiter provides FIFO fairness
It assigns each request a time stamp when it is asserted
The request with the earliest time stamp receives the grant
Cost is determined by size of the time stamp
wi … number of bits for time stampΔt … time stamp rangeta … arrival intervaln … number of inputs to arbiterTmax … maximum service time
wi = log2 (Δt / ta)Δt = 2nTmax
![Page 33: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/33.jpg)
Bus Bridge
![Page 34: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/34.jpg)
April 20, 2023 SoC Architecture 34
Bus Bridges
Bus bridges are used to separate high-performance devices from low-performance devices
All communication from high-performance bus with the low performance device goes via the bridge
![Page 35: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/35.jpg)
April 20, 2023 SoC Architecture 35
AHB to ISA Bus Bridge
![Page 36: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/36.jpg)
April 20, 2023 SoC Architecture 36
AHB Basic Transfer
![Page 37: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/37.jpg)
April 20, 2023 SoC Architecture 37
AHB and ISA Timing
![Page 38: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/38.jpg)
April 20, 2023 SoC Architecture 38
Bridge Implementation
![Page 39: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/39.jpg)
Bus Protocols
![Page 40: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/40.jpg)
April 20, 2023 SoC Architecture 40
Low Performance Bus Protocol
Without a special bus protocol the bus is not efficiently used
In the example module 2 requests the bus in cycle 2, but must wait until cycle 6 to receive the grant
![Page 41: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/41.jpg)
April 20, 2023 SoC Architecture 41
Bus Pipelining
A memory access consists of several cycles (including arbitration)
Since the bus is not used in all cycles, pipelining can be used to increase the performance
AR ARB AG RQ ACK
Arb request
Arbiter
Arb grant
Bus
AR ARB AG RQ P
Arb request
Arbiter
Arb grant
Bus
RPLY
Write Access Read Access
Only one transaction can Receive the grant during a given cycle Use the bus during a given cycle
![Page 42: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/42.jpg)
April 20, 2023 SoC Architecture 42
Bus Pipelining
Pipelining leads to an efficient use of the bus Stalls are inserted since only one instance can use the bus Sometimes (cycle 12) two transactions can overlap However this cannot be done in cycle 5 (2. Write) since
otherwise RPLY and ACK would overlap in cycle 6!
2. Write
3. Write
4. Read
5. Read
1 2 3 4 5
6. Read
6 7 8 9 10 11 12 13
AR
AR
AR
ARB
ARB
AG Stall Stall
Stall Stall
Stall Stall
RQ
AG
ACK
Stall
Stall
Stall
RQ ACK
ARB AG
AR ARB
AR
Stall
Stall
Stall
RQ
AG
ARB
Bus busy
1. Read AR ARB AG RQ P RPLY
P RPLY
RQ
AG
14 15
RPLYP
Stall Stall RQ
![Page 43: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/43.jpg)
April 20, 2023 SoC Architecture 43
Split-Transaction Bus
In a split-transaction bus a transaction is splitted into a two transactions ”request”-transaction ”reply”-transaction
Both transactions have to compete for the bus by arbitration
![Page 44: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/44.jpg)
April 20, 2023 SoC Architecture 44
Split-Transaction Buses
2. Reply
6. Read
3. Reply
4. Reply
5. Reply
Bus busy 1
2. Write
3. Write
4. Read
1. Reply
1 2 3 4 5
5. Read
6 7 8 9 10 11 12 13
AR
AR
AR
ARB
ARB
AG RQ
AG RQ
1. Read AR ARB AG RQ
14 15
ARB AG RQ
AR ARB
AR ARB
AG RPLY
Stall
AR ARB AG RPLY
Stall
AR ARB Stall Stall
Stall
AR ARB AG RPLY
AR ARB AG RPLY
AR ARB AG
Stall AG RQ
Stall Stall AG RQ
6. Reply AR ARB
2 3 4 1 2 3 4 5 6
![Page 45: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/45.jpg)
April 20, 2023 SoC Architecture 45
Split-Transaction Buses
2. Trans
3. Trans
1 2 3 4 5 6 7 8 9 10 11 12 13
RQ C
1. Trans RQ A RP A
14
RQ B RP B
RP C
2. Trans
3. Trans RQ C
1. Trans RQ A RP A
RQ B RP B
RP C
The advantages of the split-transaction bus are evident, if there is a variable delay for requests.
Pipelined Bus
Split-Transaction Bus
![Page 46: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/46.jpg)
April 20, 2023 SoC Architecture 46
Burst Messages
There is a considerable amount of overhead in a bus transaction Arbitration Addressing Acknowledgement
ARB
Cmd Adr Data
ARB
Cmd Adr Data
ARB
Cmd Adr Data
ARB
Cmd Adr Data
Request
Efficiency = Transmitted Words / Message Size = 1/3
![Page 47: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/47.jpg)
April 20, 2023 SoC Architecture 47
Burst Messages
The overhead can be reduced, if messages are sent as blocks (bursts)
ARB
Cmd Adr Data
ARB
Cmd Adr Data
ARB
Cmd Adr Data
ARB
Cmd Adr Data
ARB
Cmd Adr Data Data Data Data
Request
Burst Request
Efficiency = Transmitted Words / Message Size = 2/3
![Page 48: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/48.jpg)
April 20, 2023 SoC Architecture 48
Burst Messages The longer the burst, the better the efficiency BUT
Other bus masters have to wait, which may be unacceptable in many systems (Real-Time)
Possible solution: Maximum length for a burst Interrupt of long messages
Restart or Resume
Gnt A
Cmd Adr Data Data Data Data
Cmd Adr Data
Res A
Adr Data Data Data Data
Gnt BArbitration
Message A
Message B
![Page 49: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/49.jpg)
Modern SoC buses
![Page 50: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/50.jpg)
April 20, 2023 SoC Architecture 50
Embedded busses
Current system-on-chips are advanced enough to need a hierarchy of busses
A new set of bus standards have been defined to be used in SoCs, e.g. ARM Amba Altera Avalon OCP – Open Communication Protocol
These busses allow for higher performance than traditional Tri-State busses
![Page 51: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/51.jpg)
April 20, 2023 SoC Architecture 51
Comparison: Multiplexor Bus and Tri-State Bus
Multiplexer Bus Bus Master can send their
request including address and data (for write) at the same time
Arbiter selects a bus master
Tri-State Bus Only one bus master can
output address or data (otherwise collision)
A Bus Grant is needed to output address or data
MUX
BM1
BM2
Arbiter
Req, Adr, Data
Req, Adr, DataBM1 BM2
Adr1 Adr2
Collision!Mux avoids collision!
![Page 52: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/52.jpg)
April 20, 2023 SoC Architecture 52
AMBA Specification
The AMBA specification defines an on-chip communications standard for designing high-performance embedded micro-controllers
Three buses are defined Advanced High-Performance Bus (AHB) Advanced System Bus (ASB) Advanced Peripheral Bus (APB)
A test methodology is included within AMBA which provides an infrastructure for modular macrocell test and diagnostic access
![Page 53: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/53.jpg)
April 20, 2023 SoC Architecture 53
System based on an AMBA Bus
An AMBA system typically contains a high speed bus (ASB or AHB) for CPU, fast memory and DMA and a bus for peripherals (APB), which is connected via a bridge to the high-speed bus
![Page 54: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/54.jpg)
April 20, 2023 SoC Architecture 54
AMBA Buses AMBA AHB (new standard)
High Performance Pipelined Operation Multiple Bus Masters Burst Transfers Split Transactions
AMBA ASB (older standard) High Performance Pipelined Operation Multiple Bus Masters
AMBA APB Low Power Latched Address and Control Simple Interface Suitable for many peripherals
![Page 55: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/55.jpg)
April 20, 2023 SoC Architecture 55
AMBA AHB System
AHB Master A bus master is able to initiate read and write information
by providing address and control information. Only one bus master can use the bus at the same time
AHB Slave A bus slave responds to a read and write operation within
a given address-space range. The bus slave signals back to the active bus master the success, failure or waiting of the data transfer
![Page 56: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/56.jpg)
April 20, 2023 SoC Architecture 56
AMBA AHB System
AHB Arbiter The bus arbiter ensures that only one bus master at a time
is allowed to initiate data transfers. Even though the arbitration protocol is fixed, any arbitration algorithm, such as highest priority or fair access can be implemented depending on the application requirements
An AHB includes only one arbiter AHB Decoder
The AHB decoder is used to decode the address of each transfer and provides a select signal for the slave that is involved in the transfer
A single centralized decoder is required in all AHB implementations
![Page 57: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/57.jpg)
April 20, 2023 SoC Architecture 57
AMBA AHBBus Interconnection
![Page 58: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/58.jpg)
April 20, 2023 SoC Architecture 58
AMBA AHBBus Interconnection
AHB Protocol is based on a central multiplexer interconnection scheme
All bus masters send their request in form of address and control signals
The arbiter chooses one master. The address and control signals are routed to all slaves
The decoder selects the signals from the slave that is involved in the transfer with the bus master
![Page 59: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/59.jpg)
April 20, 2023 SoC Architecture 59
AMBA ARM’s Advanced Microcontroller Bus Interface
AMBA 1995: APB (Advanced Peripheral Bus) ASB (Advanced System Bus)
Multiple masters Pipelined operations
AMBA 2 1999: AHB (Advanced High Performance Bus)
Multiple masters Pipelined operations Burst transactions Split transactions, multiple outstanding transactions Single cycle master hand-over Exclusive bus control Single- centralized decoder
![Page 60: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/60.jpg)
April 20, 2023 SoC Architecture 60
AMBA 3 - 2004
Multiple parallel connections
Pipelined bursts Only 2-stage network Central n x m switch
matrix
![Page 61: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/61.jpg)
April 20, 2023 SoC Architecture 61
Multi Layer AMBA Bus
![Page 62: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/62.jpg)
April 20, 2023 SoC Architecture 62
Multi Layer AMBA Bus
Local Slaves Multiple slaves on one slave port
![Page 63: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/63.jpg)
April 20, 2023 SoC Architecture 63
Multi Layer AMBA Bus
Multiple masters on one layer
![Page 64: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/64.jpg)
April 20, 2023 SoC Architecture 64
Multi Layer AMBA Bus
Separate AHB Subsystems
![Page 65: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/65.jpg)
April 20, 2023 SoC Architecture 65
Multi Layer AMBA Bus Example
![Page 66: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/66.jpg)
April 20, 2023 SoC Architecture 66
AXIThe new AMBA bus protocol The objectives of the latest generation AMBA
interface are to: be suitable for high-bandwidth and low-latency designs enable high-frequency operation without using complex
bridges meet the interface requirements of a wide range of
components be suitable for memory controllers with high initial access
latency provide flexibility in the implementation of interconnect
architectures be backward-compatible with existing AHB and APB
interfaces.
![Page 67: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/67.jpg)
April 20, 2023 SoC Architecture 67
AXIThe new AMBA bus protocol
The key features of the AXI protocol are: separate address/control and data phases support for unaligned data transfers using byte strobes burst-based transactions with only start address issued separate read and write data channels to enable low-cost
Direct Memory Access (DMA) ability to issue multiple outstanding addresses out-of-order transaction completion easy addition of register stages to provide timing closure
![Page 68: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/68.jpg)
April 20, 2023 SoC Architecture 68
AXI Channels
AW: Address Write Channel W: Write Data Channel B: Write Acknowledgement Channel AR: Address Read Channel RID: Read Data Channel
![Page 69: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/69.jpg)
April 20, 2023 SoC Architecture 69
AXI Ordering Model AWID The ID tag for the write address group of signals. WID The write ID tag for a write transaction. Must match the
AWID BID The ID tag for the write response; Must match the AWID and
WID. ARID The ID tag for the read address group of signals. RID The read ID tag for a read transaction; Must match the
ARID.
The interconnect appends Master id to AWID, ARID, WID
![Page 70: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/70.jpg)
April 20, 2023 SoC Architecture 70
Ordering Rules Transactions from different masters have no ordering
restrictions. They can complete in any order. Transactions from the same master, but with different ID values,
have no ordering restrictions. They can complete in any order. The data for a sequence of write transactions with the same
AWID value must complete in the same order that the master issued the addresses in.
The data for a sequence of read transactions with the same ARID value must be returned in order that: when reads with the same ARID are from the same slave then
the slave must ensure that the read data returns in the same order that the addresses are received.
when reads with the same ARID are from different slaves, the interconnect must ensure that the read data returns in the same order that the master issued the addresses in.
There are no ordering restrictions between read and write transactions with the same AWID and ARID. If a master requires an ordering restriction then it must ensure that the first transaction is fully completed before the second transaction is issued.
![Page 71: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/71.jpg)
April 20, 2023 SoC Architecture 71
AMBA 3 - 2004 AXI - Advanced eXtensible Interface
Abstract interface protocol Multiple parallel transactions Multiple outstanding transactions Transactions may complete out of order IDs to group transactions for ordering control Master/slave and read/write transaction based
protocol AMBA 4 – 20??:
More flexible and abstract protocol Support for QoS
More information can be found on http://www.arm.com
![Page 72: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/72.jpg)
April 20, 2023 SoC Architecture 72
Altera Avalon Bus
Features Up to 128-bit wide data Synchronous operation
Open Standard Specification specifies
communication between Master and switch-
fabric Slave and switch-fabric
Third party vendors can develop their Avalon devices
![Page 73: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/73.jpg)
April 20, 2023 SoC Architecture 73
Avalon Bus – Transfer Modes
The Avalon Specification allows (among others) the following transfer modes Wait-states: Fixed or variable (slave only) Pipeline: Fixed or variable latency Burst Tristate (devices with a shared read/write
channel)
Reference: Avalon Interface Specification,Avalon Switch Fabric
![Page 74: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/74.jpg)
April 20, 2023 SoC Architecture 74
Avalon Switch Fabric Master ports only wait to
access a slave port, if another master tries to access the same slave
Multi-master access is resolved weighted round-robin arbitration
Designer can define shared values, which define how often a master is allowed to access a slave (relative to other masters)
![Page 75: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/75.jpg)
April 20, 2023 SoC Architecture 75
Avalon Bus and SOPC Builder
The Avalon Bus is generated automatically, when a new Nios II core with peripherals is created in SOPC-builder
Changes in the design of the architecture lead to a new structure of the Avalon Switch Fabric
The user does not see the bus structure or the internal structure of the Avalon Switch Fabric
![Page 76: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/76.jpg)
Summary
![Page 77: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/77.jpg)
April 20, 2023 SoC Architecture 77
Summary and Outlook
A bus is an excellent communication medium to connect several devices
Since the bus is a shared communication medium, it is a bottleneck in the system
Many different arbitration techniques exist, which lead to different behaviors of the system
![Page 78: Buses A. Jantsch / Z. Lu / I. Sander Dally: Chapter 22, 18](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f395503460f94c55e6b/html5/thumbnails/78.jpg)
April 20, 2023 SoC Architecture 78
Summary and Outlook
Techniques like split-transaction and bridges can increase the performance of a bus, but there is a limit
Networks-on-Chip architectures aim to offer communication capabilities that are more general and flexible than buses
Modern buses evolve and have more and more network-like capabilities!