Dynamic Traffic Distribution among Hierarchy Levels in
Hierarchical Networks-on-Chip
Ran Manevich, Israel Cidon, and Avinoam Kolodny
Module
Modu le Module
Modu le Modu le
Modu le Modu le
Modu le
Module
Modu le
Modu le
Modu leGroup
ResearchQNoC
Electrical Engineering DepartmentTechnion – Israel Institute of Technology
Haifa, Israel
NOCS 2013
Hierarchical un-clustered NoCs
Hierarchical Rings S. Bourduas and, Z. Zilic, “Latency reduction of global traffic in wormhole-routed meshes using hierarchical rings for global routing.” ASAP 2007.
PyraMesh R. Manevich, I Cidon and, A. Kolodny. “Handling global
traffic in future CMP NoCs” SLIP 2012.
Phase 1Ascent to the highest level (LMAX).
Routing in hierarchical NoCs
Phase 2Travel on LMAX
towards the destination. Phase 3
Descent from LMAX and reach the destination.
Highest level LMAX defines distribution of traffic among hierarchy levels.
Traffic distribution among hierarchy levels
LMAX = 3LMAX = 2LMAX = 1
Highest Level LMAX defined by the hop distance (D) a packet would travel at the bottom level.
DThi – Distance Threshold of level i. If D > DThi , the packet is directed to level
i+1. Example: DThi = 6, 12, 20
Packets distribution policy
LMAX Bottom Mesh Travel Distance (D)
4 D>203 12<D≤202 6<D≤121 D≤6
How to distribute traffic among hierarchy levels?
SHORTESTPATH?
Shortest path – light load8x8 PyraMesh, 3D illustration
Average latencyHierarchical < Average latencyFlat
Shortest path – heavy load
8x8 PyraMesh, 3D illustration
Congestion!!!
Average latencyHierarchical >> Average latencyFlat
Shortest path,but not for all?
The upper levels are sparse!
Shortest path only for distant packets – heavy load
Average latencyHierarchical < Average latencyFlat
Shortest path only for distant packets – light load
Average latency in “shortest path for all” is lower!
Dynamic Traffic
Distribution!!!Dynamic Traffic
Distribution !
Static Traffic Distribution (STrD):
Traffic distribution – static vs. dynamic
DThi = constTraffic distribution remains constant
Dynamic Traffic Distribution (DTrD): Traffic Distribution is adapted to the traffic conditions
DThi = DThi (t)
Dynamic traffic distribution – Two modes
DThi, Hop-Distance – Minimize hop-distance, optimal for light load and high traffic locality. Congestion at the upper levels under heavy loads.
DThi, Load-Balance – Balanced traffic distribution among hierarchy levels. Optimal for heavy-load and low traffic locality. Not optimal under light loads.
At light traffic loads:
Under heavy loads:
Example - 16x16 and 32x32 NoCs
Topology DThi,Hop-Distance DThi,Load-Balance
16x16 [5,8] [11,19]
32x32 [4,10,50] [23,42,61]
Traffic Locality Model - Bandwidth Version of Rent’s Rule
B – Cluster external bandwidth.k – Average bandwidth per
module.G – Number of modules in a
cluster.R – Rent’s exponent, 0<R<1.
B = kGR
G = 16B = ∑
Greenfield et al., “Implications of Rent’s Rule for NoC Design and Its Fault-Tolerance”, NOCS 2007
FeedbackAverage buffers occupancy at the bottleneck level among the upper levels:
2Buffers Occupancy
maxBuffers Occupancy
Level
Level NL
Average
FeedbackAverage
Feedback vs. injection rate
32x32, 4 Levels PyraMesh; Rentian traffic with R = 0.8
DTrD control schemeSwitch between distribution modes using 2 feedback thresholds:
System architecture and implementation costs
Logic: Feedback logic : <10K
NAND gates. Control logic : <1K
gates. Routing logic:
comparable to previous schemes.
Wires: Feedback links of 4
wires to <10% of the routers.
1 broadcast control bit to all bottom mesh routers.
Communication: 1 mode bit in head flits.
Simulation set-up
HNOCS – NoC simulation framework for OMNET++
http://hnocs.eew.technion.ac.il/Yaniv Ben-Itzhak et. al., NOCS 2011
Virtual channels per input port 2
Input buffer size [flits] 4
Packet size [flits] 8
Simulation clock period 2ns
Hierarchical NoC sizes 16x16, 32x32
Traffic Patterns Rentian (R=0.6, 0.7, 0.8)
Average latency vs. injection rate @ Rent’s exp. 0.6 - 0.8
Dynamic Simulation – 32x32 NoC
ConclusionsStatic traffic distribution (STrD) in hierarchical NoCs can optimize performance under either light or heavy traffic loads, but not both at the same time. Dynamic traffic distribution (DTrD) provides optimal performance under both light and heavy loads. DTrD is lightweight, effective and feasible in future systems with many thousands of modules.DTrD is useful and desirable in any un-clustered hierarchical NoC.
Thank You!