rotary router : an efficient architecture for cmp interconnection networks pablo abad, valentín...

17
Rotary Router : An Efficient Architecture for CMP Interconnection Networks Pablo Abad, Valentín Puente, Pablo Prieto, and Jose Angel Gregorio University of Cantabria, Spain ISCA’07 Presented By Tina Miriam John

Post on 21-Dec-2015

216 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Rotary Router : An Efficient Architecture for CMP Interconnection Networks Pablo Abad, Valentín Puente, Pablo Prieto, and Jose Angel Gregorio University

Rotary Router : An Efficient Architecture for CMP Interconnection

NetworksPablo Abad, Valentín Puente,

Pablo Prieto, and Jose Angel GregorioUniversity of Cantabria, Spain

ISCA’07

Presented ByTina Miriam John

Page 2: Rotary Router : An Efficient Architecture for CMP Interconnection Networks Pablo Abad, Valentín Puente, Pablo Prieto, and Jose Angel Gregorio University

Outline

Introduction The Rotary Router Avoiding Anomalies Performance Evaluation Implementation Practicality Conclusion

Page 3: Rotary Router : An Efficient Architecture for CMP Interconnection Networks Pablo Abad, Valentín Puente, Pablo Prieto, and Jose Angel Gregorio University

Introduction

CMP – Most effective way to deal with increasing design complexity.

Lower latency, higher bandwidth, low power consumption and area requirements.

Existing low cost router architectures cause Head of Line (HOL) blocking.

Centralized internal storage not feasible in CMP framework.

Real traffic patterns deviate from balanced usage of network resources while employing deterministic algorithms.

Smaller packet size as in CMP networks, reduces bandwidth increase effectiveness.

Page 4: Rotary Router : An Efficient Architecture for CMP Interconnection Networks Pablo Abad, Valentín Puente, Pablo Prieto, and Jose Angel Gregorio University

General Router Structure Rotary Router sketch

Minimizes effects of small packets and takes advantage of them.

No appreciable HOL blocking.

Uses topology dependent adaptive routing.

Page 5: Rotary Router : An Efficient Architecture for CMP Interconnection Networks Pablo Abad, Valentín Puente, Pablo Prieto, and Jose Angel Gregorio University

General Router Structure

Two independent rings : packets circulate either clockwise or anti-clockwise.

Each ring built with a group of Dual-port FIFO Buffers (DFB).

Packets circulate using DFBs of the ring, until they reach a profitable output port.

No centralized arbitration employed; instead done independently at each router output port , independent of number of input ports.

Page 6: Rotary Router : An Efficient Architecture for CMP Interconnection Networks Pablo Abad, Valentín Puente, Pablo Prieto, and Jose Angel Gregorio University

Router Building Blocks Input Stage

Made of FIFO buffer and demux. Computes profitable output ports for each entering packet Selects ring direction for packet movement – to minimize

delay. Delay depends on # of DFBs traversed and time spent at

each DFB.

Output Stage Responsible for getting packets out of the rings and sending

them to a neighbor router. Made of two buffers and a mux. Applies Flow Control mechanism between contiguous routers.

Page 7: Rotary Router : An Efficient Architecture for CMP Interconnection Networks Pablo Abad, Valentín Puente, Pablo Prieto, and Jose Angel Gregorio University

Router Building Blocks

Buffering Segment Stage Made up of two DFBs connecting every two router ports. Each DFB has two pairs of R/W ports. One pair builds a ring in which the packets turn. The other pair connects the buffer to Input and Output

stages. Decodes routing information generated by Input stage,

placed in packet header.

Page 8: Rotary Router : An Efficient Architecture for CMP Interconnection Networks Pablo Abad, Valentín Puente, Pablo Prieto, and Jose Angel Gregorio University

Flow Control and Routing Algorithm Virtual Cut Through – Controls advance of packets

among routers. Bubble flow control – Regulates packet injection into

rings Occupation based flow control – Manages advance of

packets in rings inside router.

Page 9: Rotary Router : An Efficient Architecture for CMP Interconnection Networks Pablo Abad, Valentín Puente, Pablo Prieto, and Jose Angel Gregorio University

Avoiding Anomalies Deadlock and Livelock

Bubble flow control prevents input ports from exhausting buffering space in the internal rings of the router.

Packets always move between routers because of guaranteed hole in any ring.

Delays appearance of congested situations and removes HOL blocking effect.

Starvation Injection traffic needs three holes to enter a router; in-

transit traffic requires only two. In-transit traffic starvation reduced by balancing buffer

occupation among input ports. Done by modifying flow control, increasing the required

number of holes to inject a packet into the ring.

Page 10: Rotary Router : An Efficient Architecture for CMP Interconnection Networks Pablo Abad, Valentín Puente, Pablo Prieto, and Jose Angel Gregorio University

Performance Evaluation

Synthetic Workloads

(a) (b)

Maximum Normalized Throughput (a) 4x4 torus (b) 8x8 torus

Page 11: Rotary Router : An Efficient Architecture for CMP Interconnection Networks Pablo Abad, Valentín Puente, Pablo Prieto, and Jose Angel Gregorio University

Performance Evaluation Synthetic Workloads

(a) Random Traffic (b) Transpose Matrix Traffic

Page 12: Rotary Router : An Efficient Architecture for CMP Interconnection Networks Pablo Abad, Valentín Puente, Pablo Prieto, and Jose Angel Gregorio University

Performance Evaluation

Real Workloads

(a) Normalized Execution Time (b) Main Simulation Parameters

Page 13: Rotary Router : An Efficient Architecture for CMP Interconnection Networks Pablo Abad, Valentín Puente, Pablo Prieto, and Jose Angel Gregorio University

Implementation Practicality

Delay and Area

(a) Structure of DFB (b) Atomic modules of DFB

Page 14: Rotary Router : An Efficient Architecture for CMP Interconnection Networks Pablo Abad, Valentín Puente, Pablo Prieto, and Jose Angel Gregorio University

Implementation Viability

Power

(a) Power consumption for 8x8 (b) Mobility of packets

torus network

Page 15: Rotary Router : An Efficient Architecture for CMP Interconnection Networks Pablo Abad, Valentín Puente, Pablo Prieto, and Jose Angel Gregorio University

Conclusion

A novel router architecture targeting CMP systems.

Utilizes a decentralized and scalable structure based on rings.

Eliminates HOL blocking, improves performance and provides a deadlock avoidance mechanism.

Reasonable costs in terms of area and power consumption.

Page 16: Rotary Router : An Efficient Architecture for CMP Interconnection Networks Pablo Abad, Valentín Puente, Pablo Prieto, and Jose Angel Gregorio University

References

W. Dally, B. Towles, “Principles and Practices of Interconnection Networks”. Morgan Kaufmann, 2004..

P. Kermani, L. Kleinrock, “Virtual Cut-Through: A New Computer Communication Switching Technique”. Computer Networks, Vol. 3, pp. 267-286, September 1979.

V. Puente, J.A. Gregorio, J. M. Prellezo, R.Beivide, J. Duato,C. Izu, “Adaptive Bubble Router: a Design to Improve Performance in Torus Networks”, International Conference of Parallel Processing (ICPP) 1999.

Y. Tamir and G.L. Frazier. “Dynamically-Allocated Multi-Queue Buffers for VLSI Communication Switches” IEEE Trans. on Computers, Vol.41, No. 6, pp 725-737, June 1992.

Page 17: Rotary Router : An Efficient Architecture for CMP Interconnection Networks Pablo Abad, Valentín Puente, Pablo Prieto, and Jose Angel Gregorio University

Thanks!!!