interconnection network topology design trade-offs

Interconnection Network Topology Design Trade-offs

2

Organizational Structure

Processors datapath + control logic control logic determined by examining register transfers in the

datapath Networks

links switches network interfaces

3

Link Design/Engineering Space

Cable of one or more wires/fibers with connectors at the ends attached to switches or interfaces

Short: - single logicalvalue at a time

Long: - stream of logicalvalues at a time

Narrow: - control, data and timingmultiplexed on wire

Wide: - control, data and timingon separate wires

Synchronous:- source & dest on sameclock

Asynchronous:- source encodes clock insignal

4

Example: Cray MPPs

T3D: Short, Wide, Synchronous (300 MB/s) 24 bits

16 data, 4 control, 4 reverse direction flow control single 150 MHz clock (including processor) flit = phit = 16 bits two control bits identify flit type (idle and framing)

no-info, routing tag, packet, end-of-packet

T3E: long, wide, asynchronous (500 MB/s) 14 bits, 375 MHz, LVDS flit = 5 phits = 70 bits

64 bits data + 6 control switches operate at 75 MHz framed into 1-word and 8-word read/write request packets

5

Switches

Cross-bar

InputBuffer

Control

OutputPorts

Input Receiver Transmiter

Ports

Routing, Scheduling

OutputBuffer

6

Switch Components

Output ports transmitter (typically drives clock and data)

Input ports synchronizer aligns data signal with local clock domain essentially FIFO buffer

Crossbar connects each input to any output degree limited by area or pinout

Buffering Control logic

complexity depends on routing logic and scheduling algorithm determine output port for each incoming packet arbitrate among inputs directed at same output

7

Interconnection Topologies

Topology

[Regular] [Irregular]

[Static] [Dynamic]

[One-Dimensional]

[Three-Dimensional]

[Two-Dimensional]

[Hypercube] [....][Single-Stage]

[Crossbar][Multistage]

[One-Sided]

[Two-Sided]

[....]

8

Static Connection Topologies

Mesh and Torus Illiac IV, MPP, DAP, CM-2, Paragon k-dimensional mesh N=nk, d=2k, D=k(n-1) wraparound variation - Illiac IV Torus n x n binary torus, d = 4, D = 2 n/2

Hypercubes iPSC, nCube, CM-2 N = 2n, d = n, D = n poor scalability, difficulty in packaging higher-dimensional

hypercubes

9

Dynamic Interconnection Networks

Bus-based networks Crossbar networks Single Stage Networks

Shuffle-exchange N input and N output

Crossbar Recirculating networks

Multi-stage Networks more than one stage of switching elements switching box: straight, exchange, upper broadcast, lower

broadcast network topology and control structure

10

Dynamic Interconnection Networks

Two-sided MIN connecting an arbitrary input to an arbitrary output blocking, rearrangeable, nonblocking networks blocking networks

Data manipulator, Omega, Flip, n-cube, Baseline rearrangeable networks

Benes network nonblocking networks

Clos, Crossbar

11

Interconnection Topologies

Logical Properties: distance, degree

Physcial properties length, width

Fully connected network diameter = 1 degree = N cost?

bus => O(N), but BW is O(1) - actually worse crossbar => O(N2) for BW O(N)

VLSI technology determines switch degree

12

Linear Arrays and Rings

Linear Array Diameter? N-1 Average Distance? 2/3N Bisection bandwidth? 1 Route A -> B given by relative address R = B-A Space O(N)

Torus? Or Ring Examples: FDDI, SCI, FiberChannel Arbitrated Loop, KSR1

Linear Array

Torus

Torus arranged to use short wires

13

Multidimensional Meshes and Tori

d-dimensional array N = kd-1 X ...X kO nodes

described by d-vector of coordinates (id-1, ..., iO)

d-dimensional k-ary mesh: N = kd

k = dN described by d-vector of radix k coordinate

d-dimensional k-ary torus (or k-ary d-cube)?

2D Grid 3D Cube

14

Properties

Routing relative distance: R = (b d-1 - a d-1, ... , b0 - a0 )

traverse ri = b i - a i hops in each dimension dimension-order routing

Average Distance Wire Length? d x 2k/3 for mesh dk/2 for cube

Degree? Bisection bandwidth? Partitioning?

k d-1 bidirectional links Physical layout?

2D in O(N) space Short wires higher dimension?

15

Real World 2D mesh

1824 node Paragon: 16 x 114 array a single cabinet: 16 X 4 array

16

Embeddings in two dimensions

Embed multiple logical dimension in one physical dimension using long wires

6 x 3 x 2

17

Trees

Diameter and ave distance logarithmic k-ary tree, height d = logk N address specified d-vector of radix k coordinates describing path down from root

Fixed degree Route up to common ancestor and down

R = B xor A let i be position of most significant 1 in R, route up i+1 levels down in direction given by low i+1 bits of B

H-tree space is O(N) with O(N) long wires Bisection BW?

18

Fat-Trees

Fatter links (really more of them) as you go up, so bisection BW scales with N

Fat Tree

19

Butterflies

Tree with lots of roots! N log N switches (actually N/2 x logN) Exactly one route from any source to any dest R = A xor B, at level i use ‘straight’ edge if ri=0, otherwise cross edge Bisection N/2 vs N (d-1)/d (d-dimensional mesh) vs 1 (tree)

0

1

2

3

4

16 node butterfly

0 1 0 1

0 1 0 1

0 1

building block

20

Benes network and Fat Tree

Back-to-back butterfly can route all permutations off line

What if you just pick a random mid point?

16-node Benes Network (Unidirectional)

16-node 2-ary Fat-Tree (Bidirectional)

21

Hypercubes

Also called binary n-cubes. # of nodes = N = 2n. O(logN) Hops Good bisection BW Complexity

Out degree is n = logN

0-D 1-D 2-D 3-D 4-D 5-D !

22

Relationship BttrFlies to Hypercubes

Wiring is isomorphic Except that Butterfly always takes log n steps

23

Toplology Summary

All have some “bad permutations” many popular permutations are very bad for meshs (transpose) randomness in wiring or routing makes it hard to find a bad one!

Topology Degree Diameter Ave Dist Bisection D (D ave) @ P=1024

1D Array 2 N-1 N / 3 1 huge

1D Ring 2 N/2 N/4 2

2D Mesh 4 2 (N1/2 - 1) 2/3 N1/2 N1/2 63 (21)

2D Torus 4 N1/2 1/2 N1/2 2N1/2 32 (16)

k-ary n-cube 2n n(k-1) n(k-1)/2 2kn-1 27 (13.5) @n=3

Hypercube n=log N n n/2 N/2 10 (5)

24

Wire Efficient Communication Networks for Multicomputers

What makes a network efficient? Efficient use of the limiting resources Limiting Factors

switches and pins were only considered the limiting factors Wires are limiting factors because of power and delay as well as

density At the board level as well as at the chip level, the system

interconnection is limited by wire density Most of the power dissipated in the networks is CV2f power to used to

drive wires. Most of the delay is propagation delay over wires or RC delay in

driving wires

25

In the 3D world

For n nodes, bisection area is O(n2/3 )

For large n, bisection bandwidth is limited to O(n2/3 ) Bill Dally, IEEE TPDS, [Dal90a] For fixed bisection bandwidth, low-dimensional k-ary d-cubes are

better (otherwise higher is better) i.e., a few short fat wires are better than many long thin wires What about many long fat wires?

262-ary 3-fly

BI = N/2 : high bisection widthBWI = Nw/2din = dout = kd = 2k : low degreeD = d+1 : low diameter

The Design Objective of the Network

To minimize latency and maximize throughput Latency T(,L) :the average time required to deliver a

message Each node injects messages with average length L into the

network at an average rate of bits per cycle. Three independent variables: topology, routing, and flow control

TopologyIndirect Networks (k-ary d-flys: radix k and dimension d) No of processing nodes: N = kd

27

Wire Efficient Topology

Indirect Networks high bisection width, low degree, low diameter, long wires,

symmetry the bisection width B = N/2 does not reflect the actual maximum

wire density for this class of networks: vertical partition (N wires) more accurately reflects the wiring problems

wire area O(N2) : plane mapping - expensive N = kd. As one varies k and d with the number of processing

nodes, N, and BW fixed. the degree and diameter are directly controlled. the channel width remains fixed at w = BW/B=2BW/N. B is independent of the choice of k and d. disadvantage: it prevents the designer from trading off the bandwidth

of a channel against the diameter of the network.

28

Direct Networks (k-ary d-cubes) BD = 2N/k

BWD = 2Nw/k

din = dout = d d = 2d D = dk/2

For small d a low and controllable bisection width (N=kd) low degree high diameter short wires (d 3) wiring complexity O(N)

BI = N/2 : high bisection width

BWI = Nw/2

din = dout = k

d = 2k : low degree

D = d+1 : low diameter

Wire Efficient Topology

29

How Many Dimensions?

d = 2 or d = 3 Short wires, easy to build Many hops, low bisection bandwidth Requires traffic locality

d 4 Harder to build, more wires, longer average length Fewer hops, better bisection bandwidth Can handle non-local traffic

k-ary d-cubes provide a consistent framework for comparison N = kd

scale dimension (d) or nodes per dimension (k) assume cut-through

30

Traditional Scaling: Unloaded Latency(N)

Assumes equal channel width independent of node count or dimension dominated by average distance

0

20

40

60

80

100

120

140

0 2000 4000 6000 8000 10000

Machine Size (N)

Ave

Lat

ency

T(m

=40)

d=2

d=3

d=4

k=2

m/w

0

50

100

150

200

250

0 2000 4000 6000 8000 10000

Machine Size (N)

Ave L

ate

ncy T

(m=140)

Unit routing delay ( = 1) w = 1

31

Real Machines

Wide links, smaller routing delay Tremendous variation

32

Average Distance

but, equal channel width is not equal cost! Higher dimension => more channels

0

10

20

30

40

50

60

70

80

90

100

0 5 10 15 20 25

Dimension

Ave D

ista

nce

256

1024

16384

1048576

ave dist = d (k-1)/2

interconnection network topology design trade-offs

Documents