[ieee comput. soc. press 11th international parallel processing symposium - genva, switzerland (1-5...

Broadcasting and Multicasting in Cut-through Routed Networks*

Johanne Cohent

Ecole Normale Superieure de Lyon, France j [email protected]

LIP - CNRS

Jean-Claude Konig Departement d’informatique

Universitk d’Evry, France [email protected]. fr

Abstract

This paper addresses the one-to-all broadcasting problem, and the one-to-many broadcasting problem, usually simply called broadcasting and multicasting, respectively. In thispapec we study these problems under both line model, and cut-through model. The former assumes long distance calls between non neighboring processors. The latter com- pletes the line model by taking into account the use of a routing function. It is known that one can find time optimal broadcast and multicast protocols in the line model in polynomial time. We present a new time optimal broadcasting and multicasting algorithm in the line model. This algorithm eJficient1y uses the bandwidth of the network. Moreovel: it also applies to the cut-through model as soon as the routing junction generates shortest paths only.

1 Introduction

Given any point-to-point interconnection network modeled as a graph C: = ( I ‘, E ) , broadcasting is the information dissemination problem which consists to send a same piece of information from a unique source to all the other nodes of the network. Such a communication scheme typically ap- pears in many applications on distributed memory parallel computers, and is one of the kernel of the collective communication routines of the MPI library (Message Passing Interface [SI). Broadcasting is actually a particular case of multicasting in which a source node has a unique message to send to a subset of nodes. Again, multicasting has many applications for the control of parallel computers, like barrier

*All authors are supported by the research programs PRS of the CNRS. t Additional support by the DRET of the DGA. :Additional support by the research program ANM of the CNRS

Pierre Fraigniaudt

Ecole Normale Superieure de Lyon, France p fraign@li p . ens - lyon. fr

LIP - CNRS

Andre Raspaud LaBRI - CNRS

Universite de Bordeaux 1, France raspaudalabri .u-bordeaux. fr

synchronization or cache coherence. Another major example of the use of a multicasting procedure is when a given user of a parallel machine can only make use of a subset of processors (a group of processor in the MPI terminology) and not of the entire machine. In this case, broadcasting a message in the user domain at the application level is actually a multicast at the system level.

The broadcasting problem was intensively studied under the store-andforward routing mode in which packets that proceed along a path in the network must be entirely stored on each intermediate node before being transmitted to the next one. The reader can consult the surveys [6,9, 101.

In most of the modern distributed memory parallel computers, the store-and-*fonvard routing mode has been replaced by various types of virtual cut-through routing [ 131 including direct-connect on several Intel’s machines, circuit- switching, and wormhole routing [15]. In the circuit- switched mode, when a node x sends a message to a non neighboring node y, a path is created between x and y to directly connect these two nodes. The message from z is then transmitted along this path. Intermediate processors do not receive the message that goes through them because nodes generally consist of several components: the processing unit, its local memory, and a router which is in charge of transmitting messages.

A router sends a message to the local memory of its node only when the message is precisely destined to its node. Wormhole routing differs from the circuit-switched routing in the way messages are transmitted along the path from the source to the destination. In wormhole routing, a message is decomposed in small units called flits. The first flit is used to determine the route followed by the message at each intermediate node, and the remaining flits follow in a pipeline fashion (the last flit releases the intermediate connections). Wormhole routing does not require a whole

734 1063-7133/97 $10.00 0 1997 IEEE

mailto:[email protected]

path to be reserved between the source and the destination, it makes use of a number of links proportional to the length of the message.

In this paper, we are interested in the communication complexity of broadcasting and multicasting in virtual cut- through routed networks. In the following, we will make use, in particular, of the so-called line model [3] which sup- poses that (1) a call involves exactly two nodes (these two nodes might be at distance greater than one), (2) a node can participate to at most one call at a time, (3) each call takes a unit of time (7&(L) = l), and (4) any two paths cor- responding to two simultaneous calls must be edge-disjoint. As opposed to the telephone model which allows neighbor- to-neighbor communications only, the line model allows long distance calls between non neighboring processors, as it is possible in any virtual cut-through implementation.

However, the line model suffers from a major drawback. Indeed, in virtual cut-throughrouting, the paths followed by messages are, in general, determined by a routing function. In a network modeled by a graph G = (V, E ) , a routing function R, is a collection of local functions

R = { R , : V e E , z E V }

such that any message destined to node y arriving or origi- nated in 2 is transmitted from 2 along the edge (or channel) R, ( y) . (To be well defined R, must satisfy that R, ( y) is an incident edge of z for any y E V, that is R, : V c) E, where E, = { (x, y) E E, y E V } . ) Two typical exam- ples of routing functions are e-cube routing in hypercubes (messages are routed in dimension order), and XY -routing on meshes (messages are first routed horizontally and then vertically). To take into account the natural use of a routing function, we will consider the following simple additional hypothesis to the line model : (5) paths followed by messages are constructed by application of a routing fimction R = { R, : TI’ c) E,, 2 E V} at each intermediate node. In this paper, the line model plus hypothesis 5 is called the cut-through model.

In the line model, most of the known results concern particular networkarchitectures as trees [3, 141, cycles [ 121, meshes or tori [4, 71. A major result which applies to any topology is due to Farley [3] who has shown that, under the line model, for any network G of n nodes, the broadcasting time from any node of G is [log:, n1. This result will be referred as Farley’s Theorem in the following. Note that this theorem is optimal since hypotheses 1 and 2 imply that the number of informed nodes can at most double at each round.

Farley’s theorem can be easily extended to the multicast problem: for any network C = (V, E ) , for any source node U E V , and for any destinations set D c V, U E D , the multicasting time of U in D is [log, IDll. As broadcasting, multicasting has been intensively studied in the past under

both store-and-forward or cut-through-like models (see the references in [5]). In most of the cases, the network is fixed (generally a mesh) and the influence of the multicasting algorithm on the traffic was measured by simulation. Note also that many papers dealing with multicasting make use of an additional hypothesis, called the path-based hypothesis [5] which specifies that a node is able to “read” any message traversing its router. In this paper, we will consider the line model applied to arbitrary network topologies.

Farley’s theorem does not close the discussion about the line model (for instance, papers [4, 7, 121 optimize other criteria related to the topology of the network). Indeed, in practice, it is usually required that the traffic generated by a broadcasting or a multicasting does not interfere with other possible traffic. For instance, any broadcasting initiated by a specific user of a multi-user parallel machine inside its own processor domain must not slow down communications of other users, especially when their respective domains are disjoint. More generally, multicastings are often initiated by system processes which must not reduce the ability of any user processes to exchange messages at their maximum rate. In other words, one requires the number of resources used to multicast or to broadcast a message to be minimal, or at least as small as possible.

The [log, nl-time protocol of Farley’s theorem is based on transmitting the message along the edge of a spanning tree, and therefore does not allow to minimize the resources used to perform a broadcast. For instance, the routes followed by the several copies of the broadcasted message can be quite long compared to the shortest paths between their sources and destinations. The length of the paths is not a major issue for cut-through routing. However, the simultaneous use of a lot of long paths implies that many links will be reserved, in spite of other possible communications. Our main goal in this paper is to figure out whether it is possible to derive optimal [log, n1-time broadcast protocols achieving better use of the available resources of the network. Among the sevleral possible parameters which measure the efficiency of a multicast or a broadcast protocol, we will consider the number of links. That is we will be interested in minimizing the total number of communication links which are used, either at each round of the protocol, or globally during the whole protocol. The less number of links is used, the most furtive is the protocol.

2 Line model versus cut-through model

Before starting our study, let us point out the difference between the line model and the cut-through model to make clear these different modells. Let us consider the network on Figure 1. It is a double-star 5’: = (V, E ) with two centers U and U, and n - 2: rays. First, let us show how to broadcast in [log, n1 rounds from any source of 5’: in the

735

line model (Farley's theorem insures that such a broadcast protocol exists). Nodes at the extremities of the rays are labeled from 1 to n - 2, node U is labeled 0, and node U is labeled n - 1. Note that nodes U and v play the same role. There are only two cases to be considered: either the source of the broadcasting is a center (say node O), or it is one of the extremities of the n - 2 rays (say node 1). In both cases, one can always proceed so that both nodes 0 and 1 are aware of the broadcasted message after one round. Now, if one consider the subgraph of the double star obtained by removing all the edges between v and the other nodes but U , we obtain a usual star of center U. On this star, the [log, n1 - 1 next rounds are given by: at round i , i > 1, if the node labeled k is aware of the message, then it informs the node labeled k + 2i-1. There is no edge contention, and, after round [log, n 1, all nodes are aware of the message.

151

Figure 1. A double star of 8 vertices.

Let us now consider the routing function R that routes the messages on the double star as follows:

VW E V, w # U, , w # U, VX E V, R, (x) = U; Vx E v, R,(z) = v; { Vx E V, R,(x) = x.

Assume n = 8. It is not possible to broadcast in 3 rounds from any source in the double star of 8 vertices under the cut-through model. Indeed, to broadcast in 3 rounds, the last round would consists of 4 calls performed at the same time. Therefore, at least two of these calls would use the arc from U to t i , which is impossible. (Recall that both line model and cut-through model require edge-disjoint paths.)

This example shows that the cut-through model is much more restrictive than the line model. In particular, it is not true that, for any network G, and for any routing function R on G, there exists a broadcast protocol from any node of G that performs in [log, n J rounds. However, one can state the following result:

Property 1 (cut-through version of Farley's theorem). Under the cut-through model, for any network G of n nodes, there exists a routing function R such that the broadcasting time from any node of G is [log, n1.

Proof. There exist many proofs of Farley's theorem and similar results (see [3, 1 1, 141). All of them use an arbitrary

spanning tree of the considered network, and all the calls are performed along the edges of this spanning tree. Any spanning tree induces a routing function R since there is a unique path between any two nodes in a tree. Therefore, all paths used by the broadcast protocols derived in [3, 1 1, 141 can be generated by a routing function.

0 The proof of Property 1 produces a routing function

which follows the edges of a spanning tree of the network. Such a routing function induces a lot of contentions and hot-spots when it is used for other communication problems. Indeed, the traffic is not balanced, and the root of the tree is clearly overloaded. Moreover, though the use of shortest paths is not necessarily important in network using circuit-switched or wormhole routing, such a routing function may route messages exchanged between neighbors along a path of length twice the diameter of the network! In the next section, we will show that it is possible do much better.

3 Using the minimum number of links

In this section, we will focus on the total number of links that are used by a broadcasting or a multicasting algorithm at each round, or during the whole protocol. As we saw, this parameter can be considered as a measure of the furtiveness of the protocol. Note that by total number of links used during the whole protocol, we mean the sum, over all the rounds, of the number of links used at each round.

At each round of a broadcasting algorithm, nodes aware of the message are "matched" with other nodes that did not have received the message yet. Let us formalize this fact:

Definition 1 Let U be a subset of vertices of a graph G = (V, E ) . Apseudo-matching in U is a set P of 1 y] painvise edge-disjointpaths in G such that every vertex of U (but one if1 U I is odd) is an extremity of a path in P.

The following result shows, in particular, that a pseudo- matching in U exists for any choice of U C V. We denote by d( E , y) the distance between two nodes x and y ofa graph Lr.

Lemma 1 Let U be a subset of vertices of a graph G = (V, E ) . One can group the vertices of U in pairs (XI, yi ) , ( m , ~ , ) , . . . , (xk,yk) where ,k = [q], xi E U, yj E U, xi # xi, yi # yj, and xi # yj for all i , j E { 1, . . . , k } , such that any shortest path between xi and yi is edge-disjoint with any shortestpath between xj and yj, i # j , andsuch that xf=l d(xi, yi) is minimum among all thepossible choices of thepairs (xi, yi) , i = 1 , . . . , I C .

Proof. Let m = I U 1, and let us consider the complete graph I<', of m vertices where each vertex of IC, is identified to

736

a vertex of U . We add weights on the edges of lim: the edge { x , y } has weight d( x , y), the distance between x and y in G. Consider a perfect matching of minimum weight in ICm. This matching induces a set P of J shortest paths in G such that every vertex of U (but one if I U I is odd) is an extremity of a path in P . (In case of multiple shortest paths between two matched vertices, choose one arbitrarily,)

We claim that the paths of P are pairwise edge-disjoint. Indeed, assume that two paths P{z,y} and P{zG’,y,} of P are not edge-disjoint. It means that there exist two vertices z and z’ such that P{z,y} = P{,,.} U P{,,.,} U P{z,,y}, and P{z’,y’} = P{d,Z} U q*,2/} U P{”,y’} or P{z’,y’} = P{d ,d} U P[’.,,.) U P{.z,y’} where P{Z,d} and q2,zq have at least one edge in common. Assume without loss of generality that P{z,,y,} = P{.,,.} U P[2,z,l U P{t,,Yq. It implies that the two matching {x, y} and {x’, y’} can be replaced by two other matching {x, x’} and {y, y’}. The former matching has a weight of d(x, z ) + d(z’ , y) + d ( d , x ) + d ( z ’ , y’) + 2 d ( z , z ’ ) whereas the later has a weight of d(x. 2’) + d(y, y’) which is less or equal than d ( c , x ) +d(z ’ , y) +d(x’, z ) + d ( x ’ , y’). Sinced(z, 2’) # 0, we obtain a contradiction with the fact that the original matching is of minimum weight, and therefore the paths of P are painvise edge-disjoint.

U We can define the weight of a pseudo-matching P in U as

the sum of the lengths of all the paths in P . We are interested in minimizing the weights of the several pseudo-matching generated at any round of a multicasting or a broadcasting algorithm. A pseudo-matching of small weight requires low use of the bandwidth to performs exchanges between the extremities of its paths. The following theorem shows that such this minimization is possible in a polynomial time. It strongly improves Farley ’s theorem.

Theorem 1 For any network G = (VI E ) of n nodes, and any node U of G, one can compute in polynomial time a multicastprotocol from U to any set D in G, U E D, which pevforms in [log, I Dl1 rounds in the line model, and such that: (i) all the calls are performed along shortest paths; (ii) at any of the [log, I Dl1 rounds, the weight of the corre- spondingpseudo-matching is minimum.

Proof. The [log, \Dl1 pseudo-matching are constructed backward. Start with U1 = D , and by Lemma 1, compute a pseudo-matching PI in U1 of minimum weight, and containing shortest paths only. Then choose one of the two extremities of each path in P I . This choice is arbitrary but for the source U that must be selected. Keep also the unique isolated unmatched vertex in U1 if exists. All these nodes form a set U,. Then compute a pseudo-matching P, in U, of minimum weight and containing shortest paths only. Again, this is possible by Lemma 1. Then, we extract a set U3 from

U2 as U2 was extracted frorn U1 , and repeat the process until a set U; is reduced to U . Clearly i satisfies i = [log, lDl1. This protocol can be computed in a polynomial time because one can compute a perfect matching of minimum weight in the complete graph in polynomial time [2] and, thus we can apply the construction of Lemma 1 in polynomial time.

0 Now, this result also ho Ids in the cut-through model be-

cause, in the proof of Lemma l, the way the shortest paths are selected does not matter. In particular, they can have been constructed by any shortest path routing function. There- fore, we get the following result which strongly improves Property 1:

Theorem 2 For any network G = (V, E ) of n nodes, any node U of G, and any shortest path routing function on G, one can compute in polynomial time a multicast protocol from U to any set D in G, U E D, which performs in [log, I Dl1 rounds in the cut-through model, andsuch that, at any ofthe [log, I Dl1 rounds, the weight ofthecorresponding pseudo-matching is minimum.

Theorem 2 says that, whatever is the network and the routing function on this network, if this routing function routes messages along shortest paths, it will be possible to broadcast and multicast optimally in logarithmic time. Most of the routing functions useti in the usual topologies (meshes, multi-dimensional tori,. . .)I routes on shortest paths ( X Y - routing, e-cube routing,. . .). If wormhole routing is used, it is interesting to point out that there is no contradiction between the search for a (deadlock free routing function, and the search for a routing function that insures optimal broadcasting. Indeed, most of the classical deadlock free routing functions generate shortest paths only.

Theorems 1 and 2 both say that one can easily minimize the annoyance of each round of any multicasting algorithm, and therefore one can easily insure that the additional traffic due to this process will not perturb too much other user traffic. Unfortunately, it does not say that the sum of the lengths of all the paths used during the whole multicasting is minimum. The next result shows that minimizing this parameter is NP-complete (one can find the proof in [ 11).

Theorem 3 The following problem is NP-complete: MINIMUM-TOTAL-PATH-LENGTH (MTPL) Instance: A graph G = (VI E ) , a vertex U of G, a subset

D C V , U E D, and an integer k . Question: Does there e.cist a multicast protocol from U

to D in Gperformingin [log, ID11 rounds in the line model, and such thai the sum of all the lengths of all the communication paths generated by this protocol is at most I C ?

Theofem 3 implies that it is difficult to optimize the total edge-load of a multicasting or a broadcasting algorithm.

737

However, one can approximate this value up to a logarithmic factor. More precisely, for any network G = (V, E ) , any node U E V , and any set D c V , U E D , let us denote by S ( G , D , U ) the minimum, taken over all the multicast protocols from U. to D in G performing in [log, ID/] rounds, of the sum of all the lengths of all the paths generated by the protocol. Theorem 3 shows that minimizing S(G, D , U ) is an NP-complete problem. However, we have, in the case of the broadcasting problem:

Theorem 4 Let G = (V, E ) be any graph, and let U E V . There exists a broadcast protocol from U in G which performs in [log, n1 rounds in the line model, and such that the total sum of all the paths generated by the algorithm is at most [log, nlS(G, V, U).

The proof of this theorem can be found in [ 11.

4 Conclusion

In addition to the number of links, we have also studied other parameters like the number of transmitters (a transmitter is a router which is explicitly used to forward a message), the load of the transmitters (that is the maximum number of paths which traverses a router simultaneously), and the maximum path length. Minimizing the number of transmitters allows to potentially decrease the number of nodes which will be disturbed by the multicast protocol. Similarly, minimizing the maximum number of communication paths that a given transmitter handles simultaneously allows us to not overload intermediate routers. The length of the paths allows us to estimate the possible degradation of the time complexity of the protocol when the switching time of the routers cannot be neglected when a message proceed along a long path. Unfortunately, we have shown in [ 11 that opti- mizing any of these parameters (globally or at any round) is NP-complete.

The situation is not “despairing” however because the reader must keep in mind that the main parameter to optimize in cut-through routed networks is the use of the bandwidth. The first property that must satisfy a communication protocol in this model is to be free of link contention. From these points of view, we have provided important results. It is possible to multicast or broadcast in an,optimal number of rounds such that all calls are performed on shortest paths, and the sum of the paths lengths at each round is minimum. Even if minimizing the use of the bandwidth (that is the total number of links) for the whole protocol is NP-complete, we have described a protocol which reaches the lower bound up to a logarithmic factor.

References

[I] J. Cohen, P. Fraigniaud, J.-C. Konig, and A. Raspaud. Broad- casting and multicasting in cut-through routed networks. Technical Report RR-96-36, LIP, ENS-Lyon, France, 1996.

[2] J. Edmonds. Maximum matching and a polyhedron with 0,l -vertices. Journal of Research of the Natianal Bureau of Standards-B, 69B:125-130,1965.

[3] A. Farley. Minimum-time line broadcast networks. Networks, 1059-70, 1980.

[4] R. Feldmann, J. Hromkovic, S. Madhavapeddy, B. Monien, and P. Mysliwietz. Optimal algorithms for dissemination of information in generalized communication modes. Technical Report 1 15, Fachbereich Mathematik-Informatik, Universitat-Gesamthoschule Paderborn, Germany, 1993.

Strategies for Multicasting in Meshes. Research Report (Submitted to the Journal of Parallel and Distributed Computing) 94-07, Laboratoire de I’Informatique du Parallelisme, ENS-Lyon, http://www.ens-

[5] E. Fleury and P. Fraigniaud.

lyon.fr/LIP/, 1994. [6] P. Fraigniaud and E. Lazard. Methods and Problems of Com-

munication in Usual Networks. Discrete Applied Mathemat- ics, 53:79--133, 1994.

[7] P. Fraigniaud and J. Peters. Structured Communication in torus networks. In IEEE, editor, 28th Annual Hawaii In- ternational conference on system sciences, pages 584-593, 1995.

[SI W. Gropp, E. Lusk, and A. Skjellum. Using MPI: Portable Parallel Programming with the Message Passing Interface. The MIT Press, 1994. ISBN 0-262-57104-8.

[9] S. Hedetniemi, S. Hedetniemi, and A. Liestman. A survey of gossiping and broadcasting in communication networks. Networks, 18:319-349, 1986.

I O ] J. Hromkovic, R. Klasing, B. Monien, and R. Peine. Dissem- ination of information in interconnection networks (broadcasting and gossiping). In D.-Z. Du and D. F. Hsu, edi- tors, Combinatorial Network Theory,pages 125-2 12. Kluwer Academic, 1995.

I I ] J. Hromkovic, R. Klasing, W. Unger, and H. Wagerer. Opti- mal algorithm for broadcast and gossip in the edge-disjoint path mode. In Springer-Verlag, editor, 4th Scandinavian Workshop on Algorithm Theory (SWAT’94), volume 824 of LNCS, pages 2 19-230,1994.

121 J. Kane and J. Peters. Line broadcasting in cycles. Techni- cal Report CMPT TR 94-1 1, School of Computing Science, Simon Fraser University, Burnaby, Canada, 1994.

[ 131 P, Kermani and L. Kleinrock. Virtual cut-through: a new computer communication switching technique. Computers Networks, 3:267-286, 1979.

[ 141 C. Laforest. Broadcast and gossip in line-communication mode. Technical Report 1005, Laboratoire de Recherche en Informatique, Univ. Paris Sud, 91405 Orsay, France, 1995.

[ 151 L. Ni and P. McKinley. A survey of wormhole routing tech- niques in direct networks. Computers,26(2):62-76, feb 1993.

738

http://www.ens

[ieee comput. soc. press 11th international parallel processing symposium - genva, switzerland (1-5...

Documents