1 exploring efficient and scalable multicast routing in future data center networks dan li, jiangwei...

1

Exploring Efficient and Scalable Multicast Routing in Future Data Center Networks

Dan Li, Jiangwei Yu, Junbiao Yu, Jianping WuTsinghua University

Presented by DENG Xiang

Outline

I Introduction and backgroundII Build an efficient multicast treeIII Make multicast routing scalableIV EvaluationV Conclusion

Data Centers the core of cloud servicesonline cloud applicationsback-end infrastructural computationsservers and switchespopularity of group communication

Introduction and background

Multicast

save network traffic improve application throughput

Internet-orieted Multicast is successful.

When Multicast meets data center networks...

Problem A:

Data center topologies usually expose high link

density and traditional technologies can result in

severe link waste.

Problem B:

Low-end commodity switches are largely used in

most data center designs for economic and scalability

consideration.

Data Center Network Architecture

BCubePortlandVL2 (similar to Portland)

Build an efficient Multicast tree

BCubeconstructed recursively: BCube(n,0), BCube(n,1)...BCube(n,k)each server has k+1 portseach switch has n portsnumber of servers: nk+1

Portland three-level and n pods aggregation level and edge level: n/2 switches with n ports core level: (n/2)2 switches with n ports number of servers: n3/4

Consistent themes lie in them use low-end switches in the view of expense high link density exists data center structure is built in a hierarchical and re

gular way

In order to save network traffic, how to build an

efficient Multicast tree traditional receiver-driven Multicast routing protocols originally for the Internet, such as PIM

approximate algorithm of Steiner tree Steiner tree problem: to build a Multicast tree with the lowest cost cov

ering the given nodes

source-driven tree building algorithm

the proposed algorithm

group spanning grapheach hop is a stagestage 0 includes the sender onlystage d is composed of receiversd is the diameter of data center topology

Build Multicast tree in a source-to-receiver expansion way upon the group spanning graph, with the tree node set from each stage strictly covering downstream receivers

definition of cover: A covers B if and only if for each node in B, there exists a d

irected path from a node in A A strictly covers B when A covers B and any subset of A d

oes not cover B.

algorithm details in BCube:a) select the set of servers(assume the set is E) from

stage 2 which are covered by sender s and a single switch in stage 1(assume it is W)

b) |E| of the BCube(n,k-1)s has a server in E as the source p, and the receiver set in stage 2*(k+1) covered by p.

c) the other BCube(n,k-1) has s as the source and receivers in stage 2*k covered by s but not by W as the receiver set

algorithm details in Portland:a) From the first stage to the stage of core-level switches, any

single path can be chosen, because any single core-level switch can cover the downstream receivers.

b) From the stage of core-level switches to the final stage of receivers, the paths are fixed due to the interconnection rule in PortLand.

a mechanism of packet forward to support

massive Multicast group is necessary: in-packet Bloom Filter

For only in-packet Bloom Filter, bandwidth waste is significant for large groups.

in-switch forwarding table

For only in-switch forwarding table, very large memory space is needed.

Make Multicast routing scalable

The bandwidth waste of in-packetBloom Filter comes from: the Bloom Filter field in the packet brings net

work bandwidth cost. false-positive forwarding by Bloom Filter cau

ses traffic leakage. switches receiving packets by false-positive f

orwarding may further forward packets to other switches, incurring not only additional traffic leakage but also possible loops.

we define Bandwidth Overhead Ratio r to decribe in-

packet Bloom Filter:

p--the packet length (including the Bloom Filter field) f--the length of the in-packet Bloom Filter fieldt--the number of links in the Multicast tree c--the numberof actual links covered by Bloom Filter based forwarding

with the packet size as 1500 bytes, the relation

among r, f and group size:

BCube(8,3) Portland with 48-port switches

In-packet Bloom Filter does not accommodate large-size group. So a combination routing scheme is proposed.

a) in-packet Bloom Filters are used for small-sized groups to save

routing space in switches, while routing entries are installed into

switches for large groups to alleviate bandwidth overhead.

b) Intermediate switches/servers receiving the Multicast packet check a special TAG in the packet to determine whether to forward the packet via in-packet Bloom Filter or looking up the in-switch forwarding table.

two ways of in-packet Bloom Filternode-based encoding elements are the tree nodes, including switches and serve

rs and it is chosen.

link-based encoding

elements are the directed physical links

false-positive forwarding caused by in-packet

Bloom Filter may result in loops.

the solution:When a node only forwards the packet to its

neighboring nodes (within the Bloom Filter) whose

distances to source are larger than itself.

Evaluation

evaluation of souce-driven tree buiding algorithm:

BCube(8,3) and 48-port-switch Portland; 1Gbps link speed; 200 random-sized groups; number of links in the tree computation time

BCube Portland

evaluation of combination forwarding scheme with

32-byte Bloom Filter:

Conclusion

Efficient and Scalable Multicast Routing in

Future Data Center Networks

an efficient Multicast tree building algorithm a combination forwarding scheme for salable

Multicast routing

1 exploring efficient and scalable multicast routing in future data center networks dan li, jiangwei...

Documents

stage stage

efficient multicast

background slide

stage of core

build multicast tree

proposed algorithm slide

level switches

regular way slide