a fault-tolerant network architecture for modular datacenter

Upload: fmoreira9650

Post on 04-Apr-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/31/2019 A Fault-Tolerant Network Architecture for Modular Datacenter

    1/14

  • 7/31/2019 A Fault-Tolerant Network Architecture for Modular Datacenter

    2/14

    International Journal of Software Engineering and Its Applications

    Vol. 6, No. 2, April, 2012

    94

    restriction on scalability of MDCN. So, intra-container networks could adopt some

    complex topologies which may be considered not suitable for traditional DCNs.

    In this work, we present SCautz, a novel hierarchical intra-container network struc-

    ture for a Shipping Container Kautz network. SCautz comprises of a base physical

    Kautz topology, which is built by interconnecting servers' NIC ports and a small

    amount of redundant COTS (commodity off-the-shelf) switches. SCautz base topologyadopts the server-centric approach. It is servers that take charge of routing traffic and

    work with switches to bypass the failed servers for achieving graceful perfor-mance

    degradation.

    The basic idea of SCautz is driven by the demand of MDCs service -free mode onfault-tolerance of MDCN, and the inspiration of design is based on scale-out principlein datacenter construction. Results from theoretical analysis and simulations show that

    SCautz is more viable for MDCN because of the following reasons:

    First, SCautzs base topology can offer as high network capacity as BCube [5] for

    one-to-x (e.g., one-to-one, one-to-all) and all-to-all traffic.

    Second, we propose a fault-tolerant routing algorithm called SCRouing+, whichleverages switches and peer servers connected to the same switch to bypass the failed

    servers. SCautz thus can maintain the throughput for one-to-x traffic, and make

    network performance degrade smoothly for all-to-all traffic and much slower than

    MDCs computation and storage capacity do.Third, the extra cost of redundant switches is very low. Theoretical analysis shows

    that a typical SCautz-based container with 1280 servers only needs 160 switches.

    The rest of the paper is organized as follows. Section 2 discusses background.

    Section 3 presents BCube and its support for various trac patterns. Section 4 designs

    BSR. Section 5 addresses other design issues. Sections 6 studies graceful degradation.

    Section 7 presents implementation and experiments. Section 8 discusses related workand Section 9 concludes the paper.

    2. Related Works

    As MDC gets popular, modular datacenter networks (MDCN) have attracted moreand more interest from cloud providers, hardware vendors and academic fields. Against

    the fatal drawbacks in supporting cloud data-intensive computing, lots of noveldatacenter networks (DCN) have been proposed.

    VL2 [6] and PortLand [7] organize the switches into more sophisticated Clos and fat-

    tree structures respectively, in which any two servers are able to communicate to each

    other at maximum rate of network-interface cards (NIC). Since the routing intelligences

    are placed on switches, VL2 and PortLand belong to switch-centric DCN; while Dcell[8], BCube and Camcube [9] belong to server-centric DCN, for their routing

    intelligences are placed on servers. Dcell proposes a new recursive structure for high

    scalability, BCube leverages the low-end COTS switches to implement the intra-

    container network based on the Hypercube topology [10], and CamCube designs a

    direct-connect 3D torus topology, which has been adopted by Content AddressableNetwork (CAN) overlay [11]. Because its servers are always equipped with multiple

    NICs, server-centric DCN is more effective in supporting data-intensive applications

    and dealing with failures than switch-centric one. Moreover, since Dcell has network

    performance bottleneck at its lower hierarchy and CamCube mainly studies the

    flexibility of routing API for cloud appications, BCube could better offer higher

    uniform network capacity, and achieve graceful performance degradation.

  • 7/31/2019 A Fault-Tolerant Network Architecture for Modular Datacenter

    3/14

    International Journal of Software Engineering and Its Applications

    Vol. 6, No. 2, April, 2012

    95

    In server-centric MDCN, the failures of servers and switches both lead to overall

    performance of containers decreases. For example, BCubes incomplete structure makeits throughput for one-to-x traffic patterns drops evidently, and ABT (Aggregation

    Bottle Throughput) for all-to-all traffic degrades faster than computation and storage

    do. Furthermore, switch failures decrease BCube performance much more significantly.

    Its ABT shrinks beyond 50% in the presence of 20% switch failures [5].SCautz proposes a novel hierachical network structure, in the model of undirected

    Kautz graph, to avoid the above problems. Kautz achieves near-optimal tradeoff

    between node degree and diameter, and have better bisection width and bottleneck

    degree. However, it was considered not suitable for mega datacenter, because it is hard

    to be incrementally deployed without violating the origin structure. For MDCN, the

    amount of servers in a container is fixed, and the interior network will not be changed

    during its whole lifecycle. So this restriction doesnot exist anymore. Through

    simulations and comparisons, we show that SCautz is more viable for MDCN.

    3. SCautz Architecture

    SCautz comprises of two types of components: servers with multiple NICs and COTS

    switches. Servers interconnect their NICs forming a physical Kautz topology asSCautzs base network structure, denoted as . Switches use their low-speed (1Gbps) ports to connect a specific number of servers, and reserve the high-speed(10Gbps) ports for inter-container networks.

    3.1. Preliminaries

    For defining the base undirected Kautz topology of SCautz, we introduce the

    definition of directed Kautz graph first. Let be an alphabet of d+1letters, and be the identifier space ofKautz, wherein vertex identifiers are a set of strings with length kand base d, and their

    consecutive letters are different.

    21 10

    02

    12 01

    20

    21 10

    02

    12 01

    20

    Figure 1. The Kautz Graph and its Undirected Structure

    Definition 1 (Kautz graph[13]

    ) The vertices and edges of Kautz graph K(d,k) are V(K(d,k))

    and E(K(d,k)):

    , .

  • 7/31/2019 A Fault-Tolerant Network Architecture for Modular Datacenter

    4/14

    International Journal of Software Engineering and Its Applications

    Vol. 6, No. 2, April, 2012

    96

    The Kautz graph is d-regular, the diameter is k and there are vertices and edges in it. The SCautzs base undirected Kautz structure UK(d,k) isobtained by omitting the directions of the edges and keeping the loops between the

    vertices of the form (abab...), e.g., (01,10), (21,12) and (02,20). So it is regular,which is not like the general undirected Kautz. Figure 1 shows K(2, 2) and UK(2, 2).

    3.2. SCautz Structure

    The complete structure of SCautz with redundant switches is denoted as

    or for short, defined as follows.Definition 2 Let be the complete SCautz network with base topology and the redundant switch structure. The node (), switch ( and ),cluster ( and ) and link (), in which links comprise of the links () directlyconnecting servers and the links connecting servers and switches (), are represented asfollows:

    ,

    ,

    ,

    , , or ,

    .

    The definition of SCautz(d,k,t)s nodes is just the same as in Kautz(d,k), where tmeans the length of switchs identifier. Due to the different rules of organizing servers,

    the switches are divided into two categories: and . And let servers,whose rightmost (or leftmost) substrings of length t are the same and identical to a

    certain switchs identifier, connect to the corresponding (or ). So tdeterminesthe amount of servers in one cluster and the total amount of switches. The n servers

    connected to one same switch form a cluster, hence, all clusters fall into two categories

    as well: the clusters with (or ) are denoted as (or ), and each serveris a member of and simultaneously. Therefore, The switch =10connects with four servers (1010, 2010, 0210 and 1210) building the cluster

    ={10}; the switch =02 also connects with four servers (0201, 0202, 0210and 0212) building the cluster ={02 }, and server 0210 is the member ofclusters {10} and {02} both, as shown in Figure 2. In the rest paper, we willnot distinguish S and C, and represent = (resp. ) for short,e.g., ==10 and ==21. The links in SCautz include building and connecting switches with their servers. All the links inSCautz are undirected, and the physical cables are full-duplex, thus can be definedin two equivalent ways.

  • 7/31/2019 A Fault-Tolerant Network Architecture for Modular Datacenter

    5/14

    International Journal of Software Engineering and Its Applications

    Vol. 6, No. 2, April, 2012

    97

    10cluster

    02

    0210

    cluster

    Cleft Cright

    Sleft Sright

    Figure 2. The Cluster Structures of Two Types in SCautz(2,4,2)

    If clusters are treated as virtual nodes and the reduplicate links between any pairs ofclusters are not considered temporarily, we can easily obtain the following theorem 1

    and prove it true according to Definition 3. The SCautz(2,4,2) are shown in Figure 3,

    including the full higher-level and and the partialcorresponding physical structures of servers. Note that the arrows of the links in Figure

    3 are only used to exhibit SCautzs logical structures of clusters better.

    21 10

    02

    12 01

    20

    10

    01 02

    1010 2010 0210 1210

    2101

    0101

    0102

    2102

    physical

    structure

    1201

    0201

    0202

    0102

    Cright logical

    structure

    cluster

    cluster

    cluster

    21 10

    02

    12 01

    20

    10

    01 02

    1010 2010 0210 1210

    2101

    0101

    0102

    2102

    physical

    structure

    1201

    0201

    0202

    0102

    Cright logical

    structure

    cluster

    cluster

    cluster

    Figure 3. The SCautz(2,4,2)s Two Full higher-level Logical Structures and

    Partial Physical Structures

  • 7/31/2019 A Fault-Tolerant Network Architecture for Modular Datacenter

    6/14

    International Journal of Software Engineering and Its Applications

    Vol. 6, No. 2, April, 2012

    98

    THEOREM 1. All the (or ) form a logical Kautz structure, denoted as (or ).

    In SCautz, and represent the right-neighbors and left-neighbors of the server X by one L-shift and R-shift operation. The right-neighbor

    clusters and left-neighbor clusters of and are defined in the Definition4. (or ) denotes the cluster, which server X belongs to, via (or), while (or ) denotes the peer servers in the same cluster (or ) with server X.

    Definition 4 For any server , let (),,( ) and )be the neighbor-clusters of and .

    , , , .

    Therefore, the server as a member of has d right-neighbor clusters andone left-neighbor cluster while it as a member of has d left-neighbor clustersand one right-neighbor cluster. Take 1210 as an example,

    = =01, or = =02 and =21 hold, while

    = =10, or = =20 and = =21 hold. Combing the hybridstructure of SCautz(d,k,t) and above definitions, we can obtain the following key

    properties about any server , cluster and their neighbors.

    Property 1. Each server ( ) in the cluster( ) has d right-neighbor servers , and these s are evenlydistributed in d different right-neighbor clusters . Moreover, a cluster has d right-neighbor clusters,and all the servers in this connect to m servers in each right-neighborcluster.

    Property 2. Each server in the cluster has d left-neighbor servers , and these d servers are inthe same cluster

    (

    ). Moreover,

    a cluster has d left-neighbor clusters ( , and every servers whose s areidentical (assuming ) connect to all the servers in oneleft-neighbor clusters (( ).

  • 7/31/2019 A Fault-Tolerant Network Architecture for Modular Datacenter

    7/14

    International Journal of Software Engineering and Its Applications

    Vol. 6, No. 2, April, 2012

    99

    Therefore, we obtain the following lemmas.

    Lemma 1. If , then . That is all the servers in the cluster ( ) connect with only one server in each right-neighbor cluster , and

    .Lemma 2. If , then . Thus all the severs in one cluster connect with corresponding servers and .

    010

    101 102

    1010 2010

    0101 0102

    Scautz(2,4,3)

    2101 2102

    cluster

    cluster

    cluster

    10

    01 02

    1010 2010 0210 1210

    2101

    0101

    0102

    2102

    Scautz(2,4,2)

    1201

    0201

    0202

    0102

    cluster

    cluster

    cluster

    Figure 4. The Cluster Interconnection Structures in SCautz(2,4,3) and

    SCautz(2,4,2)

    Therefore, if , there are node-disjoint paths and edge-disjointpaths from to each of its d right-neighborclusters ( ). Take SCautz(2,4,3) and SCautz(2,4,2) asexamples, for SCautz(2,4,3) shown in Figure 3, there are two servers in cluster 010 and

    their neighbor-servers are distributed in two neighbor-clusters 101 and 102. But

    according to Lemma 1, the two servers connect to only one server in each neighbor-

    cluster respectively. So, if server 0101 fails, all the links between cluster 010 and 101

    are broken; while for SCautz(2,4,2), according to Lemma 2, there are two node-disjointpaths and four edge-disjoint paths, so it is more reliable than SCautz(2,4,3). Thus, we

    will always let in this paper.

    Lemma 3. There are node-disjoint paths and edge-disjoint paths from

    to its each of itsd

    left-neighbor clusters ().It is easy to know the logical Kautz structures of C_right (X) and C_left (X) are

    isomorphic, so we can also derive the corresponding properties about C_left (X) andthey will not be listed here.

    The SCautz is server-centric and its routing intelligence is implemented on servers.

    In consideration of the number limits of servers Ethernet NIC slots and COTS switcheslow-speed ports, we pick SCautz(4,5,3) as a typical structure for MDCN. SCautz(4,5,3)

  • 7/31/2019 A Fault-Tolerant Network Architecture for Modular Datacenter

    8/14

    International Journal of Software Engineering and Its Applications

    Vol. 6, No. 2, April, 2012

    100

    supports 1280 servers using only 160 COTS switches. Each server need to be equipped

    10 Ethernet ports, in which 8 ports are used for constructing Kautz topology and 2 used

    for connecting to each type of COTS switches. Now the multi-port (Dual-port, Quad-

    port) Ethernet NICs have become COTS components, and the COTS switch is generally

    equipped with tens (e.g, 24) of 1 GigE ports and several (e.g, 4) 10 GigE ports. SCautz

    uses switches 1 GigE ports to communicate with servers in the same cluster andreserve high-speed 10 GigE ports for inter-container network. Thus, SCautz is a

    practical approach for intra-container network of MDCN.

    4. Routing in SCautz

    According to SCautzs hierarchical structure, we propose a suite of routing algorithms to

    effectively utilize the redundant resources. In this section, we first introduce the regular

    routing methods in Kautz, i.e. in fault-free ; and then we analyze theirdisadvantages on dealing with node faults; at last, we present a fault-tolerant routing

    algorithm, SCRouing+, to achieve graceful performance degradation.

    4.1. Routing in Kautz graph

    is a complete undirected Kautz structure. For directed Kautz graph, Fiolproposed a shortest path routing algorithm from source X to destination Y by using L-shift,

    defined in Definition 2: Find the largest suffix of X which coincides with a prefix of Y, and

    the substring is denoted as R-string. Then put the hop H with longer suffix that coincides with

    a prefix of Y than its previous hop until reach the destination Y, and isobtained. In the same way, could be computed by using R-shift operations, andR-shift is defined below too.

    Definition 2 Let L-shift and R-shift denote the shift operations on X:

    = = .

    Combing Fiols [12] and Pradhans [13] ideas, we design a routing algorithm for

    , called SCRouting. Let |R-string| and |L-string| refer to the length of R-string and L-string. SCRouting algorithm first compares |R-string| and |L-string|. If |R-

    string|>|L-string|, then the is picked as by performing L-shift;otherwise, is picked as to route packets.

    4.2. Routing in Kautz graph

    In , there are either d parallel R-paths or d parallel L-paths betweenany pairs of servers. Generally, the Kautz graph uses one R-path (or L-path) for data

    transmission. If the path breaks down, it is discarded and replaced by another one from

    the rest d-1 R-paths (or L-paths). The reason why not find a sub-path to bypass thefailed links or nodes is that such a sub-path may need at most k hops. For example, if

    node 20 fails, then path 12->20->01 is not valid anymore, then it compute another new

    path 12->21->10>01 from 12 to 01, as shown in Figure 5. In this way, though the

    destination is still reachable, the capacity has shrunken: For one-to-one traffic, the sparepaths are always longer than the primitive one, so the delay of single-path routing

    increases; since there are d-1 parallel paths left, so throughput of multi-path routing

    decreases by 1/d. For one-to-x traffic, since even one failure of link or server will make

  • 7/31/2019 A Fault-Tolerant Network Architecture for Modular Datacenter

    9/14

    International Journal of Software Engineering and Its Applications

    Vol. 6, No. 2, April, 2012

    101

    all the paths via it become invalid, so the network capacity and reliability degrades

    severely.

    To remedy the deficiencies, we propose a fault-tolerant routing algorithm

    SCRouing+ based on SCautzs hybrid structure. It can handle the faults in paths

    generated by both SCBRouting in and SCRouting in .SCRouing+ uses the survival peer server in the same cluster with the unreachable one tobypass the failed link or server: for R-path( ), it utilizes the peer server in, while for L-path( ), it utilizes the one in s.

    Figure 5. Fault-tolerant Routing in Kautz

    Figure 6. SCRouting+ fault-tolerant Routing in SCautz

    Let (resp. ) represent the i-th right-neighbor (resp.left-neighbor) servers by i L-shift (resp. R-shift) operations. For example assuming

    and 2, the

    21 10

    02

    12 01

    20 failed

    12 20

    01010201 1201 2101

    1012

    0212

    0120

    2120

    1212

    2012

    1020

    2020

    cluster cluster

    cluster

    failed

    failed failed

    01

  • 7/31/2019 A Fault-Tolerant Network Architecture for Modular Datacenter

    10/14

    International Journal of Software Engineering and Its Applications

    Vol. 6, No. 2, April, 2012

    102

    means s right-neighbors right-neighbor server, i.e. the second right-neighbor. Thenthe lemma 4 is obtained and proved easily .

    Lemma 4. For in logical , and are in the same cluster. If their mrightmost letters are identical and m+1 rightmost letters are different, then

    , in which

    . So it is true for

    in

    logical .According to the Lemma 4, if a server detects the next hop is unreachable, SCRouting+

    picks an idle peer server as the next hop from the ones, wheres suffix (or prefix)of length coincides with s and that of length not. Then (or ). Thus SCRouting+ bypasses thefailed hop and reaches its next hop. Moreover, the new fault-tolerant path is only one hop

    more than the original one, and without impacts on the other parallel paths. For example,

    server 2120 is down, resulting in the sub-path 0212->2120->1201 in certain path invalid.

    SCRouing+ constructs the sub-path 0212->1012->0120->1201 to bypass 2120, shown in

    Figure 6, instead of a new path 0212->2120->1201->2012->0120->1201 in regular method.

    5. Simulations

    In this section, we conduct simulations to evaluate the behavior of SCautz and SCRouting+

    on fault-tolerance. First, we analyze the performance of SCautzs base topology on handlingvarious patterns of traffic and compare the results to several representative BCubes. And

    then we test the performance decline of SCautz and BCube when failures happen and

    increase.In these simulations, we use SCautz(4,5,3) as a typical intra-container network of MDCN,

    whose base Kautz topology is UK(4,5) and t=3. There are servers equippedwith 5 dual-port NICs and COTS switches with 24 1 GigE ports and 410 GigE ports. For comparisons, we pick two full BCube structures (BCube(32,1),

    BCube(4,4))[6]

    and one partial BCube (BCube(8,3))[3]

    , in which the partial BCube(8,3) uses

    2 complete BCube(8,2) with full layer-4 switches ( ). So there are 1024 servers in allthree BCubes but with 64, 1280, 704 switches in BCube(32,1), BCube(4,4) and BCube(8,3)

    respectively.

    5.1. Performance of We assume the bandwidth of each server s NIC port is 1 Gbps and intermediate

    servers relay traffic without delay. We summary some key results in Table 1.

    Table 1. Key Simulation Results of and Bcube BCube(32,1) BCube(4,4) BCube(8,3)

    ave_path 4.38 1.94 3.75 3.511-to-1 4 2 5 4

    1-to-all 4 2 5 4

    ABT 1168.95 1057.03 1365.33 1170.29

  • 7/31/2019 A Fault-Tolerant Network Architecture for Modular Datacenter

    11/14

    International Journal of Software Engineering and Its Applications

    Vol. 6, No. 2, April, 2012

    103

    From the simulations and comparisons, we know that could offer as highthroughput for one-to-x traffic and high throughput for all-to-all traffic as BCube(8,3) does.

    But s ABT and per-server throughput are a little lower than BCube(4,4)because of its longer average path length, because the average path length directly affects the

    ABT. In our work, when computing path length for BCube, we considers the switches as

    dumb crossbar, as

    [5]

    says but unlike in

    [14,15]

    , so the two hops travelling a switch onlyaccounted as one. In addition, BCube(4,4) needs more switches of an order of magnitude. The

    results illustrate that just is able to effectively accelerate various types oftraffic patterns as well as BCube, when a container is fault-free.

    5.2 Fault-tolerance Evaluation

    Since either link or server failure makes one hop in the path unreachable, we assume all

    faults are caused by servers or switches and server failures also result in computation and

    storage capacity decline in our simulations.

    As shown in Figure 7, when one server failure happens, the per-server throughput of

    BCube(32,1), BCube(4,4) and BCube(8,3) lose by 50%, 20% and 25% for one-to-x traffic.

    Using switches, SCRouting+ algorithm could bypass the failed server by one more hop and

    keep the original path valid. So is able to retain the original throughput as afault-free one.

    In Figure 8, when 10% and 20% servers fail, the overall computation capacity drops 10%

    and 20% correspondingly, while BCubes ABT drops by 15.3% and 25.23%, represented by

    the polyline named BCube(8,3). In contrast, only loses by 6.91% and13.74% throughput respectively, much slower than computation and storage decrease. In

    addition, BCubes ABT shrinks beyond 50% when 20% switch fail, but no impact on

    .

    Figure 4. Throughput Degradation for one-to-one Traffic

  • 7/31/2019 A Fault-Tolerant Network Architecture for Modular Datacenter

    12/14

    International Journal of Software Engineering and Its Applications

    Vol. 6, No. 2, April, 2012

    104

    Figure 8. ABT Degradation for all-to-all Traffic

    5.3 Fault-tolerance Analysis

    From the above simulations, we can see that is able to leverage redundantswitches to maintain the per-server throughput for one-to-x traffic and reduce about half ABT

    decrease than BCube, so as to improve the reliability of SCautz evidently. Switch faults have

    little impact on , but result in BCubes ABT drop sharply. It is because that switches inSCatuz are mainly used to tolerate the increasing faults, while switches in BCube exist

    between any two servers and participate in forwarding each network packet.

    It is easy to obtain an effective scheme of SCautz-based container to deal with frequent and

    increasing failures: First let a fault-free containers SCautzs base topology functions, andthen leverage switches to tolerate faults. Thus, SCautz is able to retain the merits of its

    original base structure and achieve performance graceful degradation.

    9. Conclusion

    MDCs distinct service-free service model poses stricter demand on fault-tolerance ofdatacenter network. According to the scale-out design principle, we propose a novelhierarchical intra-container network structure for MDC, named SCautz. SCautz comprises of

    a base physical Kautz topology and hundreds of redundant COTS switches. Its base topology,

    , is able to effectively accelerate one-to-x traffic and offer high networkthroughput for all-to-all traffic, behaving as well as BCube. Besides, each switch of two types

    together with a specific number of servers form clsters, and clusters build two logical

    Kautz structures in higher level. Thus, SCautz is able to retain the throughput for processingone-to-x traffic in the presence of failures and achieve more graceful performance

    degradation by reducing about half ABT decrease than BCube.

    In this paper, we have proved that SCautz is able to meet the strict requirements ofMDCN through theoretical analysis and simulating evaluations. In our future work, we

    will study how to design inter-container network by interconnecting SCautz-based

    containers to build mega-datacenters. Moreover, we need to design novel load-balanced

  • 7/31/2019 A Fault-Tolerant Network Architecture for Modular Datacenter

    13/14

    International Journal of Software Engineering and Its Applications

    Vol. 6, No. 2, April, 2012

    105

    routing algorithm to process burst network flows of data-intensive applications [16, 17],

    so the map-reduce-like applications would not miss the strict deadline for fetching

    intermediate results from worker nodes[18]

    .

    Acknowledgements

    This work is supported in part by the National Basic Research Program of China (973)under Grant No. 2011CB302600, the National Natural Science Foundation of China (NSFC)

    under Grant No. 60903205, the Foundation for the Author of National Excellent Doctoral

    Dissertations of PR China (FANEDD) under Grant No. 200953, and the Research Fund for

    the Doctoral Program of Higher Education (RFDP) under Grant No. 20094307110008.

    References

    [1] J. R. Hamilton, Recent Architecture for Modular Data Centers, Proceedings of Biennial Conference onInnovative Data Systems Research (CIDR), (2007) January 7-10, 2007, Asilomar, California, USA.

    [2] K. V. Vishwanath, A. Greenberg, and D. A. Reed, Modular data centers: how to design them?, Proceedingsof LSAP, (2009), June 10. Munich, Germany.

    [3] A. B. Letaifa, A. Haji, M. Jebalia and S. Tabbane, State of the Art and Research Challenges of new servicesarchitecture technologies: Virtualization, SOA and Cloud Computing. International Journal of Grid andDistributed Computing (IJGDC). 3, 68 (2010).

    [4] P. Chakraborty, D. Bhattacharyya, N. Y. Sattarova and S. Bedaj, Green computing: Practice of Efficient andEco-Friendly Computing Resources, International Journal of Grid and Distributed Computing (IJGDC). 2,33 (2009).

    [5] C. Guo, G. Lu, et al. BCube: a high performance, server-centric network architecture for modular

    datacenters, Proceedings of the ACM SIGCOMM conference on Data communication (SIGCOMM 09),(2009) August 1721, Barcelona, Spain.

    [6] A. Greenberg and J. R. Hamilton, VL2: a scalable and flexible data center network, Proceedings of theACM SIGCOMM conference on Data communication (SIGCOMM 09), (2009) August 1721, Barcelona,Spain.

    [7] R. N. Mysore, A. Pamboris, et al., PortLand: a scalable fault-tolerant layer 2 data center network fabric,Proceedings of the ACM SIGCOMM conference on Data communication (SIGCOMM 09), (2009) August

    1721, Barcelona, Spain.[8] C. Guo, H. Wu, et al., Dcell: a scalable and fault-tolerant network structure for data centers, Proceedings of

    the ACM SIGCOMM conference on Data communication (SIGCOMM 098), (2008) August 1722, Seattle,Washington, USA.

    [9] H. Abu-Libdeh, P. Costa, et al., Symbiotic routing in future data centers, Proceedings of the ACMSIGCOMM conference on SIGCOMM (SIGCOMM 10). (2010) August 30September 3, New Delhi, India.

    [10] H. Sim, J.-C. Oh and H.-O. Lee, Multiple Reduced Hypercube(MRH): A New Interconnection NetworkReducing Both Diameter and Edge of Hypercube, International Journal of Grid and Distributed Computing(IJGDC). 3, 19 (2010).

    [11] M. O. Balitanas and T. Kim, Using Incentives for Heterogeneous peer-to-peer Network, InternationalJournal of Advanced Science and Technology (IJAST), 14, 23 (2010).

    [12] M. A. Fiol and A. S. Llado, The partial line digraph technique in the design of large interconnectionnetworks, IEEE Trans. Computers, 41, 848 (1992).

    [13] D. K. Pradhan and S. M. Reddy, A fault-tolerant communication architecture for distributed systems, IEEE

    Trans. Computers, 32: 863, (1982).[14] Praveen G, P. Vijayrajan, Analysis of Performance in the Virtual Machines Environment, International

    Journal of Advanced Science and Technology (IJAST), 32, 53 (2011).

    [15] H. Wu, G. Lu, D. Li, et al., MDCube: a high performance network structure for modular datacenterinterconnection, Proceedings ofCoNEXT 09,(2009), December 14, Rome, Italy.

    [16] M. Al-Fares, S. Radhakrishnan, BarathRaghavan, NelsonHuang and AminVahdat, Hedera: Dynamic Flow

    Scheduling for Data Center Networks, Proceedings of the 7th USENIX conference on Networked systemsdesign and implementation (NSDI10), (2010).

  • 7/31/2019 A Fault-Tolerant Network Architecture for Modular Datacenter

    14/14

    International Journal of Software Engineering and Its Applications

    Vol. 6, No. 2, April, 2012

    106

    [17] C. Raiciu, S Barre, A. Greenhalgh, D. Wischik and M. Handley, Improving datacenter performance androbustness with multipath tcp., Proceedings of the ACM SIGCOMM conference on SIGCOMM(SIGCOMM 11), (2011) August 1519, Toronto, Ontario, Canada.

    [18] C. Wilson and H. Ballani, Better never than late: Meeting deadlines in datacenter networks, In: Proceedings

    of the ACM SIGCOMM conference on SIGCOMM (SIGCOMM 11), (2011) August 1519, Toronto, Ontario,Canada.

    Authors

    Feng Huang

    He received the B.Sc. degree (with honors) in computer science

    from College of Computer, National University of DefenseTechnology (NUDT), Changsha, China, in 2001. He is now a

    student for Ph.D. at National Lab for Parallel and Distributed

    Processing, NUDT. His research interests include could computing,

    datacenter network, grid computing, virtual machine technology

    and data-intensive applications.