towards next generation datacenter architecture

13
08/05/2012 1 Towards a Next Generation Data Center Architecture: Scalability and Commoditization Albert Greenberg, Parantap Lahiri, David A. Maltz, Parveen Patel, Sudipta Sengupta Microsoft Researc h, Redmond, WA, USA Abstract Appl icat ions host ed in t oday ’s dat a centers s uffer from  internal fragmentation of resourcer  Rigidity  bandwidth constraints imposed by the architecture of the network connecting the data center’s servers; Conv entional architectures static ally ma p web se rvices to Ethernet VLANs, each constrained in size to a few hundred servers owing to control plane overheads  The IP routers used to span traffic across VLANs and the load balancers used to spray requests within a VLAN across servers are realized via expensive customized hardware and proprietary software

Upload: imtiaz-anwar

Post on 04-Apr-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Towards Next Generation Datacenter Architecture

7/30/2019 Towards Next Generation Datacenter Architecture

http://slidepdf.com/reader/full/towards-next-generation-datacenter-architecture 1/13

08/05/20

Towards a Next Generation DataCenter Architecture:

Scalability and Commoditization

Albert Greenberg, Parantap Lahiri, David A.Maltz, Parveen Patel, Sudipta Sengupta

Microsoft Research, Redmond, WA, USA

Abstract• Applications hosted in today’s data centers suffer from

 – internal fragmentation of resourcer

 – Rigidity

 – bandwidth constraints imposed by the architecture of thenetwork connecting the data center’s servers;

• Conventional architectures statically map web servicesto Ethernet VLANs, each constrained in size to a fewhundred servers owing to control plane overheads

 – The IP routers used to span traffic across VLANs and the load

balancers used to spray requests within a VLAN across serversare realized via expensive customized hardware and proprietarysoftware

Page 2: Towards Next Generation Datacenter Architecture

7/30/2019 Towards Next Generation Datacenter Architecture

http://slidepdf.com/reader/full/towards-next-generation-datacenter-architecture 2/13

08/05/20

Abstract (cont.)

• Concentrates traffic in a few pieces of hardware

 – must be frequently upgraded and replaced to keeppace with demand

• Scale up instead of scale out.

• Commodity switching hardware is now becomingavailable with programmable control interfaces and withvery high port speeds at very low port cost

Introduction

• Data centers are comprised of both server andnetworking infrastructure

 – The server portion of the infrastructure is now fardown the road of commoditization

 – The network portion of the data center infrastructurepresents the next frontier for commoditization.

Page 3: Towards Next Generation Datacenter Architecture

7/30/2019 Towards Next Generation Datacenter Architecture

http://slidepdf.com/reader/full/towards-next-generation-datacenter-architecture 3/13

08/05/20

Monsoon Architecture

• Traffic Engineering on Layer 2 Mesh

 – Valiant Load Balancing (VLB) on a mesh ofcommodity Ethernet switches to realize a hot-spotfree core fabric.

• Scale to Huge Layer 2 Domain

 – Creates a single layer 2 domain that forwards framesusing flat addresses but still scales up to 100,000servers.

• Ubiquitous Load Spreading – MAC Rotation.

CHALLENGES AND REQUIREMENTS

• Multiple applications run inside a single data center

• Each application is associated with one or more publiclyvisible and routable IP addresses to which clients in theInternet send their requests and from which they receivereplies

• Inside the data center, requests are spread among apool of frontend servers that process the requests – typically performed by a specialized load balancer

• the IP address to which requests are sent is called a virtual IPaddress (VIP)

• IP addresses of the servers over which the requests are spread areknown as direct IP addresses (DIPs)

Page 4: Towards Next Generation Datacenter Architecture

7/30/2019 Towards Next Generation Datacenter Architecture

http://slidepdf.com/reader/full/towards-next-generation-datacenter-architecture 4/13

08/05/20

Data-Center Standard Architecture

Data-Center Standard Architecture:Problems

• Fragmentation of resources

• Poor server to server connectivity

• Proprietary hardware that scales up, notout

Page 5: Towards Next Generation Datacenter Architecture

7/30/2019 Towards Next Generation Datacenter Architecture

http://slidepdf.com/reader/full/towards-next-generation-datacenter-architecture 5/13

08/05/20

Data-Center Architecture:Requirements

• Placement anywhere

 – The architecture should allow any serveranywhere in the data center to be a part of thepool of servers behind any VIP, so that serverpools can be dynamically shrunk or expanded

• Server to server bandwidth

• Commodity hardware that scales out

• Support 100,000 servers

Moonsoon Architecture

Page 6: Towards Next Generation Datacenter Architecture

7/30/2019 Towards Next Generation Datacenter Architecture

http://slidepdf.com/reader/full/towards-next-generation-datacenter-architecture 6/13

08/05/20

Monsoon Architecture (cont.)

• Cost network reduction as much as possible• To eliminate fragmentation of servers

 – hierarchical addresses• servers used to expand an application need to be located in

the same network hierarchy as the application

 – huge flat address space• one pool of free servers can be shared by all applications

• the principle of least disturbance – By implementing Monsoon’s features underneath IP,

these systems can continue to operate largely

unmodified• Ethernet the clear winner!

Monsoon Architecture (cont.)• Layer 3 - Equal Cost MultiPath (ECMP)

 – spread the requests received from the Internetequally over all access routers

• Requests in the layer 2 domain – The access routers use consistent hashing to spread

the requests going to each application’s public VIPequally over the set of servers acting as loadbalancers for that application

• Load balancers – spread the requests using an application-specific loaddistribution function over the pool of servers, identifiedby their DIPs, that implement the applicationfunctionality

Page 7: Towards Next Generation Datacenter Architecture

7/30/2019 Towards Next Generation Datacenter Architecture

http://slidepdf.com/reader/full/towards-next-generation-datacenter-architecture 7/13

08/05/20

Monsoon Load Balancing

• The ability to spread requests across servers,the load on each server can be kept at the levelwhere forwarding can be done in Software – As the offered load begins to overwhelm the existing

load balancers, additional servers can be provisionedas load balancers to dilute the load

• This costs substantially less than the equivalent hardwareload balancer

 – Commodity servers as load balancers enables themto be fully programmable

• their algorithms can be tunable to particular data center

applications rather than making do with the algorithmsvendors provide in firmware

Available Components

• MAC-in-MAC tunneling

• 16K MAC entries

• Node degree

 – two types of switches

• top-of-rack (TOR) switch that aggregates the 20 1-Gbps links coming from the 20 servers in each

rack onto 2 10-Gbps uplinks

• “core” switch with 144 ports of 10-Gbps

Page 8: Towards Next Generation Datacenter Architecture

7/30/2019 Towards Next Generation Datacenter Architecture

http://slidepdf.com/reader/full/towards-next-generation-datacenter-architecture 8/13

08/05/20

Server-to-Server Forwarding

• The ability to forward frames betweenservers in the data center is the mostbasic aspect of Monsoon – Forwarding scalability

• Monsoon must connect 100,000 servers using asingle layer 2 network built of switches that canstore only 16,000 MAC forwarding entries each

 – the sending server encapsulates its frames in a MACheader addressed to the destination’s TOR switch

» switches need only store forwarding entries for otherswitches and their own directly connected servers

Server-to-Server Forwarding (cont.)

• Traffic engineering – Monsoon must support any traffic matrix

• Valiant Load Balancing (VLB) – requires that every frame sent across the network first “bounce”

off a randomly chosen intermediate switch before beingforwarded to its destination

 – Load spreading• In Monsoon, any IP address can be mapped to a pool of

servers, each server identified by its individual MAC address

• A health service keeps track of active servers in every pool

with a common IP address• When a server has a frame to send to an IP address, the

frame is sent to an active server in the pool. – To avoid packet reordering, the MAC address is chosen on a

per-flow basis, e.g., using consistent hashing on IP 5–tuple

Page 9: Towards Next Generation Datacenter Architecture

7/30/2019 Towards Next Generation Datacenter Architecture

http://slidepdf.com/reader/full/towards-next-generation-datacenter-architecture 9/13

08/05/20

Server-to-Server Forwarding (cont.)

• Obtaining Path Information From Directory 

 – When the application running on server presents itsnetwork stack with a packet to send to an IP address,the server needs two pieces of information before itcan create and send a layer 2 frame

• it must have the list of MAC addresses for the servers

responsible for handling that IP address, and the MACaddress of the top-of-rack switch where each of those

servers is connected

• It also needs a list of switch MAC addresses from which it willrandomly pick a switch to “bounce” the frame

Server-to-Server Forwarding (cont.)

Page 10: Towards Next Generation Datacenter Architecture

7/30/2019 Towards Next Generation Datacenter Architecture

http://slidepdf.com/reader/full/towards-next-generation-datacenter-architecture 10/13

08/05/20

Server-to-Server Forwarding (cont.):Obtaining Path Information From Directory 

• When the encapsulator receives a packet from the IPnetwork stack, it computes a flow id for the packet andexamines its cache of active flows for a matching entry – If there is no entry, it queues the packet and sends a request to

the Monsoon Agent to look up the remote IP using the directoryservice

 – The directory service returns the MAC addresses to which the IPaddress resolves and the set of VLB intermediate switches touse

• the encapsulator chooses a destination MAC address (and itscorresponding top-of-rack switch address) and a VLB intermediatenode for the flow and caches this mapping

• The server randomly chooses an intermediate node for every IPflow, thus spreading its load among all VLB intermediate nodeswithout causing TCP packet reordering

Server-to-Server Forwarding (cont.):Encapsulation of Payload and Forwarding 

Page 11: Towards Next Generation Datacenter Architecture

7/30/2019 Towards Next Generation Datacenter Architecture

http://slidepdf.com/reader/full/towards-next-generation-datacenter-architecture 11/13

08/05/20

Moonsoon Architecture:External Connections

• It would be ideal if Access Routers had the abilityto spread packets across a set of MAC addresses – The routers today do not support the Monsoon load

spreading primitive or the encapsulation Monsoon usesfor VLB

• Adoption of an Ingress Server, with each Access Router – the Access Router sends all the external traffic through the Ingress

Server, which implements the Monsoon functionality and acts as agateway to the data center

Switch Topology

Page 12: Towards Next Generation Datacenter Architecture

7/30/2019 Towards Next Generation Datacenter Architecture

http://slidepdf.com/reader/full/towards-next-generation-datacenter-architecture 12/13

08/05/20

Moonsoon Control Plane

• The Monsoon control plane has two mainresponsabilities:

 – (1) maintaining the forwarding tables in theswitches; and

 – (2) operating a directory service that tracksthe port at which every server is connected tothe network, as well as the server’s IP and

MAC addresses.

Moonsoon Control Plane (cont.)

• Maintaining Forwarding Tables:

 – Monsoon requires that every switch have aforwarding table with an entry for every otherswitch

• Any technique that computes routes among the≈ 5K switches in the data center could be used

• we program the top-of-rack (TOR) switches totrack the IP and MAC addresses of the serversdirectly connected to them, and announce thisinformation in a Link-State Advertisement (LSA).

Page 13: Towards Next Generation Datacenter Architecture

7/30/2019 Towards Next Generation Datacenter Architecture

http://slidepdf.com/reader/full/towards-next-generation-datacenter-architecture 13/13

08/05/20