bv|ub0 |;7 u b -|;t o 7& v;- v; !#$%&'$()+%,-$(./0 ... · solu on brief distributed...

Solution Brief Distributed Private Cloud Use Case

of 22© Plexxi, Inc. 2015

Distributed Private Cloud Use Case

Executive SummaryPlexxi is a networking company and for the past four years we have been developing the network elements for the third platform evolution. We believe the third platform network is built using photonically switched fabrics and high performance software-based controllers (i.e. SDN) that understand workload needs and algorithmically maximize the inherent advantage in power, performance, and latency of the photonic domain. We are delivering the hardware and software products necessary to deliver a simply better network. A Plexxi network allocates its capacity and functionality directly from the user application

and tenant hosting workloads. This results in a system that has transformative scale, performance, and efficiency advantages compared to conventional enterprise network architectures. The following is a snapshot of the advantages realized by the Google networking team when deploying a controller (i.e. SDN) architecture:

The same advantages are realized in the Plexxi architecture and we provide some additional benefits as well:




• Simplicity: Single tier photonic network• Uniformed Low Latency: Direct connectivity• Reduced Cabling: Simplified network element deployment and

insertion• Reduced OPEX: Use of Controller• Reduced OPEX: Photonics drive lower power and cooling costs• Deployment Flexibility: Use of photonics reduces physical site

deployment constraints

Third Platform Era NetworksIt is rare to have technology shifts that fundamentally restructure the whole of the IT landscape. The very nature of IT—serving as the underpinnings for many businesses—does not lend itself well to massive upheaval very frequently. Such changes do happen, but perhaps only once in a generation. We are in the midst of one of those shifts now, one that will leave the whole of IT looking very differently by the time the transformation is complete.

Drivers of changeIn the IT industry generally (and the networking industry more specifically), change is being driven by the applications that run on top of IT infrastructure. The applications themselves are changing in response to two fundamental shifts: the proliferation of mobile platforms, which changes how applications are consumed, and the rise of data as the

primary application currency.

These two shifts require applications to be increasingly efficient while simultaneously processing vastly larger amounts of data. The result is a step-function-like change in application architectures. It is this transition to horizontally scaled applications that is driving change throughout IT. When the applications themselves become structurally different, the infrastructure on which they rely must too evolve. Ultimately, the whole of IT will have to craft infrastructure designed explicitly for these high-performance scale-out application architectures. The result will



be an IT landscape that looks dramatically different from today in terms of how infrastructure is built, sold, and deployed. The First Platform EraThe first IT platform was the mainframe. As a platform, mainframes included the four major IT building blocks: compute, storage, networking, and the applications. These were delivered via a monolithic, tightly integrated system. The First Platform Era was all about these mainframes and the capabilities they unlocked. From an IT perspective, this era was system-oriented.

The Second Platform EraIf the First Platform Era was about introducing IT to companies, the Second Platform Era was about extending the reach of IT to the world.

The introduction of the Internet did two things: it expanded the number of users geometrically, and it allowed those users to be geographically diverse. This put demands on applications that had been designed for small numbers of co-located users. The result was a fundamental shift in applications towards the tiered architectures still prevalent across the vast majority of applications today.

The applications, in turn, placed performance and scaling requirements on the underlying IT infrastructure (compute, storage, networking). To meet these demands, IT infrastructure had to evolve very quickly. Innovating as an entire system limits progress to the pace of the slowest mover. The only way for infrastructure to keep pace was to decouple. The result was a disaggregated IT infrastructure landscape where the focus shifted from the system in its entirety to each of the constituent components. From an IT perspective, this era was component-oriented.

An Era in TransitionIf major transformations are generational and Platform 2 hit its stride as the Internet grew, then the next major IT transformation is likely already in progress, even if the undercurrents are only barely visible. So do we observe anything that indicates change?

Post-integration sales models are emerging. Companies like VCE and Exadata are taking the components that were disaggregated during the Platform 2 Era, assembling them into complete solutions, and selling them directly to infrastructure buyers. They have grown their businesses dramatically, proof that consumption models are shifting from collections of components to more complete solutions.

“These new applications are more than applications, they are distributed systems. Just as it became commonplace for developers to build multithreaded applications for single machines, it’s now becoming commonplace for developers to build distributed systems for data centers.” - Benjamin Hindman (Twitter)

Platform 2 solutions are not sufficient to support next-generation, scale-out applications.Technology leaders like Google and Facebook have resorted to building their own infrastructure and then layering orchestration systems on top to meet the aggressive performance and scaling requirements that their platforms demand. But the problem is



not just with the largest web-scale properties. Common practice in companies looking to take advantage of Big Data applications is to build out separate infrastructure to keep Big Data workloads isolated from residual application workloads. Both examples point to infrastructure being stretched beyond its useful boundaries.

The Third Platform EraIn the Third Platform Era, infrastructure is orchestrated around the applications, which are oriented themselves towards the data on which they depend. Put simply, the Third Platform Era is application and data-oriented.

“From an operator’s perspective it would span all of the machines in a data center (or cloud) and aggregate them into one giant pool of resources on which applications would be run. You would no longer configure specific machines for specific applications; all applications would be capable of running on any available resources from any machine, even if there are other applications already running on those machines. From a developer’s perspective, the data center operating system would act as an intermediary between applications and machines, providing common primitives to facilitate and simplify building distributed applications.” - Benjamin Hindman (Twitter)

Datafication of ITThe technology world—not just IT—is moving through a massive datafication. That is to say that technology trends like the Internet of Things is leading to an explosion in the data available to perform useful functions. The key to exploiting this data for business and personal benefit is accessing it on demand, filtering it, and ultimately using it to deliver value.

The nature of this data is that it is massively distributed, not just across servers but also geographical areas. The applications that process this data must also be distributed, which leads to a collection of Third Platform applications that are scaled horizontally so that they can handle the data and processing loads required to thrive in this model.



Architectural Implications of Horizontally Scaled ApplicationsIf the data and applications are both distributed, there is an inherent architectural dependence on the network that allows these resources to work in concert to deliver application workloads. That interconnect—the network—must undergo its own transformation to meet these Third Platform needs.

Specifically the ideal Third Platform network is:

• Scalable - In a dynamic application environment, it’s about more than scale. The ability to scale gracefully to handle distributed applications is critical. Accordingly, the network must be scaled out and easily expanded to match demand.

• Agile - Application agility is meaningless if the network cannot keep pace. And keeping pace means removing complexity, simplifying operations, and embracing automation to provide a dynamic and responsive infrastruc-ture.

• Integrated - Infrastructure must work in an orchestrated fashion to deliver application experience. This means that compute, storage, networking, applications, and all the surrounding systems must be capable of frictionless coordination.

• Resilient - Distributed systems only function if the interconnect is reliable. This means the network must not only be fault tolerant but also resilient.

• Secure - With data at the center, security is more important than ever—not just for the infrastructure but also for the applications and data.



Third Platform Technology PillarsThe Third Platform must be scalable, agile, integrated, resilient, and secure. To deliver these requirements, networking solutions must include three primary technology layers:

• Dynamic physical transport - Something has to move bits. While the world might be software-defined, it still runs on physical wires. Any solution that meets the Third Platform requirements has a physical component. That physical component must be high capacity, designed for east-west traffic loads, and tunable.

• Intelligent control - With every data plane comes a control plane, and the Third Platform control plane must be capable of utilizing a high-capacity, dynamic physical transport layer. Intelligence plays a role in determining how traffic will transit the network. It cannot be constrained by Second Platform limitations like SPF and ECMP.

• Integration - The Third Platform leverages a set of infrastructure that is decoupled but integrated. Decoupling allows for individual elements to be replaced as needed (a carryover from the Second Platform era), and integra-tion allows those elements to work in concert to deliver application workloads.

Plexxi and the Third Platform EraMajor transitions in any technology space occur when existing solutions begin to fray as they are stretched beyond their capabilities. Even in today’s Second Platform Era, infrastructure is showing signs of stress. Networks today are fragile, unwieldy to manage, and expensive to own and operate. From a switching perspective, these symptoms are on full display across multiple domains:

• Inside the datacenter - Datacenters lack the agility and flexibility today required to deploy new types of applica-tions quickly and efficiently. Where compute and storage might take minutes, bringing the network along often-times takes days or weeks.

• Scale-out applications – Clustered applications like Big Data are heavy in east-west traffic, they are sensitive to data locality, and they are vulnerable to congestion. It says something about today’s networks when deploying new applications is more easily done by also deploying a new network.



• Distributed cloud - Containing resources in a small number of racks is simply not practical. As resources extend across rows, rooms, campuses, and continents, they still need to be logically adjacent. Building out connective networks is time-consuming, expensive, and needlessly complex, all of which add time and cost to the act of spinning up new services.

Solving these problems while simultaneously preparing for the Third Platform Era means providing a mix of required Second Platform features, embracing crossover technologies that have a role in both eras, and developing new capabilities specifically for the Third Platform.

At Plexxi, we are committed to solving the Second Platform Era problems in a way that allows for a graceful and cost-effective transition to the Third Platform Era. We have built solutions designed explicitly to address immediate pain points while paving the road to a new application era where data is king and infrastructure is an integral player to unlocking it.

A Network AbstractionThe notion of affinities is not new. For years, data center architects and application architects have understood that in order to maintain peak application performance, the optimal interconnection of interdependent application resources (such as servers, VMs, databases, network services, storage, etc.) is paramount. Many architects implement affinity concepts manually by co-locating instances of a workload on the same physical switch or adjacent switches. As workloads become larger, deployed across larger infrastructure sets, require increasingly diverse types of resources, and become more portable via virtualization solutions, the task of discerning and manually implementing affinities becomes difficult to nearly impossible.

The Plexxi solution streamlines discovery of workload resources – gathering critical resource information from existing repositories – and organizing those resources into relevant Affinity Groups. Affinity is merely a Plexxi term for an abstraction group. The process of grouping can be achieved administratively by using a sophisticated GUI or a scriptable shell, programmatically, or via direct connections to automation and orchestration systems and existing repositories. Once workload resources have been grouped, specific policies can be expressed that represent the needs of those workloads for precise interconnection mapping within the data center.

In order to implement the desired workload requirements expressed by group-based policies, we need a network that is designed in a fundamentally different manner than existing hierarchical networks. Today’s monoculture network is designed around the principle that most random is most optimized. It is true that the principle of network design has been to build wider, one must build taller, but the challenge of legacy network design is lack of



control and waste. There had been no easy method to correlate the needs application to the network, until the advent of SDN. The emergence of the agile infrastructure that supports east-west traffic, multiple forwarding topologies as well as path diversity via controller automation is the future.

Today’s networks that are built as hierarchies resemble the single core processor of the 1990s. Access, aggregation, core layers or leaf/spine topologies are designed to aggregate, or funnel traffic patters to a continually expanding core layer. Yet for years, data traffic has been proven to be “Self-similar” (see http://ccr.sigcomm.org/archive/1995/jan95/ccr-9501-leland.pdf). Self-similarity is the idea that something feels the same regardless of scale. When this is put in terms of long-

range dependence, auto-correlation decays and the Hurst parameter (H = measure of burst, developed by Harold Hurst in 1965) increases as traffic increases.

In other words, the more you aggregate flows through multi-tier spine layers and increasingly through the use of network overlays, the result will be the intensification of burstiness (i.e. H parameter). The obvious assumption is to expect a Poisson distribution, but that is not the result. Processes with long-range dependence are characterized by an autocorrelation function that decays hyperbolically as size increases. Said another way, as the number of ethernet users increases, the resulting aggregate traffic becomes burstier instead of smoother. The supposition that increasing number of users or VMs results in increased randomness (or even distribution) of traffic is actually false.

Plexxi Product IntroductionThe Plexxi product set is designed to harness to cost and density curves of merchant silicon switching and optical components. The hardware is an expression of the software, which is the software-defined aspect of the Plexxi product set.



Plexxi switches are designed to harness a linear scale out model. A Plexxi switch can be cut into two portions. There is a front facing electrical switching domain that uses a Broadcom Trident II with traditional SPF, QSFP ports and speaks IP and ethernet. The back half is the photonic interconnect that traditional networks would consider the fabric module or backplane portion of a core switch.

In a legacy ethernet switch, the backplane and fabric modules are closed to the user, but in a Plexxi switch out fabric becomes extensible, open and programmable by the user. We have taken the capacity of the spine or core tier of a traditional network design and incorporated that capacity in the photonic portion of a Plexxi switch. This photonic portion of a Plexxi switch forms the Plexxi fabric. If a user does nothing but connect Plexxi switches together, the fabric will auto load balance traffic across the fabric. The photonic fabric is not a ROADM and it is not designed around service provider traffic patterns. The Plexxi fabric is a dense photonic mesh technically called a 5-degree chordal ring that is designed for data center traffic patterns. Data center traffic patters are different from service provider network traffic patterns.

Plexxi uses forwarding table entry insertion (MAC table CAM entries) based on the destination end point, balancing MAC based flows over a plethora of predefined paths within the fabric that the Plexxi switch has been programmed to use. In contrast to other fabric-based solutions, the Plexxi switch does not need encapsulation to perform load balancing. In a Plexxi fabric, the same MAC, on the same VLAN (and/or same VxLAN ID) can be advertised by multiple Plexxi switches. Having more than one switch connect to the same host or end point does not cause flapping; rather it increases the entropy and available paths across the fabric. The switches support a ‘sliding window’ timeout method that caters for MAC movement as well as MAC load balancing between devices.

The Plexxi fabric has multiple methods of forwarding data at layer two (L2) the datalink layer, and layer three (L3) the network layer. The networking industry has long believed that forwarding traffic at L2 provides the highest method of flexibility, but the return for that level of flexibility is limited by the constraint of mass flooding. The Plexxi fabric is designed to extend L2 boundaries and provide the highest bandwidth utilization, coupled with a highly configurable and aware forwarding plane.

Convergence versus Pre-CalculatedLegacy networks operate on a state of reaction; if a component such as an interface, blade, switch or router fails the protocol governing the forwarding domain will recalculate the topology after the failure. Even with advanced fabric



protocols and fast CPUs, reaction times are always at the mercy of the state at which the network now finds itself. A simple loss of link can fluctuate throughout the entire L2 domain causing milliseconds, even seconds, of outages. The Plexxi solution works on a method of knowing exactly how the overall fabric is linked together, rather than distributed hop-by-hop calculations. Faster Failure Handling Having a centralized computation engine (i.e. Plexxi Control) calculating many forwarding topologies provides for fast failure handling. Failures whether they are link, node or otherwise are handled much faster and the network as a system converges rapidly to optimum state and the behavior is predicable. A mere link outage no longer

ripples through the entire network forcing state recalculation, but rather the interface or interfaces that have failed result in an isolated inactive path, where the root of the path is the destination switch. This is only achievable because the switches and computation engines are designed to understand what affect the outage will cause.

If the path is over multiple hops each switch hop in a particular path will insert the same MAC entry based on the receiving interface and VLAN, which dictates a given path. Using MAC insertion enables the switch to quickly redirect flows onto alternate paths if that path becomes congested or if a link for a path is no longer available. Unlike protocols such as TRILL, where the original header is hidden away and encapsulated, the Plexxi switches utilize multiple active or backup paths based on the condition of the network. If an upstream interface became unavailable the switch would flush the MAC from the CAM table forcing a reroute, of course all of the other switches within the Plexxi fabric are aware of the network condition and will “know” that path is inoperable. A backup path may still be made available via the same switch where any of the other remaining wavelengths are used, an advantage of having global visibility and pre-computed topologies.

The default fabric calculation is currently based on two primary forms: a set of uniform based forwarding topologies (i.e. PSATs) and a series of bandwidth based topologies (i.e. FSATs), which will be explored in detail later in the paper. Bandwidth based topologies come in the form of static, user defined, dynamic or congestion mitigation. The Plexxi switches exchange bi-directional peering information on each of the interfaces that form the fabric, this peering information provides Plexxi Control with an exact picture of how the overall fabric is formed and provides the switches with a method of link continuity. Knowing switch locations, individual link bandwidth, the number of available links and how the peering is formed, the computation engine (i.e. Controller) can produce a series of forwarding topologies that the switches can use to load balance data across the fabric. Unified View of the Network FabricPlexxi’s global awareness comes with prior knowledge of the entire fabric, where the mesh of interwoven photonic interfaces exchange peer information thus creating an overall map of the entire fabric. Once Plexxi Control has received the peering information, the Controller’s computation engine can compute the best forwarding topologies for the entire fabric and calculate preemptive link failure scenarios. If a failure happens, where a given path becomes invalid, alternate preexisting forwarding domains (i.e. topologies) and standby paths are available and waiting to take the distributed load. Upon path failure no further calculations are needed only an alternate path selected. Once the Controller has computed the topologies this forwarding information is imparted into each and every Plexxi switch, via the distributed controller architecture, so that all switches are in lockstep.



The Beauty of SimplicityWithin the Plexxi fabric the default paths all have the same priority, the steps necessary to implement the load balanced aware fabric is as simple as a single button inside Plexxi Control called a “Fit” or “fitting function.” Automating and extending network re-calculation can be performed using Plexxi’s RESTFul APIs or the embedded scripting capabilities provided through a flexible Jyphon shell. To optimize the default forwarding topology, an initial Fit is required. Further Fits are discretionary based on the users need to further optimize the forwarding topologies based on statistical analysis or new workload and workflow requirements.

Further prioritization of the existing topologies is as simple as creating a “domain” based affinity, where a configurable identifier such as a VLAN tag can be used to prioritize data and where the VLAN TAG is used to differentiate traffic on the egress queues (Strict de-queuing utilizing the silicon hardware queues.). The domain affinities are an extension of our initial host based affinities that help further differentiate computation, path usage and prioritization within the fabric.

The Power of Plexxi ControlThe Plexxi system delivers on the promise of software-defined networking by decoupling control and forwarding functions, and providing centralized network intelligence. Plexxi Control is global network orchestration and control software that provides centralized discovery, configuration, and control functions based on application affinity needs. It performs three primary functions: workload modeling, algorithmic network fitting, and global network control.



Plexxi Control builds an understanding of the relationships between compute, storage, and network resources across the entire data center or cloud. It then defines policies that optimally fit the network to each workload and automatically orchestrates the appropriate topology across the Plexxi Switch network. Plexxi Control is not actively involved in the data path and cannot impair network performance or availability should it fail.

Plexxi Control provides real-time fabric utilization statistics as well as control and visibility into the fabric. For users extending a Plexxi fabric between data centers, this provides the ability to perform fast traffic engineering from the Controller – rather than at the individual switch/port level. We have placed on control cannel in-band via a VLAN for fiber sparse deployments.

Distributed Private Cloud InterconnectThe challenge in architecting, building, and managing datacenters is one of balance. There are forces competing to both push together and pull apart datacenter resources. Finding an equilibrium point that is technologically sustainable, operationally viable, and business-friendly is challenging. The result is typically a set of compromises that outweigh the advantages inherent to any particular approach.

Given the strong forces working to push resources logically together and the equally strong forces keeping them physically separate, how does anyone find a perfect balance? The simple answer is that they don’t. While networking teams try to strike some balance, they are ultimately forced to choose between the two.

Keeping resources together is about control. Companies that favor control typically lean towards heavy traffic engineering solutions, opting to pay in terms of operational overhead as they manage more complex environments. Separating resources is about distance. Those who favor separation tend towards transport solutions, trading off the cost of operational complexity for less control over datacenter workloads.

Plexxi’s datacenter fabric is designed explicitly to mitigate the cost and complexity of striking a balance between control and distance. Whether resources are at opposite sides of a large datacenter or opposite sides of a metropolitan area, Plexxi’s fabric provides a high capacity, high-speed interconnect to provide simpler, more dynamic control over application workloads.

Resource Collaboration How can multiple elements work together towards a common goal if they are completely separate? The answer is that they cannot; and as IT moves increasingly towards virtualized and distributed applications the interdependence between resources only grows. The performance advantages of distributed architectures are only meaningful when communication between servers is uninhibited. If the network that provides connectivity possible slows down, the efficiency of the distributed architecture decreases. Datacenter architects must solve simultaneously for distributed compute and storage demands as well as the interconnect capacity required between them.

Resource Availability Building a datacenter is an exercise in matching resource capacity to demand. Individual applications, tenants, and geographies all place localized demands on resources. If the aggregate demand is sufficient, but the resources exist in



separate resource pools, the end result is a perpetual state of mismatch. There is always too much or too little workload capacity. The former means you have overbuilt. The latter leaves the network engineer on the edge of capacity exhaustion always looking for a maintenance window to add capacity.

The complexity of managing the disparate technologies required to logically pool physically separate resources could be prohibitively difficult. Even the most skilled specialists have to invest time in creating a properly engineered fabric between sites that accounts for queuing, prioritization, load balancing, and so on. The number of protocols and technologies required is high, and the volume of devices over which they must be applied can be huge. The result is a level of complexity that makes the network more expensive to manage and more difficult to change.

Physical Separation At the same time that forces are pulling things together, there are equally strong oppositional forces exerting outward pressure on datacenter resources. For many companies, the datacenter represents a mission critical element of their infrastructure. For companies whose existence depends on the presence of the resources within the datacenter (be they data, servers, or applications), it is untenably risky to rely on a single physical site. This exerts an outward force on resources, as companies must create multiple physical sites, typically separated by enough distance that a disaster would not meaningfully impact all sites.

As resources are added to a datacenter, they are typically installed in racks in relative close proximity to each other. When racks are empty, there is no reason to unnecessarily create physical separation between resources working in concert. Over time, adjacent rack space is filled through the natural expansion of compute, storage, and networking capacity. As equipment expands, available rack space is depleted, and new racks and rows are populated. Eventually, the device sprawl can occupy entire data centers.

Imagine now that a cluster of servers occupies a rack in one corner of the datacenter. If that cluster is to be expanded, where does the next server go? If the nearby racks are already built out, that resource must be installed some physical distance away from the resources with which it must coordinate.

Connecting Data Centers as Distributed System It is easy to say that Plexxi’s advanced and flexible Software Defined Networking (SDN) software simplifies and enables transport services, but what does that actually mean? Over the past decade the industry has evolved wide area connectivity from fixed leased line circuits to agile Multiprotocol Label Switching (MPLS), Virtual Private LAN Service (VPLS) and now Ethernet Virtual Private Network (eVPN) infrastructures. A virtualized L3-MPLS infrastructure enables operators across the globe to collapse virtual private circuits into a single environment, while offering service level agreements in the form of traffic engineering; something only previously offered through leased line circuits and Quality of Service (QOS). With the dawn of VPLS, eVPN and Generalized Multiprotocol Label Switching (GMPLS) providers of global transports have been confronted with a wealth of protocol choice enabling them to provide a policy-led label switched environment. However, the tradeoff is that day-to-day management and design of such infrastructures is not easy. There is a high labor and operational cost for manually crafting MPLS traffic engineering, and the end result is an abstract infrastructure that is unaware of the data services they deliver.



With the introduction of controller architectures (i.e. SDN) and the promise of global visibility, providers of transport networks have been interested to see how it can provide a higher level of visibility, control and abstraction. This leads to the question: Does adding a controller to an existing infrastructure really provide abstraction? Today, most controllers rely on OpenFlow to communicate and orchestrate the infrastructure. The promise of OpenFlow was to provide a controlled way of delivering user defined specific topologies (not too dissimilar to MPLS), a holistic way of looking at ingress flows and creating user defined topologies that can best service the user. OpenFlow does offer the ability to pass all unknown connections to an abstract software layer, which in theory enables the network to be more programmable and flexible. However, there is a critical flaw: OpenFlow lacks the

performance characteristics to scale to large environments and is missing a higher abstraction layer. For these reasons, SDN controllers focused only on OpenFlow will fail to meet real world deployment needs. Plexxi takes a different approach. A solution comprised of Plexxi Connect, Plexxi Control and Plexxi Switches provides the application driven topologies and automation promised by SDN, all coupled with the configurable policy based infrastructure protocols such as MPLS and VPLS provide.

Methods of ForwardingThe data and traffic forwarding characteristics of the Plexxi model are critical to the flexibility and application awareness of the solution. The default fabric forwarding characteristics are based on three primary methods; a set of bandwidth-based residual topologies, a series of dynamically computed policy-driven topologies (also known as Affinities) and a method of manually defining and computing user-defined topologies. In data center interconnect environments all three of these methods are important, but before we delve deeper into the user-defined topologies, let’s first review the Partially Specified Affinities (PSATs) and Fully Specified Affinities, which compute dynamically.

Partially Specified Affinities (PSATs)Partially Specified Affinities the result of a computed, loop-free series of topologies, which the switches use to load balance traffic destined for a given end point. The end point or Plexxi switch is the sole or a LAG member for a known host MAC address the frame is destined for. At the root of the algorithm, bandwidth is calculated in increments of individual flows based on a unidirectional bandwidth constraint between switch pairs. The computation engine can easily calculate the bandwidth requirements. Below is a review of a simple manually constructed series of bandwidth constraints between just three switches.



Take the above example; switches A and B have been provided two load balanced paths so that each switch can distribute traffic, via a MAC hash, to obtain a maximum capacity of 20gig. This example is simplistic and shows some level of path usage to satisfy both point-to-point and inter hop flows. The Plexxi computational algorithm starts with the shortest path, so in the example above bandwidth between switches A and B, B and A, C and A and C and B will be direct. Only when the bandwidth requirement exceeds the shortest path will additional paths be used.

As the network is expected to be much larger than this example, the bandwidth requirements between switches are expected to be higher. With many more interconnecting interfaces, or optical waves, the

number of paths will be much larger increasing the number of diverse paths, providing a carved out method of entropy. As long as there are links and capacity to satisfy a particular bandwidth constraint, simply increasing the bandwidth requirement between switches will also increase the number of paths.

The following is a simplistic example of an increment flow-based path: The “dotted” schematic depicts how the paths are filled in increments of 100 Mbps. The computation engine will satisfy each switch-pair bandwidth requirement, in increments, preventing individual path starvation and satisfying the required bandwidth requests over the available physical and logical paths.

Bandwidth A to B (4.9gig) . . . . . . . . . . (1g) . . . . . . . . . . (2g) . . . . . . . . . . (3g) . . . . . . . . . . (4g) . . . . . . . . . (4.9g)

Bandwidth B to A (14gig) . . . . . . . . . . (1g) . . . . . . . . . . (2g) . . . . . . . . . . (3g) . . . . . . . . . . (4g) . . . . . . . . . . (5g) . . . . . . . . . . (6g) . . . . . . . . . . (7g) . . . . . . . . . . (8g) . . . . . . . . . . (9g). . . . . . . . . . (10g) . . . . . . . . . . (11g) . . . . . . . . . . (12g) . . . . . . . . . . (13g) . . . . . . . . . . (14g)

Bandwidth A to C (15gig) . . . . . . . . . . (1g) . . . . . . . . . . (2g) . . . . . . . . . . (3g) . . . . . . . . . . (4g) . . . . . . . . . . (5g) . . . . . . . . . . (6g) . . . . . . . . . . (7g) . . . . . . . . . . (8g) . . . . . . . . . . (9g) . . . . . . . . . . (10g) . . . . . . . . . . (11g) . . . . . . . . . . (12g) . . . . . . . . . . (13g) . . . . . . . . . . (14g) . . . . . . . . . . (15g)

Bandwidth C to A (11gig). . . . . . . . . . (1g) . . . . . . . . . . (2g) . . . . . . . . . . (3g) . . . . . . . . . . (4g) . . . . . . . . . . (5g). . . . . . . . . . (6g) . . . . . . . . . . (7g) . . . . . . . . . . (8g) . . . . . . . . . . (9g) . . . . . . . . . . (10g) . . . . . . . . . . (11g)

Bandwidth B to C (10gig) . . . . . . . . . . (1g) . . . . . . . . . . (2g) . . . . . . . . . . (3g) . . . . . . . . . . (4g) . . . . . . . . . . (5g) . . . . . . . . . . (6g) . . . . . . . . . . (7g) . . . . . . . . . . (8g) . . . . . . . . . . (9g) . . . . . . . . . . (10g)

Bandwidth C to B (9gig) . . . . . . . . . . (1g) . . . . . . . . . . (2g) . . . . . . . . . . (3g) . . . . . . . . . . (4g) . . . . . . . . . . (5g) . . . . . . . . . . (6g) . . . . . . . . . . (7g) . . . . . . . . . . (8g) . . . . . . . . . . (9g)



Why does knowing globally how the network is formed help with usage and safeguard against outages?

Having a centralized computation engine calculating the entire series of forwarding topologies means, unlike traditional protocols, the network does not need to be on standby ready to calculate around a failure condition. In the case of Plexxi, if an interface, transceiver or actual switch fails, resulting in an isolated inactive path or path segment, the switches will automatically initialize backup paths. Primary topologies, as well as topologies resulting from potential failures, are all computed at the same time. These multiple topologies are stored in the switches and upon the event of a failure; the switches dynamically forward traffic over the next best topology. This is achievable because the switches and

computation engines are designed to understand the affects failures will cause. A simple link outage, or any failure, no longer has a ripple effect through the entire network.

Fully Specified Affinities (FSATs)

Fully Specified Affinities are a controller-to-switch method of simplifying, defining and applying network wide policies. However, FSATs are much more than just that. In legacy networks, traffic profiling, prioritization and path manipulation are manually applied by users on a per-hop-basis. In the Plexxi system, the Controller and algorithms take all this data into account while dynamically deducting policy overhead from the residual calculation and wrapping it neatly into an abstract policy notation.

What is an abstract policy notation?

Take for example Network Attached Storage (NAS) replication spread across the network. With a traditional network you configure both adjacent and intermediate switches that interconnect the storage devices with the correct DiffServ Code Point (DSCP) queuing and algorithm. Administrators could even apply policy-based routing to force the traffic to take a different path than the existing routed or L2 paths. This effort is a manually tedious, static affair and as an administrator, you must guess the amount of residual data that will also be forwarded across the same paths as those you manually constructed. Plexxi may use similar silicon ternary content-addressable (TCAM) based functionality to steer traffic, but the policy created to differentiate the NAS traffic not only defines the quality of service (QOS) queues and the method of de-queuing, but it also takes into account the affect the storage replication will have on the underlying residual traffic. This method results in separate forwarding topologies that best match the traffic profile, which is applied dynamically to ALL switches in the data path. The way the policies are implemented is also strikingly different. The emphasis is no longer on the switches where the policies are applied, but rather on the host markings (MAC/IP) or on packet identifiers (VLAN/VxLAN/DSCP etc.). By removing the direct mapping of policies to switches and dynamically applying policies (or forwarding characteristics) instead, the policies are abstracted from the hardware and can move as the device or application moves, based on the location of the connected devices. In addition, northbound application integrations can be globalized without specific knowledge of the physical network environment. It also means the implementation of policies, forwarding characteristics and topologies is centralized, eliminating the need for administrators to manually connect to and configure each individual device.

The following diagram shows a simple Plexxi FSAT policy. As you can see, there is no reference to the actual underlying infrastructure; only what constraint the hosts place on the network.



User Defined Topologies (UDTs)

User Defined Topologies couple FSAT computational awareness with user definable topologies and a wealth of selection criteria. Adjacent to the controller computation of the forwarding plane, users have the ability to define either unicast or multicast forwarding topologies. The definition of the manually created “paths” consists of next hop interfaces and next hops switches. Like FSATs, UDTs have full path validation performed and the impact on residual data is taken into account. Unlike FSATs and PSATs, with a UDT, an administrator can create, amend and delete both the topologies and UDT insertion (or catchment) criteria without incurring topology computational changes.

Having a decoupled the topology with the catchment or insertion policy mechanism, users can explicitly define any particular route across the fabric and separately apply the traffic they wish to forward. Plexxi fabric paths can be intra datacenter, between local switches, and/or inter datacenter across a datacenter interconnect (DCI). The selection criteria can be duplicated and even mapped to multiple paths. Mapping the same policy to two or more paths provides a backup mechanism, while alternate traffic can be placed on the secondary topology so that the bandwidth is not wasted. Even if no backup path has been created, traffic will automatically fail back to a FSATs or PSATs computed path. The policies are TCAM driven granular rules (SRC/DST MAC/IP, VLAN, IP, Port and Protocol), presented to the user at the controller level to provide a global automation integration point. Why not just represent these as IP, port or VLAN rules at the switch level?

Grouping silicon functionality at the controller eliminates the need to manually configure every switch individually. But more importantly, representing this collection of rules as a whole provides internal computation and external integration with a single viewpoint. This level of abstraction provides external applications a single point of entry and the ability to program and control traffic across a series of pre-defined forwarding topologies.



Distributed Private Cloud Use Case As you can see from the prior discussion, Plexxi’s fabric provides users with a wealth of unique functionality. This includes application and data-defined control, granular path configuration capabilities and abstract policy notation, all done with easy-to-use configuration tools and presented within a single management interface.

Plexxi’s Datacenter Transport Fabric

The Plexxi datacenter transport fabric consists of two or more datacenters (or pools of resources in the case of a single physical datacenter) interconnected using Plexxi’s fabric (called LightRail). The

resulting architecture delivers multi-path Layer 1 and 2-transport connectivity to all resources in any of the connected datacenters. By combining photonic switching and intelligent control, Plexxi provides a single network domain across physically separated datacenter resources, extending Layer-2 up to ten of thousands of miles when combined with optical transport equipment.

What Does a Real World Deployment Look Like?

Let us start with Google’s example of an SDN-WAN. In 2012, Google began releasing technical information regarding their deployment of a SDN driven WAN that inter-connects their major data centers. The following is a description of Google’s SD-WAN deployment concept: “On this WAN fabric we built a centralized traffic engineering (TE) service. The service collects real-time utilization metrics and topology data from the underlying network and bandwidth demand from applications/services. With this data, it computes the path assignments for traffic flows and then programs the paths into the switches…”

Google realized the following benefits from deploying an SDN powered WAN solution:

• Unified view of the network fabric With SDN we get a unified view of the network, simplifying configuration, management and provisioning.

• High utilization Centralized traffic engineering provides a global view of the supply and demand of network resources. Managing end-to-end paths with this global view results in high utilization of the links.

• Faster failure handling Failures whether it be link, node or otherwise are handled much faster. Furthermore, the systems converge more rapidly to target optimum and the behavior is predictable.

• Faster time to market/deployment: With SDN, better and more rigorous testing is done ahead of rollout acceler-ating deployment. The development is also expedited as only the features needed are developed.

• Hitless upgrades The decoupling of the control plane from the forwarding/data plane enables us to perform hit-less software upgrades without packet loss or capacity degradation.

• High fidelity Test Environment: The entire backbone is emulated in software, that not only helps in testing and verification but also in running “what-if” scenarios.

• Elastic Compute: Compute capability of network devices is no longer a limiting factor as control and manage-ment resides on external servers/controllers. Large-scale computation, path optimization in our case, is done using the latest generation of servers.



The following network and high-level architecture diagrams show the core elements of Google’s SD-WAN deployment.

The following diagram uses the high level architecture from the Google example from above and maps Plexxi’s solution components to the architecture.



The following is an actual customer deployment example of Plexxi’s Distributed Cloud use case. For confidentiality reasons, the network design has been altered to provide a level of anonymity. The design object was to build a global Controller based network. The following diagram only shows the Asia-Pacific portion of this network. The underlining network is a collection of leased lines and dark fiber deployed as wires. The Plexxi fabric is extended across this network and our algorithms and Controller recognize the variations in distance, bandwidth and latency. We can then calculate all of the network paths and control flows from a single Controller instance.

The way to think about this network is around the idea of evolved carrier ethernet. The network fabric is really a system with discrete links. The Controller sees all of these links as a resource pool. Plexxi’s dynamic fitting engine and user defined topologies allows the customer to have granular path control over the network. Plexxi Control acts as an abstraction point that translates application workloads for our fitting algorithms that are then expressed to the underlying switching infrastructure for implementation. This is the network as a system that can be correlated to compute and storage.



Less Equipment

In a typical legacy deployment, users would connect datacenter resources using either an MPLS WAN gateway or a wide-area optical transport solution. In either case, the result is a separate network to interconnect resources. The Plexxi datacenter transport fabric collapses this interconnect network, allowing architects to build a single Layer-2 network that spans multiple physical sites. By converging multiple networks into a single fabric, Plexxi reduces the total number of switches and routers by more than 40%.

Fewer Management Points

Plexxi’s converged datacenter transport fabric is managed through a single pane of glass (called Plexxi Control). The SDN controller provides a global view into the network, allowing users to provision, monitor, and troubleshoot from a single point. This simplifies administration and reduces operational costs. Compared to even legacy multi-site datacenter architectures, Plexxi’s datacenter transport fabric reduces the number of administrative touch points by up to 95% (1 vs. 20 in a small, 2-site deployment, for example).

Implicit Balancing, Explicit Control

By default, Plexxi Control balances traffic across all available bandwidth in the datacenter transport fabric. There are no complex protocols or sophisticated QOS policies to configure. This shortens the time to deploy new applications and services, and it eliminates the chance of human error. While traffic is implicitly balanced, Plexxi still provides explicit application control. Through an open policy abstraction (called Affinities), users can define application relationships, guarantee application SLAs (like bandwidth and latency), and isolate workloads for multi-tenancy or compliance.

Faster Time to Deploy

With fewer protocols to manage and a single management point from which to administer changes, a Plexxi datacenter transport fabric drastically reduces the time it takes to deploy a new end-to-end service. Deploying a typical end-to-end service on a legacy solution might require VLAN provisioning, queuing and classification configuration, MPLS LSP setup, and MPLS services–across multiple devices in each datacenter and in the interconnect network. The Plexxi equivalent consists of automated VLAN provisioning, the application of a policy abstraction, and an algorithmic calculation by the controller. By removing steps entirely and automating those that remain, Plexxi cuts the time it takes to turn up a new service by more than 50x, making it possible to do in minutes what used to take hours.

Lower Complexity

Ultimately, the Plexxi datacenter transport solution increases application control while reducing overall system complexity. Fewer devices, fewer cables, fewer management points, and fewer protocols translate into fewer dollars and fewer headaches. While there is no generally accepted measurement for complexity, there are reasonable proxies for assessing its impact. For each technology required in a design (MPLS, QOS, and so on), the complexity can be approximated by the difficulty of performing that task times the number of devices over which it must be executed.



www.plexxi.com 100 Innovative Way, Suite 3322Nashua, NH 03062

+1.888.630.PLEX (7539)[email protected]

Complexity = Difficulty x #_devices

Difficulty does not have a scalar value, but it can be approximated by assessing the relative skill level required to design, provision, and manage a technology. For instance, VLAN provisioning might require only CCNA-level skills, whereas MPLS-TE requires CCIE-level. Because Plexxi replaces technology functions like MPLS and QOS with automatic behavior and a single management point, the complexity of a Plexxi solution is more than 35x less complex than its legacy counterparts.

Superior Utilization

The result of Plexxi’s photonic switching and intelligent control is a datacenter interconnect with far superior link utilization. Where a typical fiber connection might run between 30% and 40% utilization, a Plexxi datacenter transport fabric can drive interconnect links up to 99% utilization. When links are fully utilized, scaling bandwidth up is as simple as adding more links (through either additional switches or interconnect connections). This ensures that Plexxi’s datacenter transport fabric is both functional and future-proof.

bv|ub0 |;7 u b -|;t o 7& v;- v; !#$%&'$()*+%,-$(*./0 ... · solu on brief distributed...

Documents

bv|ub0 |;7 u b -|;t o 7& v;- v; !#$%&'$()+%,-$(./0 ... · solu on brief distributed...