sdn controllers exp.comparison

Upload: philipisaia

Post on 02-Jun-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/10/2019 SDN Controllers Exp.comparison

    1/6

    Advanced Study of SDN/OpenFlow controllers

    **** List of authors is hidden ****

    ABSTRACT

    This paper presents an independent comprehensive analy-

    sis of the efficiency indexes of popular open source SDN

    controllers (NOX, POX, Beacon, Floodlight, MuL, Maestro,

    Ryu). The analysed indexes include performance, scalabil-

    ity, reliability, and security. For testing purposes we devel-

    oped the new framework called hpcprobe. The test bed and

    the methodology we used are discussed in detail so that ev-

    eryone could reproduce our experiments.

    Keywords

    SDN, OpenFlow, Controllers evaluation, Throughput,Latency, Haskell

    1. INTRODUCTION

    An SDN/OpenFlow controller for networks is like anoperating system for computers. But the current stateof the SDN/OpenFlow controller market may be com-pared to early operating systems for mainframes, whichexisted 40-50 years ago. Operating systems were con-trol software that provided libraries of support code toassist programs in runtime operations just as todayscontrollers are special control software, which interactswith switching devices and provides a programmatic in-terface for user-written network management applica-tions.

    Early operating systems were extremely diverse, witheach vendor or customer producing one or more operat-ing systems specific to their particular mainframe com-puters. Every operating system could have radicallydifferent models of command and operating procedures.

    The question of operating system efficiency was of vitalimportance and, accordingly, was studied a lot. How-ever, it was difficult to reproduce efficiency evaluationexperiments applied to operating systems because of alack of methodology descriptions, a mixture of runningapplications that created the workload in the experi-ments, unique hardware configurations, etc. The sameis true for todays SDN/OpenFlow controllers.

    At present, there are more than 30 different Open-Flow controllers created by different vendors/univer-sities/research groups, written in different languages,

    and using different runtime multi-threading techniques.Thus, the issue of efficiency evaluation has emerged.

    The latest evaluation of SDN/OpenFlow controllers,published in 2012 [1], covered a limited number of con-trollers and focused on controller performance only.During the last year new controllers appeared, and theold ones were updated. This paper was intended as im-

    partial comparative analysis of the effectiveness of themost popular and widely used controllers. In our re-search we treat the term controller effectiveness as alist of indexes, namely: performance, scalability, reli-ability, and security. Later on we will introduce themeaning of every index from this list.

    For testing controllers we used a well known toolcbench [2]. However, we were not satisfied with the ca-pabilities of cbench, and so we developed our own frame-work for SDN/Open-Flow controllers testing, which wecalled hcprobe [3]. We developed our own solution be-cause the capabilities of cbench, the traditional bench-

    mark used for controllers evaluation, cover only a smallrange of possible test cases. It is perfectly fitting forrough performance testing, but it does not provide thefine-tuning of test parameters. We use hcprobe to cre-ate a set of advanced testing scenarios, which allowus to get a deeper insight into the controller reliabil-ity and security issues: the number of failures/faultsduring long-term testing, uptime for a given workloadprofile, and processing malformed OpenFlow messages.With cbench and hcprobe we performed an experimen-tal evaluation of seven freely available OpenFlow con-trollers: NOX [4], POX [5], Beacon [6], Floodlight [7],MuL [8], Maestro [9], and Ryu [10].

    2. HCPROBE

    Similar to conventional benchmarks, hcpobe emulatesany number of OpenFlow switches and hosts connectedto a controller. Using hcprobe, one can test and anal-yse several indexes of controller operation in a flexiblemanner. Hcprobe allows to specify patterns for gener-ating OpenFlow messages (including malformed ones),set traffic profile, etc. It is written in Haskell and allowsusers to easily create their own scenarios for controller

    1

  • 8/10/2019 SDN Controllers Exp.comparison

    2/6

    testing.The key features of hcprobe are:

    correct OpenFlow and network stack packet gen-eration;

    implementation in a high-level language, whichmakes it easier to extend;

    API for custom tests design;

    introducing an embedded domain-specific lan-guage (eDSL) for creating tests.

    Figure 1: The hcrobe architecture

    The architecture of hcprobe (see Figure 1):

    Network datagram: The library, which providestypeclass-based interface for easy creation of pack-ets for network protocols: Ethernet frames, ARPpackets, IP packets, UDP packets, TCP packets.It allows to set a custom value for any field of the

    header and generate a custom payload.

    OpenFlow: The library for parsing and creatingOpenFlow messages.

    Configuration: The library that provides theroutines for configuration of the tests parametersusing command line options or a configuration file.

    FakeSwitch: Basic fake switch, which im-plements the initial handshake, manages theinteraction with the controller and providescallback/queue-based interface for writing tests.

    eDSL: Provides easier interface for creating tests.

    Using these modules, it is possible to create specifiedtests for SDN/OpenFlow controllers, including func-tional, security and performance tests. The most conve-nient way to create tests is to use the embedded domain-specific language.

    Hcprobe domain-specific language provides the fol-lowing abilities:

    create OpenFlow switches;

    write programs describing the switch logic and runprograms for different switches simultaneously;

    create various types of OpenFlow messages withcustom field values and also generate custom pay-load for PacketIn messages;

    aggregate statistics on message exchange between

    OpenFlow controller and switches.For a basic example, which illustrates the above fea-tures, see Figure 2.

    Figure 2: The hcrobe test example (incorrectPacketIn OF message)

    So hcprobe provides the framework for creating var-ious test cases to study the behaviour of OpenFlowcontrollers processing different types of messages. Onecan generate all types of switch-to-controller OpenFlowmessages and replay different communication scenarios,even the most sophisticated ones, which cannot be eas-ily reproduced using hardware and software OpenFlowswitches. The tool could be useful for developers to

    perform functionality, regression and safety testing ofcontrollers.

    3. EXPERIMENTAL SETUP

    This section explains our testing methodology andtechnical details of experimental evaluation.

    3.1 Test bed

    The test bed consists of two servers connected witha 10Gb link. Each server has two processors Intel XeonE5645 with 6 cores (12 threads), 2.4GHz and 48 GBRAM. Both servers are running Ubuntu 12.04.1 LTS

    (GNU/Linux 3.2.0-23-generic x86 64) with default net-work settings. One of the servers is used to launchthe controllers. We bind all controllers to the cores ofthe same processor in order to decrease the influenceof communication between the processors. The secondserver is used for traffic generation according to a cer-tain test scenario. We use two server setup instead of asingle server configuration where both a controller and atraffic generator run on the same server. Our previousmeasurements have shown that the throughput of in-ternal traffic is rather unpredictable and unstable, with

    2

  • 8/10/2019 SDN Controllers Exp.comparison

    3/6

    throughput surging or dropping by an order. Also, us-ing two servers we can guarantee that traffic generationdoesnt affect the controller operation.

    3.2 Controllers

    Based on the study of available materials on twentyfour SDN/OpenFlow controllers, we chose the following

    seven open source controllers: NOX [4] is a multi-threaded C++-based controller

    written on top of Boost library.

    POX [5] is a single-threaded Python-based con-troller. It is widely used for fast prototyping ofnetwork applications in research.

    Beacon [6] is a multi-threaded Java-based con-troller that relies on OSGi and Spring frameworks.

    Floodlight [7] is a multi-threaded Java-based con-troller that uses Netty framework.

    MUL [8] is a multi-threaded C-based controllerwritten on top of libevent and glib.

    Maestro [9] is a multi-threaded Java-based con-troller.

    Ryu [10] is Python-based controller that usesgevent wrapper of libevent.

    Initially we included in our study Trema [12],Flower [11] and Nettle [13] controllers. But after severalexperiments we understood that they were not ready forevaluation under heavy workloads.

    Each tested controller runs an L2 learning switching

    application provided by the controller. There are sev-eral reasons for this choice. It is quite simple and at thesame time representative. It uses all of the controllersinternal mechanisms, and also shows how effective thechosen programming language is by implementing sin-gle hash lookup.

    We relied on the latest available sources of all con-trollers dated April, 2013. We run all controllers withthe recommended settings for performance and latencytesting, if available.

    4. METHODOLOGY

    Our testing methodology includes performance andscalability measurements as well as advanced functionalanalysis such as reliability and security. The meaningof each term is explained below.

    4.1 Performance/Scalability

    Performance of an SDN/OpenFlow controller is de-fined by two characteristics: throughput and latency.The goal is to obtain maximum throughput (number ofoutstanding packets, flows/sec) and minimum latency(response time, ms) for each controller. The changes in

    this parameters when adding more switches and hoststo the network or adding more CPU cores to the serverwhere the controller runs show the scalability of thecontroller.

    We analyse the correlation between controllersthroughput and the following parameters:

    1. the number of switches connected to the controller

    (1, 4, 8, 16, 64, 256), having a fixed number ofhosts per switch (105);

    2. the number of unique hosts per switch (103, 104,105, 106, 107), having a fixed number of switches(32).

    3. the number of available CPU cores (112).

    The latency is measured with one switch, which sendsrequests, each time waiting for the reply from the con-troller before sending the next request.

    All experiments are performed with cbench. Eachcbench run consists of 10 test loops with the duration

    of 10 secs. We run each experiment three times andtake the average number as the result.

    The scripts with the methodology re-alization can be found in the repository:https://github.com/ARCCN/ctltest.

    4.2 Reliability

    By reliability we understand the ability of the con-troller to work for a long time under an averageworkload without accidentally closing connections withswitches or dropping OpenFlow messages from theswitches.

    To evaluate the reliability, we measure the number offailures during long-term testing under a given workloadprofile.

    The traffic profile is generated using hcprobe. We useflow setup rate recorded in Stanford campus in 2011 as asample workload profile (Figure 3). This profile standsfor a typical workload in a campus or office networkduring a 24-hour period: the highest requests rate isseen in the middle of the day, and by night the ratedecreases.

    In our test case we use five switches sending Open-Flow PacketIn messages with the rate varying from 2000to 18000 requests per second. We run the test for 24

    hours and record the number of errors during the test.By error we mean either a session termination or a fail-ure to receive a reply from the controller.

    4.3 Security

    To identify any security issues, we have also examinedhow the controllers manipulate malformed OpenFlowmessages.

    In our test cases we send OpenFlow messages withincorrect or inconsistent fields, which could lead to acontroller failure.

    3

  • 8/10/2019 SDN Controllers Exp.comparison

    4/6

    Figure 3: The sample workload profile

    4.3.1 Malformed OpenFlow header.

    Incorrect message length. PacketIn messages withmodified length field in OpenFlow header have beengenerated. According to the protocol specification [14],this field stands for the whole message length (includingthe header). Its value was set as follows: 1) length ofOpenFlow header; 2) length of OpenFlow header plusa, where a is smaller than the header of the encapsu-lated OpenFlow message; 3) a number larger than thelength of the whole message. This test case examineshow the controller parses the queue of incoming mes-sages.

    Invalid OpenFlow version. All controllers under test

    support OpenFlow protocol v1.0. In this test case westart the session with announcing OpenFlow v1.0 dur-ing the handshake between the switch and the con-troller, then set an invalid OpenFlow protocol versionin PacketIn messages.

    Incorrect OpenFlow message type. We examine howthe controller processes the OpenFlow messages, inwhich the value of the type field of the general Open-Flow header does not correspond to the actual type ofthe encapsulated OpenFlow message. As an example wesend messages with OFPT PACKET IN in the typefield, but containing PortStatus messages.

    4.3.2 Malformed OpenFlow message.

    PacketIn message: incorrect protocol type in the en-capsulated Ethernet frame. This test case shows howthe controllers application processes an Ethernet framewith a malformed header. We put the code of the ARPprotocol (0x0806) in the Ethernet header, though theframe contains the IP packet.

    Port status message: malformed name field in theport description. The name field in the port descrip-tion structure is an ASCII string, so a simple way to

    corrupt this field is not to put \0 at the end of thestring.

    5. RESULTS

    In this section we present the results of the experi-mental study of the controllers performed with cbench(performance/scalability tests) and hcprobe (reliability

    and security tests).

    5.1 Performance/Scalability

    5.1.1 Throughput

    CPU cores scalability (see Figure 4). Python con-trollers (POX and Ryu) do not support multi-threading,so they show no scalability across CPU cores. Mae-stros scalability is limited to 8 cores as the controllerdoes not run with more than 8 threads. The perfor-mance of Floodlight increases steadily in line with theincrease in the number of cores, slightly differing on 8-

    12 cores. Beacon shows the best scalability, achievingthe throughput of nearly 7 billion flows per second.The difference in throughput scalability between

    multi-threaded controllers depends on two major fac-tors: the first one is the algorithm of distributing in-coming messages between threads, and the second oneis the mechanism or the libraries used for network in-teraction.

    NOX, MuL and Beacon pin each switch connectionto a certain thread, this strategy performs well when allswitches send approximately equal number of requestsper second. Maestro distributes incoming packets usinground-robin algorithm, so this approach is expected to

    show better results with an unbalanced load. Floodlightrelies on Netty library, which also associates each newconnection with a certain thread.

    Figure 4: The average throughput achieved withdifferent number of threads (with 32 switches,105 hosts per switch)

    Correlation with the number of connected switches(see Figure 5). The performance of POX and Ryu doesnot depend on the number of switches, as they do not

    4

  • 8/10/2019 SDN Controllers Exp.comparison

    5/6

    support multi-threading. For multi-threaded controllersadding more switches leads to better utilization of avail-able CPU cores, so their throughput increases until thenumber of connected switches becomes larger than thenumber of cores. Due to the round-robin packets dis-tribution algorithm, Maestro shows better results on asmall amount of switches, but the drawback of this ap-

    proach is that its performance decreases when a largenumber of switches (256) is connected.

    Figure 5: The average throughput achieved withdifferent number of switches (with 8 threads,105

    hosts per switch)

    Correlation with the number of connected hosts (seeFigure 6). The number of hosts in the network has im-material influence on the performance of most of thecontrollers under test. Beacons throughput decreasesfrom 7 to 4 billion flows per second with 107 hosts,

    and the performance of Maestro goes down significantlywhen more than 106 hosts are connected. This is causedby specific details of Learning Switch application im-plementation, namely, the implementation of its lookuptable.

    Figure 6: The average throughput achieved withdifferent number of hosts (with 8 threads, 32switches)

    5.1.2 Latency

    The average response time of the controllers demon-strates insignificant correlation with the number of con-nected hosts. For the average response time with oneconnected switch and 105 hosts see Table 1. The small-est latency has been demonstrated by MuL and Mae-stro controllers, while the largest latency is typical of

    python-based controllers: POX and Ryu.

    NOX 91,531POX 323,443Floodlight 75,484

    Beacon 57,205MuL 50,082Maestro 129,321

    Ryu 200,021

    Table 1: The minimum response time (106 secs/flow)

    The results of performance/scalability testing showthat Beacon controller achieves the best throughput(about 7 billion flows/sec) and has a potential forfurther throughput scalability if the number of avail-able CPU cores is increased. The scalability of othermulti-threaded controllers is limited by 4-8 CPU cores.Python-based controllers POX and Ryu are more suit-able for fast prototyping than for enterprise deployment.

    5.2 Reliability

    The experiments have shown that most of the con-trollers successfully cope with the test load, although

    two controllers, MuL and Maestro, start to drop Pack-etIns after several minutes of work: MuL has dropped660,271,177 messages and closed 214 connections, andMaestro has dropped 463,012,511 messages withoutclosing the connections. For MuL, failures are causedby problems with Learning Switch module that cannotadd new entries to the table, which leads to packet lossand the closing of connections with switches.

    5.3 Security

    5.3.1 Malformed OpenFlow header

    Incorrect message length (1). Most of the controllerswill not crash on receiving this type of malformed mes-sages but close the connection with the switch thatsends these messages. The two controllers that crashupon receiving messages with incorrect value of lengthfield are Maestro and NOX. Maestro crashes only whenthese messages are sent without delay, however, if we seta delay between two messages to 0.01 second the con-troller replies with PacketOuts. Ryu is the only con-troller whose work is not affected by malformed mes-sages, it ignores the messages without closing the con-

    5

  • 8/10/2019 SDN Controllers Exp.comparison

    6/6

    nection.Invalid OpenFlow version (2). POX, MuL and Ryu

    close the connection with the switch upon receiving amessage with invalid protocol version. Other controllersprocess messages regardless of an invalid protocol ver-sion and reply with PacketOuts with version set to 0x01.

    Incorrect OpenFlow message type (3). MuL and Ryu

    ignore these messages. MuL sends another FeaturesRe-quest to the switch and after receiving the reply closesthe connection with the switch. Other controllers parsethem as PacketIn messages and reply with PacketOuts.

    5.3.2 Malformed OpenFlow message

    PacketIn message (4). MuL closes the connectionwith the switch after sending an additional Features-Request and receiving the reply. Other controllers tryto parse encapsulated packets as ARP and reply with aPacketOut message. POX is the only controller whichdetects invalid values of ARP header fields.

    Port status message (5). All controllers process mes-sages with malformed ASCII strings correctly, readingthe number of characters equal to the field length.

    The summary of the security tests results is shown inthe Table 2, where: red cell the controller crashed,orange cell the controller closed the connection, greencell the controller successfully passed the test. If thecell is yellow, the controller processed the message with-out crashing or closing the connection, but the error wasnot detected, which is a possible security vulnerability.

    1 2 3 4 5

    NOXPOXFloodlightBeacon

    MuLMaestroRyu

    Table 2: Security tests results

    Thus, the experiments demonstrate that sending mal-formed OpenFlow messages to a controller can cause the

    termination of a TCP session with the switch or eventhe controllers failure resulting in a failure of a networksegment or even of the whole network. The expectedbehaviour for the OpenFlow controller on receiving amalformed message is ignoring it and removing fromthe incoming queue without affecting other messages. Ifthe controller proceeds the malformed message withoutindicating an error, this could possibly cause incorrectnetwork functioning. So it is important to verify notonly the OpenFlow headers, but also the correctness ofthe encapsulated network packets.

    6. CONCLUSION

    We present an impartial experimental analysis of thecurrently available open source OpenFlow controllers:NOX, POX, Floodlight, Beacon, MuL, Maestro, andRyu.

    The key results of the work:

    The list of the controllers under the test has beenextended as compared to previous studies.

    We present a detailed description of the experi-ment setup and methodology.

    Controller performance indexes have been up-graded using advanced hardware with a 12 CPUthreads configuration (previous research usedmaximum 8 threads). Maximum throughput isdemonstrated by Beacon, with 7 billion flows persecond.

    Reliability analysis showed that most of the con-

    trollers are capable of coping with an averageworkload during long-term testing. Only MuLand Maestro controllers are not ready for the 24/7work.

    Security analysis highlighted a number of possiblesecurity vulnerabilities of tested controllers whichcan make them targets for the malware attacks.

    We also present hcprobe the framework, which per-fectly fits the demands of fine-grained functional andsecurity testing of controllers.

    7. REFERENCES[1] Amin Tootoonchian, Sergey Gorbunov, Martin Casado,

    Rob Sherwood. On Controller Performance inSoftware-Defined Networks. In Proc. Of HotIce, April,2012.

    [2] Cbench. http://docs.projectfloodlight.org/display/floodlightcontroller/Cbench

    [3] Hcprobe. https://github.com/ARCCN/hcprobe/[4] N. Gude, T. Koponen, J. Pettit, B. Pfaff, M. Casado,

    N. McKeown, and S. Shenker. NOX: towards anoperating system for networks. SIGCOMM ComputerCommunication Review 38, 3 (2008), 105-110.

    [5] POX. http://www.noxrepo.org/pox/about-pox/[6] Beacon. https://openflow.stanford.edu/display/Beacon[7] Floodlight. http://Floodlight.openflowhub.org/

    [8] Mul. http://sourceforge.net/p/mul/wiki/Home/[9] Zheng Cai,Maestro: Achieving Scalability and

    Coordination in Centralized Network Control Plane,Ph.D. Thesis, Rice University, 2011

    [10] Ryu. http://osrg.github.com/ryu/[11] FlowER. https://github.com/travelping/flower[12] Trema. http://trema.github.com/trema/[13] Nettle. http://haskell.cs.yale.edu/nettle/[14] Open Networking Foundation. OpenFlow Switch

    Specication, Version 1.0.0.

    6