morph: an adaptive framework for efficient and byzantine

16
c 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. 1 MORPH: An Adaptive Framework for Efficient and Byzantine Fault-Tolerant SDN Control Plane Ermin Sakic, Student Member, IEEE, Nemanja Ðeri´ c, Student Member, IEEE, Wolfgang Kellerer, Senior Member, IEEE Abstract—Current approaches to tackling the single point of failure in SDN entail a distributed operation of SDN controller instances. Their state synchronization process is reliant on the assumption of a correct decision-making in the controllers. Suc- cessful introduction of SDN in the critical infrastructure networks also requires catering to the issue of unavailable, unreliable (e.g. buggy) and malicious controller failures. We propose MORPH, a framework tolerant to unavailability and Byzantine failures, that distinguishes and localizes faulty controller instances and ap- propriately reconfigures the control plane. Our controller-switch connection assignment leverages the awareness of the source of failure to optimize the number of active controllers and minimize the controller and switch reconfiguration delays. The proposed re-assignment executes dynamically after each successful failure identification. We require 2FM +FA+1 controllers to tolerate FM malicious and FA availability-induced failures. After a successful detection of FM malicious controllers, MORPH reconfigures the control plane to require a single controller message to forward the system state. Next, we outline and present a solution to the practical correctness issues related to the statefulness of the distributed SDN controller applications, previously ignored in the literature. We base our performance analysis on a resource-aware routing application, deployed in an emulated testbed comprising up to 16 controllers and up to 34 switches, so to tolerate up to 5 unique Byzantine and additional 5 availability-induced controller failures (a total of 10 unique controller failures). We quantify and highlight the dynamic decrease in the packet and CPU load and the response time after each successful failure detection. Keywords - Byzantine fault tolerance, SDN, distributed control plane, reliability, availability, empirical study I. I NTRODUCTION In a Software Defined Network (SDN), switches and routers rely on a centralized SDN controller to provide the necessary configurations for path finding and resource configuration. In a single-controller SDN, the controller represents the sin- gle point of failure. In case of a disabled or compromised controller, the SDN control plane becomes inoperable. The user faces the possibility of losing control over the network equipment and hence risks the unavailability of the data plane for new services. Indeed, the availability of the network control E. Sakic is with the Department of Electrical Engineering and Information Technology, University of Technology Munich, Germany; and Corporate Technology, Siemens AG, Munich, Germany, E-Mail: (ermin.sakic@{tum.de, siemens.com}). N. Ðeri´ c is with the Department of Electrical Engineering and Infor- mation Technology, University of Technology Munich, Germany, E-Mail: ([email protected]). W. Kellerer is with the Department of Electrical Engineering and Infor- mation Technology, University of Technology Munich, Germany, E-Mail: ([email protected]). and management functions is of paramount importance in industrial networks where critical network failures may lead to disruption of applications, potentially leading to safety and regulatory issues [1]–[3]. Hence, for the purpose of resilience, a multitude of strategies for deploying parallel SDN controller instances were proposed in the recent literature [4]–[8]. In state-of-the-art distributed SDN controller implementa- tions [7], [8], a resilient control plane is established by a controller-switch role-assignment procedure. In these architec- tures, for each switch, exactly one controller is assigned the role PRIMARY, and multiple backup controllers are assigned the role SECONDARY. Hence, in the case of a critical failure in the PRIMARY controller, a SECONDARY controller can take over the failed PRIMARY controller’s switches. When handling external requests such as switch events or client requests, only a single controller is declared the PRIMARY controller for a node. A good example is the addition of a new flow service in the network. Here, a PRIMARY controller may need to handle the request in the following manner: 1) If the PRIMARY controller receives the client request, the task (e.g., path finding) is executed locally. 2) If a SECONDARY controller receives the client request, the request is proxied by the SECONDARY to the PRI- MARY controller, which then proceeds with request han- dling. If the PRIMARY controller fails, a new PRIMARY that handles the client request, is re-elected [9], [10]. However, an occurrence of a Byzantine failure of the PRI- MARY controller (i.e. because of a corrupted [1], manipulated [11], or buggy [12] internal state), may lead to incorrect computation decisions in the controller logic. With Byzantine failures, there is incomplete information on whether the con- troller has truly failed. Thus, the controller may appear as both failed and correct to the failure-detection systems, presenting different symptoms to different observers (i.e. switches). Ad- ditionally, a malicious adversary may interfere with or modify the controller logic for the purpose of taking control of the network or re-routing the network traffic [11]. On the other hand, availability-related controller failures lead to the controllers becoming inactive. In contrast to Byzantine failures, such controllers do not actively perform malicious actions nor do they try to hide their real status. Following an availability failure they stop reacting to accepting client requests and eventually time out. Availability types of failures are detected using failure detectors and not by means of a semantic message comparison. Existing state-of-

Upload: others

Post on 09-May-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: MORPH: An Adaptive Framework for Efficient and Byzantine

c©2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, includingreprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists,or reuse of any copyrighted component of this work in other works.

1

MORPH: An Adaptive Framework for Efficient andByzantine Fault-Tolerant SDN Control Plane

Ermin Sakic, Student Member, IEEE,Nemanja Ðeric, Student Member, IEEE,

Wolfgang Kellerer, Senior Member, IEEE

Abstract—Current approaches to tackling the single point offailure in SDN entail a distributed operation of SDN controllerinstances. Their state synchronization process is reliant on theassumption of a correct decision-making in the controllers. Suc-cessful introduction of SDN in the critical infrastructure networksalso requires catering to the issue of unavailable, unreliable (e.g.buggy) and malicious controller failures. We propose MORPH,a framework tolerant to unavailability and Byzantine failures,that distinguishes and localizes faulty controller instances and ap-propriately reconfigures the control plane. Our controller-switchconnection assignment leverages the awareness of the source offailure to optimize the number of active controllers and minimizethe controller and switch reconfiguration delays. The proposedre-assignment executes dynamically after each successful failureidentification. We require 2FM+FA+1 controllers to tolerate FM

malicious and FA availability-induced failures. After a successfuldetection of FM malicious controllers, MORPH reconfigures thecontrol plane to require a single controller message to forwardthe system state. Next, we outline and present a solution tothe practical correctness issues related to the statefulness of thedistributed SDN controller applications, previously ignored in theliterature. We base our performance analysis on a resource-awarerouting application, deployed in an emulated testbed comprisingup to 16 controllers and up to 34 switches, so to tolerate up to 5unique Byzantine and additional 5 availability-induced controllerfailures (a total of 10 unique controller failures). We quantify andhighlight the dynamic decrease in the packet and CPU load andthe response time after each successful failure detection.

Keywords - Byzantine fault tolerance, SDN, distributedcontrol plane, reliability, availability, empirical study

I. INTRODUCTION

In a Software Defined Network (SDN), switches and routersrely on a centralized SDN controller to provide the necessaryconfigurations for path finding and resource configuration. Ina single-controller SDN, the controller represents the sin-gle point of failure. In case of a disabled or compromisedcontroller, the SDN control plane becomes inoperable. Theuser faces the possibility of losing control over the networkequipment and hence risks the unavailability of the data planefor new services. Indeed, the availability of the network control

E. Sakic is with the Department of Electrical Engineering and InformationTechnology, University of Technology Munich, Germany; and CorporateTechnology, Siemens AG, Munich, Germany, E-Mail: (ermin.sakic@{tum.de,siemens.com}).

N. Ðeric is with the Department of Electrical Engineering and Infor-mation Technology, University of Technology Munich, Germany, E-Mail:([email protected]).

W. Kellerer is with the Department of Electrical Engineering and Infor-mation Technology, University of Technology Munich, Germany, E-Mail:([email protected]).

and management functions is of paramount importance inindustrial networks where critical network failures may leadto disruption of applications, potentially leading to safety andregulatory issues [1]–[3]. Hence, for the purpose of resilience,a multitude of strategies for deploying parallel SDN controllerinstances were proposed in the recent literature [4]–[8].

In state-of-the-art distributed SDN controller implementa-tions [7], [8], a resilient control plane is established by acontroller-switch role-assignment procedure. In these architec-tures, for each switch, exactly one controller is assigned therole PRIMARY, and multiple backup controllers are assignedthe role SECONDARY. Hence, in the case of a critical failurein the PRIMARY controller, a SECONDARY controller cantake over the failed PRIMARY controller’s switches. Whenhandling external requests such as switch events or clientrequests, only a single controller is declared the PRIMARYcontroller for a node. A good example is the addition of anew flow service in the network. Here, a PRIMARY controllermay need to handle the request in the following manner:

1) If the PRIMARY controller receives the client request,the task (e.g., path finding) is executed locally.

2) If a SECONDARY controller receives the client request,the request is proxied by the SECONDARY to the PRI-MARY controller, which then proceeds with request han-dling. If the PRIMARY controller fails, a new PRIMARYthat handles the client request, is re-elected [9], [10].

However, an occurrence of a Byzantine failure of the PRI-MARY controller (i.e. because of a corrupted [1], manipulated[11], or buggy [12] internal state), may lead to incorrectcomputation decisions in the controller logic. With Byzantinefailures, there is incomplete information on whether the con-troller has truly failed. Thus, the controller may appear as bothfailed and correct to the failure-detection systems, presentingdifferent symptoms to different observers (i.e. switches). Ad-ditionally, a malicious adversary may interfere with or modifythe controller logic for the purpose of taking control of thenetwork or re-routing the network traffic [11].

On the other hand, availability-related controller failureslead to the controllers becoming inactive. In contrast toByzantine failures, such controllers do not actively performmalicious actions nor do they try to hide their real status.Following an availability failure they stop reacting to acceptingclient requests and eventually time out. Availability typesof failures are detected using failure detectors and not bymeans of a semantic message comparison. Existing state-of-

Page 2: MORPH: An Adaptive Framework for Efficient and Byzantine

2

the art works [13], [14], however, do not distinguish Byzantinefrom availability-related failure sources, such as the failure ofthe underlying hardware, hypervisor or data plane links thatinterconnect the controllers. They consider the availability-related failures a subset of Byzantine failures and thus tendto overprovision the required number of controller instances.

The approach presented henceforth leverages a successfuldiscovery of controller failures and the cause of the failures,in order to optimally adapt the number of deployed controllersas well as the controller-switch assignments during runtime.Our approach realizes a low-overhead control plane operation,while protecting from mentioned types of faults at all times.

A. Our ContributionWe handle the Byzantine faults by modeling the controller

instances of the distributed SDN control plane as a setof Replicated State Machines (RSMs). MORPH proposes adynamic re-association of the controller-switch connectionsafter detecting a fault injected by a malicious adversary oran availability-related failure of an SDN controller instance.Depending on the network operator’s preference, instead ofcontacting just the PRIMARY controllers, the edge switchesmay also forward the client requests to both PRIMARY andSECONDARY controllers (i.e. using an OpenFlow packet-in message [15]) in order to request additional responses,computed by the isolated controller instances. The switchthen collects the different responses and evaluates them forinconsistencies. Since the controllers are modeled as RSMs,each correct controller is expected to compute the exact sameresponse for any given input request. The SECONDARYcontrollers may forward their configuration messages eitherproactively, or reactively, when inconsistencies in the con-figuration messages are detected in the switch. Finally, eachswitch deduces the correct decision locally, based on theminimum number of required consistent (matching) controllerconfiguration messages.

In MORPH, we introduce three architectural components:i) REASSIGNER - detects the controller failures, differenti-ates the types of failures and reassigns the controller-switchconnections on a detected failure; ii) S-COMPARATOR -the switch component that compares the controller responsesfor the purpose of detection of inconsistent configurations;iii) C-COMPARATOR - the controller agent which comparescontroller-to-controller messages in order to identify incon-sistent state synchronization messages. The comparators addi-tionally apply the controller-switch assignment lists providedby the REASSIGNER. The REASSIGNER recomputes thecontroller-switch lists, with the objective of minimizing theamount of considered controllers in deduction of new configu-ration and thus reducing the worst-case waiting period requiredto confirm any new switch configuration.

We introduce the dynamic controller-switch reassignmentprocedure that optimally assigns the controllers to switchesand makes strict guarantees for the QoS constraints, such asthe maximum controller-to-switch and controller-to-controllerdelays and the controller processing capacity. MORPH solvesthe controller-switch assignment problem during online oper-ation using an Integer Linear Program (ILP) formulation. As a

consequence, our switch agent applies the internal reconfigura-tions after receiving a maximum of FM+1 consistent/matchingcontroller messages, thus minimizing the response time com-pared to [14] and [13] which require a minimum of 3F + 1and F + 1 consistent messages to forward the switch state atany point in time, respectively. In MORPH, FM denotes themaximum number of tolerated Byzantine failures, whereas FA

represents the number of maximum tolerated unavailability-induced failures. In common state-of-the-art literature, thisdifferentiation is not done, hence, there F denotes the num-ber of tolerated combined Byzantine and availability-inducedfailures. The switches in MORPH require a minimum of onecontroller message to apply the internal reconfigurations, afterFM malicious controllers are detected in the system.

In our evaluation, we conclude that the dynamic reassign-ment of controller-switch connections results in a considerableperformance improvement in: i) the observed best- and worst-case response time in controller-to-switch and controller-to-controller communication; ii) a decrease in generated packetload and iii) a decrease in experienced CPU load.

The performance analysis of our implementation comprisesa varied set of maximum tolerated controller failures. Measure-ments were executed on two well-known large- and medium-sized service provider and data-center topologies. The em-ulated testbed comprised distributed MORPH processes andOpen vSwitch instances and is thus a realistic approximationof the expected actual performance.

We structure the paper as follows. Important terminology isintroduced in Sec. II. The state-of-the-art literature is presentedin Sec. III. Sec. IV highlights the practical correctness issuesrelated to the relevant state-of-the-art BFT approaches [13],[14]. In Sec. V we introduce the MORPH architecture. Sec.VI-A presents the algorithms implemented by MORPH. Sec.VI-B details the ILP formulation for the dynamic controller-switch assignment procedure. Sec. VII outlines our evaluationmethodology. We present the evaluation results in Sec. VIII.Finally, Sec. IX concludes this paper.

II. TERMINOLOGY

To clarify the issues with the existing approaches, weintroduce the following important concepts:• A malicious controller is an active controller in possession

of a malicious adversary. It is able to compute correct andincorrect responses to the client requests and propose theaccording configurations to the switches. It may disguiseitself as a correct controller for an arbitrary period oftime. In our model, unreliable controllers that computean incorrect result as a consequence of a buggy internalstate are also considered "malicious". Until suspected, allmalicious controllers are considered correct.

• A correct controller is an active controller instance which isnot malicious and which actively participates in the clusterof MORPH controllers.

• An unavailable controller is a disabled controller which isnot active as a result of a hardware/software failure. It isunknown if an unavailable controller was either correct ormalicious at some point in time before its failure.

Page 3: MORPH: An Adaptive Framework for Efficient and Byzantine

3

• An incorrect controller is a controller that is either maliciousor unavailable.

• State-independent (SIA) SDN applications: We refer to SDNapplications that do not require the inter-controller statesynchronization as state-independent applications (SIA), e.g.shortest path routing without resource guarantees, controller-supported MAC-learning, load balancing etc.

• State-dependent (SDA) SDN applications: For reservation-based applications, state synchronization between the con-trollers is necessary to generate correct new decisions forrequests which rely on decisions made in the past (i.e. thecausality property). We refer to these types of applications asstate-dependent applications (SDA), e.g. reservation-basedpath finding and resource management [16], flow schedul-ing, optimal and reservation-aware load balancing [17] etc.

III. RELATED WORK

In this section, we present the related work in the contextof Byzantine Fault Tolerance (BFT) research in SDN.

The prominent open-source SDN platforms such as Open-Daylight [7] and ONOS [8] support no means of identifyingByzantine failures in the controller cluster. However, they doimplement measures to support controller fail-over in the faceof availability-induced failures. More specifically, ONOS andOpenDaylight allow for replication of the controller state ina strong and eventually consistent manner [18], [19], so toprovide for availability of the data-store knowledge in theSECONDARY controllers, after a failure of a PRIMARY.Their strong consistent data-store is based on RAFT consensusalgorithm [9], [20], [21]. RAFT is, however, susceptible toByzantine failures, as only a single leader is elected at atime by design. The RAFT leader processes and broadcastsany new data-store updates to the followers before an updatemay be considered committed. This introduces multiple attackvectors, e.g.: i) the adversary may generate malicious stateupdates in each follower if the leader controller is in thepossession of a malicious adversary; ii) inconsistencies in thestate view [17] of master and followers if adversary takes overthe RAFT follower; iii) incorrectness issues, e.g., a continuousnon-convergence of controller cluster leadership [22].

The definition of the problem of malicious SDN controllerinstances, including an initial solution draft, was proposedin [14]. The authors discuss the requirement of a minimumassignment of 3F+1 controllers to each switch at all times,so to tolerate a total number of F Byzantine failures. 3F+1matching messages are required during the agreement and2F+1 during the execution phase in order to reach the majorityand consensus on the correct response after the comparison ofthe computed configuration outputs [23].

BFT-SMaRt [4] proposes a strong consistent framework forsupporting Byzantine- and availability-induced failures. Theauthors abstract away the notion of a "failure" to considercontrollers failed, if either the controller process crashes andnever recovers or the controller keeps infinitely crashing andrecovering. Their follow-up work [24] applies the Bft-SMaRtin order to improve the data-store replication performance in astrong consistent SDN. They evaluate the workload generatedby real SDN applications as they interact with the data-store.

Both works do not consider the issue of maliciouscontroller-to-controller synchronization, where a controllermay initiate malicious state synchronization procedure andthus commit malicious database changes in the correct con-trollers. Furthermore, they do not cover for the aspects of anadaptive controller-switch connection reassignment procedurenor do they distinguish and leverage the difference betweenthe Byzantine and availability-related controller failures.

A recent proposal for a BFT-enabled SDN [13] advocatesthe usage of a total of 2F+1 instead of 3F+1 controller in-stances. The authors propose the collection of F+1 PRIMARYcontroller configuration messages at each switch (generatedby their PRIMARY controllers), and a delayed request to theremaining F SECONDARY instances if an inconsistency isdetected in the switch. The configuration message is flagged ascorrect only after a minimum of F+1 matching configurationswere received in the switch. The authors do not consider thesource of failure nor the effect of their design on the controllerstate synchronization process.

IV. ISSUES WITH THE CURRENT DESIGNS

Apart from the inefficiency issues related to the unnecessarycontrol plane packet load during normal (non-faulty) operationin [14] and [13], we have identified multiple correctness-related issues not addressed by these works:

Inadequate constellation of controllers: The 2F+1 modeof configuration leads to an incorrect decision-making whencorrect instances become unavailable due to a controller-to-switch link, controller-to-controller link or a controller in-stance failure because of an availability-related issue, previousto the identification of F of the remaining F+1 active instancesas malicious. Namely, the switch may incorrectly interpret themajority of Byzantine messages as correct. Depending on thedesign choice, after receiving F+1 inconsistent (non-matching)controller replies, the switch may decide to not accept anyfuture configuration sent by the remaining active controllersor worse, accept the proposed malicious configuration (themajority configuration) as the valid configuration. Either de-sign results in an unavailability and/or incorrectness of thesystem after one correct replica has failed/or is unreachablebecause of the unavailability preceding the activation of themalicious controller replicas. Thus, both the number andthe order and type of failures matter for the correct failurelocalization. Our approach requires an initial deployment andswitch-controller assignment of 2FM + FA + 1 controllers totolerate FM malicious adversary- and FA availability-inducedfailures (assuming this the total request processing capacityconstraint holds). During the runtime, we dynamically adaptthe remaining number of assigned controllers according to thenumber and type of detected controller failure.

Insecure controller-to-controller channel: The existing ap-proaches do not address the issue of securing the controller-to-controller communication. For SDA applications, letting amalicious controller compute a correct operation and distributethe Byzantine state changes (e.g. reservations) to the remainingcontrollers may result in committing those malicious statechanges into the data-stores of the correct controller instances.

Page 4: MORPH: An Adaptive Framework for Efficient and Byzantine

4

The issue is exacerbated when reservation values (i.e. thebandwidth and buffer reservations, flow table occupanciesetc.) are exchanged between controller instances for the pur-pose of enabling resilience for mission-critical communicationservices. Namely, the global controller decisions potentiallystretch across switches under control of different administra-tive controller clusters, with each active controller requiring aconsistent global state to achieve optimal decision-making.

Request (R1)

PRIMARY

SECONDARY

DATAPLANE

CLIENTS

Update State (C1)

Request(R2)

Detect Inconsist. (R2)

C2'' = Compute(R2)

UpdateState (C1)

Req

-Bac

kup

(R2)

T1 T2 T3 Time

C1 (C2, C2') C2'‘

DetectInconsist. (R2)

R1 R2

T4

UpdateState (C2, C2')

T5

Fig. 1. State-dependent applications: Issue of making incorrect decisionsor non-convergence on the new switch configuration when fetching SEC-ONDARY controller responses.

Distinguishing false positives for SDA: Requesting thecomputation of F SECONDARY controller configuration re-sponses in addition to the collected F+1 PRIMARY responses,without reasoning about the state awareness, may lead to anincorrect identification of malicious controllers.

A problem scenario is depicted in Fig. 1. According to [13],given an input request R1, PRIMARY controllers compute aconsistent switch configuration C1 at time T1. The switchaccepts the configuration C1 only if F+1 PRIMARY replicashave computed a consistent result. As this is the case for R1,the switch commits the new configuration C1 at time T2.Similarly, the PRIMARY controllers update their controllerstate according to the configuration C1. Let the client requestR2 arrive at the PRIMARY controllers at time T2. ThePRIMARY controllers handle the request and forward thecomputed correct and Byzantine configurations C2 and C2′,respectively, to the switches. The configuration is inconsistent,hence the switches decide to request a recomputation of theresult for request R2 at the SECONDARY controllers at timeT3. Since the SECONDARY controllers have never observedthe updated view of the internal data-store as a result of thechanges related to C1, they compute a configuration C2′′ atT4 that differs from the correct configuration C2 computed atthe correct PRIMARY controllers at T2. Thus, the switchesare unable to distinguish correct from incorrect controllers atT5 as F+1 consistent results may not be deducible. Moreimportantly, the switches may mistake correct for maliciouscontrollers (false positives). This is realistic in the scenariowhere the majority of PRIMARY controllers assigned to theaffected switch are malicious.

V. MORPH SYSTEM MODEL

We next introduce our system model and architecturein more detail and explain the novel system elements. Asour starting point, we consider a typical SDN architecturewhere the data plane (i.e. networking elements/switches) andcontrol plane are separated [15]. We extend it with addi-tional functional elements to enable a BFT and unavailability-tolerant system operation. The control plane communicationbetween the switches and controllers (S2C, C2S) and be-tween controllers (C2C) is realized via an in-band controlchannel [25]. Furthermore, all the communication betweendifferent elements (i.e. REASSIGNER, C-COMPARATORand S-COMPARATOR) is assumed to be signed [26]. Thusi) message forging is assumed impossible and ii) messageintegrity is ensured in the data plane during normal operation.In order to prevent a faulty replica from impersonating acorrect replica, correct replicas can authenticate each messageusing MAC authentication [27], [28].

A. Design Goals

The proposed architecture is designed to accomplish twomajor objectives, i) data plane protection from incorrect con-trollers and ii) re-adaptation to the optimal network configu-ration w.r.t. controller-to-switch connection assignments basedon the present number of correct and incorrect controllers.

i) Data plane protection: Even if there are up to FM

malicious controllers and FA failed controllers in the network,the data plane networking elements must never be incorrectlyreconfigured or tampered with by the incorrect controllers.MORPH realizes this protection by leveraging replicated com-putation of control plane decisions and their transmission frommultiple controllers to the target switches. Hence, each switchis connected to multiple controllers at the same time, and onlyaccepts and proceeds to commit the reconfiguration requestsif a sufficient number of matching messages for reaching thecorrect consensus was received. If we consider a scenariowhere FM malicious controllers send their reconfigurationrequests to the networking elements, it becomes clear thatwe need at least FM + 1 correct reconfiguration requests inorder to enable the controllers to reach a correct consensusand for the switches to distinguish a correct reconfigurationrequest from an inconsistent message set. For the detection ofan unavailability-induced controller failure, we deploy a failuredetector [29], [30] that reliably identifies the fail-silent con-trollers [31]. Hence, in the case of a failure of FA controllers,we only require one additional backup instance (i.e. FA+1) toachieve a correct control plane protection. Therefore, in orderto protect the data plane from FM malicious controllers andFA failed controllers each switch has to be assigned to at least|C| = 2FM +FA+1 controllers. Granted, a switch will acceptthe new reconfiguration request already after receiving FM+1matching messages from the controllers.

Additionally, individual switch failures may cause packetloss and thus an unsuccesful delivery of controller messages. Ifa faulty switch on the control path starts dropping controllers’messages, the next-in-line switches and dislocated controllers

Page 5: MORPH: An Adaptive Framework for Efficient and Byzantine

5

eventually start suspecting failed connections to the unreach-able controllers. The switches eventually raise an alarm at theREASSIGNER. The REASSIGNER then accordingly marksthe unreachable controllers as "unavailable". Only a limitednumber of "unavailable" controllers is tolerated by our design(governed by the FA parameter). Thus, both failures comingfrom unavailable controllers, as well as from the unavailableswitches that forward the control packets to-and-from unreach-able controllers are accounted by same parameter FA.

ii) Re-adaptation to the optimal network configuration:After the detection of a failed or malicious controller, thetamper-proof entity REASSIGNER recomputes the controller-to-switch connection assignments in order to achieve theoptimal network configuration given the current number ofcorrect and incorrect controllers. The recomputed assignmentsare distributed to the controllers and switches in order toreduce the reconfiguration delay and messaging overhead inthe control plane. The definition of the optimal configurationof a network and the reassignment logic is elaborated inSection VI-B.

B. System Elements in MORPH

In order to establish and evaluate the mentioned designgoals, the following elements are considered and introducedin the MORPH system architecture depicted in Fig. 2.

Northbound (NBI) and data plane clients: The SDNapplication clients requesting a particular controller service.NBI clients are aware of the controllers but not of theirroles w.r.t. switch assignment. Thus, NBI clients report theirrequests to all available controllers. The data plane clients feedtheir requests to a well-defined controller group address. Thedata plane client requests are then handled by the neighboringedge switch and are encapsulated as a packet-in message (e.g.using OpenFlow [15]) and delivered only to the PRIMARYand SECONDARY controllers of that particular switch.

S-COMPARATOR: A switch mechanism that collects andcompares the configuration outputs generated at each ac-tive PRIMARY or SECONDARY controller instance asso-ciated with the switch hosting the C-COMPARATOR in-stance. On discovery of inconsistencies, the S-COMPARATORreports the conflicting configurations to the REASSIGNERentity. We assume a tamper-proof operation of the S-COMPARATOR (e.g., realized using the Intel SGX enclaves[32]). S-COMPARATOR cannot be implemented with currentOpenFlow logic, but requires one additional software agent,responsible for comparison of arriving control messages. Thisrequirement is, however, not unique to MORPH and holdsfor any BFT-enabled system where switches take over thecomparison logic, including [13] and [14]. Advances of pro-grammable switching hardware and languages, such as P4[33], could enable for a more flexible implementations of S-COMPARATOR. A running instance of S-COMPARATOR isassumed in each active switch in MORPH deployment (referto Fig. 2).

C-COMPARATOR: A controller mechanism that collectsand compares the controller configuration messages generatedat each active controller instance. In the controller hosting

SDA applications, it withholds the propagation of switchconfiguration messages as long as the majority of controllerupdates remains inconsistent. After collecting a majority ofconsistent messages, it updates the underlying controller’sinternal state configuration to reflect the configuration update(e.g., it updates the status of resource reservations). A runninginstance of C-COMPARATOR is assumed in each activecontroller in MORPH deployment (ref. Fig. 2).

REASSIGNER: The REASSIGNER is in charge of de-tecting the incorrect controllers. It is furthermore able tofind an optimal controller-to-switch assignment so to excludethe incorrect controllers and report the updated controller-switch list assignments at the controllers and switches. TheREASSIGNER is triggered every time a controller is suspectedby an S-COMPARATOR. The trigger is executed independentof the cause of failure (i.e. the discovery of a Byzantine orunavailable controller). However, the effect of discovery of thecause of failure considerably affects the selection of controllersconsidered in the assignment. The proposed controller-switchassignment solver is aware of the free capacities of thecontrollers as well as of the worst-case delays between thecontrollers and switches and in between the controllers, and isable to consider the related worst-case bounds when executingan optimal assignment. We detail the assignment procedure inSec. VI-B. We cannot trust a single REASSIGNER instanceto be correct. To make the REASSIGNER resistant againstmalicious and availability failures, similarly to controllers, itmust be implemented as a group of replicas that also carry outa BFT agreement protocol after each successful reassignment.For brevity, in the rest of the paper, we assume a tamper-proofoperation of the REASSIGNER (e.g., realized using the IntelSGX enclaves [32]) and focus our attention on solving theByzantine faults in the context of SDN controllers only.

REQ1

NBI Client

REQ1

REQ1

REQ1

RES1

RES1

RES1

RES1

REASSIGNER

Data-Plane Client

REQ2REQ2

REQ2

REQ2

- Client-Controller Channel

- Controller-Controller Channel

- PRIMARY Controller - Switch Channel

- SECONDARY Controller - Switch Channel

- REASSIGNER - C-COMPARATOR Channel

- REASSIGNER - S-COMPARATOR Channel

S-COMPC-COMP

C-COMP

C-COMPC-COMP

S-COMP

S-COMPS-COMP

S-COMP

S-COMP

Fig. 2. MORPH architecture comprising the: i) SDN controllers that hostthe C-COMPARATOR; ii) switches that host the S-COMPARATOR; iii)northbound (NBI) and data plane clients; and iv) REASSIGNER element.REASSIGNER is in charge of dynamically recomputing the controller-switchassignment after a reported controller failure discovery by S-COMPARATOR.

Page 6: MORPH: An Adaptive Framework for Efficient and Byzantine

6

C. Application Types

We distinguish two different types of SDN applica-tions, state-independent (SIA) and state-dependent applica-tions (SDA). SIA refers to all SDN applications which processclient’s requests independent of the actual network state.Hence, given a repeated client input, a correct SIA applicationalways generates a constant, semantically equal response.One representative of SIA is the hop-based shortest pathrouting where, assuming no topology changes, the shortestpath between the hosts remains the same, regardless of thecurrent link and node utilization. Contrary to SIA, SDA refersto all applications which base their decision-making on thecurrent network state. Hence, if we consider typical loadbalancing routing, two identical requests (e.g. path requests)could generate completely different responses (i.e. differentroutes) from the controller, so to optimize for the total currentresource utilization given the instantaneous network state.

D. Necessity of the Controller State Synchronization

Consensus across SDA instances: In a resource-based SDA,such as a routing application, where the cost of each linkdepends on the current network state (e.g. link utilization),the problem might arise if two scattered clients (e.g. client Aand client B) request the same path to distributed controllersat approximately the same time. Due to the propagation andprocessing delays, the controllers in the proximity of client Aand the controllers in the proximity of client B could processthe corresponding requests in a non-deterministic order. Sincethe network state is updated after processing each request, eventhe correct controllers could produce inconsistent responses tothe switches. MORPH solves this problem by delaying thecorrect controllers’ configuration responses to the switchesuntil a sufficient number of controller messages generatedby PRIMARY and SECONDARY set, required to reach acommon consensus for the client request, is collected. In orderto achieve the consensus: i) all PRIMARY and SECONDARYcontrollers first compute a response immediately after receiv-ing the client request; ii) they next store it in an internaldatabase and; iii) exchange their decisions using the any-to-any C2C channels. Finally, the consensus is reached for acontroller’s C-COMPARATOR component when it receivesat least b(ReqP + ReqS)/2c + 1 identical responses for agiven request, where ReqP is the current number of requiredPRIMARY controllers and ReqS is the current number ofrequired SECONDARY controller assignments per switch.

Proxying mechanism in SDA and SIA instances: Further-more, both in the SDA and SIA routing applications, theestablishment of certain long paths is not achievable withoutthe C2C state synchronization. For example, a client couldinitiate a routing request via its neighbouring edge switch SA

to the set of SA’s PRIMARY and SECONDARY controllers.However, if the determined path contains a switch which isnot assigned to the same set of PRIMARY and SECONDARYcontrollers as with SA, the configuration of the flow rules bythe computing controller would not be allowed. Therefore,when an actual controller, which is assigned the role ofPRIMARY or SECONDARY controller for the target switch,

compares and validates the new configuration on the C2Cchannel, it decides to dispatch a response to the affectedswitches. The proxying mechanism could alternatively beomitted by enforcing the edge switch to forward the requestsfor SDN applications that operate globally on the data plane, toall available controllers. However, this design intuitively doesnot scale well with the number of controllers as the processingpower of controllers and switches is limited [34].

VI. ENABLING BFT OPERATION OF A DISTRIBUTED SDNCONTROL PLANE

Next we detail the system flow and the algorithms behindMORPH. In the second part, we formulate the objectivesand constraints for the Integer Linear Program (ILP) thatdynamically reassigns the controller-switch connections duringruntime, as part of the REASSIGNER logic.

A. Algorithm

The attached algorithms describe the operation of MORPHin more detail. A simplified visual representation and systemworkflow involving the introduced algorithms is provided inFig. 3. We distinguish the following eight steps:1) Given a list of controllers and switches, as well as the

list of clients and their worst-case capacity requirements,REASSIGNER executes the initial assignment of controlplane connections. It considers the unidirectional delay aswell as the controller capacity constraint. As described inSec. VI-B, the REASSIGNER minimizes the total numberof active controllers when assigning switches to controllers.This process is embodied in Lines 1-2 of Alg. 1.

2) The controller-switch assignment lists are distributed to theSDN controllers and switches. Switches are thus assignedtheir PRIMARY and SECONDARY controllers. From nowon, any received configuration message initiated by aremote controller is queued for the evaluation in the switchif and only if the configuration message was initiatedby a controller that belongs to either the PRIMARY orSECONDARY controller set of that switch. Otherwise, theconfiguration message is rejected and dropped.

3) An end-client sends off their request to the SDN controller,i.e. a request for a computation of a QoS-constrained path,a load-balancing request etc. Northbound clients send theirrequests directly to all controllers, while data plane clientsstay unaware of the location of the controllers, so to sim-plify the client-side / user logic [35]. The data plane clientsfeed their requests directly into the network. The next-hopedge switch then intercepts and proxies the request to itsassigned PRIMARY and SECONDARY controllers.

4) The PRIMARY and SECONDARY controllers of theswitch which initiated the client request compute the cor-responding configuration response and decide to apply thecomputed response configuration in the affected switches.We enable the selection of a flexible trade-off betweenthe switch configuration time overhead (response time) andthe generated control plane load. In general, the controllerinstances assigned to a switch are either of the PRIMARYor the SECONDARY role. Depending on the point in time

Page 7: MORPH: An Adaptive Framework for Efficient and Byzantine

7

C-COMP

C-COMP

C-COMP REASSIGNER

Data-Plane Client

S-COMP

S-COMP

S-COMP

assign-controllers ()

suspect-byzantine ():

controller-failed ():

C-COMP

signal (C, packet-in)

store-secondary-response ()

SECONDARY

PRIMARYsignal (C, response-sync)

handle-packet-in ()

handle-packet-in ()

receive-response ():either:

- apply()- signal(C, response-request)- signal(Z, suspect-byzantine)- drop-response()

signal (C, response-request)

1

3

4

5

4

5

2

2

6

7

98

10

11

13

12 14

Fig. 3. A simplified visual representation of the system workflow and the distributed algorithm execution as defined in Sec. VI-A. The portrayedworkflow depicts the steps of: (1-2) controller-switch assignment; (3-5) client request dissemination and handling in the PRIMARY and SECONDARYC-COMPARATOR instances; (6-7, 8-10) dissemination and handling of PRIMARY and SECONDARY controller responses in the affected S-COMPARATORinstances, respectively; (11-12, 13-14) dissemination and handling of Byzantine and/or availability-induced failures in the REASSIGNER, respectively.

the SECONDARY controller instances decide to forwardthe computed configuration responses to their switches, theoverhead in terms of response time and control plane loadmay vary. Therefore, we define the following two models:• NON-SELECTIVE: Propagation of computed configura-

tions from SECONDARY controllers to switches initi-ates immediately after receiving the client request. Incase of the SDA, new configurations are sent to switchesonly after reaching the majority consensus on the newconfiguration message, as explained later in the text.

• SELECTIVE model imposes an additional step of buffer-ing the intermediate result (either locally computed in thecase of SIA or the majority result in the case of SDAapplications) and forwarding the result to the switch on-demand, whenever the switch requests additional config-uration messages from its SECONDARY controllers.

In the case of the SELECTIVE model, only the con-trollers which are the PRIMARY controllers of the to-be-configured switches forward their request directly to theswitch. In the case of the NON-SELECTIVE model, theSECONDARY controllers also proactively forward theirconfiguration messages to the switches. This differentiationis depicted in Lines 9-16 of the SIA-specific Alg. 2 andLines 11-18 of the SDA-specific Alg. 3.State-independent applications (SIA): In case of SIA, thecontrollers forward the computed configuration messagesto the target switches if they are assigned either thePRIMARY or SECONDARY role for the target switch. Ifthe controller that computes the configuration is assignedneither role for the target switch, the configuration result isforwarded to the switch only after a consensus is achievedon the actual PRIMARY/SECONDARY controllers of theswitch. For SIA, consensus is achieved after collecting

ReqP identical messages on the PRIMARY/SECONDARYinstances. ReqP denotes the currently required number ofassigned PRIMARY controllers per switch.State-dependent applications (SDA): In case of SDA, allcorrect controllers must first reach consensus on theircommon internal state update. To this end, they deducethe majority configuration message (in Lines 8-9 of Alg.3). Majority configuration response is necessary in or-der to omit the possibility of false positive detection,where correct controllers potentially become identified asfaulty instances, following a temporary controller state de-synchronization during runtime, as discussed in Sec. IV.After determining the majority response, the controllerssend the configuration message to the switches (as in Lines10-16 of Alg. 3). For SDA, the total number of requiredmatching messages to apply a controller-state update isb(ReqP +ReqS)/2c+1, where ReqS denotes the currentlyrequired number of assigned SECONDARY controllers perswitch. The total number of exchanged messages is thesame as in SIA, however, reaching a consensus before dis-patching the configuration messages increases the overallreconfiguration time, as shown in Section VIII.The controllers that serve the client request synchronizetheir configuration messages with the PRIMARY and SEC-ONDARY controllers of the affected switch. Thus, thePRIMARY and SECONDARY controllers of the affectedswitch apply new switch configurations only after a suffi-cient number of matching messages were generated at theremote controller cluster members that have executed thecomputation. Alternatively, the switches may distribute theclient requests to ALL SDN controllers and thus omit theround trip resulting from the corner-case described above.This incurs an additional message load for any case that isnot the corner-case situation described above.

Page 8: MORPH: An Adaptive Framework for Efficient and Byzantine

8

5) The S-COMPARATORS collect the configuration requestsand decide after ReqP consistent (equal) PRIMARY mes-sages to apply the configuration locally. In the SELECTIVEscenario where inconsistent messages among the ReqPPRIMARY messages are detected, the switches contact theSECONDARY replicas to fetch additional ReqS responses- as depicted in Line 10 of Alg. 4. After receiving ReqPconsistent (equal) messages (Line 12-15 of Alg. 4), theswitches apply the new configuration (Line 15). Thus, theoverhead of collection of ReqP consistent messages dic-tates the worst-case for applying new switch configurations.

6) After discovery of an inconsistency among the controllerresponses (Lines 9-10 of Alg. 4) or a failure of an assignedcontroller (Line 24-25), the S-COMPARATORS wait untilReqP+ReqS messages are collected, before addressing theREASSIGNER with the controller IDs and the conflictingmessages (Line 20 of Alg. 4). In the case when a controllerinstance has failed, the duration of the time the switchwaits before contacting the REASSIGNER correspondsto the worst-case failure detection period (dictated bythe underlying failure detector, e.g. the φ-Accrual [29]detector). If a switch suspects that a controller assignedto it has failed, it notifies the REASSIGNER as per Line25 of Alg. 4 and Line 13 of Alg. 1.

7) The REASSIGNER compares the inputs and deduces themajority of correct responses received for the conflictingclient request. If a Byzantine failure of a controller issuspected, both ReqP and ReqS are decremented by onefor each malicious controller. If a controller is markedas unavailable (independent of the source of failure andcorrectness of the controller), only the required number ofSECONDARY controllers per-switch ReqS is decrementedby one. Lines 4-17 of Alg. 1 contain this differentiation.

8) Based on the updated ReqP and ReqS controllers de-duced during runtime, the REASSIGNER computes thenew optimal assignment and configures the switches andcontrollers with the new controller-switch assignment lists.Steps 3-7 repeat until all FM malicious and FA unavailablecontrollers are eventually identified.

By lowering the number of maximum required PRIMARYand SECONDARY controller assignments per switch, theREASSIGNER minimizes the total control plane overhead interms of the packet exchange in both controller-to-controllerand controller-to-switch channels, as well as the time requiredto confirm new controller and switch state configurations.

B. REASSIGNER Logic (Integer Linear Program)

REASSIGNER assigns the controller roles to switches ona per-switch basis. It takes a list of controllers and switches,their unidirectional delay and delay requirements, as well asthe list of clients and the client request arrival rates as input.

For brevity, we henceforth define a single objective for theassignment problem. Namely, the REASSIGNER minimizesthe number of active controllers, so to lower the total overheadof controller-to-controller communication, while taking intoconsideration the maximum delay bound and available con-troller capacities when assigning the controllers to switches.

Algorithm 1 REASSIGNER: Controller-switch assignmentNotation:S Set of available SDN switchesC Set of available SDN controllersB Set of detected blacklisted SDN controllersAp Set of PRIMARY controller-switch assignmentsAs Set of SECONDARY controller-switch assignmentsReqP No. of required PRIMARY controllers per switchReqS No. of req. SECONDARY controllers per switchFM Maximum number of tolerated malicious controller

failuresFA Maximum number of tolerated unavailability

controller failures

Initial variables:ReqP = FM + 1ReqS = FA + FM

1: procedure CONTROLLER-SWITCH ASSIGNMENT2: (Ap,As) := Assign-Controllers (S, C)3:4: upon event suspect-byzantine < Cm > do5: C ← C \ Cm6: B ← B ∪ Cm7: ReqP = ReqP − 18: ReqS = ReqS − 19: (Ap,As) := Assign-Controllers (S, C, ReqP , ReqS)

10: signal (S, new-assignment< Ap,As >)11: signal (C, new-assignment< Ap,As,B >)12:13: upon event controller-failed < Cf > do14: C ← C \ Cf15: B ← B ∪ Cm16: ReqS = ReqS − 117: (Ap,As) := Assign-Controllers (S, C, ReqP , ReqS)18: signal (S, new-assignment< Ap,As >)19: signal (C, new-assignment< Ap,As,B >)

The mentioned objective additionaly allows for implicit load-balancing of individual controller-switch connections, as longas all controllers are instantiated with an equal initial capacity.

Let APCi,Sj

and ASCi,Sj

denote the active assignmentof a controller instance Ci to the switch Sj as a PRI-MARY/SECONDARY controller, respectively. Then UCi de-notes the active participation of the controller Ci in the system:

UCi =

1 when∑

Sj∈SAP

Ci,Sj+

∑Sj∈S

ASCi,Sj

> 0,∀Ci ∈ C

0 otherwise(1)

The objective function can then be formalized as:

min∑Ci∈C

UCi(2)

In the following, we define the constraints of the corre-sponding ILP.

Page 9: MORPH: An Adaptive Framework for Efficient and Byzantine

9

Algorithm 2 C-COMPARATOR: Handling of state-independent application (SIA) requests in the SDN controller

Notation:P Client request (e.g. flow request) initiated at switch SC

C The set of available SDN controllersCi The local controller instanceCr The remote controller instanceRP,Si Configuration response intended for switch Si

Ap Set of primary controller-switch assignmentsAs Set of secondary controller-switch assignmentsASi,Ci

Connection assignment variable for controller-switch pair (Si, Ci)

1: procedure HANDLE CLIENT REQUEST2: upon event packet-in < SC , P > do3: RP := handle-packet-in (P )4: signal (C, response-sync< Ci, RP >)5:6: upon event response-sync < Cr, RP > do7: if is-sia-consensus-reached(RP ) or Ci == Cr then8: for all Si ∈ affected-switches(RP ) do9: if mode == non-selective then

10: if ASi,Ci ∈ Ap ∪ As then11: signal (Si, response < Ci, RP,Si

>)

12: else if mode == selective then13: if ASi,Ci ∈ Ap then14: signal (Si, response < Ci, RP,Si >)15: else if ASi,Ci

∈ As then16: store-secondary-response(RP,Si

)

17:18: upon event response-request < Si, P > do19: signal (Si, response < Ci, RP,Si

>)

Minimum Assignment Constraint: In order to tolerate FM

Byzantine and FA unavailability failures, initially at time t = 0the REASSIGNER assigns ReqP (0) = FM + 1 PRIMARYcontrollers, and ReqS(0) = FM + FA SECONDARY con-trollers per each switch. The values ReqP and ReqS aretime-variant and are adapted on every discovered Byzan-tine/availability failure or a successful controller recovery:

∑Ci∈C

APCi,Sj

≥ ReqP (t),∀Sj ∈ S∑Ci∈C

ASCi,Sj

≥ ReqS(t),∀Sj ∈ S(3)

Unique Assignment Constraint: Each controller Ci mayeither be assigned the role of a PRIMARY or SECONDARY,at maximum once per switch Sj :

APCi,Sj

+ASCi,Sj

≤ 1,∀Ci ∈ C, Sj ∈ S (4)

Controller Capacity Constraint: Let PCidenote the total

available controller’s Ci capacity. Let LCLkand LSj denote

the load stemming from the northbound client CLk and thedata plane edge switch Sj , respectively. The sum of theloads generated by the assigned switches at controller Ci may

Algorithm 3 C-COMPARATOR: Handling of state-dependentapplication (SDA) requests in the SDN controller

Notation:P Client request (e.g. flow request) initiated at switch SC

C The set of available SDN controllersCi The local controller instanceCr The remote controller instanceRmaj

P,SiMajority configuration for switch Si

RP Buffer containing controller responses for request PAp Set of primary controller-switch assignmentsAs Set of secondary controller-switch assignmentsASi,Ci Connection assignment variable for controller-

switch pair (Si, Ci)

1: procedure HANDLE CLIENT REQUEST2: upon event packet-in < SC , P > do3: RP := handle-packet-in (P )4: signal (C, response-sync< Ci, RP >)5:6: upon event response-sync < Cr, RP > do7: RP ← RP∪ < Cr, RP >8: if is-sda-consensus-reached(RP ) then9: Rmaj

P :=majority-response(RP )10: for all Si ∈ affected-switches(Rmaj

P ) do11: if mode == non-selective then12: if ASi,Ci ∈ Ap ∪ As then13: signal (Si, response < Ci, R

majP,Si

>)

14: else if mode == selective then15: if ASi,Ci

∈ Ap then16: signal (Si, response < Ci, R

majP,Si

>)17: else if ASi,Ci

∈ As then18: store-secondary-response(Rmaj

P,Si)

19:20: upon event response-request < Si, P > do21: signal (Si, response < Ci, R

majP,Si

>)

not exceed the difference of the total available controller’scapacity and the sum of the constant loads stemming fromthe northbound clients (ref. Sec. V):∑

Sj∈SAP

Ci,Sj∗ LSj

+∑Sj∈S

ASCi,Sj

∗ LSj≤

PCi−

∑CLk∈CL

LCLk,∀Ci ∈ C

(5)

Delay Bound Constraint: We assume well-defined worst-case upper bounds for the unidirectional delays in controller-to-switch and controller-to-controller communication. LetdCi,Sj

and dCi,Cjdenote the guaranteed worst-case expe-

rienced delay for the controller-switch pair (Ci, Sj) andcontroller-controller pair (Ci, Cj), respectively. Given a max-imum tolerable global upper bound delays DC,S and DC,C ,we define the related constraints as:

APCi,Sj

∗ dCi,Sj≤ DC,S ,∀Ci ∈ C, Sj ∈ S

ASCi,Sj

∗ dCi,Sj≤ DC,S ,∀Ci ∈ C, Sj ∈ S

dCi,Cj≤ DC,C ,∀Ci ∈ C \ Cj , Cj ∈ C \ Ci

(6)

Page 10: MORPH: An Adaptive Framework for Efficient and Byzantine

10

Algorithm 4 S-COMPARATOR: Processing of controller con-figuration messages in the switch

Notation:ReqP No. of required PRIMARY controllers per switchReqS No. of req. SECONDARY controllers per switchRP The set of received responses for client request PASi

p Set of PRIMARY controllers assigned to switch Si

ASis Set of SECONDARY controllers assigned to Si

Z The set of designated shufflers assigned to switch Si

1: procedure COMPARE AND APPLY CONTROLLER CON-FIGURATIONS

2: upon event receive-response< Cj , RP > do

3: RP ← RP∪ < Cj , RP >

4: if Cj ∈ ASip then

5: if |RP | == ReqP then6: Rmajority := majority-responses(RP )7: if |Rmajority| ≥ ReqP then8: apply (Rmajority)9: else

10: signal(ASis , response-request < Si, P >)

11: else if Cj ∈ ASip ∪ASi

s then12: if ReqP < |RP | ≤ ReqP +ReqS then13: Rmajority := majority-responses(RP )14: if |Rmajority| ≥ ReqP then15: apply (Rmajority)16: for all < Ci, Ri >∈ RP do17: if Ri /∈ Rmajority then18: Cincnst ← Cincnst ∪ Ci

19: if Ci 6= ∅ then20: signal (Z, suspect-byzant < Cincnst >)21: else22: drop-response(RP )

23:24: upon event controller-failed< Cj > do25: signal (Z, controller-failed < Cj >)

C. Communication and Computation Complexity Analysis

TABLE INOTATION USED IN SPECIFICATION OF THE COMMUNICATION OVERHEAD

Notation PropertyCLI2S Communication channels between clients and switchesS2C Upstream communication channels between switches and controllersC2C Controller to controller communication channelsC2S Downstream communication channels between controllers and switchesC2S Downstream communication channels between controllers and switchesm Total number of data plane clients in the networkµcli The average request rate of data plane clients|Scli| The average number of affected switches by data plane client’s requests|C| The total number of controllers, |C| ≥ 2FM + FA + 1fM,A the current number of failed controllersE Equal to 1/0 if MORPH is in NON-SELECTIVE/SELECTIVE mode

Henceforth we analyze the communication overhead im-posed by MORPH framework when handling data planeclients only (NBI clients are excluded for brevity). Multiplecommunication channels are present in the system:• CLI2S: Every request generated by a data plane client

is first forwarded to the corresponding edge switch,

therefore, if the average request generation rate of mdata plane clients is µcli, then mµcli is the total requestgeneration rate on client to switch channels (i.e., CLI2S).

• S2C: Upon receiving the request, the edge switch for-wards it to its PRIMARY and SECONDARY controllers,hence, the total message rate on the upstream com-munication channels between switches and controllersis (ReqP + ReqS)mµcli. In the best-case, all faultycontrollers were detected, hence the total number ofPRIMARY and SECONDARY controllers is reduced toReqP = 1 and ReqS = 0, while in the worst-case noneof the faulty controllers failed yet, i.e., the inital valuesstill hold ReqP = FM + 1 and ReqS = FM + FA.

• C2C: For every request received from the switches,its PRIMARY and SECONDARY controllers calculatethe corresponding response (i.e., a forwarding path) andforward it to all of the other available (non-faulty)controllers in the network (i.e., |C| − fM,A − 1). Thetotal dynamic rate rate on the controller to controllerchannel is (|C| − fM,A − 1)(ReqP + ReqS)mµcli. Inthe best-case, all failed controllers were detected, thusfM,A = FM + FA, however, in the worst-case, none ofthe controllers failed, i.e., fM,A = 0.

• C2S: After the consensus is reached for the correspond-ing request, in the case of NON-SELECTIVE modeall PRIMARY and SECONDARY controllers issue thereconfiguration responses to all affected switches (i.e.,variable is E = 1), while in the SELECTIVE mode, onlyPRIMARY controllers issue the responses (i.e., E = 0).If we consider that on average |Scli| switches are recon-figured based on the clients requests, the total numberof responses issued by controllers towards switches is|Scli|(ReqP +EReqS)mµcli. In the best-case, all faultycontrollers were detected (i.e., ReqP = 1) and the work-ing mode is SELECTIVE E = 0, thus only |Scli|mµcli

messages are needed. While in the worst-case, the initialcase, ReqP = FM + 1, E = 1, ReqS = FM + FA + 1.

The summary of the communication overhead is presentedin Table II, with corresponding notation summarized in I. TableII also presents the required number of messages in case whenthere is no fault-tolerance, i.e., in this case we assume that thenetwork is handled by a single SDN controller. Furthermore,we also present the communication overhead specific to thedesign presented in a related work [13]. It can be observed thatMORPH requires the same or lower amount of messages onCLI2S, S2C, C2S channels depending on the current networkstate. Furthermore, as mentioned in the Sec. IV, controller-to-controller communication was not discussed in [13], thus SDAare not correctly handled as in the case of MORPH.

VII. EVALUATION METHODOLOGY

In this section we present our evaluation methodologyincluding the target SDN application, the evaluated topolo-gies and the overall scenario. Furthermore, we elaborate ouremulation approach in more detail.

Page 11: MORPH: An Adaptive Framework for Efficient and Byzantine

11

TABLE IIEXCHANGED NUMBER OF MESSAGES ON EACH COMMUNICATION CHANNEL

Channel MORPH Best-Case MORPH Worst-Case MORPH No Fault-Tolerance [13]CLI2S mµcli mµcli mµcli mµcli mµcliS2C (ReqP +ReqS)mµcli mµcli (2FM + FA + 1)mµcli mµcli (2(FM + FA) + 1)mµcliC2C (|C| − fM,A − 1)(ReqP +ReqS)mµcli (|C| − FM − FA − 1)mµcli (|C| − 1)(2FM + FA + 1)mµcli - -C2S |Scli|(ReqP + EReqS)mµcli |Scli|mµcli |Scli|(2FM + FA + 1)mµcli |Scli|mµcli |Scli|(2(FM + FA) + 1)mµcli

A. Application modelFor our SDN application we have implemented a centralized

resource-based path finding application, based on Dijkstra’salgorithm [36]. We deploy the routing application once pereach controller instance, so to cater to the SPOF issue.

In the case of SIA realization, the weight of each linkis set to a constant link delay value corresponding to thepropagation delay. Hence, the application is stateless andcomputes the same shortest path between two arbitrary hosts aslong as no topology changes occur. Thus, as soon as our SDNcontroller application receives a client request, it computesthe new path, signals the path configuration (i.e. OpenFlowFlowMod updates) to the affected switches, and notifies theother controllers for state synchronization purposes.

For the SDA case, we consider load-balanced routing wherethe cost of each link depends on the current link load. Acorrect distribution of switch configuration thus necessitates aconsensus across the set of correct SDN controllers, so in orderto omit the overloading of particular links (ref. Sec. V-D).

B. ScenariosTo validate our claims in a realistic environment, we have

emulated the Internet2 Network Infrastructure Topology1, aswell as a standard fat-tree data-center topology. Both topolo-gies are depicted in Fig. 4a and Fig. 4b, respectively. Theevaluated Internet2 and fat-tree topologies encompass 34 and20 switches, respectively. We have enabled a configuration ofMORPH to support up to FM = 5 and FA = 5 availability-related controller failures. Thus, to allow for a Byzantine fault-tolerant operation, 2FM + FA + 1 = 16 controller processeswere deployed. We varied FM and FA individually from 1to 5, so to allow for an evaluation of the overhead of addedrobustness against either type of failure. The overhead of ourdesign scales with the arbitrary number of tolerated failuresFM and FA and is independent of the number of deployedswitches. Each SDN controller implements the logic to executethe routing task as well as the C-COMPARATOR compo-nent that compares the controller-to-controller synchronizationinputs and notifies the switches of new path configurations.An S-COMPARATOR agent is hosted inside each switchinstance, and is enabled to listen for remote connections. Allcommunication channels (ref. Fig. 2) are realized using TCPwith enabled Nagle’s algorithm [37].

A realistic, in-band control plane channel was realizedat all times. To realize the switching plane, a number ofinterconnected Open vSwitch2 v2.8.2 virtual switches were

1Internet2 Advanced Networking - https://www.internet2.edu/products-services/advanced-networking/

2Open vSwitch - https://www.openvswitch.org/

(a) Internet2 topology [38]

(b) Fat-tree topology [39]

Fig. 4. Exemplary network topologies and controller placements used in theevaluation of MORPH. Elements highlighted in green and blue represent theswitches and clients, respectively. Red elements are the controller instancesplaced as per [38], [39]. Each switch instance in the Internet2 topology isallocated a client, while the fat-tree topology hosts clients at the leaf switches.

instantiated and isolated in individual Docker containers. Toreflect the delays incurred by the length of the optical links inthe geographically scattered Internet2 topology, we assume atravel speed of light of 2 · 106km/s in the optical fiber links.We then derive the link distances from the publicly availablegeographical Internet2 data3 and inject the propagation delaysusing Linux’s Traffic Control (tc) tool. The links of the fat-tree topology only posses the inherit processing and queuingdelays. The arrival rates of the incoming service embedding re-quests were modeled using a negative exponential distribution[39]. In the fat-tree topology, each leaf-switch was connectedto 2 client instances, bringing the total number of clients upto 16. The Internet2 topology deployed one client per switch.

Controller placement: In the Internet2 topology, we leveragea controller placement that allows for a high robustness againstthe controller failures. We consider the exemplary placementsintroduced in [38]. Since the accompanying controller place-ment framework4 was unable to solve the placement problemfor a very high number of controllers (i.e. up to 16 in ourcase), we have executed the problem by placing up to 5

3Internet2 topological data (provided by POCO project) - https://github.com/lsinfo3/poco/tree/master/topologies

4Pareto Optimal Controller Placement (POCO) GitHub - https://github.com/lsinfo3/poco

Page 12: MORPH: An Adaptive Framework for Efficient and Byzantine

12

controllers multiple times for the same topology, targeting dif-ferent optimization objectives (incl. response time, maximizedcoverage in the case of failures, load balancing etc.). We thenranked the unique nodes based on their placement preferenceand have selected the highest-ranked 16 nodes to host theMORPH controller processes. We make a note here that theoptimality of controller placement decisions is orthogonal tothe issue solved in this paper, and was thus not consideredcrucial for our evaluation. The resulting controller placementis depicted in Fig. 4a. The SDN controller replicas in the data-center topology were deployed on the leaf-nodes, similar to thecontroller placement presented in [39].

TABLE IIIPARAMETERS USED IN EVALUATION OF MORPH IMPLEMENTATION

Parameter Intensity Unit MeaningFM [1, 2, 3, 4, 5] N/A Max. no. of Byzantine failuresFA [1, 2, 3, 4, 5] N/A Max. no. of availability failures|C| [4, 7, 10, 13, 16] N/A No. of deployed controllers1/λFM

[5, 10, 15, 20] [s] Byzantine failure rate1/λFA

[5, 10, 15, 20] [s] Availability failure rateS [0, 1] N/A SIA/SDA operationE [0, 1] N/A NON-SELECTIVE/SELECTIVET internet2, fat-tree N/A Topology type

To evaluate the effect of the number, ratio and order of theByzantine and availability-type failures, we deploy a special-ized fault injection component. The fault injection componentgenerates the faults in any of the available SDN controllerprocesses using an out-of-band management channel. Thetargeted active controllers are faulted with uniform probability.The ratio of failure injections is manipulated by specifyingthe maximum amount of individual type of failures (ref. TableIII). The ordering of the different failure type injections isgoverned by the parametrization of mean arrival times asspecified in Table III. The intensity of failures follows thenegative exponential distribution (similar to [19], [40]).

The Docker- and OVS-based topology emulator, the RE-ASSIGNER as well as up to 16 C-COMPARATOR and34 S-COMPARATOR processes were deployed on a singlecommodity PC running a recent multi-core AMD Ryzen 1600CPU and 32 GB of DDR4 RAM. We have used GurobiOptimizer5 to solve the ILP formulated in Sec. VI-B.

VIII. RESULTS

A. Overhead minimization by successful failure detection

Fig. 5 depicts the observed controller-to-controller (C2C)state reconfiguration and switch-reconfiguration (C2S) delays,before, during and following the failure injection process.Similarly, the packet load and total system CPU utilizationare depicted for the same measurement. Fig. 5 a) depictsthe convergence time to the globally consistent state in thereplicated controller database. The Best-Case considers thetime required to replicate and commit a state update in at leasttwo controllers, while the Worst-Case considers the outcomewhere every controller converges to the globally consistentstate. We observe that the initial failure injections consistently

5Gurobi Optimizer - http://www.gurobi.com/products/gurobi-optimizer

decrease the expected waiting time to achieve consensus andconverge the state updates for the worst-case. The detection oflater injections (Injection 5 and 6) has a lesser effect. For thosedetections, the trade-off in the added C2C delay overridesthe benefits of the lower amount of controller confirmationsrequired to update the controller state on each instance. Thisis a consequence of the displacement of the remaining activecontrollers in the Internet2 topology and hence the addedcontroller-to-controller delay.

Fig. 5 b) depicts a drastic decrease in the response time re-quired to handle client requests, i.e., to reconfigure each switchon the identified path with the new flow rules. We observe thatdynamically decreasing the minimum number of controllersneeded to reach consensus on the correct configuration update,consistently lowers the expected time to reconfigure both thecontrol and data plane. Thus, the overall throughput of theSDN control plane is increased, as additional service requestsmay be served in the same amount of time.

Similarly, the exclusion of incorrect controllers from themessage comparison procedure consistently lowers the experi-enced control plane packet and CPU load over the observationperiod. The exclusion of the controller instances, specifically,has a benefit on the minimization of the number of C2Csynchronization messages, as can be observed from the changein the arrival slope in Fig. 5 c). The total CPU load is similarlydecreased with each successfully identified controller fault, asboth controller and switches must process a lower total amountof controller messages with each controller exclusion. We haveobserved identical trends for the fat-tree topology.

Note that in static controller-to-switch assignment, the de-picted downward slopes would be non-existent, i.e., a compa-rable design [13], [14] with static cluster configuration wouldnot lead to a noticeable decrease in either CPU load orthe switch/controller response times, even after the discoveryof the faulty controller instances. This is in part becausethe mentioned comparable designs tend to ignore controllermessages after detecting a Byzantine fault, instead of fullyexcluding the controllers from the actual cluster configuration.

Fig. 6 depicts the benefits of distinguishing the type offailure when reducing the number of active controllers used inthe consensus procedure. According to Alg. 1, two messageexchanges less are required to identify the correct messageafter detection of a malicious controller (FM ), while a singlemessage less is required after discovery of an unavailablecontroller (FA). Successful detection of Byzantine controllersthus leads to a lower average C2S and C2C reconfigurationtime, compared to a discovery of the same amount of failuresof a different type (or a combination of different faults). Thedetection of an availability failure (FA) hence provides lessinformation about the failed controller being either benign ormalicious. In that case, the REASSIGNER is more conserva-tive when computing new controller-switch assignments.

B. Impact of SDN application statefulness

Fig. 7 depicts the consequence of the complexity of stateful-ness of the distributed SDN control plane on the experiencedC2C controller- and C2S switch-reconfiguration delays (Fig.

Page 13: MORPH: An Adaptive Framework for Efficient and Byzantine

13

80000 90000 100000 110000 120000 130000 140000 150000Time [ms] +1.521714e12

0

20

40

60

80

C2C

Com

mit

Tim

e [m

s]

C2C Commit Delay (Best-Case)C2C Commit Delay (Worst-Case)Detected Failure 1 (Byzantine)Detected Failure 2 (Availability)Detected Failure 3 (Byzantine)Detected Failure 4 (Byzantine)Detected Failure 5 (Availability)Detected Failure 6 (Byzantine)

(a) Internet2: Controller cluster convergence time

80000 90000 100000 110000 120000 130000 140000 150000Time [ms] +1.521714e12

25

50

75

100

125

150

175

Expe

rienc

ed C

2S R

econ

figur

atio

n De

lay

[ms]

C2S Reconfiguration DelayDetected Failure 1 (Byzantine)Detected Failure 2 (Availability)Detected Failure 3 (Byzantine)Detected Failure 4 (Byzantine)Detected Failure 5 (Availability)Detected Failure 6 (Byzantine)

(b) Internet2: Switch reconfiguration delay

80000 90000 100000 110000 120000 130000 140000 150000Time [ms] +1.521714e12

0

250

500

750

1000

1250

1500

1750

# Re

ceiv

ed P

acke

ts

C-to-S Packet LoadC-to-C Packet Load

(c) Internet2: Observed received no. of packets over time

80000 90000 100000 110000 120000 130000 140000Time [ms] +1.521714e12

0

100

200

300

400

Tota

l CPU

Loa

d (Q

uad

Core

CPU

)

Total Experienced CPU LoadDetected Failure 1 (Byzantine)Detected Failure 2 (Availability)Detected Failure 3 (Byzantine)Detected Failure 4 (Byzantine)Detected Failure 5 (Availability)Detected Failure 6 (Byzantine)

(d) Internet2: Observed CPU load of system over time

Fig. 5. Successful detection of injected Byzantine and availability failures leads to an instant decrease in: a) the experienced controller-to-controller (C2C)delay, both for the duration of the best- and worst-case state convergence time; b) the switch reconfiguration delay; c) the measured C2C and controller-to-switch (C2S) packet arrivals (measured at the ingress of a correct controller and switch, respectively); and d) the total measured system CPU load. Thedepicted measurements are taken in the Internet2 topology, based on 13 controllers and 34 switches. The particular sample shown here depicts the case where4 Byzantine and 2 availability failures are identified in the depicted order (red and blue vertical dashed lines, respectively).

50 100 150 200 250C2C Commit Time [ms] < D

0.0

0.2

0.4

0.6

0.8

1.0

Cum

ulat

ive

Prob

abilit

y

C2C Commit Delay - FM = 4, FA = 0C2C Commit Delay - FM = 0, FA = 4C2C Commit Delay - FM = 2, FA = 2C2C Commit Delay - FM = 3, FA = 1C2C Commit Delay - FM = 1, FA = 3

(a) Internet2: Controller cluster convergence time

50 100 150 200 250 300Switch Reconfiguration Delay [ms] < D

0.0

0.2

0.4

0.6

0.8

1.0

Cum

ulat

ive

Prob

abilit

y

C2S Reconfiguration Delay - FM = 4, FA = 0C2S Reconfiguration Delay - FM = 0, FA = 4C2S Reconfiguration Delay - FM = 2, FA = 2C2S Reconfiguration Delay - FM = 3, FA = 1C2S Reconfiguration Delay - FM = 1, FA = 3

(b) Internet2: Switch reconfiguration delay

Fig. 6. The type of the failure affects considerably the experienced controller-to-controller delay (depicted in Fig. 6 (a)). Indeed, the larger the numberof detected Byzantine controller failures, the lower the instantaneous total number of PRIMARY controller messages required to process and commit newcontroller state update. Interestingly, the trend is observable for the switch configuration delay as well (depicted in Fig. 6 (b)), but not with the same intensity.The depicted figures result from the measurements taken for the Internet2 topology comprised of 13 controllers and 34 switches. The observation periodincludes the time preceding, during and following the successful failure injections.

Page 14: MORPH: An Adaptive Framework for Efficient and Byzantine

14

max(FM

+ F A)= 2/

S=0

max(FM

+ F A)= 2/

S=1

max(FM

+ F A)= 4/

S=0

max(FM

+ F A)= 4/

S=1

max(FM

+ F A)= 6/

S=0

max(FM

+ F A)= 6/

S=1

max(FM

+ F A)= 10

/ S=0

max(FM

+ F A)= 10

/ S=1

0

25

50

75

100

125

150

175

200C2

C Co

mm

it De

lay

(Wor

st C

ase)

[ms]

(a) Internet2: Controller cluster convergence time

max(FM

+ F A)= 2/

S=0

max(FM

+ F A)= 2/

S=1

max(FM

+ F A)= 4/

S=0

max(FM

+ F A)= 4/

S=1

max(FM

+ F A)= 6/

S=0

max(FM

+ F A)= 6/

S=1

max(FM

+ F A)= 10

/ S=0

max(FM

+ F A)= 10

/ S=1

0

100

200

300

400

Switc

h Re

conf

igur

atio

n De

lay

[ms]

(b) Internet2: Switch Reconfiguration Delay

max(FM

+ F A)= 2/

S=0

max(FM

+ F A)= 2/

S=1

max(FM

+ F A)= 4/

S=0

max(FM

+ F A)= 4/

S=1

max(FM

+ F A)= 6/

S=0

max(FM

+ F A)= 6/

S=1

max(FM

+ F A)= 10

/ S=0

max(FM

+ F A)= 10

/ S=1

0

50

100

150

200

Switc

h Re

conf

igur

atio

n De

lay

[ms]

(c) Fat-Tree: Switch Reconfiguration Delay

Fig. 7. Depending on an SDN application’s requirement for access tothe consistent controller data-store, controllers impose a varying controller-to-controller synchronization load. Specifically in the case of resource-reservation applications, knowledge about the reserved and free resources isa requirement for the efficient client request processing (i.e. slicing, pathembedding, load-balancing algorithms that rely on resource reservations).S = 1 denotes the case for a stateful controller application, while S = 0denotes applications which do not demand causality in controller updates.

max(FM

+ F A)= 4 /

E=1

max(FM

+ F A)= 4 /

E=0

max(FM

+ F A)= 6 /

E=1

max(FM

+ F A)= 6 /

E=0

max(FM

+ F A)= 8 /

E=1

max(FM

+ F A)= 8 /

E=0

max(FM

+ F A)= 10

/ E=1

max(FM

+ F A)= 10

/ E=0

0

20

40

60

80

100

120

140

Switc

h Re

conf

igur

atio

n De

lay

[ms]

Fig. 8. Difference in the observed switch reconfiguration delay for NON-SELECTIVE and SELECTIVE propagation of SECONDARY controller con-figurations in the [7..16] controller topology that tolerates max(FM+FA) =[4..10] controller failures. Deployment of a large control plane induces abenefit in the average configuration time with the NON-SELECTIVE model.In a topology comprising a smaller number of controllers, the additionalsystem workload related with the propagation of configurations on everySECONDARY controller negatively affects the overall performance.

7 a) and Fig. 7 b), respectively). Stateful SDA applicationsnecessitate consensus, which imposes an additional waitingtime in the C2C synchronization. With consensus, the majorityof controller configurations must match before deciding onthe configuration message to be delivered to switch. Afterthe consensus is reached, the controllers’ data-stores convergeto a consistent state in each correct controller replica. Thedifference in the waiting time is not reflected in the data plane(switch) as much as in the C2C communication, since theswitch experiences a constant waiting time for the minimumamount of required replicas ReqP , independent of the time ittakes to reach consensus in the C2C communication.

However, as depicted both in Fig. 7 and Fig. 8, the overheadin the controller state and switch reconfiguration time isintensified by the maximum number of tolerated failures. Themore failures the system is designed to tolerate, the higherthe amount of required comparisons across the PRIMARYmessages in order to forward the system state. We concludethat the experienced controller and switch reconfigurationtimes scale with the number of tolerated controller failures.

C. Impact of proactive propagation of switch configurations

Fig. 8 depicts the measurements taken in the fat-tree topol-ogy comprising 7 to 16 SDN controller instances and 20switches. A trend reversal in the switch reconfiguration delaycan be observed, where for small-sized controller clusters (i.e.4..10 controllers), the usage of the SELECTIVE mode (E = 1)results in a lower configuration delay overall. For large-scalecontrol planes (13..16 controllers), the trend is reversed andthe NON-SELECTIVE mode (E = 0) becomes faster. Weassume that the trend reversal is linked to the fact that inthe average case where no failures are detected, SELECTIVEmode is more efficient than the NON-SELECTIVE one, since

Page 15: MORPH: An Adaptive Framework for Efficient and Byzantine

15

0 1 2 3 4 5 6 7 8 9 10Total no. of identified failures FM + FA

0

100

200

300

400

500

Reas

signm

ent P

erio

d [m

s]

Fig. 9. Decrease in the total execution time of the ILP solver for thecontroller-to-switch connection reassignment procedure. The controller-switchassigment (re-)computations consider 0 − 10 failure detections and wereexecuted in an emulated 34-switch, 16-controller Internet2 network topology.The time required for the REASSIGNER to reconfigure the control planescales inversely with the number of successfully identified controller faults.As the REASSIGNER tasks may execute asynchronously, without negativelydisturbing the system correctness or control plane availability, we consider thedeployment of MORPH in critical operation networks for practically feasible.

no additional packet overhead / CPU load is generated duringthe message processing, both in the control and data plane.In the large-scale control plane, the controller placement hasa dominant role, so that the decrease in the controller-switchdistance dominates the additional load related to the highernumber of controller instances. For client requests where somecontrollers send their malicious configurations to the switches,or fail-silently and thus do not deliver any new configurationmessages, the switch is able to deduce the correct majorityonly after experiencing an additional round-trip to collectadditional configuration responses from the SECONDARYcontrollers. The NON-SELECTIVE model is intuitively fasterin its worst-case, as it propagates every controller responsedirectly to the switch after the computation of the configurationresponse, independent of the controller’s role as PRIMARY orSECONDARY. Thus, no additional switch-controller round-trip delays are experienced when the received messages atthe switch are inconsistent. Similarly, the switch is capableof collecting the ReqP consistent messages faster on average,since SECONDARY controllers may potentially deliver newconfigurations faster than the assigned PRIMARY instances.

D. Controller-switch reassignment time

The ILP solver is executed once after the initial systemdeployment (to compute the initial switch-controller assign-ments) and once after each controller failure detection. Fig. 9depicts a constant decrease in the execution time of the ILPas new faults are injected and failures are observed at theREASSIGNER. Indeed, the exclusion of incorrect controllersfrom the parameter space decreases the total running time ofthe ILP solver. Nevertheless, even in a large-scale formulationthat assumes the assignment of 16 controller replicas to 34

switches, the REASSIGNER requires only up to ∼ 540ms tocompute the optimal assignment w.r.t. delay and bandwidthconstraints. Hence, MORPH supports the real-time switch re-assignment and is thus deployable in online operation as well.For higher-scaled networks we do not foresee any limitationsin the reassignment procedure related to the viability of oursolution. Namely, when an inconsistency in the PRIMARYconfiguration messages is detected in the S-COMPARATOR,the switch initially ignores the inconsistent messages andcontinues to autonomously contact the SECONDARY replicasto request additional confirmation messages. Thus, the correct-ness of the system is never endangered and the REASSIGNERmay proceed with its reassignment optimization procedureasynchronously in the background. MORPH is hence suitablefor operation in critical networks where minimal downtime isa necessary prerequisite (e.g., in industrial control networks).

IX. CONCLUSION

MORPH allows for distinguishing malicious/unreliablefrom unavailable controllers during runtime, as well as fortheir dynamic exclusion from the system configuration. Ingeneral, enabling Byzantine Fault Tolerance results in a rel-atively high system footprint in the network phase whereno Byzantine or availability-related failures are identified.However, in the critical infrastructure networks, such as theindustrial networks, this overhead may be unavoidable.

Following a Byzantine or availability-induced SDN con-troller failure, MORPH allows for an autonomous adaptationof the system configuration and minimization of the distributedcontrol plane overhead. By experimental validation, we haveproven that MORPH achieves a performance improvement inboth average- and worst-case system response time by mini-mizing the amount of required / considered controllers whendeducing new configurations. Thus, the worst-case waitingperiods required to confirm new controller state updates andswitch configurations are substantially reduced over time. Fur-thermore, by excluding faulty controllers, the average packetand CPU loads incurred by the generation and transfer of con-troller messages to the switches are reduced. We have shownthat the ILP formulation for QoS-constrained reassignmentof controller-to-switch relationships may execute online, forboth medium- and large-scale control planes comprising upto 16 controller instances. The time required to re-adapt thesystem configuration scales proportionally with the numberof successfully detected faulted controllers. Apart from theminimal additional CPU load related to the dynamic reas-signment of controller connections, average- and worst-casecomputational and communication overheads are lower thanthose of comparable BFT SDN designs.

ACKNOWLEDGMENT

This work has received funding from the EU’s Horizon 2020research and innovation programme under grant agreementnumber 780315 SEMIOTICS and CELTIC EUREKA projectSENDATE-PLANETS (Project ID C2015/3-1) and is partlyfunded by the BMBF (Project ID 16KIS0473). We are gratefulto Hans-Peter Huth, Dr. Johannes Riedl, Dr. Andreas Zirkler,and the reviewers for their useful feedback and comments.

Page 16: MORPH: An Adaptive Framework for Efficient and Byzantine

16

REFERENCES

[1] S. Sommer et al., “RACE: A Centralized Platform Computer BasedArchitecture for Automotive Applications,” 2013.

[2] T. Mahmoodi et al., “VirtuWind: virtual and programmable industrialnetwork prototype deployed in operational wind park,” Transactions onEmerging Telecommunications Technologies, vol. 27, no. 9, 2016.

[3] Y. Huang et al., “Real-time detection of false data injection in smart gridnetworks: an adaptive CUSUM method and analysis,” IEEE SystemsJournal, vol. 10, no. 2, 2016.

[4] A. Bessani et al., “State machine replication for the masses with BFT-SMaRt,” in Dependable Systems and Networks (DSN), 2014 44th AnnualIEEE/IFIP International Conference on. IEEE, 2014.

[5] A. Tootoonchian and Y. Ganjali, “Hyperflow: A distributed control planefor Openflow,” in Proceedings of the 2010 internet network managementconference on Research on enterprise networking, 2010.

[6] N. Katta et al., “Ravana: Controller Fault-Tolerance in Software-DefinedNetworking,” in Proceedings of the 1st ACM SIGCOMM symposium onsoftware defined networking research. ACM, 2015.

[7] J. Medved et al., “OpenDaylight: Towards a model-driven SDN con-troller architecture,” in A World of Wireless, Mobile and MultimediaNetworks (WoWMoM), 2014 IEEE 15th International Symposium on,2014.

[8] P. Berde et al., “ONOS: towards an open, distributed SDN OS,” inProceedings of the third workshop on Hot topics in software definednetworking. ACM, 2014.

[9] D. Ongaro et al., “In Search of an Understandable Consensus Algo-rithm,” in USENIX Annual Technical Conference, 2014.

[10] L. Lamport et al., “Paxos made simple,” ACM Sigact News, vol. 32,no. 4, 2001.

[11] Q. Yan et al., “Software-defined networking (SDN) and distributeddenial of service (DDoS) attacks in cloud computing environments: Asurvey, some research issues, and challenges,” IEEE CommunicationsSurveys & Tutorials, vol. 18, no. 1, 2016.

[12] P. Vizarreta et al., “Characterization of failure dynamics in SDNcontrollers,” in Resilient Networks Design and Modeling (RNDM), 20179th International Workshop on. IEEE, 2017.

[13] P. M. Mohan et al., “Primary-Backup Controller Mapping for ByzantineFault Tolerance in Software Defined Networks,” in IEEE Globecom,2017.

[14] H. Li et al., “Byzantine-resilient secure software-defined networks withmultiple controllers in cloud,” IEEE Transactions on Cloud Computing,vol. 2, no. 4, 2014.

[15] N. McKeown et al., “OpenFlow: enabling innovation in campus net-works,” ACM SIGCOMM Computer Communication Review, vol. 38,no. 2, 2008.

[16] J. W. Guck et al., “Function split between delay-constrained routing andresource allocation for centrally managed QoS in industrial networks,”IEEE Transactions on Industrial Informatics, vol. 12, no. 6, 2016.

[17] D. Levin et al., “Logically centralized?: State distribution trade-offs insoftware defined networks,” in Proceedings of the first workshop on Hottopics in software defined networks. ACM, 2012.

[18] E. Sakic et al., “Towards adaptive state consistency in distributed SDNcontrol plane,” in IEEE International Conference on Communications,2017.

[19] ——, “Response Time and Availability Study of RAFT Consensus inDistributed SDN Control Plane,” IEEE Transactions on Network andService Management, 2017.

[20] H. Howard et al., “Raft Refloated: Do we have Consensus?” ACMSIGOPS Operating Systems Review, vol. 49, no. 1, 2015.

[21] D. Suh et al., “On performance of OpenDaylight clustering,” in NetSoftConference and Workshops (NetSoft), 2016 IEEE. IEEE, 2016.

[22] Y. Zhang et al., “When Raft Meets SDN: How to Elect a Leader andReach Consensus in an Unruly Network,” in Proceedings of the FirstAsia-Pacific Workshop on Networking. ACM, 2017.

[23] J. Yin et al., “Separating agreement from execution for byzantine faulttolerant services,” ACM SIGOPS Operating Systems Review, vol. 37,no. 5, 2003.

[24] F. Botelho et al., “Design and implementation of a consistent datastore for a distributed SDN control plane,” in Dependable ComputingConference (EDCC), 2016 12th European. IEEE, 2016.

[25] L. Schiff et al., “In-band synchronization for distributed SDN controlplanes,” ACM SIGCOMM Computer Communication Review, vol. 46,no. 1, 2016.

[26] N. Paladi and C. Gehrmann, “TruSDN: Bootstrapping Trust in CloudNetwork Infrastructure,” in International Conference on Security andPrivacy in Communication Systems. Springer, 2016.

[27] M. Castro, B. Liskov et al., “Practical Byzantine Fault Tolerance,” inOSDI, vol. 99, 1999.

[28] M. Eischer and T. Distler, “Scalable Byzantine Fault Tolerance onHeterogeneous Servers,” in Dependable Computing Conference (EDCC),2017 13th European. IEEE, 2017.

[29] N. Hayashibara et al., “The Phi Accrual Failure Detector,” in ReliableDistributed Systems, 2004. Proceedings of the 23rd IEEE InternationalSymposium on. IEEE, 2004.

[30] ——, “Failure detectors for large-scale distributed systems,” in ReliableDistributed Systems, 2002. Proceedings. 21st IEEE Symposium on.IEEE, 2002.

[31] C. Cachin et al., Introduction to reliable and secure distributed pro-gramming. Springer Science & Business Media, 2011.

[32] V. Costan et al., “Intel SGX Explained.” IACR Cryptology ePrintArchive, vol. 2016, 2016.

[33] P. Bosshart et al., “P4: Programming protocol-independent packet pro-cessors,” ACM SIGCOMM Computer Communication Review, vol. 44,no. 3, 2014.

[34] M. Kuzniar et al., “What you need to know about SDN flow tables,” inInternational Conference on Passive and Active Network Measurement.Springer, 2015.

[35] F. Sardis et al., “Can QoS be dynamically manipulated using end-device initialization?” in Communications Workshops (ICC), 2016 IEEEInternational Conference on. IEEE, 2016.

[36] E. W. Dijkstra, “A note on two problems in connexion with graphs,”Numerische mathematik, vol. 1, no. 1, 1959.

[37] G. Minshall et al., “Application performance pitfalls and TCP’s Nagle al-gorithm,” ACM SIGMETRICS Performance Evaluation Review, vol. 27,no. 4, 2000.

[38] D. Hock et al., “POCO-framework for Pareto-optimal resilient controllerplacement in SDN-based core networks,” in Network Operations andManagement Symposium (NOMS), 2014 IEEE. IEEE, 2014.

[39] X. Huang et al., “Dynamic Switch-Controller Association and ControlDevolution for SDN Systems,” arXiv preprint arXiv:1702.03065, 2017.

[40] G. Nencioni et al., “Availability modelling of software-defined backbonenetworks,” in Dependable Systems and Networks Workshop, 2016 46thAnnual IEEE/IFIP International Conference on. IEEE, 2016.

Ermin Sakic (S′17) received his B.Sc. and M.Sc.degrees in electrical engineering and informationtechnology from Technical University of Munichin 2012 and 2014, respectively. He is currentlywith Siemens AG as a Research Scientist in theCorporate Technology research unit. Since 2016, heis pursuing the Ph.D. degree with the Departmentof Electrical and Computer Engineering at TUM.His research interests include reliable and scalableSoftware Defined Networks, distributed systems andefficient network and service management.

Nemanja Ðeric (S′18) received his B.Sc. degreein electrical and computer engineering in 2014,while he received M.Sc. degree in communicationengineering from Technical University of Munich in2016. He then joined the Chair of Communicationnetworks at Technical University of Munich, wherehe is currently pursuing Ph.D. degree and workingas a Research and Teaching Associate. His currentresearch interests are Software Defined Networking(SDN) and Network Virtualization (NV).

Wolfgang Kellerer (M′96 – SM′11) is a FullProfessor with the Technical University of Munich(TUM), heading the Chair of Communication Net-works at the Department of Electrical and ComputerEngineering. Before, he was for over ten years withNTT DOCOMO’s European Research Laboratories.He received his Dr.-Ing. degree (Ph.D.) and hisDipl.-Ing. degree (Master) from TUM, in 1995 and2002, respectively. His research resulted in over 200publications and 35 granted patents. He currentlyserves as an associate editor for IEEE Transactions

on Network and Service Management and on the Editorial Board of the IEEECommunications Surveys and Tutorials. He is a member of ACM and theVDE ITG.