softcell: scalable and flexible cellular core network ...lierranli/.../softcell-2013.pdfsecond,...

12
SoftCell: Scalable and Flexible Cellular Core Network Architecture Xin Jin , Li Erran Li ? , Laurent Vanbever , and Jennifer Rexford Princeton University , Bell Labs ? ABSTRACT Existing cellular networks suffer from inflexible and expen- sive equipment, and complex control-plane protocols. To address these challenges, we present SoftCell, a scalable ar- chitecture that supports fine-grained policies for mobile de- vices in cellular core networks, using commodity switches and servers. A controller realizes high-level service polices that direct traffic through sequences of middleboxes based on subscriber attributes and applications. To enable small forwarding tables in core switches, SoftCell aggregates traf- fic along multiple dimensions—the service policy, the base station, and the mobile device—at different switches in the network. Since most traffic originates from mobile devices, SoftCell performs fine-grain packet classification at the ac- cess switches at the base stations, where software switches can easily handle the state and bandwidth requirements. Soft- Cell guarantees that packets in the same connection traverse the same sequence of middleboxes in both directions, even in the presence of mobility, without requiring expensive packet classification at the high-bandwidth gateway edge switches. We demonstrate that SoftCell improves the scalability and flexibility of cellular core networks through analysis of LTE workloads, micro-benchmarks on our prototype controller, and large-scale simulations. 1. INTRODUCTION With the rapid proliferation of cellular devices (e.g., smart phones, tablets, and smart meters and other M2M (machine-to-machine) devices [1]), cellular data traffic has exploded in recent years. This trend is likely to con- tinue. For instance, Cisco predicts a 18 times increase of cellular data traffic between 2011 and 2016 [2]. New cellular technologies, like Long Term Evolution (LTE), helped cellular providers to keep up with the traffic growth by increasing their radio access capacity. How- ever, they now face the challenge of keeping up with the increasing demand in their core networks, which carry the User Equipment (UE) traffic between the Base Sta- tion (BS) and the Internet, as shown in Figure 1. Unlike traditional IP networks, cellular providers rely extensively on customized policies based on a wide va- riety of subscriber attributes (e.g., the cell-phone model Figure 1: LTE network architecture or M2M device type, the operating-system version, the billing plan, options for parental controls, whether the total traffic exceeds a usage cap, whether a user is roam- ing), as well as applications types (e.g., transcoding for video traffic, caching for Web traffic, and exemption from usage caps for applications that pay the carrier on the user’s behalf) [3]. For example, the carrier may di- rect traffic for older phones through an echo-cancellation gateway, video traffic through a transcoder during times of congestion, M2M fleet tracking traffic through a low latency path, and all traffic through a firewall. To route traffic and perform fine-grained packet pro- cessing, cellular providers rely on specialized and pro- prietary devices, namely: Serving Gateways (S-GWs) and Packet data network Gateways (P-GWs). S-GWs are mainly used as mobility anchors to provide seam- less mobility. P-GWs sit at the boundary of the cellu- lar network and the Internet. P-GWs centralize most network functions like content filtering, traffic optimiza- tion, firewalls, and lawful intercept [4]. The base sta- tions, S-GWs, and P-GWs communicate using GPRS Tunneling Protocol (GTP). Centralizing nearly all data-plane functionalities in the P-GWs makes cellular core networks remarkably in- efficient, complex and inflexible [5, 3]. Indeed, cellular core networks must forward all traffic through the P- GW—including device-to-device traffic and local con- tent distribution network services. This increases the network delay and congestion. Since P-GWs are not

Upload: others

Post on 02-Oct-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SoftCell: Scalable and Flexible Cellular Core Network ...lierranli/.../SoftCell-2013.pdfSecond, nearly all tra c in cellular networks goes to and from the Internet, whereas data centers

SoftCell: Scalable and FlexibleCellular Core Network Architecture

Xin Jin†, Li Erran Li?, Laurent Vanbever†, and Jennifer Rexford†

Princeton University†, Bell Labs?

ABSTRACTExisting cellular networks suffer from inflexible and expen-sive equipment, and complex control-plane protocols. Toaddress these challenges, we present SoftCell, a scalable ar-chitecture that supports fine-grained policies for mobile de-vices in cellular core networks, using commodity switchesand servers. A controller realizes high-level service policesthat direct traffic through sequences of middleboxes basedon subscriber attributes and applications. To enable smallforwarding tables in core switches, SoftCell aggregates traf-fic along multiple dimensions—the service policy, the basestation, and the mobile device—at different switches in thenetwork. Since most traffic originates from mobile devices,SoftCell performs fine-grain packet classification at the ac-cess switches at the base stations, where software switchescan easily handle the state and bandwidth requirements. Soft-Cell guarantees that packets in the same connection traversethe same sequence of middleboxes in both directions, even inthe presence of mobility, without requiring expensive packetclassification at the high-bandwidth gateway edge switches.We demonstrate that SoftCell improves the scalability andflexibility of cellular core networks through analysis of LTEworkloads, micro-benchmarks on our prototype controller,and large-scale simulations.

1. INTRODUCTIONWith the rapid proliferation of cellular devices (e.g.,

smart phones, tablets, and smart meters and other M2M(machine-to-machine) devices [1]), cellular data traffichas exploded in recent years. This trend is likely to con-tinue. For instance, Cisco predicts a 18 times increaseof cellular data traffic between 2011 and 2016 [2]. Newcellular technologies, like Long Term Evolution (LTE),helped cellular providers to keep up with the trafficgrowth by increasing their radio access capacity. How-ever, they now face the challenge of keeping up with theincreasing demand in their core networks, which carrythe User Equipment (UE) traffic between the Base Sta-tion (BS) and the Internet, as shown in Figure 1.

Unlike traditional IP networks, cellular providers relyextensively on customized policies based on a wide va-riety of subscriber attributes (e.g., the cell-phone model

Figure 1: LTE network architecture

or M2M device type, the operating-system version, thebilling plan, options for parental controls, whether thetotal traffic exceeds a usage cap, whether a user is roam-ing), as well as applications types (e.g., transcoding forvideo traffic, caching for Web traffic, and exemptionfrom usage caps for applications that pay the carrier onthe user’s behalf) [3]. For example, the carrier may di-rect traffic for older phones through an echo-cancellationgateway, video traffic through a transcoder during timesof congestion, M2M fleet tracking traffic through a lowlatency path, and all traffic through a firewall.

To route traffic and perform fine-grained packet pro-cessing, cellular providers rely on specialized and pro-prietary devices, namely: Serving Gateways (S-GWs)and Packet data network Gateways (P-GWs). S-GWsare mainly used as mobility anchors to provide seam-less mobility. P-GWs sit at the boundary of the cellu-lar network and the Internet. P-GWs centralize mostnetwork functions like content filtering, traffic optimiza-tion, firewalls, and lawful intercept [4]. The base sta-tions, S-GWs, and P-GWs communicate using GPRSTunneling Protocol (GTP).

Centralizing nearly all data-plane functionalities inthe P-GWs makes cellular core networks remarkably in-efficient, complex and inflexible [5, 3]. Indeed, cellularcore networks must forward all traffic through the P-GW—including device-to-device traffic and local con-tent distribution network services. This increases thenetwork delay and congestion. Since P-GWs are not

Page 2: SoftCell: Scalable and Flexible Cellular Core Network ...lierranli/.../SoftCell-2013.pdfSecond, nearly all tra c in cellular networks goes to and from the Internet, whereas data centers

modular, carriers often end up paying for functionali-ties they do not need. In contrast, when a functionalityis not available, carriers have no choice but to replacethe P-GWs, even if they are sufficient for most pur-poses. Finally, carriers cannot “mix and match” capa-bilities from different vendors (e.g., use a firewall fromone vendor, and a transcoder from another), or “scaleup” the resources devoted to a specific function [3, 6].

Rather than perform all these functions at the Inter-net boundary, we argue that cellular providers shouldadopt a network design more akin to modern data cen-ters. The network should consist of a fabric of simplecore switches, with most functionality moved to low-bandwidth access switches (at the base stations) anda distributed set of middleboxes that the carrier canexpand as needed to meet the demands. These middle-boxes could be dedicated appliances, virtual machinesrunning on commodity devices [6], or packet-processingrules installed directly in the switches [7, 8]. A logically-centralized controller can then route traffic through theappropriate middleboxes, via efficient network paths, torealize a high-level service policy (e.g., directing a UE’svideo traffic through a transcoder and a firewall).

However, implementing such design in cellular net-works introduce unique scalability challenges. First, thefine-grained policies used in cellular networks are de-fined along many dimensions, leading to large numberof packet classifiers. Yet, commodity switches can onlystore a few thousand to tens of thousands of rules [9].Second, nearly all traffic in cellular networks goes toand from the Internet, whereas data centers [10, 11, 12]are dominated by traffic between servers. This placesheavier bandwidth and state requirements on the cel-lular gateway switches. Third, device mobility is fre-quent and unplanned, requiring additional state to en-sure seamless connectivity and direct all packets in aconnection through the same sequence of middleboxes.

To address these challenges, we present SoftCell, ascalable architecture for supporting fine-grained policiesfor mobile devices in cellular core networks. SoftCellemploys the following two novel techniques:

Muti-dimensional aggregation: SoftCell significantlyreduces the size of switch tables by aggregating entriesalong multiple dimensions, combining the benefits oftraditional location-based routing and tag-based rout-ing. The SoftCell controller aggregates the forwardingentries dynamically, using an online algorithm.

Smart access edge, dumb gateway edge: SoftCellobviates the need to perform packet classification at theInternet gateway edge. Instead, SoftCell performs allpacket classification at the access edge, using softwareswitches along with local software controllers. To avoidreclassifying the return traffic at the Internet edge, Soft-Cell embeds policy identifiers directly in the IP packet

Figure 2: SoftCell network architecture

headers, enabling the Internet edge to only perform ba-sic packet forwarding.

We built a SoftCell controller on top of Floodlight [13].We then evaluated it using: i) real traces from a largeLTE deployment; ii) micro-benchmarks on our proto-type; and iii) large-scale simulation experiments. Ourexperiments and analysis show that SoftCell can easilyhandle the workload of a large LTE network and sup-port thousands of service-policy clauses with just a fewthousand TCAM entries in the core switches.

2. SOFTCELL ARCHITECTURESoftCell’s goal is to support numerous fine-grained

policies, in a scalable manner. In this section, we intro-duce the components in a SoftCell network, the specifi-cation of service policies, and the key design decisions.

2.1 SoftCell Core Network ComponentsA SoftCell network interconnects unmodified UEs (via

base stations) and the Internet (via gateway switches),as shown in Figure 2. SoftCell does not require the spe-cialized network elements (e.g., S-GWs and P-GWs) orpoint-to-point tunneling (e.g., user-level GTP tunnels)used in today’s LTE networks.

Controller: The controller realizes high-level servicepolicies by computing and installing switch-level rulesthat direct traffic through middleboxes. Service policiesare specified on subscriber attributes and applicationtypes. To compute these paths, the controller has ac-cess to the (mostly static) attributes (e.g., billing plan,phone model, and usage cap) of each UE.

Access switches: Each base station has an accessswitch that performs fine-grained packet classificationon traffic from UEs. Access switches can be softwareswitches (e.g., Open vSwitch [14]) that run on com-modity servers. The server also runs a local agent (LAin Figure 2) that caches packet classifiers for attached

2

Page 3: SoftCell: Scalable and Flexible Cellular Core Network ...lierranli/.../SoftCell-2013.pdfSecond, nearly all tra c in cellular networks goes to and from the Internet, whereas data centers

UEs, to minimize interaction with the central controller.

Core switches: The rest of the network consists ofcore switches, including a few gateway switches con-nected to the Internet. We assume that they can per-form arbitrary wildcard matching on the IP addressesand TCP/UDP port numbers (as in today’s merchantsilicon [15]), or can cache flat rules after processing wild-card rules locally in software (as in DevoFlow [16]).SoftCell gateway switches are much cheaper than P-GWs; they just perform packet forwarding, and relegatesophisticated packet processing to middleboxes.

Middleboxes: SoftCell supports commodity middle-boxes implemented as dedicated appliances, virtual ma-chines, or packet-processing rules on switches. Eachmiddlebox function (e.g., transcoder) may be availableat multiple locations. SoftCell supports stateful mid-dleboxes that require all packets in both directions of aconnection to traverse the same middlebox instance.

The radio access networks consist of base stationsthat connect to unmodified UEs using existing proto-cols. Just as today, a UE retains a single IP address asit moves between base stations in the same cellular corenetwork. SoftCell uses a different, location-dependentIP address for routing within the core network and theInternet (§ 3). Access switches perform address transla-tion, transparent to UEs (§ 4). SoftCell does not requireany change to the radio hardware at the base station,or common functions such as scheduling, radio resourcemanagement, and paging. Similarly, SoftCell does notrequire changes to commodity middleboxes, or any sup-port from the rest of the Internet.

2.2 Flexible, High-Level Service PoliciesThe SoftCell controller directs traffic over network

and middlebox paths, based on the service policy. Ser-vice policies are defined at a high level of abstraction,based on subscriber attributes and applications. Thecontroller handles low-level details like ephemeral net-work identifiers, the locations of middleboxes and switches,and application identification. A service policy is com-posed of multiple clauses that specify which traffic (spec-ified by a predicate) should be handled in what way(specified by an action):

Predicates: A predicate is a boolean expression onsubscriber attributes and application types. Subscriberattributes consist of device type, billing plan, devicecapabilities, provider, etc. Application types includeweb browsing, real-time streaming video, VoIP, etc.

Service action: An action consists of a set of middle-boxes, along with quality-of-service (QoS) and access-control specifications. Specifying the set of middleboxesas an order allows the carrier to impose constraints (e.g.,firewall before transcoder). The action does not indicate

Prio Predicates Service Actions

1 provider = B Firewall

2 provider != A Drop

3 app = video ∧ plan = Silver [Firewall, Transcoder]

4 app = VoIP [Firewall, Echo-Cancel]

5 device type=M2M fleet [HighPriority, Firewall]

Table 1: Example service policy for carrier A

a specific instance of each middlebox, allowing the con-troller to select middlebox instances and network pathsthat minimize latency and load.

Priority: The priority is used to disambiguate overlap-ping predicates. The network handles traffic using thehighest-priority clause with a matching predicate.

Table 1 shows an example service policy for carrierA. In this example, carrier A has a roaming agree-ment with carrier B which enables B’s subscribers touse A’s network as fallback. To avoid abuse, the firstclause directs all traffic of B’s subscribers through a fire-wall. The second clause disallows traffic of subscribersfrom all other carriers. The remaining clauses specifythe handling of A’s own subscribers, with all traffic go-ing through a firewall. The third clause indicates thatthe video traffic to subscribers on the “silver” billingplan must go through a transcoder after the firewall.The fourth clause specifies that VoIP traffic must gothrough a echo cancellation box (improving voice qual-ity) after a firewall. The fifth clause requires M2M fleet-tracking traffic to be forwarded with high priority to en-sure low latency. In this paper, we focus on middleboxservice policies, since they require more sophisticatedtraffic steering rather than simple local processing todrop packets or mark the type-of-service bits.

2.3 SoftCell Design Challenges and SolutionsBefore going into the details of SoftCell, we first give

an overview of the design challenges and solutions.

Challenge 1: support service routing with smallswitch tables. Fine-grained service policies in largenetworks can easily lead to an explosion in the data-plane state needed to direct traffic through the rightmiddleboxes. For instance, suppose a service policy has1000 clauses (very reasonable for a future network con-sists of billions of M2M devices used in all kinds of set-tings such as tele-health, asset tracking, and buildingsecurity) in a network with 1000 base stations. Sup-pose for each policy clause, we must instantiate a policypath (that traverses the right middlebox instances) be-tween each base station and the gateway switch. Thisresults in 1 million paths. Existing service routing tech-niques [17] are not able to install such a huge amountof paths with limited switch flow tables.

Solution: multi-dimensional aggregation. Soft-Cell tackles this challenge by leveraging aggregation. As

3

Page 4: SoftCell: Scalable and Flexible Cellular Core Network ...lierranli/.../SoftCell-2013.pdfSecond, nearly all tra c in cellular networks goes to and from the Internet, whereas data centers

simply aggregating on destination IP addresses does notprovide enough flexibility and scalability to implementfine-grained service policies, SoftCell introduces multi-dimensional aggregation, which combines the benefitsof traditional location-based routing and tag-based rout-ing to scale to large networks with large service policies.

Challenge 2: support fine-grained packet classifi-cation in asymmetric topology. To enforce servicepolicies, we must first classify packets at the networkedge to decide which policy clause to apply. This is rel-atively easy in data centers as most traffic is “east west”(between servers in the data center). Packets are typ-ically classified by software switches in hypervisors orfirst-hop hardware switches [10, 11, 12], and the load isequally distributed over all classification switches at theedge. However, in cellular core networks most traffic is“north south” (between the Internet and UEs). The ac-cess edge connecting to UEs, consists of thousands ofbase stations. An access switch at a base station han-dles traffic from up to 1000 UEs attached to the basestation. The total traffic volume is around 20 Mbps to1 Gbps [18, 19]. The gateway edge facing the Internetconsists of just a few gateway switches that direct traf-fic to those thousands of base stations, and the trafficvolume can be several Tbps. Thus, classifying packetsat the gateway edge at line rate is very difficult.

Solution: smart access edge, dumb gatewayedge. SoftCell exploits the unique property of cellularnetworks that traffic is initiated by UEs, and classifiespackets only at the access edge. Using a novel state-embedding technique, SoftCell “piggybacks” the classi-fication results in the source IP address and port num-ber, so the gateway switches can easily forward returntraffic by examining the destination address and port.

Challenge 3: scalable handling of network dy-namics. Cellular networks operate under considerablechurn due to UE mobility. Whereas data-center oper-ators can plan VM migration in advance, cellular car-riers have little control over UE mobility (beyond lim-ited load balancing between nearby base stations). Ex-isting research on mobility management shows how tominimize packet loss during handover, but does not ad-dress service policies. Stateful middleboxes rely on pol-icy consistency, where all packets of a connection musttraverse the same middlebox instance. Besides policyconsistency, centralized control introduces a new failuremode—controller failure—that must be addressed.

Solution: smart local agent at base stations.Each access switch acts as a mobility anchor for at-tached UEs. Upon a UE handoff, ongoing flows con-tinue to reach the old access switch via the old path.This ensures policy consistency, while leveraging thevast number of access switches to achieve scalability.New flows traverse the new access switch, and new pol-

icy paths, to minimize path stretch. To ensure fastrecovery from controller failures, SoftCell runs multi-ple controller replicas and maintains a consistent, dis-tributed store of the control state. Most state changesslowly, lowering the overhead of maintaining strong con-sistency. The most dynamic state is UE location. Abackup controller can fetch up-to-date UE location datadirectly from the local agent at each base station.

3. SCALABLE SERVICE ROUTING WITHMULTI-DIMENSIONAL AGGREGATION

To implement service policies, SoftCell routes traf-fic over sequences of middleboxes that are distributedthroughout the network. While more flexible than acentralized P-GW, SoftCell’s service routing requiresfine-grained forwarding rules. SoftCell overcomes thisscalability challenge by aggregating forwarding rules alongmultiple dimensions. In this section, we present theSoftCell aggregation strategy, and an online algorithmfor computing the forwarding entries.

3.1 Multi-Dimensional AggregationIn traditional IP networks, routers forward traffic based

solely on the destination prefix, and operators align IPprefixes with the topology to enable aggregation of con-tiguous prefixes. However, destination-based forward-ing is not flexible enough for cellular networks, whereforwarding can depend on subscriber attributes or ap-plication types. To enable more flexible forwarding, aclassical solution is to forward traffic using VLAN tagsor MPLS labels. However tag-based forwarding scalespoorly as it enforces“flat-routing”and removes the abil-ity to aggregate contiguous entries, even if the destina-tion is the same. While label stacking and label swap-ping can reduce the number of tags, carrying multipleMPLS labels in the packet header incurs large overhead,and existing middleboxes do not necessarily understand,or even preserve, MPLS labels.

SoftCell combines the benefits of location-based rout-ing and tag-based routing, by leveraging the ability ofcommodity switches to selectively match on differentpacket fields. In particular, SoftCell forwarding rulesaggregate on three dimensions: (i) policy, (ii) location,and (iii) UE.

Aggregation by policy (policy tag): Subscriber at-tributes are not easily translated or aggregated withnetwork addresses. Consider for instance the third clausein Table 1. As UEs with a “silver plan” can have a vari-ety of IP addresses, this policy requires a rule for eachflow in the worst case. We could conceivably assign“silver plan” UEs IP addresses under the same subnet,allowing us to assign one rule that matches on the IPprefix. However, we cannot do this for every attribute,not to mention that many service policies are defined

4

Page 5: SoftCell: Scalable and Flexible Cellular Core Network ...lierranli/.../SoftCell-2013.pdfSecond, nearly all tra c in cellular networks goes to and from the Internet, whereas data centers

CS1

CS2 CS3

Transcoder1 Transcoder2

Switch   Match     Ac+on  

CS1  10.0.0.0/16   Forward  to  CS2  

10.1.0.0/16   Forward  to  CS3  

10.0.0.0/16 10.1.0.0/16

CS1

CS2

Transcoder1

Transcoder2 Firewall

CS1 AS1

AS2

10.0.0.0/24

10.0.1.0/24

UE 1 10.0.0.7

(a) Location-based routing (c) Flexible policy (b) Mobility

AS1 AS2

10.0.0.0/16

AS1

10.1.0.0/16

AS2

Switch   Match     Ac+on  

CS1   tag1   Forward  to  Firewall  

CS2  tag1,  10.0.0.0/15   Forward  to  Transcoder1  

tag1,  10.2.0.0/15   Forward  to  CS3  

CS3   tag1,  10.2.0.0/15   Forward  to  Transcoder2  

Switch   Match     Ac+on  

CS1  

10.0.0.7   Forward  to  AS2  

10.0.0.0/16   Forward  to  AS1  

10.1.0.0/16   Forward  to  AS2  

10.2.0.0/16

AS3

10.3.0.0/16

AS4

CS3

Handoff

Figure 3: Examples of multi-dimensional aggregation rules for traffic arriving from the Internet

on combinations of attributes. To minimize the rulesin core switches, we use a policy tag to aggregate flowson the same policy path. This tag is associated at theaccess switch, allowing core switches to forward packetsbased on policy tags.

Aggregation by location (hierarchical IP address):In many core switches, traffic destined to the same basestation traverses the same outgoing link, even if thepackets go through different middleboxes. By includinglocation information in the UE addresses, we can ag-gregate traffic by IP prefix. Furthermore, cellular corenetworks have a natural hierarchical structure. There-fore, we assign each base station an IP prefix, calledbase station ID, and IDs of nearby base stations can befurther aggregated into larger blocks. We can aggregateeven more by combining policy tags and IP addresses.Suppose two policy paths going to two base stationsshare a long path segment before branching. If assignedthe same policy tag, a single rule matching on the tagcan forward packets along the shared segment until thebranching point, where traffic is divided based on thebase station prefix.

Aggregation by UE (UE ID): Some middleboxes(like intrusion detection systems) need a way to identifygroups of flows associated with the same UE. Clearly,this is impossible if all flows for the same base stationshare the same address. Packets therefore need an UEidentifier (UE ID) that differs from other UEs at thesame base station. Having an UE ID in each packetalso enables optimizations for handling mobility, by in-stalling switch rules that forward in-progress flows tothe UE at its new location. Together, the base sta-tion prefix and the local UE ID form a hierarchicallocation-dependent address (LocIP) for the UE. ThisLocIP is transparent to the UE and is used for routingin the core network and the Internet, but not the ra-

dio access network. The UE is allocated a permanentIP address via DHCP when it first attaches to the net-work. This permanent IP address doesn’t change, whileLocIP changes when the UE moves between base sta-tions. Access switches perform the translation betweenthe permanent IP address and LocIP.

We can furthermore maximize the aggregation of thedata-plane state by selectively matching on multiplefield. We describe three examples.

Location-based routing: In Figure 3(a), core switchCS1 matches on the base-station prefix to forward traf-fic to CS2 and CS3. CS2 and CS3 then decide whetherto direct traffic to a transcoder based on the policy tag.Notice that CS1 does not need to base its forwardingdecision on the tag.

UE mobility: In Figure 3(b), CS1 forwards trafficto base stations according to the destination IP prefix.When UE1 moves from access switch AS1 to AS2, weinstall a high-priority rule at CS1 to match on both thebase station prefix and the UE ID. This ensures thatongoing flows reach UE1 at AS2 over a direct path.More advanced mobility handling that ensures policyconsistency is discussed later in § 5.1.

Flexible policy: Figure 3(c) illustrates how to im-plement the third clause in Table 1 with “tag1.” CS1forward “tag1” packets to the Firewall1. Suppose we as-sign AS1 and AS2 traffic to Transcoder1, and AS3 andAS4 traffic to Transcoder2. CS2 matches on both thetag and the prefix (more precisely, the aggregated pre-fix of two base stations) to forward AS1 and AS2 trafficto Transcoder1, and AS3 and AS4 traffic to CS3. CS3finally forwards AS3 and AS4 traffic to Transcoder2.

3.2 Rule Minimization in Core SwitchesWe now present a greedy online algorithm that per-

1Traffic from middleboxes is identified based on the inport.

5

Page 6: SoftCell: Scalable and Flexible Cellular Core Network ...lierranli/.../SoftCell-2013.pdfSecond, nearly all tra c in cellular networks goes to and from the Internet, whereas data centers

forms multi-dimensional aggregation given a stream ofpolicy paths as input (see Algorithm 1). The algo-rithm operates in an online fashion because policy pathscan be dynamically installed or removed due to policychanges or middlebox load balancing. We only brieflyexplain the algorithm here and defer the reader to [20]for a more detailed explanation.

Basic multi-dimensional aggregation algorithm:Intuitively, the algorithm performs multi-dimensionalaggregation of a policy path p in two stages: aggregat-ing first by policy, then by location. A policy path pis a sequence (sw0, . . . , swk) of adjacent switches suchthat sw0 represents an access switch connected to a basestation and swk a gateway switch connected to the In-ternet. Given p, the algorithm first computes candTag,the set of tags used on any switch belonging to p, exceptsw0

2. For each tag t ∈ candTag, the algorithm com-putes the total number of forwarding rules that mustbe created if that tag is reused to set-up p (line 1–6).

To do so, the algorithm iterates over each consecutivepairs of switches (swi, swi+1) in p and checks whetherthe forwarding rule associated with t on swi can bereused as such to forward traffic using p, i.e. whether itsnext-hop is swi+1 (line 4). If the next-hop differs, the al-gorithm (function extraRules(path, swi)) accounts forextra forwarding rules matching both t and the loca-tion address unless that extra rule can be aggregatedwith another existing rule (line 5). If the next hop isa middlebox, we need rules for traffic to and from themiddlebox. To ensure correctness, the algorithm aggre-gates two rules if and only if their location prefixes arecontiguous. At the end of the loop, the algorithm re-turns the tag that minimizes the number of new rulesor returns a new tag if needed (lines 7–10). Finally, thealgorithm installs the forwarding rules to switches usingconsistent updates techniques [21] and aggregating theentries where it can (line 11–16).

Dealing with loops: In some cases, policy paths con-tain one or more loops. As an illustration, considera policy path enforcing outbound video traffic to gothrough a firewall before a video transcoder in Fig-ure 2. A loop that enters the same switch twice butthrough different links can easily be differentiated basedon the input ports. However, a loop that enters thesame switch twice from the same link is more difficultto handle. In such a case, the algorithm uses additionaltags to disambiguate the forwarding decisions (omittedin the algorithm for space constraints). More specifi-cally, the algorithm breaks the loop into two segmentsand uses different tags for each segment. At the switchthat connects these two segments, the algorithm installsa rule to “swap” these two tags. This approach can be

2Otherwise, it would be impossible to distinguish amongdifferent policy paths originated from the same sw0.

Algorithm 1 Install a new policy path

Input:– path: the policy path to install– prefix: the IP prefix of the origin base station– candTag: the set of candidate tags for the base station

Output: switch rules and a tag for this policy path

Step 1: Choose a tag to minimize new rules1: for t in candTag do2: newRule[t] = 0 . new rules needed if tag t is used3: for (swi, swi+1) in path do4: if swi.getNextHop(t, prefix)! = swi+1 then5: if !swi.canAggregate(t, prefix, swi+1) then6: newRule[t]+ = extraRules(path, swi)

7: if candTag ! = ∅ then8: tag∗ = arg min

t{newRule[t]}

9: else10: tag∗ = new tag

Step 2: Install the path with the prefix and tag11: for (swi, swi+1) in path do12: if swi.getNextHop(tag∗, prefix)! = swi+1 then13: if swi.canAggregate(tag∗, prefix, swi+1) then14: swi.aggregateRule(tag∗, prefix, swi+1)15: else16: swi.installRule(tag∗, prefix, swi+1)

generalized to support nested loops.

Discussion of offline algorithm: Our online algo-rithm is optimal if each policy path is processed one at atime. For extremely constrained environments, we cancouple the online algorithm with an offline algorithmthat would regularly recompute the optimal forwardingentries. We leave a description of the offline algorithmto separate work as Algorithm 1 already supports ordersof magnitude more policy paths on commodity switchesthan what is required today by operators.

4. SCALABLE PACKET CLASSIFICATIONWITH ASYMMETRIC EDGE DESIGN

Each packet entering the network must be associatedwith the appropriate policy tag and location-dependentIP address. This imposes overhead in both the dataplane (to apply packet-classification rules) and the con-trol plane (to fetch the rules). SoftCell places key func-tionality at the low-bandwidth access edge, to limit thedata-plane overhead on the gateway switches and thecontrol-plane overhead on the controller.

4.1 Packet Classification at the Access EdgeEach time a packet arrives at a base station, the ac-

cess switch needs to translate the permanent IP addressto the location-dependent IP address assigned to theUE, and attach the appropriate policy tag. To realizethis, SoftCell installs one rule for each microflow at theaccess switch. A base station has at most 1000 UEs with(say) ten flows each, resulting in 10,000 microflows—easily supported in a software switch [22]. To avoid clas-

6

Page 7: SoftCell: Scalable and Flexible Cellular Core Network ...lierranli/.../SoftCell-2013.pdfSecond, nearly all tra c in cellular networks goes to and from the Internet, whereas data centers

Public  Prefix   UE  ID  Base  Sta5on  ID  

Policy  Tag   Flow  ID  

IP:

Port:

Figure 4: Embedding location and policy infor-mation in source IP address and source portnumber. Thus the information can be implic-itly piggybacked in return traffic.

sifying packets again at the gateway switches (which ag-gregates traffic for thousands of base stations), SoftCellembeds the packet-classification result in the packet.

Embedding state in packet headers: Rather thanencapsulating packets, as is commonly done in data-center networks, we embed the policy tag, base stationID, and UE ID in the packet header. This ensures thatthe return traffic from the Internet carries these fields.For example, we could encode the state as part of theUE’s IP address (e.g., in IPv6), or a combination of theUE’s IP address and TCP/UDP port number (e.g., inIPv4) as shown in Figure 4. The access switch rewritesthe source IP address to the location-dependent IP ad-dress (i.e., the carrier’s public prefix, as well as the basestation and UE IDs), and embeds the policy tag as partof the source port. UEs do not have many active flows,leaving plenty of room for carrying the policy tag in theport-number field. With this embedding mechanism,our three identifiers are implicitly “piggybacked” in re-turn traffic arriving from the Internet. The gatewayswitch can simply forward incoming packets based onthe destination IP address and port number.

Dealing with security and privacy issues: Directlyapplying this approach may raise some security and pri-vacy challenges. Malicious Internet hosts may spoofpolicy tags and congest network links or middleboxes,though these attacks can be blocked using conventionalfirewalls. In addition, changing a UE’s local IP addresseach time it moves to a new base station would makeit easier for Internet servers to infer the user’s location.To address these concerns, SoftCell can perform net-work address translation (NAT) as packets arrive fromthe Internet. Specifically, we require the NAT functionto pick a different IP address and/or port number forevery flow, whether or not the UE moves. In addition,these public IP address and port pairs cannot be cor-related with the UE’s location (or with the decision tochange locations). In practice, NATs are already exten-sively deployed today, as cellular providers are short ofpublic IP addresses for each UE [23]. As such, SoftCellgives the same level of security and privacy protectionas today’s cellular networks. Due to the lack of space,we do not provide any formal proof.

4.2 Local Control Agent at the Access EdgeSending the first packet of every microflow from the

access switch to the controller would introduce highoverhead. Instead, SoftCell introduces a local softwareagent running at each base station to scale the controlplane. Note that the gateway switches don’t performfine-grained packet classification and thus do not needlocal agents, as the policy tags are piggybacked in thepacket headers.

The local agent caches a list of packet classifiers foreach UE at the behest of the controller. The packetclassifiers are a UE-specific instantiation of the servicepolicy that matches on header fields and identifies theappropriate policy tag, if a policy path already exists.When the UE arrives at the base station, the controllercomputes the packet classifiers based on the service pol-icy, the UE’s subscriber attributes, and the current pol-icy tags. When the UE starts a new flow, the local agentconsults these classifiers to determine the right policytag for these packets, and installs a microflow rule inthe access switch, similar to the DevoFlow “clone” func-tion [16]. The local agent only contacts the controllerif no policy tag exists for this flow—that is, if a packetis the first one at this base station, across all UEs, thatneeds a particular policy path.

For example, suppose UE7 arrives at base station 1with prefix 10.0.0.0/16. The local agent first assigns aUE ID 10 to the UE. Now UE7 is associated with thelocation-dependent address 10.0.0.10. The local agentcontacts the controller to fetch a list of packet classi-fiers for this UE. Suppose the list includes two packetclassifiers:

1. match:dst_port=80, action:tag=22. match:dst_port=22, action:send-to-controller

When a packet with destination port 80 from UE7 ar-rives, the access switch does not find any existing mi-croflow rule, and directs the packet to the local agent.The local agent determines that the traffic matches thefirst packet classifier. Since the policy path already ex-ists, the local agent simply installs a microflow rule inthe access switch which (i) rewrites the UE IP addressto 10.0.0.10 and (ii) pushes “tag=2” to the source portnumber, without contacting the central controller. Sup-pose another packet arrives from UE7 with destinationport 22. This flow matches the second packet classifierand the action is “send to controller”, meaning that thepolicy path has not been installed yet. The local agentsends a request to the central controller to install a newpolicy path and return the policy tag. Then, the lo-cal agent can update the packet classifier and install amicroflow rule for the packets of this flow.

In this way, local agents cache UE-specific packetclassifiers and process most flows locally, significantlyreducing the control-plane load on the controller.

5. HANDLING NETWORK DYNAMICS

7

Page 8: SoftCell: Scalable and Flexible Cellular Core Network ...lierranli/.../SoftCell-2013.pdfSecond, nearly all tra c in cellular networks goes to and from the Internet, whereas data centers

UE 1

BS 1

BS 2

Transcoder1 CS1

CS3

CS2

CS4 Transcoder2

UE 1 Old Path

New Path

Tunnel Shortcut

Figure 5: Tunnels and shortcuts for old flows

In this section, we present how SoftCell handles net-work dynamics, mainly UE mobility and controller fail-ure. SoftCell handles churn in a scalable fashion, throughthe clean division of labor between core and edge.

5.1 Policy Consistency Under MobilitySeamless handling of device mobility is a basic re-

quirement for cellular networks. UEs move frequentlyfrom one base station to another, and carriers have nocontrol over when and where a UE moves. In addi-tion to minimizing packet loss and delay, carriers mustensure that ongoing flows continue traversing the origi-nal sequence of middleboxes (though not necessarily thesame switches), while reaching the UE at its new loca-tion. Such policy consistency is crucial for traffic goingthrough stateful middleboxes, like firewalls and intru-sion prevention systems. However, new flows shouldtraverse middlebox instances closer to the UE’s new lo-cation, for better performance. As such, SoftCell mustdifferentiate between old and new flows, and direct flowson the appropriate paths through the network.

Differentiate between old and new flows: Incom-ing packets for old flows have a destination IP addresscorresponding to the UE’s old location, so these pack-ets naturally traverse the old sequence of middleboxesto the old base station. SoftCell merely needs to di-rect these packets to the new base station, which thenremaps the old location-dependent address to the UE’spermanent address. During the transition, the con-troller does not assign the old location-dependent ad-dress to any new UEs. For the traffic sent from the UE,the old access switch has a complete list of microflowrules for the active flows. Copying these rules to thenew access switch ensures that packets in these flowscontinue to use the old IP address, to avoid a disrup-tion in service. Each UE has a relatively small numberof active connections (say, 10), limiting the overhead ofcopying the rules. To minimize hand-off latency, theSoftCell controller could copy these rules in advance, assoon as a UE moves near a new base station.

Efficiently reroute the old flows: To handle ongo-ing connections during mobility events, SoftCell main-tains long-lived tunnels between nearby base stations,as shown in Figure 5. These tunnels can carry traffic

for any UEs that have moved from one base stationto another. This “triangle routing” ensures policy con-sistency and minimizes packet loss, at the expense ofhigher latency and bandwidth consumption. The manyshort-lived connections would not experience any sig-nificant performance penalty. To handle long-lived con-nections more efficiently, the controller can establishtemporary shortcut paths for directing traffic betweenthe new base station and the old policy path, as shownin Figure 5. The controller can learn the list of ac-tive microflows from the access switch at the old basestation, and install rules in the core switches to directincoming packets over the shortcut paths. A single UEmay need multiple shortcuts, since different traffic maygo through different middleboxes. As such, these short-cut paths are created when a UE moves, and removedwhen a soft timeout expires—indicating that the oldflow has ended.

5.2 Handling Control Plane FailuresWe now describe how SoftCell handles control-plane

failures. We focus on control-plane failure because thecontroller can easily handle topology changes (e.g., switchfailures) by recomputing paths and modifying rules inthe affected switches.

Handling controller failure: Controller failure is han-dled by maintaining a distributed, consistent copy of thecontrol-plane state. The state of the central controllermainly includes: the service policy, the subscriber at-tributes, the policy paths, and the UE locations and lo-cal IP addresses. SoftCell enables fast failure recoveryby simply replicating the controller. Indeed, the firstthree parts of the controller state change very slowly,making it affordable to maintain strong consistency.Also, although the UE locations change relatively of-ten, a UE only associates with one base station at atime. Upon a controller failure, a replica can correctlyrebuild the UE location state by querying local agents.

Handling local agent failure: The state of a localagent mainly includes: (i) the packet classifiers (gener-ated from the service policy and subscriber attributesby the central controller), and (ii) the UEs’ location-dependent IP addresses. Nonetheless, that state is neverupdated by the local agent (read-only to the local agent);only the central controller can update the state. Upona local-agent failure, SoftCell restarts the local agent,which fetches the related state from the controller again.Observe that the impact of a failure is purely local anddoes not affect the agents at other base stations.

6. PERFORMANCE EVALUATIONIn this section, we demonstrate the scalability and

performance of our SoftCell architecture. Although inthe current stage we are not able to integrate and deploy

8

Page 9: SoftCell: Scalable and Flexible Cellular Core Network ...lierranli/.../SoftCell-2013.pdfSecond, nearly all tra c in cellular networks goes to and from the Internet, whereas data centers

0 50 100 150 200 250 3000

0.2

0.4

0.6

0.8

1

Number of events per second in the whole network

CD

F

UE Arrival

Handoff

(a) Events in the whole network

100

101

102

1030

0.2

0.4

0.6

0.8

1

Number of active UEs per base station

CD

F

(b) Active UEs per base station

0 10 20 30 400

0.2

0.4

0.6

0.8

1

Number of radio bearer arrivals per second per base station

CD

F

(c) Radio bearer arrivals per base station

Figure 6: Measurement Results of a LTE network

SoftCell in a real cellular network, we extensively eva-lute SoftCell with real traces from a large LTE deploy-ment, micro-benchmark on our prototype controller, andlarge-scale simulaiton experiments. We show that Soft-Cell is able to sustain several times of the load in currentLTE networks and handle thousands of service policyclauses on commodity switches.

6.1 LTE Workload CharacteristicsAs a first step towards SoftCell deployment, we mea-

sured the workload of a real cellular network to under-stand the performance requirements of the controller.In contrast to other LTE measurement works, we studythe aggregate arrival rates of UEs and flows and showthe implications on control plane load.

Dataset description: We collected about 1TB tracesfrom a large ISP’s LTE network during one week inJanuary 2013. The dataset covers a large metropolitanarea with roughly 1500 base stations and 1 million mo-bile devices (including mobile phones and tablets). Thetrace is bearer -level and includes various events such asradio bearer creation, UE arrival to the network, UEhandoff between base stations, etc. A radio bearer isa communication channel between a UE and its asso-ciated base station with a defined Quality of Service(QoS) class. When a flow arrives and there is an exist-ing radio bearer with the same QoS class, the flow willuse the existing radio bearer. Since radio bearers timeout in a few seconds, a long flow may trigger several ra-dio bearer creation and deletion events. Since we do nothave flow-level information, we use radio bearers to es-timate flow activities. We present measurement resultsfor a typical weekday.

Network-wide characteristics: Figure 6(a) showsthe CDF of UE arrivals and handoffs in the whole net-work. A UE arrival means a UE first attaches to thenetwork (e.g., after a UE is powered on). A UE hand-off event means a UE moves from one base station toanother. From the figure, we can see that the 99.999percentile of UE arrivals and handoffs per second are214 and 280, respectively. As each of these events re-

quire the central controller to contact local agents (sendpacket classifiers) or update core switches (install short-cuts for long flows), it implies that the controller shouldbe able to handle hundreds of such events per second.

Load on each base station: Figure 6(b) shows theCDF of active UEs per base station. We see that a typ-ical base station handles hundreds of active UEs with a99.999 percentile of 514. Figure 6(c) depicts the radiobearer arrival rate at each base station. The number isrelatively small, only 34 for the 99.999 percentile. Asone radio bearer typically carries a handful of concur-rent flows [24, 25], we expect the actual flow arrival rateto be around several hundred per second. These resultsimply that the local agent has to keep state for severalhundred UEs and process a maximum of tens of thou-sands new flows per second. As most policy paths wouldhave already been installed in the network, new flowrequests only require the local agent to install packetclassification rules at the access switch.

6.2 Controller Micro BenchmarkWe have implemented a SoftCell control-plane pro-

totype on top of the popular Floodlight [13] OpenFlowcontroller. The prototype implements both the SoftCellcentral controller and the SoftCell local agent. In thefollowing, we perform micro-benchmarks on the proto-type with Cbench [26], and compare the results with themeasurement results in § 6.1. Cbench emulates a num-ber of switches, generates packet-in events to the testedcontroller, and counts how many events the controllerprocesses per second (throughput). Each test server hasan Intel XEON W5580 processor with 8 cores and 6GBof RAM.

Central controller performance: First, we evalu-ate the throughput of the controller. Recall that thecontroller must send packet classifiers to local agentswhen a UE attaches or moves to a base station. We useCbench to emulates 1000 switches and let these switcheskeep sending packet-in events to the controller. Fromthe controller viewpoint, these packet-in events corre-spond to packet-classifier requests coming from 1000 of

9

Page 10: SoftCell: Scalable and Flexible Cellular Core Network ...lierranli/.../SoftCell-2013.pdfSecond, nearly all tra c in cellular networks goes to and from the Internet, whereas data centers

1000 2000 3000 4000 5000 6000 7000 80000

2500

5000

7500

10000

12500

15000

Number of service policy clauses

Sw

itch

ta

ble

siz

e (

nu

mb

er

of

rule

s)

Maximum

Median

(a) Effect of the number of policy clauses

4 5 6 7 80

500

1000

1500

2000

2500

Service policy clause length (number of middleboxes)

Sw

itch t

able

siz

e (

num

be

r of

rule

s)

Maximum

Median

(b) Effect of service policy clause length

1280 2500 4320 6860 10240 14580 200000

500

1000

1500

2000

2500

Network size (number of base stations)

Sw

itch table

siz

e (

num

ber

of ru

les)

Maximum

Median

(c) Effect of network size

Figure 7: Large-scale simulation result. Thanks to multi-dimensional aggregation, SoftCell data planeis able to support thousands of service policy clauses on commodity switches.

Cache Hit Ratio 0% 20% 40% 60% 80% 100%Throughput 1.8K 2.3K 3.0K 4.5K 8.6K 505.8K

Table 2: Effect of cache hit ratio on local agentthroughput

local agents. The controller then replies to these re-quests with packet classifiers as fast as it can. The resultis that the controller can process 2.2 million of requestsper second with 15 threads. This is more than enoughto handle the hundreds of UE arrivals or handoffs persecond for the LTE network in § 6.1.

Local agent performance: Second, we evaluate thethroughput of the local agent. Recall that the localagent handles new flows based on packet classifiers fetchedfrom the central controller. Thus its throughput de-pends on how frequently it needs to contact the cen-tral controller. Table 2 shows the evolution of the localagent throughput in function of the cache hit ratio. Acache hit ratio of 80% means that the local agent canhandle 80% of the flows locally and needs to contactthe central controller for the remaining 20%. From thetable, it is easy to see that local agent throughput issufficient to handle the number of new flows at a basestation (a small tens of thousands per second). Indeed,even in the worst case where the local agent has to con-tact the controller for every flow, it is still able to handle1.8K events per second. Further performance gains arepossible by prefetching packet classifiers from the con-troller when a UE moves in range of the base station.

The result also validates the need to employ a hier-archical control plane. If there were no local agents,requests from those thousands of base stations wouldall go to the central controller. Such load (tens or hun-dreds of millions of requests per second) is difficult tohandle by a single controller.

6.3 Large-Scale SimulationsWe now demonstrate the scalability of the SoftCell

data plane through large-scale simulations.

Methodology: We use a synthetic topology based on

[18, 27]. The topology has three layers: access, aggre-gation and core. The access layer consists of clustersof 10 base stations interconnected in a ring [27]. Theaggregation layer consists of k pods, each of which hask switches connected in full-mesh. In each pod, k/2switches are connected to k/2 base station clusters; theremaining k/2 switches are connected to k/2 switches inthe core layer. The core layer has k2 switches connectedin full-mesh. They finally connect to a gateway switch.The whole topology with parameter k has 10k3/4 basestations. For example, k = 8 (resp. k = 20) givesa network with 1280 (resp. 20000) base stations. Foreach topology, we assume that they are k different typesof middleboxes. We randomly connect one instance ofeach type in each pod in the aggregation layer and twoinstances of each type in the core layer. On top of thistopology, we generate n policy paths for each base sta-tion to the gateway switch. A policy path traverses mrandomly chosen middlebox instances. Finally, we mea-sure the number of rules in each switch flow table. Inthe base case, we consider n = 1000, m = 5 and k = 8.We vary k, n and m to show how the switch state isaffected by the number of service policy clauses, thepolicy length and the network size, respectively.

Effect of number of service policy clauses: Fig-ure 7(a) shows the maximum and median size of theswitch forwarding table with respect to the number ofservice policy clauses. We can see that switch tablesize increases linearly with the number of service policyclauses with a small slope (less than 2). In particular,to support 1000 service policy clauses (1.28 million pol-icy paths!), switches store a median of 1214 rules and amaximum of 1697 rules. This good performance is a di-rect consequence of the multi-dimensional aggregation(see § 3). It is true that one service policy clause mayinstantiate one policy path to every base station (thusthousands of policy paths for just one policy clause).But the corresponding entries can be aggregated likeCS1 in Figure 3(c) if only one middlebox instance isused for this clause or like CS2 and CS3 if multiple in-

10

Page 11: SoftCell: Scalable and Flexible Cellular Core Network ...lierranli/.../SoftCell-2013.pdfSecond, nearly all tra c in cellular networks goes to and from the Internet, whereas data centers

stances are used.

Effect of service policy clause length: Figure 7(b)shows the switch table size with respect to the policylength. When the policy length is 8, the maximumtable size is 1934. As before, we see that table sizeincreases linearly with the policy length with a smallslope. The reason is also similar. When a policy clauseis longer, more middleboxes are traversed. But most af-fected switches on the path only require one additionalrule that matches on the tag as CS1 in Figure 3(c); onlya few switches require multiple rules to dispatch trafficto multiple middlebox instances as CS2 and CS3.

Effect of network size: Figure 7(c) shows the switchtable size with respect to the network size. We see thetable size decreases as the network grows. It is true thatwith more base stations, we have to install more policypaths for the same service policy clause, thus need morerules. But remember that we can do aggregation onpolicy tags and base station prefixes. The increase ofrules is small due to aggregation, but all rules are nowdistributed over the more switches as the network islarger. This leads to the result that when the networkgrows, switches maintain smaller tables for the samenumber of service policy clauses.

In summary, SoftCell can support thousands of ser-vice policy clauses in a network of thousands of basestations with a few thousand TCAM entries. The gainessentially comes from the ability to selectively matchon multiple dimensions.

7. DISCUSSION

Leveraging multi-table capabilities: As describedin Section 3, SoftCell mainly has three types of flow en-tries: 1) matching both tags and IP prefixes, 2) match-ing tags only and, 3) matching IP prefixes only. Soft-Cell could leverage the multi-table capabilities offeredby modern switches [9]. This is interesting as only Type1 entries would require the use of TCAMs (which areexpensive and power hungry). In contrast, Type 2 andType 3 entries would be stored in tables using exactmatch and IP-prefix match, respectively. We leave thedetails to future work.

Traffic initiated from the Internet: Although mosttraffic in cellular networks today are initiated from UEs,some carriers [28] also offer public IP address options.When a gateway switch receives packets destined tothese public IP addresses, the gateway will act like anaccess switch. It will install packet classifiers that trans-late the public IP addresses and port numbers to LocIPsand policy tags. Note that these packet classifiers arenot microflow rules and don’t require communicationwith the central controller for every microflow. Theyare coarse grained (match on the UE public IPs and

port numbers) and can be installed once.

Asymmetric Internet routing: For ease of descrip-tion, we have assumed that flows leaving a gatewayswitch return to the same gateway switch. However,Internet routing is not guaranteed to be symmetric. Ifgateway switches are not border routers peering withother autonomous systems, border routers can be con-figured to route return traffic to the same gateway switch.Otherwise, the controller can install corresponding switchrules for return traffic in all possible gateway switches(mostly a small fraction of all the gateway switches).

On-path middleboxes: The only problem with on-path middleboxes is that it is unavoidable to traversethem in some cases. If service policy specifies that cer-tain flows can not traverse certain middleboxes (whichwe have not considered in our service policy), then thepath computation has to avoid them. In case no feasiblepath exists, the policy path request will be denied.

8. RELATED WORKOur quest is to build a scalable architecture to sup-

port fine-grained policies for mobile devices in cellularcore networks. SoftCell differs from prior work as sum-marized by the following four categories.

Cellular network architecture: Recently work hasexposed the complexity and inflexibility of current cel-lular data networks [5, 3]. There are several efforts [5,3, 29, 30, 31] attempting to fix the problem. However,only [31, 30] have concrete designs. OpenFlow Wire-less [31] focuses on virtualizing the data path and con-figuration. [30] proposes an integration of OpenFlowwith GTP tunnels. None of them present a scalablenetwork architecture for fine-grained policies.

Scalable data centers: Our addressing scheme sharessome similarity to prior work on scalable data center [10,11, 12]. However, they mainly deal with“east west”traf-fic. To deal with Internet traffic, [12] require intelligentInternet gateways. In contrast, our gateways are muchsimpler because we “embed” policy and location infor-mation in the packet header, rather than relying on thecontroller to install fine-grain packet-classification rulesat gateway switches. Also, these earlier papers do notpresent techniques for enforcing service policies.

Software defined networks: Recent work like De-voFlow [16] and DIFANE [32] improves upon Ethane [7]by moving some processing from the control plane tothe data plane. Their techniques cannot address fine-grained middlebox policies, nor do they ensure policyconsistency under mobility. Fabric [33] and SDIA [34]describe the idea of core/edge separation, which arguesto put most intelligence at the edge and keep the coresimple. Though sharing similar high-level idea, SoftCellpresents the new concept of asymmetric edge and gives

11

Page 12: SoftCell: Scalable and Flexible Cellular Core Network ...lierranli/.../SoftCell-2013.pdfSecond, nearly all tra c in cellular networks goes to and from the Internet, whereas data centers

a novel solution of “smart acess edge, dumb gatewayedge” to scale the system. SoftCell also gives a spe-cific solution in the core to support fine-grained servicerouting, which is not addressed in Fabric and SDIA.

Middleboxes: There have been many works on mid-dleboxes recently [17, 35, 36]. The closest ones arePLayer [17] and SIMPLE [35]. PLayer is a pioneeringwork talking about how to enforce flexible middleboxtraversals. SIMPLE takes a further step to enable bet-ter load balancing and support middleboxes that mod-ify packets. SoftCell differs from them by identifyingthe specific challenges to enable fine-grained policies forlarge cellular networks and providing novel techinquesto make the whole system scalable.

9. CONCLUSIONToday’s cellular core networks are expensive and in-

flexible. In this paper, we propose SoftCell, a scalablearchitecture for supporting fine-grained policies in cel-lular core networks. SoftCell achieves scalability in thedata plane by (i) pushing packet classification to low-bandwidth access switches and (ii) minimizing the statein core network through effective, multi-dimensional ag-gregation of forwarding rules. SoftCell achieves scalabil-ity in the control plane by caching packet classifiers andpolicy tags at local agents that update the rules in theaccess switches. Our prototype and evaluation demon-strate that SoftCell significantly improves the scalabil-ity and flexibility of cellular core networks.

10. REFERENCES[1] M. Z. Shafiq, L. Ji, A. X. Liu, J. Pang, and J. Wang, “A

first look at cellular machine-to-machine traffic: Large scalemeasurement and characterization,” in ACMSIGMETRICS, June 2012.

[2] Cisco, “Cisco visual networking index forecast projects18-fold growth in global mobile internet data traffic from2011 to 2016.” http://tinyurl.com/7gn9x9s.

[3] S. Elby, “Carrier vision of SDN and future applications toachieve a more agile mobile business,” October 2012.Keynote at the SDN & OpenFlow World Congress.

[4] “Cisco PGW packet data network gateway.” http://www.cisco.com/en/US/products/ps11079/index.html.

[5] B.-J. Kim and P. Henry, “Directions for future cellularmobile network architecture,” First Monday, vol. 17,December 2012.

[6] “Network functions virtualization: Introductory whitepaper,” October 2012. http://tinyurl.com/kj4m4de.

[7] M. Casado, M. J. Freedman, J. Pettit, J. Luo, N. Gude,N. McKeown, and S. Shenker, “Rethinking enterprisenetwork control,” IEEE TON, vol. 17, August 2009.

[8] R. Wang, D. Butnariu, and J. Rexford, “OpenFlow-basedserver load balancing gone wild,” in Hot-ICE Workshop,March 2011.

[9] B. Stephens, A. Cox, W. Felter, C. Dixon, and J. Carter,“PAST: Scalable Ethernet for data centers,” in ACMSIGCOMM CoNext Conference, December 2012.

[10] A. Greenberg, J. R. Hamilton, N. Jain, S. Kandula,C. Kim, P. Lahiri, D. A. Maltz, P. Patel, and S. Sengupta,“VL2: A scalable and flexible data center network,” inACM SIGCOMM, August 2009.

[11] R. Niranjan Mysore, A. Pamboris, N. Farrington,N. Huang, P. Miri, S. Radhakrishnan, V. Subramanya, and

A. Vahdat, “PortLand: A scalable fault-tolerant layer 2data center network fabric,” in ACM SIGCOMM, August2009.

[12] Nicira, “Network virtualization for cloud data centers.”http://tinyurl.com/c9jbkuu.

[13] “Floodlight OpenFlow Controller.”http://floodlight.openflowhub.org/.

[14] “Open vSwitch.” http://openvswitch.org/, 2013.[15] “Broadcom Trident chipset.” http://www.broadcom.com/

products/Switching/Data-Center/BCM56850-Series.[16] A. R. Curtis, J. C. Mogul, J. Tourrilhes, P. Yalagandula,

P. Sharma, and S. Banerjee, “DevoFlow: Scaling flowmanagement for high-performance networks,” in ACMSIGCOMM, August 2011.

[17] D. Joseph, A. Tavakoli, and I. Stoica, “A policy-awareswitching layer for data centers,” in ACM SIGCOMM,August 2008.

[18] M. Howard, “Using carrier Ethernet to backhaul LTE,”Infonetics Research White Paper, 2011.

[19] L. Whitney, “Ericsson demos faster LTE speeds of almost1Gbps.” http://tinyurl.com/alml5vt.

[20] X. Jin, L. E. Li, L. Vanbever, and J. Rexford, “SoftCell:Taking control of cellular core networks,” Tech. Rep.TR-95-13, Princeton University CS, May 2013.

[21] M. Reitblatt, N. Foster, J. Rexford, C. Schlesinger, andD. Walker, “Abstractions for network update,” in ACMSIGCOMM, August 2012.

[22] “The rise of soft switching, part II: Soft switching isawesome,” June 2012. http://tinyurl.com/bjz8469.

[23] Z. Wang, Z. Qian, Q. Xu, Z. Mao, and M. Zhang, “Anuntold story of middleboxes in cellular networks,” in ACMSIGCOMM, August 2011.

[24] A. Rahmati, C. Shepard, C. Tossell, A. Nicoara, L. Zhong,P. Kortum, and J. Singh, “Seamless flow migration onsmartphones without network support,” IEEE Transactionson Mobile Computing, 2013. To appear.

[25] Y. Zhang and A. Arvidsson, “Understanding thecharacteristics of cellular data traffic,” in ACM SIGCOMMCellNet Workshop, August 2012.

[26] “Cbench OpenFlow Controller Benchmark.”http://www.openflow.org/wk/index.php/Oflops.

[27] R. Nadiv and T. Naveh, “Wireless backhaul topologies:Analyzing backhaul topology strategies,” Ceragon WhitePaper, 2010.

[28] AT&T, “Wireless IP options for mobile deployments.”http://tinyurl.com/l6z6kxp.

[29] L. Li, Z. Mao, and J. Rexford, “Toward software-definedcellular networks,” in EWSDN, October 2012.

[30] J. Kempf, B. Johansson, S. Pettersson, H. Luning, andT. Nilsson, “Moving the mobile evolved packet core to thecloud,” in IEEE WiMob, October 2012.

[31] K.-K. Yap, R. Sherwood, M. Kobayashi, T.-Y. Huang,M. Chan, N. Handigol, N. McKeown, and G. Parulkar,“Blueprint for introducing innovation into wireless mobilenetworks,” in ACM VISA Workshop, August 2010.

[32] M. Yu, J. Rexford, M. J. Freedman, and J. Wang, “Scalableflow-based networking with DIFANE,” in ACMSIGCOMM, August 2010.

[33] M. Casado, T. Koponen, S. Shenker, and A. Tootoonchian,“Fabric: A retrospective on evolving SDN,” in ACMSIGCOMM HotSDN Workshop, August 2012.

[34] B. Raghavan, M. Casado, T. Koponen, S. Ratnasamy,A. Ghodsi, and S. Shenker, “Software-defined Internetarchitecture: Decoupling architecture from infrastructure,”in ACM SIGCOMM HotNets Workshop, October 2012.

[35] Z. Qazi, C. Tu, L. Chiang, R. Miao, V. Sekar, and M. Yu,“SIMPLE-fying middlebox policy enforcement using SDN,”in ACM SIGCOMM, August 2013.

[36] A. Gember, P. Prabhu, Z. Ghadiyali, and A. Akella,“Toward software-defined middlebox networking,” in ACMSIGCOMM HotNets Workshop, 2012.

12