a measurement study of a commercial-grade urban wifi meshpages.cs.wisc.edu/~suman/pubs/mcb.pdfthe...

A Measurement Study of a Commercial-grade Urban WiFi Mesh∗

Vladimir Brik, Shravan Rayanchu, Sharad Saha, Sayandeep Sen, Vivek Shrivastava,Suman Banerjee

Department of Computer Sciences, University of Wisconsin, Madison, WI 53706, USA{vladimir, shravan, sharad, sdsen, viveks, suman}@cs.wisc.edu

ABSTRACTWe present a measurement study of a large-scale urban WiFi meshnetwork consisting of more than 250 Mesh Access Points (MAPs),with paying customers that use it for Internet access. Our study, in-volved collecting multi-modal data, e.g., through continuous gath-ering of SNMP logs, syslogs, passive traffic capture, and limitedactive measurements in different parts of the city. Our study is splitinto four components — planning and deployment of the mesh,success of mesh routing techniques, likely experience of users, andcharacterization of how the mesh is utilized. During our data col-lection process that spanned 8 months, the network changed manytimes due to hardware and software upgrades. Hence to presenta consistent view of the network, the core dataset used in this pa-per comes from a two week excerpt of our dataset. This part ofthe dataset had more than 1.7 million SNMP log entries (from 224MAPs) and more than 100 hours of active measurements. The scaleof the study allowed us to make many important observations thatare critical in planning and using WiFi meshes as an Internet ac-cess technology. For example, our study indicates that the last hop2.4GHz wireless link between the mesh and the client is the majorbottleneck in client performance. Further we observe that deploy-ing the mesh access points on utility poles results in performancedegradation for indoor clients that receive poor signal from the ac-cess points.

Categories and Subject DescriptorsC.2.3 [Network Operations]: Network monitoring,Public networks;C.2.1 [Network Architecture and Design]: Wireless communica-tion

General TermsDocumentation, Experimentation, Measurement, Performance, Re-liability

∗All student authors — V. Brik, S. Rayanchu, S. Saha, S. Sen, andV. Shrivastava — are in alphabetical order.

Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.IMC’08, October 20–22, 2008, Vouliagmeni, Greece.Copyright 2008 ACM 978-1-60558-334-1/08/10 ...$5.00.

KeywordsWiFi Mesh, Commercial, Measurement, Wireless, Access Network,Urban, Active Measurement, Pathloss, Client Experience

1. INTRODUCTIONThe widespread availability of WiFi and the benefits of a low cost

tetherless network deployment have spurred a significant amount ofinterest in wireless mesh networks. As the initial excitement of thisnew possibility has subsided, realities and challenges of makingsuch a network as a viable technology have set in. The researchcommunity has been spending many years and person-hours tryingto both understand performance issues of multi-hop wireless com-munication as an access technology and tackle problems in multipleinnovative ways. A few examples of recent progress include newchannel-aware routing metrics such as ETX [1] and WCETT [2];new opportunistic and broadcast-oriented routing strategies such asExOR [3], COPE [4], MORE [5], and CLONE [6]; channel plan-ning and assignment strategies that combine routing decisions [7,8], mesh deployment strategies [9, 10]. Additionally, a numberof vendors (Cisco Systems, Firetide, Mesh Networks (now part ofMotorola), Strix Systems, and Tropos Networks, to name a few)have also spent their efforts in creating and releasing commercial-grade mesh networking solutions that are being used for differ-ent purposes, including municipality-wide Internet access, publicsafety, and commercial use.

A number of detailed and insightful measurement studies in therecent years have characterized performance of various moderatescale, primarily home-grown and organically expanding, mesh net-works. Examples include pioneering work on the Roofnet testbedaround Cambridge, MA [11], the TFA-Mesh in Houston, TX [9],and the Digital Gangetic Plains project in India [12]. Mesh de-ployments with a research intent, often, have important limitations.In such cases, we typically use off-the-shelf components for costreasons, and adapt them to their individual goals. Often, in suchdeployments we provision additional capability for measurementsand experimentation to further the research goals. Additionally,the services offered by these deployments are, often, at no cost tothe user. Hence, initial expectations of performance are relativelylow. In contrast, users of a commercial network have significantlyhigh expectations of network stability and availability. Therefore,commercial networks are, often, carefully deployed, engineered,and tuned for high quality performance. While multiple studieshave documented the experiences of mesh networks, that are off-shoot of research endeavors, in this paper, we attempt to present thefirst systematic study of a commercial-grade WiFi mesh network.The mesh network we study has been operational in Madison, WI,for more than two years now. We refer to this mesh asMadMesh.MadMesh consists of more than 250 Mesh Access Points (MAPs)

Figure 1: Logical view of theMadMesh Network.

distributed in the greater downtown area of 10 square miles, cur-rently serves more than 1000 residential customers, provides trafficbackhauling capabilities for some other ISPs and small businesses,and provides additional wireless services to different public safetyorganizations of the city.

1.1 Study goalsThrough detailed measurement efforts spanning more than 8 months,

we wish to answer a broad categories of questions. In each cate-gory, apart from trying to understand existing phenomena, we alsoattempted to evaluate the relevance of various ongoing research ef-forts to improving performance. For instance, given that a lot ofefforts are being spent in designing effective network coding basedrouting strategies [4, 6, 5], how applicable are they to common de-ployment scenarios. These categories are:

• Mesh planning and deployment:What are the deploymentstrategies and their efficacies for a large-scale mesh networkspanning a substantial part of an urban area? How effectiveis such deployment in handling failures?

• Mesh routing strategies:What are common routing mecha-nisms adapted? How well do they perform?

• User experience:What is the client performance in differentparts of the mesh network?

• Usage characterization:How is the mesh utilized by the users?

While it is always dangerous to generalize observations basedon one single deployment, we believe that the relative success ofthis network makes it a reasonable starting point for other futureevaluations along these lines.

1.2 MadMesh Architecture and UseThe MadMesh network is comprised primarily of Cisco 1510

MAPs [13]. The MAPs are typically organized into a tree structure,with the root referred to as aRoot Access Point (RAP), i.e., a RAP isa MAP selected to serve as a root (see Figure 1). While a RAP typ-ically has wireline access to the Internet, in the case of MadMesh,

the RAPs use special licensed, wireless frequency bands to com-municate to an Internet fiber hub. MadMesh has multiple MAPsconfigured as RAPs, and hence, there is a separate tree correspond-ing to each RAP. Each MAP is configured to detect other nearbyMAPs, and associate with one tree for all communication. Basedon changing channel conditions, a MAP can also change its parentin the tree, or even switch to a different tree, if available. A meshcontroller directly configures and manages all the MAPs in the net-work. In particular, each MAP establishes a Layer 2 tunnel to thismesh controller soon after it boots up and joins an existing tree.

We refer to the links between different MAPs on a tree, as themesh backbone. The Cisco 1510 MAPs are equipped with two ra-dio interfaces. One interface of each MAP is dedicated for commu-nication on the mesh backbone, and is referred to as thebackboneinterface.The second interface is configured to act as a regular Ac-cess Point (AP) for regular clients, and is referred to as theaccessinterface.(When we refer to the corresponding wireless links, werefer to them as backbone link and access link respectively.) Thebackbone interface in MadMesh is configured to operate using the802.11a standards in the 5 GHz band, while the access interfaceis configured to operate using the 802.11b/g standards in the 2.4GHz band. Since each MAP dedicates a single radio interface forbackbone communication, all these interfaces of MAPs that formthe same tree are made to operate on the same 802.11a channel toestablish communication links. Different trees operate on different802.11a channels. The access interface of different MAPs operateon different 802.11 b/g channels. Clients associate to the accessinterface of a MAP using common WLAN procedures.

In the MadMesh deployment, the access interface is always con-figured to use an omni-directional antenna with 5 or 8dBi gain toachieve the desired coverage. The MAPs are mounted on street util-ity poles and the expected coverage of the AP interface is around1000 to 1500 feet. Most backbone interfaces of MAPs use a 11 dBisector antenna for more efficient communication. All of the back-haul traffic is encrypted by the MAPs using hardware-based AESto ensure privacy of the users. The APs support the 802.11i andWPA standard security authentication and encryption mechanisms.

How MadMesh is used? MadMesh is used to provide Inter-net access to users. Users are typically charged a fixed monthlyfee which varies with the quality of service (like bandwidth lim-its) promised to them. Overall, the end users of this network useit mostly from the student dormitories, university buildings, cafete-rias and other residences.

Figure 2 presents a high-level view of approximately one-thirdof MadMesh. The centers of the circles mark the positions of theMAP locations. The circle size is proportional to the number ofusers served. The lines indicate the typical connectivity structurewithin the mesh.

1.3 Main observationsWe now highlight some of the most important lessons and obser-

vations learnt about a large-scale, commercial-grade mesh networkthrough our measurement study.

Robustness — local does not mean globalEach MAP in the network has good connectivity with its peers.For example, about 60% of the MAPs had a degree greater than 3on average, while the top 10% of the MAPs had a degree of 6 orhigher. However, surprisingly there were multiple cases where asingle link failure could partition the network. Thus although thenetwork planning involved local redundancy, it did not automati-cally translate to global redundancy.

Figure 2: A third of the MadMesh deployment area. The circles represent the MAPs, their relative sizes indicate the relativenumber of users associated, averaged over three minute intervals over the duration of the study. The largest circle correspondsto an AP which had 6.55 users in average. The lines indicate their connectivity into the tree structure.

Bottleneck — it is the access linkThe performance of the mesh backbone was fairly robust. The linkqualities were usually good. Multiple hops on the backbone, goingall the way up to 6-8 hops did not significantly hinder user per-formance. However, the biggest hindrance to performance is theinterference in the access link. We believe that there are two rea-sons for it. First, the MAPs are on utility poles, and most usersare indoors (in brick or other buildings). The access link, there-fore, has poor signal quality from such indoor locations. Second,the radio interface in client devices (laptops, PDAs, etc.) often tendto operate in low-power modes than the MAPs. Thus, althoughclients can ‘hear’ MAP beacons, the uplink communication link is,often, particularly bad. Customer premise equipments (like 802.11repeaters) can potentially help mitigate some of these performanceproblems.

Routing paths — flapping is prevalentThe trees that define routing paths have a flapping behavior. Whilemany MAPs had fairly stable paths, about 10% of the MAPs hadrouting flaps in a regular fashion (more than 4 route changes perhour between the same alternative choices). Often these flaps occurdue to availability of multiple equally good or equally bad alterna-tives, and call for dampening mechanisms to be put into place.

Management — client feedback can really helpThe usual management tools at the disposal of network administra-tors rely on SNMP data collection from MAPs and other infrastructure-based components. Unfortunately, the MAPs, often, do not ob-serve the real performance problems being experienced at clients.Inferring client performance based on observations at the MAPsis harder due to the high variability and complexity of the urbanWiFi environment. A limited amount of automated client feedback(client reports) can bring many of these performance problems tolight.

Importance of client assisted network managementSimple measurements using few clients helped us observe that cov-erage holes were prevalent in the network with the disconnectionsbecoming more pronounced when the network was being accessedfrom moving vehicles. More importantly, although these areaswere shown as ’covered’ by the vendor provided tool (i.e. from theinfrastructure view of the network), client feedback helped identifythe areas with poor coverage. We believe that managing such largescale mesh network deployment is a complex task and client as-sisted measurements can provide significant corrective feedback tothe operator regarding coverage holes, which are much more accu-rate than the propagation models used in current mesh controllers.

Applicability of recent research results — network cod-ing and opportunistic routing can helpIn the recent past, new, wireless-specific, routing and MAC mecha-nisms, such as network coding and opportunistic routing, have beenproposed and demonstrated to work through research prototypes inlimited settings. However, the question of their real applicability inoutdoor mesh deployments have not been answered. Our measure-ments indicate that certain degree of topology diversity and trafficredundancy in data paths exist that will allow for these mechanismsto lead to performance gains.

User characteristics — night-time peaks and unevenusageFinally, we have also studied the usual aspects of user behavioron this network. Being primarily a residential access network, weobserve that traffic volumes peak in late evenings and the nighthours, rather than in the daytime. This is likely to be consistentwith traffic patterns of other access networks, but is contrary toobservations made in core ISPs (that see daytime peaks). Clientdistribution between MAPs is also quite uneven.

1.4 RoadmapThe rest of the paper is structured as follows. In the next section

we describe our measurement methodology. In Sections 3 to 6, we

examine different questions in the four broad categories, namelymesh planning and deployment, mesh routing strategies, user ex-perience, and usage characteristics. In Section 7, we present somerelated work and place our current effort in perspective, We, finally,conclude in Section 8.

2. MEASUREMENT METHODOLOGYFor our measurement study we have collected data over a pe-

riod greater than 8 months. The main limitation to our ability tocollect data has to do with the commercial nature of the networkwe study. Although most of our needs were accommodated by thenetwork operator, our access to logs was limited and our experi-ments had to be conducted in a manner that would not significantlyimpair the network performance. Combining the passive and activedata, however, still allowed us to capture and understand the overallcharacteristics of the network.

Periodic infrastructure logsUsing our privileged access to the mesh controller we polled SNMPrecords from all active MAPs, once every three minutes. EachSNMP record had more than 150 parameters that each MAP recordsabout its performance. For example, each MAP maintains statisticsabout the total number of clients associated to it, the MAC addressof the current parent MAP, MAC addresses of its neighbors, the cur-rent channel number, the number of failed transmissions, the noisefloor level at the MAP etc. In addition, we had access to variousmanagement tools and syslogs at the mesh controller, that trackedother global mesh parameters.

Passive monitoringWe strategically placed a few monitoring nodes at different partsof the network to gather wireless traffic passively. We used threeform of passive locations — (i) an outdoor utility pole mountedmonitoring node that was close to a MAP, a few indoor monitoringnodes co-located with a few residential users, and (iii) a mobilemonitoring node mounted on a city-bus traveling all over Madison,WI. Unfortunately, all client traffic on the mesh was encrypted, andhence, it was not feasible for us to do any application-level trafficanalysis. However, the MAC-level headers of all wireless frameswere available through this method.

Active measurementsOur log analysis revealed that passively collected data did not ad-equately describe experience of individual network clients. To ad-dress this issue, members of our teams periodically went to differ-ent parts of the city to perform limited volumes of active measure-ment, using tools such as iperf [14]. For these measurements weused laptops equipped with a Cisco AIR-PCM352 PC Card wire-less adapter and the MadWifi driver v0.9.3.

Over the duration of these 8 months, the network itself changedmany times. For example, MAPs were moved between differentutility poles, hardware was changed and upgraded, and so on. Soto present a consistent view of the network performance, the coredataset used in this paper comes from a two-week period, betweenthe end November and early December 2007. This part of thedataset had more than 1.7 million SNMP log entries (from 224MAPs) and more than 100 hours of active measurements.

Based on these data sets, we now present our observations in thefour different categories in the following four sections.

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

1 2 3 4 5 6 7 8910 20 30 40

Fra

ctio

n

Average degree of MAPs (in Logscale)

MadMeshRoofnet Mesh

Figure 3: Plot showing the fraction of MAPs with a given aver-age degree for MadMesh and Roofnet (CDF)

3. ON MESH PLANNING & DEPLOYMENTWe begin with our observations on various topological properties

of MadMesh. Many of these questions arise when the mesh is beingdeployed or periodically upgraded. More specifically, we focus onthe following questions in this section:

• What does the neighborhood of each MAP look like? Whatkind of connectivity does each MAP have with its peers?

• How robust is the deployment to failure scenarios?

• What are the link-level error rates and the signal qualities onthe backbone and access links? What are their contributioncauses?

• Does the network topology lend itself to new routing mech-anisms such as network coding and opportunistic routing?

3.1 Average MAP degreeWe define the degree of a MAP as the number of neighboring

MAPs with link quality above a threshold (14 dB for this study).The average degree of the MAPs in the network helps us determinethe connectivity properties of the deployed mesh network. A lowdegree of connectivity would imply that the MAPs are constrainedin their choices of parent links, which in turn implies limited re-routing choices in presence of losses. Similarly, an extremely highdegree of connectivity would imply over-provisioning in the de-ployment apart from increased possibility of self interference. Fig-ure 3 plots the CDF of the degree for all the MAPs in the mesh.We can observe that 20% of the MAPs have a degree of less than2 and about 60% of the MAPs have a degree of more than3. Thisobservation is interesting because more than 70% of the MAPs usedirectional sector antenna systems with a 45 degree beam for thebackbone, and we would have expected a much lower connected-ness. Overall, the connectivity of the MAPs within the mesh isfairly good.

However, it is interesting to note that the neighborhood distri-bution of MadMesh is still much lower than an organically grownmesh, such as the Roofnet. The latter is a network deployed inCambridge, MA, in and around MIT. Roofnet comprises of a setof wireless nodes that are hosted in homes and apartments of will-ing volunteers, and hence, there is limited fidelity in controlling

its growth and structure. In Figure 3 we also plot the degree ofnodes in the Roofnet network, and we believe that the large vari-ation in node density is a consequence of its unplanned growth.In contrast, the deployment of MadMesh is well structured, andwas preceded by detailed site surveys. Additionally, positioningof MAPs are continuously changed based on performance require-ments. Finally, MAPs in MadMesh are continuously available, andare tightly managed by the network operators, making their upti-mes more predictable. Therefore, it is logical to expect that thedensity of MAPs in MadMesh is much lower than RoofNet.

3.2 Robustness of the deploymentTo gauge the quality of the mesh planning one also needs to mea-

sure the robustness of the deployment against link failures. A welldeployed network should have more than one distinct path to thewired Internet connection. We study the robustness of the meshtopology by looking at the min-cut of each edge. The min-cut of aMAP in a graph is defined as the minimum number of edges, whoseremoval would disconnect the MAP from the graph. To understandthis, we build a graph out of connectivity reports obtained throughSNMP logs and calculate the minimum cut (min-cut) of each MAPfrom the different RAPs. Figure 4, shows the scatter-plot of theaverage min-cut of all the MAPs (since the mesh topology changesover time, so can the min-cut, necessitating averaging of this value).as it varied against average degree. As can be seen from the plot,around 8% of the MAPs have a mincut less than 2. This impliesthat the MAPs would get disconnected from the rest of the networkif less than 2 other specific MAPs fail. The figure also shows thatMAPs with neighbor degree as high as 7 can still have a min-cutlower than 2. This can be the case, if the neighbors of a MAPhave a common ancestor in the path to the RAP. Figure 4 shows aspecific instance of this phenomenon where a group of MAPs areconnected to the rest of the network via a single path (at a specifictime instant). In this case, failure of the common MAP would resultin a disconnected topology.

The analysis presented above assumes that all MAPs in the meshhave equal failure probability. However, in reality the failure of aMAP can be triggered by many independent factors including hard-ware failures, channel fluctuations or external interference in themedium. Thus the exact failure probability of each MAP may bedifferent depending on aforementioned factors and this informationneeds to be coupled with the mincut values to provide a better esti-mate of the robustness of a MAP. One way of estimating the failureprobability of a MAP is to track the uptimes of all the MAPs inthe system. Unfortunately, we currently do not have access to thisinformation for Madmesh and we hope to provide more details onnode failure in our future work. But we still believe that trackingthe mincut of the MAPs in the mesh provides a reasonable estimateof the robustness of the mesh and network planners should exam-ine the path diversity in their mesh deployments. More specifically,they need to ensure that multiple paths do exist between each MAPand the different RAPs, that can help tide over individual failures,and a high neighbor degree of each MAP does not automaticallyguarantee robustness.

3.3 Error rates of backbone and client accesslinks

The mesh network utilizes two different spectral bands (2.4 GHzfor access links and 5 GHz for backbone links) for communica-tion. We now characterize the relative performance of access andbackbone links. In order to do this, we first compare the packeterror rates (PER) for the access and backbone links. We definePER as the fraction of unicast wireless frames for which no cor-

path to the root

mincut = 1

Connectivity for thehighlighted node

11

3

18

58 23

102

17

160

56

21

39 48

34

19

0

1

3

5

7

9

11

1 2 3 4 5 6 7 8 9 10 11

Min

cut o

f Nod

e

Average Degree of Node

Figure 4: Scatter plot of the mincut and average degree ofMAPs

responding acknowledgment was received. For the calculation ofPER on the backbone links, we utilize two SNMP counters whichreport the number of packets for which the MAP did not receivean acknowledgment (F ) and the number of transmitted packets forwhich an acknowledgment was successfully received (T ). We thencalculate PER for the backbone links usingF/(T + F ). However,the SNMP data for calculating the PER on access links was notavailable to us. We therefore carried out a set of directed activemeasurements to estimate the PER of the access side. As part ofour active measurement experiments, we randomly selected a setof 35 locations in the coverage area, and at each location we con-nected with the MAP with the strongest signal and performed threesets of TCPiperf sessions, each session lasting 120 seconds. Ouriperf server was running on the mesh controller. We were also cap-turing packets in the monitor mode (on a different interface), fromwhich we determine the number of retransmissions (and hence theloss rates). We observed a total of 15 distinct MAP’s in this activemeasurement procedure. The PER is averaged for each MAP overall the runs at different locations.

We have plotted our results in Figure 5. The figure shows theCDF of the PER for all the backbone (5.2 GHz) and the access(2.4 GHz) bands. We observe that for each of the MAPs, the PERon the access side is much higher than that on the backbone side.This result indicates that the errors rate in 2.4 GHz access band ismuch higher than the 5.2 GHz backbone band, which can be at-tributed to much higher level of interference and noise in the 2.4GHz band, used widely by most wireless access points to serve endclients. Further, on the access side, the client to MAP link is morevulnerable due to the lower transmit power of the client devicesas compared to the MAP. This link asymmetry which can lead topacket losses from client to MAP, even though the signal strengthfrom the MAP to the client is high. We validated this assumptionusing a client device with a higher transmit power, and as expectedthe PER on the access band was significantly lower for that client.Since client devices typically have lower transmit power, the per-formance on an end-to-end mesh path will be severely impacted bythe interference and errors observed on the access link. Even if themesh backbone is of high quality (which is the case for MadMesh),the performance observed by clients will be limited by the inter-ference and error effects on its direct connection to the first MAP.

3.4 Channel selection in backbone and accesslinks

As described before, the packet error rates on the access linksare significantly higher than those of the backbone links. In order

0.2

0.4

0.6

0.8

1.0

.05 .1 .15 .2 .25 .3 .35

Fra

ctio

n

MAC level Error Rate

# of MAPs 15

(a) PER of Access

0.20.40.60.81.0

.02 .04 .06 .08 .10 .12

Fra

ctio

n

MAC level Error Rate

# of MAPs 224

(b) PER of Backbone

Figure 5: Plot showing the CDF of (a) PER of Access (2.4 GHz)and (b) Backbone (5.2 GHz) band.

to understand this contrast, we first examine the signal-to-noise ra-tio (SNR) of the backbone links. For a link operating on a higherSNR, there is a higher probability of successful packet transmis-sions. Similarly, links with low SNR values can result in highpacket error rates. Such low SNR values could be because of pres-ence of high levels of ambient interference in the network or due toa very low received signal strength. In Figure 6(a) we plot the CDFof the SNR values reported for all the backbone links across theduration of study. We observe that nearly99% of the links have anSNR above15 dB. These high values of SNR explain the minimalvalues of PER seen on the backbone links.

We next turn out attention to the quality of the access links. Sim-ilar to the backbone links, high SNR values in the access wouldimply that the clients would experience relatively low losses on theaccess links. To study the characteristics of the access links ideallywe would like to plot the SNR for these links. However, differ-ent client radios use different radios (quality/brand) and differenttransmit powers. This makes the reported SNR a function of theclient device used and also its proximity to the MAP, which aredifficult to quantify. We therefore analyzed the amount of ambientnoise (the raw power received from non-802.11 sources operatingin the same channel) and the amount of interference (raw powerreceived from 802.11 sources other than the client in communica-tion). These values are reported periodically in the SNMP recordsfor the duration of study. We observed that the ambient noise floorwas at an acceptable level of -90 dBm, for nearly90% of the accesslinks. In contrast, we observed that the amount of interference wasvery high. Figure 6(b) shows the CDF of the amount of interfer-ence for all the access links. As can be seen from the plot, morethan20% of the access links experience a high interference of−70dBm. We attribute such high values of interference to be one ofthe main reasons for high PER seen on the access links. Anotherpossible reason could be the low RSS of the client radios, how-ever we could not verify this as we did not have access to the MAPsoftware.

A possible remedy for the losses would involve a) forcing theclients to use a better RSSI while talking with the MAPs and b)ensure that the access side operates in the channel with the leastamount of ambient noise and interference. While, the signal strengthsfrom the clients can not be controlled by the MAPs, they can en-sure that they operate in the best possible channel (channel withleast amount of interference and ambient noise).

To find out whether the MAPs are indeed working in the bestchannel, we plot the CDF of the least amount of interference presentin any given channel for all the SNMP snapshots in Figure 6(b).The plot shows the MAPs are not working in the best channel avail-able. We believe that adoption of a channel of operation selectionalgorithm would result in better performance in the mesh. Design-ing such channel selection algorithm is nontrivial, since while se-lecting the locally best channel of operation the MAPs have to en-

0.10.20.30.40.50.60.70.80.9

10 15 20 25 30 35 40 45 50

Fra

ctio

n

SNR (in dB)

# of MAPs 224

(a) SNR of Backbone

Current

Best

# of MAPs 224

0

20

40

60

80

100

−100−90 −80 −70 −60 −50 −40

Fra

ctio

n

Interference (in dBm)

(b) Interference in Access

Figure 6: Plot showing the (a) CDF of SNR for all backbonelinks and (b) CDF of Interference of all MAPS for access, alongwith the least interference possible (in another channel).

NodesLeaf

No Overhearing

24% (Overhearing)

1

2

3

4

5

6

0 20 40 60 80 100 120 140 160 180 200MAP Index

Max

imum

cod

ing

gain

Figure 7: Plot showing the maximum coding gain achievable ateach of the MAPs.

sure that they operate in independent channels to avoid interferingwith each other. However given the huge loss rates observed in thecurrent settings, we believe that this optimization would result inimprovement of overall mesh performance.

3.5 Feasibility of Network CodingA good degree of connectivity in the mesh network has an impli-

cation in context of current ongoing research in the field of oppor-tunistic routing and coding based approaches for mesh networks [3,4, 5, 15]. These mechanisms exploit the broadcast nature of thewireless medium and are based on the possibility of overhearing(of data packets) in the network. As shown in Figure 3 nearly50%of the MAPs have a degree of more than3. This indicates that theremight be a good possibility of overhearing (of data packets) in themesh network.

In order to understand the achievable gains from overhearing-based mechanisms in the MadMesh deployment, we take the ex-ample of COPE [4], a network coding scheme based on oppor-tunistic overhearing of data packets, and calculate an estimate ofthroughput improvements achievable at each MAP. COPE achievesin-network data compression by XORing multiple data packets to-gether and transmitting a single coded packet. The number of pack-ets which can be coded together at each MAP is determined by thefollowing coding rule [4]:

To transmitn packets,p1, ..., pn, ton nexthops,r1, ..., rn,a node can XOR then packets together only if eachnext-hopri has alln− 1 packetspj for j 6= i.

The above rule indicates thatn packets (destined ton neighboringMAPs) can be coded together only if the structure of the networkpermits each of then neighboring MAPs to overhear the othern−1packets. In such a case, acoding gainof n is said to be achieved atthis MAP. In order to estimate the throughput improvements pos-sible at each MAP, we derive the maximum coding gain at eachof the MAPs using the above coding rule. Note that, the codinggain achievable in practice also depends on the underlying routingmechanism, the number of flows and the direction of these flows.Here, we are interested in finding out the network coding oppor-tunities the deployment inherently supports and we therefore lookat the best case scenario i.e., we assume that each of the MAPs al-ways has backlogged traffic to send to each of the other neighboringMAPs.

Figure 7 shows the maximum coding gain for each of the MAPsin the network, if it were to act as the relay node (i.e., the nodewhere the packets were being coded). We observe that around10%of the MAPs have no coding opportunities available as they werethe leaf MAPs (MAPs with only one neighboring MAP). For about66% of the MAPs in the network the maximum coding gain achiev-able was only2 i.e., even though the number of neighboring MAPsfor each of these MAPs were more than1, the structure of the net-work did not permit coding more than2 packets. However, thereare around24% of the MAPs where coding gains of more than2were possible with the maximum coding gain reaching6 for someof the MAPs.

This shows that techniques like network coding can potentiallyimprove the performance of such densely deployed outdoor meshdeployments.

4. ON MESH ROUTING STRATEGIESRouting in multihop wireless mesh networks has been a field of

significant research in recent times. Algorithms proposed in [3,1, 4] describe routing algorithms designed to improve the perfor-mance of the network. Studying the functioning of a routing algo-rithm in a mesh network spanning a city is an exciting problem in itsown right. Ideally, such characterization would involve large scaleexperimentation. However, the commercial nature of the mesh de-ployment constrains the amount of experimentation feasible. Inparticular, we could not change the parameters of the routing algo-rithm to observe its characteristics. Instead, we studied the perfor-mance of the routing algorithm in terms of the routing paths createdand the relative stability of the routing paths. To reason about thequality and stability (or lack thereof) of the routing paths, one needsto know about the factors which affect the routing decision. In thissection, we attempted to answer the following questions:

• How often do routes change and what specific events (fromthe routing algorithm’s perspective) trigger these changes?

• What is the consequence of the routing algorithms used onthe structure of data trees?

• What are potential inefficiencies in the routing mechanisms?

4.1 Understanding behavior of mesh routingalgorithm

In this section we present a study of the mesh routing decisionalgorithm. Our goal is to correlate each possible routing changeswith its root cause. The current mesh deployment useseasemetricfor route creation. Details of this metric are presented in [16]. Theeaseis based on a weighted sum of the SNR and hop count of thepotential MAPs. The MAP chooses a neighbor which has the bestvalue for the metric. On comparing this metric with ETX [1], we

find that ETX uses (expected transmission count) over a link as anindicator of the quality of the link, in contrast the current metricuses the SNR value as a predictor of the same. Both of them sumthe metric over the entire path.

On studying the SNMP logs to identify root cause of a routechange we found that both hop count and SNR changes were in-volved in 0.9 of the entire parent changes (1-0.1 = 0.9). For the rest0.1 of the cases both SNR and hopcount worsened due to the parentchange. On closer inspection we found that in another .06 of the(total) cases the one of the ancestors increased its hop in the routingtree which made it a bad parent option and hence forced a routingtree change. We could not account for the rest 0.04% by looking atthe logs.

4.2 Implication of the mesh routing metricsA routing metric which is a weighted sum of the link SNRs and

hop count, has some non-obvious implications on the final routingpaths being used by the MAPs. We comment on them below.

Implication on hop count of the MAPsThe SNMP data contains periodic updates about the hop count ofeach MAP. We utilize this information to plot Figure 8 which de-picts the distribution of average number of MAPs on different hopsin the network. We observe that around15% of the MAPs in thenetwork are RAPs. Also, the average number of MAPs decreasewith increase in the hop count i.e., a higher number of MAPs arepresent at the lower hops. Thus, the network is well deployed andthe routing algorithm performs well for most of the time.

However, we also observe that around8% of the MAPs havea hop count of more than5. Conventional wisdom suggests thatthe achievable throughput of the in a multihop network degradesdrastically with increasing hop count. Presence of longer paths be-

0.3

0.2

0.1

0

0.1

0.2

0.3

0 1 2 3 4 5 6 7 8

Fra

ctio

n

Hop count

# of MAPs 224, # of Clients 498

MAPs at each hopUsers at each hop

Figure 8: Plot showing fraction of MAPs at various hops, andthe number of users at each hop

tween the MAP and a RAP might be because of two reasons: (a)the network was not planned well and therefore sufficient routingchoices were not available (b) the routing algorithm is not choosingthe right path.

To investigate, whether this high hop-count is a result of wrongdeployment, we looked into the the best available neighbors interms of hop-count (i.e., neighbor MAPs with lowest hop-counts)for MAPs with a hop-count greater than4. We have plotted theCDF of the hopcounts of such neighbors with the best available

hops and reasonable link quality (link SNR higher than 14 dB) inFigure 9. As can be seen from the plot, the MAPs at hop-countshigher than4 always have a neighbor with better hopcount avail-able. This leads to the conclusion that the phenomenon is not an ar-tifact of the deployment i.e., there were other (shorter) paths avail-able in the network, but the routing algorithm did not choose to useit. This behavior is an implication of using a routing metric whichis a weighted sum of SNR and hop-count. A neighbor with whichthe MAP has a very good link (in terms of SNR) is given priorityas potential parent over another MAP with lower hopcount and arelatively lower SNR.

0.4

0.5

0.6

0.7

0.8

0.9

1.0

1 2 3 4 5 6 7 8

Fra

ctio

n of

MA

Ps

Hop count

Best Current

Figure 9: Plot showing the fraction of current and best hop-counts of MAPs with a hop-count greater than 4

Such long path selection can be avoided if the route selectionalgorithm uses a SNR threshold for selecting the potential parentsand then decides amongst the potential parents based on hop-count.

Implication on route stabilityThe MAPs in the network can form links to more than one MAPin their neighborhood. A parameter of interest to gauge the perfor-mance of the routing algorithm is the relative stability of the routingpaths. Large fluctuations in the path imply that the network condi-tions are unstable. These might be caused by the appearance ofanother potential link with better characteristics,disappearance ofthe current parent link or occurrence of certain other events suchas increased interference etc. which make a previously unattractiveparent link more attractive.

We show the stability characteristics of the mesh routing algo-rithm in Figure 10. This figure shows the relative frequency of theparent changes for all the MAPs occurring in a given duration. Thedata was collected by analyzing a parent change counter present inthe SNMP log for each MAP over the span of passive data collec-tion. This plot shows that some MAPs have a faster rate of chang-ing their parent than other MAPs. To understand the reason behindit, we revisited the SNMP logs corresponding to the MAPs withhigh parent change frequency. We observed two major reasons forthis phenomenon:

• The MAPs with high-rate of parent change are also the oneswhich have more than one neighbor with comparable SNRand hopcounts. Presence of multiple parent choices with al-most similar quality, coupled with momentary fluctuations inthe wireless characteristics makes one of the links momentar-ily better than the rest forcing the MAP to flap its route.

0

2

4

6

8

10

12

14

0 50 100 150 200

Hou

rly P

aren

t cha

nges

MAP Index

# of MAPs 200

Figure 10: Plot showing the average hourly rate of parentchanges for all the MAPs (Note: Rest of the MAPS did not showany parent change)

• The link to the parent for some intermediate MAP (ancestor)has very bad SNR (less than 10 dB), this causes the inter-mediate MAP to choose a different parent frequently, thuscausing a route flap.

A possible remedy for reducing the amount of route flappingwould be keep a threshold on the number of times a MAP canchange its route in a given quantum.

5. ON MESH USER EXPERIENCERobust client performance is important in commercial, pay per

use mesh network. Although careful evaluation of SNMP logs pro-vide us with valuable insight into the dynamics of mesh infras-tructure, client performance can be better understood by carryingout targeted active measurements at different locations in the mesh.Specifically, we want to know the following regarding client per-formance in commercial mesh deployment under study:

• How good is the quality of client to mesh connectivity inMadMesh ? Are coverage holes prevalent ? What is theimpact of client mobility on coverage holes?

• What is the maximum achievable throughput by a mesh client? What is the impact of hop-count, RSSI, channel congestionon the client throughput?

• Is there any issue of starvation at higher hops when clientsare present at lower hops as well ?

Broadly, the aforementioned questions relate to two main issues- how easily can a client connect to the network and once con-nected, what is the experience of the client. We first describe ourmeasurements for characterizing client connectivity in the mesh de-ployment, followed by a detailed analysis of client performance.

5.1 Client connectivityUbiquitous client connectivity is one of the most important goals

of large scale mesh deployments. In order to maintain client con-nectivity, it is important to have a monitoring infrastructure in placewhich can identify ‘coverage holes’ created due to obstacles, weatherand temporary interference sources. Once such coverage holes are

detected, corrective action can be taken by adjusting power levelsof different MAPs or by deploying new ones. Existing approachescommonly used by leading vendors, employ pathloss models to es-timate the expected area of coverage. Such models describe the at-tenuation experienced by wireless signal as a function of distance.In order to assess the efficacy of such pathloss models, we first per-form detailed experiments to characterize the pathloss exponent inour urban environment.

Characterizing pathloss exponentIn the following equation,α is the pathloss exponent, andε is theshadowing component that describes the variation in pathloss expo-nent.PdBm(d) is the signal strength measured at a given distanced, while PdBm(d0) is the signal strength at the reference distanced0 [17].

PdBm(d) = PdBm(d0)− 10αlog10(d

d0) + ε (1)

We follow the measurement methodology reported in prior re-search work on propagation modeling [9] and collect signal strengthinformation at 25 different locations for each MAP. We compute thepathloss exponent for different MAPs in MadMesh network. Ourresults indicate significant variations in pathloss exponent betweendifferent MAPs. Figure 11(a) and (b) show the signal strength mea-surements as a function of link distance for two MAPs, located indowntown and campus regions of the city. As shown in the fig-ure, the pathloss exponent for the downtown MAP is2.9 while thepathloss exponent for the campus MAP is2.3. This is in contrast tothe pathloss of3.3 shown in [9], who also report that their pathlossexponent is stable across different access points in their network.We attribute this significant variation in pathloss exponent to di-verse set of obstacles and external interference, which also varysignificantly from one location in the city to another. Our resultsshow that generalizing a pathloss exponent for a city wide meshdeployment may be inaccurate, and targeted experiments must beperformed to determine the pathloss in different parts of the city.This observation further reinforces the inefficacy of pathloss mod-els in determining coverage holes. Next we describe a simple clientassisted monitoring tool can detect such coverage holes efficientlyand evaluate its efficacy on the MadMesh network.

Characterizing coverage holesIn order to assess the prevalence of coverage holes in the mesh de-ployment under study, we perform extensive client measurements.We report on some sample results in in a 6×6 block area of themesh deployment. In our experiments, a few clients (IBM laptopswith Cisco Aironet wireless card) were equipped with a modulewhich continuously records the information about the location, cur-rent state of association and received signal strength. Periodically,the clients upload this information to a central server. Over a periodof time (seven days, in our case) information aggregated from theseclients is used to detect coverage holes in the network. Figure 12shows the average client connectivity in our target area. Although,the propagation model based radio map generated by the mesh con-troller shows this entire area to be ‘covered’, we found additionalcoverage holes through client assisted measurements.

Vehicular client connectivityWireless access from mobile devices has been an active area of re-search recently [18]. In that context, we wanted to evaluate MadMeshin terms of providing client connectivity from moving vehicles. To-wards this end, we repeat our client assisted measurements frommoving vehicles that makes round of the same 6x6 block area that

-100

-90

-80

-70

-60

-50

-40

-30

-20

0 50 100 150 200 250 300 350

Rec

eive

d S

igna

l Str

engt

h (d

Bm

)

Distance (m)

Pathloss Exp = 2.3082Shadowing Std. = 14

Mean+7 Stdev-7 Stdev

Measurements

-100

-90

-80

-70

-60

-50

-40

-30

-20

0 50 100 150 200 250 300 350

Rec

eive

d S

igna

l Str

engt

h (d

Bm

)

Distance (m)

Pathloss Exp = 2.9082Shadowing Std. = 8.1

Mean+4 Stdev-4 Stdev

Measurements

Figure 11: Propagation pathloss for two different MAPs in di-verse settings. The MAP in (a) is located in campus and has apathloss of 2.3, while (b) is located in downtown and its mea-sured pathloss is 2.9.

we targeted for our walking experiments. The average speed ofthe vehicle was 25 miles/hr. The coverage holes detected at suchvehicular speeds is shown in Figure 12. As shown in the figure,the holes detected by the clients at vehicular speeds is much largerthen the holes detected during earlier client measurements. In fact,we observe that about 65 % of the total path falls under the cat-egory of coverage holes at vehicular speeds. This shows that themesh deployment under study is not conducive for access at vehic-ular speeds. More importantly, we believe that such client assistedmeasurements can provide significant corrective feedback to theoperator regarding coverage holes, which are much more accuratethen the propagation models used in current mesh controllers.

5.2 Client PerformanceIn order to assess the performance of end users in the mesh de-

ployment, we undertake targeted active measurements, where werandomly sample 100 locations in the mesh coverage area and per-form bandwidth tests to determine the achievable throughput at thatlocation. At each sampled location, we associate to the MAP withstrongest signal strength and run TCPiperf[14] from the client tothe mesh controller. We use TCP as it is the dominant traffic typein mesh networks, and secondly it is less intrusive then a UDP test,which can completely saturate the link and negatively impact other

Figure 12: Actual network coverage as observed by clients, in areas estimated to be perfectly covered by infrastructure-side man-agement tools, that rely on propagation models.

client in the mesh. We perform three iterations of 100 secondseach. Figure 15 shows the distribution of TCP throughput at thesampled locations. As shown in the figure, the measured through-put closely matches a uniform distribution, with about 10% of theclients achieving less then 0.2 Mbps and 80% of the client achievethroughput less than 1 Mbps. This upper limit on client through-put is expected in view of the Service Level Agreement(SLA) ofMadMesh, which advertises a 1 Mbps service to the clients.

Further, to understand the impact of hop count, channel conges-tion and RSSI on client’s throughput, we perform targeted experi-ments on one stable 6 hop tree (shown in Figure 13), comprising ofeight MAPs and a RAP. MAPs one, two and three shown in Figure13 are located on a busy main road of the city that has substan-tial interference from other wireless hotspots in the area. On theother hand, MAPs four to eight are located inside the neighbor-hood areas, experiencing relatively less interference. We choose aminimally loaded tree, so that our experiments are not impacted bythe presence of other MadMesh users on the same tree. Our re-sults from the active measurements on chosen tree is summarizedin Table 1. The main observations are as follows:

• In all experiments, maximum client throughput is limited to1 Mbps, which indicates that bandwidth shaping may be per-formed by the mesh operator for meeting the SLA.

• Client throughput remains stable with RSSI to a point, be-yond which it drops quickly. Since per client throughputis limited by the operator, higher RSSI, which can sustainhigher data rate, does not improve client throughput.

• External interference from other wireless sources has a sig-nificant impact on the client throughput beyond the first hop.

• Throughput unfairness is observed when clients at differ-ent hops of the same tree are activated simultaneously, withclients at higher hops achieving low throughput share.

We now describe each observation in detail.

Impact of hop countAs shown in Table 1, there is no strong correlation between thethroughput and the hop count. Different throughput is seen at clients

Figure 13: Tree for our targeted experiments to understandthe impact of RSSI, hop count, channel congestion on clientperformance.

associated to MAPs at the same hop count in the tree. As shown intable 1, throughput of 0.3Mbps and 0.92Mbps is observed on twodifferent MAPs at a hop count of three. Similarly, a throughput of0.91Mbps and 0.6 Mbps is observed on MAPs at a hop count offive. This variation in throughput at different MAPs with same hopcount can be attributed to channel congestion on their access sideswhich we discuss next.

Impact of Channel CongestionPresence of traffic due to other 802.11b/g sources can have a greatimpact on the throughput observed at each hop. We estimate thechannel congestion by monitoring the traffic on each hop while do-ing the TCP iperf experiments. Table 1 shows that the throughputachieved on MAPs at the same hops is well correlated with thechannel congestion at their respective access sides. We further ob-serve that channel congestion does not have any impact on clientconnected to the first hop. This is because of the lesser number oflinks these clients have to contend for on the backbone. Howeverif the channel congestion is relatively high, this observation mightnot hold true.

Impact of Shared CongestionIn another set of experiments we study the impact on throughput ofclients at a lower hop count in the presence of other clients in the

1.1 Mbps

0.98 Mbps

0.89 Mbps

0.43 Mbps

Isolation

Shared Congestion

12

2 1

(a)

(b)

Figure 14: Effect of shared congestion on a client associated toa MAP at hop 4. In isolation it achieves close to 1 Mbps, butwhen another client at hop 3 is activated, its throughput dropsto 0.43 Mbps.

tree. We first associate only one client to a MAP at a higher hopcount in the tree and calculate its TCP throughput using iperf asshown in figure 14(a). In order to see the effect on throughput dueto other clients in the tree, we associate another client at a lowerhop count on the same tree and start running TCP iperf, shown infigure 14(b). As shown in the figure, on running the second clientthe throughput of the first client suddenly drops from 0.98 Mbps to0.43 Mbps. This is due to shared channel congestion as discussedin [19], when multiple clients try to contend for the same back-bone path. This can have a great impact on the clients connectedto MAPs at a higher hop count, which can suffer from increasedthroughput degradation with the increase in clients at lower hops(closer to RAP).

Summary: In Madcity mesh network, because of bandwidthshaping policies enforced, hop count did not really seem to be thebottleneck for performance. However, this is mostly true in absenceof shared congestion; that is in presence of multiple flows sharingthe same backbone path, the throughput of higher hop-count routeswould be lower. Hence although the penalty of using higher hopcounts is diminished due to bandwidth shaping, choosing a lowerhop path is still better due to the possibility of shared congestion inthe path.

6. ON MESH USAGE CHARACTERIZATIONWe now answer one of the most basic questions about the mesh

network –how is the network being used?Specifically, we want toknow the following:

• How many clients are using the network? How does theirnumber vary across time?

• How are the clients distributed across the coverage area?

• What is average number of clients connected to each MAP?Are there any popular MAPs?

• How does client distribution vary across different hops?

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

1

0.0 0.2 0.4 0.6 0.8 1.0 1.2

CD

F o

f sam

pled

loca

tions

TCP Throughput (Mbps)

Figure 15: CDF of throughput measured through active mea-surements at 100 random locations in the coverage area ofMadMesh. Throughput is almost uniformly distributed, withmaximum throughput being clipped around 1.2 Mbps

avg = 498

max = 627

0

100

200

300

400

500

600

700

3:00 6:00 9:00 12:00 15:00 18:00 21:00 24:00

Ave

rage

num

ber

of u

sers

Time of day

Figure 16: Distribution of the average number of clients con-nected to the network across time.

Client distribution across timeFigure 16 shows the average number of clients per hour connectedto the network over this2 week period. The error bars show the95% confidence limits. We observe that the average number ofclients varied considerably across the duration of the day, with mostnumber of clients being connected at around 10 PM and the leastnumber of clients at 5 AM. We note that the observed usage pat-tern is unique to this mesh network as it is mostly accessed by theusers from their residences. This is apparent from the fact that num-ber of clients starts increasing from around 6 AM, remains steadythroughout the afternoon and then again increases from around 6PM as the users start returning to their homes. It reaches its peakaround 10 PM when most of users are their homes and starts tailingoff as the night progresses. We observe that during the busiest houraround 627 clients were connected to the network with around 498being connected to the network on average.

Client distribution across the MAPsThe average number of clients connected to a MAP gives a measureof the amount of load experienced by that MAP. The client distri-bution across the MAPs also helps us identify ‘client hotspots’ and

MAP Hop Count Avg. RSSI. Avg. Chnl. Util. TCP Thrpt. TCP loss rate (Mbps) TCP RTT (msec)Index (MAP to client)

1 1 34 0.28 .96 0.021 111±98.42 2 33 0.27 0.4 0.092 158±1153 3 35 0.20 0.3 0.087 258.2±168.74 3 40 0.09 .92 0.007 192.5±91.95 4 33 0.10 0.7 0.021 252.2± 126.46 5 32 0.05 .91 0.007 215± 737 6 37 0.09 0.5 0.030 208.2± 117.58 5 33 0.11 0.6 0.015 278.7± 127.4

Table 1: Experimental results for the tree under study. Confidence intervals for RSSI and throughput is small and omitted forbrevity

50% (40)

25% (14)

75% (80)

0

1

2

3

4

5

6

7

0 50 100 150 200

Ave

rage

num

ber

of c

lient

s

MAP Index

Figure 17: Distribution of the average number of clients con-nected to the network across different MAPs.

accordingly deploy more MAPs in that region to evenly distributethe load across the access points. In Figure 17 we plot the numberof clients connected to each MAP averaged over the period of study.The MAPs are sorted in the decreasing order of the average numberof clients connected to them. We make several observations fromthis figure. Clearly, certain MAPs are much more popular whencompared to the others, with the average varying from around7to less than1. The figure also shows the 25, 50, 75 and 95 per-centiles of the number of the clients and the corresponding numberof MAPs to which these clients are connected. For example, onecan see that around50% of the clients are connected to 40 mostpopular MAPs which account for only20% of total the deployedMAPs. We also observe that many of the MAPs are lightly loadedin the mesh network with around 110 MAPs having on an averageless than one user connected to them.

Spatial distribution of clientsIn Figure 2, we show the spatial distribution of the clients using themesh network where each MAP is represented by a circle. The sizeof the circle represents the average number of clients connectedto the MAP. We can clearly observe the uneven distribution of theclients across the coverage area. More importantly, we note thatthe most of the popular MAPs (the MAPs with higher number ofclients connected to them on an average) are concentrated in thearea depicted which is a popular area near the downtown. Further,in this region we can observe the formation of a small number ofclusters in the areas depicted by B2, B3 and B4. We note that thereare several student dormitories in the area B2 which can explainits popularity. While B3 is very popular among the people with a

high number of coffee shops and restaurants concentrated in thatarea, B4 is popular because of an open park where student activityis prominent.

Hop count and number of clientsThe distribution of the number of clients across the different hopsof a network informs us about how good the deployment is. In awell planned deployment, one can expect to see most of the clientsconnected to network to be within a few hops. If there are popularMAPs at a higher hop count, network planners might deploy a RAPin the area in order to reduce the hop count. Figure 9 shows the dis-tribution of the clients across different hops of the deployed meshnetwork. We observe that around15% of the clients are directlyconnected to the RAPs and amongst the other MAPs, the numberof clients decrease with increase in the hopcount. In particular, weobserve that around85% of the clients are connected to the networkwithin 3 hops. We note that this distribution is similar to the distri-bution of MAPs (Figure 8) shown earlier because a large fractionof the deployed MAPs are also at a lower hop count.

Distribution of MAPs with high packet lossesWe observed that some of the MAPs had experienced a packet errorrate of more than35%. On further investigation, we observed thatmost of these MAPs had a very low number of clients connectedto them. We also observed that the MAPs with similar packet errorrates are clustered together representing the regions of high and lowinterference on the access links. For example, we observe that theMAPs in the area B3 experience very low packet losses as they aredeployed in open park. On the other hand, the MAPs with higherpacket error rates are clustered in near B3 where there are otherWiFi networks causing interference.

7. RELATED WORKIn this section we present a summary of previous work being

done on the study of wireless network deployments.A substantial body of research has already been conducted on

evaluation of the performance characteristics of wireless networks.Studies reported in [20, 21] utilize SNMP traces to understand theperformance of wireless networks. Specifically, Kotz et. al. [21,20] present a comprehensive study of the usage patterns (applica-tion popularity, temporal variation in utilization etc.), of a campuswide wireless network. While their study utilized SNMP traces ex-clusively to draw their conclusions. We have carried out a set ofdirected measurements to characterize our network. Also, the net-works considered in these studies are essentially WLANS whereaswe document the performance of a city wide multihop wireless net-work.

No. of Coverage Typical link Intended Architecture Env. Hw/SwNodes (Sq. Km) length usage technology

Roofnet [11] ∼50 4 Mostly Testbed Omni antennas Urban Single Tier(Rooftop mesh) < 500 m on rooftops 802.11b

DGP [12] 17 80 km Up to few tens Testbed High gain Rural Single Tier(Long-distance (point-to- of kms dirnl. antennas 802.11b

Mesh) to-point) on tall towersTFA@Rice [9] 18 3 Mostly Non Omni antennas Urban Two tier

(Sub-Urban Mesh) < 500 m commercial on poles 802.11bMadMesh 250 26 Mostly Commercial Omni & dirnl. Urban Two Tier

(Commercial < 500 m antennas 802.11aMesh) on poles 802.11b

Table 2: Table comparing this study with other measurement studies reported in literature.

Aguayo et. al report their findings on the link level characteris-tics of an 802.11b rooftop based mesh network in [11]. The net-work is deployed in a urban city. Their study focuses on the linklevel characteristics of the deployment. In contrast, we present re-sults on the quality of the deployment and the application level per-formance of our network along with link level characteristics of thenetwork.

Chebrolu et. al. [12] and Sheth et. al [9] also study the link levelcharacteristics of outdoor mesh networks, however their work isapplicable to rural settings. Our study was done on a commercialmesh while all of the above mentioned studies were conducted oncustom testbeds built explicitly for experimentation.

The work by Knightly et. al reports a measurement study of amesh network deployment in [9] and highlights the importance ofmeasurements in accurately planning and provisioning mesh net-works. While their deployment is has a two-tier architecture aswell, their deployment operates exclusively in 2.4 GHz settingswhile ours operates in both 2.4 GHz and 5 GHz. Also, the spanof MadMesh network (250 nodes) is far bigger than their deploy-ment (18 nodes). We summarize and contrast the our measurementstudy with prior work on mesh network deployments in Table 2. Ascan be seen from the table, the unique features of our study are, a)our deployment has a far bigger scale in terms of nodes deployedb) use of two type of RF bands for network operation (802.11 a &b), and c) the commercial nature of the MadMesh deployment.

Concurrently to our work, two other research groups have carriedout measurement studies on mesh network deployments of similarscale. The work by Knightly et. al [22] looked into methods for es-timating the quality of the available coverage on the Google WiFinetwork. Such methodolgies can be integrated with our client-driven measurement architecture to obtain better understanding of agiven mesh deployment. Afanasyev et. al [23] have also presentedtheir study on the usage of Google WiFi network. Their work fo-cuses on the usage characterization of different classes of the clientdevicess (smartphone, modem, hotspot) in terms of the applicationworkload, mobility patterns and the diversity of their usage acrossdifferent locations in the network.

8. CONCLUSIONThis paper presents the first systematic study of a commercial

grade wireless mesh network deployed in an urban setting. We findthat the planned part of the network (backbone) is performing farbetter than the access side. This disparity in performance is mostlya result of unmitigated interference in 2.4 GHz spectrum in urbansettings. The study also presents a set of interesting statistics onthe actual usage of the mesh network, which would help in cus-tomization of future deployments to make them more profitable.

We also present a set of lessons which if followed would result inmore robust deployments and stabler routing algorithms in future.

Finally, the study throws open a set of immediate next steps thatneed to be carefully addressed in real deployments to make meshnetworks viable. Some examples include (i) a better architecturaldesign to mitigate interference on the client access link, (ii) designof mechanisms to detect topology robustness in a global sense, (iii)strategies to mitigate route flapping, as common metrics that de-termine routing changes frequently, and (iv) utilization of clientfeedback in management of these networks.

9. ACKNOWLEDGMENTSWe thank Patrick Luell, Jordan Walker and Su Yong Lee for their

help in performing outdoor experiments. We would also like tothank all the reviewers for their comments. Special thanks to ourshepherd Dina Papagiannaki, whose guidance bought this paper to-gether into its final form. All authors were supported in part by theUS National Science Foundation through awards CNS-0639434,CNS-0627589, CNS-0627102, CNS-0520152, and CNS-0747177.

10. REFERENCES[1] D. De Couto, D. Aguayo, J. Bicket, and R. Morris. A

high-throughput path metric for multi-hop wireless routing.In MOBICOM, 2003.

[2] Liang Ma and Mieso K. Denko. A routing metric forload-balancing in wireless mesh networks. volume 2, pages409–414, Los Alamitos, CA, USA, 2007. IEEE ComputerSociety.

[3] Sanjit Biswas and Robert Morris. Exor: opportunisticmulti-hop routing for wireless networks. InACM Sigcomm,2005.

[4] S. Katti, H. Rahul, W. Hu, D. Katabi, M.Medard, andJ. Crowcroft. Xors in the air: Practical wireless networkcoding. InACM SIGCOMM, 2006.

[5] Szymon Chachulski, Michael Jennings, S. Katti, andD. Katabi. Trading structure for randomness in wirelessopportunistic routing. InACM SIGCOMM, 2007.

[6] Shravan Rayanchu, Sayandeep Sen, Jianming Wu, SudiptaSengupta, and Suman Banerjee. Loss-aware network codingfor unicast wireless sessions: Design, implementation, andperformance evaluation. InACM SIGMETRICS, 2008.

[7] M. Alicherry, R. Bhatia, and L. Li. Joint channel assignmentand routing for throughput optimization in multi-radiowireless mesh networks. InACM MobiCom, 2005.

[8] A. Mishra, E. Rozner, S. Banerjee, and W. Arbaugh.Exploiting partially overlapping channels in wireless

networks: Turning a peril into an advantage. InACM/USENIX IMC, 2005.

[9] J. Camp, J. Robinson, C. Steger, and E. Knightly.Measurement driven deployment of a two-tier urban meshaccess network. InACM MobiSys, 2006.

[10] J. Robinson and E. W Knightly. A performance study ofdeployment factors in wireless mesh networks. InIEEEINFOCOM, 2007.

[11] Daniel Aguayo, John Bicket, Sanjit Biswas, Glenn Judd, andRobert Morris. Link-level measurements from an 802.11bmesh network. InSIGCOMM ’04. ACM, 2004.

[12] Kameswari Chebrolu, Bhaskaran Raman, and SayandeepSen. Long-distance 802.11b links: Performancemeasurements and experience. InACM Mobicom, 2006.

[13] Product catalog: Cisco aironet 1500 series.http://www.cisco.com/en/US/products/ps7156/index.html.

[14] Iperf. www.dast.nlanr.net/Projects/Iperf/.[15] Saumitra Das, Yunnan Wu, Ranveer Chandra, and Y. Charlie

Hu. Context-based routing: techniques, applications andexperience. InNSDI’08: Proceedings of the 5th USENIXSymposium on Networked Systems Design andImplementation, pages 379–392, Berkeley, CA, USA, 2008.

[16] Cisco aironet 1500 series wireless mesh ap v5 design guide.http://www.cisco.com.ru/en/US/docs/wireless/technology/mesh/design/guide/MeshAP.pdf.

[17] G Stuber. Principles of Mobile Communication. 2000.[18] Ratul Mahajan, John Zahorjan, and Brian Zill.

Understanding wifi-based connectivity from movingvehicles. InProceedings of the 7th ACM SIGCOMMconference on Internet measurement, 2007. ACM.

[19] J. Camp, V. VMancuso, Gurewitz O., and E. W Knightly.Measurement and modeling of the origins of starvation incongestion controlled mesh networks. InIEEE INFOCOM,2008.

[20] Tristan Henderson, David Kotz, and Ilya Abyzov. Thechanging usage of a mature campus-wide wireless network.In MobiCom ’04, pages 187–201, New York, NY, USA,2004. ACM.

[21] David Kotz and Kobby Essien. Analysis of a campus-widewireless network. InMobiCom ’02, pages 107–118, NewYork, NY, USA, 2002. ACM.

[22] Joshua Robinson, Ram Swaminathan, and Edward Knightly.Assessment of urban-scale wireless networks with a smallnumber of measurements. InACM Mobicom, 2008.

[23] Mikhail Afanasyev, Tsuwei Chen, Geoffrey M. Voelker, andAlex C. Snoeren. Analysis of a mixed-use urban wifinetwork: When metropolitan becomes neapolitan. InACM/USENIX IMC, 2008.

a measurement study of a commercial-grade urban wifi meshpages.cs.wisc.edu/~suman/pubs/mcb.pdfthe...

Documents