wireless sensor networks - hindawi publishing...

EURASIP Journal on Wireless Communications and Networking

Wireless Sensor Networks

Guest Editors: Biao Chen, Wendi B. Heinzelman,Mingyan Liu, and Andrew T. Campbell


Guest Editors: Biao Chen, Wendi B. Heinzelman,Mingyan Liu, and Andrew T. Campbell


Copyright © 2005 Hindawi Publishing Corporation. All rights reserved.

This is a special issue published in volume 2005 of “EURASIP Journal on Wireless Communications and Networking.” All articles areopen access articles distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, andreproduction in any medium, provided the original work is properly cited.

Editor-in-ChiefPhillip Regalia, Institut National des Telecommunications, France

Associate EditorsThushara Abhayapala, Australia Fary Ghassemlooy, UK Eric Moulines, FranceFarid Ahmed, USA Alfred Hanssen, Norway Sayandev Mukherjee, USAAlagan Anpalagan, Canada Stefan Kaiser, Germany A. Nallanathan, SingaporeAnthony C. Boucouvalas, UK G. K. Karagiannidis, Greece Kamesh Namuduri, USAJonathon Chambers, UK Hyung-Myung Kim, Korea Athina Petropulu, USABiao Chen, USA Chi Chung Ko, Singapore H. Vincent Poor, USAPascal Chevalier, France Richard J. Kozick, USA Brian Sadler, USAChia-Chin Chong, Korea Bhaskar Krishnamachari, USA Ivan Stojmenovic, CanadaSoura Dasgupta, USA Vincent Lau, Hong Kong Lee Swindlehurst, USAPetarM. Djuric, USA Dave Laurenson, Scotland Sergios Theodoridis, GreeceAbraham Fapojuwo, Canada Tho Le-Ngoc, Canada Lang Tong, USAMichael Gastpar, USA Tongtong Li, USA Luc Vandendorpe, BelgiumAlex B. Gershman, Canada Wei (Wayne) Li, USA Yang Xiao, USAWolfgang Gerstacker, Germany Steve McLaughlin, UK Lawrence Yeung, Hong KongDavid Gesbert, France Marc Moonen, Belgium Weihua Zhuang, Canada

Contents

Editorial, Biao Chen, Wendi B. Heinzelman, Mingyan Liu, and Andrew T. CampbellVolume 2005 (2005), Issue 4, Pages 459-461

Distributed Detection and Fusion in a Large Wireless Sensor Network of Random Size, Ruixin Niuand Pramod K. VarshneyVolume 2005 (2005), Issue 4, Pages 462-472

Minimum Energy Decentralized Estimation in a Wireless Sensor Network with Correlated SensorNoises, Alexey Krasnopeev, Jin-Jun Xiao, and Zhi-Quan LuoVolume 2005 (2005), Issue 4, Pages 473-482

Asymmetric Joint Source-Channel Coding for Correlated Sources with Blind HMM Estimation at theReceiver, Javier Del Ser, Pedro M. Crespo, and Olaia GaldosVolume 2005 (2005), Issue 4, Pages 483-492

MAC Protocols for Optimal Information Retrieval Pattern in Sensor Networks with Mobile Access,Zhiyu Yang, Min Dong, Lang Tong, and Brian M. SadlerVolume 2005 (2005), Issue 4, Pages 493-504

An Optimal Medium Access Control with Partial Observations for Sensor Networks, Răzvan Cristescuand Sergio D. ServettoVolume 2005 (2005), Issue 4, Pages 505-522

Multihop Medium Access Control for WSNs: An Energy Analysis Model, Jussi Haapola, Zach Shelby,Carlos Pomalaza-Ráez, and Petri MähönenVolume 2005 (2005), Issue 4, Pages 523-540

Optimal Throughput and Energy Efficiency for Wireless Sensor Networks: Multiple Access andMultipacket Reception, Wenjun Li and Huaiyu DaiVolume 2005 (2005), Issue 4, Pages 541-553

Throughput Analysis of Fading Sensor Networks with Regular and Random Topologies, Xiaowen Liuand Martin HaenggiVolume 2005 (2005), Issue 4, Pages 554-564

Maintaining Differentiated Coverage in Heterogeneous Sensor Networks, Xiaojiang Duand Fengjing LinVolume 2005 (2005), Issue 4, Pages 565-572

EURASIP Journal on Wireless Communications and Networking 2005:4, 459–461c© 2005 Hindawi Publishing Corporation

Editorial

Biao ChenDepartment of Electrical Engineering and Computer Science, Syracuse University, Syracuse, NY 13244, USAEmail: [email protected]

Wendi B. HeinzelmanDepartment of Electrical and Computer Engineering, University of Rochester, Rochester, NY 14627, USAEmail: [email protected]

Mingyan LiuDepartment of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48109-2122, USAEmail: [email protected]

Andrew T. CampbellDepartment of Electrical Engineering and Center for Telecommunications Research, Columbia University, NY 10027, USAEmail: [email protected]

Recent advances in integrated circuits and in digital wire-less communication technologies have enabled the designof wireless sensor networks (WSN) to facilitate the jointprocessing of spatially and temporally distributed informa-tion. Such networks immensely enhance our ability to un-derstand and evaluate complex systems and environments.Using wireless connections for sensor networks offers in-creased flexibility in deployment and reconfiguration ofthe networks and reduces infrastructure cost. These advan-tages enable WSN applications in areas ranging from battle-field surveillance to environment monitoring and control totelemedicine.

Enormous challenges in the understanding of sensor net-works presently impede deployment of many of the envis-aged applications. In particular, for WSN that employ in situunattended sensors, physical constraints, including those ofpower, bandwidth, and cost, have presented significant chal-lenges as well as research opportunities in the field. Of par-ticular interest to this special issue are topics related to thecommunications and networking aspects of WSN. Indeed,one of the major concerns in sensor networks is maintainingconnectivity and networking functions with geographicallydispersed sensor nodes under stringent resource constraints.This is further exacerbated by the volume of data generatedby the sensors, which is disproportionately large comparedwith the network capacity. The papers in this special issueare reflections of some of these issues.

Sensor networks are typically built to perform somesystem-wide missions, that is, collective inference tasks thatinvolve all sensor nodes. Examples include detection of anevent and estimation of a parameter or a process. The firstthree papers are concerned with designing such WSN. Thefirst paper, coauthored by Niu and Varshney, considers thedetection of an event in sensor networks with a randomnumber of sensors. High network density and limited band-width impose a severe constraint on the number of bits eachsensor can transmit, and the authors treat the extreme casewhere a single bit is sent from each sensor. Under the as-sumption of a Poisson model on the number of sensors, asimple counting rule is proposed at the fusion center to strikea balance between performance and requirement on a pri-ori information. This work demonstrates that for large-scaleheterogeneous sensor networks, heuristics based on intuitionoften trump theoretically optimal processing that is typicallytoo demanding in its requirement. Under the same networkarchitecture, that is, a number of sensors communicatingwith a single fusion center, the second paper, by Krasnopeevet al. treats an estimation problem where the unknown sig-nal is corrupted by spatially correlated additive noises. Again,bandwidth constraints dictate that each sensor sends a finitenumber of bits to the fusion center. By exploiting the spatialcorrelation of the noise in terms of its covariance matrix, theminimum energy quantizer design is reformulated as a con-vex optimization problem and hence can be solved efficiently

mailto:[email protected]




460 EURASIP Journal on Wireless Communications and Networking

using standard convex programming. Taking the problemone step further, the third paper, coauthored by Del Ser et al.deals with the estimation of a random process. Specifically,two binary sources, whose correlation is modeled by a hiddenMarkov process, are transmitted and the receiver is assumedto reliably recover one of them. This then serves as side in-formation for the decoding of the other. It was demonstratedthat the hidden Markov model parameters and the transmit-ted source can be jointly recovered via iterative decoding.

A perennial problem encountered in large-scale sensornetworks is medium access control (MAC): the lack of a cen-tral node and the stringent bandwidth and other resourceconstraints make it an extremely difficult problem. In thepaper by Yang et al., the authors consider information re-trieval and processing problems in the SENMA (sensor net-works with mobile access) network architecture, where datagenerated by ground sensors are collected by mobile accesspoints (e.g., unmanned aerial vehicles). Three MAC proto-cols are proposed to produce desired data-retrieval patterns,so as to minimize the reconstruction distortion. These MACprotocols integrate random access by the sensor nodes andthe ability of the mobile access points to selectively acti-vate subsets of the sensor field. For a more complicated net-work model involving multihops, the paper by Cristescu andServetto describes a solution to the rate control problem at arelay node with partial state information at each source node.The solution reveals an interesting interplay between stabil-ity and efficiency; it also provides a distributed medium ac-cess control mechanism such that each node can indepen-dently decide when it should transmit a packet without com-plete knowledge of the network state information. Multi-hop transmission is also considered in the paper by Haap-ola et al., where an energy dissipation model is proposed toevaluate carrier-sense multiple-access-based MAC protocols.Three different MAC protocols are analyzed with this modeland the authors demonstrate how the model can be usedto determine when multihop forwarding is more energy-efficient than single-hop transmission in wireless sensor net-works.

An important performance measure in sensor networksis the throughput capacity. The paper by Li and Dai consid-ers the tradeoff between throughput and energy efficiencyfor a reachback channel where multiple sensors send infor-mation to an access node. Allowing for an advanced detec-tion scheme at the multiple-antenna receiver, typically feasi-ble for the reachback channel as the access node is not sub-ject to stringent constraints, the authors compare two MACschemes, round-robin and slotted-Aloha, both in through-put and in energy consumption. It was shown that mul-tiuser scheduling brings significant gain in a fading envi-ronment, an observation that corroborates other studies inwireless networks with fading channels. The paper by Liuand Haenggi studies throughput for a multihopped networkusing slotted-Aloha. It was shown that while a regular net-work topology exhibits only marginal performance gain overa random topology in terms ofper-link throughput, it does

have significant advantages if the end-to-end throughput isof concern.

Finally, Du and Lin discuss in their paper a new nodescheduling scheme for heterogeneous sensor networks thatprovide increased redundancy in certain key areas. Their ap-proach, which utilizes a clustering scheme with high-endcluster head nodes that perform the scheduling of all nodesin their cluster, is energy-efficient by ensuring that only thenecessary sensor nodes are turned on to achieve the desiredcoverage in each cluster.

We would like to thank the individuals who participatedin the review process; their dedication has ensured the qualityof this special issue and the timeliness of its publication. Wealso would like to thank the authors who have contributedto this special issue for their effort in abiding to the strictdeadlines. We hope that this collection of papers will providesome timely research results and contribute to the literatureof this very exciting area.

Biao ChenWendi B. Heinzelman

Mingyan LiuAndrew T. Campbell

Biao Chen received his B.E. and E.E. de-grees in electrical engineering from Ts-inghua University, Beijing, China, in 1992and 1994, respectively. From 1994 to 1995,he worked at AT&T (China) Inc., Beijing,China, before he joined the University ofConnecticut, Storrs, where he received hisM.S. degree in statistics and Ph.D. degree inelectrical engineering, in 1998 and 1999, re-spectively. From 1999 to 2000 he was withCornell University as a Postdoc Research Associate. Since 2000, hehas been with Syracuse University, Syracuse, NY, as an AssistantProfessor with the Department of Electrical Engineering and Com-puter Science. His area of interest mainly focuses on signal pro-cessing for wireless sensors and ad hoc networks and on multiuserMIMO systems.

Wendi B. Heinzelman is an Assistant Pro-fessor in the Department of Electrical andComputer Engineering at the University ofRochester. She received a B.S. degree in elec-trical engineering from Cornell Universityin 1995 and the M.S. and Ph.D. degrees inelectrical engineering and computer sciencefrom MIT in 1997 and 2000, respectively.Her current research interests lie in the areasof wireless communications and network-ing, mobile computing, and multimedia communication. She re-ceived the NSF CAREER Award in 2005 for her research on cross-layer architectures for wireless sensor networks, and she receivedthe ONR Young Investigator Award in 2005 for her work on balanc-ing resource utilization in wireless sensor networks. She is a Mem-ber of Sigma Xi, the IEEE, and the ACM.

Editorial 461

Mingyan Liu received her B.S. degree inelectrical engineering in 1995 from Nan-jing University of Aeronautics and Astro-nautics, Nanjing, China, the M.S. degree insystems engineering, and the Ph.D. degreein electrical engineering from the Univer-sity of Maryland, College Park, in 1997 and2000, respectively. She joined the Depart-ment of Electrical Engineering and Com-puter Science, the University of Michigan,Ann Arbor, in September 2000, where she is currently an Assis-tant Professor. Her research interests are in performance modeling,analysis, energy-efficiency, and resource allocation issues in wire-less mobile ad hoc networks, wireless sensor networks, and ter-restrial satellite hybrid networks. She is the recipient of the 2002NSF CAREER Award and the University of Michigan Elizabeth C.Crosby Research Award in 2003.

Andrew T. Campbell is an Associate Professor of electrical engi-neering at Columbia University, and a member of the COMETGroup. He is working on emerging architectures and programma-bility for wireless networks. He received his Ph.D. degree in com-puter science in 1996, and the NSF CAREER Award for his re-search in programmable mobile networking in 1999. Prior to join-ing academia he spent 10 years working on transport and operatingsystems issues in industry. He spent his sabbatical year (2003–2004)at the Computer Lab, Cambridge University, as an EPSRC VisitingFellow.

EURASIP Journal on Wireless Communications and Networking 2005:4, 462–472c© 2005 R. Niu and P. K. Varshney

Distributed Detection and Fusion in a Large WirelessSensor Network of Random Size

Ruixin NiuDepartment of Electrical Engineering and Computer Science, Syracuse University, 335 Link Hall, Syracuse,NY 13244-1240, USAEmail: [email protected]

Pramod K. VarshneyDepartment of Electrical Engineering and Computer Science, Syracuse University, 335 Link Hall, Syracuse,NY 13244-1240, USAEmail: [email protected]

Received 11 December 2004; Revised 9 May 2005

For a wireless sensor network (WSN) with a random number of sensors, we propose a decision fusion rule that uses the totalnumber of detections reported by local sensors as a statistic for hypothesis testing. We assume that the signal power attenuatesas a function of the distance from the target, the number of sensors follows a Poisson distribution, and the locations of sensorsfollow a uniform distribution within the region of interest (ROI). Both analytical and simulation results for system-level detectionperformance are provided. This fusion rule can achieve a very good system-level detection performance even at very low signal-to-noise ratio (SNR), as long as the average number of sensors is sufficiently large. For all the different system parameters wehave explored, the proposed fusion rule is equivalent to the optimal fusion rule, which requires much more prior information.The problem of designing an optimum local sensor-level threshold is investigated. For various system parameters, the optimalthresholds are found numerically by maximizing the deflection coefficient. Guidelines on selecting the optimal local sensor-levelthreshold are also provided.

Keywords and phrases: wireless sensor networks, distributed detection, decision fusion, deflection coefficient.

1. INTRODUCTION

Recently, wireless sensor networks (WSNs) have attractedmuch attention and interest, and have become a very activeresearch area. Due to their high flexibility, enhanced surveil-lance coverage, robustness, mobility, and cost effectiveness,WSNs have wide applications and high potential in mili-tary surveillance, security, monitoring of traffic, and envi-ronment. Usually, a WSN consists of a large number of low-cost and low-power sensors, which are deployed in the en-vironment to collect observations and preprocess the obser-vations. Each sensor node has limited communication capa-bility that allows it to communicate with other sensor nodesvia a wireless channel. Normally, there is a fusion center thatprocesses data from sensors and forms a global situationalassessment.

In a typical WSN, sensor nodes are powered by batter-ies, and hence have a very frugal energy budget. To maintain

This is an open access article distributed under the Creative CommonsAttribution License, which permits unrestricted use, distribution, andreproduction in any medium, provided the original work is properly cited.

longer lifetimes of the sensors, all aspects of the networkshould be energy efficient. In [1], a data-centric energy ef-ficient routing protocol is proposed. By using existing wire-less local area network (WLAN) technologies, in [2], authorspresent a cluster-based ad hoc routing scheme for a multi-hop sensor network. In [3], an on-demand clustering mech-anism, passive clustering, is presented to overcome two lim-itations of ad hoc routing schemes, namely limited scalabil-ity and the inability to adapt to high-density sensor distribu-tions.

Many other important aspects of WSNs have been in-vestigated too, such as distributed data compression andtransmission, and collaborative signal processing [4, 5]. Ina WSN, detection, classification, and tracking of targets re-quire collaboration between sensor nodes. Distributed sig-nal processing in a sensor network reduces the amount ofcommunication required in the network, lowers the risk ofnetwork node failures, and prevents the fusion center frombeing overwhelmed by huge amount of raw data from sen-sors. In this paper, we focus on distributed target detection,one of the most important functions that a WSN needs toperform. There are already many papers on the conventional



Distributed Detection and Fusion in a Sensor Network 463

distributed detection problem. In [6, 7], optimum fusionrules have been obtained under the conditional indepen-dence assumption. Decision fusion with correlated observa-tions has been investigated in [8, 9, 10, 11]. There are alsomany papers on the problem of distributed detection withconstrained system resources [12, 13, 14, 15, 16, 17, 18].More specifically, these papers have proposed solutions tooptimal bit allocation under a communication constraint.

However, most of these results are based on the assump-tion that the local sensors’ detection performances, namelyeither the local sensors’ signal-to-noise ratio (SNR) or theirprobability of detection and false alarm rate, are known tothe fusion center. For a dynamic target and passive sensors,it is very difficult to estimate local sensors’ performances viaexperiments because these performances are time varying asthe target moves through the wireless sensor field. Even ifthe local sensors can somehow estimate their detection per-formances in real time, it will be very expensive to transmitthem to the fusion center, especially for a WSN with very lim-ited system resources. Usually a WSN consists of a large num-ber of low-cost and low-power sensors, which are denselydeployed in the surveillance area. Taking advantage of theseunique characteristics of WSNs, in our previous paper [19],we proposed a fusion rule that uses the total number of de-tections (“1”s) transmitted from local sensors as the statistic.

In [19], we assumed that the total number of sensors inthe region of interest (ROI) is known to the WSN. However,in many applications, the sensors are deployed randomly inand around the ROI, and oftentimes some of them are outof the communication range of the fusion center, malfunc-tioning, or out of battery. Therefore, at a particular time, thetotal number of sensors that work properly in the ROI is arandom variable (RV). For example, in a battlefield or a hos-tile region, many microsensors can be deployed from an air-plane to form a WSN. Data are transmitted from sensors toan access point, which could be an airplane that flies overthe sensor field and collects data from the sensors. The totalnumber of sensors within the network and the total numberof sensors that can communicate with the access point (theflying airplane) at a particular time are RVs. In this paper,the results presented in [19] are extended to this more gen-eral situation. The performance of the fusion rule proposedin [19] will be analyzed with this extra uncertainty about thetotal number of sensors.

In Section 2, basic assumptions regarding the WSN aremade, the signal attenuation model is provided, and the fu-sion rule based on the total number of detections from lo-cal sensors is introduced. In addition, it is shown that theproposed fusion rule can be adapted well to a large networkwith multiple-layer hierarchical structure. Analytical meth-ods to determine the system-level detection performance arepresented in Section 3. There, asymptotic detection perfor-mance is studied. In addition, the proposed fusion rule iscompared to the likelihood-ratio (LR) based optimal fu-sion rule, which requires much more prior information.Simulation results are also provided to confirm our analy-ses. In Section 4, the problem of designing an optimum lo-cal sensor-level threshold is investigated, and the optimum

−50

−40

−30

−20

−10

0

10

20

30

40

50

−50 −40 −30 −20 −10 0 10 20 30 40 50

SensorsTarget

X

Y

Figure 1: The signal power contour of a target located in a sensorfield.

thresholds for various system parameters are found numeri-cally. Conclusions and discussion are provided in Section 5.

2. SYSTEM MODEL AND DECISION FUSION RULE

2.1. Problem formulation

As shown in Figure 1, a total of N sensors are randomly de-ployed in the ROI, which is a square with area b2. N is an RVthat follows a Poisson distribution:

p(N) = λNe−λ

N !, N = 0, . . . ,∞. (1)

The locations of sensors are unknown to the WSN, but itis assumed that they are independent and identically dis-tributed (i.i.d.) and follow a uniform distribution in the ROI:

f(xi, yi

) =

1b2

, −b

2≤ xi, yi ≤ b

2,

0, otherwise(2)

for i = 1, . . . ,N , where (xi, yi) are the coordinates of sensor i.Noises at local sensors are i.i.d. and follow the standard

Gaussian distribution with zero mean and unit variance:

ni ∼ N (0, 1), i = 1, . . . ,N. (3)

For a local sensor i, the binary hypothesis testing problem is

H1 : si = ai + ni,

H0 : si = ni,(4)

where si is the received signal at sensor i, and ai is the ampli-tude of the signal that is emitted by the target and received at


sensor i. We adopt the same isotropic signal power attenua-tion model as that presented in [19]

a2i =

P0

1 + αdni, (5)

where P0 is the signal power emitted by the target at distancezero, di is the distance between the target and local sensor i:

di =√(

xi − xt)2

+(yi − yt

)2, (6)

and (xt, yt) are the coordinates of the target. We further as-sume that the location of the target also follows a uniformdistribution within the ROI. n is the signal decay exponentand takes values between 2 and 3. α is an adjustable con-stant, and a larger α implies faster signal power decay. Notethat the signal attenuation model can be easily extended to3-dimensional problems. Our attenuation model is similarto that used in [20]. The difference is that in the denomina-tor of (5), instead of dni , we use 1 + αdni . By doing so, ourmodel is valid even if the distance di is close to or equal to 0.When di is large (αdni � 1), the difference between these twomodels is negligible.

In this paper, we do not specify the type of the passivesensors, and the power decay model adopted here is quitegeneral. For example, in a radar or wireless communicationsystem, for an isotropically radiated electromagnetic wavethat is propagating in free space, the power is inversely pro-portional to the square of the distance from the transmitter[21, 22]. Similarly, when spherical acoustic waves radiated bya simple source are propagating through the air, the intensityof the waves will decay at a rate inversely proportional to thesquare of the distance [23].

Because the noise has unit variance, it is evident that theSNR at local sensor i is

SNRi = a2i =

P0

1 + αdni. (7)

We define the SNR at distance zero as

SNR0 = 10 log10 P0. (8)

Assuming that all the local sensors use the same thresholdτ to make a decision and with the Gaussian noise assump-tion, we have the local sensor-level false alarm rate and prob-ability of detection:

pfa =∫∞τ

1√2π

e−t2/2dt = Q(τ), (9)

pdi =∫∞τ

1√2π

e−(t−ai)2/2dt = Q(τ − ai

), (10)

where Q(·) is the complementary distribution function ofthe standard Gaussian, that is,

Q(x) =∫∞x

1√2π

e−t2/2dt. (11)

We assume that the ROI is very large and the signal powerdecays very fast. Hence, only within a very small fraction ofthe ROI, which is the area surrounding the target, the re-ceived signal power is significantly larger than zero. By ig-noring the border effect of the ROI, we assume that the targetis located at the center of the ROI, without any loss of gen-erality. As a result, at a particular time, only a small subsetof sensors can detect the target. To save communication andenergy, a local sensor only transmits data (“1”s) to the fusioncenter when its signal exceeds the threshold τ.

2.2. Decision fusion rule

We denote the binary data from local sensor i as Ii = {0, 1}(i = 1, . . . ,N). Ii takes the value 1 when there is a detection;otherwise, it takes the value 0.

We know that the optimal decision fusion rule is theChair-Varshney fusion rule [6], and it is a threshold test ofthe following statistic:

Λ0 =N∑i=1

[Ii log

pdipfai

+(1− Ii

)log

1− pdi1− pfai

]

=N∑i=1

Ii logpdi(1− pfai

)pfai

(1− pdi

) +N∑i=1

log1− pdi1− pfai

.

(12)

This fusion statistic is equivalent to a weighted summationof all the detections (“1”s) that a fusion center receives. Thedecision from a sensor with a better detection performance,namely higher pdi and lower pfai , gets a greater weight, whichis given by log(pdi(1− pfai)/pfai(1− pdi)).

As long as the threshold τ is known, the probability offalse alarm at each sensor is known (pfai = pfa) from (9).However, at each sensor, it is very difficult to calculate pdisince according to (10), pdi is decided by each sensor’s dis-tance to the target and the amplitude of the target’s sig-nal. To make matters worse, we do not even know the totalnumber of sensors N because the fusion center only receivesdata from those sensors whose received signals exceed thethreshold τ, as we have assumed in Section 2.1. An alterna-tive scheme would be that each sensor transmits raw data si tothe fusion center, and the fusion center will make a decisionbased on these raw measurements. However, the transmis-sion of raw data will be very expensive especially for a typicalWSN with very limited energy and bandwidth. It is desirableto transmit only binary data to the fusion center. Withoutthe knowledge of pdis, the fusion center is forced to treat de-tections from every sensor equally. An intuitive choice is touse the total number of “1”s as a statistic since the informa-tion about which sensor reports a “1” is of little use to thefusion center. As proposed in [19], the system-level decisionis made by first counting the number of detections made bylocal sensors and then comparing it with a threshold T :

Λ =N∑i=1

Ii

H1

≷H0

T , (13)


−150

−100

−50

0

50

100

150

−150 −100 −50 0 50 100 150

X

Y

Figure 2: The signal power contour of a target located in a sensorfield with nine cluster heads and their corresponding subregions.Points: sensors; triangles: cluster heads; star: target.

where Ii = {0, 1} is the local decision made by sensor i. Wealso call this fusion rule the “counting rule.”

2.3. Hierarchical network structure

In this paper, we focus on the application aspect of the WSN.Routing protocols and network structures are beyond thescope of this paper. In Sections 2.1 and 2.2, a very simplenetwork structure is implied. That is, all the sensors in theROI report directly to the fusion center. However, our anal-ysis results, which are based on this simple assumption andwill be presented later, are quite general and can be appliedto various scenarios and network structures. In this section,we give an example to show how the proposed approach canbe adapted to complicated and practical applications.

Suppose that the sensor field is quite vast and the sig-nal decays very fast as the distance from the target increases.As a result, only a tiny fraction of the sensors can detect thesignals from the target, as illustrated in Figure 2. Most sen-sors’ measurements are just pure noises. Since the local de-cisions from these sensors do not convey much informationabout the target, it is neither very useful nor energy efficientto transmit them to the fusion center. When the sensor net-work is very large, there is also the issue of scalability. Onereasonable solution is to use a three-layered hierarchical net-work structure, as shown in Figure 3. Sensors that are closeto each other will form a cluster and each cluster has its owncluster head or cluster master, which serves as the local fusioncenter and is supposed to have more powerful computationand communication capabilities. Each cluster is in charge ofthe surveillance of a subregion of the whole ROI, as shownin Figure 2. Instead of transmitting data to a faraway centralfusion center, sensors will send data to their correspondingcluster head. Based on data transmitted from sensors locatedwithin a specific cluster/subregion, the corresponding clus-ter head will make a decision about target presence/absencewithin that subregion. The decisions from cluster heads will

Fusion center

Cluster head 2Cluster head 1 Cluster head K

Sensor 2Sensor 1 Sensor N

· · ·

· · ·

Figure 3: Three-layered hierarchical sensor network structure.

be further transmitted to the fusion center to inform it ifthere is a target or event in specific subregions.

The theoretical analysis provided later in this paper canbe used to evaluate the detection performance at the cluster-head level, as long as the assumptions made in Section 2.1 arestill valid within each cluster/subregion.

3. PERFORMANCE ANALYSIS

In this section, the system-level detection performance,namely the probability of false alarm Pfa and probability ofdetection Pd at the fusion center, will be derived, and the an-alytical results will be compared to simulation results.

3.1. System-level false alarm rate

At the fusion center level, the probability of false alarm Pfa is

Pfa =∞∑

N=Tp(N)Pr

{Λ ≥ T|N , H0

}. (14)

Obviously, for a given N , under hypothesis H0, Λ followsa binomial (N , pfa) distribution. When N is large enough,Pr{Λ ≥ T|N , H0} can be calculated by using Laplace-DeMoivre approximation [24]:

Pr{Λ ≥ T|N ,H0

} = N∑i=T

(Ni

)pifa(1− pfa

)N−i

� Q

T −Npfa√

Npfa(1− pfa

).

(15)

It is well known that the kurtosis of a Poisson distribu-tion is 3 + (1/λ). As λ increases, the kurtosis of this Poissondistribution approaches that of a Gaussian distribution, andits distribution has a light tail. This can also be explained bythe unique characteristic of the Poisson distribution. A Pois-son RV with mean λ can be deemed as the summation of Mi.i.d. Poisson RVs with mean λ0 = λ/M. Therefore, a PoissonRV with a very large λ is a summation of a very large num-ber (M) of i.i.d. Poisson RVs with a constant mean λ0, and


00.0020.0040.0060.008

0.010.0120.014

p(N

)

0 200 400 600 800 1000 1200 1400 1600 1800 2000

N

(a)

0

1

2

3

4×10−3

p(N

)

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2×104

N

(b)

Figure 4: The probability mass function for Poisson distributions.(a): λ = 1000; (b): λ = 10 000.

its distribution approaches a Gaussian distribution, accord-ing to the central limit theorem (CLT). As a result, when λ islarge, the probability mass of N will concentrate around theaverage value (λ). This phenomenon is illustrated in Figure 4,where the probability mass function of N has been plottedfor λ = 1000 and λ = 10 000. Due to this characteristic ofthe Poisson distribution, using the fact that both the meanand variance of a Poisson RV are λ, we have the followingapproximation when λ is large:

Pr{λ− 6

√λ ≤ N ≤ λ + 6

√λ} � 1 (16)

or

N3∑N1

e−λλN

N !� 1, (17)

where N1 = �λ− 6√λ and N3 = λ + 6

√λ�.

Hence, for a large λ, a “typical” N is also a large number.The probability that N takes a small value is negligible. Forexample, when λ = 1000, Pr{N < 810} = 2.4× 10−10; whenλ = 10 000, Pr{N < 9400} = 6.6× 10−10. Therefore, when λis large enough, we have

Pfa =∞∑

N=0

p(N)N∑i=T

(Ni

)pifa(1− pfa

)N−i

�N3∑

N=N2

λNe−λ

N !Q

T −Npfa√

Npfa(1− pfa

)

=N3∑

N=N2

λNe−λ

N !Q(T − µ0

σ0

),

(18)

where N2 = max(T ,N1), µ0 � Npfa, and σ0 �√Npfa(1− pfa). Note that for a large N , the Laplace-De

Moivre approximation in (15) is valid, and this fact has beenused in the derivation of (18). The significance of (17) alsolies in the fact that the computation load in calculating Pfa

or Pd (see (18) and (25)) is reduced significantly since inthe computation, a summation of less than or equal to 12

√λ

terms is sufficient, rather than a summation of infinite num-ber of terms.

3.2. System-level probability of detection

Because of the nature of this problem, different local sensorswill have different pdi , which is a function of di as shown in(10). Therefore, under hypothesis H1, the total number ofdetections (Λ) no longer follows a Binomial distribution. Itis very difficult to derive an analytical expression for the dis-tribution of Λ. Instead, we will obtain the Pd either throughapproximation or through simulation. In [19], through ap-proximation by using the CLT, we derived the system level Pdwhen the number of sensors N is large:

Pr{Λ ≥ T|N ,H1

} � Q

(T −N pd√

Nσ2

), (19)

where

pd = 2πb2

∫ b/2

0C(r)rdr +

(1− π

4

)γ, (20)

σ2 = 2πb2

∫ b/2

0

[1− C(r)

]C(r)rdr +

(1− π

4

)γ(1− γ), (21)

C(r) = Q

(τ −√

P0

1 + αrn

), (22)

γ = Q

τ −

√P0

1 + α(√

2b/2)n. (23)

Note that in [19], a different γ is used:

γ′ = Q(τ) = pfa. (24)

γ used in this paper is slightly different from that used in[19], and it gives a more accurate approximation. But whenthe ROI is very large, meaning that b is large, the differenceis really negligible. Interested readers can find the detailedderivations in [19]. Taking an average of (19) with respect toN , and similar to the derivation of (18), we have the systemlevel Pd as

Pd �N3∑

N=N2

λNe−λ

N !Q

(T −N pd√

Nσ2

)

=N3∑

N=N2

λNe−λ

N !Q(T − µ1

σ1

),

(25)


0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Pd

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Pfa

Simulation (P0 = 1000)Approximation (P0 = 1000)Simulation (P0 = 500)

Approximation (P0 = 500)Simulation (P0 = 100)Approximation (P0 = 100)

Figure 5: ROC curves obtained by analysis and simulations. λ =1000, n = 2, b = 100, α = 200, and τ = 0.77, 0.73, 0.67 for P0 =1000, 500, 100, respectively.

10−5

10−4

10−3

10−2

10−1

100

Pd

10−5 10−4 10−3 10−2 10−1 100

Pfa

Simulation (P0 = 1000)Approximation (P0 = 1000)Simulation (P0 = 500)

Approximation (P0 = 500)Simulation (P0 = 100)Approximation (P0 = 100)

Figure 6: ROC curves obtained by analysis and simulations. Systemparameters are the same as those listed in Figure 5.

where µ1 � N pd, and σ1 �√Nσ2. Again, we use the fact

that for a large λ, a typical N is large. Therefore, the Gaussianapproximation in (19) by using the CLT is still valid.

3.3. Simulation results

The system-level Pd and Pfa can also be estimated by simula-tions. In Figures 5, 6, 7, and 8, the receiver’s operative char-acteristic (ROC) curves obtained by using approximations inSections 3.1 and 3.2 and those by simulations are plotted forvarious system parameters. The simulation results in Figures

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Pd

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Pfa

Simulation (λ = 4000)Approximation (λ = 4000)Simulation (λ = 2000)

Approximation (λ = 2000)Simulation (λ = 1000)Approximation (λ = 1000)

Figure 7: ROC curves obtained by analysis and simulations. P0 =500, n = 3, b = 100, α = 40, and τ = 0.90.

10−4

10−3

10−2

10−1

100

Pd

10−5 10−4 10−3 10−2 10−1 100

Pfa

Simulation (λ = 4000)Approximation (λ = 4000)Simulation (λ = 2000)

Approximation (λ = 2000)Simulation (λ = 1000)Approximation (λ = 1000)

Figure 8: ROC curves obtained by analysis and simulations. P0 =500, n = 3, b = 100, α = 40, and τ = 0.90.

5 and 7 are based on 105 Monte Carlo runs, and the sim-ulation results in Figures 6 and 8 are obtained through 107

Monte Carlo runs. From these figures, it is clear that the re-sults obtained by approximations are very close to those ob-tained by simulations, even when the system-level Pfa is verylow (Figures 6 and 8).

3.4. Asymptotic analysis

It is useful to analyze the system performance when the aver-age number of sensors λ is very large.


0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

Pd

102 103 104 105 106

λ

SNR0 = 10 dBSNR0 = 20 dBSNR0 = 30 dB

Figure 9: System-level probability of detection Pd as a function ofλ. n = 2, b = 100, α = 200, and τ = 0.5.

In (18), we know that

max(T ,⌊λ− 6

√λ⌋) ≤ N ≤ ⌈λ + 6

√λ⌉. (26)

Hence, as λ→∞, we have N → λ, if T ≤ λ+6√λ�. Assuming

that the system-level threshold is in the form of T = βλ, wehave

Pfa �N3∑

N=N2

λNe−λ

N !Q

(β − pfa

)√λ√

pfa(1− pfa

). (27)

Similarly, from (25), we have

Pd �N3∑

N=N2

λNe−λ

N !Q

((β − pd

)√λ√

σ2

). (28)

Therefore, when λ → ∞, if β < pfa, Pfa = Pd = 1; ifpfa < β < pd, Pfa = 0 and Pd = 1; if β > pd, Pfa = Pd = 0.As a result, as long as β takes a value between pfa and pd,as λ → ∞, the WSN detection performance will be perfectwith Pd = 1 and Pfa = 0. In Figures 9 and 10, Pd and Pfa asfunctions of λ are plotted. It is clear that the Pd converges to1 and Pfa converges to 0, as λ increases. In this example, weset β such that β = (pfa + pd)/2. Another conclusion is thatwhen λ is large enough, even for a small SNR0, the systemcan achieve a very good detection performance.

3.5. Optimality of the decision fusion rule

The proposed decision fusion rule (the counting rule) is ac-tually a threshold test in terms of the total number of detec-tions made by local sensors, and it is intuitive. It is importantto compare the performance of this fusion rule to that of the

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

Pfa

102 103 104 105 106

λ

SNR0 = 10 dBSNR0 = 20 dBSNR0 = 30 dB

Figure 10: System-level false alarm rate Pfa as a function of λ. n = 2,b = 100, α = 200, and τ = 0.5.

optimal decision fusion rule, which is also based on the totalnumber of local detections from local sensors.

As we know, Λ in (13) is a lattice-type RV [24], whichtakes equidistant values from 0 to N . Hence, according to theCLT [24], for a large N , the probability pk = Pr{Λ = k|N}equals the sample of the Gaussian density:

Pr{Λ = k|N} � 1√2πσ

e−(k−µ)2/(2σ2) (k = 0, . . . ,N). (29)

Therefore, under hypothesis H1, for a large λ, we have

Pr{Λ = k|H1

} = ∞∑N=0

p(N) Pr{Λ = k|N ,H1

}

�∞∑

N=0

λNe−λ

N !1√

2πσ1(N)e−[k−µ1(N)]2/[2σ2

1 (N)],

(30)

where µ1(N) = N pd and σ1(N) =√Nσ2. Similarly, under

hypothesis H0, for a large λ, we have

Pr{Λ = k|H0

} � ∞∑N=0

λNe−λ

N !1√

2πσ0(N)e−[k−µ0(N)]2/[2σ2

0 (N)],

(31)

where µ0(N) = Npfa and σ0(N) =√Npfa(1− pfa). Now it is

easy to show that the likelihood ratio of Λ is

L(Λ) = Pr{Λ|H1

}Pr{Λ|H0

}�∑∞

N=0 λN/[N !σ1(N)

]e−[Λ−µ1(N)]2/[2σ2

1 (N)]∑∞N=0 λN/

[N !σ0(N)

]e−[Λ−µ0(N)]2/[2σ2

0 (N)].

(32)


10−10

10−5

100

105

1010

L(Λ

)

0 50 100 150 200 250 300 350 400

Λ

P0 = 50P0 = 100P0 = 500

P0 = 1000P0 = 2000

Figure 11: Function L(Λ). λ = 1000, n = 2, b = 100, α = 200,τ = 0.66, 0.67, 0.73, 0.77, 0.82 for P0 = 50, 100, 500, 1000, 2000,respectively.

Hence, the optimal fusion rule at the fusion center is a likeli-hood ratio test:

L(Λ)H1

≷H0

TL. (33)

Note that the implementation of the proposed countingrule for a Neyman-Pearson detector with a given system levelPfa requires only the knowledge of λ and τ (or pfa) in orderto find the system-level threshold T through (18). To choosean optimal local threshold τ, as we will see later in this paper,the knowledge of P0 is required too. However, the countingrule can still be implemented without an optimal τ, and agood choice of τ based on some prior knowledge of P0 canalways be made. As a result, an exact knowledge of P0 is notnecessary for the implementation of the counting rule, eventhough it is needed in the evaluation of the system-level de-tection performance.

As for the implementation of the optimal fusion rule, weneed to have the exact knowledge of α, P0, and b to calcu-late σ2 and pd. Hence, the optimal fusion rule requires muchmore information, especially the knowledge of signal powerP0, which is unknown in most cases. Furthermore, becauseof its dependence on the exact knowledge of P0, the opti-mal fusion rule is more sensitive to the estimation errors ofP0. Therefore, in this paper, the optimal fusion rule only hastheoretical importance, and it is not very useful or robust inpractical applications, where it is always difficult to estimateP0.

As we can see from (32), L(Λ) is a nonlinear transforma-tion of Λ. The threshold tests of Λ and L(Λ) will have iden-tical detection performances if L(Λ) is a monotonically in-creasing transformation of Λ.

10−10

10−5

100

105

1010

1015

L(Λ

)

0 100 200 300 400 500 600 700

Λ

λ = 1000λ = 2000λ = 3000

λ = 4000λ = 5000

Figure 12: Function L(Λ). n = 3, P0 = 500, b = 100, α = 40,τ = 0.90.

In Figures 11 and 12, L as a function of Λ is plotted fordifferent system parameters. As we can see, in all the cases,L(Λ) is a monotonically increasing function of Λ, meaningthat the counting rule and the optimal fusion rule are equiv-alent in terms of detection performance. In addition to thecases shown in Figures 11 and 12, we have extensively inves-tigated the relationship between L and Λ for various systemparameters. For all the system parameters we have studied,L(Λ) is a monotonically increasing function of Λ.

In Figure 13, the ROC curves obtained by simulations(based on 106 Monte Carlo runs) for the counting rule andthe optimal fusion rule are shown. We can see that the ROCcurves corresponding to the counting rule and those of theoptimal fusion rule are indistinguishable.

4. THRESHOLD FOR LOCAL SENSORS

In addition to the ROC curve for performance compari-son, one can also resort to the so-called deflection coefficient[25, 26], especially when the statistical properties of the sig-nal and noise are limited to moments up to a given order.The deflection coefficient is defined as

D(Λ) =[E(Λ|H1

)− E(Λ|H0

)]2Var(Λ|H0

) . (34)

In the case of Var(Λ|H1) = Var(Λ|H0), this is in essence theSNR of the detection statistic. It is worth noting that the useof deflection criterion leads to the optimum LR receiver inmany cases of practical importance [25]. For example, in theproblem of detecting a Gaussian signal in Gaussian noise, anLR detector is obtained by maximizing the deflection mea-sure. In the above sections, we have assumed that the thresh-old τ (or equivalently pfa) is given. From (18), (20), (21),


10−3

10−2

10−1

100

Pd

10−5 10−4 10−3 10−2 10−1 100

Pfa

Optimal rule (P0 = 1000)Counting rule (P0 = 1000)Optimal rule (P0 = 500)

Counting rule (P0 = 500)Optimal rule (P0 = 100)Counting rule (P0 = 100)

Figure 13: ROC curves for the counting rule and the optimal fusionrule. System parameters are the same as those listed in Figure 5.

0

0.5

1

1.5

2

2.5

3

D(τ

)

−3 −2 −1 0 1 2 3

τ

Figure 14: D(τ). λ = 1000, n = 2, a = 100, α = 200, SNR0 = 30 dB(or P0 = 1000).

and (25), we know that both Pfa and Pd are functions of τ.Hence, τ is a parameter that can be designed to achieve abetter system-level performance. In this paper, we will findthe optimum local sensor-level threshold τ by maximizingthe deflection coefficient. The deflection coefficient for thedetection problem in this paper is derived and stated in thefollowing theorem.

Theorem 1. The deflection coefficient at the fusion center forthe detection problem formulated in this paper is

D(τ) = λ[pd(τ)− pfa(τ)

]2pfa(τ)

. (35)

For the proof, see the appendix.

10−2

10−1

100

Pd

10−5 10−4 10−3 10−2 10−1 100

Pfa

τ = −0.23τ = 0.27τ = 0.77

τ = 1.27τ = 1.77

Figure 15: ROC curves for system with different τ. λ = 1000, n = 2,b = 100, α = 200, SNR0 = 30 dB (or P0 = 1000).

The optimum τ can be found by maximizing D(τ) withrespect to τ. As we can see in Figure 14, there exists an op-timal τ(0.7694) that maximizes the deflection coefficient D.By employing this optimum τopt, a significant improvementin D can be achieved.

The system-level ROC curves for different τ are plottedin Figure 15. As we can see, the ROC curve correspondingto the optimal threshold τopt(0.77) is above those for otherthresholds, meaning that τopt provides the best system levelperformance. In Figures 16 and 17, τopt and the correspond-ing optimal pfa as functions of SNR0 and α are shown. It isclear that τopt is a monotonically increasing function of SNR0

and a monotonically decreasing function of α. This is becausewith a strong target signal (high SNR0 and low α), by adopt-ing a higher threshold, local sensors lower their false alarmrate, while at the same time they can still attain a relativelyhigh probability of detection.

5. CONCLUSIONS

We have proposed and studied a decision fusion rule that isbased on the total number of detections reported by localsensors for a WSN with a random number of sensors. As-suming that the number of sensors in a ROI follows a Pois-son distribution, we have derived the system-level detectionperformance measures, namely the probabilities of detectionand false alarm. We have shown that even at very low SNR,this fusion rule can achieve a very good system-level detec-tion performance given that there are, on an average, a suf-ficiently large number of sensors deployed in the ROI. Theaverage number of sensors needed for a prespecified system-level performance can be calculated based on our analyticalexpressions. Another important result is that the proposedfusion rule is equivalent to the optimal fusion rule, which


0.6

0.7

0.8

0.9

1

1.1

τ opt

10 15 20 25 30 35 40

SNR0 (dB)

(a)

0.140.160.18

0.20.220.240.260.28

Opt

imalp f

a

10 15 20 25 30 35 40

SNR0 (dB)

(b)

Figure 16: Optimal τopt and the corresponding optimal pfa as func-tions of SNR0. λ = 1000, n = 2, b = 100, α = 200.

0.81

1.21.41.61.8

2

τ opt

50 100 150 200 250

α

(a)

0.05

0.1

0.15

0.2

Opt

imalp f

a

50 100 150 200 250

α

(b)

Figure 17: Optimal τopt and the corresponding optimal pfa as func-tions of α. λ = 1000, n = 2, b = 100, SNR0 = 30 dB.

requires much more prior knowledge of the system parame-ters, for all the different system parameters we have investi-gated.

We have also shown that a better system performance canbe achieved if we choose an optimum threshold at the localsensors by maximizing the deflection coefficient. If SNR0 ishigh, and α is small, a higher local sensor-level threshold τshould be chosen; otherwise, a lower τ should be employedto achieve a better performance.

APPENDIX

PROOF OF THEOREM 1

Under hypothesis H0, we have

E(Λ|N , H0

) = Npfa, (A.1)

Var(Λ|N , H0

) = Npfa(1− pfa

). (A.2)

Hence,

E(Λ2|N ,H0

) = Var(Λ|N ,H0

)+ E(Λ|N ,H0

)2

= Npfa(1− pfa

)+ N2p2

fa.(A.3)

Since N is a Poisson RV, we have

E[N] = λ, (A.4)

Var[N] = λ, (A.5)

E(N2) = Var(N) +

[E(N)

]2 = λ + λ2. (A.6)

With (A.1) and (A.4), E(Λ|H0) can be derived as follows:

E(Λ|H0

) = E[E(Λ|N ,H0

)] = E[Npfa

] = λpfa. (A.7)

Given (A.3), (A.4), and (A.6), it is easy to show that

E(Λ2|H0

) = E[E(Λ2|N ,H0

)]= E[Npfa

(1− pfa

)+ N2p2

fa

]= λpfa

(1− pfa

)+(λ + λ2)p2

fa

= λpfa(1 + λpfa

).

(A.8)

Therefore,

Var(Λ|H0

) = E[Λ2|H0

]− E[Λ|H0

]2= λpfa

(1 + λpfa

)− λ2p2fa

= λpfa.

(A.9)

Under hypothesis H1, according to [19], we have

E(Λ|N ,H1

) = N pd. (A.10)

Hence,

E[Λ|H1

] = E[E(Λ|N ,H1

)]= E[N pd

]= λpd.

(A.11)

By substituting (A.7), (A.9), and (A.11) into (34), we fi-nally get the deflection coefficient

D(τ) = λ[pd(τ)− pfa(τ)

]2pfa(τ)

. (A.12)


REFERENCES

[1] C. Intanagonwiwat, R. Govindan, D. Estrin, J. Heidemann,and F. Silva, “Directed diffusion for wireless sensor network-ing,” IEEE/ACM Trans. Networking, vol. 11, no. 1, pp. 2–16,2003.

[2] H. Gharavi and K. Ban, “Multihop sensor network designfor wide-band communications,” Proc. IEEE, vol. 91, no. 8,pp. 1221–1234, 2003.

[3] T. J. Kwon, M. Gerla, V. K. Varma, M. Barton, and T. R. Hsing,“Efficient flooding with passive clustering—an overhead-freeselective forward mechanism for ad hoc/sensor networks,”Proc. IEEE, vol. 91, no. 8, pp. 1210–1220, 2003.

[4] S. Kumar, F. Zhao, and D. Shepherd, “Collaborative signal andinformation processing in microsensor networks,” IEEE Sig-nal Processing Mag., vol. 19, no. 2, pp. 13–14, 2002.

[5] H. Gharavi and S. P. Kumar, “Special issue on sensor networksand applications,” Proc. IEEE, vol. 91, no. 8, pp. 1151–1153,2003.

[6] Z. Chair and P. K. Varshney, “Optimal data fusion in multiplesensor detection systems,” IEEE Trans. Aerosp. Electron. Syst.,vol. 22, no. 1, pp. 98–101, 1986.

[7] P. K. Varshney, Distributed Detection and Data Fusion,Springer, New York, NY, USA, 1997.

[8] P. K. Willett, P. F. Swaszek, and R. S. Blum, “The good, bad andugly: distributed detection of a known signal in dependentGaussian noise,” IEEE Trans. Signal Processing, vol. 48, no. 12,pp. 3266–3279, 2000.

[9] E. Drakopoulos and C.-C. Lee, “Optimum multisensor fu-sion of correlated local decisions,” IEEE Trans. Aerosp. Elec-tron. Syst., vol. 27, no. 4, pp. 593–606, 1991.

[10] M. Kam, W. Chang, and Q. Zhu, “Hardware complexityof binary distributed detection systems with isolated localBayesian detectors,” IEEE Trans. Syst., Man, Cybern., vol. 21,no. 3, pp. 565–571, 1991.

[11] M. Kam, Q. Zhu, and W. S. Gray, “Optimal data fusion of cor-related local decisions in multiple sensor detection systems,”IEEE Trans. Aerosp. Electron. Syst., vol. 28, no. 3, pp. 916–920,1992.

[12] C. Rago, P. K. Willett, and Y. Bar-Shalom, “Censoring sensors:a low-communication-rate scheme for distributed detection,”IEEE Trans. Aerosp. Electron. Syst., vol. 32, no. 2, pp. 554–568,1996.

[13] F. Gini, F. Lombardini, and L. Verrazzani, “Decentraliseddetection strategies under communication constraints,” IEEProceedings—Radar, Sonar and Navigation, vol. 145, no. 4,pp. 199–208, 1998.

[14] C.-T. Yu and P. K. Varshney, “Paradigm for distributed detec-tion under communication constraints,” Optical Engineering,vol. 37, no. 2, pp. 417–426, 1998.

[15] C.-T. Yu and P. K. Varshney, “Bit allocation for discrete signaldetection,” IEEE Trans. Commun., vol. 46, no. 2, pp. 173–175,1998.

[16] T. Kasetkasem and P. K. Varshney, “Communication struc-ture planning for multisensor detection systems,” IEEProceedings—Radar, Sonar and Navigation, vol. 148, no. 1,pp. 2–8, 2001.

[17] J. Hu and R. S. Blum, “On the optimality of finite-level quan-tizations for distributed signal detection,” IEEE Trans. Inform.Theory, vol. 47, no. 4, pp. 1665–1671, 2001.

[18] J.-F. Chamberland and V. V. Veeravalli, “Decentralized detec-tion in sensor networks,” IEEE Trans. Signal Processing, vol. 51,no. 2, pp. 407–416, 2003.

[19] R. Niu, P. K. Varshney, M. H. Moore, and D. Klamer, “Deci-sion fusion in a wireless sensor network with a large number

of sensors,” in Proc. 7th IEEE International Conference on In-formation Fusion (ICIF ’04), Stockholm, Sweden, June–July2004.

[20] D. Li, K. D. Wong, Y. H. Hu, and A. M. Sayeed, “Detection,classification, and tracking of targets,” IEEE Signal ProcessingMag., vol. 19, no. 2, pp. 17–29, 2002.

[21] N. Levanon, Radar Principles, John Wiley & Sons, New York,NY, USA, 1988.

[22] T. Rappaport, Wireless Communications—Principles and Prac-tices, Prentice-Hall, Upper Saddle River, NJ, USA, 1996.

[23] L. E. Kinsler and A. R. Frey, Fundamentals of Acoustics, JohnWiley & Sons, New York, NY, USA, 1962.

[24] A. Papoulis, Probability, Random Variables, and Stochastic Pro-cesses, McGraw-Hill, New York, NY, USA, 1984.

[25] B. Picinbono, “On deflection as a performance criterion indetection,” IEEE Trans. Aerosp. Electron. Syst., vol. 31, no. 3,pp. 1072–1081, 1995.

[26] S. M. Kay, Fundamentals of Statistical Signal Processing II:Detection Theory, Prentice-Hall, Englewood Cliffs, NJ, USA,1998.

Ruixin Niu received his B.S. degree fromXi’an Jiaotong University, Xi’an, China, in1994, his M.S. degree from the Institute ofElectronics, Chinese Academy of Sciences,Beijing, in 1997, and his Ph.D. degree fromthe University of Connecticut, Storrs, in2001, all in electrical engineering. He iscurrently a Postdoctoral Research Associatewith Syracuse University, Syracuse, NY. Hisresearch interests are in the areas of statisti-cal signal processing and its applications, including detection, es-timation, data fusion, communications, and image processing. Hereceived the Fusion 2004 Best Paper Award in the Seventh Interna-tional Conference on Information Fusion, Stockholm, Sweden, inJune 2004.

Pramod K. Varshney was born in Alla-habad, India, on July 1, 1952. He receivedthe B.S. degree in electrical engineeringand computer science (with highest hon-ors), and the M.S. and Ph.D. degrees inelectrical engineering from the Universityof Illinois at Urbana-Champaign in 1972,1974, and 1976, respectively. Since 1976, hehas been with Syracuse University, Syracuse,NY, where he is currently a Professor of elec-trical engineering and computer science. His current research in-terests are in distributed sensor networks and data fusion, detec-tion and estimation theory, wireless communications, image pro-cessing, radar signal processing, and remote sensing. He has pub-lished extensively. He has served as a Consultant to several majorcompanies. He is the recipient of the 1981 ASEE Dow Outstand-ing Young Faculty Award. He was elected to the grade of Fellow ofthe IEEE in 1997 for his contributions in the area of distributeddetection and data fusion. In 2000, he received the Third Millen-nium Medal from the IEEE and Chancellor’s Citation for excep-tional academic achievement at Syracuse University. He serves asa Distinguished Lecturer for the AES Society of the IEEE. He wasthe President of International Society of Information Fusion dur-ing 2001.

EURASIP Journal on Wireless Communications and Networking 2005:4, 473–482c© 2005 Alexey Krasnopeev et al.

Minimum Energy Decentralized Estimation in a WirelessSensor Network with Correlated Sensor Noises

Alexey KrasnopeevDepartment of Electrical and Computer Engineering, University of Minnesota, 200 Union Street SE, Minneapolis, MN 55455, USAEmail: [email protected]

Jin-Jun XiaoDepartment of Electrical and Computer Engineering, University of Minnesota, 200 Union Street SE, Minneapolis, MN 55455, USAEmail: [email protected]

Zhi-Quan LuoDepartment of Electrical and Computer Engineering, University of Minnesota, 200 Union Street SE, Minneapolis, MN 55455, USAEmail: [email protected]

Received 25 November 2004; Revised 20 May 2005

Consider the problem of estimating an unknown parameter by a sensor network with a fusion center (FC). Sensor observationsare corrupted by additive noises with an arbitrary spatial correlation. Due to bandwidth and energy limitation, each sensor isonly able to transmit a finite number of bits to the FC, while the latter must combine the received bits to estimate the unknownparameter. We require the decentralized estimator to have a mean-squared error (MSE) that is within a constant factor to thatof the best linear unbiased estimator (BLUE). We minimize the total sensor transmitted energy by selecting sensor quantizationlevels using the knowledge of noise covariance matrix while meeting the target MSE requirement. Computer simulations showthat our designs can achieve energy savings up to 70% when compared to the uniform quantization strategy whereby each sensorgenerates the same number of bits, irrespective of the quality of its observation and the condition of its channel to the FC.

Keywords and phrases: wireless sensor networks, decentralized estimation, power control, energy efficiency.

1. INTRODUCTION

Wireless sensor networks (WSNs) are ideal for environmen-tal monitoring applications because of their low implemen-tation cost, agility, and robustness to sensor failures. A pop-ular WSN architecture consists of a fusion center (FC) and alarge number of spatially distributed sensors. The FC can beeither a standard base station or a mobile access point suchas an unmanned aerial vehicle hovering over the sensor field.Each sensor in a WSN is responsible for local data collectionas well as occasional transmission of a summary of its ob-servations to the FC via a wireless link. In a practical WSN,each sensor has only limited computation and communica-tion capabilities due to various design considerations such assmall size battery, bandwidth, and cost. As a result, it is diffi-cult for sensors to send their entire real-valued observations


to the FC. Instead, a more practical decentralized estima-tion scheme is to let each sensor quantize its real-valued localmeasurement to an appropriate length and send the result-ing discrete message (typically short) to the FC, while thelatter combines all the received messages to produce a finalestimate of the unknown parameter. Naturally, the messagelengths are dictated by the power and bandwidth limitations,sensor noise characteristics as well as the desired final esti-mation accuracy.

Recently, several decentralized estimation schemes (DES)[1, 2, 3, 4] have been proposed for parameter estimationin the presence of additive sensor noise. These DESs re-quire each sensor to send only a few bits to the fusion cen-ter, with the message length determined by the sensor’s lo-cal SNR. Performance of the resulting estimator is shown tobe within a constant factor of the best linear unbiased esti-mator (BLUE) performance. While the designs suggested by[1, 2, 3, 4] give a guaranteed estimation performance withlow bandwidth requirement, the effect of wireless channeldistortion and the important issue of total sensor energyminimization were not directly modelled.





In a practical WSN, the wireless links from sensors to theFC may have different qualities, depending on the sensor lo-cations relative to the FC. Intuitively, local message lengthshould depend not only on the quality of sensor’s observa-tion (i.e., local SNR), but also on the quality of its wirelesslink to the FC. In particular, even if a sensor has a high-quality observation, it should not perform any local quan-tization or transmission when its wireless link to the FC isweak, in order to conserve sensor energy. In general, min-imizing the total sensor energy consumption for a decen-tralized estimation task is essential to ensure long lifespanof a WSN. Motivated by these considerations, the authorsof [5, 6] proposed optimal coded and uncoded transmis-sion strategies for sensor networks which can minimize therequired energy per transmitted bit, although no consider-ation was given to the quantization effect and the accuracyof final estimation. In the recent work of [7, 8], the authorsconsidered the problem of optimal energy scheduling for de-centralized estimation where sensor measurements are cor-rupted by additive noises, while communication links fromsensors to the fusion center differ in quality. In particular,[7] used an adaptive modulation scheme with an exponen-tial dependence of energy on the transmitted message size,and then derived optimal sensor power and quantization lev-els via convex optimization.

The aforementioned results all require an important as-sumption that sensor observation noises are spatially uncor-related. Unfortunately, this assumption can be restrictive in apractical WSN, especially when sensors are densely deployed.In this paper, we consider distributed parameter estimationin situations where sensor observations are corrupted by cor-related additive noises. Assuming a standard energy model[5, 6], uniform quantization at sensors, and the knowledgeof sensor noise correlation matrix, we use convex optimiza-tion techniques to derive a nearly-optimal (modulo a minorrelaxation) energy scheduling strategy with a mean-squarederror performance guaranteed to be within a constant factorto that of the centralized BLUE estimator. Computer simula-tions show that our designs can achieve energy savings up to70% when compared to the uniform bit allocation strategywhereby each sensor generates the same number of bits.

Our sensor energy scheduling strategy is suitable for di-rect application when the sensor noise correlation matrix isavailable at the FC. In practice, the sensor noise correlationmatrix may have to be determined in the sensor network cal-ibration phase, possibly with the help of training signals. Inthe absence of this knowledge, our scheme is also useful asit provides an upper bound on the performance of all otherenergy scheduling schemes, both centralized and distributed.In fact, our scheme gives an estimate of the amount of energy“wasted” due to the lack of sensor noise correlation knowl-edge. The power schedules generated by our design also giveinsight into the design of distributed energy scheduling algo-rithms.

Our paper is organized as follows. In Section 2 we de-scribe the DES and formulate total energy minimizationproblem. In Section 3 we present a convex relaxation ofthe energy minimization problem and give a nearly-optimal

SN

S2

S1

θ FC

xN

mN

dN

θ

d1

m1

x1

x2m2

d2...

Figure 1: Decentralized estimation scheme.

solution in closed form. The performance of our energy-efficient design is analyzed in Section 4 by numerical simu-lation. Section 5 contains an extension of the work where weformulate an alternative problem of minimizing maximal in-dividual sensor energy and present an analytic solution. Finalremarks are given in Section 6.

Throughout, we use the following notations. Matricesand vectors are denoted by boldface letters, capital and smallcorrespondingly, whereas same regular letters with indicesdenote their elements. Diagonal matrix with nonzero ele-ments a1, . . . , aN is denoted by diag(a1, . . . , aN ). Logarithmsdenoted by log(·) are taken to the base 2; for natural loga-rithms notation ln(·) is used. For any real number x ∈ R,we use �x� to denote the smallest integer greater or equal tox. For any random variable R, we use Ex R to denote the ex-pected value of R taken with respect to random variable x,while Ex|y R denotes the expected value of R with respect tox given y. Finally, varR denotes the variance of random vari-able R.

2. PROBLEM FORMULATION

Consider the problem of estimating an unknown parameterθ by a sensor network consisting of N sensors. Measurementof each sensor xi is corrupted by additive noise ni so that

xi = θ + ni, i = 1, . . . ,N. (1)

We assume that both θ and ni have finite range, so that allxi belong to a common finite interval [−U ,U], with U >0 a known constant. The noises ni are assumed to be zeromean and correlated across sensors with covariance matrixC, but otherwise unknown. We assume C is known at theFC. Measurements xi are quantized to produce messages mi

to be passed on to the fusion center; the latter then combinesreceived messages in order to estimate θ, see Figure 1. Theexact form of mi will be detailed later.

We assume that each sensor sends messages to FC usinga separate channel. This can be achieved by using a multi-ple access technique such as TDMA or FDMA. Each channelis corrupted by additive white Gaussian noise (AWGN) withpower spectral density N0/2:

mi = d−κ/2i mi + vi, (2)

Minimum Energy DES in a WSN with Correlated Sensor Noises 475

where mi is the received message at FC and vi is the AWGN.The signal power received at the FC is assumed to be inverselyproportional to dκi where di is the distance between sensor iand the FC, and κ is the path loss exponent. Suppose thatmessage mi has length bi bits. We will assume that energy Wi

required for transmission of mi is proportional to the num-ber of bits in the message. This is the case, for example, ifsensors use M-QAM or M-PSK modulation to transmit mes-sages. For example, if M-QAM is used, Wi can be found asfollows [5, 6]:

Wi = 23Nf N0G0d

κi

(2s − 1

)ln

(4(1− 2−s

)sPb

)bis≡ wibi, (3)

where s = logM is the number of bits per symbol, Nf is thereceiver noise figure, Pb is the required bit error probability,and G0 is the system constant defined as in [5].

2.1. Quantization strategy

Suppose that sensor observation xi is bounded to a finite in-terval [−U ,U]. Suppose further that we wish to quantize xiin such a way that resulting message mi has length bi bits,where bi is to be determined later. We therefore have Ki = 2bi

quantization points {a(i)j ∈ [−U ,U], j = 1, . . . ,Ki}. These

points are uniformly spaced so that a(i)1 = −U < a(i)

2 < · · · <a(i)Ki= U and a(i)

k+1−a(i)k = ∆i for every k. Since end points {aij}

divide the observation range into Ki − 1 intervals, it followsthat ∆i = 2U/(Ki − 1). Quantization is done in the following

probabilistic manner. Suppose that xi ∈ [a(i)k , a(i)

k+1). Then xiis quantized to either a(i)

k+1 or a(i)k according to

P(mi = a(i)

k+1

)= xi − a(i)

k

∆i,

P(mi = a(i)

k

)= a(i)

k+1 − xi∆i

.

(4)

This probabilistic quantization produces a message mi whoseexpected value equals the observation itself:

Epi mi = a(i)k+1 Pr

(mi = a(i)

k+1

)+ a(i)

k Pr(mi = a(i)

k

)

= a(i)k+1

(xi − a(i)

k

)∆i

+ a(i)k

(a(i)k+1 − xi

)∆i

= xia(i)k+1 − a(i)

k

∆i= xi,

(5)

where the expectation Epi is taken with respect to the proba-bilistic quantization noise model (4).

Next, we consider any fixed observation value of xi, andbound the variance varmi (taken with respect to the quan-tization noise) as follows. Suppose xi falls in the interval

[a(i)k , a(i)

k+1). We denote r = a(i)k+1 − xi and pi = (xi − a(i)

k )/∆i ∈[0, 1]. Then, we have

varmi = Epi

(mi − xi

)2

= (∆i − r)2(

1− pi)

+ r2pi

= ∆2i

((1− pi

)2pi + p2

i

(1− pi

))

= ∆2i pi(1− pi

).

(6)

Thus, the maximum variance of mi is equal to ∆2i /4 and is

achieved when the observation xi falls in the middle of quan-tization interval [a(i)

k , a(i)k+1).

2.2. A linear fusion rule

The classical best linear unbiased estimator (BLUE) for θ isgiven by [9]

θ = 1TC−1x1TC−11

, (7)

where x = (x1, . . . , xN )T and 1 is the vector of all ones. Es-timation performance is characterized by the variance of theestimator

var θ = (1TC−11)−1

. (8)

To implement BLUE exactly in a WSN setup, we musthave mi = xi (i.e., real-valued message) and assume that thechannel is distortion-less, both of which are unrealistic inpractice. Nonetheless, BLUE estimator serves as a good per-formance benchmark for the DES to be designed. Motivatedby the centralized BLUE, we adopt the following fusion rule:upon receiving sensor messages mi, the FC combines theminto an estimator θ given by

θ = 1TC−1m1TC−11

, (9)

where m = (m1, . . . ,mN )T . Equation (5) gives us an impor-tant property of θ: it is an unbiased estimator for θ. Indeed,we have

Ep,x1TC−1m1TC−11

= Ex1TC−1 Ep m

1TC−11= Ex

1TC−1x1TC−11

= θ, (10)

where Ep denotes expectation taken with respect to all sensorquantization noises, and the last step is due to Ex x = θ1. Themean-squared error (MSE) of θ can be expanded as follows:

MSE(θ) = E(θ − θ)2 = E(θ − θ + θ − θ)2

= E(θ − θ)2 + E(θ − θ)2 + 2 E(θ − θ)(θ − θ).(11)


Consider the third term in the last expression. We have

Em,x(θ − θ)(θ − θ) = Ex(Em|x

[(θ − θ)(θ − θ)

])= Ex

[(θ − θ)Em|x(θ − θ)

] = 0,(12)

where the second step is due to the fact that θ is independentof m for any fixed x, and the last step follows from (10). Thus,we can write

MSE(θ)

= E(θ − θ)2 + E(θ − θ)2

= E

(1TC−1(m− x)

1TC−11

)2

+ var θ

=(

11TC−11

)2

1TC−1 E(m− x)(m− x)TC−11

+1

1TC−11

=(

1TC−1QC−111TC−11

+ 1

)var θ,

(13)

where

Q = E(m− x)(m− x)T (14)

is the quantization noise correlation matrix.In our formulation, we seek an energy-efficient DES

which can deliver an MSE performance that is comparableto that of the centralized BLUE estimator. Specifically, wewill minimize the transmission energy while maintaining theMSE(θ) to be within a constant factor of the BLUE perfor-mance, that is, MSE(θ) ≤ (1 + α) var θ for some constantα > 0. Therefore, the following condition must hold:

1TC−1QC−111TC−11

≤ α. (15)

The total sensor transmission energy is equal to

W =N∑i=1

Wi =N∑i=1

wibi, (16)

where wi is the energy required for transmission of a singlebit from sensor i to the FC; see (3). Therefore, the minimumenergy DES design problem becomes

minimize W =N∑i=1

wibi

subject to1TC−1QC−11

1TC−11≤ α, bi ∈ N,

(17)

where N denotes the set of nonnegative integers.

To complete the formulation, we need to make explicitthe dependence of Q on bi. The unbiasedness of our quanti-zation strategy leads to the following important property onthe quantization noise correlation matrix Q.

Lemma 1. The quantization noise matrix Q is diagonal.

Proof. Consider any (i, j)th element of the matrix Q, withi �= j. We have

Qij = E(mi − xi

)(mj − xj

)

= Exi,xj

(Epi,pj

∣∣xi,xj(mi − xi

)(mj − xj

)∣∣xi, xj)

= Exi,xj

(Epi∣∣xi(mi − xi

)Epj

∣∣xj(mj − xj

)|xi, xj) = 0.

(18)

Here we use the fact that random variables mi and mj areconditionally independent given corresponding observationsxi and xj , which together with (5) gives the desired result.

Lemma 1 states that all the off-diagonal entries of Q mustbe zero. Let Qii be the ith diagonal element of Q. Recalling(6), we obtain the following important bound on the diago-nal entries of Q:

Qii = varmi ≤ U2(2bi − 1

)2 , (19)

where bi is the number of bits in mi. This bound will be use-ful in our final formulation of the energy minimization prob-lem.

2.3. Total energy minimization

We introduce the notation c = C−11 and β = α/ var θ. Sincevar θ = 1/1TC−11, we can rewrite the MSE condition (15) as

cTQc ≤ β. (20)

This constraint ensures that the MSE performance of theDES is within a factor of α to the BLUE performance. Sincethe distribution of x is unknown in general, we enforce astronger condition, namely

maxx,p

cTQc ≤ β. (21)

Recalling that Q is diagonal (cf. Lemma 1), we can use thebound (19) to rewrite the above condition as

maxx,p

cTQc = maxx,p

N∑i=1

Qiic2i =

N∑i=1

U2c2i(

2bi − 1)2 ≤ β. (22)


Now we can reformulate the original energy minimizationproblem (17) explicitly as follows:

minimizeN∑i=1

wibi

subject toN∑i=1

c2i

(2bi − 1)2≤ β

U2, bi ∈ N, i = 1, . . . ,N.

(23)

To relate this formulation to physical parameters, we notethat the wireless channel conditions, the choice of modula-tions/BER, and so forth will determine the values of weight-ing factors wi, as shown in (3). The values of ci are deter-mined by the noise correlation matrix C. Without loss ofgenerality we assume ci �= 0 for all i. In case ci = 0 for somesensors, we can exclude corresponding mi from fusion con-sideration, as it does not contribute to the fusion estimate θ.

3. CONVEX RELAXATION WITH A CLOSED-FORMNEARLY-OPTIMAL SOLUTION

Since bi can only take integer values, problem (23) is actuallya nonlinear integer program whose computational complex-ity is typically NP-hard. To make this problem computation-ally tractable, we relax the integer constraints on bi to allowthem to take real nonnegative values:

minimizeN∑i=1

wibi

subject toN∑i=1

c2i

(2bi − 1)2≤ β

U2, bi ≥ 0, i = 1, . . . ,N.

(24)

The relaxed problem (24) has a linear objective function andconvex inequality constraints. Therefore, solution to prob-lem (24) can be efficiently found by the fusion center usingconvex optimization techniques such as the interior pointmethods [10]. Once the optimal bi’s are found, the fusioncenter can round this solution to the nearest greater integerand broadcast it to the sensors for power adjustment.

In what follows, we will present an approximately-optimal solution to the problem (24) in closed form. Such aclosed-form solution not only simplifies the energy schedul-ing process, but also provides valuable insight into the opti-mal power-scheduling scheme. To begin, we first note that,by a simple monotonicity argument, the main MSE con-straint will be active (i.e., holds with equality) at any opti-mum point,1 while the remaining nonnegativity constraintson bi will be inactive since bi = 0 for some i would vio-late the main MSE constraint. Therefore, we can ignore the

1Indeed, the left-hand side of MSE constraint is monotonically decreas-ing in terms of bi function. Therefore, if at the optimum the inequality isstrict, we could change bk in the optimal solution to bk < bk for some k todecrease the objective function.

nonnegativity constraints (since the Lagrangian multipliersassociated with these constraints will be zero). Associating amultiplier λ with the MSE constraint, we can write the La-grangian for the problem (24) as follows:

L(bi, λ

) = N∑i=1

wibi + λ

( N∑i=1

c2i(

2bi − 1)2 −

β

U2

). (25)

At the point of optimum we must have ∂L/∂bi = 0 for i =1, . . . ,N , yielding the following set of conditions:

∂L

∂bi= wi − 2λ ln 2

2bi c2i(

2bi − 1)3 = 0, (26)

or alternatively

2bi(2bi − 1

)3 =wiλ′

c2i

, (27)

where λ′ = 1/2λ ln 2. Also, the main MSE constraint holdswith equality at optimum point (as noted above), yielding

N∑i=1

c2i(

2bi − 1)2 =

β

U2. (28)

The optimal solutions {bi, λ′} can be found from the non-linear equations (27) and (28) which unfortunately cannotbe solved in the closed form. To facilitate a closed-form so-lution, we consider a slightly modified system in variables{b∗i , λ∗}:

N∑i=1

c2i(

2b∗i − 1

)2 =β

U2, (29)

2b∗i − 1(

2b∗i − 1

)3 =1(

2b∗i − 1

)2 =wiλ∗

c2i

. (30)

The above system is almost identical to the original Karush-Kuhn-Tucker (KKT) system (27) and (28) except for thesmall change in the numerators of the left-hand sides of (30)and (27). Simple algebraic manipulation shows that (29) and(30) can be solved analytically, yielding

λ∗ = β

U2

( N∑i=1

wi

)−1

. (31)

Substituting this λ∗ into (30) gives the following feasible so-lution to the original energy scheduling problem (24):

b∗i = log(

1 +

∣∣ci∣∣√λ∗wi

). (32)

It remains to quantify the performance of this particular en-ergy scheduling strategy. This is the content of next two lem-mas.


Lemma 2. Let {bi, λ′} be the optimal solution to the problem(24) such that bi ≥ 1 for all i, and let {b∗i , λ∗} be its approxi-mation defined by (29) and (30). Then

λ∗ ≤ λ′ ≤ 2λ∗. (33)

Proof. Since

2bi(2bi − 1

)3 =1(

2bi − 1)2 +

1(2bi − 1

)3 , (34)

an upper bound on λ′ can be found using (27) as follows:

λ′( N∑

i=1

wi

)=

N∑i=1

2bi c2i(

2bi − 1)3 ≥

N∑i=1

c2i(

2bi − 1)2 =

β

U2, (35)

and we conclude that λ∗ ≤ λ′. On the other hand, if all bi ≥ 1we can write

1(2bi − 1

)2 ≤2bi(

2bi − 1)3 ≤

2(2bi − 1

)2 , (36)

therefore λ′ ≤ 2λ∗, and the result of the lemma follows.

We now bound the difference |bi − b∗i |.

Lemma 3. Under the conditions of Lemma 2,

b∗i −12< bi < b∗i +

12

∀ i = 1, 2, . . . ,N. (37)

Proof. Using left-hand side of (36) and right-hand side of(33) we can write

1(2bi − 1

)2 ≤2bi(

2bi − 1)3 ≤

wi

c2i

2λ∗, (38)

which gives the lower bound on bi:

bi ≥ log

(1 +

∣∣ci∣∣√2λ∗wi

)= log

(√2 +


)− 1

2> b∗i −

12.

(39)

By analogy, from right-hand side of (36) and left-hand sideof (33) we have

2(2bi − 1

)2 ≥2bi(

2bi − 1)3 ≥

wi

c2i

λ∗, (40)

which further implies

bi ≤ log

(1 +

√2∣∣ci∣∣√λ∗wi

)= log

(1√2

+


)+

12<b∗i +

12.

(41)

This completes the proof.

Lemma 3 implies that |bi − b∗i | < 1. Thus, rounded opti-mal solution �bi� is at most one bit away from �b∗i �. We caninterpret this result as follows: in situation when bi are suf-ficiently large, for example, when high estimation precisionis required, the optimal solution behaves approximately aslog(1 + |ci|/

√λ∗wi). Notice that ci = eTi C−11 (ei denotes the

ith unit vector), so ci signifies the inverse of “noisiness” ofsignal xi in relation to the other sensor observations. Recall-ing the definition of λ∗ we note that product λ∗wi is propor-tional to the relative energy per bit wi/

∑wj and the value

of 1/√λ∗wi can be interpreted as being proportional to the

relative quality of wireless link between sensor i and the FC.Thus, the local message length b∗i can be intuitively inter-preted as being proportional to the logarithm of the productof signal quality and channel quality at sensor i.

We now consider a special case when the use of {b∗i } isespecially appealing. Suppose that covariance matrix C has ablock-diagonal structure

C =

C1 0 · · · 00 C2 · · · 0...

.... . .

...0 0 · · · Cn

. (42)

This situation may occur when sensors in the network arepartitioned into several clusters in such a way that sensorswithin each group are placed relatively close to each otherand far from the rest of the sensors. Thus, sensor observa-tions are uncorrelated unless they are generated from thesame cluster. In this case matrix C−1 is also block-diagonal:

C−1 =

C−11 0 · · · 00 C−1

2 · · · 0...

.... . .

...0 0 · · · C−1

n

. (43)

We assume further that sensors within each group can co-operate to learn the corresponding covariance submatrix C j .Value of λ∗ can be computed by the fusion center and broad-casted back to the sensors. Thus, each sensor can easily com-pute ci = [C−1

j 1]i and independently find its own quantiza-tion level b∗i . The advantage of this method is that the fusioncenter needs to broadcast only one universal message for allsensors.

To conclude this section we observe that our strategy canbe applied even if sensor noises have infinite range. Indeed,with an appropriate choice of U , that is, if tails of the noisepdf are negligible, the pdf can be approximated by a finitesupport function. However, the estimator (9) will no longerbe unbiased and cross terms E(θ − θ)(θ − θ) in the MSE ex-pression will no longer be zero. Thus, inequality (15) onlydefines a lower bound on estimation performance for someα, and the gap between left-hand side of (15) and actual MSEis determined by the noise pdf. Therefore, the full pdf knowl-edge will be required in order to specify constants U and αand quantify the estimation bias.


4. NUMERICAL SIMULATIONS

In this section, we present numerical simulations to comparethe transmission energy requirement for two energy schedul-ing strategies: (i) quantization using the closed-form approx-imate solution (32); (ii) uniform bit allocation when all sen-sors quantize their observations to the same number of bitsto achieve the same MSE. We denote by b the number of bitsused in case of uniform bit allocation. We can find the mini-mum of b from the MSE constraint

N∑i=1

c2i(

2b − 1)2 ≤

β

U2, (44)

which gives

b ≥ log

1 +

√√√√√U2

β

N∑i=1

c2i

. (45)

The number of bits can only take integer values, so the totalminimal energy is given by

Wuniform = log

1 +

√√√√√U2

β

N∑i=1

c2i

N∑i=1

wi. (46)

Recall that we have relaxed bi to take real values to makethe problem convex. Therefore, the optimal energy obtainedby allowing bi to take on real values is a lower bound on theactual optimal energy. If we round bi up to the closest integer�bi�, we can obtain an upper bound (denoted by Wopt) onthe actual energy. Even though we use �b∗i � to approximatethe actual optimal solution, significant energy can be savedwhen compared with the uniform bit allocation strategy inorder to achieve the same target distortion. The percentageof saving is defined as

Wuniform −Wopt

Wuniform× 100. (47)

For a positive random variable R we define

normalized deviation of R =√

varRER

, (48)

which will be used as a measure of the absolute heterogeneityof R. The sensor noise variances {σ2

i } are taken to be σ2i =

1 +a2Zi, where Zi are i.i.d. random variables with Zi ∼ χ21(z).

As can be easily verified, {σ2i } are also i.i.d. with σi ∼ χ2

1((x−1)/a2). We control heterogeneity of sensor noise variances byvarying the parameter a. In Figure 2a, we suppose that sensornoises have tri-diagonal correlation matrix

C =diag(σ1, σ2, . . . , σN

)

1 ρ · · · 0 0ρ 1 · · · 0 0...

.... . .

......

0 0 · · · 1 ρ0 0 · · · ρ 1

diag(σ1, σ2, . . . , σN

), (49)

where ρ = 0.2. In Figure 2b, we suppose that sensor noiseshave correlation matrix

C= diag

(σ1, σ2, . . . , σN

)[(1−ρ)I+ρ 11T

]diag

(σ1, σ2, . . . , σN

).

(50)

In all simulations, the total number of sensorsN = 200. Sinceall coefficients wi are scaled by a common factor, in our sim-ulation, {wi} are taken to be channel path losses

wi = dκi . (51)

Assume that the target estimation performance is fixed.From Figure 2 we can see that the amount of energy sav-ing becomes significant when the local noise variances be-come more and more heterogeneous, assuming that all sen-sors have identical wi. In Figure 3, we plot the percentageof energy savings versus the heterogeneity of channel gains,

supposing that sensors have same observation noise vari-ances with tri-diagonal structure as in (49) where σ2

i = 1for all i, and ρ = 0.2. Here we suppose that all sensors areuniformly distributed inside a unitary disk whose center is atthe FC. It is easy to show that in this case normalized devi-ation of wi depends only on κ (cf. (51)). In our simulation,we choose 1 ≤ κ ≤ 8. We observe that percentage of savingdepends more on the heterogeneity of sensor noise variancesthan that of channel gains. This can be understood regardingexpression (32) for b∗i , where in the logarithm, the quantitydepends on the distribution of ci, but only on the distribu-tion of 1/

√wi.

5. AN EXTENSION: MINIMAX FORMULATION

Minimizing total transmission energy results in sensors hav-ing different lifetimes. This may induce frequent changes inthe network topology. An alternative approach is to minimize


0 0.2 0.4 0.6 0.8 1 1.2 1.40

10

20

30

40

50

60

70

Normalized deviation of sensor noise variances

En

ergy

savi

ng

inpe

rcen

tage

(a)

0 0.2 0.4 0.6 0.8 1 1.2 1.40

10

20

30

40

50

60

70

Normalized deviation of sensor noise variances

En

ergy

savi

ng

inpe

rcen

tage

(b)

Figure 2: Percentage of energy saving increases when sensor noise variances become more heterogeneous.

maximal energy Wi which leads to maximum network life-time. Relaxing {bi} as in (24), we can state the problem asfollows:

minimize maxi

wibi

subject toN∑i=1

c2i(

2bi − 1)2 ≤

β

U2, bi ≥ 0, i = 1, . . . ,N ,

(52)

or alternatively

minimize max t

subject to wibi ≤ tN∑i=1

c2i(

2bi − 1)2 ≤

β

U2, bi ≥ 0, i=1, . . . ,N.

(53)

As in Section 3, we assume that ci �= 0 for all i and ignore thenonnegativity constraints bi ≥ 0 (which must be inactive atoptimum). The Lagrangian for problem (53) is found to be

L(t, bi,µi, λ

)= t+N∑i=1

µi(wibi − t

)+λ

( N∑i=1

c2i(

2bi − 1)2 −

β

U2

).

(54)

Differentiating L with respect to primal variables we obtainthe following conditions:

∂L

∂t= 1−

N∑i=1

µi = 0,

∂L

∂bi= −2λ ln 2

2bi c2i(

2bi − 1)3 + µiwi = 0,

(55)

which give

N∑i=1

µi = 1, (56)

λ′µi = 2bi c2i(

2bi − 1)3wi

, (57)

where as before λ′ = 1/2λ ln 2. Taking sum of (57) over all iwe obtain

λ′ =N∑i=1

c2i

wi

2bi(2bi − 1

)3 . (58)

Since each term in the right-hand side sum in (58) is positive,we conclude that λ > 0, therefore µi > 0, and complimentaryslackness condition gives

N∑i=1

c2i(

2bi − 1)2 =

β

U2,

wibi = t.

(59)

Thus, the optimal value topt can be found as a solution to thefollowing equation:

N∑i=1

c2i(

2t/wi − 1)2 =

β

U2. (60)

The solution topt is unique due to the monotonicity of theleft-hand side function in (60). The FC can solve (60) andbroadcast topt to the sensors, which in turn can determinetheir quantization levels locally. In this case sensor lifetime isnot affected by transmitted power.


0.4 0.6 0.8 1 1.2 1.40

5

10

15

20

Normalized deviation of channel gains

En

ergy

savi

ng

inpe

rcen

tage

Figure 3: Percentage of energy saving increases when channel gainsbecome more heterogeneous.

6. CONCLUSION

In this paper we have shown that total energy consumptionrequired for transmission in a sensor network can be min-imized if number of quantization levels for each sensor isdetermined jointly by the fusion center using informationabout correlation of sensor observations. We have also pre-sented a nearly-optimal solution in closed form to the energyminimization problem which can achieve the same target es-timation performance as the optimal solution. It is shown bynumerical simulations that to attain the same MSE perfor-mance our energy-efficient quantization scheme can achieveenergy saving up to 70% when compared to simple uniformbit allocation scheme. We plan to consider various exten-sions of this work in our future work. These include jointestimation of a common vector signal by a WSN, and dis-tributed least squares and target tracking for dynamic tar-gets.

ACKNOWLEDGMENTS

Authors would like to thank the anonymous reviewers fortheir valuable comments that helped improve the quality ofthis paper. This research is supported in part by the NaturalSciences and Engineering Research Council of Canada, Grantno. OPG0090391, by the Canada Research Chair Program,and by the National Science Foundation, Grant no. DMS-0312416.

REFERENCES

[1] Z.-Q. Luo, “Universal decentralized estimation in a band-width constrained sensor network,” to appear in IEEE Trans.Inform. Theory.

[2] Z.-Q. Luo, “An isotropic universal decentralized estimationscheme for a bandwidth constrained ad hoc sensor network,”IEEE J. Select. Areas Commun., vol. 23, no. 4, pp. 735–744,2005.

[3] Z.-Q. Luo and J.-J. Xiao, “Universal decentralized estimationin an inhomogeneous environment,” to appear in IEEE Trans.Inform. Theory.

[4] Z.-Q. Luo and J.-J. Xiao, “Decentralized estimation in aninhomogeneous environment,” in Proc. IEEE InternationalSymposium on Information Theory (ISIT ’04), pp. 517–517,Chicago, Ill, USA, June–July 2004.

[5] S. Cui, A. J. Goldsmith, and A. Bahai, “Energy-constrainedmodulation optimization,” to appear in IEEE Transaction onWireless Communications.

[6] S. Cui, A. J. Goldsmith, and A. Bahai, “Joint modulation andmultiple access optimization under energy constraints,” inProc. IEEE Global Telecommunications Conference (GLOBE-COM ’04), pp. 151–155, Dallas, Tex, USA, November–December 2004.

[7] J.-J. Xiao, S. Cui, Z.-Q. Luo, and A. J. Goldsmith, “Powerscheduling of universal decentralized estimation in sensornetworks,” to appear in IEEE Trans. Signal Processing.

[8] X. Luo and G. B. Giannakis, “Energy-constrained optimalquantization for wireless sensor networks,” in Proc. 1st IEEEAnnual Communications Society Conference on Sensor and AdHoc Communications and Networks (SECON ’04), pp. 272–278, Santa Clara, Calif, USA, October 2004.

[9] S. M. Kay, Fundamentals of Statistical Signal Processing: Esti-mation Theory, Prentice-Hall, Upper Saddle River, NJ, USA,1993.

[10] S. Boyd and L. Vandenberghe, Convex Optimization, Cam-bridge University Press, Cambridge, UK, 2003.

Alexey Krasnopeev received the B.S. de-gree in applied mathematics and physics in1999, and the M.S. degree in applied math-ematics and physics in 2001, both from theMoscow Institute of Physics and Technol-ogy, Moscow, Russia. He is currently pursu-ing his M.S. degree in electrical engineeringat the University of Minnesota. His researchinterests include wireless sensor networks,information theory, and algebraic codingtheory.

Jin-Jun Xiao received the B.S. degree inapplied mathematics from Jilin University,China, in 1997, and the M.S. degree inmathematics from the University of Min-nesota, in 2003. He is currently pursuing thePh.D. degree in electrical engineering at theUniversity of Minnesota. His research inter-ests are in wireless sensor networks, infor-mation theory, and optimization.

Zhi-Quan Luo received the B.S. degreein mathematics from Peking University,China, in 1984. During the academic yearof 1984/1985, he was with Nankai Insti-tute of Mathematics, Tianjin, China. From1985 to 1989, he studied at the Departmentof Electrical Engineering and ComputerScience, Massachusetts Institute of Tech-nology, where he received a Ph.D. degree


in operations research. In 1989, he joined the Department of Elec-trical and Computer Engineering, McMaster University, Hamilton,Canada, where he became a Professor in 1998 and held the CanadaResearch Chair in information processing since 2001. Starting April2003, he has been a Professor in the Department of Electrical andComputer Engineering at the University of Minnesota, and holdsan ADC Chair in digital technology. His research interests lie inthe union of large-scale optimization, signal processing, data com-munications, and information theory. He is a Member of SIAMand MPS. He is presently serving as an Associate Editor for sev-eral international journals including SIAM Journal on Optimiza-tion, Mathematical Programming, Mathematics of Computation,and Mathematics of Operations Research.

EURASIP Journal on Wireless Communications and Networking 2005:4, 483–492c© 2005 Javier Del Ser et al.

Asymmetric Joint Source-Channel Coding for CorrelatedSources with Blind HMM Estimation at the Receiver

Javier Del SerCentro de Estudios e Investigaciones Tecnicas de Gipuzkoa (CEIT), Parque Tecnologico de San Sebastian, Paseo Mikeletegi,N48, 20009 Donostia, San Sebastian, SpainEmail: [email protected]

Pedro M. CrespoCentro de Estudios e Investigaciones Tecnicas de Gipuzkoa (CEIT), Parque Tecnologico de San Sebastian, Paseo Mikeletegi,N48, 20009 Donostia, San Sebastian, SpainEmail: [email protected]

Olaia GaldosCentro de Estudios e Investigaciones Tecnicas de Gipuzkoa (CEIT), Parque Tecnologico de San Sebastian, Paseo Mikeletegi,N48, 20009 Donostia, San Sebastian, SpainEmail: [email protected]

Received 25 October 2004; Revised 17 May 2005

We consider the case of two correlated sources, S1 and S2. The correlation between them has memory, and it is modelled by ahidden Markov chain. The paper studies the problem of reliable communication of the information sent by the source S1 overan additive white Gaussian noise (AWGN) channel when the output of the other source S2 is available as side information at thereceiver. We assume that the receiver has no a priori knowledge of the correlation statistics between the sources. In particular,we propose the use of a turbo code for joint source-channel coding of the source S1. The joint decoder uses an iterative schemewhere the unknown parameters of the correlation model are estimated jointly within the decoding process. It is shown that reliablecommunication is possible at signal-to-noise ratios close to the theoretical limits set by the combination of Shannon and Slepian-Wolf theorems.

Keywords and phrases: distributed source coding, hidden Markov model parameter estimation, Slepian-Wolf theorem, jointsource-channel coding.

1. INTRODUCTION

Communication networks are multiuser communicationsystems. Therefore, their performance is best understoodwhen viewed as resource sharing systems. In the particularcentralized scenario where several users intend to send theirdata to a common destination (e.g., an access point in a wire-less local area network), the receiver may exploit the existingcorrelation among the transmitters, either to reduce powerconsumption or gain immunity against noise. In this context,we consider the system shown in Figure 1. The output of twocorrelated binary sources {Xk,Yk}∞k=1 are separately encoded,and the encoded sequences are sent through two different


channels to a joint decoder. The only requirement imposedon the random process {Xk,Yk}∞k=1 is to be ergodic. Noticethat this includes the situation where the process {Xk,Yk}∞k=1is modelled by a hidden Markov model (HMM); this is thecase analyzed in this paper.

If the channels are noiseless, the problem is reduced toone of distributed data compression. The Slepian-Wolf the-orem [1] (proven to be extensible to ergodic sources in [2])states that the achievable compression region (see Figure 2)is given by

R1 ≥H(S1 | S2

),

R2 ≥H(S2 | S1

),

R1 + R2 ≥H(S1, S2

),

(1)

where R1 and R2 are the compression rates for sources S1





S1

X1, . . . ,XMEncoder

1

R1 = MN1

C1, . . . ,CN1Channel 1

V1, . . . ,VN1

Join

tde

code

r

X1, . . . , XM

S2

Y1, . . . ,YM Encoder2

R2 = MN2

D1, . . . ,DN2Channel 2

Z1, . . . ,ZN2

Y1, . . . , YM

Figure 1: Block diagram of a typical distributed data coding system.

and S2 (bits per source symbol), and

H(S1 | S2

) = limn→∞

1nH(X1, . . . ,Xn, | Y1 . . . ,Yn

),

H(S1, S2

) = limn→∞

1nH(X1, . . . ,Xn;Y1, . . . ,Yn

),

(2)

their respective conditional and joint entropy rates. In theparticular case where the joint sequence {Xk,Yk}∞k=1 is i.i.d.,the above entropy rates are replaced by their correspondingentropies.

As already mentioned, we assume that the output ofthe multiterminal source {Xk,Yk}∞k=1 can be modelled by aHMM, and we analyze a more general problem of reliablecommunication when channels 1 and 2 in Figure 1 are ad-ditive white Gaussian noise (AWGN) and noiseless, respec-tively. The main goal is to minimize the energy per informa-tion bit Eb sent by the source S1 for a given encoding rateR1 < 1 and binary phase-shift keying (BPSK) modulation(i.e., the system operates in the power-limited regime). Whenthe complexity of both encoder and decoder is not an issue,the minimum theoretical limit (Eb/N0)∗ is achieved whenthe source S1 is compressed at its minimum rate, namely,H(S1 | S2). This can be done if the compression rate R2 ofthe source S2 is greater than or equal H(S2) (marked point inFigure 2). Without any loss of generality, we can assume thatthe source S2 is available as side information at the decoder(R2 =H(S2)).

From the source-channel separation theorem with sideinformation [3], the limit (Eb/N0)∗ is inferred from thecondition C ≥ H(S1 | S2)R1, where C = (1/2) log2(1 +2EbR1/N0) is the capacity of the AWGN channel in bits perchannel use.1 The above condition yields

(EbN0

)∗= 22R1H(S1|S2) − 1

2R1. (3)

Referring to Figure 1, the encoder 1 has been imple-mented using a binary turbo encoder [4] with coding rate R1.

1Since the modulation scheme used is BPSK, the capacity of the con-strained AWGN channel with a binary input constellation should be usedinstead of the unconstrained channel capacity. However, since the systemoperates in the power-limited regim the difference between both capacitiesis small.

R2

H(S1, S2)

H(S2)

H(S2|S1)

H(S1|S2) H(S1) H(S1, S2) R1

Figure 2: Diagram showing the achievable region for the codingrates. The displayed point [R1 = H(S1 | S2), R2 = H(S2)] showsthe asymmetric compression pair selected in our system.

However, with the corresponding decoding modifications,other type of probabilistic channel codes could have beenemployed, for example, low-density parity-check (LDPC)codes. The joint decoder bases its decision on both the out-put of the channel Vk and the side information Zk = Yk com-ing from the source S2.

The first practical scheme of distributed source compres-sion exploiting the potential of the Slepian-Wolf theorem wasintroduced by Pradhan and Ramchandran [5]. They focusedon the asymmetric case of compression of a source with sideinformation at the decoder and explored the use of sim-ple channel codes like linear block and trellis codes. If thisasymmetric compression pair can be reached, the other cor-ner point of the Slepian-Wolf rate region can be approachedby swapping the roles of both sources and any point be-tween these two corner points can be realized by time shar-ing. For that reason, most of the recent works reported inthe literature regarding distributed noiseless data compres-sion consider the asymmetric coding problem, although theyuse more powerful codes such as turbo [6, 7] and LDPC[8, 9] schemes. An exception is [10] that deals with sym-metric source compression. In all the above references, ex-cept in [9], the correlation between the sources is very sim-ple because they assume that this correlation does not havememory (i.e., {Xk,Yk}∞k=1 is i.i.d. and P(Xk �= Yk) = p ∀k).In [10], the correlation parameter p is estimated iteratively.However, Garcia-Frias and Zhong in [9] consider a much

Asymmetric Joint Source-Channel Coding with Blind Estimation 485

Multiterminal source

Source S1{xk}Mk=1 τ

Turbo encoder {xτ(k)}Mk=1

Encoder 1{rτ(k)}Mk=1

π

Encoder 2{zτ(k)}Mk=1

P/S φ

AWGN channel

N (0, N02 )

+Joint

source-channeldecoder

{xk}Mk=1

{xτ(k), rτ(k), zτ(k)}Mk=1

{φ(xτ(k)),φ(rτ(k)),φ(zτ(k))}Mk=1{yk}Mk=1

Side information

Source S2

HMM{ek}Mk=1 +

Figure 3: Proposed communication system for the joint source-channel coding scheme with side information. The decoder provides anestimate xk of xk with the help of the side information sequence {yk}Mk=1 and the redundant data {rk , zk}Mk=1 computed in the turbo encoder.The interleaver τ decorrelates the output of the sources.

more general model with hidden Markov correlation and as-sumes that its parameters are known at the decoder.

When one of the channels is noisy, the authors in [11](for a binary symmetric channel, BSC) and in [12] (fora BSC, AWGN and Rayleigh channel) have proposed ajoint source-channel coding scheme based on turbo andirregular repeat accumulate (IRA) codes, respectively. Inboth cases, the correlation among the sources is again as-sumed to be memoryless and known at the receiver. Un-der the same correlation assumptions, the case of symmetricjoint source-channel coding when both channels are noisy(AWGN) has been studied using turbo [13] and low-densitygenerator-matrix (LDGM) [14] codes. Both assume that thememoryless correlation probability is known at the decoder.

In this paper, we take a further step and consider thatthe correlation between the sources follows a hidden Markovmodel like the correlation proposed in [9] for distributedsource compression. However, unlike what is assumed in[9], our proposed scheme does not require any previousknowledge of the HMM parameters. It is based on an itera-tive scheme that jointly estimates, within the turbo-decodingprocess, the parameters of the HMM correlation model. It isan extension of the estimation method presented by Garcia-Frias and Villasenor [15] (for point-to-point data transmis-sion over an AWGN of a single HMM source) to the men-tioned distributed joint source-channel coding scenario. Aswe show in the simulation results, the loss in BER perfor-mance that results from the blind estimation of the HMMparameters when compared to their perfect knowledge isnegligible.

The rest of this paper is organized as follows. In the nextsection, the proposed system is introduced and the itera-tive source-channel joint decoder is described. Section 3 dis-cusses the simulation results of the joint decoding scheme.Finally, in Section 4, some concluding remarks are given.

2. SYSTEM MODEL

In this section, we present the proposed joint source-channelencoder shown in Figure 3. It uses an iterative decodingscheme that exploits the hidden Markov correlation betweensources based on the side information available at the de-coder. After describing the model assumed for the correlated

sources, the encoding and decoding process is analyzed. Weplace a special emphasis on the description of the iterativedecoding algorithm by means of factor graphs and the sum-product algorithm (SPA). For an overview about graphicalmodels and the SPA, we refer to [16].

2.1. Joint source model

We assume the following model for the multiterminal source(MS) sequence {Xk,Yk}∞k=1.

(i) The Xk are i.i.d. binary random variables with proba-bility distribution P(xk = 1) = P(xk = 0) = 0.5.

(ii) The output Yk from the source S2 is expressed as Yk =Xk ⊕ Ek, where

⊕denotes modulus 2 addition, and

Ek is a binary random process generated by an HMMwith parameters {A, B,Π}. The model is characterizedby [17]

(1) the number of states P;(2) the state-transition probability distribution A =

[as,s′], where as,s′ = PSMSk |SMS

k−1(s′ | s), s, s′ ∈ {0, . . . ,

P − 1};(3) the observed symbol probabilities distribution B =

[bs,e], where bs,e = PEk|SMSk

(e | s), s ∈ {0, . . . ,P − 1},and e ∈ {0, 1};

(4) the initial-state distribution Π = {πs}, where πs =PSMS

0(s) and s ∈ {0, . . . ,P − 1}.

We may note that for this model, the outputs of bothsources S1 and S2 are i.i.d. and equiprobable. Thus, H(S1) =H(X1) = 1 and H(S2) = H(Y1) = 1. On the contrary, thecorrelation between sources does have memory since

H(S1 | S2

) = limn→∞

1nH(X1, . . . ,Xn | Y1, . . . ,Xn

)= lim

n→∞1nH(E1, . . . ,En

) =H(E) < H(E1),

(4)

where H(E) denotes the entropy rate of the random se-quence Ek generated by the HMM. By changing the param-eters of the HMM, different values of H(S1 | S2) can be ob-tained. Also notice that, for the particular case where P = 1,the correlation is memoryless, resulting in H(S1 | S2) =H(E1) = h(b0,1); that is, the entropy of a binary random vari-able with distribution (b0,1, 1− b0,1).


TMSk (SMS

k−1, SMSk ,Xk = 0,Yk = 0)

TMSk (SMS

k−1, SMSk ,Xk = 0,Yk = 1)

TMSk (SMS

k−1, SMSk ,Xk = 1,Yk = 0)

TMSk (SMS

k−1, SMSk ,Xk = 1,Yk = 1)

SMSk−1

k − 1

SMSk

k

Figure 4: Branch transition probabilities from the generic state SMSk−1

to SMSk of the trellis describing the HMM multiterminal source.

Using the fact that Yk = Xk ⊕ Ek, the above model canbe reduced to an equivalent HMM that outputs directly thejoint sequence {Xk,Yk}∞k=1 without any reference to the vari-able Ek. Its trellis diagram has P states and 4 parallel branchesbetween states, one for each possible output (Xk,Yk) combi-nation (see Figure 4). The associated branch a priori prob-abilities are easily obtained from the original HMM modeland the Xk a priori probabilities P(xk). For instance, thebranch probability of going from state s to state s′, associ-ated with outputs Xk = q and Yk = v, q �= v (q = v), isgiven by the probability of the following three independentevents {Sk−1 = s, Sk = s′}, {Ek = 1, when being in state s}({Ek = 0, when being in state s}), and {Xk = q}; that is,as,s′ · bs,1 · P(xk = q). Therefore,

TMS(SMSk−1 = s, SMS

k = s′, Xk = q, Yk = v)

=as,s′ · bs,0 · 0.5 if q = v,

as,s′ · bs,1 · 0.5 if q �= v,

(5)

where q, v ∈ {0, 1} and s, s′ ∈ {0, . . . ,P − 1}. The MS labelfor the trellis branch transitions TMS

k and state variables SMSk

stands for multiterminal source.

2.2. Turbo encoder

The block sequence {x′k}Mk=1 = {X1 = x′1, . . . ,XM = x′M}produced by a realization of the source S1 is first ran-domized by the interleaver τ before entering to a turbocode, with two identical constituent convolutional encodersC1 and C2. The encoded binary sequence is denoted by{x′τ(k), r

′τ(k), z

′τ(k)}Mk=1, where we assume that the coding rate is

R1 = 1/3, and r′τ(k), z′τ(k) are the redundant symbols produced

by C1, C2, respectively. The input to the AWGN channel is{φ(x′τ(k)),φ(r′τ(k)),φ(z′τ(k))}Mk=1, where φ : {0, 1} → R denotesthe BPSK transformation performed by the modulator. Fi-nally, the received corresponding sequence will be denotedby {xτ(k), rτ(k), zτ(k)}Mk=1.

2.3. Joint source-channel decoder

To better understand the joint source-channel decoder withside information, we begin analyzing a simplified decoderthat bases its decisions only on

(i) the received systematic symbols {xk}Mk=1;

(ii) the side information sequence { yk}Mk=1 generated by arealization of the source S2.

The decoder will decide for the Xk ∈ {0, 1} that maxi-mizes the a posteriori probability P(xk | {x j , y j}Mj=1) (MAPdecoder). This is done via the forward-backward algorithm,also known as MAP or BCJR [18]. This algorithm is a partic-ularization of the SPA applied to factor graphs derived froman HMM or a trellis diagram, and it is an efficient marginal-ization procedure based on message-passing rules among thenodes in a factor graph.

From the trellis description of our source model (seeFigure 4), the joint probability distribution function ofthe random variables {Xk}Mk=1 conditioned by the obser-vations {x j}Mj=1 and the side information { y j}Mj=1, that is,

P(x1, . . . , xM | {x j , y j}Mj=1), can be decomposed in terms offactors, one for each time instant k. In turn, this factorizationmay be represented by a factor graph [16], like the one shownin Figure 5. We keep the same convention used in [16], repre-senting in lower case the variables involved in a factor graph.There should be no confusion from the context whether xdenotes an ordinary variable taking on values in some finitealphabet X, or the realization of some random variable X .

Since the channel is AWGN, the local functions ofxk, P(xk | xk), are given by the Gaussian distributionN (φ(xk),N0/2). On the other hand, the local functionsIyk (yk) are indicator functions taking value 1 when yk = ykand 0 otherwise. This shows the fact that the output of thesource S2 is known with certainty at the decoder.

Based on this factor graph, the decoder can nowefficiently compute the a posteriori probability P(xk |{x j , y j}Mj=1) by marginalizing P(x1, . . . , xM | {x j , y j}Mj=1) viathe SPA which, in this case, reduces to the forward-backwardalgorithm.

In particular, the forward and backward recursion pa-rameters αMS

k−1(sMSk−1) and βMS

k (sMSk ) defined in the forward-

backward algorithm are the messages passed from the statevariable node sMS

k−1 to the factor node TMSk and from the

state variable node sMSk to TMS

k , respectively. From the sum-product update rules, the following expressions are obtainedfor these messages:

αMSk

(sk) = ∑

∼{sk}αMSk−1

(sk−1

) · TMSk

(sk−1, sk, xk, yk

)· P(xk | xk) · Iyk (yk), k = 1, . . . ,M,

(6)

βMSk

(sk) = ∑

∼{sk}βMSk+1

(sk+1

) · TMSk+1

(sk, sk+1, xk+1, yk+1

)· P(xk+1 | xk+1

) · Iyk+1

(yk+1

), k =M − 1, . . . , 1,

(7)

where xk, yk ∈ {0, 1}, sk−1, sk ∈ {0, . . . ,P − 1}, and∑∼{sk}

indicates that all variables are being summed over exceptvariable sk. The subindex MS in the state variables hasbeen omitted for clarity’s sake. The initialization is doneby setting αMS

0 ( j) = πj and βMSM ( j) = 1/P, for all j ∈

{0, . . . ,P−1}. Once the αMSk (sk) and βMS

k (sk) have been com-puted, the messages δMS

k (xk), passed from the factor nodes


p(x1|x1)

x1

p(x2|x2)

x2p(x3|x3)

x3

sMS0

TMS1

sMS1

αMS1 (SMS

1 )δMS

2 (x2)

TMS2

sMS2

TMS3

sMS3

Iy1 (y1)y1

Iy2 (y2)y2

βMS2 (SMS

2 ) Iy3 (y3)y3

Figure 5: Simplified factor graph defined by the trellis of Figure 4. For simplicity, only M = 3 stages has been drawn.

TMSk (sMS

k−1, sMSk , xk, yk) to the variable nodes xk, are obtained

by the SPA update rules as

δMSk

(xk) = ∑

∼{xk}αMSk−1

(sk−1

) · TMSk

(sk−1, sk, xk, yk

)· βMS

k

(sk) · Iyk(yk), k = 1, . . . ,M.

(8)

The a posteriori probability P(xk | {x j , y j}Mj=1) is now cal-culated as the product of all the messages arriving at variablenode xk. In our case, the message passed from the local func-tion node P(xk | xk) to the variable node xk is simply theprobability function itself, whereas the message passed fromthe local function node TMS

k (sMSk−1, sMS

k , xk, yk) to the variablenode xk is δMS

k (xk) (see Figure 5). Therefore,

P(xk |

{x j , y j

}Mj=1

)∝ P

(xk | xk

) · δMSk

(xk). (9)

The problem we want to solve in this paper is an ex-tension of what we have just analyzed. The joint decodermust compute the a posteriori probability of the symbol Xk

by observing not only the corresponding received symbols{x j}Mj=1 and the side information { y j}Mj=1 as described be-

fore, but also the additional outputs of the channel {r j}Mj=1

and {z j}Mj=1, that is, P(xk | {x j , r j , z j , y j}Mj=1). The global fac-tor graph results by properly attaching, through interleaver τ,the factor graph describing a standard turbo decoder to thegraph in Figure 5.

Figure 6 shows this arrangement. Observe that the threesub-factor graphs have the same topology since each modelsa trellis (with different parameters); namely, the trellis of thetwo constituent convolutional decoders and the trellis of themultiterminal source.

Similarly to what happens with the standard factor graphof a turbo decoder, the compound factor graph has cycles andthe message sum-product algorithm has no natural termina-tion. To overcome this problem, the following schedule hasbeen adopted. During the ith iteration, a standard SPA is sep-arately applied to each of the three factor graphs describingthe decoders D1, D2, and the multiterminal source, in thisorder: MS → D1 → D2. Since these subfactor graphs do nothave cycles, the corresponding SPAs will terminate. Notice,however, that the updating rules for the SPA, when appliedto one of the subfactor graphs, require incoming messagesfrom the other two subfactor graphs (called extrinsic infor-mation in turbo-decoding jargon), since all share the same

variable nodes xτ(k). The messages computed in the previoussteps are used for that purpose.

For example, referring to Figure 6, the former SPA updateexpressions (see (6)–(8)) are now modified to include the ex-trinsic information ξMS

k,i (xk) coming from D1 and D2 (i.e.,from the turbo-decoding iteration), instead of P(xk | xk).That is,

αMSk,i

(sk) = ∑

∼{sk}αMSk−1,i

(sk−1

) · TMSk

(sk−1, sk, xx, yk

)· ξMS

k,i

(xk) · Iyk(yk), k = 1, . . . ,M,

(10)

βMSk,i

(sk) = ∑

∼{sk}βMSk+1,i

(sk+1

) · TMSk+1

(sk, sk+1, xk+1, yk+1

)· ξMS

k+1,i

(xk+1

) · Iyk+1

(yk+1

), k =M − 1, . . . , 1,

(11)

δMSk,i

(xk) = ∑

∼{xk}αMSk−1,i

(sk−1

) · TMSk

(sk−1, sk, xx, yk

)· βMS

k,i

(sk) · Iyk(yk), k = 1, . . . ,M,

(12)

where the subindex i denotes the current iteration. The ex-trinsic information ξMS

k,i (xk) is the message passed from thevariable node xk to the factor node TMS

k through interleaverτ (see Figure 6). Using the SPA update rules, this is given by

ξMSk,i

(xk) = δD1

k,i−1

(xk) · δD2

k,i−1

(xk) ·P(xk | xk), k = 1, . . . ,M.

(13)

With the obvious modifications, the same set of recur-sions also holds for the factor graphs D1 and D2. Observethat the SPA applied to D1 and D2 is nothing more than thestandard turbo-decoding procedure modified to include theextrinsic information δMS

k,i−1(xk) coming from the MS.After L iterations, the a posteriori probabilities P(xτ(k) |

{x j , r j , z j , y j}Mj=1) are calculated as the product of all mes-sages arriving at variable node xτ(k), that is,

P(xτ(k) |

{x j , r j , z j , y j

}Mj=1

)∝ δD1

τ(k),L

(xτ(k)

) · δD2τ(k),L

(xτ(k)

)· δMS

τ(k),L

(xτ(k)

) · P(xτ(k) | xτ(k)), k = 1, . . . ,M.

(14)

Finally, the estimated source symbol at τ(k) is given byarg maxxτ(k)∈{0,1} P(xτ(k) | {x j , r j , z j , y j}Mj=1).


If the local functions Iyk (yk) in the factor nodes ofFigure 6 were substituted by P(yk) = 0.5 (i.e., if no side in-formation was available at the decoder or the sources werenot correlated), the resulting normalized messages from theSPA would be δMS

k,i (xk) = 0.5 for all k, i and all values ofvariables xk (showing the fact that the source S1 is i.i.d. andequiprobable). In other words, the subfactor graph of theMS would be superfluous and the decoder would be reducedto a standard turbo decoder. Should we assume for S1 (seeFigure 3) a two-state HMM source, like the one consideredin [15] instead of i.i.d., the resulting MS overall HMM, com-bining both HMM models (for {Ek} and {Xk}), would have2P states with 4 branches between states. The correspond-ing branch probabilities in (5) would have to be modifiedaccordingly. In the lack of side information, the MS factorgraph would be reduced to that describing the HMM of thesource S1. As a result, our decoding process would coincidewith the scheme studied in [15].

2.4. Iterative estimation of the HMM parametersof the multiterminal source model

The updating equations (10)–(12) require the knowledge ofthe HMM parameters {A, B,Π}, since they appear in the def-inition of the branch transition probabilities in (5). However,in most cases, this information is not available. Therefore,the joint decoder must additionally estimate these parame-ters. The proposed estimation method is based on a modi-fication of the iterative Baum-Welch algorithm (BWA) [17],which was first applied in [15] to estimate the parameters ofhidden Markov source in a point-to-point transmission sce-nario. The underlying idea is to use the BWA over the trellis

associated with the multiterminal source by reusing the SPAmessages computed at each iteration.

For the derivation of the reestimation formulas, it is con-venient to define the functions ai(s, s′), bi(s, e), and πi(s),where s, s′, and e are variables taking on values in {0, . . . ,P−1} and {0, 1}, respectively. The index i denotes the iterationnumber and the values taken by these functions at iterationi are the reestimated distributions of the probability of goingfrom state s to state s′, the probability that the HMM outputsthe symbol e when being in state s, and the probability thatthe initial state of the HMM is s, respectively. With this newnotation, the local functions TMS

k (sk−1, sk, xk, yk, ek) (5) in theMS factor graph will now depend on i, yielding

TMSk,i

(sk−1, sk, xk, yk, ek

)

=

ai−1(sk−1, sk) · bi−1

(sk, 0

) · 0.5 if xk = yk, ek = 0,

ai−1(sk−1, sk

) · bi−1(sk, 1

) · 0.5 if xk �= yk, ek = 1,

0 elsewhere.(15)

Notice that the variable ek is explicitly included in the ar-gument of TMS

k,i since the access to this variable is requiredwhen obtaining the reestimation formula for bi(s, e) (17).

Having said that, the reestimation expressions for thesefunctions are easily derived by realizing that the condi-tional probability P(sk−1, sk, xk, yk, ek | {x j , r j , z j , y j}Mj=1)at iteration i is proportional to the product αMS

k−1,i(sk−1) ·TMSk,i (sk−1, sk, xk, yk, e) · βMS

k,i (sk) · ξMSk,i (xk) · Iyk (yk). Using this

fact on the BWA, the following reestimation equations areobtained:

ai(s, s′) =∑M

k=1

∑∼{s,s′} α

MSk−1,i(s) · TMS

k,i

(s, s′, xk, yk, e

) · βMSk,i (s′) · ξMS

k,i

(xk) · Iyk(yk)∑M

k=1

∑∼{s} α

MSk−1,i(s) · TMS

k,i

(s, s′, xk, yk, e


k,i

(xk) · Iyk(yk) , (16)

bi(s, e) =∑M

k=1

∑∼{s,e} α

MSk−1,i(s) · TMS

k,i

(s, s′, xk, yk, e


k,i

(xk) · Iyk(yk)∑M

k=1

∑∼{s} α

MSk−1,i(s) · TMS

k,i

(s, s′, xk, yk, e


k,i

(xk) · Iyk(yk) , (17)

πi(s) =∑∼{s} α

MS0,i (s) · TMS

1,i

(s, s′, x1, y1, e

) · βMS1,i (s′) · ξMS

1,i

(x1) · Iy1

(y1)

∑∼{∅} α

MS0,i (s) · TMS

1,i

(s, s′, x1, y1, e

) · βMS1,i (s′) · ξMS

1,i

(x1) · Iy1

(y1) . (18)

The∑∼{∅} in the denominator of (18) indicates that all

variables are summed over. At iteration i, the above expres-sions are computed after the SPA has been applied to MS, D1,and D2. We have noticed that (18) may be omitted wheneverthe block length is large enough (the initial αMS

0,i ( j) can be setto 1/P for all j ∈ {0, . . . ,P − 1}). We now give a brief sum-mary of the proposed iterative decoding scheme.

(i) Phase I: i = 0.

(1) Perform the SPA over the factor graphs that de-scribe the decoders D1 and D2 without considering

the extrinsic information coming from the MSblock (i.e., with δMS

k,0 (xk) = 0.5, for all k ∈{1, . . . ,M}). For each k, obtain an initial es-timate xk of the source symbol xk by xk =arg maxxk∈{0,1} P(xk | {x j , r j , zj}Mj=1). Notice thatthis is equivalent to considering only the turbo de-coder.

(2) Based on the observation ek = xk ⊕ yk, apply thestandard BWA [17] to obtain an initial estimate ofthe Markov parameters a0(s, s′), b0(s, e), and π0(s),e ∈ {0, 1}, s, s′ ∈ {0, . . . ,P − 1}.


P(rτ(1)|rτ(1))

rτ(1)

P(rτ(2)|rτ(2))

rτ(2)

P(rτ(3)|rτ(3))

rτ(3)

P(rτ(4)|rτ(4))

rτ(4)

sD10

TD11

sD11

TD12

sD12

αD12,i (sD1

2 ) βD13,i (sD1

3 )

sD13

TD14

sD14

DecoderD1

P(xτ(1)|xτ(1))xτ(1)

P(xτ(2)|xτ(2))xτ(2)

P(xτ(3)|xτ(3))xτ(3)

δD1τ(3),i(xτ(3))

xτ(4)

ξD1τ(4),i(xτ(4))

ξD2τ(2),i(xτ(2)) δD2

τ(3),i(xτ(3))P(xτ(4)|xτ(4))

π−1 π interleaver π

ξD2π(τ(2)),i(xπ(τ(2))) δD2

π(τ(3)),i(xπ(τ(3)))TD2

1 TD22 TD2

3 TD24

sD20 sD2

1 sD22 sD2

3 sD24

DecoderD2

P(zτ(1)|zτ(1)) P(zτ(2)|zτ(2)) P(zτ(3)|zτ(3)) P(zτ(4)|zτ(4))

zτ(1) zτ(2) zτ(3) zτ(4)

ξMSτ(3),i(xτ(3)) δMS

τ(3),i(xτ(3))

τ τ interleaver τ−1

ξMS3,i (x3) δMS

3,i (x3)

sMS0

TMS1

sMS1

TMS2

sMS2

αMS2,i (sMS

2 ) βMS3,i (sMS

3 )sMS3

TMS4

sMS4

Multiterminalsource

Iy1 (y1) y1 Iy2 (y2) y2 Iy3 (y3) y3 Iy4 (y4) y4

Figure 6: Assembly of the standard turbo decoder to the factor graph in Figure 5. For simplification purposes, the data length has been fixedto M = 4.

(ii) Phase II: i ≥ 1.(3) i = i + 1.(4) Perform the SPA over the MS factor graph using

the functions TMSk,i in (15) as factor nodes. This will

produce the set of messages δMSk,i (xk).

(5) Perform the SPA over the factor graphs D1 andD2 with messages δMS

k,i (xk) as extrinsic informationcoming from the factor graph MS.

(6) Reestimate the HMM parameters using (16)–(18),and go back to step 3.

3. SIMULATION RESULTS

In order to assess the performance of the proposed jointdecoding/estimation scheme, a simulation has been carriedout using different values of the conditional entropy rateH(S1 | S2). The two constituent convolutional encoders C1

and C2 of the turbo code are characterized by the polynomialgenerator g(Z) = [1, (Z3 + Z2 + Z + 1)/(Z3 + Z2 + 1)]. In allsimulated cases, the number of states P for the HMM char-acterizing the joint source correlation has been set to 2. Per-formance comparisons with and without the decoder havinga priori knowledge of the hidden Markov parameters are pre-sented.

The simulation uses 2000 blocks of 16384 binary sym-bols each, and the maximum number of iterations is fixedto 35. Figure 7 displays the bit error ratio (BER) versusEb/N0 for two different values of the conditional entropy rate,

H(S1 | S2) = 0.45 and 0.73, and for the rate 1/3 stan-dard turbo decoder. The HMM model that generates the sta-tionary random process Ek, giving raise to H(S1 | S2) =0.45 (0.73), has transition probabilities a0,0 = 0.97 (0.9),a1,1 = 0.98 (0.85) and output probabilities b0,0 = 0.05 (0.05),b1,0 = 0.95 (0.92). In both cases, the initial-state distributionΠ is the corresponding stationary distribution of the chain.

As opposed to what happens to the joint probability dis-tribution of (E1, . . . ,En), the marginal distribution PEk (ek) iseasily computed by PEk (ek) = π1 · b1,ek + π0 · b0,ek , for allk. It can be checked that in both models this distribution isnearly equiprobable, giving a value for the entropy H(Ek) ofapproximately 0.98. Since H(Xk | Yk) = H(Ek) ≈ H(Xk),we have that PXk|Yk (xk | yk) ≈ PXk (xk), that is, the randomvariables Xk and Yk are practically independent. Therefore,the correlation between the processes {Xk}∞k=1 and {Yk}∞k=1is embedded in the memory of the joint process {Xk,Yk}∞k=1(see (4)).

The standard turbo-decoder curve has been included inFigure 7 for reference. It shows the performance degrada-tion that the proposed joint decoder would incur, should theside information not be used in the decoding algorithm (or,equivalently, if no correlation exists between both sources,i.e., H(S1 | S2) =H(S1) = 1).

For comparison purposes, the three theoretical limits−0.55, −2.2, and −4.6 dB given in (3) corresponding toH(S1 | S2) = 1, 0.73, and 0.45, respectively, are also shownas vertical lines in Figure 7. For H(S1 | S2) = 0.73 and


10−1

10−2

10−3

10−4

Bit

erro

rra

tio

−5 −4 −3 −2 −1 0 1

Eb/N0 (dB)

H(S1|S2) = 0.73

H(S1|S2) = 0.45

Rate 1/3standard

turbo

Figure 7: BER versus Eb/N0 for entropy values H(S1 | S2) =1.0, 0.73, and 0.45 after 35 iterations. The results for known and un-known HMM are depicted with � and �markers, respectively. Thetheoretical Shannon limits are represented by the vertical solid lines.The BER range is bounded at 1/M (less than one error inM = 16384bits).

H(S1 | S2) = 0.45, the BER curves with � markers repre-sent the performance when perfect knowledge of the jointsource parameters is available at the decoder. On the otherhand, the curves with � display the performance when noinitial knowledge is available at the joint decoder. In this case,the estimation of the HMM parameters is run afresh for eachinput block, that is, without relying on any previous reesti-mation information.

Observe that the degradation in performance due to thelack of a priori knowledge in the source correlation statisticsis negligible. Also we may note that at a given BER, the gapbetween the required Eb/N0 and their corresponding theoret-ical limits widens as the conditional entropy rate decreases(i.e., the amount of correlation between sources increases).In particular, at BER = 10−4, the gaps are 0.65, 1 and 2.4 dB,respectively. As mentioned in [13] for the memoryless case,when the correlation between the sequences is very strongthe side information can be interpreted as an additional sys-tematic output of the turbo decoder. As it is well known inthe turbo-code literature, this repetition involves a penalty inperformance.

The set of curves in Figure 8 illustrates the BER perfor-mance versus Eb/N0 as the number of iterations increases.Plots 8a and 8b are for the conditional entropy rates H(S1 |S2) = 0.45 and H(S1 | S2) = 0.73, respectively. Although theBER performance is similar in both cases, the convergencerate when the decoder estimates the parameters of the HMMis slower, as expected.

Finally, suppose that the joint decoder is implementedassuming that the correlation between sources is memory-less (like in [13]), that is, the state variables in the MS fac-tor graph can only take a single value sk = 0, and the factornodes TMS

k in (5) have a0,0 = 1 and b0,0 = PEk (0). As a result,

10−1

10−2

10−3

10−4

Bit

erro

rra

tio

−3 −2.5 −2 −1.5

Eb/N0 (dB)

BWA, Iter 1BWA, Iter 5BWA, Iter 10BWA, Iter 20BWA, Iter 35

Iter 1Iter 5Iter 10Iter 20Iter 35

(a)

10−1

10−2

10−3

10−4

Bit

erro

rra

tio

−2 −1.5 −1 −0.5

Eb/N0 (dB)

BWA, Iter 1BWA, Iter 5BWA, Iter 10BWA, Iter 20BWA, Iter 35

Iter 1Iter 5Iter 10Iter 20Iter 35

(b)

Figure 8: BER versus Eb/N0 (dB) for several iteration numbers: (a)H(S1 | S2) = 0.45 and (b) H(S1 | S2) = 0.73. The label BWA standsfor the case where the HMM parameters are iteratively estimated.

we would not achieve any performance improvement withrespect to the case of no side information. As previouslymentioned, the reason is that with this decoder, the rate com-pression for source S1 would be limited to H(Xk | Yk) =H(E1) ≈ H(Xk), implying that there is practically no corre-lation (of depth n = 1) between S1 and S2.


4. CONCLUSIONS

Given two binary correlated sources with hidden Markovcorrelation, this paper proposes an asymmetric distributedjoint source-channel coding scheme for the transmission ofone of the sources over an AWGN. We assume that the othersource output is available as side information at the receiver.A turbo encoder and a joint decoder are used to exploit theMarkov correlation between the sources. We show that, whenthe correlation statistics are not initially known at the de-coder, they can be estimated jointly within the iterative de-coding process without any performance degradation. Sim-ulation results show that the performance of this systemachieves signal to noise ratios close to those established bythe combination of Shannon and Slepian-Wolf theorems.

REFERENCES

[1] D. Slepian and J. Wolf, “Noiseless coding of correlated infor-mation sources,” IEEE Trans. Inform. Theory, vol. 19, no. 4,pp. 471–480, 1973.

[2] T. Cover, “A proof of the data compression theorem of Slepianand Wolf for ergodic sources (Corresp.),” IEEE Trans. Inform.Theory, vol. 21, no. 2, pp. 226–228, 1975.

[3] S. Shamai and S. Verdu, “Capacity of channels with uncodedside information,” European Transactions on Telecommunica-tions, vol. 6, no. 5, pp. 587–600, 1995.

[4] C. Berrou and A. Glavieux, “Near optimum error correctingcoding and decoding: turbo-codes,” IEEE Trans. Commun.,vol. 44, no. 10, pp. 1261–1271, 1996.

[5] S. S. Pradhan and K. Ramchandran, “Distributed source cod-ing using syndromes (DISCUS): design and construction,” inProc. IEEE Data Compression Conference (DCC ’99), pp. 158–167, Snowbird, Utah, USA, March 1999.

[6] J. Bajcsy and P. Mitran, “Coding for the Slepian-Wolf problemwith turbo codes,” in Proc. IEEE Global TelecommunicationsConference (GLOBECOM ’01), vol. 2, pp. 1400–1404, San An-tonio, Tex, USA, November 2001.

[7] A. D. Liveris, Z. Xiong, and C. N. Georghiades, “Distributedcompression of binary sources using conventional paralleland serial concatenated convolutional codes,” in Proc. IEEEData Compression Conference (DCC ’03), pp. 193–202, Snow-bird, Utah, USA, March 2003.

[8] A. D. Liveris, Z. Xiong, and C. N. Georghiades, “Compressionof binary sources with side information at the decoder usingLDPC codes,” IEEE Commun. Lett., vol. 6, no. 10, pp. 440–442,2002.

[9] J. Garcia-Frias and W. Zhong, “LDPC codes for compressionof multi-terminal sources with hidden Markov correlation,”IEEE Commun. Lett., vol. 7, no. 3, pp. 115–117, 2003.

[10] J. Garcia-Frias, “Compression of correlated binary sources us-ing turbo codes,” IEEE Commun. Lett., vol. 5, no. 10, pp. 417–419, 2001.

[11] A. Aaron and B. Girod, “Compression with side informationusing turbo codes,” in Proc. IEEE Data Compression Confer-ence 2002 (DCC ’02), pp. 252–261, Snowbird, Utah, USA,April 2002.

[12] A. D. Liveris, Z. Xiong, and C. N. Georghiades, “Joint source-channel coding of binary sources with side information at thedecoder using IRA codes,” in Proc. IEEE International Work-shop on Multimedia Signal Processing (MMSP ’02), pp. 53–56,St. Thomas, US Virgin Islands, December 2002.

[13] J. Garcia-Frias, “Joint source-channel decoding of correlatedsources over noisy channels,” in Proc. IEEE Data Compression

Conference (DCC ’01), pp. 283–292, Snowbird, Utah, USA,March 2001.

[14] W. Zhong, H. Lou, and J. Garcia-Frias, “LDGM codes for jointsource-channel coding of correlated sources,” in Proc. IEEEInternational Conference on Image Processing (ICIP ’03), vol. 1,pp. 593–596, Barcelona, Spain, September 2003.

[15] J. Garcia-Frias and J. D. Villasenor, “Joint turbo decoding andestimation of hidden Markov sources,” IEEE J. Select. AreasCommun., vol. 19, no. 9, pp. 1671–1679, 2001.

[16] F. R. Kschischang, B. J. Frey, and H.-A. Loeliger, “Factorgraphs and the sum-product algorithm,” IEEE Trans. Inform.Theory, vol. 47, no. 2, pp. 498–519, 2001.

[17] L. R. Rabiner, “A tutorial on hidden Markov models and se-lected applications in speech recognition,” Proc. IEEE, vol. 77,no. 2, pp. 257–286, 1989.

[18] L. Bahl, J. Cocke, F. Jelinek, and J. Raviv, “Optimal decodingof linear codes for minimizing symbol error rate (Corresp.),”IEEE Trans. Inform. Theory, vol. 20, no. 2, pp. 284–287, 1974.

Javier Del Ser was born on March 13, 1979,in Barakaldo, Spain. He studied telecom-munication engineering from 1997 to 2003at the Technical Engineering School of Bil-bao (ETSI), Spain, where he obtained hisM.S. degree in 2003. As a Member of theSignal and Communication Group at theDepartment of Electronics and Telecom-munications of the University of the BasqueCountry (EHU/UPV), he developed a signalprocessing system for the measurement of quality parameters of thepower line supply. Currently, he is working toward the Ph.D. degreeat the Centro de Estudios e Investigaciones Tecnicas de Gipuzkoa(CEIT), San Sebastian, Spain. He is also a Teaching Assistant atTECNUN (University of Navarra). His research interests are fo-cused on factor graph theory, distributed source coding, and bothturbo-coding and turbo-equalization schemes, with a special inter-est in their practical application in real scenarios.

Pedro M. Crespo was born in Barcelona,Spain. In 1978, he received the Engineeringdegree in telecommunications from Univer-sidad Politecnica de Barcelona, and the M.S.degree in applied mathematics and Ph.D.degree in electrical engineering from theUniversity of Southern California (USC), in1983 and 1984, respectively. From Septem-ber 1984 to April 1991, he was a Memberof the technical staff in the Signal Process-ing Research Group at Bell Communications Research, New Jer-sey, USA, where he worked in the areas of data communicationand signal processing. He actively contributed in the definitionand development of the first prototypes of digital subscriber linestransceivers (xDSL). From May 1991 to August 1999, he was aDistrict Manager at Telefonica Investigacion y Desarrollo, Madrid,Spain. From 1999 to 2002, he was the Technical Director of theSpanish telecommunication operator Jazztel. At present, he is theDepartment Head of the Communication and Information TheoryGroup at Centro de Estudios Investigaciones Tecnicas de Gipuzkoa(CEIT), San Sebastian, Spain. He is also a Full Professor at TEC-NUN (University of Navarra). Pedro Crespo is a Senior Memberof the Institute of Electrical and Electronic Engineers (IEEE) andhe is a recipient of the Bell Communication Researchs Award ofExcellence. He holds seven patents in the areas of digital subscriber


lines and wireless communications. His research interests currentlyinclude space-time coding techniques for MIMO systems, iterativecoding and equalization schemes, bioinformatics, and sensor net-works.

Olaia Galdos was born on April 20, 1976,in Legazpi, Spain. She studied mathemat-ics from 1994 to 1999 at Sciences Facultyof the University of the Basque Country,Leioa, Spain. Currently she is a Ph.D. can-didate at TECNUN (University of Navarra,Spain). Her research topics are in Slepian-Wolf distributed source coding with turboand LDPC codes, factor graph theory andits application to coding and decoding algo-rithms.

EURASIP Journal on Wireless Communications and Networking 2005:4, 493–504c© 2005 Zhiyu Yang et al.

MAC Protocols for Optimal Information RetrievalPattern in Sensor Networks with Mobile Access

Zhiyu YangSchool of Electrical and Computer Engineering, Cornell University, Ithaca, NY 14853, USAEmail: [email protected]

Min DongCorporate Research & Development, QUALCOMM Incorporated, 5775 Morehouse Drive, San Diego, CA 92121, USAEmail: [email protected]

Lang TongSchool of Electrical and Computer Engineering, Cornell University, Ithaca, NY 14853, USAEmail: [email protected]

Brian M. SadlerArmy Research Laboratory, Adelphi, MD 20783-1197, USAEmail: [email protected]

Received 9 December 2004

In signal field reconstruction applications of sensor network, the locations where the measurements are retrieved from affect thereconstruction performance. In this paper, we consider the design of medium access control (MAC) protocols in sensor net-works with mobile access for the desirable information retrieval pattern to minimize the reconstruction distortion. Taking bothperformance and implementation complexity into consideration, besides the optimal centralized scheduler, we propose threedecentralized MAC protocols, namely, decentralized scheduling through carrier sensing, Aloha scheduling, and adaptive Alohascheduling. Design parameters for the proposed protocols are optimized. Finally, performance comparison among these protocolsis provided via simulations.

Keywords and phrases: medium access control, signal field reconstruction, sensor networks.

1. INTRODUCTION

In many applications, sensor networks operate in threephases: sensing, information retrieval, and information pro-cessing. As a typical example, in physical environmentalmonitoring, sensors first take measurements of the signalfield at a particular time. The data are then collected fromindividual sensors to a central processing unit, where the sig-nal field is finally reconstructed.

An appropriate network architecture for such applica-tions is SEnsor Networks with Mobile Access (SENMA)[1, 2]. As shown in Figure 1, SENMA consists of two typesof nodes: low-power low-complexity sensors randomly de-ployed in a large quantity, and a few powerful mobile access


points communicating with the sensors. The use of mobileaccess points enables data collection from specific areas ofthe network.

We focus on the latter two operational phases in theSENMA architecture: information retrieval and processing,which are strongly coupled. To achieve the optimal perfor-mance of the sensor network, the two phases should be con-sidered jointly. The key to information retrieval is mediumaccess control (MAC) that regulates data retrieval from sen-sors to the access point. The main focus of this paper is todesign MAC protocols for the optimal reconstruction of thesignal field.

The MAC design for sensor network applications needsto take into account application-specific characteristics, forexample, the correlation of the field, the randomness of thesensor locations, and the redundancy of the large-scale sen-sor deployment. The traditional MAC design criteria, such asthroughput, fail to capture the characteristics of the specific






Access point

Sensor

Figure 1: A 1D sensor network with a mobile access point.

sensor application; a high-throughput MAC does not implylow reconstruction distortion. In this paper, we propose anew MAC design criterion for the field reconstruction ap-plication.

The new MAC design criterion is motivated by the needto collect data evenly across the field for a given throughput.If we have an infinitely dense network, the optimal data col-lection strategy is to retrieve samples from evenly spaced lo-cations. For a finite density network considered in this work,however, there may not exist sensors in the desired loca-tions. The optimal centralized scheduler, with the locationinformation of all sensors, calculates the optimal location setand retrieves data from the optimal set to minimize the re-construction distortion. Such optimal centralized schedulercomes with the substantial cost of sensor-location informa-tion gathering. Decentralized MAC protocols, on the otherhand, require much less intervention from the mobile accesspoint and bandwidth resources.

We consider a one-dimensional problem for simplicity,which can be extended to a two-dimensional setup. Takingboth performance and implementation complexity into con-sideration, besides the optimal centralized scheduler, we pro-pose three decentralized MAC protocols. We first propose adecentralized scheduler via carrier sensing, which, under theno-processing delay assumption, provides little performanceloss compared to the performance of the optimal scheduler.Then, to simplify the implementation, we introduce a MACscheme which uses Aloha-like random access within a resolu-tion interval centered at the desired retrieval location. Finally,to improve the performance, we propose an adaptive Alohascheduling scheme which adaptively chooses the desired re-trieval locations based on the history of retrieved samples.Design parameters are optimized for the proposed schemes.The performance comparison under various sensor densityconditions and packet collection sizes is also provided.

The problems on sensor network communications haveattracted a growing research interest. In terms of medium ac-cess control, many MAC protocols have been proposed aim-ing at the special needs and requirements for both ad-hocsensor networks [3, 4, 5, 6] and sensor networks with mo-bile access [2]. Most of these proposed schemes only considerthe MAC layer performance, that is, throughput. The effectof MAC for information retrieval on information process-ing is analyzed in [7, 8] for infinite and finite sensor density

networks, respectively, where the performance of the central-ized scheduler and that of the decentralized random accessare analyzed and compared.

The idea of using carrier sensing for energy-efficienttransmission in sensor networks was first proposed in [9,10, 11], where backoff delays are chosen as a function of thechannel strength. The carrier sensing strategy presented heregeneralized that in [9, 10, 11] by using carrier sensing to dis-tinguish nodes in different locations.

2. SYSTEM MODEL AND MAC DESIGN OBJECTIVE

In this section, we introduce the system model and the sig-nal field reconstruction distortion measure, which leads to asimple MAC design objective.

2.1. Signal field model

Consider a one-dimensional field of unit length, denoted byA = [0, 1]. Let S(x) (x ∈ A) be the source of interest in Aat a particular time. We assume that the spatial dynamic ofS(x) is a homogeneous Gaussian random field given by thefollowing linear stochastic differential equation:

dS(x) = − f S(x)dx + σdW(x), (1)

where f > 0, σ are known, {W(x) : x ≥ 0} is a standardBrownian motion, and S(x) ∼ N (0, σ2/|2 f |) is the station-ary solution of (1). The random field modeled in (1) is essen-tially a diffusion process which is often used to model manyphysical phenomena of interest. Being homogeneous in A,S(x) has the autocorrelation

E{S(x0)S(x1)} = σ2

2 fe− f (x1−x0) (2)

for x0 < x1, which is only a function of the distance betweenthe two points x1 and x0.

2.2. Sensor network model

We assume that sensors in A are deployed randomly, andtheir distribution forms a one-dimensional homogeneousspatial Poisson field with local density ρ sensors/unit area.That is, in a length-l interval, the number of sensors N(l) is aPoisson random variable with distribution

Pr{N(l) = k

} = e−ρl(ρl)k

k!, (3)

and the numbers of sensors in any two disjoint intervals areindependent. To avoid the boundary effect, we assume thatthere is a sensor at each of the two boundary points x = 0 andx = 1. LetN denote the number of sensors in the field exclud-ing the two boundary points. Denote xN = [x1, x2, . . . , xN ]T

the sensor locations, where 0 < x1 < x2 < · · · < xN < 1.After its deployment, each sensor obtains its own lo-

cation information through some positioning method. Ata prearranged time, all sensors measure their local signals,

MAC for Optimal Information Retrieval Pattern 495

dmax

0 1

x

RetrievedNot retrieved

Figure 2: Linear field.

forming a snapshot of the signal field. The measurement of asensor at location x is given by

Y(x) = S(x) + Z(x), (4)

where Z(x) is zero mean, spatially white Gaussian measure-ment noise with variance σ2

Z , and is independent of S(x).Each sensor stores its local measurement along with its

location information in the form of a packet for future datacollection.

2.3. The multiple-access channel

When the mobile access point is ready for data collection,sensors transmit their measurement packets to the accesspoint through a common wireless channel. We assume slot-ted transmission in a collision channel, that is, a packet is cor-rectly received if and only if no other users attempt transmis-sion. To retrieve measurement packets from the field througha collision channel, some form of MAC is needed. In this pa-per, we propose and discuss four MAC protocols, with differ-ent performance and complexity trade-off, to optimize thereconstruction performance.

In each time slot, sensors compete for the channel use.The channel output may be a collision, an empty slot, ora data packet that contains the measurement and the loca-tion of the sensor. We assume that the access point uses mtime slots to retrieve measurement data and refer to m as thepacket collection size. Let qi, 1 ≤ i ≤ m, denote the samplelocation of the ith channel outcome if a packet is successfullyreceived. Otherwise, let qi = ∅. Let q = [q1, q2, . . . , qm]T

denote the output location vector. To avoid the boundary ef-fect for signal reconstruction, we assume that, in addition tothe m retrieval attempts, the two boundary measurementsare also retrieved by the mobile access point.

2.4. Information processing andperformance measure

After the information retrieval, we reconstruct the originalsignal field based on the received data samples. Let K denotethe number of qi’s not equal to ∅ in q, excluding the twoboundary points. Let rK = [r1, r2, . . . , rK ]T , r1 ≤ r2 ≤ · · · ≤rK , be the ordered sample location vector constructed from qby ordering the non-∅ elements. For convenience, let r0 = 0and rK+1 = 1.

We estimate S(x) at location x using its two immediateneighbor samples by the MMSE smoothing, that is, for ri <x < ri+1, 0 ≤ i ≤ K ,

S(x) = E{S(x)|Y(ri),Y(ri+1

)}. (5)

dmax

θx

RetrievedNot retrieved

Figure 3: Circular field.

Given q, we define the maximum field reconstruction distor-tion as he maximum mean-square estimation error in A,

E(q) � maxx∈A

E{∣∣S(x)− S(x)

∣∣2∣∣q}. (6)

The expected maximum distortion of the signal reconstructionin m collection time slots is then given by

E(m) � E{E(q)

}, (7)

where the expectation is taken over the output location vec-tor q.

2.5. MAC design objective

Our objective is to design MAC protocols that result inthe smallest signal field reconstruction distortion for a fixednumber of retrieval slots. From [7, 8], we have shown that themaximum distortion is determined only by the maximumdistance between two adjacent data samples,

E(q) = 2 f σ2Z/σ

2 + 1− e− f dmax(q)

2 f σ2Z/σ2 + 1 + e− f dmax(q)

σ2

2 f� E

(dmax(q)

), (8)

where

dmax(q) = max0≤i≤K

(ri+1(q)− ri(q)

). (9)

The maximum distortion in (6) is a monotonically increas-ing function of dmax. Thus, a smaller E{dmax} indicates asmaller reconstruction distortion. Our objective now is todesign MAC for the minimum E{dmax}.

2.6. Linear field and circular field

The above 1D field model with two boundary points is re-ferred to as the linear field (Figure 2). Another filed of interestis the circular field which is a circle with unit circumference(Figure 3). As in the linear field, sensors in the circular fieldare deployedaccording to Poisson distribution with density ρ


sensors/unit length; see (3). The location of each sensor onthe circular field is described by its angle θ, 0 ≤ θ < 2π, asshown in Figure 3. Alternatively, the location can also be de-scribed by x = θ/2π, 0 ≤ x < 1. Let xN = [x1, x2, . . . , xN ]T ,x1 ≤ x2 ≤ · · · ≤ xN , denote the sensor locations where N isthe number of sensors in the field.1

Similar to the linear field, let q = [q1, q2, . . . , qm]T denotethe output location vector, where qi, 1 ≤ i ≤ m, is the samplelocation of the ith channel outcome if a packet is successfullyreceived in the ith slot, or qi = ∅ otherwise. Let K be thenumber of non-∅ elements in q and let rK = [r1, r2, . . . , rK ]T

be the ordered sample location vector constructed from q byordering the non-∅ elements, with r1 being the smallest. Forconvenience, let rK+1 = 1+r1. The maximum distance for thecircular field is defined as

dmax(q) � max1≤i≤K

(ri+1(q)− ri(q)

). (10)

To avoid ambiguity, define dmax to be 1 if only one sampleis retrieved, or 2 if none is retrieved. Since we are not work-ing in the extremely low-density regime, the probability ofretrieving only one or no sample is small. Besides the vectorform as in (9) and (10), the input parameters of dmax(q) forboth fields also take other forms in this paper for the ease ofpresentation. The MAC design objective for the circular fieldis also to minimize E{dmax}.

3. MAC FOR OPTIMAL INFORMATIONRETRIEVAL PATTERN

3.1. Optimal centralized scheduling

Assume that the location information xN of all sensors isavailable to the mobile access point. Also assume that themobile access point is able to activate individual nodes fordata transmission. The mobile access point is then able toprecompute the optimal set of m locations and to activateonly those sensors. This results in the minimum dmax, andtherefore, the best performance. The performance under thisscheduler can be used as a benchmark for performance com-parison.

For a given sensor location realization xN and a fixed m,the optimal dmax is

d∗max

(xN ,m

) = min1≤i1≤i2≤···≤im≤N

dmax(xi1 , xi2 , . . . , xim

). (11)

The optimal set of sensor locations are those that produced∗max, and the mobile access point activates these sensors oneat a time to avoid collision.

The optimization problem (11) can be solved by abrute force search. To reduce the computational complex-ity, we propose an efficient algorithm for the linear field,Algorithm 1. It first looks for an initial set of locations and

1We are reusing notations for the circular field. If a discussion is partic-ular to the linear or the circular field, the notations should be understood inthat context.

The search scheme consists of three steps.Step 1. Location initialization. A set of m sensor locations ischosen from xN as the initial set, (q(0)

1 , . . . , q(0)m ). The dmax of

the chosen set is assigned to d(0)max. Let i = 0.

Step 2. Within interval (0,d(i)max), find the sensor location

closest to d(i)max and assign it to q(i+1)

1 . For 1 ≤ j ≤ m− 1, if

q(i+1)j + d(i)

max > 1, let q(i+1)j+1 = 1; if q(i+1)

j + d(i)max ≤ 1 and there

exists at least one sensor in the interval (q(i+1)j , q(i+1)

j + d(i)max),

let q(i+1)j+1 be the sensor location closest to the right boundary

of the interval; if q(i+1)j + d(i)

max ≤ 1 and there are no sensors in

the interval (q(i+1)j , q(i+1)

j + d(i)max), the algorithm ends and

d(i)max obtained previously is the minimum d∗max.

Step 3. After obtaining q(i+1)1 , . . . , q(i+1)

m , calculate

d(i+1)max = dmax(q(i+1)

1 , . . . , q(i+1)m ). If d(i+1)

max < d(i)max, let i = i + 1

and go to Step 2. Otherwise, the search ends and d(i)max is the

minimum d∗max.When the search stops, the corresponding (q(i)

1 , q(i)2 , . . . , q(i)

m )is the optimal set of locations for the given xN and m. Weselect the initial set as follows. Choose q(0)

i to be the sensorlocation that is closest to i/(m + 1), 1 ≤ i ≤ m, and let thecorresponding dmax be d(0)

max.

Algorithm 1

the corresponding dmax. Based on this dmax, it looks for an-other set of locations resulting in a smaller dmax. Iteratively,dmax converges to its minimum value in finite steps.

In each iteration, d(i)max is strictly decreasing. Algorithm 1

stops only when d(i)max has reached its minimum value. For a

field with finite sensors, the possible values of dmax is finite.Therefore, Algorithm 1 finds the optimal locations in finitesteps.

Next, we consider the circular field. Algorithm 1 can beadapted to solve the optimization of (11) by converting thecircular field to the linear field. For the ease of discussion, fora given xN , let xN+ j � 1 + xj , 1 ≤ j ≤ N . Suppose that xi isincluded in the optimal set, 1 ≤ i ≤ N . Then we break thecircle at point xi, and (xi+1, . . . , xN+i−1) are sensor locationsin the linear field with xi and xN+i being the two boundarypoints. The other m− 1 points that minimize dmax under theassumption that xi is selected can be solved by Algorithm 1.2

Exhausting all xi gives the global optimal d∗max. To shorten thesearch time, use the smallest dmax obtained in previous runsof Algorithm 1 as the initialization value d(0)

max for the newsearch with a new xi. It can be shown that exhausting x1 ≤xi < x1 + d′max is enough, where d′max is any value greater thanor equal to the global minimum d∗max. The initialization value

d(0)max for the current xi can be used as d′max for the exhaustion

stopping criterion.The centralized scheme gives the best performance under

the condition that all sensor location information is avail-able to the mobile access point. However, the bandwidth re-quired for sensors reporting their locations is prohibitively

2Here, m−1 points are sought instead of m points in the linear field case.


large, especially for large-scale sensor networks. Decentral-ized schemes that do not require the knowledge of sensor lo-cations at the mobile access point are desirable. Nonetheless,the centralized scheme gives the best possible performanceand serves as a benchmark.

3.2. Decentralized scheduling through carrier sensingIn practice, the sensor location information may not be avail-able at the mobile access point. Each senor only knows itsown location. In this case, in order to retrieve data with thedesired pattern and in a decentralized fashion, we proposedecentralized scheduling through carrier sensing. We assumethat each sensor has a transmission coverage radius R. Sincethe propagation delay is relatively small as compared to theslot length, we assume perfect carrier sensing with no prop-agation delay within radius R, that is, a sensor’s transmis-sion is detected immediately by other sensors within distanceR.

In the proposed protocol, sensor transmissions arescheduled through carrier sensing, where the distances ofsensors from the desired locations are used in the backoffscheme. The backoff time of a sensor is a function of the dis-tance from the sensor to the desired location. A similar ideaof using carrier sensing for decentralized transmission wasfirst proposed in [9, 10, 11], where the channel state infor-mation was used in the backoff function of the carrier sens-ing scheme for opportunistic transmission.

Protocol. In each time slot, a segment of length R is acti-vated. Sensors within the activated region compete for thechannel use. Let pj denote the center of the jth segment,1 ≤ j ≤ m. Each sensor within the activated segment com-putes its distance to pj , that is, if xi is within the activatedsegment, its distance is di, j = |xi − pj| for the linear field,or di, j = min(|xi − pj|, 1 − |xi − pj|) for the circular field.The activated sensors then choose their respective backofftime based on a backoff function τ(d), which maps the dis-tance to a backoff time. A sensor listens to the channel duringits backoff time. If it detects a transmission before its back-off timer expires, the sensor will not transmit in this timeslot. Otherwise, the sensor transmits its measurement sam-ple packet immediately when its timer expires. The functionτ(d) is designed to be strictly increasing; therefore, if thereare sensors in the activated region, only the sensor closest tothe center of the activated segment will be received success-fully in this time slot. An example of τ(d) is given in Figure 4.The activation sequence is deterministic in the sense that itdoes not change based on the previous data collection re-sults.

Where the activation segments should be centered is adesign issue. As the next lemma shows, for the circular field,the segments should be separated evenly.

Lemma 1. Consider the circular field. Suppose that in the ithtime slot, 1 ≤ i ≤ m, the length-L segment centered at pi,0 ≤ pi < 1, is activated to compete for the collision channel use.Suppose that these segments do not overlap. Let qi, 0 ≤ qi < 1,be the outcome location in the ith slot if a packet is success-fully received, or qi = ∅ otherwise. Define the relative outcome

τ

τ1

τ2

dd2 d1

Figure 4: Backoff function τ(d).

location bi, bi = ∅ or −L/2 ≤ bi ≤ L/2, as follows:

bi(pi, qi

)�

∅ if qi = ∅,

qi − pi if∣∣qi − pi

∣∣ ≤ L

2,

qi − pi − 1 if∣∣qi − pi

∣∣ > L

2, qi > pi,

qi − pi + 1 if∣∣qi − pi

∣∣ > L

2, qi < pi,

(12)

where the conditions in (12) are to deal with the coordinatetransition around θ = 0 or θ = 2π on the circular field. If bi’sare independent and identically distributed (i.i.d.), then evenlyspaced segments produce the minimum E{dmax} for the circularfield.

For the proof, see Appendix A.For the linear field, however, evenly spaced activation

segment sequence is not optimal because of the asymme-try introduced by the two boundary points. Nonetheless,evenly spaced segment sequence has good performance forlarge m and ρ since the boundary effect is negligible in thisscenario. We will use the evenly spaced segment sequencepi = i/(m + 1), 1 ≤ i ≤ m, for the linear field in the sim-ulations.

The carrier sensing protocol has high throughput be-cause, if there are nodes within an activation segment, thepacket closest to the center will be successfully received withprobability one.

3.3. Aloha scheduling

The carrier sensing scheme requires additional hardware forthe carrier sensing functionality. In addition, the synchro-nization and timing requirements are strict for the carriersensing mechanism. Next, we present a cost-efficient proto-col for sensor sample collection.

Protocol. Select a sequence of m nonoverlapping length-εsegments as the activation sequence. Activate one segmentin the activation sequence every time slot. Sensors withinthe activated region transmit their packet independently withprobability P. The activation sequence is deterministic in the


ε

0 p1 p2 p3 1

Figure 5: Aloha scheme on the linear field. A sequence of length-εsegments is activated sequentially. The sensors within the activatedrange transmit with probability P.

sense that it does not depend on the data collection results.Figure 5 illustrates the Aloha scheme on the linear field.

In the Aloha protocol, the segment length ε, the trans-mission probability P, and the center locations of the activa-tion segments are optimization parameters.

Lemma 2. For both the linear and the circular fields, the opti-mal transmission probability P is one and the optimal segmentlength ε is strictly less than 1/ρ.

For the proof, see Appendix B.It can be shown that the result of Lemma 2 also holds

in a more general setup where the transmission probabilitywithin the activation region is a function of the distance fromthe sensor to the center of the activation region. An intuitiveway to explain Lemma 2 is that, for the same throughput,the smaller the activation interval length is, the more pre-cise the outcome location can be. Therefore, the data collec-tion outcomes for a smaller activation interval are closer toevenly spaced center locations, producing a smaller E{dmax}.Letting P = 1 gives the smallest activation interval lengthfor a given throughput. The result about ε can be explainedas follows. Shortening the activation length has two effectson E{dmax}: one is that it gives a lower throughput if thelength is less than or equal to 1/ρ, which is a negative effect;the other is that it produces a more precise outcome loca-tion control, a positive effect. Although (P = 1, ε = 1/ρ)gives the maximum throughput for Aloha, when ε is short-ened a little, the throughput only decreases a little becausethe derivative of the throughput with respect to ε is zero atε = 1/ρ. Thus the negative effect is small. The positive ef-fect from the more precise location control favors an activa-tion length strictly shorter than 1/ρ, meaning that the opti-mal throughput is strictly less than 1/e. Nonetheless, the gainby selecting a length shorter than 1/ρ is small for dense sensornetworks. We will use ε = 1/ρ in the simulations.

As shown in Lemma 1, for the circular field, evenlyspaced center locations of the activation segments are opti-mal. As mentioned in the carrier sensing protocol, for the lin-ear field, evenly spaced activation segments are not optimal.Nonetheless, evenly spaced segments have good performancefor large m and ρ, and we will use evenly spaced activationsegments in the simulations for the linear field.

3.4. Adaptive Aloha scheduling

The carrier sensing and Aloha scheduling protocols pre-sented previously are deterministic scheduling since the cen-ter location of each activation segment does not change ac-cording to previous data collection outcomes. In determinis-tic scheduling, the activation location information may bepreset to sensors before their deployment, eliminating the

dmax

ε

0 1

Figure 6: Adaptive Aloha scheduling example on the linear field.The mobile access point activates one interval of length ε in onetime slot. The sensors within the activated range transmit withprobability P = 1. The solid diamonds indicate the received packets.The algorithm tries to break the maximum distance by placing thenext polling interval at the center of the two received data samplelocations whose distance is dmax.

need to broadcast the location information from the mo-bile access point and saving some hardware cost. Anotherapproach is to let the mobile access point decide the next ac-tivation location on the fly, based on previous data collectionresults. Allowing the activation sequence to adapt to previousdata collection results may give better performance. Next wepresent an adaptive scheduling for Aloha.

Protocol. The basic activation strategy is similar to theAloha protocol. The mobile access point activates an inter-val of length ε = 1/ρ in each time slot; the sensors within therange transmit with probability P = 1. The difference is that,in the adaptive version, the locations of the activation inter-vals depend on the previous data collection results, which isdescribed as follows.

After obtaining a new packet, the access point checks allthe previous received data and finds the two adjacent samplelocations that have the maximum distance. The access pointthen locates the next polling interval in the middle of thesetwo samples locations (see Figure 6 for the linear field case).If an empty slot occurs, the access point then activates thelength-ε interval adjacent (either left or right) to the pre-vious empty intervals until a success or collision occurs. Ifa collision occurs, the access point resolves the collision bysplitting the collision interval until a packet is successfullyreceived (similar to the splitting algorithms [12]). If a packetis received successfully, the access point recalculates and triesto break the new dmax of the received samples within the re-maining time slots. The algorithm keeps running until it usesup the m time slots.

The above protocol works in an environment where themobile access point can communicate to the whole field fromone location, for example, high-altitude airplanes or satel-lites. There are other types of adaptive scheduling schemes.For example, we can also adapt the activation sequence on acarrier sensing scheduling setup. However, as will be shownin the simulations section, the gain of adapting activation se-quence on a carrier sensing setup is small because the per-formance of the carrier sensing scheduling is already close tothat of the optimal centralized scheduling.

4. SIMULATIONS

In this section, we compare the performance of the MACprotocols proposed in the last section through simulations.Due to the space limit, only figures for the linear field are


0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

E{d

max}

10 15 20 25 30 35 40 45 50

m

CentralizedCarrier sensing

AlohaAdaptive Aloha

Figure 7: E{dmax} versus packet collection size m for sensor densityρ = 40.

shown. For the circular field, similar results are observed.Sensors are randomly deployed according to the Poisson dis-tribution with density ρ. For convenience, we name theseMAC protocols as follows.

(i) π1 is the optimal centralized scheduler.(ii) π2 is the decentralized scheduling through carrier sens-

ing with R = 1.(iii) π3 is the Aloha scheduling.(iv) π4 is the adaptive Aloha scheduling.

We use the dmax found using π2 as the initial maximum dis-tance for the iteration algorithm in π1. The search stops after1-2 iterations typically. In the comparison, we use E{dmax} asthe performance metric.

Figures 7 and 8 plot E{dmax} versus m for sensor den-sity ρ = 40 and 200, respectively. The expectation of dmax inthe figures is averaged over 100 000 realizations of the Pois-son sensor field. As expected, as m increases, the number ofdata samples received at the mobile access point increases,and thus E{dmax} decreases. We see that there is little perfor-mance loss by using π2. Notice that, when m is larger than ρ(Figure 7), under π1 and π2, data from all sensors can be re-trieved with a high probability. Therefore, the performancegap for the two protocols diminishes. The performance un-der π3 is worse than other schemes even when m is largerthan ρ. This is because, under π3, some scheduled intervalsdo not have data packets received successfully due to eithercollision or void of sensors. Unlike π3, the location of each ac-tivation interval of π4 is adapted to the previous data collec-tion outcomes. When m is large, it has enough slots to searchfor intervals within which sensors exist and to resolve col-lision, therefore avoiding the problem in π3. From Figure 7,we see that, when m is large, the performance under π4 is asgood as the optimal case.

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

E{d

max}

10 15 20 25 30 35 40 45 50

m


AlohaAdaptive Aloha

Figure 8: E{dmax} versus packet collection size m for sensor densityρ = 200.

Figures 9 and 10 plot E{dmax} versus ρ for packet col-lection size m = 10 and 50, respectively. As expected, as ρincreases, the density of the sensor field increases, and thereceived data locations are closer to the desired locations, re-sulting in a sample pattern closer to evenly spaced. There-fore, E{dmax} converges to the minimum value as ρ increases.Again, we see that the performance under π2 closely followsthe optimal one. As ρ increases, we see the performance gapbetween the two Aloha schemes and π1 increases. The per-formance loss under π3 is mainly due to its lower throughputthan that of π1 and π2, which limits the number of receivedsamples. We observe that there is a significant performanceimprovement of π4 over π3 by adaptively optimizing the re-trieval pattern based on the retrieval history.

5. CONCLUSION

To reconstruct the signal field using sensor networks, the lo-cations of the retrieved data affect the signal field reconstruc-tion performance. In this paper, we design MAC protocolsto obtain the desired data retrieval pattern. We propose anew MAC design criterion that takes into account the appli-cation characteristics of the signal field reconstruction. Tak-ing both performance and implementation complexity intoconsideration, besides the optimal centralized scheduler, wepropose three decentralized MAC protocols. We have shownthat, for the carrier sensing and Aloha scheduling schemes,evenly spaced activation intervals are optimal for the circularfield. For the Aloha scheduling in both the linear field andthe circular field, the optimal transmission probability is oneand the optimal activation interval length is strictly smallerthan 1/ρ, resulting in a throughput strictly less than 1/e.Our simulations show that using the decentralized schedul-ing through carrier sensing results in little performance loss


0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

E{d

max}

20 40 60 80 100 120 140 160 180 200

Sensor density ρ


AlohaAdaptive Aloha

Figure 9: E{dmax} versus sensor density ρ for packet collection sizem = 10.

compared to the performance of the optimal scheduler. Forthe two Aloha schemes, by exploring the history of retrieveddata locations, adaptive Aloha provides a significant perfor-mance gain over the simple Aloha scheme.

APPENDICES

A. PROOF OF LEMMA 1

We first define four operations on integers or real numbers.Let i and j be two integers. Define i⊕ j to be equal to i+ j+km,where k is the integer such that 1 ≤ i + j + km ≤ m. Leti j � i ⊕ (− j). Let x1 and x2 be two real numbers. Definex1⊕x2 to be equal to x1+x2+k, where k is the integer such that0 ≤ x1 +x2 +k < 1. Let x1x2 � x1⊕(−x2). For convenience,extend the operations

⊕and on real numbers to include

the symbol ∅. Let x1 and x2 be real numbers or the symbol∅. Define x1⊕x2 and x1x2 to be∅ if either x1 or x2 is equalto∅.

It can be verified that the inverse function of (12) is givenby

qi(pi, bi

) = pi ⊕ bi. (A.1)

The average dmax when p is the center location vector is givenby

Eq{dmax(q); p

} = Eb{dmax(p⊕ b)

}, (A.2)

where p⊕b is the vector with pi⊕bi as the ith entry. Withoutloss of generality, assume that p is an ordered vector withp1 being the smallest. Let p be an equally spaced locationvector on the circular field. Without loss of generality, let

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

0.22

E{d

max}

20 40 60 80 100 120 140 160 180 200

Sensor density ρ


AlohaAdaptive Aloha

Figure 10: E{dmax} versus sensor density ρ for packet collection sizem = 50.

pi = (i− 1)/m, 1 ≤ i ≤ m. The proof is concluded if we showthat, for all p,

Eb{dmax(p⊕ b)

} ≥ Eb{dmax(p⊕ b)

}. (A.3)

Let b(k) be the kth rotated vector of b, that is, b(k)i = bi⊕k,

for 0 ≤ k ≤ m−1 and 1 ≤ i ≤ m. Since bi’s are i.i.d., we have,for 0 ≤ k ≤ m− 1,

Eb{dmax(p⊕ b)

} = Eb{dmax

(p⊕ b(k))}. (A.4)

Therefore, the left-hand side of (A.3) can be expressed as

Eb

{1m

m−1∑k=0

dmax(

p⊕ b(k))}. (A.5)

Hence, it suffices to show that for any b and p,

1m

m−1∑k=0

dmax(

p⊕ b(k)) ≥ dmax(p⊕ b). (A.6)

For a given b with one or no non-∅ element, by defini-tion, dmax is equal to 1 or 2, respectively, for both p and p.Therefore, (A.6) holds.

Let L(i, j) be the set of indices between i and j counter-clockwise, 1 ≤ i, j ≤ m and i = j, that is, L(i, j) = {l : i <l < j} if i < j, or {l : i < l ≤ m, or 1 ≤ l < j} if i > j. For agiven b with at least two non-∅ entries, search dmax amongthe output locations p⊕ b on the circular field. Suppose that


dmax occurs from the ith point to the jth point counterclock-wise, that is, bi, bj = ∅, bl = ∅ for l ∈ L(i, j), and

dmax(p⊕ b) = ( p j ⊕ bj) ( pi ⊕ bi

)= ( p j pi

)+(bj − bi

)(A.7)

=(j i

m

)+(bj − bi

),

where (A.7) holds because p j pi > L > bj − bi. Since bl =∅ for l ∈ L(i, j), in the outcome locations p ⊕ b(k), there

are no valid samples from pik ⊕ b(k)ik counterclockwise to

pjk ⊕ b(k)jk. Hence dmax(p ⊕ b(k)) is at least as large as the

distance from pik ⊕ b(k)ik counterclockwise to pjk ⊕ b(k)

jk.Thus,

m−1∑k=0

dmax(

p⊕ b(k))

≥m−1∑k=0

((pjk ⊕ b(k)

jk)(pik ⊕ b(k)

ik))

=m−1∑k=0

((pjk pik

)+(bj − bi

))(A.8)

=(m−1∑

k=0

ji∑l=1

(pik⊕l pik⊕l1

))+ m

(bj − bi

)

=( ji∑

l=1

m−1∑k=0

(pik⊕l pik⊕l1

))+ m

(bj − bi

)

=ji∑l=1

1 + m(bj − bi) (A.9)

= ( j i) + m(bj − bi

)= mdmax(p⊕ b),

where (A.8) holds because pjk pik > L > bj − bi, and

(A.9) holds because∑m−1

k=0 (pik⊕l pik⊕l1) is equal to thecircumference of the circular field, which is one.

B. PROOF OF LEMMA 2

We prove Lemma 2 for the linear field. The proof for the cir-cular field is basically the same except that extra care shouldbe taken for coordinate transitions around location x = 0 orx = 1. Consider a more general scheme which does not re-quire that each activation segment has the same length andtransmission probability. Let pi, Pi, and εi denote the center,the transmission probability, and the length of the ith activa-tion segment, respectively, 1 ≤ i ≤ m. Let qi be the outcomelocation of the ith channel competition, or qi = ∅ if no sam-ple packet is received successfully in the ith time slot, due toeither collision or no transmission. The throughput of the ithtime slot is

si � Pr{qi = ∅

} = εiPiρe−εiPiρ. (B.10)

Given a packet is received successfully in the ith time slot, thelocation qi is uniformly distributed,

p(qi|qi = ∅

) = 1εi

1pi−εi/2≤qi≤pi+εi/2, (B.11)

where 1A is the indicator function. Let q = [q1, . . . , qm]T .Since the activation segments do not overlap, qi’s are inde-pendent. Let q/i denote the length-(m−1) vector constructedby taking out qi from q. The expected dmax(q) is given by

Eq{dmax(q)

}= Eq/iEqi

{dmax

(q/i, qi

)|q/i}

= 12Eq/i

{2(1− si

)dmax(q/i, qi = ∅)

+siεi

∫ εi/2−εi/2

(dmax

(q/i, qi = pi + a

)+ dmax

(q/i, qi = pi − a

))da}.

(B.12)

Suppose that (εi, Pi) give the same throughput as (εi,Pi),that is, εiPiρe−εi Piρ = si. And suppose that εi < εi. We willshow that if (εi,Pi) are replaced by (εi, Pi) while other pa-rameters remain the same, then E{dmax(q)} decreases. Sincethe throughput si remains the same, the first term of (B.12)remains the same. If we can show that, for all q/i and for−εi/2 ≤ a ≤ εi/2,

dmax(

q/i, qi = pi + a)

+ dmax(

q/i, qi = pi − a)

≥ dmax

(q/i, qi = pi +

εiεia)

+ dmax

(q/i, qi = pi − εiεi a

),

(B.13)

then we have shown that the second term of (B.12) decreases.Therefore, we have proved that, with the same throughput,the shorter the activation length, the better the performance.Hence, the optimal Pi is 1 and the optimal εi is less than orequal to 1/ρ for all i because these conditions in Aloha givethe shortest activation length for a given throughput.

Next we prove (B.13). Let length-m vectors q′, q, and q′

be functions of q given qi = ∅: q′j = q j = q′j = qj for j = i,q′i = 2pi−qi, qi = pi+εi/εi(qi−pi), and q′i = pi−εi/εi(qi−pi)(Figure 11). Equivalently, we are proving that

dmax(q) + dmax(q′) ≥ dmax(q) + dmax(q′) (B.14)

for all q with qi = ∅, or equivalently, for all q with qi = ∅.We first define three terms for the ease of discussion. dmax(q)is said to be associated with qi if qi is one of the endpointsthat produces dmax given q as the outcome location vector.dmax(q) is said to be associated with qi to the inside if dmax(q)is associated with qi and the center pi is between the two end-points of dmax. dmax(q) is said to be associated with qi to theoutside if dmax(q) is associated with qi and the center pi is


dmax(q), dmax(q′)

qi−1 qi piq ′i qi+1 qi+2

qi q′i

Figure 11: Case 1.

not between the two endpoints of dmax. We prove (B.14) byverifying all possible cases.

Case 1. Neither dmax(q) is associated with qi nor dmax(q′)is associated with q′i . Therefore, dmax(q) and dmax(q′) areassociated with two points other than qi or q′i (Figure 11).Since these two points are also adjacent points in q and q′,dmax(q) and dmax(q′) are at least as large as the distance ofthe two points. Therefore, dmax(q) + dmax(q′) ≥ dmax(q) +dmax(q′).

Case 2. Either dmax(q) is associated with qi to the outside ordmax(q′) is associated with q′i to the outside. Without lossof generality, assume that dmax(q′) is associated with q′i tothe outside (Figure 12). Suppose that the other endpoint fordmax(q′) is qk, k = i. By assumption, qk and q′i are on thesame side of pi. Thus, it can be verified that qi and qk are thetwo endpoints of dmax(q). Therefore,

dmax(q) + dmax(q′) = 2∣∣pi − qk

∣∣. (B.15)

Since qi and qk are two adjacent points in q, we havedmax(q) ≥ |qi − qk|. Similarly, dmax(q′) ≥ |q′i − qk|. Since

qi and q′i are on the same side of qk, we have

dmax(q) + dmax(q′) ≥ ∣∣qi − qk∣∣ +

∣∣q′i − qk∣∣

= 2∣∣pi − qk

∣∣= dmax(q) + dmax(q′).

(B.16)

Case 3. Either dmax(q) is associated with qi to the insideor dmax(q′) is associated with q′i to the inside, but neitherdmax(q) is associated with qi to the outside nor dmax(q′) isassociated with q′i to the outside. Without loss of general-ity, assume that dmax(q) is associated with qi to the inside(Figure 13). Since qi is further away from the center pi thanqi, we have dmax(q) > dmax(q). There are two subcases.

Subcase 1. dmax(q′) is associated with q′i to the inside. Sinceq′i is further away from the center pi than q′i , we havedmax(q′) > dmax(q′). Therefore,

dmax(q) + dmax(q′) > dmax(q) + dmax(q′). (B.17)

Subcase 2. dmax(q′) is not associated with q′i . With the sameargument as in Case 1, we have dmax(q′) ≥ dmax(q′). There-fore, (B.17) still holds.

The above three cases conclude the proof of (B.14). Thuswe have shown that the optimal Pi is 1 and the optimal εiis less than or equal to 1/ρ for all i. Next we prove that theoptimal εi is strictly less than 1/ρ. Since E{dmax(q)} is a con-tinuous function of εi, it suffices to prove that, when P = 1,

∂E{dmax(q)

}∂εi

∣∣∣∣εi=1/ρ

> 0. (B.18)

From (B.12),

∂E{dmax(q)

}∂εi

= ρe−εiρEq/i

{(εiρ− 1

)dmax

(q/i, qi = ∅

)− ρ

2

∫ εi/2−εi/2

(dmax

(q/i, qi = pi + a

)+ dmax

(q/i, qi = pi − a

))da

+12

(dmax

(q/i, qi = pi +

εi2

)+ dmax

(q/i, qi = pi − εi2

))}. (B.19)

The first term of (B.19) is equal to zero given that εi =1/ρ. From (B.13),

dmax

(q/i, qi = pi +

εi2

)+ dmax

(q/i, qi = pi − εi2

)≥ dmax

(q/i, qi = pi + a

)+ dmax

(q/i, qi = pi − a

)(B.20)

for−εi/2 < a < εi/2. Since (B.17) in Case 3 in the proof of the

first part occurs with nonzero probability, strict inequality in(B.20) occurs with nonzero probability. Therefore, the sumof the second and the third terms of (B.19) is strictly largerthan zero given that εi = 1/ρ, thus proving (B.18).

ACKNOWLEDGMENTS

This work was supported in part by the NationalScience Foundation under Contract CCR-0311055, the


dmax(q′)

dmax(q)


qi q′i

Figure 12: Case 2.

dmax(q)

dmax(q′)


qi q′i

Figure 13: Case 3.

Multidisciplinary University Research Initiative (MURI) un-der the Office of Naval Research Contract N00014-00-1-0564, and the Army Research Laboratory CTA on Communi-cation and Networks under Grant DAAD19-01-2-0011. Partof this work was presented at MILCOM, Monterey, Calif,USA, October 2004.

REFERENCES

[1] L. Tong, Q. Zhao, and S. Adireddy, “Sensor networks with mo-bile agents,” in Proc. IEEE Military Communications Confer-ence (MILCOM ’03), vol. 1, pp. 688–693, Boston, Mass, USA,October 2003.

[2] P. Venkitasubramaniam, S. Adireddy, and L. Tong, “Sensornetworks with mobile access: optimal random access and cod-ing,” IEEE J. Select. Areas Commun., vol. 22, no. 6, pp. 1058–1068, 2004.

[3] A. Woo and D. Culler, “A transmission control scheme for me-dia access in sensor networks,” in Proc. 7th Annual ACM/IEEEInternational Conference on Mobile Computing and Network-ing (MobiCom ’01), pp. 221–235, Rome, Italy, July 2001.

[4] W. Ye, J. Heidemann, and D. Estrin, “An energy-efficient MACprotocol for wireless sensor networks,” in Proc. 21st AnnualJoint Conference of the IEEE Computer and CommunicationsSocieties (INFOCOM ’02), vol. 3, pp. 1567–1576, New York,NY, USA, June 2002.

[5] K. Sohrabi, J. Gao, V. Ailawadhi, and G. J. Pottie, “Protocolsfor self-organization of a wireless sensor network,” IEEE Pers.Commun., vol. 7, no. 5, pp. 16–27, 2000.

[6] R. Iyer and L. Kleinrock, “QoS control for sensor networks,” inProc. IEEE International Conference on Communications (ICC’03), vol. 1, pp. 517–521, Anchorage, Alaska, USA, May 2003.

[7] M. Dong, L. Tong, and B. M. Sadler, “Impact of MAC designon signal field reconstruction in dense sensor networks,” sub-mitted to IEEE Trans. Signal Processing.

[8] M. Dong, L. Tong, and B. M. Sadler, “Information retrievaland processing in sensor networks: deterministic schedulingvs. random access,” in Proc. IEEE International Symposium onInformation Theory (ISIT ’04), pp. 79–79, Chicago, Ill, USA,June–July 2004.

[9] Q. Zhao and L. Tong, “QoS specific medium access controlfor wireless sensor network with fading,” in Proc. 8th Inter-national Workshop on Signal Processing for Space Communica-tions (SPSC ’03), Catania, Italy, September 2003.

[10] Q. Zhao and L. Tong, “Distributed opportunistic transmis-sion for wireless sensor networks,” in Proc. IEEE InternationalConference on Acoustics, Speech, and Signal Processing (ICASSP’04), vol. 3, pp. 833–836, Montreal, Quebec, Canada, May2004.

[11] Q. Zhao and L. Tong, “Opportunistic carrier sensing forenergy-efficient information retrieval in sensor networks,”EURASIP Journal on Wireless Communications and Network-ing, vol. 2005, no. 2, pp. 231–241, 2005.

[12] D. Bertsekas and R. Gallager, Data Networks, Prentice-Hall,Englewood Cliffs, NJ, USA, 1992.

Zhiyu Yang received the B.Eng. degree inelectronic engineering from Tsinghua Uni-versity, Beijing, China, in 2000, and the S.M.degree in engineering sciences from Har-vard University, Cambridge, Massachusetts,in 2001. Currently, he is a Ph.D. candidate inthe School of Electrical and Computer En-gineering, Cornell University, Ithaca, NewYork. His areas of interest include wirelesscommunications, communication and sen-sor networks, information theory, and signal processing.

Min Dong received the B.Eng. degree fromTsinghua University, Beijing, China, in1998, and the Ph.D. degree in electrical andcomputer engineering from Cornell Uni-versity, Ithaca, New York, in 2004. She iscurrently with the Cooperate Research andDevelopment, QUALCOMM Incorporated,San Diego, Calif, USA. Dr. Dong receivedthe IEEE Signal Processing Society Best Pa-per Award in 2004. Her research interestsinclude statistical signal processing, wireless communications, andcommunication networks.

Lang Tong is a Professor in the School of Electrical and Com-puter Engineering, Cornell University, Ithaca, New York. He re-ceived the B.E. degree from Tsinghua University, Beijing, China,in 1985, and M.S. and Ph.D. degrees in electrical engineering in1987 and 1991, respectively, from the University of Notre Dame,Notre Dame, Indiana. He was a Postdoctoral Research Affiliateat the Information Systems Laboratory, Stanford University, in1991. He was also the 2001 Cor Wit Visiting Professor at theDelft University of Technology. Dr. Tong received Young Investi-gator Award from the Office of Naval Research in 1996, the Out-standing Young Author Award from the IEEE Circuits and Sys-tems Society in 1991, the 2004 IEEE Signal Processing SocietyBest Paper Award (with M. Dong), the 2004 Leonard G. Abra-ham Prize Paper Award from the IEEE Communications Soci-ety (with P. Venkitasubramaniam and S. Adireddy). He serves asan Associate Editor for the IEEE Transactions on Signal Process-ing and IEEE Signal Processing Letters. His areas of interest in-clude statistical signal processing, wireless communications, com-munication networks and sensor networks, and information the-ory.


Brian M. Sadler received the B.S. and M.S.degrees from the University of Maryland,College Park, and the Ph.D. degree from theUniversity of Virginia, Charlottesville, all inelectrical engineering. He is a Senior Re-search Scientist at the Army Research Lab-oratory (ARL) in Adelphi, MD, USA. Hewas a Lecturer at the University of Mary-land, and has been lecturing at Johns Hop-kins University since 1994 on statistical sig-nal processing and communications. He is an Associate Editor forthe IEEE Signal Processing Letters, was an Associate Editor for theIEEE Transactions on Signal Processing, is on the editorial boardfor the EURASIP Journal on Wireless Communications and Net-working, and is a Guest Editor for the IEEE JSAC special issue onMilitary Communications. He is a Member of the IEEE TechnicalCommittee on Signal Processing for Communications, an IEEE Se-nior Member, and cochaired the 2nd IEEE Workshop on SignalProcessing Advances in Wireless Communications (SPAWC-99).His research interests include signal processing for mobile wirelessand ultra-wideband systems, and sensor signal processing and net-working.

EURASIP Journal on Wireless Communications and Networking 2005:4, 505–522c© 2005 R. Cristescu and S. D. Servetto

An Optimal Medium Access Control with PartialObservations for Sensor Networks

Razvan CristescuCenter for the Mathematics of Information, California Institute of Technology, Caltech 13693, Pasadena, CA 91125, USAEmail: [email protected]

Sergio D. ServettoSchool of Electrical and Computer Engineering, College of Engineering, Cornell University, 224 Philips Hall, Ithaca, NY 14853, USAEmail: [email protected]

Received 10 December 2004; Revised 13 April 2005

We consider medium access control (MAC) in multihop sensor networks, where only partial information about the sharedmedium is available to the transmitter. We model our setting as a queuing problem in which the service rate of a queue is afunction of a partially observed Markov chain representing the available bandwidth, and in which the arrivals are controlled basedon the partial observations so as to keep the system in a desirable mildly unstable regime. The optimal controller for this problemsatisfies a separation property: we first compute a probability measure on the state space of the chain, namely the informationstate, then use this measure as the new state on which the control decisions are based. We give a formal description of the sys-tem considered and of its dynamics, we formalize and solve an optimal control problem, and we show numerical simulationsto illustrate with concrete examples properties of the optimal control law. We show how the ergodic behavior of our queuingmodel is characterized by an invariant measure over all possible information states, and we construct that measure. Our resultscan be specifically applied for designing efficient and stable algorithms for medium access control in multiple-accessed systems, inparticular for sensor networks.

Keywords and phrases: MAC, feedback control, controlled Markov chains, Markov decision processes, dynamic programming,stochastic stability.

1. INTRODUCTION

1.1. Multiple access in dynamic networks

Communication in large networks has to be done over aninherently challenging multiple-access channel. An impor-tant constraint is associated with the nodes that relay trans-mission from the source to the destination (relay nodes, orrouters). Namely, the relay nodes have an associated maxi-mum bandwidth, determined for instance by the limited sizeof their buffers and the finite rate of processing. Thus, thenodes using the relay need usually to contend for the access.

A typical example of such a system is a sensor network,where deployed nodes measure some property of the envi-ronment like temperature or seismic data. Data from thesenodes is transmitted over the network, using other nodes asrelays, to one or more base stations, for storage or controlpurposes. The additional constraints in such networks result


from the fact that the resources available at nodes, namelybattery power and processing capabilities, are limited. Nodeshave to decide on the rate with which to inject packets intoa commonly shared relay, but the multiple-access strategycannot be controlled in a centralized manner by the nodethat is acting as a relay, since communication with the chil-dren is very costly. Moreover, since nodes need to preservetheir energy resources, they only switch on when there isrelevant/new data to transmit, otherwise they turn idle. Asa result, the number of active sources is variable, thus theamount of bandwidth the nodes get is variable as well. Apoorly chosen algorithm for rate control may result in a largenumber of losses and retransmissions. In the case of sensornetworks, this is equivalent with a waste of critical resources,like battery power. It is thus needed to design simple decen-tralized algorithms that adaptively regulate the access to theshared medium, by maintaining the system stable but stillproviding reasonable throughput. A realistic assumption isthat nodes have only limited information available about thestate of the system. Thus, the algorithms for rate control, im-plemented by the data sources, should rely only on limitedfeedback from the routing node.


mailto: [email protected]


2

1

u2

u13 S

Figure 1: Multiple access in a simple network.

We illustrate these issues with a simple network exampleshown in Figure 1. Nodes 1 and 2 need to control their rateof sending further their measured and/or relayed data, whilerelying only on feedback from the router. Node 3 serves onesingle packet at a time. If the relay is aware of the numbersof nodes that access it at a certain time moment (in this case,zero, one or two), it can just allocate some fair proportion ofits bandwidth to each of them, avoiding thus collisions. How-ever, such an information is not available in general neitherat the relay, nor at the nodes accessing it.

Suppose each of the two nodes 1 and 2 employs a simplerandom medium access protocol, defined by two Bernoullirandom variables u1, u2 that determine the injection prob-abilities. Due to the above mentioned power and commu-nication limitations, the nodes are not able to communicatebetween them. For the same reasons of minimizing the over-head, they need to control the rate of transmission by us-ing only limited information (feedback) from the relay node.This feedback is usually restricted only to acknowledgmentsof whether the packet sent was accepted or not. Most currentprotocols for data transmission, including Aloha and TCP,use this kind of information for the rate control. Currentproposals for medium access protocol in sensor networksmake use of randomized controllers. The study of perfor-mance and stability of such protocols is thus of obvious im-portance.

As an example, suppose node 1 uses a probability of in-jection u1 = 0.5, that is, it will try to inject on average apacket every two time slots. If it sends a packet and this isaccepted (there is free place in the buffer of node 3), an ad-equate policy will consequently increase its rate u1, since itis probable that node 2 is not active at that particular time.As a result, node 1 accesses the buffer more often. If on thecontrary the packet is rejected, then it is probable that node2 is accessing the channel in the same time, too. Then, node1 will decrease its rate. Note that care must be taken so thatneither of the nodes alone take full use of the buffer. This fair-ness can be achieved, for instance, by drastically reducing theinjection probability when losses are experienced. The designand analysis of such control policies is the goal of this work.

For such a setting, due to frequent failures on links andfrequent need of rerouting, protocols like TCP are not suit-able (e.g., the IEEE 802.11 protocol is based on a random ac-cess algorithm). On the other hand, stability of random ac-cess systems (like e.g., Aloha [1]), but with private feedback,

is hard to analyze. Our goal is to provide an analysis of sys-tems under variable conditions, where there are only partialobservations available, and the rate control actions are basedon those partial observations.

In this paper, we set up a “toy” problem which is analyti-cally tractable, and which captures in a clean manner some ofthese issues. We propose a hybrid model, in which nodes getonly private feedback from the router, like in TCP. However,TCP behavior (including fairness) is not explicitly imposed,but as we will see further, the resulting system has the slowincrease/fast decrease type of behavior specific to TCP. Notethat an Aloha type of contention resolution, where if there iscollision no packet goes through, does not take full advantageof the buffering available at relaying nodes. Thus, unlike inAloha, in our model one packet goes always out of the queue(since the relay has a finite buffer, and filling of the buffer isprevented by the rate control at nodes).

The key property of our model is that the control deci-sions, on what rate to be used by a node, are based on all thehistory that is locally available at that node. For a networkwith partial observations, intuitively this is the best that canbe done.

1.2. Related work

The problem of how different sources gain access to a sharedqueue is an abstraction of the thoroughly studied flow controlproblem in networks. Many practical and well-debugged al-gorithms have been developed over the years [2, 3], and morerecently, formulations of this problem have taken more an-alytical approaches, based on game theoretic, optimization,and flows-as-fluids concepts [4, 5, 6, 7]. More recently, theflow control problem has been addressed in sensor networks[8, 9].

Several important issues appear in studying the MACproblem in the sensor network context, including limitedpower and communication constraints, as well as interfer-ence. Contention-based algorithms include the classical ex-amples of Aloha and carrier-sense multiple access (CSMA)[1]. Recently proposed algorithms adapted to the specific re-quirements of sensor networks are presented in [10, 11, 12,13]. Scheduling-based algorithms include TDMA, FDMA,and CDMA (time/frequency/code-division multiple access)[14, 15, 16, 17, 18].

The need of a unified theory of control and informationin the case of dynamic systems is underlined in the overviewof [19], where the author discusses topics related to the con-trol of systems with limited information. These issues are dis-cussed in the context of several examples (stabilizing a single-input LTI unstable system, quantization in a distributed con-trol two-stage setting, and LQG), where improvements in theconsidered cost functions can be obtained by considering in-formation and control together, namely by “measuring in-formation upon its effect on performance.” Extensive workalong these lines is presented in [20], where the author de-rived techniques which consider the use of partial informa-tion, for capacity optimization of Markov sources and chan-nels, formulated as dynamic programming problems.

Optimal MAC with Partial Observations 507

N

2

1

... µ

Figure 2: The problem of N sources sharing a single finite buffer.When each source gets to observe the state of the entire network,this problem degenerates to the single-source case. The interestingcase however occurs when sources only have partial informationabout the state of the system, and they must base decisions aboutwhen to access the channel only on that partial data.

The main tool we use in this work is the control theorywith partial information. An important quantity in this con-text is the information state, which is a probability vector thatweighs the most that can be inferred about the state of thesystem at a certain time instance, given the system behaviorat previous time instances. There are some important resultsin the literature dealing with related results on convergence indistribution of the information state, in which the state of asystem can only be inferred from partial observations. Kaijserproved convergence in distribution of the information statefor finite-state ergodic Markov chains, for the case when thechain transition matrix and the function which links the par-tial observation with the original Markov chain (the obser-vation function) satisfy some mild conditions [21]. Kaijser’sresults were used by Goldsmith and Varaiya, in the contextof finite-state Markov channels [22]. This convergence resultis obtained as a step in computing the Shannon capacity offinite-state Markov channels, and it holds under the crucialassumption of i.i.d. inputs: a key step of that proof is shownto break down for an example of Markov inputs. This as-sumption is removed in a recent work of Sharma and Singh[23], where it is shown that for convergence in distribution,the inputs need not be i.i.d., but in turn the pair (channelinput, channel state) should be drawn from an irreducible,aperiodic, and ergodic Markov chain. Their convergence re-sult is proved using the more general theory of regenerativeprocesses. However, using directly these results in our settingdoes not yield the sought result of weak convergence and thusstability, as we will show that the optimal control policy is afunction of the information state, whereas in previous work,inputs are independent of the state of the system. This depen-dence due to feedback control is the main difference betweenour setup and previous work.

1.3. Main contributions and organization of the paper

We formulate, analyze, and simulate a MAC system whereonly partial information about the channel state is available.The optimal controller for this problem satisfies a separationproperty: we first compute a probability measure on the statespace of the chain, namely the information state, then usethis measure as the new state based on which to make controldecisions. Then, we show numerical simulations to illustrate

N

2

1

...

OFF ON

OFF ON

OFF ON

Transmit a packetwith probability

u(N)k


u(2)k


u(1)k

Number of activesources : xk

Figure 3: To illustrate the proposed model. N sources switch be-tween on/off states. When a source is in the on state, it generatessymbols with a (controllable) probability u(i)

k . When it is in the offstate, it is silent.

with concrete examples properties of the optimal control law.Finally, we show how the ergodic behavior of our queuingmodel is characterized by an invariant measure over all pos-sible information states, and we construct that measure.

This paper is organized as follows. In Section 2, we set upa model of a queuing system in which multiple sources com-pete for access to a shared buffer, we describe its dynamics, weformulate and solve an appropriate stochastic control prob-lem. We also present results obtained in numerical simula-tions to illustrate with concrete examples properties of thesecontrol boxes. Then, in Section 3, we study ergodic proper-ties of the queuing model that result from operating the sys-tem of Section 2 under closed-loop control. There, we showhow long-term averages are described succinctly in terms ofa suitable invariant measure, whose existence is first proved,and then effectively constructed. The paper concludes withSection 4.

2. THE CONTROL PROBLEM

2.1. System model and dynamics

Consider the following discrete-time model (see Figure 2).

(i) N sources feed data into the network, switching be-tween on/off states in time. While on, source S(i) gen-

erates a symbol at time k with probability u(i)k , and re-

mains silent with probability 1 − u(i)k ; while off, the

source remains silent with probability 1. Given the in-

tensity value u(i)k , this coin toss is independent of ev-

erything else (see Figure 3).


B(uN ) N

...

B(u2) 2

B(u1) 1

c

Finite bufferDeterministic

service rate

Figure 4: The only information a source has about the network is asequence of 3-valued observations: acknowledgments, if the symbolwas accepted by the buffer; losses if it is rejected due to overflow, andnothing if the decision was not to transmit at the current moment(denoted by 1, −1, 0, resp.).

(ii) The queue has a finite buffer. When a source generatesa symbol to put in this buffer, if the buffer is full, thenthe symbol is dropped and the source is notified of thisevent; if there is room left in the buffer, the symbolis accepted, and the source is notified of this event aswell. Note that feedback is sent only to the source thatgenerates a symbol, and not to all of them.

(iii) The control task consists of choosing values for allu(i)’s, at all times. A basic assumption we make is thatsources are not allowed to coordinate their efforts inorder to choose an appropriate set of control actionsu(i) (i = 1, . . . ,N): instead, the only cooperation we al-low is in the form of having all sources implementingthe same control technique, based on feedback they re-ceive from the queue.

(iv) The service rate of the queue is deterministic.

An illustration of this proposed model is shown in Figure 4.The dynamics of this system are modeled as follows.

(i) xk ∈ S = {1, . . . ,N} is the number of on-sources attime k, modeled as a finite-state Markov chain1 withknown matrix P of transition probabilities Pi j givenby p(xk = j | xk−1 = i) (independent of the source

intensities u(i)k and of the time index k), and known

p(x0) (the initial distribution over states).

(ii) r(i)k ∈ O = {−1, 0, 1} is the ternary feedback from the

queue to the source. The convention we use is that −1denotes losses, 0 denotes idle periods, and 1 denotespositive acknowledgments.

(iii) u(i)k ∈U, where U = (0, 1] source intensities, control-

lable (as defined above).(iv) qk+1 = min(max(qk + ak − c, 0),B) is the queue size

at moment k, with ak the number of accepted packets,c the number of departing packets (c has a constantvalue), and B the maximum buffer size. If a new packet

1For example, it is straightforward to prove that if the on/off processof each source is modeled as a two-state Markov process, then also the totalnumber of active sources is a finite-state Markov chain.

0 1/i u∗ 1

Control intensity

T

1/i

1− 1/i

1

Pro

babi

lity

Pr(0|i,u)

Pr(−1|i,u)

Pr(1|i,u)

Figure 5: Consider a fixed (observed) state i, and assume a large fi-nite shared buffer (for simplicity—if not, these curves would have tobe replaced by curves derived from large deviations estimates suchas given by the Chernoff bound). The probability of a packet loss iszero until the injection rate hits the fairness point 1/i, beyond whichit increases linearly, and the probability of a packet finding availablespace in the shared buffer increases linearly up until the fairnesspoint 1/i, beyond which it remains constant. Note that u∗ > 1/i isthe largest u ∈ (0, 1] such that p(−1 | i,u) ≤ T—the gap between1/i and u∗ is the “margin of freedom;” we will have to risk the lossof packets, in the case when i cannot be observed.

is accepted, the queue generates an rk = 1 private ac-knowledgment to the source from which the packet isoriginated, and if the packet is not accepted, the queuegenerates an rk = −1 acknowledgment.

(v) p(r | x,u) is the probability of occurrence of an ob-servation r ∈ O, when x sources are active, and whensymbols are generated by all active sources at an aver-age rate u. These probabilities can be computed a pri-ori: for a finite but large enough buffer, a good approx-imation for p(r | x,u) is illustrated in Figure 5. Notethat in this approximation, the values of p(r | x,u) donot depend on the maximum size of the buffer B, noron the instantaneous queue size qk.

These dynamics are illustrated in Figure 6.There are two important observations to make about

how we have chosen to set up our model. Describing theprobabilities of observations p(r | x,u) only in terms of thenumber of active sources x and the average injection rate uof all the active sources does require some justification: howcan we assume that all sources inject the same amount ofdata, when the data on which these decisions are based (feed-back from the queue), is not shared, and each source gets itsown private feedback? Although this might seem unjustified,that is not the case. Once we study in some detail the controlproblem we are setting up here, we will find that the optimalcontrol action uk at time k is given by a memoryless func-tion uk = g(πk) of a random vector π that has the samedistribution for all sources, and with well-defined ergodicproperties—a precise study of these ergodic properties is thesubject of Section 3. Therefore, even though at any point intime there will likely be some sources getting more and someother sources getting less than their fair share, on average all


Observations

States

· · · Loss Ack N

pL pA pN

Loss Ack N

pL pA pN

p(i|i− 1)

p(i− 1|i)p(i− 1|i− 1) p(i|i) p(i + 1|i + 1)

1 · · · i− 1 i i + 1 · · · N−1 N

· · · · · · · · ·

TransitionsHidden Markov chain

p(−1|i,u) p(1|i,u) p(0|i,u)

Figure 6: An illustration of the model from the point of view of a single source, based on a simple birth-and-death chain for the evolutionof the number of active sources.

get the same. This issue is further discussed below, both an-alytically (in Section 3) and in terms of numerical results (inFigure 10).

Another important thing to note is that there are strongsimilarities between our model and the formalization of mul-tiaccess communication that led to the development of theAloha protocol. However, the fact that feedback is not broad-cast to all active sources in our model is a major differ-ence between our formulation and that one. In fact, we con-ceived our model as an analytically tractable “hybrid” be-tween Aloha and TCP. Like in slotted Aloha, time is discrete,feedback is instantaneous, and the state follows a Markovianevolution; but like in TCP, feedback is private only to thesource that generated a transmitted packet.

Hajek [24] reviews a series of results for the two usualmodels for Aloha (finite number of users and one packet ata time, and infinite number of users). Decentralized poli-cies for the injection probabilities, that maintain stability inthe case of private acknowledgment feedback, are hard tobe derived for the infinite-nodes case with Poisson arrivals.There is however important work [24] about stability in thefinite-nodes study of Aloha. The theory in [24] is applied,as an example, to finding conditions of stability for multi-plicative policies for sources that are supplied with Poissonarrivals. We expect that the theory we develop in this paperwill provide a useful background for an Aloha model withrandom arrivals (not necessarily Poisson), with a finite num-ber of backlogged packets, and its extension to the infinite-user model.

2.2. Formal problem statement

Intuitively, what we would like to do is maximizing the rateat which information flows across this queue, subject to theconstraint of not losing too many packets. Since each timewe attempt to put a packet into the shared buffer there is achance that this packet may be lost, it seems intuitively clearthat without accepting the possibility of losing a few packets,the throughput that can be achieved will be low; at the sametime, we do not want a high packet loss rate, as this wouldcorrespond to a highly unstable mode of operation for oursystem.

This intuition is formalized as follows. Our goal is to finda policy g = (u1, . . . ,uK ) that solves

maxg

lim supK→∞

1K

K∑k=1

p(rk = 1 | xk,uk

),

subject to p(rk = −1 | xk,uk

) ≤ T , ∀k,

(1)

where T ∈ (0, 1] is a parameter that specifies the maximumacceptable rate of packet losses.2 Note that we use a lim supin the definition of our utility function (instead of a regu-lar limit) because we do not know yet that the limit actuallyexists—although it certainly does, as will be shown later.

2.3. Warming up: finite horizon and observed state

We start with the solution to an “easier” version of our con-trol problem: one in which the state of the chain (i.e., thenumber of active sources at any time) is known to all thesources. Although this would certainly not be a reasonableassumption to make (it does trivialize the problem), we findthat looking at the solution to the general problem in thisspecific case is actually quite instructive, and so we start hereas a step towards the solution of the case of true interest (hid-den state).

The problem formulated above is a textbook exampleof a problem of optimal control for controlled Markovchains, and its solution is given by an appropriate set of dy-namic programming equations [25]. Define c(u) = [p(1 |i,u) · · · p(1 | N ,u)]�, and then

VK (i) = 0, (2)

Vk(i) = supu:p(−1|i,u)≤T

{c(u) + PVk+1

}

= supu:p(−1|i,u)≤T

{c(u) + C

}(C independent of u).

(3)

Equation (2) is set to 0 because this is only a finite-horizonapproximation, but we are interested in the infinite-horizon

2In Figure 9, on numerical simulations, we illustrate how this parameteraffects the behavior of the controller.


Controlaction Control law

Informationstate

EstimationObservationSystem

Figure 7: Illustrates the separation of estimation and control. Sup-pose we have a controlled system, which produces certain observ-able quantities related to its unobserved state. Based on these obser-vations, we compute an information state, a quantity that somehowmust capture all we can infer about the state of the system givenall the information we have seen so far (this concept will be maderigorous later). This information state is fed into a control law thatuses it to make a decision of what control action to choose, and thisaction is fed back into the system.

case, and in this case, the boundary condition given by VK =0 has a vanishing effect as we let K → ∞. What is more in-teresting is that from (3), it follows that a greedy controller isoptimal: this is not at all unexpected, since in our model thetransition probabilities P are not affected by control, only ob-servations are. The interplay among control and the differentprobabilities of observations are illustrated in Figure 5.

2.4. One step closer to reality: partial information

Definition 1. Denote the simplex of N-dimensional proba-bility vectors by Π = {(p1, . . . , pN ) ∈RN : pi ≥ 0,

∑Ni=1 pi =

1}.

The case of partial information (i.e., when the underlyingMarkov chain cannot be observed directly) poses new chal-lenges. The problem in this case is that Markovian controlpolicies based on state estimates are not necessarily optimal.Instead, optimal policies satisfy a “separation” property, il-lustrated in Figure 7 and extensively discussed in [25, pages84–87].

Formally, an information state πk is a function ofthe entire history of observations and controls r0 · · ·rk−1u0 · · ·uk−1, with the extra requirement that πk+1 can becomputed from πk, rk, uk.3 A typical choice is to let πk bep(xk | rk−1,uk−1), the conditional probability of xk given allthe past observations and applied controls. Then, an optimalcontroller for partially observed Markov chains also satisfiesa set of dynamic programming equations, but instead of be-ing over the states of the chain (a finite number), these equa-tions are defined over information states [25] (i.e., over allpoints in the simplex of Π probabilities over N points):

VK (π) = 0,

Vk(π) = supu:Eπ p(−1|i,u)≤T

Eπ{

c(i,u) +Vk+1(F[π,u, r]

)}, (4)

where F denotes the recursive updates of π, and where thenotation Eπ denotes expectation relative to the measure π.

3Note that this is a very reasonable requirement to make of somethingthat we would like to think of as capturing some notion of state for oursystem.

A straightforward derivation gives the information-statetransition function F:

πk+1 = F[πk,uk, rk

]= Cπk · πk ·D

(uk, rk

) · P,(5)

with Cπk a normalizing constant, P the transition-probabilitymatrix of the underlying chain, and D(u, r) = diag[p(r |1,u) · · · p(r | N ,u)] a diagonal matrix. This is essentiallythe same set of DP equations as before, but where depen-dence on states is removed by averaging with respect to thecurrent information state πk. And as before, the optimal con-troller is chosen by recording for each π the value of u thatachieves the supremum in the left-hand side of (4). The op-timal control will thus be a function of only the informationstate, u = g(π).

2.5. Infinite horizon

In the previous sections, we derived the solution for the op-timal control in the case of partial observations when thetime horizon is finite. We can get back now to the infinite-horizon problem stated in (1). The dynamic programmingalgorithm becomes a fixed-point system of equations withthe unknowns spanning the simplex Π. Indeed, we start fromthe finite horizon case:

VK (π) = supu:Eπ p(−1|i,u)≤T

Eπ[c(i,u) +VK−1

(F[π,u, rk

])]. (6)

We rewrite (6) as [25]

VK (π)K

= supu:Eπ p(−1|i,u)≤T

Eπ

[c(i,u) +VK−1

(F[π,u, rk

])

−VK (π) +VK (π)K

].

(7)

Assume that the following limits exist for all π ∈ Π andsome J∗:

limK→∞

(VK (π)− KJ∗) = V∞(π). (8)

Then by taking the limit K →∞ in (7), we finally get

J∗ +V∞(π) = supu:Eπ p(−1|i,u)≤T

Eπ[c(i,u) +V∞

(F[π,u, rk

])].

(9)

The DP equation in (9) holds actually under moregeneral conditions easy to verify for our model [25]. Thetransition-probability matrix P does not depend in ourmodel on the control policy. Further, the Markov chain givenby the number of active sources is irreducible in normal cir-cumstances. Then it is shown in [25] that if these conditionsare fulfilled, then the DP equation system for the average costcriterion is as in (9) and there exist V(π), π ∈ Π and J∗ thatsolve it. Also, J∗ is the minimum average cost and a policy gis optimal if g(π) attains the minimum in (9).


One might attempt to solve the fixed-point system in (9)with an iteration algorithm on a discretized version of theequations system. However, there are practical difficulties toimplement and simulate the optimal controller in the partialinformation case as defined above, having to do with the factthat our state space is the whole simplex of probability dis-tributions Π. Our approach to find an approximate solutionfor the optimization problem (1) is to solve the dynamic pro-gramming system for the finite-horizon case (finite K), andstudy the properties of the obtained control policy by numer-ical simulations.

2.6. Numerical simulations

To help develop some intuition for what kind of propertiesresult from the optimal control laws developed in previoussections, in this section we present results obtained in nu-merical simulations. Our approximation consists in choos-ing the maximum control at time k that still obeys the lossconstraint, since this will also maximize the throughput. InFigure 8, we present a typical evolution over time of the in-formation state, in Figure 9, we illustrate how different valuesof the threshold T influence the behavior of the controller,and in Figure 10, we address the fairness issue raised at theend of Section 2.1.

In all our simulations, we compare our controller withpartial observation, with the optimal genie-aided controllerthat would be used if the number of active sources wereknown. Note that the difference between the optimal genie-aided controller and the controller derived by our algorithmis dependent on the two defining parameters of the system:the loss threshold T , and the transition-probability matrixP. Namely, our controller adapts faster to the network con-ditions if the transition matrix P corresponds to a slowlychanging Markov chain; on the other hand, the larger thethreshold T implies better adaptation, at the expense of anincreased level of losses.

3. PERFORMANCE ANALYSIS

3.1. Overview

3.1.1. Problem formulation

In Section 2, we gave a model for the system of interest, wedescribed its dynamics, we formulated an optimal controlproblem, we showed how this problem can be solved usingstandard techniques developed in the context of controlledMarkov chains [25], and we developed numerical simula-tions to illustrate with concrete examples properties of thequeues operating under feedback control. Now, once we havethat optimal control algorithm, each source gets to operatethe queue based on its local controller, thus resulting in a“decoupling” of the problem, as illustrated in Figure 11.

Perhaps the first question that comes to mind once weformulate the picture shown in Figure 11 is about ergodicproperties of the resulting controlled queues. Specifically, wewill be interested in two quantities.

(i) Average throughput:

J(g) = limK→∞

1K

K∑k=1

p(1 | xk, g

(πk))

?=∫{x,π}

p(1 | x, g(π)

)dν(x,π).

(10)

(ii) Average loss rate:

limK→∞

1K

K∑k=1

p(− 1 | xk, g

(πk))

?=∫{x,π}

p(− 1 | x, g(π)

)dν(x,π).

(11)

Therefore we see that, in both cases, the questions of in-terest are formulated in terms of a suitable invariant measure.Since we have assumed the underlying finite-state Markovchain to be irreducible and aperiodic, this chain does admit astationary distribution. Therefore, a sufficient condition forthe existence of the sought measure ν is the weak convergenceof the sequence of information states πk to some limit distri-bution over the simplex ΠN of probability distributions onN points. And to start developing some intuition on what toexpect in terms of the sought convergence result, it is quiteinstructive to look at typical trajectories of the informationstate, as shown in Figure 12.

We state now the main theorem of this paper.

Theorem 1. The sequence πk converges weakly to a limit dis-tribution ν over the simplex Π.

The proof will follow after we briefly review some previ-ous related work.

3.1.2. Some related work

Note that the stability of the control policy cannot in gen-eral be proven using a Lyapunov function, since the depen-dence of the optimal control on the information state is nota closed-form function.

In view of the previous results [21, 22, 23], a seeminglyfeasible approach to establish the sought convergence forour system would have been considering the control actionu ∈ U to play the role of a channel input in the setup of[22], while the observations r ∈ O could have played therole of a channel output (thus making the control u and theobservation r the available partial observations). However,this approach does not yield the sought result. In our sys-tem, the control u is a function of the information state, thatis, it depends on the state of the system, but in those pre-vious papers, inputs are independent of the state of the sys-tem.

3.1.3. Weak convergence of the informationstate—steps of the proof

The proof of weak convergence of π involves five steps.

(1) First, we show that the sequence of information states


1 2 3 4 5 6 7 8 9 10

Number of sources

00.20.4

Moment k = 4; obs. = 0

Stat

epr

obab

ility

1 2 3 4 5 6 7 8 9 10

Number of sources

00.20.4


Stat

epr

obab

ility

1 2 3 4 5 6 7 8 9 10

Number of sources

00.20.4


Stat

epr

obab

ility

1 2 3 4 5 6 7 8 9 10

Number of sources

00.20.4


Stat

epr

obab

ility

1 2 3 4 5 6 7 8 9 10

Number of sources

00.20.4

Moment k = 19; obs. = −1

Stat

epr

obab

ility

1 2 3 4 5 6 7 8 9 10

Number of sources

00.20.4


Stat

epr

obab

ility

Figure 8: Illustrates typical dynamics of π. This plot corresponds to a symmetric birth-and-death chain as shown in Figure 6, with probabil-ity of switching to a different state p = 0.001, N = 10 sources, and loss threshold T = 0.04. At time 0, the initial π0 is taken to be πs(i) = 1/N ,the stationary distribution of the underlying birth-and-death chain. While there are no communication attempts (up until time k = 6),πk remains at πs. Then at time 6, a packet is injected into the network and it is accepted, and as a result, there is a shift in the probabilitymass towards the region in which there is a small number of active sources. Then at time 19, another communication attempt takes placebut this time the packet is rejected, and as a result, now the probability mass shifts to the region of a large number of active sources. Thistype of oscillations we have observed repeatedly, and gives a very pleasing intuitive interpretation of what the optimal controller does: keeppushing the probability mass to the left (because that is the region where more frequent communication attempts occur, and therefore leadsto maximization of throughput), but dealing with the fact that losses push the mass back to the right. Similar oscillations are also typical oflinear-increase multiplicative-decrease flow control algorithms such as the one used in TCP.

πk has the Markov property itself. This is a Markovchain taking values in an uncountable space, though(the simplex Π).

(2) Then we discretize the simplex Π. And we show thatfor all “small enough” discretizations, there is at leastone observation taking πk out of any cell with positive

probability. With this, we make sure that there are noabsorbing cells, in the sense that once the chain hitsthat cell, it gets stuck there forever.

(3) Then we show that the stationary distribution πs ofthe underlying (finite-state) Markov chain is a pointreachable from anywhere in the simplex. With this,


0 100 200 300 400 500 600

Time

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Con

trol

Desired sourceOracle

(a)

0 100 200 300 400 500 600

Time

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Con

trol


(b)

0 100 200 300 400 500 600

Time

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Con

trol


(c)

Figure 9: Illustrates how the value of the loss threshold T affects the optimal control law. In this case, we consider the same birth-and-deathmodel considered in Figure 8, with three different values for T : top-left, T = 0.1; top-right, T = 0.02; bottom, T = 0.05. In all plots, thehorizontal axis is time, the vertical axis is control intensity, and two controllers are shown: the thick black line corresponds to our optimalcontrol law, the thin dotted line corresponds to a genie-aided controller that can observe the hidden state. And we observe a number ofinteresting things: (i) when T is large (a), our optimal control stays most of the time above the fair share point determined by the actions ofthe genie-aided controller; (ii) also when T is large, we see that sudden increases in bandwidth are quickly discovered by our optimal law;(iii) when T is small (b), the gap between the control actions of our optimal law and the genie-aided law is smaller, but our law has a hardtime tracking a sudden increase in available bandwidth; (iv) for intermediate values of T (c), both the size of the gap and the speed withwhich changes in available bandwidth can be tracked are in between the previous two cases. These plots also suggest another intuitively verypleasing interpretation: T is a measure of how “aggressive” our optimal control law is.

we make sure that there is at least one cell which canbe reached from any initial point in Π, and hence thatthe set of recurrent cells is not empty.

(4) Consider next any “small enough” discretization of the

space, and define a new process whose values are thecells of this discretization, based on whether πk hits aparticular cell. Then, this new process is (finite-state)Markov, and positive recurrent on a nonempty subset


0 100 200 300 400 500 600

Time

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Con

trol

MaximumMinimumOracle

(a)

0 100 200 300 400 500 600

Time

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Con

trol

OracleDesired source

(b)

Figure 10: Illustrates the fairness issue raised at the end of Section 2.1. In this case, we also consider a birth-and-death chain model as inprevious examples, but now with only two sources (N = 2). In (a), we show the maximum and the minimum control values chosen byeither one of the sources over time: thick black line shows the minimum, thin solid line shows the maximum (for reference, the genie-aidedcontroller is also shown); in (b), the thick line corresponds to the control actions of only one of the sources, all the time. Observe how,around time steps 150–250, the source shown at the bottom is the one that achieves the maximum at the top; but around time steps 500–600,the same source achieves the minimum of those injection rates. This is yet another intuitively very pleasing pattern that we have observedrepeatedly in many simulations: the control law is essentially fair in the sense that, although we do not have enough information to makesure that at any time instant all controllers will use the same injection rate, at least over time the different controllers “take turns” to go aboveand below each other.

B(uN ) N

...

B(u2) 2

B(u1) 1

c

Turnedinto

B(g(π)) N

...

B(g(π)) 2

B(g(π)) 1

c(π)

c(π)

c(π)

Figure 11: Illustrates how the original problem is broken into N independent identical subproblems. Since all the nodes execute exactly thesame control algorithm, the distribution of π is the same for all nodes. But other than through this statistical constraint, all decisions aretaken locally by each node, based on private data that is not available to any other node, and therefore completely independent.

of the cells, and therefore it admits an invariant mea-sure itself.

(5) Finally, we construct a measure as the limit of the“simple” measures from step 4 (as we let the size ofthe discretization vanish), and we show that this limitis invariant over Π. This requires some further steps,largely based on the elegant framework of [26] as fol-lows.

(5.1) We show that the limit exists and is well defined (itis independent of the particular sequence of dis-cretizations considered).

(5.2) We construct a simple ϕ-irreducibility measure onΠ, and from there, we conclude the existence of a

unique maximal ψ-irreducibility measure.(5.3) We construct a family of accessible atoms inΠ, and

show that πk is positive recurrent. From this andfrom 5.2, using a theorem from [26], we concludethat there exists a unique invariant measure on Π.

(5.4) We show that the limit measure of (5.1) is indeedinvariant, and therefore conclude that it must bethe unique measure of (5.3).

Although steps 2–4 can be dealt with using classicalfinite-state Markov chain theory, steps 1 and 5 cannot. Thisis because πk is a Markov chain defined on an (uncountable)metric space, and therefore to analyze its properties, we need


00.2

0.40.6

0.81π(1) 1 0.8

0.6 0.4 0.2 0

π(2)

00.10.20.30.40.50.60.70.80.9

1

π(3

)

r = 1r = −1r = 0r = −1πSr = 1

r = −1

r = 0

r = 1

2D simplexπ(1) + π(2) + π(3)

= 1

Figure 12: Looking at the evolution in time of the informationstate. For a system with three sources, π is a point in the 2D sim-plex as shown in this figure. And after letting the system run forsome time, we find that there are regions of space visited fairly often(bottom right), regions visited less often (bottom left), and regionsnever visited (top right). Yet each point on this simplex determinesa choice of an injection rate, and therefore the frequency with whicheach point is visited is clearly a fundamental performance analysistool.

to resort to a more general theory of Markov processes. Meynand Tweedie provide an excellent coverage of the problem ofMarkov chains on general spaces [26], which we found to bean invaluable tool in our work.

We continue now with the formal proofs for the steps ofthe proof above.

3.2. Weak convergence—steps 1–4

Step 1: π is Markov

Although involving a chain defined over a metric space, thisproof is elementary, since all we need to invoke is the stan-dard definition of the Markov property and the total proba-bility law:

p(πk+1 | πk, . . . ,π0

)(a)=∑rk

p(πk+1 | πk, . . . ,π0, rk

)p(rk | πk, . . . ,π0

)(b)=∑rk

p(πk+1 | πk, rk,uk

)∑xk

p(rk | xk,πk, . . . ,π0

)

×p(xk | πk, . . . ,π0)

(c)=∑rk

p(πk+1 | πk, g

(πk), rk)

×∑xk

p(rk | xk, g

(πk))πk(xk)

= p(πk+1 | πk

),

(12)

where (a) results from the total probability law, (b) is be-cause when conditioned on πk, we can add in the condition-ing uk, since uk = g(πk), and the total probability law, and(c) because conditioned on anything else, rk depends onlyon xk,uk, and πk contains all information about xk given the

past. So we see that when conditioning on the past values,πk+1 depends only on πk, and hence π is Markov.

An interesting observation to make here—which givessome insight into structural properties of our model that willallow us to prove the sought weak convergence result—is thatthe intensity of the arrivals process is a memoryless functionof π. Although we have not attempted to prove this, it seemsat least intuitively clear to us that if instead of the optimalcontroller we used a suboptimal one (typically based on theformation of an estimate of the current state), then the op-timal decision would not be a memoryless function of thestate estimate, but would actually require past state estimatesas well.

Step 2: nonabsorbing small discretization cells

The next step is to show that there is a constant C > 0 suchthat, for any information state π ∈ Π, there exists an ob-servation r ∈ O for which the distance between πk and thenext-step information state πk+1 corresponding to r is largerthan C. This allows us to quantize the simplex Π and makesure that, provided that the size of a quantization cell is smallenough, at least one observation will take the current infor-mation state to a different cell.

Lemma 2. There exists a constant C such that for any π ∈ Π,there is an observation r for which ‖π−F[π, g(π), r]‖ ≥ ε, forall 0 < ε ≤ C, and for any norm ‖ · ‖.

Proof. This basically means that for any state, there is at leastone observation that moves the chain at a finite nonzero dis-tance away from that given state. We prove this by contradic-tion. We show that if all jumps are infinitesimally small, thenthe only information state that can satisfy this condition isthe stationary distribution πs of the original chain. But forthis particular information state, any observation differentfrom r = 0 does allow jumps of finite size away from it.

Suppose that for any C > 0, there exists a point π ∈ Πsuch that for any observation r ∈ O, ‖π−F[π, g(π), r]‖ < C.Denote by QC the set of points π verifying the above condi-tion for a given C. Then, if C1 > C2, then QC2 ⊆ QC1 . Denoteby Q0 the intersection of all QC sets. Then the suppositionthat we want to contradict is equivalent to Q0 �= ∅.

Consider now any π ∈ Q0. Then for any C arbitrarilyclose to 0, and for any observation r, ‖π −F[π, g(π), r]‖ < C(all jumps are arbitrarily close to zero).

In what follows, k(·) are normalizing constants. If r = 0,it results that π is arbitrarily close to πP. This means thatπ is arbitrarily close to πs (the stationary distribution of P).Also, for r = −1, or r = 1, consider the respective D(g(π), r)diagonal matrices. It results that π is arbitrarily close to1/kπ,rπD(g(π), r)P. But π is arbitrarily close to πs as well, soit results that πs is arbitrarily close to 1/kπs,rπsD(g(πs), r)P.In the limit, πs = 1/kπs,rπsD(g(πs), r)P. But this cannotbe true because D is not the identity matrix. Actually, Dis a diagonal matrix with increasing or decreasing diago-nal elements (d1, . . . ,dN ), for r = 1, respectively, r = −1.If, for example, r = 1, then πs = 1/kπs,1πsD(g(πs), 1)P.


The∏

space (0, 0, . . . , 0, 1)

−1

π0

1 0

π1

1

−1

0

πk−1 −1

πk

1 0

−1

1 0

πk+1

πs

ε − cell(1, 0, 0, . . . , 0)

Figure 13: A sequence of r = 0 observations leads the chain arbi-trarily close to πs.

This would mean that there exists π1 = 1/kπs,1πsD(g(πs), 1)with πs = π1P. We know that πs = πsP, so if the chain ad-mits only one stationary distribution, it results that π1 = πs.However this is not possible, since πsD(g(πs), 1) moves to-wards (1, 0, . . . , 0) the mass function of the new probabilityvector with respect to πs.

Step 3: πs is reachable from anywhere

Lemma 3. For any π ∈ Π, there is a nonzero probability thatin the limit, the chain reaches arbitrarily close to the state πs,when starting in state π.

Proof. We illustrate in Figure 13 the intuition on which webase our proof. The proof relies on the observation thatfinite-length sequences of r = 0 observations move the statearbitrarily closer to πs. If the observation at time k is rk = 0,

then the matrix D(uk, rk) becomes diagonal with elementsdii = 1−uk , so it equals the identity matrix multiplied with aconstant; then the recursion for the information state, when-ever the source decides not to transmit, can be expressed asπk+1 = πkP. This vector equation has as solution the station-ary distribution πs. It follows that for any π ∈ Π as initialstate of the chain, there is a path by which the chain reaches inthe limit the stationary distribution state πs, via for example asequence of successive rk = 0 observations. But any arbitraryfinite-length sequence of rk = 0 observations may happenwith nonzero probability, so for any εs > 0, there is a finitetime K with rk = 0, k ≤ K , in which the chain can reach withnonzero probability a state πε

ssuch that ‖πεs−πs‖ ≤ εs.

Step 4: positive recurrent discretizationon a nonempty subset

We consider now quantizations of the Markov chain formedby the sequence of information states, with quantization cellsof size ε ≤ C. If the cell size is small enough, then fromLemma 2, it follows that for any π inside a discretizationcell, there is at least one observation happening with nonzeroprobability for which the chain jumps outside the cell. Thisensures that there is no state of the chain in which the systemstays forever, so the recurrent irreducible subset of discretizedcells has more than one element. With this procedure, wedefine a family of quantizations of Π, with members of thefamily of the form qε = {qε1, . . . , qεNε}, where qεi are the Nεcompact sets contained in qε and

⋃i qεi = Π,

⋂i qεi = ∅.

For simplicity, we will denote by qε(π) the cell to which theinstantaneous information state π belongs. We note that

p(qεk+1 | qεk , qεk−1, . . .

) =∫πk+1∈qεk+1, πk∈qεk , πk−1∈qεk−1,...

p(πk+1 | πk,πk−1, . . .)dπk+1dπkdπk−1 . . .

=∫πk+1∈qεk+1, πk∈qεk

p(πk+1 | πk

)dπk+1dπk

= p(qεk+1 | qεk

)(13)

since the process πk is Markov. The measure with respectto which we are integrating is Lebesgue measure over the Πspace. We just count how often the continuous chain falls in agiven cell. Thus the process qε(πk) forms a finite-state chain,also having the Markov property (inherited from the contin-uous chain).

Lemma 4. For any ε ≤ C, there is a subset P ε ⊆ Π, whichcontains the stationary distribution πs of the xk original chain,and on which the discretized chain qε(πk) is positive recurrent.

Proof. We show in Figure 14 a typical behavior of the chain,

that shows the existence of a recursive subset Pε ⊆ Π. Webase our proof on the fact that πs is recurrent so its proper-ties will be induced on a recurrent closure of the discretizedversion of the simplex Π. As we showed in Lemma 3, the in-formation state πs can be reached in the limit with nonzeroprobability from any π0 initial state of the Markov chain.For any initial π0, there is a sequence π0, . . . ,πk, . . . such thatπk → πs when k → ∞, and the size of the quantization cell isstrictly positive. Then the time in which the discretized chainqε(πk) reaches the cell containing state πs is finite, so withoutloss of generality, we may consider our limit results with πs

as initial value for the information state.


The∏

space (0, 0, . . . , 0, 1)

π0

π1

π2

π3

πk πS

Pε subspace

(1, 0, 0, . . . , 0)

Figure 14: After passing through a sequence of transient states, thechain reaches a recursive subset of the discretized simplex Π.

Denote by Pε the set of reachable quantization cells qε(π)if the chain starts in πs. We already proved that the cell con-taining πs is accessible in a finite number of steps from anyother information state π, so implicitly from any cell qεi ∈ Pεas well. Moreover, by construction, any cell qεi ∈ Pε is reach-able from qε(πs). Since ε ≤ C, then for any qε(πk), there isat least one observation r for which the transition from πkto πk+1 leads to qε(πk+1) �= qε(πk). It follows that the chainwith states in Pε is irreducible (and aperiodic as well, sincethe cell containing πs is one-step reachable from itself, via anr = 0 observation). The state space is finite, so the chain ispositive recurrent, and thus it has a stationary distribution.Denote by pε this probability distribution over the Pε statespace. If qεi /∈ Pε, then pε(qεi ) = 0.

We will prove now that there is a limit probability mea-sure on Π to which pε converges in the limit ε → 0, and studythe properties of that measure.

3.3. Weak convergence—step 5

There exists a unique limit invariant measure over Π, νε → ν,when ε → 0.

Step 5.1: existence of the limit measure

We will show that the limit measure exists by considering, forany subset A of Π, sequences of measures on subsets of thediscretized simplex that cover, and respectively, intersect A.We show that they converge to the same limit.

Definition 2. Define the inner and outer sequences of mea-sures over the simplex Π, corresponding to the set of ε-discretizations:

νIε(A) =∑

S∈Pε :S⊆Apε(S),

νOε (A) =∑

S∈Pε :S∩A�=∅pε(S),

(14)

where A is any subset in the σ-algebra of Π.

We want to prove that, for any given A, both νIε(A) andνIε(A) converge to the same limit, as ε → 0. That limit will beour limit invariant measure ν(A).

We will prove first convergence of each of the limits. Con-sider νIε; we will prove that the sequence is Cauchy for any setA, and it trivially has a convergent subsequence, which willmean that the whole sequence is convergent.

For a given set A, denote by An = {∪S ∈ Pεn : S ⊆A} the inner cover of set A corresponding to discretizationstep εn. We will prove first that the normalized volume of thedifference set between two inner covers of the set A tends tothe empty set, and consequently the probability measure overthat difference set tends to zero. Define the metric d(X ,Y) =µLeb((X − Y) ∪ (Y − X))/µLeb(Π), on the σ-algebra B of Π,X ,Y ∈ B (this represents the normalized volume of the setwhere the two subsets X ,Y differ from each other—µLeb isLebesgue measure). It is easy to verify that d(·, ·) is indeed avalid metric.

Let εn be a decreasing sequence of discretization steps,with limn→∞ εn = 0. Then due to the fact that Pεn is a se-quence of subsequent discretizations of the space Π whenn → ∞, it follows that limn→∞ An = A. Since An is conver-gent, it is also Cauchy in the metric space (B,d). This meansthat for any δ > 0, there exists nδ such that d(An,Am) < δ,for any m > n ≥ nδ . So the normalized volume of the set dif-ference between two set elements of the sequence becomesarbitrarily small. That also means that if εn, εm → 0, thenνIεnεm(d(An,Am)) → 0, as νεnεm is a stationary distributionover finite spaces with decreasing cell size. Then for any δν >0, there is nδν such that νIεnεm((An−Am)∪(Am−An)) < δν, forany m > n ≥ nδν . Note that νIεnεm((An − Am)∪ (Am − An)) ≥|νIεnεm(An)− νIεnεm(Am)|.

Finally, we note that |νIεnεm(An)−νIεnεm(Am)| = |νIεn(An)−νIεm(Am)|, due to the property of inclusion for the sequence ofmeasures (sum of probabilities of ε′-discretization cells thatcover exactly an ε-discretization cell is equal to the proba-bility of that ε-discretization cell), and the way An, Am areconstructed (cells corresponding to multiples of ε are all in-cluded in the cells corresponding to ε).

We conclude that for any δν > 0, there is nδν such that|νIεn(An) − νIεm(Am)| < δν, for any m > n ≥ nδν . This meansthat the sequence νIεn(An) is Cauchy as well. It is trivial toshow that there is a convergent subsequence of νIεn(An). Pick,for example, εn = ε0/2n, then the corresponding subse-quence is bounded from above by 1, and monotonically in-creasing; it follows that the subsequence is convergent. But aCauchy sequence with a convergent subsequence is conver-gent, which proves that νIε(A) is convergent for any set A.

The proof for convergence of νOε is similar and we willomit it. Both limits exist, and it is obvious that they fulfill theinequality

limε→0

νIε(A) ≤ limε→0

νOε (A), for any A ⊂ Π. (15)

We want to prove that the inequality above holds in factwith equality. Assume that the inequality is strict; then letδ = νO0 (A)− νI0(A) > 0. But this would mean that there existsat least a cell in any partition Pε of size δ > 0, for all εn →0. However, in the limit, the two sets of summation becomeequal (with union A), so a contradiction results.


Definition 3. Define the measure ν over the simplex Π as thecommon limit of the two sequences of measures:

ν(A) = limε→0

νIε(A) = limε→0

νOε (A) (16)

for a given A ⊆ Π.

For the proofs in the next two sections, we will use defi-nitions and notations also found in [26].

Step 5.2: existence of a unique maximalψ-irreducibility measure

Definition 4. Denote by B(π0, δ) = {π ∈ Π : ‖π − π0‖ < δ}the open ball, with δ > 0.

Definition 5. Denote by B(Π) the σ-field generated by theopen balls in Π.

Definition 6. Denote, for any state π ∈ Π and subset A ∈B(Π), the probability that, when starting in state π, the chainreaches subset A:

L(π,A) = Pπ(τA <∞

). (17)

Lemma 5. Let πn ∈ Π,n = 0, 1, . . . be a sequence of informa-tion states. Then πn is φ-irreducible on B(Π).

Proof. Let πs be the stationary distribution of the underlyingchain. Define the measure φ on B(Π) as

φ(B(πs, δ

)) = µLeb(B(πs, δ)),φ(A) = 0 otherwise.

(18)

In step 3 of the proof, we proved that πs is reachable fromanywhere. Hence, for all π ∈ Π, we have L(π,B(πs, δ)) > 0,and φ is an irreducibility measure.

Note. If a φ-irreducibility measure exists, then there is a partof the space reachable from anywhere, so one might expectindependence of the chain from the initial conditions, byanalogy with finite chains.

Proposition 1. If πn is φ-irreducible, then there exists a unique“maximal” measure ψ on B(Π) such that πn is ψ-irreducibleand φ ≤ ψ. Denote by B+(Π) the σ-algebra of Π with sets onwhich ψ is positive.

Proof. The proof involves concepts outside the scope of thispaper, but is standard for chains fulfilling the previous con-ditions, and it can be found in [26].

Step 5.3: uniqueness of the invariant measure on Π

Definition 7. α ∈ B(Π) is called an atom for a sequence πn ifthere exists a measure µ on B(Π), such that, for any π ∈ α,P(π,A) = µ(A) (for any A ∈ B(Π)).

Definition 8. α is called an accessible atom for a sequence πn,if πn is ψ-irreducible and ψ(α) > 0.

Note. Atoms behave like states in finite chains. From the de-velopment in [26], it turns out that the reason why so manyresults about finite chains carry over to more general set-tings is precisely the fact that it is always possible to constructatoms.

Proposition 2. All balls B(πs, δ), with δ > 0, are accessibleatoms for any sequence πn.

Proof. Let α be a set in B(Π), and let π ∈ α. Then, dependingon the current observation r, there are three possible transi-tions from π, via the recursion function F[π, g(π), r]. Thenfor any A ∈ Π, we can consider the measure

µ(A) = p(r) if F[π, g(π), r

] ∈ A,

µ(A) = 0 otherwise.(19)

Then any α = B(πs, δ) is an accessible atom.

Definition 9. Denote by Eπ[ηA] the expected number of re-turns of the chain to subset A ∈ Π when starting in state π.

Definition 10. A set A ∈ B(Π) is called recurrent if Eπ[ηA] =∞ for all π ∈ A (when starting in A, the expected number ofreturns to A is infinite).

Lemma 6. If πn is ψ-irreducible and admits a recurrent atomα, then every set in B+(Π) is recurrent.

Proof. If A ∈ B+(Π), then for any π, there exist r, s such thatPr(π,α) > 0, Ps(α,A) > 0 and we can write, by consideringthe paths of the chain that go from π to A via the atom α,

∑n

Pr+s+n(π,A) ≥ Pr(π,α)

[∑n

Pn(α,α)

]Ps(α,A) = ∞.

(20)

Since α, being an atom, implies that∑

n Pn(α,α) diver-

ges.

Note. Observe again the analogy between atoms and states ofa finite chain.

Definition 11. A sequence πn is called recurrent if and onlyif it is ψ-irreducible, and Eπ[ηA] = ∞ for any π ∈ Π andA ∈ B+(Π).

Lemma 7. Any sequence of information states drawn from theMarkov chain πn is recurrent.

Proof. From Lemma 5 and Proposition 1, it results that πnis ψ-irreducible. Furthermore, from step 3 of the proof, itresults that all the balls B(πs, δ), with δ > 0, are recur-rent atoms. Then every A ∈ B+(Π) is recurrent, and fromDefinition 10, it results that Eπ[ηA] = ∞ for all π ∈ A.We still need to prove that even if π /∈ A, we still haveL(π,A) = 1. By definition, L(π,

⋃B+(Π)) = 1. Suppose

that the sequence πn hits at some time a set B ∈ B+(Π).


If A = B, then the sought result follows. Otherwise, we willhave L(y,A) > 0 for all y ∈ B, because of the ψ-irreducibilityover B+. But B ∈ B+(Π) and Ey[ηB] = ∞, so it results thatL(y,A) = 1. So finally, L(π,A) = 1 and this case is reduced tothe previous one (where π ∈ A). Hence, πn is recurrent.

Definition 12. A sequence πn is called positive if and only if itis ψ-irreducible and admits an invariant measure γ.

Lemma 8 (Kac’s theorem). If a sequence πn is recurrent andadmits an atom α ∈ B+(Π), then πn is positive if and only ifEα[τα] <∞.

Proof. If Eα[τα] < ∞, then obviously L(α,α) = 1, so it resultsπn is recurrent. It also results from the structure of γ (see[26]) that γ is finite, so is positive as well. The converse resultsfrom the structure of γ as well.

Lemma 9. The sequence of information states πn is positive.

Proof. From Lemma 7, it results that πn is recurrent.Also, from step 3 of the proof, it results that every ballα = B(πs) ∈ B+(Π) is an atom, and Eα(τα) < ∞. Then itresults that πn is positive.

Theorem 10. There exists a unique invariant probability mea-sure of πn.

Proof. The proof for this theorem is valid for chains havingthe properties we have analyzed until now, and can be foundin [26].

Step 5.4: invariance of ν

We prove now Theorem 1. The measure ν (as constructed instep (5.1)) is the unique invariant probability measure on Π.

Proof. For invariance of ν, we need to prove that

ν(A) =∫Π

ν(dy)P(y,A). (21)

From the definition of ν, we have that for any ε > 0,

νIε(A) ≤ ν(A) ≤ νOε . (22)

If we denote by Pε(·, ·) the transition-probability kernelfor the ε-discretization, then we can rewrite the rightmostterm of the inequality (22) as

νIε(A) =∑

S∈Pε :S⊆Apε(S)

=∑

S∈Pε :S⊆A

∑T∈Pε

pε(T)Pε(T , S)

=∑T∈Pε

∑S∈Pε :S⊆A

pε(T)Pε(T , S)

=∑T∈Pε

pε(T)Pε(T ,∪{S ∈ Pε : S ⊆ A

}).

(23)

In a similar manner, we can rewrite the expression forνOε (A):

νOε (A) =∑T∈Pε

pε(T)Pε(T ,∪{S ∈ Pε : S∩ A �= ∅}). (24)

By taking now the limit in expression (22), we know thatboth left and right limits exist and are equal, so it results that

ν(A) = limε→0

∑T∈Pε

pε(T)Pε(T ,∪{S ∈ Pε : S ⊆ A

})

= limε→0

∑T∈Pε

pε(T)Pε(T ,∪{S ∈ Pε : S∩ A �= ∅})

(a)=∫Π

ν(dy)P(y,A),

(25)

where equality (a) holds because, under some continuityconditions, in the limit ε → 0, the sum becomes integral;the probability limit ν exists; the quantization cell T ∈ Pεbecomes the infinitesimal integration variable T → dy; thetransition-probability kernel Pε(·, ·) → P(·, ·); and both thereunions of cells included in A and, respectively, intersectingA cover whole set A.

It results that ν is invariant, and thus it is the unique in-variant measure on Π.

3.4. Numerical simulations

In this section, we show results of numerically evaluating theintegrals above. We simulated a system with N = 2, N = 4,and N = 8 sources, and with different values for the lossthreshold T = 0.02, T = 0.05, and T = 0.1. The chainis birth-and-death with probability Pswitch. We let the sys-tem run for t = 100 000 time steps. We plot the averagethroughput and loss as a function of the transition probabil-ity Pswitch. The resulting plots are shown in Figure 15. We seethat the plots do not depend significantly on Pswitch. The de-pendence is essentially on the stationary probability πs of theoriginal chain P, which is the same for any symmetric birth-and-death chain. As expected, large values for T imply largerthroughput, as the controller is allowed to probe more oftenthe environment; this is on the expense of increased losses.

4. CONCLUSIONS

We proposed a new mechanism for rate control in sensor net-works, based on partial observation about the state of the sys-tem. We show that, when the system state is Markov, the opti-mal controller essentially depends on the information state,a quantity that takes into account all previous controls andobservations about the system. Then, we show results on theconvergence of the information state, which imply stabilityof our control policy. Namely, (a) we formulated a queue-ing problem in which the process of arrivals is not indepen-dent of the (partially observed) state of the queue, (b) wesolved the corresponding optimal control problem, and (c)we proved a theorem regarding its ergodic behavior—the ex-istence of a suitable invariant measure. Our main insight to


0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5

Transition probability Pswitch

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8A

vera

gep

erfo

rman

ceth

rou

ghpu

t

0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5


0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Ave

rage

loss

0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5


0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Ave

rage

per

form

ance

thro

ugh

put

0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5


0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Ave

rage

loss

0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5


0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Ave

rage

per

form

ance

thro

ugh

put

0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5


0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Ave

rage

loss

Figure 15: The plots from up to bottom: N = 2, 4, 8 sources. Left: average throughput; right: average loss. Legend: dotted plot, T = 0.1;dashed plot, T = 0.05; solid plot, T = 0.01.

tackle these problems was that by conditioning on informa-tion states, the arrivals process does become a process withindependent increments—but since this conditioning termis itself Markov, this is how the model is rendered analyticallytractable.

An interesting conclusion that we draw from our resultsrelates to the observation that to make a system efficient inan information-theoretic sense requires driving the systemvery close to the instability point. Consistent with this obser-vation, for our model to be efficient, the queue needs to be


driven past the stability point and into the instability region.This is because, without being able to observe the state ofthe system but being able to observe when the system be-comes unstable, this instability is the only indication thatwe are operating at peak efficiency. Note that this is a keyidea behind the implementation of TCP (increase windowsize while packets get acknowledged, decrease when packetsget lost). In our model, without forcing “TCP compatibil-ity,” we observe the exact same type of behavior. This pro-vides a strong argument for the (additive probing proba-bility)/(multiplicative back-off) paradigm in the design ofMAC access protocols in sensor networks, where only lim-ited partial observations of the state system are available tothe medium access controllers.

ACKNOWLEDGMENTS

This work was largely completed while both authors werewith l’Ecole Polytechnique Federale de Lausanne (EPFL),and supported primarily by the Swiss National Science Foun-dation under Grant 21-61831.00 (with additional supportfrom the National Science Foundation under Grant CCR-0227676). Parts of this work have been presented at the 35thAnnual Conference on Information Sciences and Systems(CISS 2001), and at the IEEE International Symposium onInformation Theory (ISIT 2002).

REFERENCES

[1] D. Bertsekas and R. Gallager, Data Networks, Prentice-Hall,Englewood Cliffs, NJ, USA, 2nd edition, 1992.

[2] S. Floyd and V. Jacobson, “Random early detection gate-ways for congestion avoidance,” IEEE/ACM Trans. Network-ing, vol. 1, no. 4, pp. 397–413, 1993.

[3] V. Jacobson, “Congestion avoidance and control,” in Proc.ACM Symposium on Communications Architectures and Pro-tocols (ACM SIGCOMM ’88), pp. 314–329, Stanford, Calif,USA, August 1988.

[4] F. Kelly, “Mathematical modelling of the Internet,” inMathematics Unlimited - 2001 and Beyond, B. Engquistand W. Schmid, Eds., pp. 685–702, Springer, Berlin,Germany, 2001, available from http://www.statslab.cam.ac.uk/frank/PAPERS/.

[5] R. J. La and V. Anantharam, “Utility-based rate control inthe Internet for elastic traffic,” IEEE/ACM Trans. Networking,vol. 10, no. 2, pp. 272–286, 2002.

[6] S. H. Low and D. E. Lapsely, “Optimization flow control. I. Ba-sic algorithm and convergence,” IEEE/ACM Trans. Network-ing, vol. 7, no. 6, pp. 861–874, 1999.

[7] S. H. Low, F. Paganini, and J. C. Doyle, “Internet congestioncontrol,” IEEE Control Syst. Mag., vol. 22, no. 1, pp. 28–43,2002.

[8] Y. Sankarasubramaniam, I. F. Akyildiz, and S. W. McLaughlin,“Energy efficiency based packet size optimization in wirelesssensor networks,” in Proc. 1st IEEE International Workshop onSensor Network Protocols and Applications (SNPA ’03), pp. 1–8, Anchorage, Alaska, USA, May 2003.

[9] C.-Y. Wan, S. Eisenman, and A. Campbell, “CODA:COngestion Detection and Avoidance in sensor networks,”

in Proc. ACM SenSys ’03, Los Angeles, Calif, USA, November2003.

[10] M. Carvalho and J. J. Garcia-Luna-Aceves, “A Scalable modelfor channel access protocols in multihop Ad-hoc networks,”in Proc. 10th Annual ACM International Conference on MobileComputing and Networking (MobiCom ’04), Philadelphia, Pa,USA, September–October 2004.

[11] V. Naware and L. Tong, “Smart antennas, dumb schedulingfor medium access control,” in Proc. 37th Annual Conferenceon Information Sciences and Systems (CISS ’03), Baltimore,Md, USA, March 2003.

[12] S. Singh and C. S. Raghavendra, “PAMAS: power aware multi-access protocol with signalling for Ad-hoc networks,” ACMComputer Communication Review, vol. 28, no. 3, pp. 5–26,1998.

[13] A. Woo and D. Culler, “A transmission control scheme for me-dia access in sensor networks,” in Proc. 7th Annual ACM In-ternational Conference on Mobile Computing and Networking(Mobicom ’01), pp. 221–235, Rome, Italy, July 2001.

[14] L. Bao and J. J. Garcia-Luna-Aceves, “Transmission schedul-ing in Ad-hoc networks with directional antennas,” in Proc.8th Annual ACM International Conference on Mobile Comput-ing and Networking (MobiCom ’02), pp. 48–58, Atlanta, Ga,USA, September 2002.

[15] W. R. Heinzelman, A. Chandrakasan, and H. Balakrishnan,“Energy-efficient communication protocol for wireless mi-crosensor networks,” in Proc. 33rd IEEE Annual Hawaii In-ternational Conference on System Sciences (HICSS ’00), vol. 2,Maui, Hawaii, USA, January 2000.

[16] E.-S. Jung and N. H. Vaidya, “A power control MAC protocolfor Ad-hoc networks,” in Proc. 8th Annual ACM InternationalConference on Mobile Computing and Networking (MobiCom’02), pp. 36–47, Atlanta, Ga, USA, September 2002.

[17] K. Sohrabi and G. J. Pottie, “Performance of a novel self-organization protocol for wireless Ad-hoc sensor networks,”in Proc. 50th IEEE Vehicular Technology Conference (VTC ’99),vol. 2, pp. 1222–1226, Amsterdam, Netherlands, September1999.

[18] W. Ye, J. Heidemann, and D. Estrin, “Medium access controlwith coordinated adaptive sleeping for wireless sensor net-works,” IEEE/ACM Trans. Networking, vol. 12, no. 3, pp. 493–506, 2004.

[19] S. Mitter, “Control with limited information,” European Jour-nal of Control, vol. 7, no. 2-3, pp. 122–131, 2001.

[20] S. Tatikonda, Control Under Communication Constraints,Ph.D. thesis, M. I. T. Press, Cambridge, Mass, USA, 2000.

[21] T. Kaijser, “A limit theorem for partially observed Markovchains,” The Annals of Probability, vol. 3, no. 4, pp. 667–696,1975.

[22] A. J. Goldsmith and P. P. Varaiya, “Capacity, mutual infor-mation, and coding for finite-state Markov channels,” IEEETrans. Inform. Theory, vol. 42, no. 3, pp. 868–886, 1996.

[23] V. Sharma and S. K. Singh, “Entropy and channel capacity inthe regenerative setup with applications to Markov channels,”in Proc. IEEE International Symposium on Information Theory(ISIT ’01), Washington, DC, USA, June 2001.

[24] B. Hajek, “Stochastic approximation methods for decentral-ized control of multiaccess communications,” IEEE Trans. In-form. Theory, vol. 31, no. 2, pp. 176–184, 1985.

[25] P. R. Kumar and P. Varaiya, Stochastic Systems: Estimation,Identification and Adaptive Control, Prentice-Hall, EnglewoodCliffs, NJ, USA, 1986.

[26] S. P. Meyn and R. L. Tweedie, Markov Chains and StochasticStability, Springer, London, UK, 1993.

http://www.statslab.cam.ac.uk/frank/PAPERS/

http://www.statslab.cam.ac.uk/frank/PAPERS/


Razvan Cristescu was born in Ploiesti, Ro-mania, in 1974. He received his Ph.D. degreefrom l’Ecole Polytechnique Federale de Lau-sanne (EPF, Lausanne) in 2004, his Lic. Tech.in 2000 from Helsinki University of Technol-ogy, and his M.S. degree in 1998 from Poly-technic University of Bucharest. Since Octo-ber 2004, he has been a Postdoctoral Fellowin the Center for the Mathematics of Infor-mation at California Institute of Technology.His research interests include information theory, network theory,computational complexity, signal processing for communications,and applications in sensor networks.

Sergio D. Servetto was born in Argentina,on January 18, 1968. He received a Licen-ciatura en Informatica from UniversidadNacional de La Plata (UNLP, Argentina) in1992, and the M.S. degree in electrical en-gineering, and the Ph.D. degree in com-puter science from the University of Illinoisat Urbana-Champaign (UIUC), in 1996 and1999. Between 1999 and 2001, he workedat the Ecole Polytechnique Federale de Lau-sanne (EPFL), Lausanne, Switzerland. Since fall 2001, he has beenan Assistant Professor in the School of Electrical and ComputerEngineering at Cornell University, and a member of the field ofapplied mathematics. He was the recipient of the 1998 Ray OzzieFellowship, given to “outstanding graduate students in computerscience,” and of the 1999 David J. Kuck Outstanding Thesis Awardfor the best doctoral dissertation of the year, both from the Depart-ment of Computer Science at UIUC. He is also the recipient of a2003 NSF Career Award. His research interests are centered aroundinformation-theoretic aspects of networked systems, with a currentemphasis on problems that arise in the context of large-scale sensornetworks.

EURASIP Journal on Wireless Communications and Networking 2005:4, 523–540c© 2005 Jussi Haapola et al.

Multihop Medium Access Control for WSNs:An Energy Analysis Model

Jussi HaapolaCentre for Wireless Communications (CWC), University of Oulu, P.O. Box 4500, 90014 Oulu, FinlandEmail: [email protected]

Zach ShelbyCentre for Wireless Communications (CWC), University of Oulu, P.O. Box 4500, 90014 Oulu, FinlandEmail: [email protected]

Carlos Pomalaza-RaezCentre for Wireless Communications (CWC), University of Oulu, P.O. Box 4500, 90014 Oulu, FinlandEmail: [email protected]

Petri MahonenInstitute of Wireless Networks, RWTH Aachen University, Kackertstraße 9, 52072 Aachen, GermanyEmail: [email protected]

Received 30 November 2004; Revised 30 March 2005

We present an energy analysis technique applicable to medium access control (MAC) and multihop communications. Further-more, the technique’s application gives insight on using multihop forwarding instead of single-hop communications. Using thetechnique, we perform an energy analysis of carrier-sense-multiple-access (CSMA-) based MAC protocols with sleeping schemes.Power constraints set by battery operation raise energy efficiency as the prime factor for wireless sensor networks. A detailedenergy expenditure analysis of the physical, the link, and the network layers together can provide a basis for developing newenergy-efficient wireless sensor networks. The presented technique provides a set of analytical tools for accomplishing this. Withthose tools, the energy impact of radio, MAC, and topology parameters on the network can be investigated. From the analysis,we extract key parameters of selected MAC protocols and show that some traditional mechanisms, such as binary exponentialbackoff, have inherent problems.

Keywords and phrases: energy efficiency, wireless sensor networks, medium access control, multihop communications.

1. INTRODUCTION

Sensor network applications have recently become of signif-icant interest due to cheap single-chip transceivers and mi-crocontrollers. Sensor nodes are usually battery operated andtheir operational lifetime should be maximized, hence en-ergy consumption is a crucial issue. Many wireless sensorsand therefore sensor networks are expected to operate usingsingle-chip transceivers like the RFM TR1000 [1] or its Euro-pean versions, all of which work in ISM bands. The radio pa-rameters of the RFM TR1000 represent a typical transceiveroperating in the lower-frequency ISM bands. Therefore, the


RFM TR1000 is used in this paper as a representative ex-ample. Regulations in many countries impose a duty cycle[2, 3], which is normally 10% in the 434 MHz band and 1%in the 868 MHz band. The duty cycle is defined as the ra-tio, expressed as a percentage, of the maximum transmitteron-time, relative to a one-hour period. When a sensor net-work is expected to work continuously, this duty cycle has tobe taken into account and it can affect the energy efficiencyof a network. In data-centric sensor networks, the perfor-mance of sink nodes in particular will often be challenged byduty-cycle constraints. Multihop communications presentsanother challenge to sensor networks. Tools are needed tounderstand the point where multihop provides real energysavings and should be applied.

The contribution of this paper is to present an analyti-cal energy consumption evaluation technique applicable to






Sink

N

Sensor nodes

123· · ·n− 1n

d

R = nd

Figure 1: A simple linear sensor network of N nodes. Nodes areseparated by distance d and to reach the sink node, node n’s packetsrequire n hops resulting in an overall distance of R.

Sink

Linearpath

Figure 2: A simple linear multihop model in a large network pro-ducing a linear path. The large network may contain several linearpaths.

MAC protocols and multihop communications. The pre-sented technique can be applied to predict when to use mul-tihop forwarding in wireless sensor networks. Also, apply-ing the presented technique, we make an analysis on CSMA-based sensor MAC protocols with sleeping schemes.

We start from the simple linear multihop communica-tions model of Figure 1 without medium access control toshow the basic effects of radio parameters on the energyconsumption of a network. Thereafter, we create an energyanalysis technique for MAC protocols using the same radioparameters. Sleep scheduling is included in the analysis aswell as multihop communications. The simple linear multi-hop communications model is used with the exception thatMAC modelling considers the multihop forwarding modelin a network with a very large number of nodes and cre-ates background traffic for the network. The modelling inthis paper uses the term “linear path” which is illustratedin Figure 2. As a result of the presented technique, we firstlyperform a single-hop energy consumption comparison be-tween three CSMA-based MAC protocols. Secondly, we com-pare how the basic multihop scenario without medium ac-cess control relates to the case also considering MAC protocoleffects. Thirdly, a single-hop versus multihop analysis withMAC protocols is made. Lastly a few key parameters that canbe extracted from the technique presented are discussed.

The linear topology model, whether uniformly or ran-domly spaced, represents a common network after route dis-covery has been accomplished. We propose an energy con-sumption model for the transmission and reception of MAC

frames, develop a coordinated sleep group energy consump-tion model, and analytically investigate the effect of sleepon sensor networks. From the analysis, we show that al-though in an ideal scenario multihop communications per-forms better than single-hop communications, realistic en-ergy models and especially the MAC design have a signifi-cant impact. The radio transceiver energy model takes intoaccount several important radio parameters; in this paper,we use the RFM TR1000 and RFM radio designers’ guide[4] as an example of realistic transceiver parameters. Themain metric used is absolute energy consumption per use-ful successfully transmitted bit. This implies that only theMAC service data unit (MSDU), that is, the data from higherlayer, will be considered useful and all the other communi-cated bits, headers, control frames, preambles, and so forthare considered to be overhead. For linear topology scenarios,we begin with optimum uniform spacing and optimal powercontrol and proceed to random node spacing using morerealistic four-level transmit power control. As intermediatesteps, we cover non-optimum uniform spacing with optimalpower control and nonuniform spacing with fixed transmis-sion power.

The rest of the paper is organized as follows. Related workand some MAC protocols, namely, nonpersistent CSMA, S-MAC, nanoMAC, and the IEEE Std 802.15.4, are discussedin Section 2. Section 3 describes the radio propagation en-ergy model and presents the simple linear multihop commu-nications model without medium access control. Section 4presents the MAC energy consumption models for the trans-mission and reception of data and Section 5 deals with reg-ular sleep periods and presents the worst-case energy con-sumption results and the energy savings achieved by regularsleeping. Section 6 addresses the single-hop versus multihopproblem and in Section 7 we present an analysis for nonopti-mal and randomly spaced multihop networks using shortest-hop and longest-hop strategies. Conclusions are drawn anddiscussion is presented in Section 8.

2. RELATED WORK

2.1. Radio modelling

The radio model and physical layer characteristics in this pa-per are based on the work of [5, 6, 7]. In [5] optimal trans-mittable packet sizes are discussed in respect to energy effi-ciency over single hops. The authors present an energy con-sumption model and optimal packet payload sizes for var-ious channel bit error rates (BERs) and coding schemes aredetermined. In [6, 7] a linear radio model is presented as seenin Figure 1 for multihop analysis. The latter also presents anoptimal hop distance characteristic for multihop communi-cations which is a function of radio parameters and heavilydependent on the individual radio used. A single-hop radioenergy consumption model taking into account startup en-ergies and decoding energy was presented in [8]. The paperdescribes the total power consumption of a single hop andassumes a linear radio model as well as the simple linear net-work of Figure 1.

Multihop MAC for WSNs: An Energy Analysis Model 525

2.2. Topology and network protocols

There has been a lot of research on efficient wireless sen-sor network topologies that include LEACH [6], SPIN [9],data funnelling [10], and directed diffusion [11]. Each ofthem suggests a method of energy-efficient network forma-tion. LEACH builds dynamic clusters to ensure that mostnodes need to transmit only small distances and SPIN sen-sor nodes advertise the data they have so that only interestednodes request the data. Data funnelling creates sensing areaswith border nodes so that data from an area is gathered toborder nodes that in turn find and use a multihop path tothe sink node. In directed diffusion, the sink node broadcastswhat data it is interested in and builds gradients to nodesthat have the data of interest. All of the mentioned protocolsare data-centric, which is a good assumption for sensor net-works and implies that the data itself is the key element in thenetwork, not the sensor nodes that sent it. Of the mentionedprotocols, SPIN, data funnelling, and directed diffusion canbe modelled with the linear network shown in Figure 1 insteady state.

2.3. Cross-layer studies

The closest related work to our paper was presented in [12].The paper is a MAC-routing protocol cross-layer study forad hoc networks. Although the work is on ad hoc protocolsand does not take energy usage into account, it shows theimportance of considering different layers when designing anew protocol. This is demonstrated with ad hoc on demanddistance vector (AODV) routing and IEEE Std 802.11. AODVis designed to work specifically on top of the IEEE Std 802.11MAC protocol and achieves its best performance with thatMAC and also has the best overall throughput of the MAC-routing protocol combinations presented in the paper.

2.4. Medium access control

During the past few years, there has been an increasingamount of research on energy efficient MAC protocolsspecifically for use with sensor networks [13, 14, 15]. How-ever, such protocols are usually modifications from tradi-tional ad hoc networking and have some inherent flaws forsensor networks. The PAMAS [13] protocol was one of thefirst attempts to reduce unnecessary power consumption byputting overhearing nodes to sleep. The protocol howeverneeds a separate control channel for coordination and avoid-ing overhearing. It also does not take into account idle lis-tening in any way, which accounts for a large portion of en-ergy consumption. The sensor MAC (S-MAC) [14] is a pro-tocol designed for sensor networks and its prime function-ality is to reduce idle listening. S-MAC’s foundations lie onIEEE Std 802.11 [16] and MACAW [17], which is the basis ofIEEE Std 802.11. They both implement carrier sense multi-ple access with collision avoidance (CSMA/CA), a four-wayhandshake using binary exponential (BE) backoff and othersimilar functionalities. S-MAC also implements a regularsleep period and a special synchronization scheme to reduceidle listening and maintain global connectivity. The methodis called virtual clustering, where irregular synchronization

messages urge, but do not enforce, a common schedule. Eventhough S-MAC outperforms IEEE Std 802.11-like protocolsin the energy perspective, it is still a traditional ad hoc pro-tocol in many ways. The timeout MAC (T-MAC) [15] is anevolution of S-MAC into even lower energy consumption bynot only reducing idle listening but also making the active pe-riods of the protocol dynamic. The data communications inT-MAC is highly bursty, minimizing the active time and forc-ing the bursty periods to operate in a very high contentionenvironment. It shares many of the features of S-MAC butachieves superior performance over S-MAC in certain cases.

The IEEE Std 802.15.4 standard [18] is the IEEE’s contri-bution to flexible sensor MAC protocols with a low-rate wire-less personal area network (LR-WPAN). The design goal hasbeen low-cost and very low-power short-range wireless com-munications. The standard provides two frequency ranges:the 868/915 MHz ISM band supporting 20/40 kbps commu-nications and the 2450 MHz ISM band supporting a datarate of 250 kbps. Like other IEEE 802.15 protocols, the stan-dard operates using piconets, that is, every WPAN has a cen-tral coordinator called the PAN coordinator. However, IEEEStd 802.15.4 provides more flexible topologies than the otherIEEE 802.15 family protocols including star network, meshtopology, and a clustered network approach. The piconet canalso operate in beacon-enabled or beaconless modes allowingmore flexibility to nodes with special requirements, like ad-vanced sleeping schemes with very low duty cycle or low de-lay. The channel access method for the standard is CSMA/CAexcept in guaranteed time slots (GTS) provided by the PANcoordinator in beacon-enabled mode where communicationis reserved for a single node. The standard does not describeany specific sleep algorithms and its channel access is verysimilar to the other protocols we are considering in this work,therefore it is not included in the forthcoming analysis.

The MAC protocols used for the energy analysis inthis paper, namely, nonpersistent CSMA, S-MAC, andnanoMAC, are described in the following subsections.Nonpersistent CSMA is a well-known and normally well-performing MAC protocol in almost any scenario. It gives theworst-case energy performance that any sensor MAC proto-col should outperform. S-MAC is the current sensor MACbenchmark protocol which is used to highlight some of thefaults of traditionally designed sensor MAC protocols. Wecompare these two protocols to nanoMAC, a protocol de-signed to operate in a sensor networking environment.

2.4.1. Nonpersistent CSMA

Carrier sense multiple access was originally presented in [19]and has been widely referenced afterwards. The reason forconsidering nonpersistent CSMA (np-CSMA) in this paperis because it performs quite well under most circumstances,even though theoretically being an unstable protocol. It alsofunctions as the worst-case model for sensor MAC protocols.When a node using np-CSMA has data to send, it first usescarrier sensing (CS) to sense the channel. If the channel isfound to be vacant for the whole duration of the CS, the nodesends the data, otherwise, it does not persist in sensing thechannel, but chooses a random time in the future to perform


k bits Transmitterelectronics

ete eta

TX

Amplifier

d

Receiverelectronics

ere

k bits

Figure 3: Typical narrowband radio energy consumption modelwhere k bits are transmitted and ete and eta are the transmitter elec-tronics and amplifier energy consumption per bit, respectively. Thetransmission distance is d and the k bits are received by the receiverelectronics consuming erx energy per bit.

CS again. Once the data has been sent, np-CSMA waits for anacknowledgement (ACK) frame from the intended recipientand if it is received before a timeout, the data is known to besuccessfully received. Otherwise, the data has to be retrans-mitted at a later time. As a deviation from the original paper,the ACK frame is transmitted on the same channel as data.

2.4.2. S-MAC

The S-MAC [14] operation and frame is divided into twoperiods: the active period and the sleep period. During thesleep period, all nodes sharing the same schedule sleep andsave energy. The sleep period is usually several times longerthan the active period. The active period also consists of twosubperiods: the listen for synchronization (SYNC) frame pe-riod and the listen for request-to-send (RTS) period. Nodeslisten for a SYNC frame in every cycle and the SYNC frameis transmitted by a device infrequently to achieve and main-tain virtual clustering. In the listen for RTS part, the nodescan communicate using a CSMA/CA channel access methodwith binary exponential backoff. S-MAC also implements atechnique called message passing which can be applied whenthe network layer has a packet larger than a single frame totransmit. Using message passing, S-MAC splits up the packetinto smaller sized pieces and transmit them as a burst of con-secutive data—ACK frames. Overhearing nodes sleep duringthe data transfer. Should a data transmission continue be-yond the active period, the transmitting and receiving nodesusing S-MAC can prolong their awake time for the durationof the data transmission.

2.4.3. NanoMAC

Because CSMA/CA is a powerful protocol for medium accesscontrol, the nanoMAC protocol also implements CSMA/CA.NanoMAC has been discussed in detail in [20, 21] and [22]presents more details of it with part of the analysis laterpresented in this paper. Briefly described, nanoMAC is p-nonpersistent, that is, with probability p, the protocol willact as nonpersistent and with probability 1 − p, the proto-col will refrain from sending even before CS and schedulea new time to attempt it. Nodes contending for the chan-nel do not constantly listen for the channel, contrary to thenormal binary exponential backoff mechanism, but sleepduring the random contention window. When the back-off timer expires, the node wakes up to sense the channel.

The CS for nanoMAC is relatively short but long enough toguarantee carrier detection on the channel with high confi-dence. The described feature makes the actual carrier sensingtime short, even though the backoff mechanism is binary ex-ponential, and saves energy. In the request-to-send/clear-to-send (RTS/CTS) frames, nanoMAC does virtual carrier sens-ing in addition to informing overhearing nodes of the timethey are required to refrain from transmission. Virtual car-rier sensing enables overhearing nodes to sleep during thatperiod. Unlike S-MAC, 48-bit IEEE MAC addresses are sup-ported as well as sleep information for virtual clustering andthe number of data frames to be transmitted are also in-cluded in the RTS and CTS frames.

The data frames carry only temporary, short, randomaddresses to minimize the data frame overhead. With oneRTS/CTS reservation, a maximum of 10 data frames can betransmitted using a frame train ideology. The idea is simi-lar to message passing in S-MAC, but it is a default charac-teristic in nanoMAC, as data is always divided into 35 octetblocks. The transmitted data frames are acknowledged bya single, common ACK frame that has a separate acknowl-edgement bit reserved for each data frame. The ACK frameis therefore an acknowledgement/negative acknowledgement(ACK/NACK) combination. In this way, only the corruptedframes need to be retransmitted and not the whole packet.Without forward error correction (FEC) methods, the frametrain method promises to be efficient. If FEC is used, framescan be made longer. When best utilized, nanoMAC has lowoverhead even with low data-rate, small frame-size applica-tions. For a 350-octet payload, the MSDU-to-packet ratio fornanoMAC is ∼ 75% while for S-MAC and CSMA the valuesare ∼ 64% and ∼ 44%, respectively.

3. BASELINE MULTIHOP COMMUNICATIONS MODEL

In this section, we describe the simple multihop communica-tions model without medium access control. The analysis ap-plies to the case where the MAC is considered to be ideal; theMAC produces no overhead, adds no delays, and the channelaccess never causes collisions. The analysis without mediumaccess control provides insight into the energy consumptioneffects of radio parameters.

3.1. Radio power consumption

Power consumption models of the radio, illustrated byFigure 3, in embedded devices, must take both transceiverand startup power consumption into account along with anaccurate model of the amplifier. The latter actually becomesdominant with small packet sizes and long transition times toreceive mode because of frequency synthesizer settle-downtime. In [5] a model for radio power consumption is givenfor energy per bit eb as

eb = etx + erx +Edec

ι, (1)

where etx and erx are the transmitter and receiver power con-sumptions per bit, respectively, Edec is the energy required for


decoding a packet, and ι is the payload length in bits. The en-coding energy of data is assumed to be negligible. This modeltakes into account the energy needed to transmit a framefrom a transmitter to a receiver over a single hop. In [5] themodel was used over a single hop to optimize frame sizesand coding techniques. In this paper, we extend the modelfor multihop scenarios and with different traffic models. Itis then used later in the paper to produce a baseline com-parison for multihop MAC efficiency using the same radioparameters.

The term etx from (1) with optimal power control can berepresented as

etx = ete + etadα, (2)

where ete is the energy consumption of the transmitter elec-tronics per bit, eta is the energy consumption of the transmitamplifier per bit over a distance of 1 meter, d is the trans-mission distance, and α the path loss exponent. Often in theliterature generic approximations are used for these terms.However, an explicit expression for eta has been presented in[7] as

eta = (S/N)r(NFRx

)(N0)(BW)(4π/λ)α(

Gant)(ηamp

)(Rbit

) , (3)

where (S/N)r is the desired signal-to-noise ratio at the re-ceiver’s demodulator, NFRx is the receiver noise figure, N0 isthe thermal noise floor for 1 Hz bandwidth, BW is the chan-nel noise bandwidth, λ is the wavelength in meters, Gant is theantenna gain, ηamp is the transmitter efficiency, and Rbit is theraw channel rate in bits per second. This expression for eta

can be used for those cases where a particular hardware con-figuration is being considered as in this paper. In the samepaper, the authors have shown that an optimal multihop dis-tance, the characteristic distance dchar, can be defined as

dchar = α

√ete + erx

eta(α− 1). (4)

The characteristic distance is a radio specific parameterwhich describes when the energy consumptions of the trans-mitter and receiver circuitries are in balance with the energyconsumption of the transmitter amplifier. For a typical lowfrequency band transceiver like the RFM TR1000 with elec-tronics values presented in Table 1, the characteristic distanceis found to be 31.5 meters with a BER of 10−4 assuming non-coherent FSK modulation. For sensor networks, this value ofdchar is a long link distance, but it is the most energy efficientfrom the point of transceiver electronics. Most communica-tions in sensor networks can thus be completed using single-hop communications using this particular radio. In this pa-per, we analyze topology, traffic, and medium access controleffects on multihop energy efficiency. With the parameters ofTable 1, Sankarasubramaniam et al. [5] suggest that a framesize of 41 octets with a BER of 5 × 10−4 is close to optimalenergy efficiency.

Table 1: Radio parameters of a typical ISM transceiver, the RFMTR1000 at 19.2 kbps, which is used in the analysis of the paper.

Parameter ValueTransmitter circuitry ete 1.066 µJ/bit

Receiver circuitry erx 0.533 µJ/bit

SNR at the receiver (S/N)r 40 dB

Receiver noise figure NFRx 10 dB

Thermal noise floor N0 4.17∗ 10−21 J

Bandwidth BW 19 200 Hz

Wavelength λ 0.327 m

Path loss exponent α 2.5

Antenna gain Gant −10 dB

Transmitter efficiency ηamp 0.2

Raw bit rate Rbit 19 200 bps

Sleep mode energy eslp 120 pJ/bit

3.2. Multihop power consumption

In this section, an analytical model for multihop communi-cations is introduced that takes detailed overheads into ac-count. The linear model is used with variable spacing be-tween nodes assuming a sink node that collects data andis not energy constrained. No medium access control is as-sumed. Energy per bit, energy efficiency, and total energy arederived for various traffic cases and node distributions.

A similar analysis can be made as in [8] by extending(1) to take the linear multihop scenario shown in Figure 1into account, assuming optimal power control. Instead of to-tal power derived in [8], we can derive multihop energy peruseful bit from (1) as

eb =(n(ete + eta(d)α

)+ (n− 1)erx

)(1 +

(β + τ)ι

)

+nEst + (n− 1)

(Esr + Edec

)ι

,

(5)

where n is the number of hops, β is the preamble length, τis the coding overhead, and Est and Esr are startup energiesfrom sleep to transmit and receive, respectively. The recep-tion energy consumption of the sink node is not included be-cause it is not considered to be energy constrained and doesnot affect the multihop comparison.

For this same topology, we can also calculate the total en-ergy consumed in the network. Using the same notation asin (5), total multihop energy consumption EMH incurred bynode n transmitting k = β+ ι+ τ total bits over n hops to thesink is

EMH = n(k(ete + etad

α)

+ Est)

+ (n− 1)(kerx + Esr + Edec

).

(6)

The analysis used to this point has assumed an unreal-istic traffic model, that is, only node n (furthest from thesink) transmits data. This was necessary for calculating en-ergy per bit and energy efficiency, which are frame-centric


108

64

20

Number of hops 0 5 10 1520

2530

Distance/hop (m)

012345678×10−5

Tota

len

ergy

per

use

fulb

it(J

)

Single hop

Multihop

Figure 4: Total energy for the node n transmitting case. This plotshows the relationship between multihop and single-hop energyefficiency. Single hop is typically more efficient within the radio’stransmission range. The path loss exponent α is 2.3 in this case.

metrics. However, in most useful scenarios, all nodes willtransmit data. We can take that into account by assumingthat all nodes have a single frame to transmit towards thesink. We consider the scenario of Figure 1 where all the nodestransmit a frame to the sink. From (6) the total energy con-sumed Eall

MH in the network by each node transmitting theirown frame and forwarding the other nodes’ frames towardsthe sink for this scenario is

EallMH =

n(n + 1)2

(k(ete + eta(d)α

)+ Est

)

+n(n− 1)

2

(kerx + Esr + Edec

).

(7)

We can compare this multihop case to the single-hop casewhere each node transmits its frame directly to the sink node,that is, no forwarding is performed. Node n has to transmita total distance of nd, node n− 1 a distance of (n− 1)d, andso forth. From (5) by summation we get the single-hop totalenergy consumed Eall

SH in the network as

EallSH =

n∑i=1

(k(ete + eta(id)α

)+ Est

). (8)

The intermediate nodes between the transmitting nodeand the sink in the single-hop case do not overhear the trans-missions. The channel is also considered to be errorless withthe parameters of Table 1. Note that in a realistic scenario,the traffic model is usually somewhere in between the twoaforementioned models.

3.3. Baseline results

The parameters used for the analysis are shown in Table 1,with the exception of α being 2.3 in Figure 4 for clearer illus-trative purposes. Matlab was used as a tool for producing thefigures. In addition, a 350-octet payload with 4B/6B codingis assumed for comparison with the results obtained later in-cluding the MAC protocol effects. Using this model, we can

MultihopSingle hop

Multihop (all)Single hop (all)

2 4 6 8 10 12 14 16

Number of hops

0

1

2

3

4

5

6×10−5

Tota

len

ergy

per

use

fulb

it(J

)

Figure 5: Comparison of the node n only and all node transmissiontraffic cases. It can be seen that the crossover point is further in theall nodes transmitting case. Node spacing d is 10 m and the path lossexponent α is 2.5.

compare the use of single-hop and multihop communica-tions in low-power networks. The real question is whethertransmit energy or receive and startup energy is a dominantfactor, the former favoring the theory that multihop is alwaysmore efficient. However, when accurately taking startup en-ergies and other overheads into account, it can be shown thatin most practical cases single-hop techniques are preferredfor energy efficiency.

The relationship between multihop and single-hop en-ergy efficiency is shown in Figure 4. Here we can see howthe planes of multihop and single hop intersect. Multihopis more efficient with a small number of hops over largerdistances. Past the typical transmission range of the radio(∼ 80 m in our case, dchar being less), single hop becomes lessefficient because of the path loss. In Figure 5, we can see howthe traffic model affects this intersection. The all nodes trans-mitting case increases the range under which single hop ismore efficient. Note that in both cases the intersection is be-yond the practical range of the radio. These results are highlyinfluenced by radio and channel parameters, especially thepath loss exponent, and thus are meant only to show the gen-eral relationship. In the next section, we develop the MACprotocol energy analysis model and later use the same radioand topology parameters as in this section in order to makea comparison of MAC effects.

4. ENERGY CONSUMPTION MODEL WITHMEDIUM ACCESS CONTROL

In this section, we describe a theoretical analysis for the en-ergy consumption of MAC protocols and the underlyingphysical layer. This analysis can be used for the study of


(1− Pc) or Pc(1− Pers),channel detected busy,

stay in backoff

Backoff

(1− Ps), collision, go to backoff

PcPers, channel detectedvacant, transmit RTS

Attempt

Ps, transmit data, receive ACK

Success

ArrivePb or (1− Pb)(1− Pers),

refrain fromtransmission

(1− Pb)Pers, transmit RTS

Carrier sense

Figure 6: Transmit energy model for nanoMAC. The arrows present energy consuming transitions from one state to a new state while thestates are instant and do not consume energy. Pb, Pers, Ps, and Pc are transition probabilities.

networks with a large number of nodes.1 The model consistsof the energy consumed in a network in the transmission ofdata taking into account average contention times, averagebackoff times, and possible frame collisions. The model takesthe reception of data into account as the average probabilitiesfor receiving data correctly. A similar model was originallypresented in [23] for the delay analysis of the FAMA-NTRprotocol, but we have modified it for energy consumptioncalculations by investigating the probabilities of transitionsfrom one MAC protocol state to another state and the re-lated times consumed in transmit, receive, idle, and sleep. Inthe model, one consumes energy in the process of arriving toa state. The states themselves are transitory and with certainprobabilities one of all possible paths is chosen to arrive to anew state (in some cases the same state as before). Usually, inthe case of ISM-band transceivers, receive and idle modes canbe considered as a single mode or the difference is marginal.Throughout the presentation of the analytical model, we usenanoMAC as an example, but an equivalent analysis can beapplied to np-CSMA and S-MAC as well as to other MACprotocols.

4.1. Transmit energy

The energy consumption model for transmission can befound from Figure 6. There are four different states: Arrive,Backoff, Attempt, and Success. The Arrive state is the entrypoint to the system for a node with new data to transmit. Inthe case of CSMA protocols, carrier sensing is always madebefore arriving to the Arrive state which consumes EArrive

joules of energy. To calculate the average energy consump-tion, we solve a system of equations implied by Figure 6. LetETx equal the expected energy consumption by a node withnew data at the Arrive state until the node reaches the Suc-cess state. Let E(A) equal the average energy consumption oneach visit by the node to the Attempt state, and let E(B) equalthe energy consumption on each visit to the Backoff state.On every arrival to one of the states, energy is consumed.

1We assume a Poisson process of data arrival and the number of nodesin the network approaches infinite. Therefore, the probabilities used in ouranalysis are exponential.

This energy consumption consists of certain times, for ex-ample, the time needed to transmit a preamble and an RTSframe, and the time spent in a specific transceiver mode, forexample, transmit (MTx) in this case. There are probabilitiesattached to each of the arrivals depicting a certain exponen-tial probability to choose that path. The sum of all probabil-ities out of a specific state is always 1. To reach the Successstate which is the exit point of the data transfer, all the pos-sible transitions starting from the Arrive state and ending atthe Success have to be calculated. The average energy con-sumption upon transmission from the point of packet arrivalfrom the upper layer to the point of receiving an ACK frameis in general of the form

ETx = EArrive + Pprob1E(A) +(1− Pprob 1

)E(B), (9)

E(A) = Pprob 2ESuccess +(1− Pprob 2

)E(B), (10)

E(B) = Pprob 3E(A) +(1− Pprob 3

)E(B), (11)

where Pprob{1,2,3} are different probabilities related to arrivingto a certain state (each Pprob{1,2,3} may contain several prob-abilities), EArrive is the carrier sensing energy consumptionwhen coming to the Arrive state, and ESuccess is the expectedenergy consumption upon reaching the Success state fromthe Attempt state. For nanoMAC, presenting the probabili-ties, the times, and the transceiver modes explicitly, (9) trans-lates to

ETx = TCSMRx + Pb

(Tbb +

Tr

2

)MSlp + PbE(B)

+(1− Pb

)(1− Pers

)(Tbp +

Tr

2

)MSlp

+(1− Pb

)PersE(A) +

(1− Pb

)Pers

(Tpr + RTS

)MTx

+(1− Pb

)(1− Pers

)E(B).

(12)

In (12) the notation is as follows.

(i) MTx is the transceiver transmit power consumptionand is related to the time consumed arriving to a state.Similarly, MRx and MSlp are transceiver reception andsleep power consumptions, respectively.


Received

Psenh, receivedata packet

Reply

(1− Psenh), collisionduring CTS

Ps, valid RTSreceived

Idle

(1− Ps), no valid RTSreceived, stay in idle

Figure 7: The receive energy model for nanoMAC. The arrowspresent energy consuming transitions from one state to a new statewhile the states are instant and do not consume energy. Idle is theentry point to the system and no energy is consumed before a trans-mission by another device is attempted. Ps and Psenh are transitionprobabilities.

(ii) TCS is the time required for carrier sensing.(iii) Tbb and Tbp represent the average values of binary ex-

ponential backoff. Tbb is the incremented backoff timeand Tbp is the base backoff time.

(iv) Pb is the probability of finding the channel busy duringCS.

(v) Tr/2 is the average random delay obeying uniform dis-tribution.

(vi) Pers is the nonpersistence value of nanoMAC.(vii) Tpr and RTS are times to transmit a preamble and an

RTS frame, respectively.

From Backoff, (11), and Attempt, (10), we make the sameanalysis as from the Arrive, (9), state and solve a system ofequations. For nanoMAC, E(B) of (11) after algebra trans-lates to

E(B) = (ω + PcPersδ)(PersPcPs

)−1, (13)

where Pc is the probability of finding no transmissions dur-ing time e and Ps is the probability of no collision during anRTS frame. The symbolω represents the energy model’s tran-sition from Backoff state to Attempt state or Backoff state.The explicit form of ω is presented in Appendix A and byform it is similar to (12). Similarly, δ represents the model’stransition from Attempt state to Backoff state or Success stateand the explicit form can be found in Appendix A. After al-gebra, E(A) of (10) for nanoMAC can also be found and is

E(A) = δ +(1− Ps

)(ω + PcPersδ

)(PersPcPs

)−1, (14)

where the term E(A) gives a constraint: the probability ofno collision with retransmit RTS Pc > 0 and the probabil-ity of successful data transmission Ps > 0 → G ∈ [0,∞].Note that we are not modelling the BE backoff with a Markovchain here. We are using average values of BE backoff mod-ified by G, where G is the normalized, average traffic offered

to the channel. This assumption does not affect the energyconsumption result.

For np-CSMA and S-MAC, a state machine similar toFigure 6 can be drawn but with different probabilities andvalues. Equations (9), (10), and (11) apply and the transmitenergy consumption of np-CSMA and S-MAC is of the formETx = γ+σE(B)+φ+(1−σ)E(A), where γ and φ are sums ofproducts of probabilities, times, and transceiver modes (sim-ilar to ω and δ) and σ is a probability based on the value ofthe congestion window.

4.2. Receive energy

The reception energy consumption model of a packet fornanoMAC can be found in Figure 7. Idle listening is nottaken into account in the model of Figure 7, instead the nextsection provides it. For analysis the reception energy model issimilar to the transmit energy model and the average receiveenergy consumption ERx from listening for a transmission todetecting and receiving a valid packet and being the properdestination can be found to be

ERx = E(I) = (µ + Psθ)(PsPsenh

)−1, (15)

where the notation is as follows.

(i) E(I) is the energy incurred in each visit to state Idle.(ii) µ represents the energy model’s transitions from state

Idle and is explicitly described in Appendix B. It is sim-ilar to ω of the previous subsection.

(iii) θ represents the energy model’s transitions from stateReply and is explicitly described in Appendix B. It isalso similar to ω of the previous subsection.

(iv) Ps and Psenh are the probabilities of no collision duringRTS or CTS, respectively.

Details for receive energy consumption can be found inAppendix B. For reception, the constraint PsPsenh > 0 → G <∞ is introduced. The energy consumption for np-CSMA andS-MAC for reception can be calculated using Figure 7 and re-placing the probabilities, times, and transceiver modes withappropriate ones.

The average energy per useful bit for transmission andreception is depicted in Figure 8. A network with a very largenumber of nodes using a Poisson process is assumed. The ra-dio parameters can be found in Table 1 and we can see thatnp-CSMA transmission energy consumption is the highest asexpected and about 40% higher than with nanoMAC and 7%higher than with S-MAC. Surprisingly, the reception energyconsumption of S-MAC is the highest of the three protocols.This is due to three factors: in the calculations done in Mat-lab, artificially small ACK frames of 1 octet were used for np-CSMA. This is due to the fact that longer ACK frames for np-CSMA would lead to a deadlock situation in the worst-caseenergy consumption scenario presented in the next chap-ter. Secondly, binary exponential backoff causes S-MAC andalso np-CSMA to spend on the average a relatively long timein transceiver RX mode before data transmission. Thirdly,S-MAC has a cyclic listen for SYNC period, in which the


TX P0.01 nanoMACTX P0.1 nanoMACTX P1 nanoMACTX np-CSMA

TX S-MACRX np-CSMARX nanoMACRX S-MAC

10−3 10−2 10−1 100 101 102 103 104

Normalized traffic G(Erlang)

1

2

3

4

5

6

7

8

9

10×10−6

Abs

olu

teen

ergy

con

sum

ptio

nE

(J)

Figure 8: Transmission and reception energy consumption of np-CSMA, S-MAC, and nanoMAC per MSDU bit. The traffic assumesa Poisson process over a single hop, and a fully connected networkwith a very large number of nodes.

transceiver has to be in RX mode. No actual data can becommunicated during that time, so a potential transmit-ter and receiver has to spend extra time in RX mode. InnanoMAC, the synchronization is handled in RTS, CTS, andACK frames, so no extra listening is required per transmitteddata packet. NanoMAC reception therefore consumes onlytwo fifths of the energy in reception per useful bit comparedto S-MAC.

5. REGULAR SLEEP PERIODS

In the previous section, we presented a MAC energy modelfor the transmission and reception of data. In a more realis-tic analysis of wireless sensor MAC protocols, we have to in-clude periods when there is no data communication ongoingas well as sleeping to save energy. These issues are addressedin this section by including idle listening and describing asleep mechanism which are appended to the model of theprevious section. A comparison of energy consumption withand without sleep is also made.

We evaluate the average, maximum, single-hop powerconsumption for a node using the RFM TR1000 andnanoMAC with and without sleep periods as well as np-CSMA without sleep. Because S-MAC has an inherent sleepcycle, we use a similar model for evaluation. A legal duty cy-cle of 10% common to ISM channels is used implying that anode is allowed to transmit only one tenth of its active time.That is, whenever a node sends a packet to some other node,it has to refrain from transmission for a period of 9 times thetime it took to transmit the packet. The data arrival rate to

Table 2: MAC protocol specific frame sizes, MSDU size, communi-cating MSDU on the channel, and transmitted portions by the dataoriginator and the recipient in octets.

Parameter (octets) NanoMAC CSMA S-MAC

Control frame size 18 1 10

Data frame size 41 41 43

Data frame payload 35 25 35

MSDU Apkt 350 25 350

Packet on the channel Cpkt 507.25 49 627

Cpkt; sender transmitter STx 464.25 44.5 478.5

Cpkt; receiver transmitter RTx 43 4.5 148.5

the system is Poisson distributed and in Table 2 we can seethe relevant parameters for the data packet communications.We consider a 350-octet MSDU Apkt arriving from an up-per layer process for nanoMAC and S-MAC and a 25-octetMSDU for np-CSMA. In this way, the least overhead is usedby each of the protocols. The length of the data transmittedon the channel Cpkt in octets is known after appending thenecessary control frames, headers, and preambles. Of Cpkt,STx octets are transmitted by the data originator transmit-ter and RTx octets are transmitted by the receiver transmit-ter as control frames and acknowledgements. Protocols havetheir own frame structure and communications method andtherefore the values are different for each protocol.

We consider a maximal usage case, called the worst-casescenario in which a node(i) transmits a packet as often aspossible, without buffering and it is the recipient for all of thepackets sent in the channel, except the packets it transmits.

5.1. Worst-case scenario

Whenever a node transmits data, control frames, or acknowl-edgements, it has to obey duty-cycle constraints. Because ofthe duty-cycle constraints, a node can transmit a packet everyTtp seconds,

Ttp = STx

RdCd+ MAX(r)

(RTx

RdCd

)Gmod, (16)

where Rd is the data rate (bps), Cd the duty cycle, and r thenumber of packets addressed to node(i) that node(i) receivesduring a wait between packet transmissions Ttp. Gmod is theaverage, normalized traffic with a limit that when G > 1 →Gmod = 1. The value of MAX(r) can be defined as the maxi-mum number possible r in a Ttp at G = 1 by

MAX(r)=(

STx

Cd(Cpkt +Tproc

)−1

)(1− RTx

Cd(Cpkt +Tproc

))−1

.

(17)

The processing delay Tproc is expressed in bits. We usea 1-octet ACK for np-CSMA because using a 15-octet-longACK frame (ACK frame with IEEE sender/recipient MAC ad-dresses) with np-CSMA leads to a deadlock. The deadlock is


expressed by MAX(r) reaching negative values. Negative val-ues correspond to a situation where a node first transmitsa data frame. While refraining from transmission until theduty cycle is satisfied, the node receives data frames and byacknowledging those frames the ACK frame transmissionsdelay the next data transmission indefinitely.

5.2. NanoMAC sleep groups

We implement four-level sleep scheduling for nanoMAC.The sleep scheduling operates in cycles of 9.6 seconds afterwhich all of the nodes in the network resynchronize them-selves. After the resynchronizing timer expires in a node, thenode turns its radio to listen mode. The node then only lis-tens for the channel for a period of time to confirm thatevery node in its area of influence is awake. After this pe-riod, the node starts a random timer after which it broad-casts a special synchronization preamble to resynchronize allof the nodes. Should the node receive the special synchro-nization preamble before its own transmission, it synchro-nizes with that preamble and resumes normal operation. Anew cycle of 9.6 seconds begins from the end of transmis-sion of the special preamble. If the node has data to trans-mit, it can piggyback the data. In the case that a node can-not resynchronize with the network, it has to immediatelychange its sleep group to SG 00, always awake until it re-ceives a valid resynchronization preamble. On the average,nodes have to spend 500 milliseconds in receive mode toresynchronize producing an extra energy cost of 5.1 mJ in10.1 (9.6 + 0.5) seconds corresponding to 28 nJ/bit in a cy-cle.

The sleep group information in nanoMAC is transmittedin the control frames which every node awake can overhear:RTS, CTS, and ACK. Each control frame has a 1-octet sleepfield which is divided into two parts.

(i) Sleep group: this field announces the sleep group thenode is currently following. There are four differentsleep groups: SG 00 with no sleep periods, SG 01 inwhich nodes wake up every 0.4 second, SG 10 with0.96-second wake-ups, and SG 11 with 1.6-secondwake-ups.

(ii) Next wake-up: this field indicates the next time thenode will be awake for communication. The resolutionof the field depends on the sleep group.

The above values are just carefully selected examples andone could use other values. After wake-up, the nodes stayawake for an active period of 85 milliseconds and in addi-tion a period of {0 − Cpkt/Rd} (the time of a data packetcommunication) seconds. The additional period is spentawake only in the case that a valid packet is being trans-mitted or received. Any node overhearing one of the con-trol frames can calculate the times when the source nodewill be awake. Every node keeps the schedules of all its im-mediate neighbors, or at least the schedules of the neigh-bors it wishes to communicate with if the additional mem-ory consumption of keeping track of all nodes is not justi-fied.

5.3. Energy consumption with sleep groups

In the last two subsections, we defined the scenario and pre-sented a sleep group model for analysis with the MAC en-ergy model derived before. Next all these are added togetherto consider single-hop communications, MAC energy con-sumption with idle listening and sleeping, taking into ac-count the radio characteristics.

When considering sleep groups, we assume that thesender and recipient are synchronized in time so that whenthe sender transmits, the recipient is awake to receive data.Because the transmitter and receiver are synchronized intime, sleeping mainly reduces idle listening. Sleeping also in-creases the traffic offered to the channel because some ar-rivals occur during the sleep period and every new arrival canbe allocated for a new node to satisfy the Poisson process. Thetotal worst-case energy consumption with sleep EWCS con-sists of the energy consumed in transmission ETx, receptionERx, sleeping, and idle listening. The exact derivation of EWCS

is presented in Appendix C and the resulting formula is

EWCS = mTawGimod

Ttp

(1

Cpkt− 1

RdTtp

)

×(

1− Apkt

RdTtpGinc

)ERx +

m(Twup − Taw

)Apkt

MSlp

+ ETx +mTaw

(1−Gimod

)Ttp

TidleRXMidleRX

Apkt,

(18)

where m = Ttp/Twup is the number of wake-ups during Ttp,Twup the wake-up period defined by sleep groups, Taw theperiod a node is awake, Gimod the increased traffic offered tothe channel due to sleeping with a maximum value of 1, Ginc

the increased traffic due to sleeping, TidleRX is the time in oneTtp a node spends in idle mode, and MidleRX is the transceiverin idle receive mode (here, the same as MRx). Traffic offeredto the channel is increased because there are arrivals whennodes are sleeping and when the nodes wake up, there will beincreased contention.

The radio parameters are listed in Table 1. The total en-ergy consumption per useful transmitted bit in the worst-case scenario with and without sleep groups is depicted inFigure 9. The behavior of the curves needs some explanation.The high energy consumption per bit at low values of G isexplained by the fact that the offered traffic to the channelis very low and nodes spend most of their time in idle lis-tening. The actual energy consumed in the transmission of apacket is negligible compared to the energy consumed in idlelistening between successive data packet transmissions. Thisbehavior is common to all of the MAC protocols we con-sider. We can see that the introduction of sleep groups andS-MAC’s inherent sleep schedule help to compensate for theidle listening, but it can be seen that one needs at least a 15 : 1sleep : awake cycle (nanoMAC SG 11) to keep the energy-per-useful-bit value low. When G increases, nanoMAC with anonpersistence of 1 performs very well for a wide range of G,


NanoMAC P1 no sleepNanoMAC P1 SG 01NanoMAC P1 SG 10

NanoMAC P1 SG 11Np-CSMAS-MAC

10−3 10−2 10−1 100 101 102 103 104

Normalized traffic G(Erlang)

0

0.2

0.4

0.6

0.8

1

1.2

×10−4

Abs

olu

teen

ergy

con

sum

ptio

nE

(J)

Figure 9: Energy consumption of nanoMAC with sleep groups, np-CSMA, and S-MAC per transmitted MSDU bit in the worst-casescenario with respect to G. A node transmits as often as possiblewith a 10% duty-cycle constraint and is the recipient for all theother transmissions in the channel.

but eventually in extremely high bursts of G the energy con-sumption increases exponentially. NanoMAC accomplishesthis by being passive and sleeping. The low energy consump-tion tradeoff is an increase in delay as our work in [21] im-plies (with throughput-delay calculations). The good perfor-mance of nanoMAC is also due to the fact that overhearingnodes sleep for the duration of data transmission as well asfor the duration of the backoff times.

Similar behavior can be seen for S-MAC, but there is aclear energy consumption minimum seen around G = 0.07.At this point there is exactly one data packet arrival per Ttp.When the traffic load increases, node(i) begins to receive datapackets in addition to its own transmissions. Idle time isreduced, but the high energy consumption of receiving in-creases energy consumption. The energy consumption peruseful transmitted bit soon reaches a steady state or a satura-tion point, where extra traffic no longer increases the amountof data node(i) receives per Ttp. Because Ttp has reached itsmaximum value, no more traffic can be communicated in thechannel. When the instantaneous traffic offered to the chan-nel reaches very high values, the number of collisions effec-tively block communications on the channel and energy peruseful transmitted bit grows exponentially.

The performance of np-CSMA on the other hand seemsquite interesting, but upon closer inspection the behavior isexactly the same as for S-MAC. At low values of G the per-formance of np-CSMA is similar to that of nanoMAC with-out sleep for the same reasons as for nanoMAC. When G in-creases beyond the point of more than one arrival (duringTtp) to the system, the energy consumption starts increasinglinearly because the number of received packets per Ttp grows

linearly. The increase of reception continues for a while un-til the channel starts to saturate with data packets. Becausenp-CSMA is a simple protocol, high bursts of traffic lead to arapid increase in energy consumption per useful bit.

The energy saving effect of regular sleeping can be ob-served with low values of G and occurs because the amountof idle listening is reduced by a large factor. We expect thatthe same energy saving behavior is not limited to this worst-case scenario, but is applicable whenever G is low.

6. MULTIHOP ANALYSIS

We have described an analytical model for MAC energy eval-uation in the previous sections, but up till now we have onlyconsidered a single-hop model. From here on we extend ouranalysis to include the multihop topologies of Figures 1 and2. In Figure 1 N is the total number of nodes in the multihopchain with uniform optimum spacing d. With multihop, onehop is d meters and node N ’s packets make N hops reach thesink node whereas for single hop, node N transmits the samedata with one hop of distance Nd.

We assume that sleep scheduling similar to nanoMAC’scan be made for np-CSMA. ACK frames for np-CSMA are1 octet long. Three different scenarios are investigated: onewith perfect sleep scheduling, one with the multigroup sleepscheduling described in the previous section, and one withcommon sleep scheduling. In perfect sleep scheduling onlythe source and the immediate destination are awake duringany given transmission and there are no overhearing nodes.With multigroup sleep scheduling, we assume that 25% ofnodes obey each sleep schedule. Notice that all of the sleepschedules overlap in certain wake periods to keep the net-work fully connected and all the nodes awake during a trans-mission will overhear it if they are within the range of thetransmission. When common sleep scheduling is used, we as-sume that all N nodes in the linear network are awake at thesame time, so all the nodes within the transmission radiuswill overhear the transmissions. The MAC model producesbackground traffic in the network resulting in the scenario ofFigure 2.

Figure 10 illustrates the energy consumption behaviorof the modified np-CSMA with optimum spacing,2 whered is the characteristic distance dchar of (4) and G = 0.22.We observe the following: with dchar, the optimum multihoppower consumption distance, multihop communications al-ways has lower energy consumption than single-hop com-munications. This behaviour is independent of the MACprotocol even though only np-CSMA is shown in Figure 10.The lower energy consumption performance of multilevel(SG 01) and perfect sleep can only be seen in MAC proto-cols like np-CSMA because the overheard frames are long.In S-MAC and nanoMAC, the overheard frames are lim-ited to small control frames implying that even perfect sleep

2By optimum spacing we mean all the nodes in the chain are equidistantand the separation of nodes d is exactly the radio characteristics dependentcharacteristic distance dchar.


Np-CSMA, perfect sleep, single-hopNp-CSMA, SG 01Np-CSMA, common sleepNp-CSMA, perfect sleep, multihopNp-CSMA, SG 01Np-CSMA, common sleep

0 2 4 6 8 10 12 14 16 18 20

Number of hops

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5×10−3

En

ergy

con

sum

ptio

nE

(J)

Figure 10: Nonpersistent CSMA with a linear topology. Compar-ison of single-hop versus multihop communications with charac-teristic distance dchar. All nodes are transmitting and different sleepgroups are utilized (d = 31.5 m times multiplier N , path loss expo-nent α = 2.5).

scheduling does not provide better energy performance tothat of common sleeping. The benefits of an advanced sleepalgorithm therefore help only in situations where the channelis lightly loaded as seen in Figure 9. Figure 11 illustrates allthree MAC protocols for uniform optimum spacing dchar anda common sleep group. We observe the MAC protocols hav-ing different energy consumptions even for a small numberof hops, but nanoMAC single-hop communications is moreenergy efficient than np-CSMA and S-MAC multihop com-munications by up to 2 hops.

Next we make the same analysis as above, but changefrom uniform optimum spacing to uniform nonoptimalspacing. We choose d to be 10 meters and calculate Figure 12.In the legend the first three curves are for the single-hop caseand the latter are for multihop. All of the MAC protocols’single-hop and multihop energy consumption curves cross.Each of the crossing points is outside the feasible single-hoptransmission distances of our ISM radio and the protocolsillustrate similar behavior to that of Figures 4 and 5 whichare calculated without medium access control. The differ-ences between Figures 5 and 12 are mainly in the energy con-sumption showing that medium access control consumes al-most 2 orders of magnitude more energy than in the analysiswithout medium access control. Therefore, a simpler analysiscan illustrate equivalent behavior in some cases even thoughthe absolute values differ. From the figures we deduct thatthe use of single-hop communications is more energy ef-ficient in wireless sensor networks, where the offered traf-fic is usually low or moderate and node separation is small.

P1 nanoMAC, single hopNp-CSMA, single hopS-MAC, single hop

P1 nanoMAC, multihopNp-CSMA, multihopS-MAC, multihop

1 2 3

Number of hops

0

1

2

3

4

5

6

7×10−5

Abs

olu

teen

ergy

con

sum

ptio

nE

(J)

Figure 11: Optimal spacing with characteristic distance dchar. Com-mon sleep group for single-hop versus multihop communications.With dchar, multihop forwarding always outperforms single-hopcommunications but, for example, nanoMAC single-hop commu-nications outperform np-CSMA and S-MAC multihop forwardingwith 2 hops (d = 31.5 m times multiplier N , path loss exponentα = 2.5).

When we compare the protocols with one another, we can seethe importance of proper design of the MAC protocol withits neighboring layers; nanoMAC as a sensor MAC protocolachieves over 50% energy savings compared to np-CSMA.

7. MULTIHOP WITH RANDOM SPACING

Lastly, we use the developed analysis technique in realisticwireless sensor networks. To this point we have assumedthe node separation in sensor networks to be uniform, butin reality this is generally not achievable due to, for ex-ample, the terrain or deployment method. Usually, nodesare scattered around randomly causing certain minimumand maximum separation thresholds. Also, in the case ofspatially large sensor networks, single-hop communicationscan become impossible due to too long node-sink distances.Therefore, we adopt new communication styles: shortest hopand longest hop corresponding to the former multihop andsingle-hop communications, respectively. Shortest-hop com-munications applies to many routing protocols, where onechooses a close or closest neighbor towards the sink androutes the data via that neighbor. In the longest-hop strat-egy, a node tries to transmit to the furthest neighbor it canwithin the feasible transmission distance of the radio. We usethe radio characteristics provided by Table 1 and based onmeasurements choose 100 meters to be the maximum fea-sible transmission radius of a node with legal transmissionpower. We also discard the usage of optimal power controland apply a four-level discrete power control achievable by


P1 nanoMAC, single hopNp-CSMA, single hopS-MAC, single hop

P1 nanoMAC, multihopNp-CSMA, multihopS-MAC, multihop

0 2 4 6 8 10 12 14 16 18 20

Number of hops

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

×10−3

Abs

olu

teen

ergy

con

sum

ptio

nE

(J)

Figure 12: Np-CSMA, S-MAC, and nanoMAC energy consump-tion with nonoptimal spacing of d = 10 meters. Common sleepgroup applied for all the protocols comparing single-hop versusmultihop communications. All the protocols’ curves cross imply-ing single-hop communications outperform multihop communi-cations up to the crossover point (d = 10 m times multiplier N ,path loss exponent α = 2.5).

cheap sensor nodes. The power levels enable transmission tofull range and 3/5, 1/3, and 1/10 of full range.

7.1. Short node distances

We set up the network with hop distances randomly chosenbetween 3 to 15 meters. All the nodes in the network aretransmitting data to the sink. An average of 20, d boundedrandom scenarios are run and the results are illustrated inFigure 13. When we compare the figure to Figure 12, we seethat there is no crossing point and the longest-hop methodoutperforms the shortest-hop method. The behavior is ex-plained by two factors: a discrete step power control causesmore overhearing by the shortest-hop communications aswell as higher than necessary transmission power. Secondly,although the shortest-hop method could occasionally reachtwo hops away, it always communicates with the closest nodelike many traditional ad hoc routing protocols do. Therefore,the shortest-hop method wastes energy and causes the differ-ence in the figures.

7.2. Large node distances

First, we take a look at a special case where there is no powercontrol and nodes always transmit at their set power, fullpower. We use random hop distances with 5 to 70 meter hopsand again average over 20 independent networks. The pathloss exponent α is given values between 2 and 4. Figure 14presents the case of α = 4 and we argue that the longest-hop communications mode outperforms the shortest-hop

Np-CSMA, longest hopS-MAC, longest hopNanoMAC, longest hopNp-CSMA, shortest hopS-MAC, shortest hopNanoMAC, shortest hop

20 40 60 80 100 120 140 160 180

Number of nodes

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

×10−3

Abs

olu

teen

ergy

con

sum

ptio

nE

(J)

Figure 13: Np-CSMA, S-MAC, and nanoMAC with random [3, 15]meter node spacing. The MAC protocols use common sleep groupsand longest-hop versus shortest-hop communications are com-pared. Note that there is no crossing of curves and longest-hop com-munications clearly outperform the shortest-hop strategy (total dis-tance (m) with random length hops, N = 20, path loss exponentα = 2.5).

method even with a very harsh radio propagation environ-ment. The results are not unexpected since the shortest-hop method transmits at the same power as the longest-hopmethod, but always to the nearest neighbor. When α is lower,S-MAC energy performance achieves better results than themodified np-CSMA. We expect that communications usingthe shortest-hop method occurs within a longer period oftime and therefore multiple packet forwarding does not in-crease G significantly.

Last, we consider a case of random 5 to 50 meter hopswith four-level power control. The radio parameters arefound in Table 1, but we vary α from 2 (free space) upward.Figure 15 illustrates the energy consumption of nanoMACwith the longest- and shortest-hop communication meth-ods and varying path loss. In a free space environment, thelongest-hop communications method has superior energyperformance compared to the shortest-hop method. In openfields, where α is usually close to 2.2, the longest-hop methodstill clearly outperforms the shortest-hop method, but al-ready with light woods (α∼ 2.4) the shortest-hop commu-nication achieves better energy performance per bit thanthe longest hop. Therefore, we deduct that choosing theproper communications method depends heavily on the en-vironment the sensor network is supposed to work in. Inlarge opens spaces one should favor longest-hop commu-nications whereas in large industrial halls where the pathloss can be high (close to 4) the shortest-hop communica-tions method is the best choice. The MAC protocol chosen


Np-CSMA, longest hop, average max no. hops 10S-MAC, longest hop, average max no. hops 10NanoMAC, longest hop, average max no. hops 10Np-CSMA, multihop, shortest hop, average max no. hops 20S-MAC, shortest hop, average max no. hops 20NanoMAC, shortest hop, average max no. hops 20

100 200 300 400 500 600 700 800

Number of nodes

0

500

1000

1500

2000

2500

3000

En

ergy

con

sum

ptio

nE

(J)

Figure 14: Np-CSMA, S-MAC, and nanoMAC with no power con-trol and random [5, 70] meter spacing. Nodes transmit with fullpower and a common sleep group is applied. Longest-hop versusshortest-hop communications are compared and the longest-hopstrategy performs better even with harsh radio environments (to-tal distance (m) with random length hops, N = 1 − 20, path lossexponent α = 4).

has some importance because in the presented scenario thenanoMAC longest-hop method still has a marginally betterbehavior than S-MAC shortest-hop methods with α = 2.5(not shown). The individual differences between MAC pro-tocols are still great, for example, resulting in over 35% bet-ter energy performance per useful bit from nanoMAC tomodified np-CSMA with common sleep mode for both theshortest- and longest-hop methods with α = 2.5.

8. CONCLUSIONS AND DISCUSSION

In this paper, we have presented an energy analysis techniqueapplicable to medium access control and multihop commu-nications. By applying this technique, we have gained insightfor when to use single-hop communications instead of mul-tihop forwarding. As an application of the presented tech-nique, we have made an energy analysis on np-CSMA, S-MAC, and nanoMAC protocols with sleeping schemes. Basedon the analysis, we have discovered many important resultsthat relate MAC protocol features.

Firstly, when a realistic radio model is applied for asensor network, we discovered that with feasible transmis-sion distances single-hop communications can be more ef-ficient than multihop from an energy perspective. This phe-nomenon applies to uniform hop distances of less than theradio-specific optimum transmission distance dchar with op-timal power control, nonuniform random short and long

NanoMAC, longest hop, α = 2NanoMAC, longest hop, α = 2.2NanoMAC, longest hop, α = 2.4NanoMAC, shortest hop, α = 2NanoMAC, shortest hop, α = 2.2NanoMAC, shortest hop, α = 2.4

0 100 200 300 400 500

Number of nodes

0

0.5

1

1.5×10−3

En

ergy

con

sum

ptio

nE

(J)

Figure 15: The effect of path loss on nanoMAC energy consump-tion with random [5, 50] meter spacing. A common sleep group forlongest-hop versus shortest-hop communications is used. The pathloss heavily affects which of the communication styles performs bet-ter, with high path loss favoring shortest-hop communications (to-tal distance (m) with random length (5− 50 m) hops, N = 20).

(long with α < 2.4) link distances with discrete four-levelpower control, and in cases where no power control can beexercised. Secondly, a well-designed sensor MAC protocolhas similar behavior to the case where the MAC protocol canbe considered ideal; only the absolute value energy consump-tion is higher, on the order of one magnitude.

Thirdly, there are some inherent flaws in adapting exist-ing ad hoc MAC protocols to sensor networks. Idle listeningand overhearing avoidance are important factors as alreadydiscussed in publications, such as [14, 15], but also any lis-tening that is not absolutely necessary, like listening for theSYNC in S-MAC, decreases the energy performance of a sen-sor MAC. Binary exponential backoff causing nodes to lis-ten for the channel for the duration of the contention win-dow before transmitting also increases energy consumption,especially when the offered traffic to the channel increases(see Figures 8 and 9). If message passing techniques areused (transmitting an ACK frame and the related turnaroundtimes consume a large amount of energy and occupy thechannel for a longer time), ACKs should be combined. Onthe other hand, combining ACK frames makes the commonACK more important and communications more vulnera-ble to losing the frame. ACK combining is implemented innanoMAC and proves more energy efficient in our analysis.

Introducing regular sleep periods can have a major im-pact on the energy consumption of a node, especially withlow traffic loads. The low duty cycle of ISM bands also


demands regular sleep periods. Sleep periods, however, in-crease the delay, but they can be justified because of the en-ergy savings. Regular, coordinated multigroup sleeping alsodecreases the energy consumption in both single-hop andmultihop communications per transmitted useful bit be-cause it limits the number of overhearing nodes. The energysavings depend heavily on the MAC protocol used as wellas whether single-hop or multihop communications is used.The energy saving effect is most effective with MAC proto-cols where the overheard frames are long, like np-CSMA (seeFigure 10).

When designing sensor networks, several factors are re-quired to be taken into account. Firstly, the environmentthe sensor network is going to operate in suggests whethercommunications with longest possible links or shortest pos-sible links is more energy efficient. In small areas and largeopen areas, utilizing longest feasible links is most energy ef-ficient and in large indoor areas shortest link communica-tions is best suited. Secondly, the availability of power con-trol on the transmitter amplifier is an important consider-ation. If no power control is available, longest feasible hopsare recommended no matter the environment. The more ad-justable power levels there are, the better short link mul-tihop communication performs. With optimal power con-trol a range of communications where using longest possi-ble hops is more beneficial generally exist, but if the sensornodes can be placed with the radio-specific characteristic dis-tance dchar apart, shortest-hop communications will performbest. Thirdly, if delay is not an important factor, minimizethe amount of time the MAC protocol consumes in listening.Periodic listen times after a sleep period should be made asshort as possible with functionality to dynamically extend thelistening time if data is being received. The listening includesbackoff periods, network synchronization periods, and con-tention for the channel. Finally, the used transceiver’s radioparameters highly influence the system energy performance.For example, if the reception circuitry of a radio consumesmore energy than transmission at full power as in Bluetooth,single-hop communications becomes much more favorablethan multihop communications. The same behavior is ob-served if the power consumption of the transmitter electron-ics is dominant. When the transmitter amplifier energy con-sumption is highly dominant multihop communications isbetter.

9. FUTURE WORK

In order to continue the analysis, further analytical resultswill be compared with real measurements. We have imple-mented nanoMAC on TinyOS for the Berkeley MICA2-motenodes and on the CWC’s WIRO sensor platform to makemeasurements. Also, we have assumed an error-free or nearlyerror-free (BER 10−4) channel and need to analyze the energybehavior with different bit error rates. This implies modifica-tions to the MAC energy model or a switch to Markov chainsand a finite number of nodes and the use of energy savingerror control codes for low BER values. Different sensor net-work traffic models influence the energy consumption and

the types of protocols used, so the definition of traffic mod-els other than data-centric nodes transmitting to the sink isalso needed. Finally, the problem needs to be considered alsofrom the transport and application layer. Different schemesfor packet forwarding in sensor networks should be com-pared using a similar cross-layer analysis.

APPENDICES

A. TRANSMIT ENERGY

From Figure 6 for nanoMAC,

(1)

ETx = TCSMRx + Pb

(Tbb +

Tr

2

)MSlp + PbE(B)

+(1− Pb

)(1− Pers

)(Tbp +

Tr

2

)MSlp

+(1− Pb

)PersE(A) +

(1− Pb

)Pers

(Tpr + RTS

)MTx

+(1− Pb

)(1− Pers

)E(B).

(A.1)

MTx, MRx, and MSlp are transceiver power con-sumptions in TX, RX, and sleep modes, respectively.Whereas TCS is the time required for carrier sensing,Tbb and Tbp are weighted average and base backofftimes, respectively. Symbol Pb denotes the probabilityof finding channel busy during CS, Tr/2 is the aver-age random delay, Pers is the nonpersistence value ofnanoMAC, and Tpr and RTS are times to transmit apreamble and an RTS frame, respectively.

(2)

E(A) = Ps(∆)MRx + Ps(Ψ)MTx +(1− Ps

)Pn f

(Tf)MRx

+(1− Ps

)Pn f E(B) +

(1− Ps

)(1− Pn f

)E(B)

+(1− Ps

)(1− Pn f

)(Tf

2+ Tpr + RTS

)MRx

+(1− Ps

)(1− Pn f

)(Tpkt − Tpr − RTS

)MSlp,

(A.2)

where ∆ = 2 × Tpr + CTS + ACK + Tto/2, Ψ = Tpr +9× (Tprs + Tsym) + 10×Data, and Pn f is the probabil-ity of not having a new data transmission by anotherdevice during failed period Tf . Tpkt = 4× Tpr + RTS +CTS + 9× (Tprs +Tsym) + 10×Data + ACK +Tto, whereTprs is a short preamble, Tsym is the time of 1 symbol,CTS, Data, and ACK are the corresponding times totransmit those frames, and Tto is the timeout delay.

(3)

E(B) = (1− Pc)(TCS + RTS

)MRx +

(1− Pc

)E(B)

+(1−Pc

)(Tpkt−Tpr − RTS

)MSlp + Pc

(1− Pers

)E(B)

+ Pc(1− Pers

)(Tbp +

Tr

2+ e)MSlp

+ PcPers(MRxTCS + MTx

(Tpr + RTS

))+ PcPersE(A),

(A.3)


where Pc is probability of finding no transmissionsduring time e. Inserting E(A) into the above equationof E(B), we can solve E(B) and get

E(B) = (ω + PcPersδ)(PersPcPs

)−1, (A.4)

where

ω = PcPers(MRxTCS + MTx

(Tpr + RTS

))

+ Pc(1− Pers

)(MSlp

(Tbp +

Tr

2+ e))

+(1−Pc

)(MRx(TCS +RTS

)+MSlp(Tpkt − Tpr − RTS)),

δ = Ps(MRx(∆) + MTx(Ψ)

)+(1− Ps

)Pn f

(MRx

(Tf)

+ MSlp(Tpkt − Tpr − RTS

))

+(1− Ps

)(1− Pn f

)(MRx

(Tf

2+ Tpr + RTS

)).

(A.5)

Now we can solve E(A) as follows:

E(A) = δ +(1− Ps

)(ω + PcPersδ

)(PersPcPs

)−1. (A.6)

The term E(A) gives a constraint: the probability of nocollision with retransmit RTS Pc > 0 and the probability ofsuccessful data transmission Ps > 0 → G ∈ [0,∞].

B. RECEIVE ENERGY

Based on Figure 7 we can solve the average reception energyconsumption ERx of nanoMAC by analyzing Idle E(I) andReply E(R) states:

(1)

E(I) = (1−Ps)(1−Pn f )(

2TCSRX +2RTS+Tpr +Tf

2

)MRx

+(1− Ps

)(1− Pn f

)(Tpkt + RTS + Tpr

)MSlp

+(1− Ps

)(1− Pn f

)E(I) +

(1− Ps

)Pn f E(I)

+(1− Ps

)Pn f

(TCSRX + RTS +

Tf

2+ e)MRx

+ Ps(TCSRX + RTS + Tproc + e

)MRx + PsE(R),

(B.1)

where TCSRX is receive carrier sense delay and Tproc isthe processing delay in the MAC protocol.

(2)

E(R) = Psenh(2Tpr + CTS + ACK

)MTx

+ Psenh(Tpr + 9

(Tprs + Tsym

)+ 10Data + Tproc

)MRx

+(1−Psenh

)(2Tpr +9

(Tprs +Tsym

)+10Data+Tto

)MRx

+(1−Psenh

)(2Tpr +CTS+ACK

)MTx +

(1−Psenh

)E(I),(B.2)

where Psenh is the enhanced probability of not having acollision during CTS transmission and CTS and Dataare times needed to transmit a CTS frame and a dataframe, respectively. From the above equations we cansolve ERx and get

ERx = E(I) = (µ + Psθ)(PsPsenh

)−1, (B.3)

where

µ = (1− Ps)(

1− Pn f)[(

Tpkt − Tpr − RTS)MSlp

+(

2TCSRX +2RTS + Tpr +Tf

2

)MRx

]

+(1− Ps

)Pn f

(TCSRX + RTS +

Tf

2+ e)MRx

+ Ps(TCSRX + RTS

)MRx,

θ = Psenh[(

2Tpr + CTS + ACK)MTx

+(Tpr + 9

(Tprs + Tsym

)+ 10Data + Tproc

)MRx

]+(1− Psenh

)[((2Tpr + CTS + ACK

)MTx

)+(Tpr +9

(Tprs +Tsym

)+10Data+Tto

)MRx

].

(B.4)

For reception, the constraint PsPsenh > 0 → G < ∞ isintroduced.

C. ENERGY CONSUMPTION WITH SLEEP GROUPS

The total energy consumption and sleep has to be expressedin parts as a function of G, the average, normalized traffic of-fered to the channel. When G = 1 (the capacity of the chan-nel) and we denote Rd as the data rate, Apkt as the MSDUsize, and Ttp of (16) as the minimum period between twoconsecutive packet transmissions by node(i), (Rd/Apkt)Ttp

new packets arrive to the system in Ttp period. When G =(Apkt/(RdTtp)), only one packet is generated for transmis-sion every Ttp. When G ≤ (Rd/Apkt)Ttp, the transceiver ofthe node(i) stays in idle listening TidleRX for

TidleRX =Apkt

RdTtpG

(Ttp −

Cpkt

Rd

)(C.1)

seconds for every packet transmitted.When (Rd/Apkt)Ttp ≤ G ≤ 1, at least one packet is gen-

erated every Ttp and one of the generated packets can be as-signed for transmission by node(i). In this worst case sce-nario, all the other generated packets during the period Ttp

will be assigned for reception by node(i). So, for total energyconsumption Etot, we get

Etot = ETx +(

1Cpkt

− 1RdTtp

)(1− Apkt

RdTtpG

)ERx

+ TidleRXMidleRX

Apkt,

(C.2)


where the energy consumption is per successfully transmit-ted useful bit by node(i), ETx is found from (12) and di-vided by Apkt, ERx is found from (15) and divided by Apkt,and MidleRX is the power consumption for listening for emptychannel.

When G ≥ 1, exactly one packet is generated for trans-mission by node(i) in a Ttp and almost all the rest of the timeis for receiving packets, but due to multiple access environ-ment, a small amount of channel capacity is still used for idlelistening. In order to use Etot we have to set some constraintsfor the equation to be valid with different values of G. Theconstraints are

ERx =

0, if G <Apkt

RdTtp,

(15)Apkt

, else.(C.3)

Lastly, sleeping is taken into account in Etot. A device staysawake a period of Taw = 85 milliseconds + {0 − (Cpkt/Rd)}(Tbwu = 85 milliseconds), where Cpkt/Rd is the time neededto communicate one packet, Cpkt is the length of the packeton the channel, and Tbwu is the base awake time of node(i).When G < Apkt/(RdTtp), Taw = 85 milliseconds. When G ≥1, Taw = 85 + (Cpkt/Rd) milliseconds with high probability.Therefore, we can expect

Taw = Tbwu + GmodCpkt

Rd, (C.4)

where Gmod = 1 when G > 1 is the channel capacity limitedtraffic offered to the channel. Node(i) will sleep forTwup−Taw

seconds, where Twup is the wake up period defined by sleepgroups (SG 00, SG 01, SG 10, and SG 11 of Section 5.2). If adevice does not sleep, it belongs to group SG 00, and Twup =Taw.

If node(i) sleeps, it reduces the number of received pack-ets in a Ttp and also that reduces the time Ttp itself by re-ducing MAX(r) of (17), the maximum number of receivedpackets between two consecutive transmissions. The newMAX(r), marked MAX(rslp) can be calculated by the formula

MAX(rslp) =

((Taw

Twup

)(STx

Cd(Cpkt + Tproc

) − 1

))

×(

1− RTx

Cd(Cpkt + Tproc

))−1

,

(C.5)

where STx is the amount of data the originator transmits datain a packet exchange, Tproc is the processing delay measuredhere in bits, RTx is the amount of data the recipient transmitsdata in a packet exchange, and Cd is the duty cycle. Ttp canbe calculated with MAX(rslp).

In a Ttp, there are m = Ttp/Twup wake-ups and duringsleep there are (G(Twup −Taw)Rd)/Apkt new arrivals and thus

they increase the traffic offered to the channel Ginc to be

Ginc = G

(1 +

(Twup − Taw

)2Rd

TwupApkt

). (C.6)

With the above equations, we can present the total energyconsumption Etot with sleep groups EWCS as

EWCS = mTawGimod

Ttp

(1

Cpkt− 1

RdTtp

)

×(

1− Apkt

RdTtpGinc

)ERx +

m(Twup − Taw

)Apkt

MSlp

+ ETx +mTaw

(1−Gimod

)Ttp

TidleRXMidleRX

Apkt,

(C.7)

where Gimod = Ginc, when Ginc ≤ 1, and Gimod = 1 otherwise.

REFERENCES

[1] RFM, “Tr-1000 product technical information sheet,” avail-able online on http://www.rfm.com/products/data/.

[2] THK, “Regulation on Collective Frequencies for Cer-tain Radio Transmitters and Their Use,” June 2001, tele-hallintokeskus, Unofficial Translation, available online onhttp://www.ficora.fi/englanti/document/.

[3] ERC REPORT 25, “Frequency Range 29.7 MHz to 105 GHzand Associated European Table of Frequency Allocations andUtilisations,” February 1998, Brussels, June 1994, revised inBonn, March 1995 and in Brugge, February 1998. EuropeanRadiocommunications Committee (ERC) within the CEPT,available online on http://www.ero.dk/.

[4] RFM, “ ASH Transceiver Designer’s Guide,” available onlineon http://www.rfm.com/products/tr des24.pdf.

[5] Y. Sankarasubramaniam, I. F. Akyildiz, and S. W. McLaughlin,“Energy efficiency based packet size optimization in wirelesssensor networks,” in Proc. 1st IEEE International Workshop onSensor Network Protocols and Applications (SNPA ’03), pp. 1–8, Anchorage, Alaska, USA, May 2003.

[6] W. R. Heinzelman, A. Chandrakasan, and H. Balakrishnan,“Energy-efficient communication protocol for wireless mi-crosensor networks,” in Proc. 33rd Annual Hawaii Interna-tional Conference on System Sciences (HICSS ’00), vol. 2, pp.1–10, Maui, Hawaii, USA, January 2000.

[7] P. Chen, B. O’Dea, and E. Callaway, “Energy efficient systemdesign with optimum transmission range for wireless ad hocnetworks,” in Proc. IEEE International Conference on Commu-nications (ICC ’02), vol. 2, pp. 945–952, New York, NY, USA,April-May 2002.

[8] R. Min, M. Bhardwaj, N. Ickes, A. Wang, and A. Chan-drakasan, “The hardware and the network: total-systemstrategies for power aware wireless microsensors,” in Proc.IEEE CAS Workshop on Wireless Communications and Net-working, Pasadena, Calif, USA, September 2002.

[9] W. R. Heinzelman, J. Kulik, and H. Balakrishnan, “Adaptiveprotocols for information dissemination in wireless sensornetworks,” in Proc. 5th Annual ACM/IEEE International Con-ference on Mobile Computing and Networking (MobiCom ’99),pp. 174–185, Seattle, Wash, USA, August 1999.

[10] D. Petrovic, R. C. Shah, K. Ramchandran, and J. Rabaey,“Data funneling: routing with aggregation and compressionfor wireless sensor networks,” in Proc. 1st IEEE International

http://www.rfm.com/products/data/

http://www.ficora.fi/englanti/document/

http://www.ero.dk/

http://www.rfm.com/products/tr_des24.pdf


Workshop on Sensor Network Protocols and Applications (SNPA’03), pp. 156–162, Anchorage, Alaska, USA, May 2003.

[11] C. Intanagonwiwat, R. Govindan, and D. Estrin, “Directeddiffusion: a scalable and robust communication paradigmfor sensor networks,” in Proc. 6th Annual ACM/IEEE In-ternational Conference on Mobile Computing and Network-ing (MobiCom ’00), pp. 56–67, Boston, Mass, USA, August2000.

[12] E. M. Royer, S.-J. Lee, and C. E. Perkins, “The effects of MACprotocols on ad hoc network communication,” in Proc. IEEEWireless Communications and Networking Conference (WCNC’00), vol. 2, pp. 543–548, Chicago, Ill, USA, September2000.

[13] S. Singh and C. S. Raghavendra, “PAMAS—power awaremulti-access protocol with signalling for ad hoc networks,”ACM SIGCOMM Computer Communication Review, vol. 28,no. 3, pp. 5–26, 1998.

[14] W. Ye, J. Heidemann, and D. Estrin, “Medium access controlwith coordinated adaptive sleeping for wireless sensor net-works,” IEEE/ACM Trans. Networking, vol. 12, no. 3, pp. 493–506, 2004.

[15] T. van Dam and K. Langendoen, “An adaptive energy-efficientMAC protocol for wireless sensor networks,” in Proc. 1st In-ternational Conference on Embedded Networked Sensor Systems(SenSys ’03), pp. 171–180, Los Angeles, Calif, USA, November2003.

[16] IEEE-802.11, “Part11: Wireless LAN Medium Access Con-trol (MAC) and Physical Layer (PHY) Specifications,” Tech.Rep., Institute of Electrical and Electronics Engineers, Belle-vue, Wash, USA, June 1997. IEEE Std 802.11-1997.

[17] V. Bharghavan, A. Demers, S. Shenker, and L. Zhang,“MACAW: a media access protocol for wireless LANs,” in Proc.ACM SIGCOMM Conference (ACM SIGCOMM ’94), pp. 212–225, London, UK, September 1994.

[18] IEEE-802.15.4, “Part 15.4: Wireless Medium Access Control(MAC) and Physical Layer (PHY) Specifications for Low-Rate Wireless Personal Area Networks (LR-WPANs),” Tech.Rep., Institute of Electrical and Electronics Engineers, Belle-vue, Wash, USA, May 3003. IEEE Std 802.15.4-2003.

[19] L. Kleinrock and F. A. Tobagi, “Packet switching in radiochannels: Part I—Carrier sense multiple-access modes andtheir throughput-delay characteristics,” IEEE Trans. Com-mun., vol. 23, no. 12, pp. 1400–1416, 1975.

[20] J. Haapola, “Low-power wireless measurement system forphysics sensors,” Master’s thesis, Department of Physical Sci-ences, University of Oulu, Oulu, Finland, 2002, unpublished,available online on http://www.ee.oulu.fi/∼jhaapola/.

[21] J. Haapola, “NanoMAC: a distributed MAC protocol for wire-less sensor networks,” in Proc. 18th Convention on Radio Sci-ence & IV Finnish Wireless Communication Workshop (FWCW’03), pp. 17–20, Oulu, Finland, October 2003.

[22] J. Haapola, Z. Shelby, C. Pomalaza-Raez, and P. Mahonen,“Cross-layer energy analysis of multi-hop wireless sensor net-works,” in Proc. 2nd European Workshop on Wireless SensorNetworks (EWSN ’05), pp. 33–44, Istanbul, Turkey, January–February 2005.

[23] C. L. Fullmer, Collision avoidance techniques for packet-radionetworks, Ph.D. dissertation, University of California, SantaCruz, Calif, USA, June 1998.

Jussi Haapola graduated with an M.S. de-gree in physics from the University of Oulu,Finland, in 2002. Currently, he is a Ph.D.student of telecommunications at the De-partment of Electrical Engineering. He isalso a Researcher at the Centre for Wire-less Communications working in the fieldof low-power wireless networking with anemphasis on medium access control. Otherresearch interests include energy optimiza-tion in heterogeneous and multihop wireless networks.

Zach Shelby is a Ph.D. student and ResearchScientist at the Centre for Wireless Com-munications, University of Oulu. He holdsa B.S. degree from Michigan TechnologicalUniversity (1999) and an M.S. (Tech) degreefrom the University of Oulu (2003). His in-terests are in wireless energy efficient net-works, especially in the area of embeddedand sensor networks.

Carlos Pomalaza-Raez is an electrical en-gineering Professor at Indiana-Purdue Uni-versity, USA. He received his B.S.M.E. andB.S.E.E. degrees from Universidad Nacionalde Ingenierıa, Lima, Peru, in 1974, andthe M.S. and Ph.D. degrees in electricalengineering from Purdue University, WestLafayette, Indiana, in 1977 and 1980, re-spectively. He has been a faculty member ofthe University of Limerick, Ireland, and ofClarkson University, Potsdam, New York. He has also been a mem-ber of the technical staff at the Jet Propulsion Laboratory, the Cali-fornia Institute of Technology, where he was involved in the designof the advanced receiver for the Voyager II deep space program.He has extensive experience in the design, development, and im-plementation of routing algorithms for ad hoc tactical communi-cation networks. In 2003 and 2004, under the auspices of a Nokia-Fulbright Scholar Award, he was a Visiting Professor at the Centrefor Wireless Communications, University of Oulu, Finland. His re-search interests are wireless communications networks and signalprocessing applications.

Petri Mahonen is currently a Full Profes-sor and Chair of wireless networks at theAachen University (RWTH Aachen). Pre-viously, he has studied and worked in theUnited States, United Kingdom, and Fin-land. He has been principal investigator inseveral international research projects, in-cluding initiating and leading several largeEuropean Union research projects. He haspublished over 100 peer-reviewed confer-ence and journal articles. His current research with his group fo-cuses on wireless Internet, cognitive networking and radios, appliedmathematical methods for telecommunications, and low-powercommunications including sensors, cooperative and ad hoc net-works.

http://www.ee.oulu.fi/~jhaapola/

EURASIP Journal on Wireless Communications and Networking 2005:4, 541–553c© 2005 W. Li and H. Dai

Optimal Throughput and Energy Efficiencyfor Wireless Sensor Networks:Multiple Access and Multipacket Reception

Wenjun LiDepartment of Electrical and Computer Engineering, North Carolina State University, Raleigh, NC 27695-7914, USAEmail: [email protected]

Huaiyu DaiDepartment of Electrical and Computer Engineering, North Carolina State University, Raleigh, NC 27695-7914, USAEmail: huaiyu [email protected]

Received 9 December 2004; Revised 1 April 2005

We investigate two important aspects in sensor network design—the throughput and the energy efficiency. We consider the uplinkreachback problem where the receiver is equipped with multiple antennas and linear multiuser detectors. We first assume Rayleighflat-fading, and analyze two MAC schemes: round-robin and slotted-ALOHA. We optimize the average number of transmissionsper slot and the transmission power for two purposes: maximizing the throughput, or minimizing the effective energy (defined asthe average energy consumption per successfully received packet) subject to a throughput constraint. For each MAC scheme witha given linear detector, we derive the maximum asymptotic throughput as the signal-to-noise ratio goes to infinity. It is shownthat the minimum effective energy grows rapidly as the throughput constraint approaches the maximum asymptotic throughput.By comparing the optimal performance of different MAC schemes equipped with different detectors, we draw important tradeoffsinvolved in the sensor network design. Finally, we show that multiuser scheduling greatly enhances system performance in ashadow fading environment.

Keywords and phrases: throughput, energy efficiency, multiuser diversity, scheduling, slotted-ALOHA, linear multiuser detector.

1. INTRODUCTION

Wireless sensor networks have become one of the burgeon-ing research fields in recent years, as they are envisioned tohave wide applications in military, environmental, and manyother fields [1]. Since sensors typically operate on batteries,replenishment of which is often difficult, a lot of work hasbeen done to minimize the energy expenditure and prolongthe sensor lifetime through energy efficient designs acrosslayers [2, 3, 4, 5, 6]. Meanwhile, the sensor network shouldbe able to maintain a certain throughput (which is equiva-lent to a certain delay constraint), in order to fulfill the QoSrequirement of the end user, and to ensure the stability ofthe network. Typically, the throughput and the energy effi-ciency are inconsistent, and there exists a tradeoff betweenthe two measures. The objective of this work is to explorethe maximum achievable throughput under certain network


configurations and receiver structures, as well as optimal net-work designs that achieve the desired throughput with mini-mal energy consumption.

We consider the reachback problem where all sensornodes in the sensor field transmit to a common receiver.The receiver has replenishible power supply and possesses so-phisticated data reception and processing capabilities. An al-ternative way for transmitting data, typically in a nonhier-archical sensor network, is the multihop communication,whereby a packet is received and forwarded by intermediatenodes several times before reaching the destination. Whilemultihop communication may lower the transmission en-ergy by mitigating the exponential decay in the signal poweras a function of the distance, this energy saving can hardlyjustify the extra energy spent on packet reception, process-ing, routing, and forwarding. Moreover, multihop commu-nication also incurs more contentions/interference and de-lays, as indicated in [7, 8]. As exemplified by the sensor net-works with mobile agents (SENMA) [9], employing a pow-erful receiver, such as a mobile agent, conserves sensors’ en-ergy by freeing them from packet relaying, routing, and data




processing routines, and good performance can be guaran-teed even with minimal transmission power.

We assume that each node constantly has packets totransmit; the transmission is slotted and the slot length Tequals the transmission time of one packet. The sensorsand the receiver constitute a multiple access network. Un-der the traditional collision channel model (i.e., single trans-mission means success and simultaneous transmissions re-sult in failure), the throughput of the multiple access net-work is limited: the maximum throughput per slot is 1for time-division-multiple-access (TDMA), and is only 1/efor slotted-ALOHA with optimal decentralized control [10].Such a throughput may not be sufficient for sensor networkapplications. Nevertheless, advanced signal processing tech-niques such as multiuser detection [11] enable correct re-ception of simultaneous transmitted packets at the physi-cal layer, and consequently, Ghez et al. proposed the mul-tipacket reception model [12], which revolutionized the un-derlying assumption of MAC layer design. In this paper, weassume that the receiver is equipped with N antennas anda linear multiuser detector followed by single-user decoders.The packet transmission is considered successful as long asthe output signal-to-interference ratio (SIR) of the linear de-tector is above a certain threshold β [13]. The transmittingsensors and the receive antenna array thus form a virtualmultiple-input-multiple-output (MIMO) system, which canalso be viewed as a space-division-multiple-access (SDMA)system. Note that due to the analogy between the direct-sequence code-division-multiple-access (DS-CDMA) systemand the MIMO system, the analysis in this paper can also beadapted to the DS-CDMA system with a single receive an-tenna and spreading gain N . But since the received poweradds up across the antennas, the MIMO system requires only1/N of the transmission power of the corresponding DS-CDMA system. A hybrid of CDMA and multiple receive an-tenna system is also possible, in which case the performanceis further enhanced by the effect of “resource-pooling” [14].

A sensor field usually consists of hundreds or thousandsof sensors, and the number of transmissions in each slot atthe same frequency band is typically much smaller to avoidthe excessive multiple access interference. Therefore, in ad-dition to the SDMA that defines the channelization in eachslot, another level of medium access control is necessary todetermine which sensors should transmit during each slot,and the MAC scheme for this purpose can be either co-ordinated or random. For coordinated access, we considerround-robin, which is TDMA in essence: the adjacent sensorsform a transmission group and the groups are scheduled foraccess one by one. For random access, we consider the sim-plest form of slotted-ALOHA, known as delayed first trans-mission (DFT) [15]: in each slot every sensor node trans-mits a packet (new or retransmission) with the same prob-ability p independently. We assume that the receiver trans-mits a beacon at the beginning of each slot for synchroniza-tion [9, 16]. It might require some overhead for the sensornodes to get some delay estimates for synchronization pur-pose, and then they can adjust their timing when simultane-ously transmitting. It is known that slotted-ALOHA is simple

and is preferred when the traffic is bursty, but it suffers fromcertain performance degradation from centrally controllednetworks, and we will investigate the exact performance lossin our system. In addition to different MAC schemes, thelinear multiuser detector at the receiver can be the single-user matched filter, the decorrelating detector, or the linearMMSE detector. As we will see, both the MAC scheme andreceiver structure employed have significant impact on thesystem performance. For a given MAC scheme with a givenlinear detector, we optimize the transmit power, as well asthe transmission group size (for round-robin) or the trans-mission probability (for slotted-ALOHA). We study two op-timization problems: one is to maximize the throughput, andthe other is to minimize the energy consumption subject toa throughput constraint.

We then modify our assumption of pure Rayleigh fad-ing by admitting shadow fading into our system model. Mul-tiuser diversity can be realized in such a system by allow-ing the sensor group with the best shadowing coefficient totransmit during each slot, and is shown to have great sig-nificance in energy conservation for sensor networks. Fair-ness concerns of multiuser scheduling can be remedied byenabling the movement of the receiver to induce a dynamicshadowing environment, or other known algorithms with lit-tle throughput sacrifice (see Section 6).

Most related papers on the performance and optimal re-source allocation of multiple access networks are based onthe collision model. The optimization of transmission prob-ability for slotted-ALOHA scheme with or without uplinkCSI are studied in [17, 18, 19]. Relatively few works in thisdirection adopted the multipacket reception model [16, 20].The design of transmission probability of slotted-ALOHAscheme by exploiting uplink CSI in a distributed fashionis studied in [16]. In [20], the authors analyze slotted-ALOHA sensor networks with multiple mobile agents, whosecovering areas can be optimally designed to maximize thethroughput or to maximize the energy efficiency. The per-formance analysis of sensor networks using both CDMA andmultiple receive antennas is presented in [21] based on theresults on large random networks in [14]. The analysis inthis paper does not rely on the large network approxima-tion. Meanwhile, most studies on multiuser scheduling foruplink or downlink wireless networks have focused on maxi-mizing the information-theoretic capacity [22, 23, 24, 25]. In[26], the authors present a scheduling algorithm which max-imizes a certain performance value estimated by the user orcalculated by the base station, such as a linear function of theSINR. On the other hand, we study multiuser scheduling byassuming the MPR model due to suboptimal receivers, wherethe main performance measure is the throughput in terms ofthe average number of successful packets per slot.

The main contribution of this paper is as follows.

(1) We derive the throughput and the effective energy (av-erage energy consumption for each successful packet)for multiple access network employing round-robinand slotted-ALOHA in Rayleigh flat-fading.

(2) We optimize the transmission power and the averagenumber of transmissions per slot to

Throughput and Energy Efficiency for Sensor Networks 543

(a) maximize the throughput: for each MAC schemewith a linear detector, we derive the maximumasymptotic throughput when the signal-to-noiseratio goes to infinity,

(b) minimize the effective energy subject to a through-put constraint: it is shown that the minimum ef-fective energy grows rapidly as the throughputconstraint approaches the maximum asymptoticthroughput.

(3) By comparing the optimal performance of differentMAC schemes equipped with different detectors, wedraw important tradeoffs involved in the sensor net-work design.

(4) We show that multiuser scheduling can significantlyenhance the system performance in a shadow fadingenvironment.

The organization of the paper is as follows. In Section 2,we introduce the system model, some assumptions of ourwork, and the general measures of the throughput and theenergy efficiency. In Section 3, we briefly describe the threelinear detectors of interest and derive the analytical re-sults to be used later. In Section 4, we first derive the en-ergy efficiency and the throughput of the round-robin andslotted-ALOHA scheme, and then study the two optimiza-tion problems, throughput maximization and throughput-constrained energy minimization, respectively. Numericalresults and discussions are presented in Section 5. Section 6studies multiuser scheduling in the shadow fading environ-ment. Section 7 contains the concluding remarks.

2. SYSTEM DESCRIPTION

We assume that there are totally n sensors in the sensor field,the receiver is equipped with N antennas, and the SIR thresh-old is β. The diameter of the sensor field is much smaller thanthe distance between the sensor field and the receiver, andthere exists a rich-scattering environment between the sensorfield and the receiver—for example, the sensors are deployedin a building or a forest. Therefore the channel states betweeneach sensor and each receive antenna can be modeled as in-dependent, identically distributed Rayleigh variables. We as-sume that sensors have no knowledge of uplink channel stateinformation (CSI), and transmit with equal power P. If msensors simultaneously transmit, the m sensors and N receiveantennas form a virtual MIMO system, and the discrete-timemodel is given by

y =√G

m∑i=1

hixi + n, (1)

where xi is the transmitted signal of the ith sensor andE[‖xi‖2] = P, hi is the N × 1 spatial signature of the ithsensor, whose entries are independent circularly-symmetriccomplex Gaussian variables with zero mean and unit vari-ance, G is the common pathloss, n is the noise vector withzero mean circularly-symmetric complex Gaussian entriesand covariance matrix σ2I, and y is the received signal vector.

The average received SNR of a packet at one receive antennais given by ρ = PG/σ2. In the following we denote the matrixH = [h1, h2, . . . , hm

].

We assume that a feedback channel exists from the re-ceiver to the sensor nodes, which is used for synchronization,acknowledgements, group selection, and other signaling onthe MAC layer. The bandwidth of the feedback channel istypically small and thus the energy consumption for receiv-ing the signaling is assumed to be negligible throughout thepaper. For simplicity, we also ignore the circuit energy con-sumption, which can be incorporated and the optimizationsdescribed in this paper can be performed with minor modifi-cations. Some measures of sensor network’s energy efficiencyhave been explored in the literature: in [5], the energy con-sumption per bit to achieve a desired bit error rate is evalu-ated, and in [20], the metric efficiency, defined as the averagenumber of successes over the total number of transmissions,is studied for SENMA networks. The former metric does notassume a multipacket reception model, and the latter doesnot characterize the exact energy expenditure, as a transmis-sion scheme with high efficiency is not necessarily energy ef-ficient if the transmit power is not constrained. We combinethe ideas in these two papers and measure the energy effi-ciency by the effective energy [21], defined as the average en-ergy consumption per successfully transmitted packet:

Ee = PT

Pr[succ], (2)

where Pr[succ] is the average probability of success for atransmitted packet. Note that the effective energy directlydetermines the number of packets a sensor can successfullytransmit during its lifetime. The throughput, denoted by λ,is defined as the average number of successful transmissionsper slot. Denote a as the average number of transmissions perslot, then we have

Pr[succ] = λ

a. (3)

Throughout the paper we assume that the number ofreceive antennas N , the total number of sensors n, the SIRthreshold β, the common pathlossG, as well as the noise vari-ance σ2 are fixed. When G and σ2 are fixed, the optimizationof the transmission power P is the same as the optimizationof ρ.

3. LINEAR MULTIUSER DETECTORS INRAYLEIGH FADING CHANNELS

Assume that m sensors simultaneously transmit and the SNRis ρ, then the outcome of the ith transmitted packet (successis denoted by 1 and failure is denoted by 0) is a random vari-able determined by the channel realization:

oi(H) = I(

SIRi ≥ β | m, ρ, H), (4)

where I(·) denotes the indicator function. The expectedvalue of the outcome averaged over all channel realizations


is denoted by q(m, ρ), which is the same for all i:

q(m, ρ) = EH[oi(H)

] = Pr[

SIRi ≥ β | m, ρ]. (5)

In an ergodic channel, the average number of successes whenthere are m transmissions per slot and SNR is ρ is given by

EH

[ m∑i=1

oi(H)

]=

m∑i=1

EH[oi(H)

] = mq(m, ρ). (6)

As we will see, the throughput and the effective energy forround-robin and slotted-ALOHA are functions of q(m, ρ),which is determined by the physical channel and the lin-ear detector used. In general q(m, ρ) decreases with m andincreases with ρ. In this section, we briefly describe thethree linear detectors of interest, and derive the expression ofq(m, ρ) in Rayleigh fading channels for each detector. More-over, as we will use the asymptotic value of q(m, ρ) as ρ →∞frequently in later analysis, we also derive the expression ofq(m,∞)

.= limρ→∞ q(m, ρ). The readers are referred to [11]for more details of these multiuser detectors.

3.1. Matched filter

The matched filter only requires the knowledge of the spatialsignature of the desired user, which is suitable for the down-link but not much of an advantage for the uplink where theknowledge of spatial signatures of all users are known. TheSIR of the ith user after matched-filtering is given by

SIRi = PG∥∥hi

∥∥4

σ2∥∥hi

∥∥2+ PG

∑mj=1, j�=i

∥∥h†i h j

∥∥2 , (7)

where † denotes conjugate transpose.

Lemma 1. The q(m, ρ) of the matched filter in the Rayleighfading channel is given by

qmf (m, ρ)

=

1− Γ(β

ρ,N)

, m = 1,

1(m− 2)!

×∫∞

0

[1− Γ

(βy +

β

ρ,N)]

ym−2e−ydy, m > 1,

(8)

where Γ(a, x) is the regularized gamma function given byΓ(a, x) = ∫ x0 ta−1e−tdt/

∫∞0 ta−1e−tdt.

In the case ρ →∞,

qmf (m,∞) =

1, m = 1,

1− I(

β

1 + β;N ,m− 1

), m > 1,

(9)

where I(x; a, b) is the regularized beta function, given byI(x; a, b) = ∫ x0 ta−1(1− t)b−1dt/

∫ 10 ta−1(1− t)b−1dt.

Proof. See Appendix A.

3.2. Decorrelating detector

The decorrelating detector is optimal according to threedifferent criteria: least squares, near-far resistance, andmaximum-likelihood when the received amplitudes are un-known [11]. When the spatial signatures are independent,the decorrelator exhibits improved performance than thematched filter except at low signal-to-noise ratios, and it con-verges to the linear MMSE detector at high signal-to-noiseratios. Generally, the decorrelator allows simpler expressionsas it decomposes a multiuser channel into parallel single-userGaussian channels. If H†H is invertible, the SIR of the ithuser using a decorrelating detector is given by

SIRi = ρ[(H†H

)−1]ii

, (10)

and when H†H is singular, SIRi is zero.

Lemma 2. The q(m, ρ) of the decorrelator in the Rayleigh fad-ing channel is given by (cf. (8) for the definition of the Γ(a, x)function)

qdec(m, ρ) =

1− Γ(β

ρ,N −m + 1

), m ≤ N ,

0, m > N.(11)

When ρ →∞,

qdec(m,∞) =1, m ≤ N ,

0, m > N.(12)

Proof. See Appendix B.

3.3. Linear MMSE detector

The linear MMSE detector cancels the interference and noisein an optimal way, such that the mean squared error is min-imized among linear detectors. It can be shown that the lin-ear MMSE detector also maximizes the SIR [11], hence it isoptimal among linear detectors under the multiple packet re-ception model where the success probability only depends onthe SIR. For the linear MMSE receiver, it can be shown thatthe SIR of the ith user is given by

SIRi = h†i

(HiH

†i +

1ρ

I)−1

hi, (13)

where Hi denotes the matrix obtained by striking out theith column of H. There is no straightforward closed-formexpression of q(m, ρ) for the linear MMSE detector in theRayleigh fading channel. An approximation of qmmse(m, ρ)can be obtained by using recent results on linear multiuser


detectors in large random networks [27], where the SIR isshown to approach a Gaussian distribution as N approachesinfinity, with α = m/N fixed. However, simulations show thatsuch approximations are not accurate enough whenN is rela-tively small, so in this paper we use exact success probabilitiesobtained through simulations for the linear MMSE detector.Nevertheless, when ρ →∞, the success probability of the lin-ear MMSE detector has a simple form, given by the followinglemma.

Lemma 3. For Rayleigh fading channels (cf. (9) for the defini-tion of the I(x; a, b) function),

qmmse(m,∞) =

1, m ≤ N ,

1− I(

β

1 + β;N ,m−N

), m > N.

(14)

Proof. See Appendix C.

4. THROUGHPUT AND ENERGY OPTIMIZATIONS

In this section, we first derive the general expressions of thethroughput and the effective energy for the round-robin andslotted-ALOHA schemes, and then study the two optimiza-tion problems, throughput maximization and throughput-constrained energy minimization for both MAC schemes.

4.1. Throughput and effective energy ofround-robin and slotted-ALOHA

4.1.1. Round-robin

Round-robin is a fair scheduling scheme and is relatively easyto implement:m sensors in close proximity form a group. Forsimplicity we assume that n is a multiple of m, so there are to-tally K = n/m groups. Groups are scheduled for access oneby one, and when a group is scheduled in a slot, all the sen-sors in that group transmit simultaneously. It is easily seenthat in an ergodic fading channel (shown at the beginning ofSection 3), the throughput of round-robin is

λrr(m, ρ) = mq(m, ρ). (15)

With P = ρσ2/G, the effective energy of round-robin is givenby

Ee,rr(m, ρ) = ρσ2T/G

q(m, ρ). (16)

4.1.2. Slotted-ALOHA

To employ the decorrelating detector or the linear MMSEdetector in a slotted-ALOHA system requires that the re-ceiver knows the number and the channels of the transmit-ting nodes. For example, the sensors can signal their inten-tion of transmission in a short reservation period at the be-ginning of each slot. We consider the type of slotted-ALOHA

where the transmission probability for all packets (new or re-transmissions) is the same. Denoting the transmission prob-ability of each user by p, the throughput of slotted-ALOHAis given by

λsa =n∑

k=1

(n

k

)pk(1− p)n−kkq(k, ρ). (17)

The average number of transmissions per slot is a = np. Inthe case n is large and p is small, we can approximate thebinomial probabilities with Poisson probabilities and obtain

λsa(a, ρ) = e−an∑

k=1

ak

k!kq(k, ρ) = e−a

n∑k=1

ak

(k − 1)!q(k, ρ).

(18)

The average success probability is Pr[succ] = λsa(a, ρ)/a, thusthe effective energy is given by

Ee,sa(a, ρ) = ρσ2T/G

λsa(a, ρ)/a. (19)

The receiver can simply inform the sensors of the trans-mission probability, or the sensors can compute the opti-mum transmission probability if they have the knowledge ofn. Slotted-ALOHA also has built-in fairness, since the trans-mission probability is independent of the channel states ofindividual sensors.

4.2. Throughput maximization

As we have shown, the throughput depends on both the MACscheme as well as the type of the linear detector used. For agiven MAC scheme with a given linear detector, the through-put is a function of the SNR ρ and the average number oftransmissions per slot a (for round-robin, a = m, and forslotted-ALOHA, a = np). These parameters can be cho-sen judiciously such that the throughput is maximized. Theperformance of various MAC schemes with different lineardetectors can then be compared, in terms of the maximumthroughput. In the following we focus on the joint optimiza-tion of a and ρ; the optimization of a single parameter isstraightforward and is therefore omitted.

First assume that a is fixed. Since Pr[succ] increases withρ, the maximum throughput for any fixed a is achieved whenρ → ∞. Therefore the maximum throughput jointly op-timized over a and ρ is obtained by letting ρ → ∞, andsearching for the optimal a that achieves the global maxi-mum. In practical systems, the sensors’ power amplifier hasa maximum output limit [5], which in turn poses an upperlimit on ρ, denoted by ρmax. Then the maximum through-put is achieved at ρmax, and the problem again reduces toa single-parameter optimization problem. Nevertheless, themaximum throughput with no power constraint (ρ → ∞)is of special interest as it represents the upper bound on thethroughput that can be achieved by a MAC scheme with agiven type of linear detector. In the following we discuss thiscase in detail.


For a given MAC scheme with a given linear detector,we define the maximum asymptotic throughput as the max-imum throughput achievable with a given number of re-ceive antennas as SNR ρ approaches infinity, and denote it byΛ(∞)

.= maxa λ(a,∞). The maximum asymptotic through-put plays an important role in throughput-constrained en-ergy minimization to be discussed in Section 4.3, in the sensethat any throughput constraint larger than Λ(∞) cannot beattained. With a general linear detector, we have the follow-ing proposition.

Proposition 1. The maximum asymptotic throughput ofround-robin and slotted-ALOHA are, respectively, given by

Λrr(∞) = maxm

mq(m,∞);

Λsa(∞) = maxa

e−an∑

k=1

ak

(k − 1)!q(k,∞).

(20)

The above expressions can be evaluated for different detectorsusing (9), (12), and (14).

Remark 1. With the decorrelating detector, the maximumasymptotic throughput of the two MAC schemes are, respec-tively, given by

Λdecrr (∞) = N with m = N ;

Λdecsa (∞) = max

ae−a

N∑k=1

ak

(k − 1)!.

(21)

The above are direct consequences of applying (12).Note that with the decorrelator, the maximum asymptoticthroughput of slotted-ALOHA can be much smaller thanthat of round-robin. For example, when N = 10, the max-imum asymptotic throughput of slotted-ALOHA with thedecorrelator is 5.831, which is achieved at a = 7.297.

Remark 2. While no straightforward closed-form expres-sions for maximum asymptotic throughput are available forthe matched filter and the linear MMSE detector, some qual-itative results are possible. For round-robin, comparing (9),(12), and (14) reveals

(1) Λmmserr (∞) ≥ Λmf

rr (∞), with the equality held whenN = 1,

(2) Λmmserr (∞) ≥ Λdec

rr (∞); the equality holds if and only ifthe throughput of the linear MMSE with m = N + 1 issmaller than with m = N , that is,

(N + 1)[

1− I(

β

1 + β;N , 1

)]≤ N , (22)

which yields

β ≥ 1(N + 1)1/N − 1

. (23)

In other words, the linear MMSE detector can sup-port a throughput larger than the number of receiveantennas N (and surpass the decorrelator) if and onlyif β < 1/((N + 1)1/N − 1). Note that the right-hand sideof the above inequality is a strictly increasing functionof N , going from 1 to +∞.

(3) The relative performance of the decorrelator and thematched filter depends on β. It can be shown thatwhen β ≥ 1, Λdec

rr (∞) ≥ Λmfrr (∞).

Remark 3. As for slotted-ALOHA, since we have qmmse(m,∞) ≥ max{qmf (m,∞), qdec(m,∞)} for all m, the maximumasymptotic throughput with the linear MMSE is always thebest, while it is not immediate whether the matched filter orthe decorrelator is the worst.

4.3. Throughput-constrained energy minimization

In this section we study the optimization to achieve the great-est energy efficiency, that is, to minimize the effective energy.In particular, we study the minimization of the effective en-ergy subject to a throughput constraint λ ≥ ∆. There are tworeasons for doing this. First, it is only fair to compare the en-ergy efficiency of different MAC schemes if they achieve thesame throughput. Second and more importantly, in a practi-cal sensor network, there is usually a minimum throughputconstraint, which may arise from a QoS demand from theend user, or from a mild delay constraint to ensure the stabil-ity of the network. As discussed in Section 4.2, the maximumasymptotic throughput is the upper limit on the through-put supportable by each MAC scheme with a given linear de-tector, so the given throughput constraint must not exceedthis limit, otherwise it cannot be met. Comparing (16) and(19), we observe that σ2T/G is a common factor and is fixed.Therefore to minimize Ee it suffices to find

mina,ρ

aρ

λ(a, ρ)(24)

subject to

λ(a, ρ) ≥ ∆. (25)

In the following we briefly describe both single-parameteroptimization as well as joint optimization.

4.3.1. Fixed ρ

For a fixed ρ, the throughput constraint ∆ can be met if andonly if Λ(ρ)

.= maxa λ(a, ρ), the maximum throughput givenρ, satisfies Λ(ρ) ≥ ∆. When ρ is fixed, for each MAC scheme,the values of a that satisfy λ(a, ρ) ≥ ∆ form a closed inter-val (of reals or integers). Since Pr[succ] decreases with a, theeffective energy is minimized by the minimum a with whichthe throughput constraint is satisfied, that is,

aopt(ρ) = min{a | λ(a, ρ) ≥ ∆

}. (26)


0 2 4 6 8 10 12 14 16 18 200

5

10

15

20

25

30

Number of receive antennas N

Λ(∞

)

MF, β = 1Decorrelator, β = 1LMMSE, β = 1

MF, β = 3Decorrelator, β = 3LMMSE, β = 3

Figure 1: Maximum asymptotic throughput of round-robin withdifferent linear detectors.

4.3.2. Fixed a

When a is fixed, the throughput constraint ∆ can be met ifand only if λ(a,∞)

.= limρ→∞ λ(a, ρ), the maximum through-put given a, satisfies λ(a,∞) ≥ ∆. Since the throughput is amonotone increasing function of ρ, we can find the smallestρ that meets the throughput constraint, which is denoted byρmin(a) = min{ρ | λ(a, ρ) ≥ ∆}. Thus the minimum effectiveenergy for fixed a is given by

Ee,min(a) = minρ≥ρmin(a)

aρ

λ(a, ρ). (27)

4.3.3. Joint optimization

If we can jointly optimize a and ρ and there is no power con-straint, the throughput constraint ∆ can be met as long as themaximum asymptotic throughput Λ(∞) ≥ ∆. The joint op-timization can proceed in two steps: first, find the minimumeffective energy when a is fixed, as described above; then findthe global minimum across all a. This is characterized by thefollowing proposition.

Proposition 2. For a given throughput constraint ∆, if ∆ ≤Λ(∞), the minimum effective energy jointly optimized over aand ρ is given by

Ee,min = mina

Ee,min(a) = mina

minρ≥ρmin(a)

aρ

λ(a, ρ), (28)

while if ∆ > Λ(∞), the throughput constraint cannot be met.

5. NUMERICAL RESULTS AND DISCUSSIONS

In this section we present the numerical results anddraw someobservations on the comparative performance of

0 2 4 6 8 10 12 14 16 18 200

5

10

15

20

25

30

Number of receive antennas N

Max

imu

mas

ympt

otic

thro

ugh

putΛ

(∞)

Round-robin, MFRound-robin, DECRound-robin, LMMSE

Slotted-ALOHA, MFSlotted-ALOHA, DECSlotted-ALOHA, LMMSE

Figure 2: Maximum asymptotic throughput of round-robin andslotted-ALOHA with different linear detectors, β = 1.

different MAC schemes, as well as on the comparative per-formance of different linear detectors.

5.1. Maximum throughput

Example 1 (comparison of detectors; joint optimization). InFigure 1 we plot the maximum asymptotic throughput (re-sult of joint optimization) of round-robin with three lineardetectors when β = 1 and β = 3. Note that the two curvesfor the decorrelator coincide. When β = 1, the maximumasymptotic throughput of the linear MMSE detector exceedsthat of the decorrelator (which is N) for all values of N ex-cept N = 1, since 1/((N + 1)1/N − 1) > 1 for all N > 1.When β = 3, the maximum asymptotic throughput of thelinear MMSE detector exceeds that of the decorrelator whenN ≥ 8, with which 1/((N + 1)1/N − 1) > 3. As β gets larger,it requires a larger N for the linear MMSE detector to sur-pass the decorrelator in terms of the maximum asymptoticthroughput.

Example 2 (comparison of MAC schemes and detectors;joint optimization). Figure 2 shows the maximum asymp-totic throughput of round-robin and slotted-ALOHA withthree linear detectors when β = 1. Note that the relativeperformance loss of slotted-ALOHA with respect to round-robin is much larger with the decorrelator than with thematched filter and the linear MMSE detector. When N issmall, the matched filter outperforms the decorrelator forslotted-ALOHA, and when N is large, the opposite is true.For both MAC schemes the linear MMSE detector assumesgreat superiority, and can achieve a maximum asymptoticthroughput greater than N with the linear MMSE detectorwhen β = 1.


5 5.5 6 6.5 7 7.5 8 8.5 9 9.5 100

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

Average number of transmissions in each slot (m or a)

Min

imu

meff

ecti

veen

ergy

Round-robinSlotted-ALOHA

Figure 3: Minimum effective energy with throughput constraintfor different MAC schemes with the decorrelator (fixed m or a), ∆ =5, N = 10, β = 1, σ2T/G = 1.

5.2. Minimum effective energy withthroughput constraint

In the following we present the results of throughput-constrained energy minimization described in Section 4.3.We show the results of optimization with fixed a and jointoptimization. For all simulations in this section we use thefollowing values: N = 10, β = 1, and σ2T/G = 1 (scalingfactor of Ee).

Example 3 (comparison of MAC schemes; fixed a). Assumethat the decorrelator is used, Figure 3 plots the minimum ef-fective energy of three MAC schemes with the throughputconstraint ∆ = 5 when a is fixed. Note that the through-put constraint implies that m ≥ 5 for round-robin, and5.21 ≤ a ≤ 9.43 for slotted-ALOHA. We observe that exceptfor m = 5 (where the minimum effective energy of round-robin goes to infinity and is not shown in the figure), round-robin is much more energy-efficient than slotted-ALOHA forthe same value of a.

Example 4 (comparison of MAC schemes; joint optimiza-tion). Assume that the decorrelator is used, when ∆ =5, Figure 3 reveals that the minimum effective energy isachieved at m = 6 for round-robin, and at about a = 6.2 forslotted-ALOHA. The minimum effective energy correspond-ing to different throughput constraints obtained throughjointly optimizing a and ρ is shown in Figure 4, and the cor-responding optimal a is shown in Figure 5. Note that thelargest throughput achievable by round-robin is Λ(∞) =N = 10, and that of slotted-ALOHA is 5.831. The minimumeffective energy curve for round-robin is not smooth at val-ues of m where a jump in the optimal group size m occurs.For slotted-ALOHA, the optimal a is a smooth function of

0 1 2 3 4 5 6 7 8 9 100

1

2

3

4

5

6

7

8

9

10

Throughput constraint ∆

Min

imu

meff

ecti

veen

ergy


Figure 4: Minimum effective energy with throughput constraintfor different MAC schemes with the decorrelator (joint optimiza-tion), N = 10, β = 1, σ2T/G = 1.

∆, and so is the minimum effective energy. It can be seenfrom Figure 4 that the minimum effective energy increasesrapidly as ∆ approaches the maximum asymptotic through-put for each MAC scheme: the minimum effective energy ap-proaches infinity for slotted-ALOHA and round-robin, re-spectively, as ∆ → 5.831 and as ∆ → 10. When ∆ is relativelysmall (e.g., ∆ ≤ 3), slotted-ALOHA does not incur muchextra energy expenditure than round-robin. As ∆ increases,the energy saving by round-robin relative to slotted-ALOHAbecomes increasingly larger, and round-robin can support athroughput that cannot be achieved by slotted-ALOHA.

Example 5 (comparison of linear detectors; joint optimiza-tion). The throughput-constrained minimum effective en-ergy for round-robin with various linear detectors is shownin Figure 6. When N = 10, the maximum asymptoticthroughput of round-robin with the matched filter, thedecorrelator, and the linear MMSE detector are about 6.4,10, and 13.8, respectively, (cf. Figure 1). Again, it can be seenthat the minimum effective energy approaches infinity as thethroughput constraint approaches the maximum asymptoticthroughput. When ∆ is small, we can use any one of the threedetectors, but with different energy expenditures. As ∆ getslarger, we are left with fewer choices of the detector that canbe used. The linear MMSE detector is certainly favorable inall scenarios.

6. MULTIUSER SCHEDULING UNDERSHADOW FADING

With the assumption of independent fading across the space,multiuser diversity can be explored in a multiuser envi-ronment to achieve a scheduling gain for delay-tolerant


0 1 2 3 4 5 6 7 8 9 100

1

2

3

4

5

6

7

8

9

10


Opt

imu

mm

ora


Figure 5: Optimal m or a for minimum effective energy withthroughput constraint for different MAC schemes with the decor-relator (joint optimization), N = 10, β = 1, σ2T/G = 1.

applications. It has been shown that in a single-antenna sys-tem, the information capacity is maximized by the so-called“opportunistic transmission,” that is, allowing only the userwith the best channel to transmit in every slot [28]. Mul-tiuser scheduling for systems with spatial diversity has beenstudied in [22, 23, 24, 25], and all these works aim to max-imize the information capacity. Consider a similar setup asround-robin, that is, sensors form groups of size m, the opti-mal scheduler under the multipacket reception model shouldmaximize the throughput in terms of the number of success-fully received packets in each slot. That is, in each slot, theoptimal scheduler selects the group with the highest numberof sensors that meet the SIR threshold. Although such an op-timal scheduler is theoretically appealing, its realization re-quires the receiver’s knowledge of the spatial signatures of allsensors at the beginning of each slot, which is infeasible whenthe number of sensors is large.

Another verified problem with multiuser scheduling fora system described in Section 2 is that, under pure Rayleighfading, multiuser scheduling has a vanishing relative schedul-ing gain as m and N increases (indicating a tradeoff be-tween multiple antennas and multiuser diversity) [25]. Whileshadow fading generally increases the dynamism of individ-ual link quantity, which leads to larger outage probability andis unfavorable to real-time applications, it can actually en-hance the scheduling gain in a multiuser environment fordelay-tolerant applications [25]. By slightly modifying oursystem model, we can investigate the multiuser schedulinggain that is realizable under the shadow fading.

We assume that the sensors in each group are adjacent toeach other such that they experience the same shadow fad-ing while sensors in different groups experience independentidentically distributed shadow fading. In each slot, the sched-uler selects the group with the highest shadowing coefficient.

0 2 4 6 8 10 12 140

1

2

3

4

5

6

7

8

9

10


Min

imu

meff

ecti

veen

ergy

MFDecorrelatorLMMSE

Figure 6: Minimum effective energy with throughput constraintfor round-robin with different linear detectors (joint optimization),N = 10, β = 1, σ2T/G = 1.

Although this scheduler is not optimal in terms of through-put, it only requires about 1/Nm amount of channel knowl-edge compared to the optimal scheduler. Ideally, the receiveris a mobile agent which moves at the end of each slot to in-duce a dynamic environment such that all groups have simi-lar chances to enjoy the best channel in the long run. Fairnesscan be further guaranteed by employing other methods, suchas those in [29, 30].

Denote the channel gain of the kth (k = 1, . . . ,K) groupby Gk, then for the kth group the system model in (1) is mod-ified as

y =√Gk

m∑i=1

hixi + n. (29)

Gk is modeled as log-normal-distributed, which has areamean E[Gk] = G = GL(dB), and decibel spread σL(dB).The average SNR is given by ρ = PG/σ2. Denote Gk = ezk ,then zk ∼ N (κGL, (κσL)2) is a Gaussian variable, whereκ = ln 10/10.

Lemma 4 (see [31]). If Z1, . . . ,ZK are i.i.d. Gaussian withmean µ and variance σ2, as K →∞,

max1≤k≤K

Zk −→ µ + σ√

2 lnK. (30)

Applying the lemma to zk as defined above, we havemax zk → κGL + κσL

√2 lnK(dB), or maxGk → G · eκσL

√2 lnK .

Denote the individual SNR ρk = PGk/σ2, we then havemax ρk → PG/σ2 · eκσL

√2 lnK .= ξρ, where ξ = eκσL

√2 lnK

roughly characterizes the scheduling gain in terms of the im-provement of SNR. The throughput and the effective energy


1 2 3 4 5 6 7 8 9 100

1

2

3

4

5

6

7

8

9

10

Number of transmitting sensors m

Th

rou

ghpu

t

Scheduling, shadowing, ρ = 0 dBRound-robin, shadowing, ρ = 0 dBRound-robin, no shadowing, ρ = 0 dBScheduling, shadowing, ρ = −10 dBRound-robin, shadowing, ρ = −10 dBRound-robin, no shadowing, ρ = −10 dB

Figure 7: Throughput comparison: multiuser scheduling versusround-robin (with the decorrelator), N = 10, n = 1000, σL = 8 dB,β = 1.

of the scheduling algorithm respectively converge to

λsch(m, ρ) = mq(m, ξρ), (31)

Ee,sch(m, ρ) = ρσ2T/G

q(m, ξρ). (32)

In comparison, the throughput and effective energy of thesame system via using the round-robin approach are givenby

λrr(m, ρ) = Eρk[mq(m, ρk)

]

= m∫ +∞

−∞q(m, ez

) e−(ln z−ln ρ)2/2(κσL)2

z√

2πκσLdz

.= mq(m, ρ),

Ee,rr(m, ρ) = ρσ2T/G

q(m, ρ).

(33)

The throughput of multiuser scheduling and round-robin in shadow fading, both with the decorrelator, are de-picted in Figure 7, where N = 10,n = 1000, σL = 8 dB,β = 1, and two SNR values,−10 dB and 0 dB, are shown. Thethroughputs of round-robin without shadowing (i.e., pureRayleigh fading) are also plotted for comparison. We observethat even for round-robin, shadowing is beneficial when theSNR is low, while the opposite is true when SNR is high:

shadowing degrades the throughput. This can be readily ex-plained by Jensen’s inequality by observing the property ofthe q(m, ρ) function: for all three detectors, it can be shownthat the q(m, ρ) function is convex in the low-SNR range andis concave in the high-SNR range, and approaches q(m,∞) asρ → ∞ (see (9), (12), and (14)). Meanwhile, the throughputof multiuser scheduling is almost invariant of the SNR, andis roughly equal to the number of transmissions. This meansthat despite the average SNR, the group of the best channelhas an effective SNR with which the success probability is 1.This demonstrates that multiuser scheduling is most usefulwhen the SNR is low, which is of particular significance forsensor networks.

It is not difficult to show from (31) that the multiuserscheduling algorithm has the same maximum asymptoticthroughput as round-robin. However, the fact that q(m, ξρ)can be made virtually 1 for a modest ρ when the number ofsensors is large implies that there is no loss in the energy con-sumption, and that the minimum effective energy remainslow for all throughput constraints ∆ < Λ(∞).

7. CONCLUSIONS

In this paper we have presented a detailed investigation oftwo important aspects in the sensor network design, thethroughput and the energy efficiency, which are typically twoinconsistent measures. We have considered the uplink reach-back problem with simultaneous transmissions and multiplereceive antennas. Simultaneous transmissions are favored fordramatically increased throughput and supported by the ad-vanced signal processing exploited in the physical layer. Weconsider both coordinated and random medium access con-trol schemes represented, respectively, by round-robin andslotted-ALOHA. We measure the energy efficiency with theeffective energy, defined as the average energy consumptionfor each successfully transmitted packet. We optimize the av-erage number of transmissions per slot a and the transmis-sion power per sensor node, to meet two objectives: through-put maximization, and throughput-constrained effective en-ergy minimization. There are interesting connections be-tween these two optimization problems. In particular, themaximum asymptotic throughput as the SNR goes to infin-ity defines the upper limit on the throughput constraint thatcan be achieved.

Under the assumption of Rayleigh flat-fading channel,we show that slotted-ALOHA suffers from the greatest per-formance loss when paired with the decorrelator. Whileslotted-ALOHA has similar minimum effective energy asround-robin for small throughput constraints, it soon turnsenergy-inefficient as the throughput constraint increases. Forboth MAC schemes, the linear MMSE detector significantlyoutperforms the decorrelator and the matched filter in boththe throughput and the energy efficiency. Finally we considerthe shadowing effect on the system performance and showthat multiuser scheduling greatly boosts the throughput inlow-SNR region and hence is of particular significance forsensor network applications.


APPENDICES

A. PROOF OF LEMMA 1

For the matched filter, when m = 1,

SIRi = ρ∥∥hi

∥∥2, (A.1)

where ‖hi‖2 ∼ χ22N . Thus

Pr[

SIRi ≥ β] = Pr

[∥∥hi

∥∥2 ≥ β

ρ

]= 1− Γ

(β

ρ,N)

, (A.2)

where Γ(a, x) is the regularized gamma function given byΓ(a, x) = ∫ x

0 ta−1e−tdt/∫∞

0 ta−1e−tdt. When m > 1, we canwrite the SIR in (7) as

SIRi =∥∥hi

∥∥2

1/ρ + h†i(

HiH†i

)hi/∥∥hi

∥∥2 , (A.3)

where Hi denotes the matrix obtained by deleting the ith col-umn of H. HiH

†i has a complex central Wishart distribution

with m − 1 degrees of freedom and covariance matrix Im−1,denoted as HiH

†i ∈ CWN (m − 1, Im−1). Since hi and Hi are

independent, according to [32, Theorem 3.2.8] we have

Y.= h†i

(HiH

†i

)hi∥∥hi

∥∥2 ∼ χ22(m−1), (A.4)

and Y is independent of hi. Denote X.= ‖hi‖2 ∼ χ2

2N . There-fore, the probability of success is

Pr[

SIRi ≥ β]

= Pr[X ≥ β

(Y +

1ρ

)]

= 1(m− 2)!

∫∞0

[1− Γ

(βy +

β

ρ,N)]

ym−2e−ydy.

(A.5)

In summary,

qmf (m, ρ)

=

1− Γ(

βρ ,N

), m = 1,

1(m− 2)!

×∫∞

0

[1− Γ

(βy +

β

ρ,N)]

ym−2e−ydy, m > 1.

(A.6)

As ρ →∞, when m > 1, we have

SIRi = X

Y= N

m− 1X/2N

Y/2(m− 1).= N

m− 1F, (A.7)

where F = (X/2N)/(Y/2(m − 1)) has an F2N ,2(m−1) distribu-tion. Therefore,

Pr[

SIRi ≥ β] = Pr

[F ≥ β

m− 1N

]

= 1− I(

β

1 + β;N ,m− 1

),

(A.8)

where I(x; a, b) is the regularized beta function given byI(x; a, b) = ∫ x0 ta−1(1− t)b−1dt/

∫ 10 ta−1(1− t)b−1dt. It is obvi-

ous that when ρ → ∞, Pr[SIRi ≥ β] = 1 for m = 1, thus wehave

qmf (m,∞) =

1, m = 1,

1− I(

β

1 + β;N ,m− 1

), m > 1.

(A.9)

B. PROOF OF LEMMA 2

Denote the m×m matrix by Z.= H†H, then Z has a complex

central Wishart distribution, that is, Z ∈ CWm(N , IN ). It isknown that when m ≤ N , the determinant of Z is distributedas∏m

i=1 χ22(N−i+1), and when m > N , Z is singular [32]. There-

fore we have when m ≤ N ,

zi.= 1(

Z−1)ii

= det(Z)det

(Z[i]

) = det(

Zsc[i]

), (B.1)

where Z[i] denotes the matrix obtained by striking out theith row and the ith column of Z, and Zsc

[i] denotes the Schur-complement of Z[i], which is also complex Wishart dis-tributed, that is, Zsc

[i] ∈ CW1(N −m + 1, IN−m+1). Therefore,zi = det(Zsc

[i]) ∼ χ22(N−m+1). We get for the decorrelating de-

tector, when m ≤ N ,

Pr[

SIRi ≥ β] = Pr

[zi ≥ β

ρ

]= 1− Γ

(β

ρ,N −m + 1

),

(B.2)

while when m > N , since SIRi = 0, Pr[SIRi ≥ β] = 0. Insummary,

qdec(m, ρ) =

1− Γ(β

ρ,N −m + 1

), m ≤ N ,

0, m > N.(B.3)

C. PROOF OF LEMMA 3

For the linear MMSE detector, when m < N , consider thelimiting SIR as ρ →∞:

limρ→∞ SIRi = lim

ρ→∞h†i

(HiH

†i +

1ρ

I)−1

hi

= limρ→∞ ρh†i

(ρHiH

†i + I

)−1hi.

(C.1)


Denote the spectral decomposition of matrix HiH†i =

UDU†, where D is the matrix containing the eigenvalues ofHiH

†i in decreasing order, and U is the unitary matrix con-

taining the eigenvectors of HiH†i . Putting in the above and

evaluating the limit, we get

limρ→∞ SIRi = ρh†i UQU†hi = ρv†i Qvi, (C.2)

where Q = diag(0, . . . , 0, 1, . . . , 1), and the number of 1’sis the number of zero eigenvalues of HiH

†i , which is N −

m + 1, and v = U†hi. Since hi is circularly-symmetric com-plex Gaussian, v has the same distribution as hi. There-fore, limρ→∞ SIRi = ρ

∑N−m+1i=1 ‖vi‖2 ∼ ρχ2

2(N−m+1), which isthe same as the decorrelator. Thus q(m,∞) = limρ→∞ 1 −Γ(β/ρ,N −m + 1) = 1.

When m > N , HiH†i is invertible, so as ρ →∞,

SIRi −→ h†i(

HiH†i

)−1hi. (C.3)

Since hi and Hi are independent, and HiH†i ∈ CWN (m −

1, Im−1), using [32, Theorem 3.2.12] we obtain

Z.=

∥∥hi

∥∥2

h†i(

HiH†i

)−1hi

∼ χ22(m−N), (C.4)

and Z is independent of hi. Denoting X.= ‖hi‖2, we get

h†i(

HiH†i

)−1hi = X

Z= N

m−N

X/2NZ/2(m−N)

.= N

m−NF,

(C.5)

where F = (X/2N)/(Z/2(m−N)) has an F2N ,2(m−N) distribu-tion. Therefore,

Pr[

SIRi ≥ β] = Pr

[F ≥ β

m−N

N

]

= 1− I(

β

1 + β;N ,m−N

).

(C.6)

In summary,

qmmse(m,∞) =

1, m ≤ N ,

1− I(

β

1 + β;N ,m−N

), m > N.

(C.7)

REFERENCES

[1] I. F. Akyildiz, W. Su, Y. Sankarasubramaniam, and E. Cayirci,“A survey on sensor networks,” IEEE Commun. Mag., vol. 40,no. 8, pp. 102–114, 2002.

[2] A. J. Goldsmith and S. B. Wicker, “Design challenges forenergy-constrained ad hoc wireless networks,” IEEE WirelessCommunications, vol. 9, no. 4, pp. 8–27, 2002.

[3] E. Shih, S. Cho, N. Ickes, et al., “Physical layer driven pro-tocol and algorithm design for energy-efficient wireless sen-sor networks,” in Proc. 7th Annual International Conference on

Mobile Computing and Networking (MobiCom ’01), pp. 272–287, Rome, Italy, July 2001.

[4] A. F. Dana and B. Hassibi, “On the power-efficiency of sen-sory and ad-hoc wireless networks,” submitted to IEEE Trans.Inform. Theory.

[5] S. Cui, A. J. Goldsmith, and A. Bahai, “Energy-constrainedmodulation optimization,” to appear in IEEE Transactions onWireless Communications.

[6] J. Chou, D. Petrovic, and K. Ramachandran, “A distributedand adaptive signal processing approach to reducing energyconsumption in sensor networks,” in Proc. 22nd Annual JointConference of the IEEE Computer and Communications Soci-eties (INFOCOM ’03), vol. 2, pp. 1054–1062, San Francisco,Calif, USA, March–April 2003.

[7] S. Cui, R. Madan, A. J. Goldsmith, and S. Lall, “Joint routing,MAC, and link layer optimization in sensor networks withenergy constraints,” in Proc. IEEE International Conference onCommunications (ICC ’05), vol. 2, pp. 725–729, Seoul, SouthKorea, May 2005.

[8] A. Ephremides, “Energy concerns in wireless networks,” IEEEWireless Communications, vol. 9, no. 4, pp. 48–59, 2002.

[9] L. Tong, Q. Zhao, and S. Adireddy, “Sensor networks with mo-bile agents,” in Proc. IEEE Military Communications Confer-ence (MILCOM ’03), vol. 1, pp. 688–693, Boston, Mass, USA,October 2003.

[10] L. Roberts, “ALOHA Packet system with and without slots andcapture,” Computer Communications Review, vol. 5, no. 2, pp.28–42, 1975.

[11] S. Verdu, Multiuser Detection, Cambridge University Press,Cambrige, UK, 1998.

[12] S. Ghez, S. Verdu, and S. C. Schwartz, “Stability propertiesof slotted Aloha with multipacket reception capability,” IEEETrans. Automat. Contr., vol. 33, no. 7, pp. 640–649, 1988.

[13] B. Hajek, A. Krishna, and R. O. LaMaire, “On the captureprobability for a large number of stations,” IEEE Trans. Com-mun., vol. 45, no. 2, pp. 254–260, 1997.

[14] S. V. Hanly and D. N. C. Tse, “Resource pooling and effectivebandwidths in CDMA networks with multiuser receivers andspatial diversity,” IEEE Trans. Inform. Theory, vol. 47, no. 4,pp. 1328–1351, 2001.

[15] S. Ghez, S. Verdu, and S. C. Schwartz, “Optimal decentral-ized control in the random access multipacket channel,” IEEETrans. Automat. Contr., vol. 34, no. 11, pp. 1153–1163, 1989.

[16] S. Adireddy and L. Tong, “Exploiting decentralized channelstate information for random access,” IEEE Trans. Inform.Theory, vol. 51, no. 2, pp. 537–561, 2005.

[17] A. Chockalingam, M. Zorzi, L. B. Milstein, and P. Venkataram,“Performance of a wireless access protocol on correlatedRayleigh-fading channels with capture,” IEEE Trans. Com-mun., vol. 46, no. 5, pp. 644–655, 1998.

[18] J. T.-K. Liu and A. Polydoros, “Retransmission control andfairness issue in mobile slotted ALOHA networks with fadingand near-far effect,” Mobile Networks and Applications, vol. 2,no. 1, pp. 101–110, 1997.

[19] X. Qin and R. Berry, “Exploiting multiuser diversity in wire-less ALOHA networks,” in Proc. Allerton Conference on Com-munication, Control and Computing, Allerton, Ill, USA, Octo-ber 2001.

[20] P. Venkitasubramaniam, Q. Zhao, and L. Tong, “Sensor net-work with multiple mobile access points,” in Proc. 38th An-nual Conference on Information Sciences and Systems (CISS’04), Princeton, NJ, USA, March 2004.

[21] W. Li and H. Dai, “Throughput and energy efficiency of sen-sor networks with multiuser receivers and spatial diversity,” inProc. IEEE International Conference on Acoustics, Speech, and


Signal Processing (ICASSP ’05), Philadelphia, Pa, USA, March2005.

[22] E. G. Larsson, “On the combination of spatial diversity andmultiuser diversity,” IEEE Commun. Lett., vol. 8, no. 8, pp.517–519, 2004.

[23] J. Jiang, R. M. Buehrer, and W. H. Tranter, “Antenna diversityin multiuser data networks,” IEEE Trans. Commun., vol. 52,no. 3, pp. 490–497, 2004.

[24] D. Aktas and H. El Gamal, “Multiuser scheduling for MIMOwireless systems,” in Proc. 58th IEEE Vehicular TechnologyConference (VTC ’03), vol. 3, pp. 1743–1747, Orlando, Fla,USA, October 2003.

[25] B. M. Hochwald, T. L. Marzetta, and V. Tarokh, “Multiple-antenna channel hardening and its implications for rate feed-back and scheduling,” IEEE Trans. Inform. Theory, vol. 50,no. 9, pp. 1893–1909, 2004.

[26] X. Liu, E. K. P. Chong, and N. B. Shroff, “Opportunistic trans-mission scheduling with resource-sharing constraints in wire-less networks,” IEEE J. Select. Areas Commun., vol. 19, no. 10,pp. 2053–2064, 2001.

[27] D. N. C. Tse and O. Zeitouni, “Linear multiuser receivers inrandom environments,” IEEE Trans. Inform. Theory, vol. 46,no. 1, pp. 171–188, 2000.

[28] R. Knopp and P. A. Humblet, “Information capacity andpower control in single-cell multiuser communications,” inProc. IEEE International Conference on Communications (ICC’95), vol. 1, pp. 331–335, Seattle, Wash, USA, June 1995.

[29] P. Viswanath, D. N. C. Tse, and R. Laroia, “Opportunis-tic beamforming using dumb antennas,” IEEE Trans. Inform.Theory, vol. 48, no. 6, pp. 1277–1294, 2002.

[30] M. Sharif and B. Hassibi, “A delay analysis for opportunistictransmission in fading broadcast channels,” in Proc. 24th An-nual Joint Conference of the IEEE Computer and Communica-tions Societies (INFOCOM ’05), vol. 4, pp. 2720–2730, Miami,Fla, USA, March 2005.

[31] H. A. David and H. N. Nagaraja, Order Statistics, John Wiley& Sons, New York, NY, USA, 3ed edition, 2003.

[32] R. J. Muirhead, Aspects of Multivariate Statistical Theory, JohnWiley & Sons, New York, NY, USA, 1982.

[33] B. Chen and L. Tong, “Traffic modeling and tracking for mul-tiuser detection for random access networks,” in Proc. IEEEInternational Conference on Acoustics, Speech, and Signal Pro-cessing (ICASSP ’00), vol. 5, pp. 2601–2604, Istanbul, Turkey,June 2000.

[34] S. Cui, A. J. Goldsmith, and A. Bahai, “Energy-efficiency ofMIMO and cooperative MIMO techniques in sensor net-works,” IEEE J. Select. Areas Commun., vol. 22, no. 6, pp. 1089–1098, 2004.

[35] L. Tong, V. Naware, and P. Venkitasubramaniam, “Signalprocessing in random access,” IEEE Signal Processing Mag.,vol. 21, no. 5, pp. 29–39, 2004.

Wenjun Li received the B.S. degree in elec-tronic and information engineering fromShanghai Jiaotong University, Shanghai,China, in 2002, and the M.S. degree inelectrical engineering from North CarolinaState University, Raleigh, NC, in 2004. She iscurrently working towards her Ph.D. degreein the Department of Electrical and Com-puter Engineering, North Carolina StateUniversity, Raleigh, NC. Her current research focuses on wirelesscommunications, and energy-efficient communications and signalprocessing for wireless sensor networks.

Huaiyu Dai received the B.E. and M.S.degrees in electrical engineering from Ts-inghua University, Beijing, China, in 1996and 1998, respectively, and the Ph.D. de-gree in electrical engineering from Prince-ton University, Princeton, NJ, in 2002. Heworked at Bell Labs, Lucent Technologies,Holmdel, NJ, during the summer of 2000,and at AT & T Labs-Research, Middletown,NJ, during the summer of 2001. Currently he is an Assistant Profes-sor of electrical and computer engineering at North Carolina StateUniversity, Raleigh, NC. His research interests are in the generalareas of communication systems and networks, advanced signalprocessing for digital communications, and communication theoryand information theory. He has worked in the areas of digital com-munication system design, speech coding and enhancement, andDSL transmission. His current research focuses on space-time com-munications and signal processing, the turbo principle and its ap-plications, multiuser detection, and the information-theoretic as-pects of multiuser communications and networks.

EURASIP Journal on Wireless Communications and Networking 2005:4, 554–564c© 2005 X. Liu and M. Haenggi

Throughput Analysis of Fading Sensor Networkswith Regular and Random Topologies

Xiaowen LiuDepartment of Electrical Engineering, University of Notre Dame, Notre Dame, IN 46556, USAEmail: [email protected]

Martin HaenggiDepartment of Electrical Engineering, University of Notre Dame, Notre Dame, IN 46556, USAEmail: [email protected]

Received 30 November 2004; Revised 5 June 2005

We present closed-form expressions of the average link throughput for sensor networks with a slotted ALOHA MAC protocol inRayleigh fading channels. We compare networks with three regular topologies in terms of throughput, transmit efficiency, andtransport capacity. In particular, for square lattice networks, we present a sensitivity analysis of the maximum throughput andthe optimum transmit probability with respect to the signal-to-interference ratio threshold. For random networks with nodesdistributed according to a two-dimensional Poisson point process, the average throughput is analytically characterized and nu-merically evaluated. It turns out that although regular networks have an only slightly higher average link throughput than randomnetworks for the same link distance, regular topologies have a significant benefit when the end-to-end throughput in multihopconnections is considered.

Keywords and phrases: throughput, Rayleigh fading, slotted ALOHA, network topology, interference.

1. INTRODUCTION

A sensor network [1] consists of a large number of sensornodes which are placed inside or near a phenomenon. Uni-formly random or Poisson distributions are widely acceptedmodels for the location of the nodes in wireless sensor net-works, if nodes are deployed in large quantities and there islittle control over where they are dropped. A typical scenariois a deployment from an airplane for battlefield monitoring.On the other hand, depending on the application, it may alsobe possible to place sensors in a regular topology, for exam-ple, in a square grid.

Throughput is a traditional measure of how much traf-fic can be delivered by the network [2, 3]. There is a richliterature on throughput capacity for wireless networks [2,4, 5] with random or regular topologies. The seminal pa-per [2] shows that, for peer-to-peer traffic, in a static two-dimensional network with N nodes and N/2 randomly se-lected source-destination pairs, the end-to-end throughputof a connection is Θ(W/

√N), where W is the maximum

transmission rate for each node. The reason for this poor


scaling behavior is that the per-link1 throughput remainsconstant while the number of hops grows with

√N . Marco

et al. [6] show that with many-to-one traffic, the per-nodetransport capacity is Θ(1/N). Such “order of” results do notprovide any guidelines for protocol design, since the scalingbehavior is very robust against changes in MAC and routingprotocols [7]. All the above research work assumes networkswith randomly located nodes. There are also research effortsfocusing on networks with regular topologies. Silvester andKleinrock [4] calculate the throughput of regular square net-works with a slotted ALOHA channel access scheme. Xie andKumar [7] prove that the Θ(N) upper bound on transportcapacity is tight for regular networks where nodes are placedon integer lattice points for path loss exponents greater than3 and is achieved by multihop transmission. De et al. [8]compare the performance of regular topologies with randomtopology in wireless CDMA sensor networks. The authors in[9, 10] evaluate the performance for regular grid and randomtopologies. They assume a “torus” network to avoid bound-ary effects and use the expected interference power to re-place the exact interference power. In particular at high load,

1The link throughput is the total achievable throughput over a link, ag-gregated over the flows or connections that are served by the link.



Throughput Analysis of Random and Regular Networks 555

replacing the actual interference by its mean yields overlypessimistic results. Indeed, the expected interference may beinfinite [11].

Most of the work above is based on a “disk model,” whereit is assumed that the radius for a successful transmission of apacket has a fixed and deterministic value, irrespective of thecondition and the realization of the wireless channel. Suchsimplified link models ignore the stochastic nature of thewireless channel. Our analysis is based on a Rayleigh fadingchannel model, which includes both large-scale path loss andstochastic small-scale variations in the channel characteris-tics. Note that even with static nodes as assumed in this pa-per, the channel quality varies because any movement in theenvironment affects the multipath geometry of the RF sig-nal, which is easily confirmed experimentally [12, page 45].The significant variation of the link quality when nodes areimmobile is also pointed out in [13, 14, 15], and the short-comings of the “disk model” are discussed in [11].

This paper addresses the throughput problem for largesensor networks with Rayleigh fading channels. To provideinsight on the impact of the topology on the network per-formance, we compare networks with a random topologyand three regular topologies. Placing nodes in regular lat-tices has an obvious advantage in terms of coverage [16],so we are not addressing coverage issues here. We define the(per-link) throughput as the expected number of successfulpacket transmissions of a given link per timeslot. The end-to-end throughput over a multihop connection, defined as theminimum of the throughput values of the links involved, is aperformance measure of a route and the MAC scheme.

We consider a variant of the slotted ALOHA channel ac-cess scheme, originally devised in [17], that takes advantageof spatial reuse. It is assumed, as in [4, 18, 19, 20], that in ev-ery timeslot, each node transmits independently with a cer-tain fixed probability p. While often a “heavy traffic” modelis used [4, 20], where nodes always have packets to trans-mit and p only reflects the channel access probability, we donot restrict ourselves to this “MAC-centric” case. Rather, weconsider p to be composed of two factors, that is, p = pq pt,where pq is the probability that there is a packet in a node’squeue awaiting transmission, and pt is the probability oftransmission conditioned on having a packet in the queue(the channel access probability). So, pq is given by the trafficmodel, pt is the actual slotted ALOHA channel access prob-ability, and p is the unconditioned probability of transmis-sion. The heavy traffic case mentioned above corresponds topq = 1, pt = p, and the other extreme case is pq = p, pt = 1,where Bernoulli traffic is generated with probability pq andeach node with a packet to transmit has immediate accessto the channel. Since there is no need for a MAC scheme inthis case, we may denote it as “traffic-centric.” Hence, the de-composition of p shows that the throughput analysis and op-timization with respect to p in fact includes a range of traf-fic intensities and channel access probabilities. The Bernoullitraffic model is well justified by the following three observa-tions: (1) in [18], it was shown that the traffic from a slot-ted ALOHA population of nodes can indeed be modeled asBernoulli; (2) in [21, page 278], it is pointed out that the

retransmission traffic is usually Bernoulli (since an unsuc-cessfully transmitted packet reenters the queue); and (3) theBernoulli traffic model is memoryless and thus the discrete-time counterpart of the ubiquitous Poisson model.

The traffic distribution in a sensor networks is usuallyspatially and temporally bursty, that is, busy periods alter-nate temporally and busy areas alternate spatially with pe-riods and areas with little or no traffic. It may therefore beimpractical to employ reservation-based MAC schemes suchas TDMA and FDMA that require a substantial amount ofcoordination traffic and cannot be implemented efficientlyand in a fully distributed fashion.2 In any case, the slottedALOHA scheme is the simplest meaningful MAC scheme andtherefore provides a lower bound on the performance formore elaborate schemes. Since areas of the network or pe-riods with little or no traffic pose no problems, our analysisfocuses on and applies to busy areas and busy periods of thenetwork where collisions are unavoidable and the through-put is interference-limited. During such a burst of traffic, weassume that the parameters p, pq, and pt remain constant.An important example of a busy area is certainly the criticalarea around the base station or fusion center, where trafficaccumulation due to the many-to-one transmission schemeoften results in heavy traffic [22].

In Section 2, the Rayleigh fading link model is intro-duced. For a slotted ALOHA MAC scheme, the conditionalsuccess probability of a transmission for a node given thetransmitter-receiver and interference-receiver distances is de-rived. Section 3 evaluates the throughput for regular net-works with three topologies and compares their perfor-mance. Section 4 investigates the average throughput for ran-dom networks for fixed and random transmitter-receiver dis-tances d0. This section also analyzes the transport capacityand end-to-end throughput. Section 5 concludes the paper.

2. THE RAYLEIGH FADING LINK MODEL

We assume a narrowband Rayleigh block fading channel.A transmission from node i to node j is successful if thesignal-to-noise-and-interference ratio (SINR) γi j is above acertain threshold Θ that is determined by the communica-tion hardware and the modulation and coding scheme [14].The SINR γ is given by γ = Q/(N0 + I), where Q is the re-ceived power, which is exponentially distributed with meanQ. Over a transmission of distance d with an attenuation dα,we have Q = P0d−α, where P0 denotes the transmit power, αis the path loss exponent. N0 denotes the noise power, and I isthe interference power, that is, the sum of the received powerfrom all the undesired transmitters. Our analysis is based onthe following theorem.

Theorem 1. In a Rayleigh fading network with slottedALOHA, where nodes transmit at equal power levels with prob-ability p, the success probability of a transmission given a de-sired transmitter-receiver distance d0 and n other nodes at

2In general, this problem is NP-hard.


distances di (i = 1, . . . ,n) is

Ps|d0,...,dn = exp

(− ΘN0

P0d−α0

)·

n∏i=1

(1− Θp(

di/d0)α

+ Θ

), (1)

where P0 is the transmit power, N0 the noise power, and Θ theSINR threshold.

Proof. Let Q0 denote the received power from the desiredtransmitter and Qi, i = 1, . . . ,n, the received power from npotential interferers. All the received powers are exponen-tially distributed, that is, pQi(qi) = 1/Qie·−qi/Qi , where Qi de-notes the average received power Qi = Pid

−αi . The cumulated

interference power at the receiver is

I =n∑i=1

SiQi, (2)

where Si is a sequence of i.i.d. Bernoulli random variableswith P(Si = 1) = p and P(Si = 0) = 1 − p. The successprobability of a transmission is3

Ps|d0,d1,...,dn = EI[P[Q0 � Θ(I + N0) | I]]

= EQ,S

[exp

(− Θ

(∑ni=1 SiQi + N0

)Q0

)]

= exp

(− ΘN0

Q0

)EQ,S

[ n∏i=1

exp

(− Θ

(SiQi

)Q0

)]

= exp

(− ΘN0

P0d−α0

)

×n∏i=1

{P(Si = 1

) ·∫∞

0exp

(− Θqi

Q0

)

× pQi

(qi)dqi + P

(Si = 0

)}

= exp

(− ΘN0

P0d−α0

) n∏i=1

(p

1 + Θ(d0/di

)α + 1− p

)

= exp

(− ΘN0

P0d−α0

) n∏i=1

(1− Θp(

di/d0)α

+ Θ

).

(3)

Since the throughput in large sensor networks is limitedby the interference, in the following, we focus on the inter-ference part (the second factor of (3), assuming N0 = 0)to determine bounds that are fundamental in the sense thatthey cannot be exceeded even if the transmit power is notconstrained. The first exponential term is easily evaluated ifN0 �= 0.

3A similar calculation has been carried out in [23] for the case where inevery timeslot it is known exactly which node is transmitting. In contrast,Theorem 1 incorporates the uncertainty at the MAC level: we only assumewe know the probability of a transmission, but not exactly which node istransmitting in every timeslot.

Corollary 1. Under the same assumptions as in Theorem 1 butwith N0 = 0 and unit transmit power Pi = 1, the success prob-ability given a desired link of normalized distance r0 = d0/d0 =1 and n other nodes at normalized distances ri = di/d0 is

Ps|r0,r1,...,rn =n∏i=1

(1− p

1 + riα/Θ

)= LI(Θ), (4)

which is the Laplace transform of the interference power I eval-uated at the SIR threshold Θ.

Proof. With unit transmit power, the mean power from theith interferer at distance ri is 1/rαi . The Laplace transform ofthe exponential distribution with mean 1/µ is µ/(µ + s), thusthe Laplace transform of I is [24]

LI(s) =n∏i=1

(prαirαi + s

+ 1− p

)=

n∏i=1

(1− p

1 + rαi /s

). (5)

From (3) and with ri = di/d0 (normalized distances), if N0 =0,

Ps|r0,r1,...,rn =n∏i=1

(1− p

1 + riα/Θ

), (6)

we get (4).

3. REGULAR NETWORKS

In this section, we investigate networks with three regulartopologies (square, triangle, hexagon) in which every nodehas the same number of nearest neighbors and the same dis-tance to all nearest neighbors.

3.1. Square networks

We first analyze square networks with N nodes placed in thevertices of a square grid with distance 1 between all pairsof nearest nodes (density 1). The next-hop receiver of eachpacket is one of the four nearest-neighbor nodes of the trans-mitter, so the transmitter-receiver distance d0 = 1. If the re-ceiver node O is located in the center of the network as shownin Figure 1 and node A is the desired transmitter, the successprobability for node O based on (6) can be written as

Ps(p) =(

1− Θp

1α + Θ

)3

·(

1− Θp(√2)α

+ Θ

)4

×√N/2∏i=2

{(1− Θp

iα + Θ

)4

·(

1− Θp(√2i2)α

+ Θ

)4

·i−1∏j=1

(1− Θp(√

i2 + j2)α

+ Θ

)8}.

(7)


O A

Figure 1: The topology of a square network. Node O is the receiverand node A is the desired transmitter such that the link distanced0 = |OA| = 1.

0

0.01

0.02

0.03

0.04

g

α = 5

α = 2

0 0.2 0.4 0.6 0.8 1

p

Figure 2: The analytic throughput g(p) based on (7) for a squarenetwork with 40× 40 nodes, with Θ = 10.

The first term in (7) accounts for the other three nearest-neighbor nodes of the receiver; the second term for the 4diagonal nodes at distance

√2; all the other terms from

the nodes located on the dashed squares with edge ≥ 2 inFigure 1. The throughput4 is given by

g(p) = p(1− p)Ps(p), (8)

where p is the probability that A transmits and 1 − p is theprobability that O does not transmit in the same timeslot.Note that g is the throughput achievable with a simpleARQ scheme (with error-free feedback) [25]. The analyticthroughput g(p) based on (7) and (8) for a regular square

4The throughput is calculated as the throughput of the center link ofthe busy area under consideration. This is the worst case since most othernodes experience a lower interference. In the case of infinite networks, theinterference distribution is the same at every node.

network with 40 × 40 nodes with node density λ = 1 is dis-played in Figure 2. For α = 4, the maximum throughputgmax = 0.0247 is achieved at an optimal transmit probabil-ity popt = 0.066. The transmit efficiency, defined as Teff =gmax/popt, is 37.4%.

For the sensitivity analysis of the throughput with respectto Θ, we need to determine popt(Θ) and gmax(Θ). We usethree analytic approximations for popt(Θ) and gmax(Θ). From(6), g can be written as

g = p(1− p)n∏i=1

(1− p

1 + rαi /Θ

), (9)

where ri = di/d0.Since popt = arg maxp g(p) = arg maxp log(g(p)), we

maximize

log(g) = log(p) + log(1− p)

+n∑i=1

log

(1− p

1 + rαi /Θ

),

(10)

using log(1 + x) ≈ x for small x,5 yielding

p2opt − popt(1 + 2s) + s = 0, (11)

with

s = 1∑ni=1(1/(1 + rαi /Θ))

. (12)

Note ri = di for d0 = 1. So, popt is given by

popt = s +12

(1−

√1 + 4s2

). (13)

gmax can be obtained by gmax = popt(1− popt)Ps(popt), wherePs(popt) is obtained by plugging popt into (7). This method iscalled Analytic 1.

For α = 4, we use i2 to approximate d4i for the nodes

located in one quadrant. As shown in Figure 3, the distanceof node i (i = 1, . . . , 8) in the first quadrant to the receivernode O is di. Table 1 compares d4

i and i2 for i = 1, . . . , 8. ByEuler’s summation formula, d4

i ≈ i2 allows a simplification(the node at distance 1 is the desired transmitter):

k+1∑i=2

11 + i2/Θ

≈√Θ

(arctan

k + 3/2√Θ

− arctan3

2√Θ

). (14)

For k →∞,

s ≈ 14√Θ(π/2− arctan(3/2

√Θ)) , (15)

5The approximation is accurate for p in the range of interest, that is,0 < p < 0.3.


O

1 5

3

4

2

6

7

8

Figure 3: Node numbering scheme pertaining to Table 1 for nodesin the first quadrant of a square network. O is the receiver.

Table 1: Comparison of d4i and i2.

i 1 2 3 4 5 6 7 8

d4i 1 1 4 16 16 25 25 64

i2 1 4 9 16 25 36 49 64

where the factor 4 in (15) comes from the fact that nodes arelocated in 4 quadrants. Plugging (15) into (13) is our methodAnalytic 2.

In method Analytic 3, we use the approximation s ≈1/(4

√Θ), which is within ∓20% for the practical range

9/(2 cot(0.8))2 ≈ 2.4 < Θ < 9/(2 cot(1.2))2 ≈ 14.9, and sub-stitute it into (13), which yields

popt = 14√Θ

+12

(1−

√1 +

14Θ

). (16)

Based on (10) and (12), gmax is given by

gmax = popt(1− popt

)e−popt/s. (17)

The numerical result obtained by direct maximization of (7)for different Θ is compared with the results from the threeanalytical approximations in Figure 4. In Analytic 2, approx-imating interfering nodes at distance di by the larger distancei1/2 (shown in Table 1) results in lower interference. The in-terference has a more significant impact on the throughput(and popt) for small Θ (see (14)). Thus for small Θ, this lowerinterference leads to a higher popt than for Analytic 1. Thetransmit efficiency is Teff = gmax/popt = (1 − popt)e−popt/s,which is monotonically increasing from lims→0 Teff = e−1 ≈0.37 to lims→∞ Teff = 1/2. The upper bound is achieved ifthe interference goes to zero, in which case popt = 1/2 andgmax = 1/4. For the lower bound, as s → 0, we have popt → 0and gmax → 0, and Teff converges to e−1. Hence, s is a measurefor spatial reuse. Indeed for s→ 0, which happens for α→ 06

or Θ → ∞, the network does not permit any spatial reuse. In

6In fact, α→ 2 is sufficient for infinite networks.

0

0.05

0.1

0.15

0.2

0.25

p opt

0 5 10 15 20

Θ (dB)

NumericalAnalytic 1Analytic 2Analytic 3

(a)

0

0.02

0.04

0.06

0.08

0.1

g max

0 5 10 15 20

Θ (dB)

NumericalAnalytic 1Analytic 2Analytic 3

(b)

Figure 4: For a square network with 40 × 40 nodes and α = 4, thenumerical results and analytic results from Analytic 1, Analytic 2,and Analytic 3 for (a) the relationship between popt and Θ; (b) therelationship between gmax and Θ.

this case, the transmit efficiency reduces to the efficiency ofconventional slotted ALOHA [17], where for a network withN nodes, popt = 1/N and Teff = limN→∞(1 − 1/N)N−1 = e−1

[4]. The fact that our limit coincides with the limit for con-ventional slotted ALOHA further validates our approxima-tions.

3.2. Triangle networks and hexagon networks

Other regular topologies of interest are the triangle topol-ogy and its dual, the hexagon topology (Figure 5). For eachtriangle, there are three vertices and six nearest neighborsfor each vertex, while for the hexagon, there are six ver-tices for each hexagon and three nearest neighbors for eachvertex. Again, the next-hop receiver of each packet is one


(a) (b)

Figure 5: The topology of (a) triangle network and (b) hexagon network.

0

0.01

0.02

0.03

0.04

0.05

0.06

g

0 0.2 0.4 0.6 0.8 1

p

α = 2

α = 5

(a)

0

0.01

0.02

0.03

0.04

0.05

0.06

g

0 0.2 0.4 0.6 0.8 1

α = 2

α = 5

p

(b)

Figure 6: The analytic throughput g(p) versus p for two-dimensional networks with (a) triangle topology and (b) hexagon topology, whereΘ = 10 and N = 1600 nodes.

of the nearest-neighbor nodes of the transmitter, so thetransmitter-receiver distance d0 is equal to the side lengthof the regular polygon. In the triangle network, each nodeis located in a hexagon with area (

√3d2

0)/2. For node den-sity equal to 1, d2

0 = 2/√

3. Similarly, for hexagon networks,d2

0 = 4/(3√

3).Similar to the calculation of square lattice networks as

in (7), we obtain the relationship between the throughputg and the transmit probability p and compare the perfor-mance of triangle and hexagon networks in Figure 6. For afair comparison, we introduce the transport capacity whichcan be defined as Z := gmaxd0. The results for square,triangle, and hexagon networks for α = 4 are shown inTable 2. The performance difference among the three topolo-gies can be explained by the distance and number of thepotential interfering nodes. Note that the transmit efficiencyTeff is very close to the one of conventional slotted ALOHAand does not depend on the topology.

4. RANDOM NETWORKS

Here, we assume that the positions of the nodes constitute aPoisson point process.7 In the following, we will investigatethe throughput averaged over network realizations when thetransmitter-receiver distance d0 is fixed (Section 4.1) and notfixed (Section 4.2).

4.1. Average throughput for fixed d0

In this case, we assume the distance between the desiredtransmitter and receiver is fixed and there are N other nodesconstituting a two-dimensional Poisson point process. Al-though (6) gives the success probability conditioned onnode distances, we still need to find the joint density of

7For large networks, this is equivalent to a uniformly random distribu-tion for all practical purposes.


Table 2: Comparison of square, triangle, and hexagon networksfor α = 4 and Θ = 10, where popt, gmax, and Teff denote the op-timum transmit probability, maximum throughput, and transmitefficiency.

popt gmax Teff d0 gmaxd0

Square 0.0660 0.0247 0.37 1.0 0.0247

Triangle 0.0570 0.0213 0.37 1.0746 0.0229

Hexagon 0.0870 0.0326 0.37 0.8774 0.0286

d1, d2, . . . , dN (ordered distances). It is well known that forone-dimensional Poisson point processes with density λ, theordered distance from nodes to the desired receiver form thearrival times of a Poisson process [24]. The interarrival inter-vals are i.i.d. exponential with parameter λ:

fdi−di−1

(xi − xi−1

) = λe−λ(xi−xi−1). (18)

So, for the ordered distance 0 ≤ d1 ≤ · · · ≤ dN , the jointdensity function of the interarrival intervals is

fd1,d2,...,dN

(x1, x2, . . . , xN

)= fd1,...,dN−dN−1

(x1, x2 − x1, . . . , xN − xN−1

)= (λe−λx1

)(λe−λ(x2−x1)) · · · (λe−λ(xN−xN−1))

= λNe−λxN , 0 ≤ x1 ≤ x2 ≤ · · · ≤ xN .

(19)

When nodes are distributed according to a two-dimensionalPoisson point process with density λ, the squared ordereddistances from the desired receiver have the same distribu-tion as the arrival times of a Poisson process with density λπ[24]. The squared ordered distances have a joint distributionwith density

fd21 ,...,d2

N

(x1, . . . , xN

) = (λπ)Ne−λπxN ,

0 ≤ x1 ≤ x2 ≤ · · · ≤ xN ,(20)

because from [26], we have

fd2i −d2

i−1

(xi − xi−1

) = λπe−λπ(xi−xi−1). (21)

The conditional success probability can be written as (see(6))

Ps|d0,d1,...,dN =N∏i=1

(d2i )α/2 + (1− p)Θdα0

(d2i )α/2 + Θdα0

. (22)

Integrating (22) with respect to the joint density (20), and inparticular, evaluating it for α = 4, we obtain

Ps|d0

=∫∞

0(λπ)Ne−λπxN

×{∫ xN

0· · ·

∫ x2

0

N∏i=1

x2i + (1−p)Θd4

0

x2i + Θd4

0dx1· · ·dxN−1

}dxN .

(23)

0

0.005

0.01

0.015

0.02

0.025

E[g|d

0]

0 0.2 0.4 0.6 0.8 1

p

N = 100N = 121N = 144

Figure 7: For α = 4 and Θ = 10, the analytical average throughputE[g|d0 = 1] based on (25) for networks with node number N =100, 121, and 144.

By applying a similar inductive technique as in [24], it can beshown that

∫ xN

0· · ·

∫ x2

0

N−1∏i=1

x2i + (1− p)Θd4

0

x2i + Θd4

0dx1 · · ·dxN−1

= 1(N − 1)!

(xN − p

√Θd4

0 arctan

(xN√Θd4

0

))N−1

.

(24)

Combining (23) and (24), we have

Ps|d0 =∫∞

0

(λπ)N

(N − 1)!e−λπx

x2 + (1− p)Θd40

x2 + Θd40

×(x − p

√Θd4

0 arctan

(x√Θd4

0

))N−1

dx.

(25)

Based on (25), we numerically evaluate the average through-put E[g|d0] = p(1 − p)Ps|d0 (averaged over all network re-alizations) and plot it as a function of p in Figure 7 for anetwork with node numbers N = 100, 121, and 144, whered0 = 1. It is shown that they are very close, indicating thatonly a portion of the nodes interfere at the receiver and nodesfurther away have little impact on the transmission.

4.2. Average throughput for variable d0

In the previous analysis, we assumed that the transmitter-receiver distance d0 is fixed and there are N potential interfer-ing nodes uniformly distributed. Now we assume that the re-ceiver located at the center selects its nearest-neighbor nodeas its desired transmitter. Then there are N − 1 nodes furtheraway than the desired transmitter. The distance to the near-est neighbor has the Rayleigh density function (as shown in[23])

fd0 (x) = 2πxe−πx2. (26)


0

0.02

0.04

0.06

0.08

0.1

E[g

]

0 0.2 0.4 0.6 0.8 1

p

AnalyticSimulation

Figure 8: For α = 4 and Θ = 10, E[g] versus p for random networkwith N = 144. The analytic result from (27) and (30) is displayedby solid line; the simulation result over 10 000 runs by + mark.

Since d0 is the nearest distance, d2i in (22) can be varying

from d20 to d2

i+1. So we integrate xi from d20 to xi+1:

Ps|d0

=∫∞d2

0

fd21 ,...,d2

N−1|d20

(x1, . . . , xN−1|d2

0

)

×{∫ xN−1

d20

· · ·∫ x2

d20

N−1∏i=1

x2i +(1−p)Θd4

0

x2i + Θd4

0dx1· · ·dxN−2

}dxN−1,

(27)

fd21 ,...,d2

N−1|d20

(x1, . . . , xN−1|d2

0

) = (λπ)N−1e−λπ(xN−1−d20 ), (28)

where 0 ≤ d20 ≤ x1 ≤ · · · ≤ xN−1.

By induction, it can be shown that

∫ xN−1

d20

· · ·∫ x2

d20

N−2∏i=1

x2i + (1− p)Θd4

0

x2i + Θd4

0dx1 · · ·dxN−2

= 1(N − 2)!

{xN−1 − d2

0 − p√Θd4

0 ·[

arctan

(xN−1√Θd4

0

)

− arctan

(d2

0√Θd4

0

)]}N−2

.

(29)

The success probability is Ps|d0 averaged over d0:

Ps =∫∞

0fd0 (x)Ps|d0dx. (30)

Substitute (28) and (29) into (27) and evaluate (30) with(26), we obtain the relationship between E[g] = p(1 − p)Psand p, which is plotted in Figure 8. It is shown that the ana-lytic (solid line) and simulation result (marked by +) matchperfectly.

0

0.05

0.1

E[g|d

0]

0.5

1

1.5

d0

00.2

0.40.6

0.81

p

(a)

0

0.05

0.1

0.15

0.2

0.25

E[g|d

0]

0 0.2 0.4 0.6 0.8 1p

d0 = 0.1d0 = 0.5d0 = 1d0 = 1.5

(b)

Figure 9: For α = 4 and Θ = 10, average throughput (a) E[g|d0]versus p for d0 from 0.5 to 1.5; (b) E[g|d0] versus p for d0 = 0.1,0.5, 1.0, and 1.5.

Figure 8 implies random networks have better averagethroughput for local data exchange than regular networks.This can be explained by d0, the transmitter-receiver dis-tance. In random networks, a variable d0 leads to a vari-able throughput. Figure 9a displays E[g|d0] versus p for d0

from 0.5 to 1.5. Figure 9b shows the relationship for d0 =0.1, 0.5, 1.0, and 1.5. Not surprisingly, smaller d0 results inhigher throughput. For the variable d0 case, it is assumed thatthe desired transmitter is the nearest neighbor of the receiver.With the pdf of (26), the probability that d0 is greater than1 (the transmitter-receiver distance in the square lattice net-work) is P[d0 > 1] = e−π = 0.043. So for most nodes, thereceived signal power from the desired transmitter is greaterthan that in regular networks. In Figure 9b, for d0 = 0.1, it isshown that the strong signal power resulting from very smalld0 offsets the impact of interference even for high transmitprobabilities p.


0

0.005

0.01

0.015

0.02

0.025

E[g|d

0]

0 0.2 0.4 0.6 0.8 1

p

RegularRandom

Figure 10: Comparison of the average throughput of regular squarenetwork and random network. For both networks, N = 1600, d0 =1, α = 4, and Θ = 10.

Now consider the generic routing strategy from [23]:each node in the path sends packets to its nearest neigh-bor that lies within a sector φ, that is, within ±φ/2 of thesource-destination direction. The previous scheme where d0

is obtained as the distance to the nearest neighbor makes noprogress in the source-destination direction. Such a choiceof d0 would correspond to routing within φ = 2π, clearly aninefficient choice of φ. More sensible is φ ≤ π. Let d0 be thedistance to the nearest neighbor within sector φ. The proba-bility density of d0 is given by [23]

fd0 (x) = xφe−x2φ/2. (31)

If the routing sector φ = π/2, then E[d0] = 1. For d0 = 1,Figure 10 displays the throughput for square network andrandom network with N = 1600. It turns out that for thesame transmitter-receiver distance, square networks have aslightly higher average throughput than random networks.

We compare the transport capacity gmaxd0 of regular andrandom networks. Figure 11a shows gmax versus d0 and popt

versus d0 for a random network. Figure 11b compares thetransport capacity of random and regular networks. It isshown that at a specific transmitter-receiver distance d0, reg-ular networks slightly outperform random networks in termsof transport capacity.

4.3. End-to-end throughput gEE in a random network

In wireless sensor networks with multihop communication,the end-to-end throughput (the minimum of the throughputvalues of the links involved) of a route with an average num-ber of hops is a better performance indicator than the averagethroughput. For two-dimensional random sensor networks(busy area m × m, density 1, routing within sector φ) withuniformly randomly selected source and fixed destination

located at the corner,8 we can approximate the average pathlength in hops

h ≈ r

Dη, (32)

where r denotes the expected distance between the sourceand the destination, D the expected hop length, and η theexpected path efficiency, where the path efficiency is the ratiobetween the Euclidean distance and the travelled distance ofa path. Dη can be viewed as the effective hop length—the av-erage hop length projected onto the source-destination axis.The expected distance from a random point in a square to acorner can be derived from [27, Exercise 2.4.5]:

r =[√

23

+13

arctanh

(1√2

)]m ≈ 0.769m. (33)

From [23], we know that

D =√

π

2φ, η = 2

φsin

(φ

2

). (34)

So the average path length in hops can be approximated byplugging (33) into (32). To evaluate the end-to-end through-put of a route with h hops, we use a semianalytic approachby generating an h-hop path with each hop length obtainedas a realization of D according to the pdf in (31), and evalu-ate the throughput of each hop based on Figure 9a. The av-erage end-to-end throughput is then obtained by taking theminimum of each path and averaging the minimum over thenumber of realizations of the simulated routes. It is shownin Figure 12 that the maximum end-to-end throughput gEE

is 0.0086, 0.0053, and 0.0039 for φ = π, π/2, and π/3.What is the end-to-end throughput for regular networks?

It can be directly obtained from Figures 2 and 6, which is0.0247, 0.0213, and 0.0326 for square, triangle, and hexa-gon networks. For regular networks, every hop has the samelength, and the throughput is calculated for a link in the cen-ter of the network, which is the worst case, so the end-to-endthroughput is the throughput of the center link of the busyarea. In terms of the end-to-end throughput for multihopcommunication, regular networks significantly outperformrandom networks. For larger networks, the benefit is largersince larger m results in longer paths.

5. CONCLUSIONS

We have shown that for a noiseless Rayleigh fading networkwith slotted ALOHA, the success probability of a transmis-sion is the Laplace transform of the interference evaluated atthe SIR threshold Θ. We assume that in every timeslot, each

8For the many-to-one traffic typical in sensor networks, we assume thedata sink for all connections to be in one of the corners of the (square) net-work.


0

0.1

0.2

0.3

0.4

0.5

0 0.5 1 1.5 2 2.5

popt

gmax

d0

(a)

0

0.01

0.02

0.03

0.04

0.05

g maxd 0

0 0.5 1 1.5 2 2.5

RandomSquareTriangleHexagon

E[d0] = 1

d0

(b)

Figure 11: With N = 1600, α = 4, and Θ = 10, (a) gmax versus d0 and popt versus d0 for a random network; (b) transport capacity gmaxd0 forrandom and regular networks with the same size and node density. For random networks, E[d0] = 1 for φ = π/2.

0

0.002

0.004

0.006

0.008

g EE

0 0.1 0.2 0.3 0.4 0.5p

φ = πφ = π/2φ = π/3

Figure 12: The average end-to-end throughput of random net-works for different routing sectors φ, where α = 4 and Θ = 10.

node transmits independently with a certain fixed probabil-ity p = pq pt, where pq is the intensity of the Bernoulli trafficand pt is the channel access probability. This decompositionof p shows that the throughput analysis and optimizationwith respect to p includes a range of traffic intensities andchannel access probabilities.

Among the three regular networks (square, triangle, hex-agon), the hexagon network provides the highest throughputsince every node has only three nearest neighbors which isthe smallest among the three networks. The sensitivity analy-sis of the maximum throughput gmax and optimum transmitprobability popt with respect to Θ for square networks ex-plains why the transmit efficiency Teff = gmax/popt is approx-imately 37%. These results hold quantitatively for the othertwo regular networks—triangle and hexagon networks.

For random networks, two scenarios are considered—fixed and variable transmitter-receiver distances d0. If d0 isthe same for regular and random networks, regular networksslightly outperform random networks in terms of through-put and transport capacity. In the case of variable d0 wherethe receiver selects the nearest-neighbor node as its desiredtransmitter, the average throughput of random networks isbetter than that of regular ones. This is because strong sig-nal powers resulting from very small d0 offset the impact ofinterference even for high transmit probabilities. This result,however, only pertains to local data exchange. When multi-hop communication and routing is taken into account, reg-ular topologies have a significant advantage in terms of end-to-end throughput. The reason for the inferior end-to-endperformance of random networks is the large variance in thenode distances.

ACKNOWLEDGMENT

The support of the US National Science Foundation (GrantsECS 03-29766 and CAREER CNS 04-47869) is gratefully ac-knowledged.

REFERENCES

[1] I. F. Akyildiz, W. Su, Y. Sankarasubramaniam, and E. Cayirci,“Wireless sensor networks: a survey,” Computer Networks,vol. 38, no. 4, pp. 393–422, 2002.

[2] P. Gupta and P. R. Kumar, “The capacity of wireless networks,”IEEE Trans. Inform. Theory, vol. 46, no. 2, pp. 388–404, 2000.

[3] S. Toumpis and A. J. Goldsmith, “Capacity regions for wirelessad hoc networks,” IEEE Transactions on Wireless Communica-tions, vol. 2, no. 4, pp. 736–748, 2003.

[4] J. Silvester and L. Kleinrock, “On the capacity of multihopslotted ALOHA networks with regular structure,” IEEE Trans.Commun., vol. 31, no. 8, pp. 974–982, 1983.

[5] M. Grossglauser and D. Tse, “Mobility increases the capacity


of ad hoc wireless networks,” in Proc. 20th Annual Joint Con-ference of the IEEE Computer and Communications Societies(INFOCOM ’01), vol. 3, pp. 1360–1369, Anchorage, Alaska,USA, April 2001.

[6] D. Marco, E. Duarte-Melo, M. Liu, and D. L. Neuhoff, “Onthe many-to-one transport capacity of a dense wireless sen-sor network and the compressibility of its data,” in Proc. 2ndIEEE International Workshop on Information Processing in Sen-sor Networks (IPSN ’03), pp. 1–16, Palo Alto, Calif, USA, April2003.

[7] L.-L. Xie and P. R. Kumar, “A network information theory forwireless communication: scaling laws and optimal operation,”IEEE Trans. Inform. Theory, vol. 50, no. 5, pp. 748–767, 2004.

[8] S. De, C. Qiao, D. A. Pados, and M. Chatterjee, “Topologicaland MAI constraints on the performance of wireless CDMAsensor networks,” in Proc. 23rd Annual Joint Conference ofthe IEEE Computer and Communications Societies (INFOCOM’04), vol. 1, Hong Kong, China, March 2004.

[9] G. Ferrari and O. K. Tonguz, “Performance of ad hoc wire-less networks with Aloha and PR-CSMA MAC protocol,” inProc. IEEE Global Telecommunications Conference (GLOBE-COM ’03), vol. 5, pp. 2824–2829, San Francisco, Calif, USA,December 2003.

[10] S. Panichpapiboon, G. Ferrari, and O. K. Tonguz, “Sensor net-works with random versus uniform topology: MAC and in-terference considerations,” in Proc. IEEE Vehicular TechnologyConference (VTC ’04), vol. 4, pp. 2111–2115, Milan, Italy, May2004.

[11] E. S. Sousa and J. A. Silvester, “Optimum transmission rangesin a direct-sequence spread-spectrum multihop packet radionetwork,” IEEE J. Select. Areas Commun., vol. 8, no. 5, pp. 762–771, 1990.

[12] N. H. Shepherd, et al., “Coverage prediction for mobile radiosystems operating in the 800/900 MHz frequency range,” IEEETrans. Veh. Technol., vol. 37, no. 1, pp. 3–72, 1988, Special is-sue on radio propagation.

[13] A. J. Goldsmith and S. B. Wicker, “Design challenges forenergy-constrained ad hoc wireless networks,” IEEE WirelessCommunications, vol. 9, no. 4, pp. 8–27, 2002.

[14] A. Ephremides, “Energy concerns in wireless networks,” IEEEWireless Communications, vol. 9, no. 4, pp. 48–59, 2002.

[15] A. Woo, T. Tong, and D. Culler, “Taming the underlying chal-lenges of reliable multihop routing in sensor networks,” inProc. 1st International Conference on Embedded NetworkedSensor Systems, pp. 14–27, Los Angeles, Calif, USA, Novem-ber 2003.

[16] S. Megerian, F. Koushanfar, G. Qu, G. Veltri, and M. Potkon-jak, “Exposure in wireless sensor networks: theory and prac-tical solutions,” Wireless Networks, vol. 8, no. 5, pp. 443–454,2002.

[17] N. Abramson, “The aloha system - another alternative forcomputer communications,” in Proc. Fall Joint Computer Con-ference, AFIPS Conference, vol. 37, pp. 281–285, Houston, Tex,USA, November 1970.

[18] F. Tobagi, “Analysis of a two-hop centralized packet radionetwork—Part I: slotted ALOHA,” IEEE Trans. Commun.,vol. 28, no. 2, pp. 196–207, 1980.

[19] F. Baccelli, B. Blaszczyszyn, and P. Muhlethaler, “A SpatialReuse Aloha MAC Protocol For Multihop Wireless MobileNetworks,” Tech. Rep. 4955, Institut National de Rechercheen Informatique et en Automatique (INRIA), Rocquen-court, Le Chesnay Cedex, France, October, 2003, Avail-able at http://www.terminodes.org/MV2003-Present/Me15/Spacial-Baccelli.pdf.

[20] R. Nelson and L. Kleinrock, “The spatial capacity of a slottedALOHA multihop packet radio network with capture,” IEEETrans. Commun., vol. 32, no. 6, pp. 684–694, 1984.

[21] D. Bertsekas and R. Gallager, Data Networks, Prentice-Hall,Englewood Cliffs, NJ, USA, 2nd edition, 1991.

[22] M. Haenggi, “Energy-balancing strategies for wireless sensornetworks,” in Proc. IEEE International Symposium on Circuitsand Systems (ISCAS ’03), vol. 4, pp. IV-828–IV-831, Bangkok,Thailand, May 2003.

[23] M. Haenggi, “On routing in random Rayleigh fading net-works,” to appear in IEEE Transactions on Wireless Commu-nications, http://www.nd.edu/∼mhaenggi/.

[24] R. Mathar and J. Mattfeldt, “On the distribution of cumulatedinterference power in Rayleigh fading channels,” Wireless Net-works, vol. 1, no. 1, pp. 31–36, 1995.

[25] N. Ahmed and R. G. Baraniuk, “Throughput measures fordelay-constrained communications in fading channels,” inProc. Allerton Conference on Communication, Control andComputing, Monticello, Ill, USA, October 2003.

[26] M. Hellebrandt and R. Mathar, “Cumulated interferencepower and bit-error-rates in mobile packet radio,” WirelessNetworks, vol. 3, no. 3, pp. 169–172, 1997.

[27] A. M. Mathai, An Introduction to Geometrical Probability, Gor-don and Breach Science Publishers, New York, NY, USA, 1999.

Xiaowen Liu received the M.S. degree insignal and information processing from theInstitute of Acoustics, Chinese Academy ofSciences, in 1998. She entered the Univer-sity of Notre Dame as a graduate studentin January 2001. She earned the Master de-gree in electrical engineering in 2002. Cur-rently, she is working toward the Ph.D. de-gree in the Department of Electrical Engi-neering. Her current research interests in-clude wireless network and communications, especially the perfor-mance analysis of wireless ad hoc and sensor networks.

Martin Haenggi received the Dipl. Ing.(M.S.) degree in electrical engineering fromthe Swiss Federal Institute of Technology inZurich (ETHZ) in 1995. In 1995, he joinedthe Signal and Information Processing Lab-oratory at ETHZ as a Teaching and ResearchAssistant. In 1996, he earned the Dipl. NDSETH (post-diploma) degree in informationtechnology, and in 1999, he completed hisPh.D. thesis on the analysis, design, and op-timization of cellular neural networks. After a postdoctoral year atthe Electronics Research Laboratory, the University of Californiain Berkeley, he joined the faculty of the Electrical Engineering De-partment, the University of Notre Dame, as an Assistant Professorin January 2001. For both his M.S. and his Ph.D. theses, he wasawarded the ETH Medal, and he received a CAREER Award fromthe US National Science Foundation in 2005. He is a Member ofthe Editorial Board of the Elsevier Journal on Ad Hoc Networks.His scientific interests include networking and wireless communi-cations, with an emphasis on ad hoc and sensor networks.

http://www.terminodes.org/MV2003-Present/Me15/Spacial-Baccelli.pdf

http://www.terminodes.org/MV2003-Present/Me15/Spacial-Baccelli.pdf

http://www.nd.edu/~mhaenggi/

EURASIP Journal on Wireless Communications and Networking 2005:4, 565–572c© 2005 X. Du and F. Lin

Maintaining Differentiated Coverage inHeterogeneous Sensor Networks

Xiaojiang DuDepartment of Computer Science, North Dakota State University, Fargo, ND 58105, USAEmail: [email protected]

Fengjing LinDepartment of Computer Science, North Dakota State University, Fargo, ND 58105, USAEmail: [email protected]

Received 27 November 2004; Revised 22 March 2005

Most existing research considers homogeneous sensor networks, which suffer from performance bottleneck and poor scalability.In this paper, we adopt a heterogeneous sensor network model to overcome these problems. Sensing coverage is a fundamentalproblem in sensor networks and has been well studied over the past years. However, most coverage algorithms only consider theuniform coverage problem, that is, all the areas have the same coverage degree requirement. In many scenarios, some key areasneed high coverage degree while other areas only need low coverage degree. We propose a differentiated coverage algorithm whichcan provide different coverage degrees for different areas. The algorithm is energy efficient since it only keeps minimum number ofsensors to work. The performance of the differentiated coverage algorithm is evaluated through extensive simulation experiments.Our results show that the algorithm performs much better than any other differentiated coverage algorithm.

Keywords and phrases: heterogeneous sensor networks, sensing coverage, differentiated coverage.

1. INTRODUCTION

Sensor networks hold the promise of facilitating large-scale,real-time data processing in complex environments. Existingresearch mainly considers homogeneous sensor networks,that is, all sensor nodes have identical capabilities in termsof communication, computation, sensing, reliability, and soforth. However, a homogeneous ad hoc network suffers frompoor scalability. Recent research has demonstrated its perfor-mance bottleneck both theoretically (Gupta and Kumar [1]showed that the per-node throughput in a homogeneous adhoc network is Θ(1/

√n), where n is the number of nodes),

and through simulation experiments and testbed measure-ment [2]. In this paper, we adopt a heterogeneous sensornetwork model to achieve good performance and scalabil-ity. Scalability is particularly important to large-scale sensornetworks with hundreds and thousands sensor nodes.

One of the fundamental problems in sensor networks isthe sensing coverage problem. Sensing coverage characterizesthe monitoring quality provided by a sensor network in adesignated region. Energy is a paramount concern in wire-


less sensor network applications that need to operate for along time on battery power. For example, habitat monitor-ing may require continuous operation for months, and mon-itoring civil structures (e.g., bridges) requires an operationallifetime of several years. Most sensor networks are deployedwith high density (up to 20 nodes/m3 [3]) in order to prolongthe network lifetime. Recent research has found that signif-icant energy savings can be achieved by dynamic manage-ment of node duty cycles in sensor networks with high nodedensity. In this approach, some nodes are scheduled to sleep(or enter a power saving mode) while the remaining activenodes provide continuous service. A fundamental problemis to minimize the number of nodes that remain active, whilestill achieving acceptable quality of service for applications.

Most existing researches consider the uniform sensingcoverage problem in sensor networks, for example, PEAS [4]and OGDC [5]. In these algorithms, nodes switch to sleepingstate as long as their neighbors can provide sensing cover-age for them. These algorithms provide the same coveragedegree for the entire network area. However, in many scenar-ios such as battlefields, there are certain geographic sectionssuch as the command headquarters that need higher cover-age degree than other areas. Since typical sensor nodes areunreliable devices and can fail or run out of power, and sin-gle sensing readings can be easily distorted by background




noise to cause false alarms, it is desirable to provide higherdegree of coverage for critical areas. However, it is not effi-cient to support the same high degree of coverage for someless important areas. To handle this issue, in this paper wepropose a differentiated coverage algorithm for sensor net-works. Differentiated coverage means providing different de-grees of sensing coverage for different areas in a sensor net-work according to the requirement.

The main contributions of this paper are the follow-ing. (1) We adopt a heterogeneous sensor network modelto achieve good performance and scalability. (2) We pro-pose a novel differentiated coverage algorithm for sensor net-works. The rest of this paper is organized as follows. Section 2reviews the related work in the literature. In Section 3, weintroduce the differentiated coverage algorithm. Section 4presents the simulation results. And Section 5 concludes thepaper.

2. RELATED WORKS

Sensing coverage in sensor networks has been well stud-ied. Several algorithms aim to find close-to-optimal solu-tion based on global information. In [6], a linear program-ming technique is applied to select the minimal set of activenodes for maintaining coverage. In [7], sensor deploymentstrategies were investigated to provide sufficient coverage fordistributed detection. In [4], Ye et al. presented PEAS—aprobing-based sensing coverage algorithm. Tian and Geor-ganas [8] proposed an algorithm that provides complete cov-erage using the concept of “sponsored area.” Both [4, 8] onlyconsider the metric in terms of the total amount of energyconsumed regardless of the distribution of the energy amongthe nodes. The unbalanced energy dissipation causes somenodes to die much faster than others; therefore, the half-lifeof the network is dramatically reduced in the unbalanced ap-proach. In [5], Zhang and Hou showed that coverage withminimal overlap is achieved when three sensor nodes forman equilateral triangle, and they proposed a localized densitycontrol algorithm OGDC based on the result.

In [9], Yan et al. proposed a differentiated surveillancealgorithm for sensor networks. In the algorithm, the sensornetwork is covered by uniformly distributed grid points, andthe coverage of the network is converted to the coverage ofall the grid points. Each sensor node chooses a random timereference point Ref within [0,T] (T is the operation round),and broadcasts its location and Ref to the neighbors. Theneach node locally decides its schedule of sleep and work,based on the Ref and location information of the neighborsthat cover the same grid point. Since each sensor node usu-ally covers several grid points, a scheme is needed to com-bine the schedules for covering multiple grid points. In [9],the final schedule of a sensor node is the union of its sched-ules for all the grid points that it can cover. However, sincethe Ref point is randomly selected, the probability of severalRef points close to each other is very small. In other words,the multiple Ref points are usually scattered across the [0,T]time period, and thus the union of schedules usually leads

to a very long working duration, which means that a sensornode will work for most of time. For example, if a sensornode needs to cover three grid points, and the schedule foreach grid point is [0,T/3], [T/2, 2T/3], and [2T/3,T], respec-tively, then the union of the above schedules has a durationof 5T/6, which means the sensor node needs to work for 5/6of the time. Thus, the differentiated surveillance algorithm in[9] is not efficient.

Recently deployed sensor network systems are increas-ingly following heterogeneous designs, incorporating a mix-ture of sensors with widely varying capabilities [10]. For ex-ample, in a smart home environment, sensors may be pow-ered by AA batteries, AAA batteries, or even button batter-ies. Researchers have studied various issues in heterogeneoussensor networks. In [11], Mhatre et al. studied the optimumnode density and node energies to guarantee a lifetime in het-erogeneous sensor networks. Duarte-Melo and Liu analyzedenergy consumption and lifetime of heterogeneous sensornetworks in [12].

In this paper, we adopt a heterogeneous sensor networkmodel to overcome the poor scalability and performancebottleneck of homogeneous sensor networks. We propose anovel differentiated coverage algorithm for wireless sensornetworks.

3. THE ENERGY-EFFICIENT DIFFERENTIATEDCOVERAGE ALGORITHM

In this section, we present our differentiated coverage (DC)algorithm for heterogeneous sensor networks. We consider aheterogeneous sensor network (HSN) consisting of two typesof nodes: a small number of powerful high-end sensors (H-sensors) and a large number of low-end sensors (L-sensors).One can build a heterogeneous sensor network by distribut-ing H-sensors and L-sensors at the same time, or by addinga small number of H-sensors into an existing homogeneoussensor network. H-sensors and L-sensors are assumed to beuniformly and randomly distributed in the field. Both H-sensors and L-sensors are assumed to know their locationinformation. Sensor nodes can use location services such asthose in [13, 14] to estimate their locations, and no GPS re-ceiver is required at each node. The operation of a sensor net-work is divided into several rounds, with each round beingthe same duration T . We assume that the L-sensor’s trans-mission range rt is at least twice of its sensing range rs, thatis, rt ≥ 2rs. This is true for many sensor nodes, includingMica II sensor [15], and so forth.

In Section 3.1, we describe the cluster formation schemein HSN. In Section 3.2, we present the scheme that providesuniform coverage in a sensor network. The uniform cover-age problem is a special case of the differentiated coverageproblem. In Section 3.3, we present the differentiated cover-age (DC) algorithm.

3.1. Cluster formation in HSN

During the initialization phase, all H-sensors broadcast Hellomessages to nearby L-sensors with a random delay. Therandom delay is to avoid the collision of Hello messages

Differentiated Coverage in Heterogeneous Sensor Networks 567

from two neighbor H-sensors. The Hello message includesthe ID of the H-sensor and its location. Since the loca-tions of H-sensors are random, H-sensors use the maximumtransmission power to broadcast the Hello messages. Withenough number of H-sensors uniformly and randomly dis-tributed in the network, most L-sensors can receive Hellomessages from multiple H-sensors, and most H-sensors canhear Hello messages from neighbor H-sensors. Then each L-sensor chooses the H-sensor whose Hello message has thebest signal strength as the cluster head. Each L-sensor alsorecords other H-sensors from which it receives the Hellomessages, and these H-sensors are listed as backup clusterheads in case the primary cluster head fails.

If an L-sensor does not hear any Hello message duringthe initialization phase (e.g., T seconds after deployment),the node will broadcast an Explore message. When the neigh-bor L-sensors receive the Explore message, they will responsewith an Ack message after a random delay. The Ack messageincludes the location and ID of the sender’s cluster head. AnL-sensor will not send Ack message again if it overhears anAck response from another neighbor. This mechanism re-duces the number of response messages and thus the con-sumed energy. Then the L-sensor can select a cluster headbased on the Ack message. This ensures that each L-sensorfinds a cluster head.

The sensor network is divided into multiple clusters,where H-sensors serve as the cluster heads. For simplicity,assume the network is a two-dimensional plane, then eachL-sensor will select the closest H-sensor as the cluster head(except when there is an obstacle in between), and this leadsto the formation of Voronoi diagram where the cluster headsare the nuclei of the Voronoi cells. An example of the clus-ter formation is shown in Figure 1. The large rectangle nodesin Figure 1 are H-sensors and the small square nodes areL-sensors. During initialization, each H-sensor also recordsthe locations of the neighbor H-sensors (based on the Hellomessages), and H-sensors can calculate the boundary of theVoronoi cells based on the locations of neighbor H-sensors.

3.2. The uniform sensing coverage scheme

We first present the scheme that provides uniform coveragein a sensor network. A grid is installed in the sensor network,and the grid points are uniformly distributed in the network.An example is shown in Figure 2, where the crosses are thegrid points. Assume all H-sensors know the location of a ref-erence grid point and the grid size (e.g., storing such infor-mation before deployment), then H-sensors know the loca-tions of all the grid points. An H-sensor can determine whichgrid points are covered by an L-sensor based on its locationand sensing range. We will first study the problem of cover-ing all the grid points while minimizing sensor energy con-sumption. When a reduced sensing range is used for nodescheduling, it can be shown that covering all grid points isequivalent to covering the whole field. The reduced sensingrange should satisfy rc < ra − d/

√2, where rc, ra, and d are

the reduced sensing range, the actual sensing range, and thegrid side length, respectively. We will not present the detailshere. In [9], Yan et al. also showed the above equivalence.

Figure 1: Voronoi cells in an HSN.

A

B

1 2

C

3 4

D

E

F

Figure 2: Coverage for grid points.

The goal is to design a node-scheduling scheme that ensuresall the grid points have the required coverage, while at thesame time minimize the total energy consumption in the net-work and balance node energy consumption.

The node scheduling is processed in each cluster inde-pendently. In a sensor network, all the grid points are num-bered in a certain way, for example, from top to down andfrom left to right. In each cluster, the node scheduling is pro-cessed according to the increasing order of grid point num-ber. That is, the schedule of sensors covering grid point 1 isdetermined first, then the schedule of sensors covering gridpoint 2 is determined, and so on.

In the sensing coverage scheme, a cluster head determinesthe node scheduling for all the L-sensors in its cluster. Af-ter initialization, each L-sensor sends its location informa-tion to the cluster head. Since the location of the clusterhead is known from the Hello message, a greedy geographicrouting protocol GPSR [10] is used for intra-cluster routing.


An L-sensor sends the packet to the active neighbor that hasthe shortest distance to the cluster head, and the next nodeperforms the similar thing, until the packet reaches the clus-ter head. Since nodes within a cluster are not far away fromthe cluster head, the greedy geographic routing should beable to route packets to cluster head with high probability.The chance of having a void during greedy geographic rout-ing (i.e., all the neighbors have longer distance to the clus-ter head than the node itself) is small. In case such a thinghappens, several recover schemes can be used to solve theproblem, for example, GPSR [10] and GOAFR [16] route apacket around the faces of a planar subgraph extracted fromthe original network.

After a certain time, a cluster head should receive the lo-cation information from all the L-sensors in its cluster, thenthe cluster head starts determining node schedule for eachgrid point in the cluster, according to the increasing orderof the grid point number. In the following, we will use theexample in Figure 2 to illustrate the scheme that determinesnode schedule for a grid point. Based on the locations ofL-sensors, the cluster head (say H) knows which L-sensorscover a grid point, that is, L-sensors within the circle centeredat the grid point with radius rs (sensing range). In Figure 2,three L-sensors (D, E, F) cover grid point 2.

H counts the total number (say k) of L-sensors that covergrid point 2. An ideal schedule for the k sensors should bethat each L-sensor works for T/k time and sleeps for T −T/ktime in a round T . This will ensure that the total energy con-sumption is minimized and each node has similar remain-ing energy. However, a sensor node may also need to coverother grid points, and some of them may already have one ormore assigned working slots. H considers the assigned work-ing slots of each L-sensor and tries to assign a time slot thathas the maximal overlap with the existing working slots. Forexample, if node D already has a working slot of [0,T/4] (forcovering grid point 1), then H can assign the working slot of[0,T/3] to D. Thus D only needs to be active during [0,T/3]and covers both grid points 1 and 2. If there is conflict, thena node may have an additional (or overlapped) working slotbesides its existing working slots.

After determining the node schedule for all grid pointsin the cluster, the cluster head H includes the working slotsfor all the L-sensors in one packet, and broadcast the packetto all L-sensors in its cluster. Each L-sensor records its work-ing slots as well as the working slots of its neighbors. Theneighbor working slots information is used by the greedy ge-ographic routing—GPSR [10]. When an L-sensor wants tosend a packet, it sends the packet to an active neighbor thathas the shortest distance to the cluster head.

Periodically, all L-sensors wake up and enter a listen state,and cluster heads reschedule working slots for the L-sensors.This is to ensure that the coverage algorithm is robust tosensor failures. For example, at the end of each round, allL-sensors wake up and enter a listen state, and each clusterhead broadcasts a rescheduling message to the L-sensors inits cluster. Then each alive L-sensor sends a packet to the clus-ter head, including its location and node ID. Cluster headsdetermine node schedule based on the coverage algorithm.

To ensure the sensing covering scheme works well, L-sensors in a cluster need to be synchronized. However, L-sensors from different clusters need not be synchronized,since the node scheduling is determined in each cluster in-dependently. For our heterogeneous sensor network model,a simple scheme can be used to synchronize the L-sensorswithin a cluster. Each time before a cluster head H broad-casts the node scheduling, H broadcasts a short synchroniza-tion message including its local time, and all the L-sensorscan synchronize their time with cluster head H.

3.3. The differentiated coverage algorithm

The above sensing coverage scheme can be easily extendedto provide differentiated coverage for sensor networks. If wewant to adjust the sensing coverage degree of a certain areato an arbitrary degree c, the cluster head will correspond-ingly increase or decrease the work time for each L-sensorin the area. For a grid point covered by k sensor nodes, thework time for each sensor node is T/k (in each round T) toprovide degree-1 coverage. For degree-c coverage, the worktime for each sensor node is cT/k. Thus, it is easy to pro-vide differentiated coverage for a sensor network by using ourscheme. The differentiated coverage algorithm is presented inAlgorithm 1. In the following, we use the example in Figure 2to describe the details of the Differentiated Coverage algo-rithm.

A cluster head H determines the schedules of all L-sensors in its cluster. For a grid point (say point 2 in Figure 2)in its cluster, H first counts the total number (say k) of L-sensors that cover this grid point. If k ≤ c, then all the L-sensors that cover point 2 need to be active for all time. Ifk > c, H will determine the working slots for each L-sensor.An ideal schedule for the k sensors should be that each L-sensor works for cT/k time and sleeps for T − cT/k time ina round T . This ensures that the total energy consumptionis minimized and each node has similar remaining energy.However, a sensor node may also need to cover other gridpoints, and some of them may already have one or more as-signed working slots. H considers the assigned working slotsof each L-sensor and tries to assign a time slot that has themaximal overlap with the existing working slots.

In the scheduling algorithm, each round T is dividedinto k equal time slots, that is, [0,T/k], [T/k, 2T/k], . . . , [(k−1)T/k,T], and these time slots are indexed by 1, 2, . . . , k. A setI is used to include the indexes of the available time slots. Ini-tially set I includes all the time slots, that is, I = {1, 2, . . . , k}.

Each L-sensor is assigned with c time slots for a requiredcoverage degree c, and this is done by the second FOR loop inAlgorithm 1. In each iteration of the second FOR loop, onetime slot is selected for each of the k L-sensors. To avoid as-signing the same time slot to an L-sensor twice, a set Bl isused to store the selected time slots for an L-sensor l. In thethird FOR loop, for each L-sensor l, the algorithm finds atime slot j that belongs to set I but not set Bl while maximiz-ing the overlap with node l’s existing working slots (whichare used to cover other grid points). If I ⊆ Bl, that is, all thetime slots left in set I are also in set Bl, then a time slot not inset Bl is randomly selected. After selecting all the c time slots


Notations:H is the cluster head.U is the set of grid points in H’s cluster.u is a grid point in H’s cluster, that is, u ∈ U .L(u) is the set of L-sensors that cover grid point u.k = |L(u)| is the total number of L-sensors that cover gridpoint u.c is the required coverage degree.Each round T is divided into k equal time slots, that is,[0,T/k], [T/k, 2T/k], . . . , [(k − 1)T/k,T], and thesetime slots are indexed by 1, 2, . . . , k.I is the set of indexes of the available time slots.Initially I = {1, 2, . . . , k}.Bl is the set of selected time slots for a L-sensor l.Initially Bl = ∅.The following scheduling algorithm runs in each cluster head.

FOR each grid point u ∈ U// Iterating c times for a required coverage degree c.FOR i = 1 to cResetting the available time slot set I = {1, 2, . . . , k}.// For each L-sensor l ∈ L(u), 1 ≤ l ≤ k.FOR l = 1 to kIF I ⊃ Bl, find a time slot j that satisfies the following 3conditions:

(1) j ∈ I ; // Selecting j from available slots.(2) j /∈ Bl; // j should not be the same as any

// previously selected slot.(3) j has the maximal overlap with l’s existing working slots.ELSE // That is, I ⊆ Bl

A time slot not in Bl is randomly selected.ENDIFAdding time slot j to set Bl.Removing j from the available time slot set I , that is,I = I − { j}.END // End of the third FOR loop.END // End of the second FOR loop.// Adding the selected slots to the working slots.FOR each L-sensor l ∈ L(u)Adding set Bl to l’s working slots.ENDEND // End of the first FOR loop.

Algorithm 1: The differentiated coverage algorithm.

for each L-sensor, the selected time slots are added into theworking slots of each L-sensor.

In [5], Zhang and Hou prove that the radio range beingat least twice of the sensing range is both necessary and suf-ficient to ensure that coverage implies connectivity. In [17],Wang et al. also show the similar result. Our sensing cover-age algorithm ensures the coverage in a sensor network, thusguarantees connectivity in the network when rt ≥ 2rs.

4. PERFORMANCE EVALUATION

In this section, we evaluate the performance of the differ-entiated coverage (DC) algorithm, and compare DC withanother differentiated coverage algorithm in [9], whichwe refer to as differentiated surveillance (DS) algorithm.The following metrics are used to show the energy saving

and efficient coverage provided by DC algorithm: (1) totalamount of energy consumption, (2) energy variation amongnodes, (3) sensing coverage over time, (4) energy consump-tion for differentiated coverage, and (5) the number of work-ing nodes.

We implemented DC algorithm in QualNet. For compar-ison, DS algorithm was also implemented in QualNet. Theunderlying medium access control protocol is IEEE 802.11DCF. We adopt the same energy model as in TTDD [18]. Asensor node’s transmitting, receiving, and idling power con-sumption rates are 0.66 W, 0.395 W, and 0.035 W, respec-tively, [18]. In DC, GPSR [10] is used as the routing pro-tocol for transmissions from L-sensors to cluster heads. Thedefault simulation testbed has 1 sink and 300 L-sensors uni-formly, randomly distributed in a 200 m × 200 m area. Thesensing range and communication range of an L-sensor is10 m and 25 m, respectively. The grid size d is 4 m. For DC,there are additional 10 H-sensors in the network. AlthoughH-sensors also provide sensing coverage, for fair comparisonwe do not count the coverage from H-sensors in the follow-ing experiments.

Each simulation runs for 2000 seconds, and each exper-iment runs for 10 times with different node deploymentsand different random seeds. Each round T is set as 500 sec-onds, so there are 4 rounds in each simulation. In DC algo-rithm, all L-sensors enter listen state after every 500 seconds(one round) and the L-sensors are rescheduled by the clus-ter heads. In the following tests, the communication cost fortransferring data packet is not included in the energy con-sumption, since it is highly application dependent. In Sec-tions 4.1, 4.2, and 4.3, the uniform coverage case is consid-ered, and the differentiated coverage is considered in Sections4.4 and 4.5.

4.1. Total energy consumption

In Figure 3, we compare the total energy consumed for dif-ferent node densities using DC algorithm and DS algorithm.The total number of L-sensors varies from 200 to 500 withan increasing of 50. The number of H-sensors in DC doesnot change. The total energy consumption when all sensornodes are working is also plotted in Figure 3. The total en-ergy consumption in DC also includes energy consumptionof H-sensors.

From Figure 3, we can see that DC consumes much lessenergy than both DS and the “all working” case. DS con-sumes less energy than “all working” when sensor density ishigh. For the “all working” case, the total energy consumedis close to a linear function of the sensor number, and it in-creases very fast as the number of sensors increases, whilethe energy consumptions under DS and DC increase slowlywhen the number of sensors becomes large. In DS and DCalgorithms, only a portion of sensors (that are enough tocover the area) are active at any time. When sensor densityincreases, the required coverage degree does not change, thustheir energy consumptions do not increase much. The smallincrease of the energy consumptions in DS and DC mainlycomes from the communication overhead to determine thenode schedule.


0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

Tota

len

ergy

con

sum

ptio

n(u

nit

)

200 250 300 350 400 450 500

Number of L-sensors

All workingDifferentiated surveillanceDifferentiated coverage

Figure 3: The total energy consumption.

Figure 3 also shows that the total energy consumed in DCis only about 1/3 of that in DS. In DS, each node decides itsown schedule, and the integrated schedule is the union of theschedules for all the grid points that it can cover. For a sensornode using DS algorithm, the working slot for covering a gridpoint is randomly selected. Because of the randomness, theworking slots for different grid points are usually different,thus the union of the schedules for different grid points leadsto a long working duration. For example, consider a node Cthat covers three grid points. If the work-time for the threegrid points is [0,T/4], [T/3, 2T/3], and [3T/4,T], respec-tively, then node C will work for 5T/6 in each round (T sec-onds). On the other hand, in DC algorithm, the cluster headconsiders the existing working slot when it makes schedulefor covering the current grid point and tries to maximize theoverlap between the existing and new working slots, and thisdramatically reduces the total work-time for each node. Forexample, in the above example, the sensor node C could bescheduled to work only during [T/3, 2T/3] to cover the threegrid points, then the work duration is only T/3, much lessthan 5T/6 in DS algorithm.

4.2. Balancing node energy consumption

In this study, we investigate the energy consumption of indi-vidual sensor nodes. Specifically, we want to check if the en-ergy consumption is balanced among different sensor nodes.We measure the average value (Ave) and standard deviation(Std) of energy consumed by each node under different nodedensities, and the results are reported in Figure 4.

Figure 4 shows that in both DS and DC algorithms,the average energy consumption for an individual nodedecreases as the network node density increases. This is rea-sonable since more nodes means less work time for eachnode, and less energy consumed. The average energy con-sumption of each node in DC is always lower than that in

0

1

2

3

4

5

6

7

8

9

10

En

ergy

con

sum

ptio

n(u

nit

)

200 250 300 350 400 450 500

Number of L-sensors

DS-AveDC-Ave

Std-DSStd-DC

Figure 4: Average and standard deviation of node energy consump-tion.

DS, and this shows that DC is more energy efficient than DS.The reason is already stated in Section 4.1. In addition, fromFigure 4 we can see that the standard deviation in DC is alsosmaller than DS, which means the node energy consumptionis better balanced in DC than in DS.

4.3. Coverage over time

The coverage of a sensor network at different time instancesafter network deployment is an important performance. Wemeasure the sensing coverage at different time by runningthe simulation for a longer time period—6000 seconds. Eachsensor node has a fixed energy supply and it dies when theenergy supply runs out. We test the sensing coverage for twodifferent node densities: 300 nodes and 450 nodes. The testresults are reported in Figure 5.

Figure 5 shows that before 2000 seconds, the sensing cov-erages under DC and DS are closes to each other. When thesimulation time is larger than 2000 seconds, the coverage un-der DS algorithm drops rapidly as time increases, and thesensing coverage is less than 30% at 6000 seconds. On theother hand, the sensing coverage under DC algorithm dropsslowly as time increases. At 6000 seconds, the coverage underDC is still above 80% for the 450-node network, and closeto 70% for the 300-node network. Sensor nodes using DS al-gorithm have much longer work (active) time and die outearlier than nodes in DC algorithm. That is why the sensingcoverage under DS drops very fast, and the sensor networkusing DS can only provide low coverage after a long periodof time.

4.4. Energy consumed for differentiated coverage

In this subsection, we measure the performance of DC algo-rithm for differentiated coverage and compare the total en-ergy consumed in DC with DS for different desired coveragedegrees. In this experiment, different areas in the network


0

10

20

30

40

50

60

70

80

90

100

Cov

erag

ep

erce

nta

ge(%

)

0 1000 2000 3000 4000 5000 6000

Simulation time (s)

DS, 450 nodesDC, 450 nodes

DS, 300 nodesDC, 300 nodes

Figure 5: Sensing coverage over time.

0

1000

2000

3000

4000

5000

6000

7000

Tota

len

ergy

con

sum

ptio

n(u

nit

)

1 1.5 2 2.5 3 3.5 4

Average desired coverage degree

Differentiated surveillanceDifferentiated coverage

Figure 6: Total energy consumption for differentiated coverage.

have different desired coverage degrees. To make the compar-ison meaningful, the same differentiated coverage require-ments are used for both DC and DS algorithms, that is, thesame desired coverage degree is used for the same grid pointin both DC and DS. The average required coverage degree(over the network) tested includes 1, 2, 3, and 4. The test re-sults are reported in Figure 6, where a sensor network with600 L-sensors is used. From Figure 6, we can see that the totalenergy consumption increases linearly in the desired cover-age degree, in both DC and DS algorithms. The energy con-sumed at a higher average coverage degree-k is a little bit lessthan k times the energy consumed at coverage degree-1, be-cause the communication overhead does not increase pro-portionally as the desired coverage degree. Figure 6 shows

0

100

200

300

400

500

600

Ave

rage

nu

mbe

rof

wor

kin

gn

odes

300 350 400 450 500 550 600

Number of sensor nodes

Differentiated surveillanceDifferentiated coverage

Figure 7: The number of working nodes for different node densi-ties.

that the total energy consumed in DC is much lower thanthat in DS, for all the desired coverage degrees tested.

4.5. The number of working nodes

In order to reduce the total energy consumption in sen-sor networks, the number of active sensors should be keptto the minimum. The average number of working nodes ismeasured for different sensor node densities, varying from300 to 600. The results under DS and DC are plotted inFigure 7, where the required average coverage degree is two.Figure 7 shows that the number of working nodes in DC doesnot change much as sensor density increases. In DC, clus-ter heads combine node working slots (for covering differentgrid points) together and thus dramatically reduces the to-tal work time of a node, which in turn reduces the averagenumber of working nodes in the network. Since the requiredcoverage degree does not change, the number of workingnodes in DC does not change much. In DS, the work timeof a node is the union of schedules for covering multiple gridpoints, and in many cases it is much longer than the worktime in DC. Thus, the average number of working nodes inDS is larger than that in DC. When node density increases,the higher node density is not well utilized by DS becauseof the randomness in setting work time. As node density in-creases, there are more nodes in DS having long work time,so the difference of working node number between DS andDC becomes larger.

5. CONCLUSIONS

In this paper, we adopted a heterogeneous sensor networkmodel to overcome the poor scalability and performancebottleneck of homogeneous sensor networks. A small num-ber of high-end sensors are mixed together with a largenumber of low-end sensors to form a heterogeneous sen-sor network. We proposed the Differentiated coverage (DC)


algorithm for heterogeneous sensor networks, which canprovide different coverage degrees for different areas. In DC,cluster heads integrate sensor’s work time for covering multi-ple grid points and dramatically reduce the total active timefor each sensor. Various energy consumptions and sensingcoverage of DC algorithm are evaluated through simulationexperiments and compared with another differentiated cov-erage algorithm—DS. Our test results show that DC algo-rithm performs much better than DS algorithm.

REFERENCES

[1] P. Gupta and P. R. Kumar, “The capacity of wireless networks,”IEEE Trans. Inform. Theory, vol. 46, no. 2, pp. 388–404, 2000.

[2] K. Xu, X. Hong, and M. Gerla, “An ad hoc network withmobile backbones,” in Proc. IEEE International Conference onCommunications (ICC ’02), vol. 5, pp. 3138–3143, New York,NY, USA, April–May 2002.

[3] E. Shih, S.-H. Cho, N. Ickes, et al., “Physical layer driven pro-tocol and algorithm design for energy-efficient wireless sen-sor networks,” in Proc. 7th Annual International Conference onMobile Computing and Networking (MobiCom ’01), pp. 272–287, Rome, Italy, July 2001.

[4] F. Ye, G. Zhong, S. Lu, and L. Zhang, “PEAS: a robust en-ergy conserving protocol for long-lived sensor networks,” inProc. 23rd International Conference on Distributed ComputingSystems (ICDCS ’03), pp. 169–177, Providence, RI, USA, May2003.

[5] H. Zhang and J. C. Hou, “Maintaining sensing coverage andconnectivity in large sensor networks,” International Journalof Wireless Ad Hoc and Sensor Networks, vol. 1, no. 1-2, pp.89–124, 2005.

[6] K. Chakrabarty, S. S. Iyengar, H. Qi, and E. Cho, “Grid cov-erage for surveillance and target location in distributed sen-sor networks,” IEEE Trans. Comput., vol. 51, no. 12, pp. 1448–1453, 2002.

[7] T. Clouqueur, V. Phipatanasuphorn, P. Ramanathan, and K.K. Saluja, “Sensor deployment strategy for target detection,”in Proc. 1st International Workshop on Wireless Sensor Net-works and Applications (WSNA ’02), pp. 42–48, Atlanta, Ga,USA, September 2002.

[8] D. Tian and N. D. Georganas, “A coverage-preserved nodescheduling scheme for large wireless sensor networks,” inProc. 1st International Workshop on Wireless Sensor Networksand Applications (WSNA ’02), pp. 32–41, Atlanta, Ga, USA,September 2002.

[9] T. Yan, T. He, and J. A. Stankovic, “Differentiated surveillancefor sensor networks,” in Proc. 1st International Conference onEmbedded Networked Sensor Systems (SenSys ’03), pp. 51–62,Los Angeles, Calif, USA, November 2003.

[10] B. Karp and H. T. Kung, “GPSR: greedy perimeter statelessrouting for wireless networks,” in Proc. 6th Annual Interna-tional Conference on Mobile Computing and Networking (Mo-biCom ’00), pp. 243–254, Boston, Mass, USA, August 2000.

[11] V. P. Mhatre, C. Rosenberg, D. Kofman, R. Mazumdar, and N.Shroff, “A minimum cost heterogeneous sensor network witha lifetime constraint,” IEEE Transactions on Mobile Comput-ing, vol. 4, no. 1, pp. 4–15, 2005.

[12] E. J. Duarte-Melo and M. Liu, “Analysis of energy consump-tion and lifetime of heterogeneous wireless sensor networks,”in Proc. IEEE Global Telecommunications Conference (GLOBE-COM ’02), vol. 1, pp. 21–25, Taipei, Taiwan, November 2002.

[13] N. Bulusu, J. Heidemann, and D. Estrin, “GPS-less low-costoutdoor localization for very small devices,” IEEE Pers. Com-mun., vol. 7, no. 5, pp. 28–34, 2000.

[14] L. Doherty, K. S. J. Pister, and L. El Ghaoui, “Convex positionestimation in wireless sensor networks,” in Proc. 20th AnnualJoint Conference of the IEEE Computer and CommunicationsSocieties (INFOCOM ’01), vol. 3, pp. 1655–1663, Anchorage,Ark, USA, April 2001.

[15] Crossbow TechnologySan JoseCalifUSA, http://www.xbow.com/.

[16] F. Kuhn, R. Wattenhofer, and A. Zollinger, “Worst-case op-timal and average-case efficient geometric ad-hoc routing,”in Proc. 4th ACM International Symposium on Mobile Ad-HocNetworking and Computing (MobiHoc ’03), pp. 267–278, An-napolis, Md, USA, June 2003.

[17] X. Wang, G. Xing, Y. Zhang, C. Lu, R. Pless, and C. Gill, “In-tegrated coverage and connectivity configuration in wirelesssensor networks,” in Proc. 1st International Conference on Em-bedded Networked Sensor Systems (SenSys ’03), pp. 28–39, LosAngeles, Calif, USA, November 2003.

[18] F. Ye, H. Luo, J. Cheng, S. Lu, and L. Zhang, “A two-tier datadissemination model for large scale wireless sensor networks,”in Proc. 8th Annual International Conference on Mobile Com-puting and Networking (MobiCom ’02), pp. 148–159, Atlanta,Ga, USA, September 2002.

Xiaojiang Du is an Assistant Professor inthe Department of Computer Science atNorth Dakota State University. He receivedhis B.E. degree in electrical engineeringfrom Tsinghua University, Beijing, Chinain 1996, and his M.S. and Ph.D. degreesin electrical engineering from University ofMaryland, College Park, in 2002 and 2003,respectively. His research interests are wire-less sensor networks, mobile ad hoc net-works, wireless networks, computer networks, network security,and network management. He is a technical program committeemember for several international conferences (including IEEE ICC2006, Globecom 2005, BroadNets 2005, WirelessCom 2005, IPCCC2005, and BroadWise 2004). He is a Member of IEEE.

Fengjing Lin is currently a Ph.D. studentin the Department of Computer Science atNorth Dakota State University. She receivedher B.S. degree in education from Jia-YingUniversity, China, in 1999, and her M.S. de-gree in computer science from SoutheasternUniversity, Washington, DC, in 2003, re-spectively. Her research interests are wirelesssensor networks, mobile ad hoc networks,and computer networks.

http://www.xbow.com/

http://www.xbow.com/

wireless sensor networks - hindawi publishing...

Documents