a theoretical study of internet congestion control
TRANSCRIPT
A Theoretical Study of Internet Congestion Control:Equilibrium and Dynamics
Thesis by
Jiantao Wang
In Partial Fulfillment of the Requirements
for the Degree of
Doctor of Philosophy
California Institute of Technology
Pasadena, California
2005
(Defended July 28, 2005)
iv
Acknowledgements
I would like to express my deepest gratitude and appreciation to my advisors John Doyle
and Steven Low. Without their generosity and assistance, the completion of this thesis
would not have been possible. John provided me the great opportunity to pursue my doc-
toral degree at Caltech. His great vision, incredible insight, and amazing big picture have
always been a source of inspiration to my research. Steven has given me constant encour-
agement and excellent guidance through my research. There were countless individual
meetings and email exchanges from him providing me detailed help from formulating the
research idea, to solving the problem and to presenting the solution rigorously. It is hard to
imagine having done my Ph.D. without his help.
My gratitude also extends to Mani Chandy, Richard Murray, and Babak Hassibi for
serving on my thesis committee despite their busy schedules and for providing valuable
feedback.
I extend my gratitude to all my friends and colleagues in Netlab. To Ao Tang for the
fruitful and enjoyable collaboration on various research projects. To Lun Li for the collab-
oration and discussion in the TCP/IP routing project. To Xiaoliang Wei for the simulation
and discussion on the FAST stability project. These projects contribute to a major part of
this dissertation. I would like to thank Cheng Jin for numerous conversations on FAST
TCP. I also appreciate the helpful discussions with Hyojeong Choe about the stability of
Vegas and with Joon-Young Hyojeong about the global stability of FAST. I would like to
acknowledge Sanjeewa Athuraliya for introducing me to thens-2simulator. Thanks also
go to Christine Ortega for her arrangement of all my conference trips.
I would also like to acknowledge my friends in CDS: William Dunbar, Shane Ross,
Francois Lekien, Yindi Jing, Nizar Batada, and Xin Liu. I would also like to thank Yong
v
Wang who helped me to settle down when I first came to the States. I thank Aristotelis
Asimakopoulos and Lun Li for sharing an office with me.
I would like to thank my girlfriend Huirong Ai for her love and support. I would also
like to express my gratitude to my parents, and this dissertation is dedicated to them.
vi
Abstract
In the last several years, significant progress has been made in modelling the Internet con-
gestion control using theories from convex optimization and feedback control. In this dis-
sertation, the equilibrium and dynamics of various congestion control schemes are rigor-
ously studied using these mathematical frameworks.
First, we study the dynamics of TCP/AQM systems. We demonstrate that the dynamics
of queue and average window in Reno/RED networks are determined predominantly by the
protocol stability, not by AIMD probing nor noise traffic. Our study shows that Reno/RED
becomes unstable when delay increases and more strikingly, when link capacity increases.
Therefore, TCP Reno is ill suited for the future high-speed network, which has motivated
the design of FAST TCP. Using a continuous-time model, we prove that FAST TCP is
globally stable without feedback delays and provide a sufficient condition for local stability
when feedback delays are present. We also introduce a discrete-time model for FAST TCP
that fully captures the effect ofself-clockingand derive the local stability condition for
general networks with feedback delays.
Second, the equilibrium properties (i.e., fairness, throughput, and capacity) of TCP/AQM
systems are studied using the utility maximization framework. We quantitatively capture
the variations in network throughput with changes in link capacity and allocation fairness.
We clarify the open conjecture of whether a fairer allocation isalwaysmore efficient. The
effects of changes in routing are studied using a joint optimization problem over both source
rates and their routes. We investigate whether minimal-cost routing with proper link costs
can solve this joint optimization problem in a distributed way. We also identify the tradeoff
between achievable utility and routing stability.
At the end, two other related projects are briefly described.
vii
Contents
Acknowledgements iv
Abstract vi
1 Introduction 1
1.1 Challenges of developing theories for the Internet . . . . . . . . . . . . . .1
1.2 Related work in congestion control . . . . . . . . . . . . . . . . . . . . . .3
1.3 Summary of main results . . . . . . . . . . . . . . . . . . . . . . . . . . .5
1.3.1 Dynamics and stability . . . . . . . . . . . . . . . . . . . . . . . .5
1.3.1.1 Local stability of TCP/RED . . . . . . . . . . . . . . . . 6
1.3.1.2 Modelling and dynamics of FAST TCP . . . . . . . . . .6
1.3.2 Equilibrium and performance . . . . . . . . . . . . . . . . . . . .7
1.3.2.1 Relations among throughput, fairness, and capacity . . .7
1.3.2.2 Joint utility optimization over TCP/IP . . . . . . . . . . .8
1.3.3 Other related results . . . . . . . . . . . . . . . . . . . . . . . . .10
1.3.3.1 Network equilibrium with heterogeneous protocols . . .10
1.3.3.2 Control unresponsive flow–CHOKe . . . . . . . . . . . .11
1.4 Organization of this dissertation . . . . . . . . . . . . . . . . . . . . . . .11
2 Background and Preliminaries 13
2.1 Transmission Control Protocol (TCP) . . . . . . . . . . . . . . . . . . . .14
2.1.1 TCP Reno . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .14
2.1.2 TCP Vegas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16
2.1.3 FAST TCP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17
viii
2.2 Active Queue Management (AQM) . . . . . . . . . . . . . . . . . . . . . .18
2.2.1 Droptail . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18
2.2.2 Random Early Detection (RED) . . . . . . . . . . . . . . . . . . .19
2.2.3 CHOKe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .20
2.3 Unified frameworks for TCP/AQM systems . . . . . . . . . . . . . . . . .21
2.3.1 General dynamic model of TCP/AQM . . . . . . . . . . . . . . . .21
2.3.2 Duality model of TCP . . . . . . . . . . . . . . . . . . . . . . . .23
3 Local Dynamics of Reno/RED 27
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .27
3.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .28
3.3 Dynamic model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .31
3.3.1 Nonlinear model of Reno/RED . . . . . . . . . . . . . . . . . . . .32
3.3.2 Linear model of Reno/RED . . . . . . . . . . . . . . . . . . . . .34
3.3.3 Validation and stability region . . . . . . . . . . . . . . . . . . . .37
3.4 Local stability analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . .39
3.5 RED parameter setting . . . . . . . . . . . . . . . . . . . . . . . . . . . .41
3.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .42
4 Modelling and Dynamics of FAST 43
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .43
4.2 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .45
4.2.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .45
4.2.2 Discrete and continuous-time models . . . . . . . . . . . . . . . .46
4.2.3 Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .47
4.3 Stability analysis with the continuous-time model . . . . . . . . . . . . . .49
4.3.1 Global stability without feedback delay . . . . . . . . . . . . . . .49
4.3.2 Local stability with feedback delay . . . . . . . . . . . . . . . . .52
4.3.3 Numerical simulation and experiment . . . . . . . . . . . . . . . .53
4.4 Stability analysis with the discrete-time model . . . . . . . . . . . . . . . .55
4.4.1 Local stability with feedback delay . . . . . . . . . . . . . . . . .56
ix
4.4.2 Global stability for one link without feedback delay . . . . . . . . .61
4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .65
4.6 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .66
4.6.1 Proof of Lemma 4.1 . . . . . . . . . . . . . . . . . . . . . . . . .66
4.6.2 Proof of Theorem 4.3 . . . . . . . . . . . . . . . . . . . . . . . . .67
4.6.3 Proof of Lemma 4.3 . . . . . . . . . . . . . . . . . . . . . . . . .69
4.6.4 Proof of Lemma 4.4 . . . . . . . . . . . . . . . . . . . . . . . . .70
4.6.5 Proof of Lemma 4.8 . . . . . . . . . . . . . . . . . . . . . . . . .71
5 Cross-Layer Optimization in TCP/IP Networks 73
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .73
5.2 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .75
5.3 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .76
5.4 Equilibrium of TCP/IP . . . . . . . . . . . . . . . . . . . . . . . . . . . .80
5.5 Dynamics of TCP/IP . . . . . . . . . . . . . . . . . . . . . . . . . . . . .88
5.5.1 Simple ring network . . . . . . . . . . . . . . . . . . . . . . . . .89
5.5.2 Utility and stability of pure dynamic routing . . . . . . . . . . . . .90
5.5.3 Maximum utility of minimum-cost routing . . . . . . . . . . . . .93
5.5.4 Stability of minimum-cost routing . . . . . . . . . . . . . . . . . .97
5.5.5 General network . . . . . . . . . . . . . . . . . . . . . . . . . . .101
5.6 Resource provisioning . . . . . . . . . . . . . . . . . . . . . . . . . . . .103
5.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .105
5.8 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .106
5.8.1 Proof of duality gap . . . . . . . . . . . . . . . . . . . . . . . . .106
5.8.2 Proof of primal-dual optimality . . . . . . . . . . . . . . . . . . .108
5.8.3 Proof of Lemma 5.1 . . . . . . . . . . . . . . . . . . . . . . . . .109
6 Throughput, Fairness, and Capacity 111
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .111
6.2 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .113
6.3 Basic results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .115
x
6.4 Is fair allocation always inefficient? . . . . . . . . . . . . . . . . . . . . .118
6.4.1 Conjecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .118
6.4.2 Special cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . .120
6.4.3 Necessary and sufficient conditions . . . . . . . . . . . . . . . . .122
6.4.4 Counter-example . . . . . . . . . . . . . . . . . . . . . . . . . . .126
6.5 Does increasing capacity always raise throughput? . . . . . . . . . . . . .128
6.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .132
7 Other Related Projects 134
7.1 Network equilibrium with heterogeneous protocols . . . . . . . . . . . . .134
7.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .134
7.1.2 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .135
7.1.3 Existence of equilibrium . . . . . . . . . . . . . . . . . . . . . . .136
7.1.4 Examples of multiple equilibria . . . . . . . . . . . . . . . . . . .137
7.1.5 Local uniqueness of equilibrium . . . . . . . . . . . . . . . . . . .139
7.1.6 Global uniqueness of equilibrium . . . . . . . . . . . . . . . . . .142
7.1.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .144
7.2 Control unresponsive flow–CHOKe . . . . . . . . . . . . . . . . . . . . .144
7.2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .144
7.2.2 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .145
7.2.3 Throughput analysis . . . . . . . . . . . . . . . . . . . . . . . . .147
7.2.4 Spatial characteristics . . . . . . . . . . . . . . . . . . . . . . . . .149
7.2.5 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .152
7.2.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .155
8 Summary and Future Directions 156
Bibliography 158
xi
List of Figures
2.1 Congestion window of TCP Reno. . . . . . . . . . . . . . . . . . . . . . . .15
2.2 RED dropping function. . . . . . . . . . . . . . . . . . . . . . . . . . . . .20
2.3 General congestion control structure. . . . . . . . . . . . . . . . . . . . . .23
3.1 Window and queue traces without noise traffic. . . . . . . . . . . . . . . . .29
3.2 Queue traces with noise traffic. . . . . . . . . . . . . . . . . . . . . . . . . .31
3.3 Queue traces with heterogeneous delays. . . . . . . . . . . . . . . . . . . .32
3.4 Linear model validation. . . . . . . . . . . . . . . . . . . . . . . . . . . . .38
3.5 Stability region. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .39
4.1 Model validation–closed loop . . . . . . . . . . . . . . . . . . . . . . . . .48
4.2 Model validation–open loop. . . . . . . . . . . . . . . . . . . . . . . . . . .49
4.3 Numerical simulations of FAST TCP. . . . . . . . . . . . . . . . . . . . . .54
4.4 Dummynet experiments of FAST TCP. . . . . . . . . . . . . . . . . . . . .55
4.5 Illustration of Lemma 4.3. . . . . . . . . . . . . . . . . . . . . . . . . . . .69
5.1 Network to which integer partition problem can be reduced. . . . . . . . . .88
5.2 A ring network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .89
5.3 The routingr(a). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .97
5.4 A random network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .102
5.5 Aggregate utility as a function ofa for random network . . . . . . . . . . . .103
5.6 Proof of Lemma 5.1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .109
6.1 Linear network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .120
6.2 Linear network with two long flows. . . . . . . . . . . . . . . . . . . . . . .121
xii
6.3 Fairness-efficiency tradeoff. . . . . . . . . . . . . . . . . . . . . . . . . . .122
6.4 Network for counter-example in Theorem 6.3. . . . . . . . . . . . . . . . . .126
6.5 Throughput versus efficiencyα in the counter-example. . . . . . . . . . . . .127
6.6 Source rates versusα in the counter-example. . . . . . . . . . . . . . . . . .128
6.7 Counter-example for Theorem 6.5(1). . . . . . . . . . . . . . . . . . . . . .130
6.8 Counter-example for Theorem 6.5(2). . . . . . . . . . . . . . . . . . . . . .131
7.1 Example 2: two active constraint sets. . . . . . . . . . . . . . . . . . . . . .137
7.2 Shifts between the two equilibria with different active constraint sets. . . . .138
7.3 Example 2: uncountably many equilibria. . . . . . . . . . . . . . . . . . . .138
7.4 Example 3: construction of multiple isolated equilibria. . . . . . . . . . . . .141
7.5 Example 3: vector field of (p1, p2). . . . . . . . . . . . . . . . . . . . . . . .142
7.6 Corollary 7.6: linear network. . . . . . . . . . . . . . . . . . . . . . . . . .143
7.7 Bandwidth shareµ0 v.s. Sending ratex0(1− r)/c. . . . . . . . . . . . . . .148
7.8 Illustration of Theorems 7.9 and 7.10. . . . . . . . . . . . . . . . . . . . . .151
7.9 Experiment 1: effect of UDP ratex0 on queue size and UDP share . . . . . .153
7.10 Experiment 2: spatial distributionρ(y). . . . . . . . . . . . . . . . . . . . .154
7.11 Experiment 3: effect of UDP ratex0 on queue size and UDP share with TCP
Vegas. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .155
xiii
List of Acronyms
ACK Acknowledgement
AIMD Additive Increase Multiplicative Decrease
AQM Acitve Queue Management
ATM Asynchronous Transfer Mode
ECN Explicit Congestion Notification
FDDI Fiber Distributed Data Interface
FIFO First In First Out
HTTP Hyper Text Transfer Protocol
LAN Local Area Network
IP Internet Protocol
ISP Internet Service Provider
MTU Maximum Transmission Unit
RED Random Early Detection
RFC Request for Comment
RTT Round-Trip Time
SACK Selective ACKnowledgement
SONET Synchronous Optical NETwork
TCP Transmission Control Protocol
UDP User Datagram Protocol
WWW World Wide Web
1
Chapter 1
Introduction
1.1 Challenges of developing theories for the Internet
The Internet is a worldwide-interconnected computer network that transmits data by packet
switching based on the TCP/IP protocol suite. Originated from the NSFnet with a hand-
ful of nodes, it has undergone explosive growth during the last two decades. Today, it
connects hundreds of millions of machines, reaches billions of people, and forms a glob-
ally distributed information-exchanging system. With services provided by the Internet,
encyclopedic information on every subject can be easily searched and accessed, millions
of people from every corner of the world can interact with each other via e-mail and in-
stant messengers, and businesses can be conducted in new and more efficient ways. As the
most important innovation of the last century, the Internet has fundamentally changed our
lifestyle.
The huge success of the Internet is achieved with improving designs and enriching pro-
tocols. While keeping pace with the advances in communication technology and non-stop
demand for additional bandwidth and connectivity, the Internet continuously experiences
changes and updates in almost all aspects; see [165] for well-documented details. Now,
the Internet has evolved into a large-scale, heterogeneous, distributed system with com-
plexity unparalleled by any other engineering system. For example, its scale, measured by
the number of connected hosts, has grown from two thousand at the end of 1985 to over
three hundred million in 2005 with a growth rate of 80% per year. Its heterogeneity ex-
ists and increases at almost every layer. In the link layers, there are wired, wireless, fiber,
2
and satellite links with bandwidths ranging from several Kbps to 10Gbps, and there are
propagation delays from nanoseconds to hundreds of milliseconds. Different networking
technologies such as Ethernet LANs, token ring, FDDI, ATM, and SONET are used simul-
taneously, while most of them did not even exist when the original Internet architecture
was conceived. In the application layer, new applications are continuously emerging, such
as multi-player network gaming, World Wide Web, streaming multimedia, peer-to-peer file
sharing, etc. Accompanying this increasing complexity is the decentralizing control of the
Internet. With the decommissioning of the NSFnet in 1995, large commercial ISPs began
to build and operate their own backbones. The Internet topology and inter-domain routing
became much more complex and hard to understand while every ISP is driven by profit.
As an evolving complex system with unprecedented scale and great heterogeneity, the
Internet presents an immense challenge for networking researchers to model and analyze
how it works. The innovation and development of the Internet are the results of an engi-
neering design cycle largely based on intuitions, heuristics, simulations, and experiments.
Formulating theories for such a complex heuristic system afterwards seems infeasible at the
first glance, which is partly the reason why theories for the Internet are lagging far behind
of its applications. However, in recent years large steps have been taken to build rigor-
ous mathematical foundations of the Internet in several areas, such as Internet topology
[92, 93], routing [51, 135], congestion control [98, 138], etc.
Previous Internet research has been heavily based on measurements and simulations,
which have intrinsic limitations. For example, network measurements cannot tell us the ef-
fects of new protocols before their deployment. Simulations only work for small networks
with simple topology due to the constraints of the memory size and processor speed. We
cannot assume that a protocol that works in a small network will still perform well in the
Internet. Furthermore, it is easier to verify the correctness of a mathematical analysis than
to check the feasibility of protocols in large-scale complex networks.
A theoretical framework can greatly help us understand the advantages and shortcom-
ings of current Internet technologies and guide us to design new protocols for identified
problems and future networks. Papachristodoulou et al. [128] also argued that protocol
design should be based on rigorous repeatable methodologies and systematic evaluation
3
frameworks. Design based on intuition can easily underestimate the importance of certain
system features and lead to a suboptimal solution, or even disastrous implementation. One
such example is the original design of HTTP protocol quoted from Floyd and Paxson [42]:
“The HTTP protocol used by the World Wide Web is a perfect example of a success
disaster. Had its designers envisioned it in use by the entire Internet, and had they
explored the corresponding consequences with analysis or simulation, they might have
significantly improved its design, which in turn could have led to a more smoothly
operating Internet today.”
In summary, developing theories for the Internet is very important and challenging,
as the design and analysis of protocols need rigorous frameworks. Recently, a unified
framework to study Internet congestion control has been proposed and will be described
in Section 2.3. We will study the equilibrium and dynamics of TCP systems based on this
framework.
1.2 Related work in congestion control
In recent years, large steps have been taken in bringing analytical models into Internet
congestion control. We survey some important work in this subsection.
The steady-state throughput of TCP Reno has been studied based on the stationary dis-
tribution of congestion windows, e.g., [38, 90, 122, 109]. These studies show that the TCP
throughput is inversely proportional to end-to-end delay and to the square root of packet
loss probability. Padhye et al. [124] refined the model to capture the fast retransmit mech-
anism and the time-out effect, and achieved a more accurate formula. This equilibrium
property of TCP Reno is used to define the notion ofTCP–friendlinessand motivates the
equation based congestion control TFRC [54].
Misra et al. [114, 115] proposed an ordinary differential equation model of the dynam-
ics of TCP Reno, which is derived by studying congestion window size with a stochastic
differential equation. This deterministic model treats the rate as fluid quantities (by assum-
ing that the packet is infinitely small) and ignores the randomness in packet level, in contrast
to the classical queueing theory approach, which relies on stochastic models. This model
4
has been quickly combined with feedback control theory to study the dynamics of TCP
systems, e.g., [60, 100], and to design stable AQM algorithms, e.g.,[8, 61, 82, 166, 133].
Similar flow models for other TCP schemes are also developed, e.g., [24, 101] for TCP
Vegas, and [69, 157] for FAST TCP. We will study the dynamics of TCP Reno and FAST
TCP with these models in Chapter 3 and 4.
The analysis and design of protocols for large-scale network have been made possible
with the optimization framework and the duality model. Kelly [77, 80] formulated the
bandwidth allocation problem as a utility maximization over source rates with capacity
constraints. A distributed algorithm is also provided by Kelly et al. [80] to globally solve
the penalty function form of this optimization problem. This algorithm is called the primal
algorithm where the sources adapt their rates dynamically, and the link prices are calculated
by a static function of arrival rates.
Low and Lapsley [97] proposed a gradient projection algorithm to solve its dual prob-
lem. It is shown that this algorithm globally converges to the exact solution of the original
optimization problem since there is no duality gap. This approach is called the dual algo-
rithm, where links adapt their prices dynamically, and the users’ source rates are determined
by a static function.
There is a large body of research in congestion control based on this utility maximiza-
tion framework. Local stability with feedback delay is studied for the primal algorithm in
[106, 71, 151]. For more results on global stability and stability of other algorithms, please
see [161, 134, 158, 127]. For discussion on implementation of such algorithms in the In-
ternet with ECN, see [6, 5, 89, 102, 125]. Mehyar et al. [112] analyzed converge regions
when there are price estimation errors. The extension of this framework into multi-cast and
multi-path routing is provided in [75, 74, 95]. The joint optimization over both routing and
source rates is studied in [78, 154].
Low [96] provided a duality model that leads to a unified framework to understand and
design TCP/AQM algorithms. This framework viewed the TCP source rates as primal vari-
ables and congestion measures as the dual variables, and interpreted the congestion control
as a distributed primal-dual algorithm over the Internet to solve the utility maximization
problem. The existing TCP/AQM protocols can be reverse-engineered to determine the
5
underlying utility functions. The equilibrium properties of a TCP/AQM system, such as
throughput and fairness, can be readily understood by studying the corresponding opti-
mization problems with these utility functions. We can also start with a general utility
function and design TCP/AQM to achieve this utility, e.g., FAST TCP [69]. The details of
this duality model will be briefly covered Section 2.3.
The optimization framework can not be used in certain situations, e.g., networks with
heterogeneous protocols [147]. It is worth noting that there are some other approaches to
studying Internet congestion control. For example, non-cooperative game theory is used in
[164, 49, 4, 32], and stochastic models are used in [148, 9] with large number flows.
1.3 Summary of main results
The main results of this dissertation are summarized in this subsection. There are two fun-
damental topics in this thesis: equilibrium and dynamics. First, we are concerned with
the dynamics of existing TCP algorithms and examine in particular the local and global
stabilities of the postulated equilibria using feedback control theory. Second, we study the
equilibrium properties such as fairness, throughput, and routing using the utility optimiza-
tion framework. At the end of the dissertation, we briefly describe two related projects:
equilibrium of heterogeneous protocols and characteristics of CHOKe. The existing opti-
mization framework is not applicable in these two cases, and new tools are introduced to
study them.
1.3.1 Dynamics and stability
Stability is an important property of congestion control systems. There is currently no
unified theory to understand the behavior of a distributed nonlinear feedback system with
delay when the system loses stability. It is therefore undesirable to let TCP/AQM systems
operate in an unstable regime, and unnecessary if stability can be maintained without sac-
rificing performance. In fact, instability can cause three problems. First, it increases jitters
in source rate and delay and can be detrimental to some applications. Second, it subjects
6
short-duration connections, which are typically delay and loss sensitive, to unnecessary
delay and loss. Finally, it can lead to under-utilization of network links if queues jump
between empty and full. The studies of TCP Reno and FAST TCP are shown bellow.
1.3.1.1 Local stability of TCP/RED
TCP Reno and its variants are the only congestion control schemes deployed in the Internet.
It has been observed that TCP/RED may oscillate wildly, and it is difficult to reduce the
oscillation by tuning RED parameters [110, 26]. Although the AIMD strategy employed
by TCP Reno and noisy link traffic certainly contribute to the oscillation, we show that
their effects are small in comparison with protocol instability. We demonstrate that this
oscillation behavior of queue and average window is determined predominantly by the
instability of TCP Reno/RED.
We provide a general nonlinear model of Reno/RED systems, and study the local sta-
bility of Reno/RED with feedback delays. We also validate the model with simulations
and illustrate the stability region of TCP Reno–RED. It turns out that Reno/RED becomes
unstable when delay increases and more strikingly when network capacity increases! This
work is published in [99, 100] and will be presented in Chapter 3 of this dissertation.
1.3.1.2 Modelling and dynamics of FAST TCP
The oscillation persists in TCP/RED systems, even if we smooth out AIMD. Our research
suggests that Reno/RED is ill suited for future high-speed networks, which motivates the
design of new distributed algorithms for large bandwidth-delay product networks. The re-
cent development in optimization and control theory for Internet congestion control played
an important role in the design of new TCP algorithms. It provides a framework to un-
derstand and design protocols with the desired equilibrium and dynamic properties. FAST
TCP [69] is one of such algorithms that are designed based on this theoretical framework.
The modelling and dynamics of FAST TCP is studied in this dissertation. Based on
the existing continuous–time flow model, we prove that FAST TCP is globally stable for
arbitrary networks when there is no feedback delay. However, this model predicts insta-
7
bility for homogeneous sources sharing a single link when feedback delay is large, while
experiments suggest otherwise. We conjecture that this inconsistence is partly due to the
self–clockingeffect, which is not captured by this model. A discrete–time model is intro-
duced to fully capture the effects. Using this discrete-time model, we derive a sufficient
condition for local asymptotic stability for general networks with feedback delay. The con-
dition says that local stability depends on delays only through their heterogeneity, which
implies in particular that FAST TCP is locally asymptotically stable when all sources have
the same delay no matter how large the delay is. We also prove global stability for a single
bottleneck link in the absence of feedback delay. The techniques developed in this work
are new and applicable to other protocols. These results have been published in [156, 157]
and will be presented in Chapter 4.
1.3.2 Equilibrium and performance
Recent studies have shown that any TCP congestion control algorithm can be interpreted
as carrying out a distributed primal-dual algorithm over the Internet to maximize aggregate
utility, and a user’s utility function is implicitly defined by its TCP algorithm [80, 97, 101,
96]. The equilibrium properties of TCP/AQM systems such as throughput, performance,
and fairness can be studied via the corresponding convex optimization problem.
1.3.2.1 Relations among throughput, fairness, and capacity
The relations among these equilibrium quantities are studied under the optimization frame-
work in this dissertation. More specifically, we try to answer whether a fair allocation is
always inefficient and whether increasing capacity always raises throughput. We are espe-
cially interested in a class of utility functions [116]
U(xi, α) =
(1− α)−1 x1−αi if α 6= 1
log xi if α = 1, (1.1)
whereα is a non-negative parameter. This utility function is special because it includes all
the previously considered allocation policies: maximum throughput (α = 0), proportional
8
fairness (α = 1, achieved by TCP Vegas and FAST), minimum potential delay (α = 2,
approximately achieved by TCP Reno), and max–min fairness (α = ∞ ). The parameterα
can be interpreted as a quantitative measure of fairness [107, 16]. An allocation isfair if α
is large andefficientif aggregate throughput is large.
All examples in the literature suggest that a fair allocation is necessarily inefficient.
We derive explicit expressions for the changes in throughput when the parameterα or the
capacities change. We characterize exactly the tradeoff between fairness and throughput
in general networks. This characterization allows us both to produce the first counter-
example and trivially explain all the previous supporting examples. Surprisingly, the class
of networks in our counter-example is such that a fairer allocation isalwaysmore efficient.
In particular it implies that max–min fairness can achieve a higher aggregate throughput
than proportional fairness.
Intuitively, we might expect that increasing link capacities always raises aggregate
throughput. We show that not only can throughput be reduced when some link increases
its capacity, but more strikingly, it can also be reduced whenall links increase their ca-
pacities by the same amount. If all links increase their capacities proportionally, however,
throughput will indeed increase. These examples demonstrate the intricate interactions
among sources in a network setting that are missing in a single-link topology. This work is
published in [144, 146].
1.3.2.2 Joint utility optimization over TCP/IP
The previous subsection studies the effects of changes in fairness and link capacity. In this
section, we will study the effects of routing changes by investigating the joint utility maxi-
mization over source rates and their routes and try to understand the cross-layer interaction
of TCP-AQM, minimum-cost routing, and resources allocation.
Routing in the current Internet within an Autonomous System is computed by IP and
uses single-path, minimum-cost routing, which generally operates on a slower time scale
than TCP/AQM. The joint utility maximization over both source rates and their routes can
9
be formulated as
maxR∈R
maxx≥0
∑i
Ui(xi) s. t.Rx ≤ c, (1.2)
whereR is the set of all feasible single-path routing matrices. Its Lagrangian dual is
minp≥0
∑i
maxxi≥0
(Ui(xi)− xi min
Ri∈Ri
∑
l
Rlipl
)+
∑
l
clpl, (1.3)
whereRi denotes the set of available routes for sourcei. A striking feature of the associated
dual problem is that the maximization over routes takes the form of minimal-cost routing
with prices as link costs. This raises the question whether TCP/IP might turn out to be a
distributed primal-dual algorithm to solve this joint optimization with proper choice of link
costs.
We show that the primal problem (1.2) is NP-hard and in general can not be solved by
minimal-cost routing. When the congestion prices generated by TCP–AQM are used as
link costs, TCP/IP indeed solves the dual problem (1.3) if it converges to an equilibrium.
However, this utility optimization problem is non-convex, and a duality gap generally exits
between (1.2) and (1.3). Equilibrium of TCP/IP exists if and only if there is no such gap.
We also show that this gap can be described as the penalty for not splitting traffic across
multiple paths in single-path routing.
When such equilibrium exists, it is generally unstable under pure dynamic routing. It
can be stabilized by adding a static component to the link costs, but at the expense of a
reduced achievable utility in equilibrium. We demonstrate this inevitable tradeoff between
utility maximization and routing stability with a simple ring network. We also present
numerical results to validate this tradeoff in a general network topology. These results also
suggest that routing instability can reduce aggregate utility to less than that achievable by
pure static routing.
We show that if the link capacities are optimally provisioned, thenpure staticrouting
is enough to maximize utility even for general networks. Moreover single-path routing
achieves the same utility as multi-path routing at optimality. This work is presented in
10
[153, 154].
1.3.3 Other related results
The following two projects are studied outside the optimization framework. We will briefly
describe the model, approach, and results in Chapter 7. See [147, 142, 155, 143, 145] for
details of these two projects.
1.3.3.1 Network equilibrium with heterogeneous protocols
An important assumption in the duality model is that all the TCP sources are homogeneous,
that means that they all adapt to the same type of congestion signals, e.g., loss probability
in TCP Reno and queueing delay in FAST [69]. During the incremental deployment of
new congestion control protocols such as FAST, there is an important and inevitable phase
where heterogeneous TCP algorithms reacting to different congestion signals coexist in the
same network. In this situation, the current optimization framework breaks down, and the
resulting equilibrium can no longer be interpreted as a solution to a utility maximization
problem. Characterizing the equilibrium of a general network with heterogeneous protocols
is substantially more difficult than in the homogeneous case.
We prove that, under mild assumptions, equilibrium still exists despite the lack of an
underlying optimization problem using the Nash theorem in game theory. In contrast to
the homogeneous protocol case with a unique equilibrium, there can be uncountably many
equilibria with heterogeneous protocols as illustrated by our examples. However, we can
also show that almost all networks have finitely many equilibria, and they are necessarily
locally unique. Multiple locally unique equilibria can arise in two ways. First, the set of
bottleneck links can be non-unique. The equilibria associated with different sets of bottle-
neck links are necessarily distinct. Second, even when there is a unique set of bottleneck
links, network equilibrium can still be non-unique, but is always finite and odd in number.
They cannot all be locally stable unless the equilibrium is globally unique. We also provide
various sufficient conditions for global uniqueness. This work also appears in [147, 142].
11
1.3.3.2 Control unresponsive flow–CHOKe
All our previous studies have assumed that all sources utilize a certain TCP scheme to adapt
their rates based on network congestion. The number of non-rate-adaptive (e.g., UDP-
based) applications is growing in the Internet. Without a proper incentive structure, these
applications may result in more severe congestion by monopolizing the network bandwidth
to the detriment of rate-adaptive applications. This has motivated a new AQM algorithm
CHOKe [126], which is stateless, simple to implement, and yet surprisingly effective in
protecting TCP from unresponsive UDP flows.
We present a deterministic fluid model that explicitly models both the feedback equilib-
rium of the TCP/CHOKe system and the spatial characteristics of the queue. We prove that,
provided the number of TCP flows is large, the UDP bandwidth share peaks at(e+1)−1 =
0.269 when UDP input rate is slightly larger than link capacity and drops to zero as UDP
input rate tends to infinity. We clarify the spatial characteristics of the leaky buffer under
CHOKe that produce this throughput behavior. Specifically, we prove that, as UDP input
rate increases, even though the total number of UDP packets in the queue increases, their
spatial distribution becomes more and more concentrated near the tail of the queue and
drops rapidly to zero toward the head of the queue. In stark contrast to a non-leaky FIFO
buffer where UDP bandwidth share would approach 1 as its input rate increases without
bound, under CHOKe, UDP simultaneously maintains a large number of packets in the
queue and receives a vanishingly small bandwidth share, the mechanism through which
CHOKe protects TCP flows. This work is published in [155, 143, 145].
1.4 Organization of this dissertation
The rest of this dissertation is organized as follows:
Chapter 2 provides background information in congestion control research. First, var-
ious existing Transmission Control Protocols and Active Queue Management schemes are
briefly described. Then a general network model of TCP/AQM systems is presented. We
also review the resource allocation problem based on utility maximization. The duality
12
model, which interprets the TCP/AQM as a distributed primal–dual algorithm, is presented
with details. These models form the basis of our studies on network dynamics and equilib-
ria in the following chapters.
Chapter 3 and 4 include our studies on the dynamics of TCP/AQM systems. We show
that Reno/RED becomes unstable when delay increases and when network capacity in-
creases. This motivated the design of FAST. The modelling of FAST and several stability
results are presented.
Chapter 6 presents our research on the equilibrium properties of TCP systems. The
relation between fairness and efficiency, and the relation between link capacity and source
throughput are studied in an analytical way.
Chapter 5 describes the joint utility maximization problem over both source rates and
their routes, and tries to answer whether TCP/IP with minimal-cost routing distributedly
solves this problem by proper choice of link costs.
Chapter 7 briefly covers two other related projects. In Section 7.1, we study the equi-
librium structures of networks with heterogeneous congestion control protocols that react
to different congestion signals. In Section 7.2, we analyze CHOKe, which is a new AQM
that aims to protect TCP sources from unresponsive flows. Both the feedback equilibrium
of the TCP/CHOKe system and the spatial characteristics of the leaky queue are studied.
Chapter 8 concludes this dissertation and points out several future research directions.
13
Chapter 2
Background and Preliminaries
Internet congestion occurs when the aggregate demand for certain resources (e.g., link
bandwidth) exceeds the available capacity. Results of this congestion include long trans-
fer delay, high packet loss, constant packet retransmission, and even possible congestion
collapse [63], in which network links are fully utilized, but the throughput, which an ap-
plication obtains, is close to zero. It is clear that in order to maintain good network perfor-
mance, certain mechanisms must be provided to prevent the network from being severely
congested for any significant period of time.
One intuitive solution is to use network provision to provide more resources. However,
Jain [66] had shown that large memory, high-speed links, and fast processors would not
solve the congestion problem in computer networks. Although the bandwidth exponen-
tially increased in the last decade, the request for additional bandwidth remained, and new
applications consumed much more bandwidth than expected, e.g., peer-to-peer file sharing
[52, 43]. The need for good congestion control schemes has been intensified by the increas-
ing capacity of the Internet instead of being alleviated, while what we want to achieve is
performance, stability, and fairness in a more heterogeneous environment [66]. Therefore,
congestion control is still a very important subject even in the future high-speed network.
Congestion control studies the design and analysis of distributed algorithms to share
network resources among competing users. The goal is to match the demand with available
resources to reduce congestion and under-utilization and to allocate the resources fairly.
There are two components in Internet Congestion Control. The first is a source algorithm
implemented in Transmission Control Protocol to dynamically adjust the sending rate based
14
on congestion along its path. The other is the Active Queue Management algorithm running
on the routers, which updates the congestion information and feeds it back to sources im-
plicitly or explicitly in the form of packet loss, delay, or marking. We will briefly describe
several such algorithms in the following subsections.
2.1 Transmission Control Protocol (TCP)
The early version of TCP used for the Internet before 1988 did not have a proper conges-
tion control scheme built in, and its main purpose was to guarantee reliable data transfer
across the unreliable best-effort network. This resulted in frequent congestion collapses
throughout the mid-1980s until the algorithm to dynamically adapt source rate based on
packet loss was introduced by Jacobson [63]. The algorithm has undergone many minor,
but important changes, e.g., [64, 140, 108, 3, 40, 57]. It has several slightly different im-
plemented versions such as TCP Tahoe, Reno, NewReno, and SACK, which have similar
essential features of additive increase and multiplicative decrease. We will not distinguish
them and will refer to them as TCP Reno in this dissertation.
2.1.1 TCP Reno
TCP Reno has performed remarkably well and has prevented severe congestion as the In-
ternet expanded by five orders of magnitude in size, speed, load, and connectivity. Mea-
surements in core routers have indicated that about 90% of all the traffic is generated by
TCP Reno sources [137]. TCP Reno is the only deployed congestion control scheme in
the current Internet, and it is very important for us to have a solid understanding of how
it works. In this subsection, we will describe the congestion control mechanism of TCP
Reno.
A TCP Reno source sends packets using a sliding window algorithm, see [129] for
details. Its sending rate is controlled by the congestion window size, which is the maxi-
mum number of packets that have been sent, yet not acknowledged. When the congestion
window is exhausted, the source must wait for an acknowledgement before sending a new
15
packet. This is the ”self-clocking” feature [98], which automatically slows down the source
when a network becomes congested and round-trip time (RTT) increases. Since there is
roughly one window of packets sent out for every RTT, the source rate is controlled by
the window size divided by RTT. The key idea in this algorithm is to additively increase
congestion window size for additional bandwidth and multiplicatively decrease it while
network congestion is detected.
A connection starts with a small window size of one packet, and the source increments
its window by one every time it receives an acknowledgement. This doubles the window
every RTT and is calledslow start, see Figure 2.1. In this phase, the source exponentially
increases its rate and can grab the available bandwidth quickly. (It isslow compared to
the old design where the source sends as many packets as the receiver’s advertised window
size.) When the window size reaches the slow-start threshold (ssThreshold), the source
enters thecongestion avoidancephase, where it increases its window by the reciprocal
of the current window size for each acknowledgement (ACK). This increases the window
by one in each round-trip time and is referred to as additive increase. When a loss is
detected through duplicate ACKs, the source halves its window size, updates the value of
ssThreshold, and performs afast recoveryby retransmitting the lost packets. When a loss is
detected through timeout expiration, the congestion window is reset to one, and the source
re-enters theslow-startphase. The mathematical model and dynamics of TCP Reno will
be studied in detail in Chapter 3.
cwnd
TimeTime
Slow Start
W Congestion Avoidance
RTT
1
W/2
ssThreshold
1
Timeout
Figure 2.1: Congestion window of TCP Reno.
16
There are some drawbacks in using packet loss as an indication of congestion. First,
high utilization can be achieved only with full queues, i.e., when the network operates at the
boundary of congestion [98]. This is ill-suited to the heavy-tailed TCP traffic, as observed
in [162, 91, 167]. While most TCP connections are “mice” (small, requiring low latency
[53]), a few “elephants” (long TCP connections, tolerating large latency) generate most of
the traffic. First, operating around a state with full queue, the mice suffer unnecessary loss
and queuing delay. Second, the performance of a loss-based TCP source will be degraded
in the situation where losses are due to other effects (e.g., wireless links).
There are also some other TCP alternatives we will briefly describe below.
2.1.2 TCP Vegas
Instead of using packet loss as a measure of congestion, there is another class of congestion
control algorithms that adapt their congestion window size based on end-to-end delay. This
approach is originally described by Jain [65] and is represented by TCP Vegas [19, 20] and
FAST TCP [69].
There are several key differences between TCP Vegas and TCP Reno. In slow-start
phase, TCP Vegas incorporates its congestion detection mechanism into slow-start with
minor modifications to grow the window size more cautiously. When packet loss is de-
tected, TCP Vegas uses a new retransmission mechanism and treats the receipt of certain
ACKs as a trigger to check if a timeout should happen [19]. The most important difference
between them is that TCP Vegas updates its congestion window size based on end-to-end
delay.
TCP Vegas source estimates its round-trip propagation delay as the minimal RTT, mea-
suring the current RTT for each ACK received. Then, it can figure out the number of its
own packets buffered along the path as the product of end-to-end queueing delay and its
sending rate. The source will try to keep this number in a region, specified by two param-
etersα andβ. The window size linearly increases, decreases, or maintains the same by
comparing this number withα andβ. The aim is to maintain a small number of packets in
the buffer to fully utilize the link and experience a small queueing delay.
17
Low et al. [101] provided a duality model for TCP Vegas and studied its equilibrium in
detail. It is shown that TCP Vegas achieves weighted proportional fairness at the equilib-
rium when there is sufficient buffer. Choe and Low [24] studied the dynamics of the TCP
Vegas algorithm, showing that it can become unstable in the presence of network delay,
and provided modification for better stability.
2.1.3 FAST TCP
It is shown [100, 59] that the current congestion control algorithms, TCP Reno, and its
variants do not scale with bandwidth-delay products of the Internet as it continues to grow,
and will eventually become performance bottlenecks. This has motivated the design of
FAST TCP [69, 70], which targets high-speed networks with long latency. Unlike other
congestion control algorithms, it is designed based on a theoretical framework [98, 102]
and aims to achieve high throughput while maintaining a stable and fair equilibrium.
FAST TCP adjusts its congestion window size based on queueing delay instead of
packet loss. In networks with large bandwidth-delay products, packet losses are rare events,
and each packet loss only provides one bit of information. The queueing delay can be mea-
sured for each ACK packet, and the results provide multi-bit information. The measured
queueing delay is processed with a low-pass filter to provide more accurate and smooth
information about the congestion in the networks. This measured queueing delay is fed
into an equation to decide the changes in the congestion window size. In the congestion
avoidance phase, FAST periodically updates the congestion window according to [69]:
w ←− min
{2w, (1− γ)w + γ
(baseRTT
RTTw+ α
) }
whereγ ∈ (0, 1], baseRTT is the minimum RTT observed so far, andα is a constant.
Although FAST TCP and TCP Vegas have different window update algorithms and dy-
namics, they share the same equilibrium properties. Similar to TCP Vegas, FAST achieves
weighted proportional fairness, and the constantα is also the number of packets a flow
attempts to maintain in the network buffers at equilibrium.
There are some other important implementation features of FAST that are not described
18
here, for example, burst control and window pacing. The details of the architecture, algo-
rithms, extensive experimental evaluations of FAST TCP, and comparison with other TCP
variants can be found in [68]. I will provide mathematical models of FAST TCP, and will
study its dynamics in detail in Chapter 4.
There are also some other TCP congestion control proposals for high-speed networks,
which will not be covered in detail here. The eXplicit Congestion control Protocol (XCP)
[76], proposed by Katabi et al., is designed based on control theory and requires explicit
feedback from the routers to achieve stability and fairness. The High Speed TCP (HSTCP)
[39], proposed by Floyd, is a modification of current TCP to increase more aggressively and
decrease more cautiously in large congestion window situations. The scalable TCP [81],
proposed by Kelly, uses multiplicative increase and multiplicative decease instead of TCP
Reno’s AIMD. The BIC TCP [163], proposed by Xu et al., uses binary search increase and
additive increase. See [21, 118] for experiments and performance comparisons between
these new proposals.
2.2 Active Queue Management (AQM)
The AQM algorithm runs on a router, which updates and feedbacks congestion information
to end-users. The feedback is usually in the form of packet loss, delay, or marking. There
is a very large body of AQMs proposed, and I will just describe few common AQMs in this
subsection.
2.2.1 Droptail
Droptail is the simplest AQM scheme in the current Internet. It is just a first-in-first-
out(FIFO) queue with limited capacity, and it simply drops any incoming packets when
the queue is full. Since it is simple and easy to implement, Droptail is the dominant AQM
in the current Internet. This FIFO queue helps to achieve better link utilization and absorbs
the bursty traffic.
The congestion information in a Droptail queue is updated by the queueing process and
19
is represented by the size of the backlog buffer. The delay-based TCP algorithms, e.g.,
TCP Vegas and FAST, receive this information by sensing the changes in the round-trip
delay. The dynamics of FAST TCP will be studied in Chapter 4 using Droptail routers with
sufficient buffer.
For loss-based sources, the Droptail queue sends back one bit of information by a packet
drop, which indicates that the router buffer is full and the network is congested. When
working with TCP Reno, Droptail routers have two drawbacks: thelock-outand thefull-
queuephenomena, which are pointed out in Braden et al. [17]. Thelock–outphenomenon
involves a single or a few sources that monopolize the bandwidth. This situation is usually
the result of synchronization [55, 117]. Thefull-queuephenomenon refers to the effect
that the queue can be full (or almost full) for long periods of time, which produces large
end-to-end delays.
One possible solution to overcome these problems is to detect congestion early and to
convey congestion notification to sources before queue overflow. We describe one such
solution below.
2.2.2 Random Early Detection (RED)
The Random Early Detection algorithm, or RED, is proposed by Floyd and Jacobson [41] to
solve the synchronization and full queue problems of Droptail. In contrast to Droptail that
drops packets deterministically when the buffer is full, the RED algorithm drops arriving
packets probabilistically based on average queue size. The packet is dropped randomly to
break up synchronized processes that lead to the lock-out phenomenon, and RED controls
the average queue size to avoid queue overflow.
There are two components in the RED algorithm. The first is the estimation of average
queue size using the exponential weighted average, which can also be interpreted as a low
pass filter to get rid of noise. The other part of the algorithm decides whether to drop an
incoming packet. There are three RED parametersminth , maxth , andmaxp controlling
the dropping probability as shown in Figure 2.2. When the average queue size is less
than the minimum thresholdminth , the dropping probability is zero. When it exceeds
20
the maximum thresholdmaxth , all the incoming packets will be dropped. When it is in
between, packets will be dropped with a probability that varies linearly from 0 tomaxp.
maxth avg. queue size
p
minth
maxp
Figure 2.2: RED dropping function.
The RED can also mark the incoming packets instead of dropping them with the deploy-
ment of Explicit Congestion Notification (ECN) [131] to prevent packet loss and improve
throughput. The basic idea of ECN is to give the network the ability to explicitly signal
TCP sources of congestion using one additional bit in the IP packet header and to have the
TCP sources reduce their transmission rates in response to the marked packets.
The dynamics of Reno/RED systems will be studied in details in Chapter 3. It is shown
that the system becomes unstable when the delay increases or when the link capacity in-
creases. It is very difficult to configure the RED parameters to achieve better performance.
There has been a large body of AQM schemes proposed recently. Some notable exam-
ples include, Stabilized RED [123], PI controller [58], REM [5] , AVQ [86], BLUE [35],
etc.
2.2.3 CHOKe
CHOKe [126], which stands for “CHOose and Keep for responsive flows, CHOose and
Kill for unresponsive flows”, is proposed by Pan et al. in 2001. It aims to penalize the un-
responsive flows (e.g., UDP sources), to protect the rate-adaptive flows (e.g., TCP sources),
and to ensure fairness.
The scheme, CHOKe, is particularly interesting in that it does not require any state
information and yet can provide a minimum throughput to TCP flows. The basic idea of
21
CHOKe is explained in the following quote from [126]:
When a packet arrives at a congested router, CHOKe draws a packet at random from
the FIFO (first-in-first-out) buffer and compares it with the arriving packet. If they both
belong to the same flow, then they are both dropped; else the randomly chosen packet
is left intact and the arriving packet is admitted into the buffer with a probability that
depends on the level of congestion (this probability is computed exactly as in RED).
The surprising feature of this extremely simple scheme is that it can bind the bandwidth
share of UDP flows regardless of their arrival rate.
Its queue characteristics and the maximum throughput of unresponsive flows is studied
in [143, 155, 145]. These results will be briefly covered in Section 7.2.
2.3 Unified frameworks for TCP/AQM systems
In this subsection, we will review the general frameworks for studying the equilibrium and
dynamics of TCP/AQM systems. These models will be used throughout this dissertation.
2.3.1 General dynamic model of TCP/AQM
A network is modelled as a set ofL links with finite capacitiesc = (cl, l ∈ L). They are
shared by a set ofN sources indexed byi. Each sourcei uses a subsetLi ⊆ L of links.
The setsLi define anL×N routing matrix
Rli =
1 if l ∈ Li
0 otherwise. (2.1)
We use the deterministic flow model developed in [115, 99] to describe transmission
rates. Two assumptions are made when using this model. First, the packets are infinitely
small and the sending rate is differentiable (like fluid flow). Second, the congestion signal
is fed back continuously.
22
Each sourcei has an associated transmission ratexi(t), and each linkl has an aggregate
incoming flow rateyl(t). Since all sources whose paths include linkl contribute toyl(t),
we have the equation:
yl(t) =∑
i
Rlixi(t− τ fli), (2.2)
whereτ fli denotes the forward transmission delay from sourcei to link l.
Each link l has an associatedcongestion measure(or price) pl(t), which is a non-
negative quantity maintained by AQM algorithms. The sources are assumed to have access
to the aggregate price of all links in their route1,
qi(t) :=∑
l
Rlipl(t− τ bli), (2.3)
whereτ bli denotes the backward transmission delay from linkl to sourcei. The total round-
trip time τi for sourcei thus satisfies
τi = τ fli + τ b
li
for every linkl in its path.
As shown in [96], this model includes, to a good approximation, the mechanism present
in existing protocols with a different interpretation for price in different protocols (e.g.,
marking or dropping probability in TCP Reno, queueing delay in TCP Vegas).
In this framework, a complete feedback-control system is specified by supplying two
additional blocks: the source rates change according to aggregate prices in the TCP algo-
rithm and the link prices update based on link utilization. The complete system determines
both the equilibrium and dynamic characteristics of the TCP/AQM network.
Since the TCP/AQM is decentralized, the sources only have access to their local in-
formation. Therefore, the key restriction in the above control laws is that they must be
1This is true when delay is used as congestion price. It is approximately true for random marking anddropping when the probability is small.
23
decentralized. Therefore, we can model the dynamics of TCP in a general form
xi(t) = Fi(xi(t), qi(t)). (2.4)
Similarly, the dynamics at links can be written as2
pl(t) = Gl(pl(t), yl(t)). (2.5)
The overall structure of this congestion control system is shown in Figure 2.3.
F1
FN
F1
FN
G1
GL
G1
GL
∑ −=i
f
liilil txRty )()( τ∑ −=i
f
liilil txRty )()( τ
∑ −=l
b
lillii tpRtq )()( τ∑ −=l
b
lillii tpRtq )()( τ
y
pq
x
TCP AQM
Figure 2.3: General congestion control structure.
We will study the dynamics of TCP within this general framework in Chapter 3 and 4.
The equilibrium properties will be studied with the duality model introduced in the next
subsection.
2.3.2 Duality model of TCP
In this section, it will be shown that the above feedback-control system solves a utility
maximization problem at its equilibrium.
Suppose that the equilibrium rates and prices are given byx∗, y∗, p∗, andq∗. Based on
2A more accurate formulation is given in [96] that includes the internal variables of AQM in the parametersof Gl.
24
(2.3) and (2.2), we have following equilibrium relationships
y∗ = Rx∗, q∗ = RT p∗. (2.6)
Assume that equilibrium rates satisfy
x∗i = fi(q∗i ), (2.7)
wherefi(·) is implicitly defined byFi(x∗, q∗i ) = 0 or given by the source static law, e.g.,
[97]. fi(·) is usually a positive, strictly monotonic decreasing function, since the source
decreases its rate with increasing congestion.
Let f−1i (xi) be the inverse function of (2.7), and let a utility functionUi(xi) be its
integral
Ui(xi) :=
∫f−1
i (xi)dxi. (2.8)
This relation implies thatUi(xi) is a monotonic increasing and strictly concave function. It
is easy to check that the equilibrium ratex∗i uniquely solves
maxxi≥0
Ui(xi)− xiq∗i . (2.9)
We interpretUi(xi) as the benefit the source receives by transmitting at ratexi andq∗i as the
price per unit. Then (2.9) is a maximization of the source’s profit. This interpretation makes
few assumptions regarding TCP and AQM and can be used for various TCP schemes.
The global optimization problem to maximize aggregate utility with capacity con-
straints is formulated by Kelly in [77, 80],
maxx≥0
∑i
Ui(xi) (2.10)
subject to Rx ≤ c. (2.11)
It has a unique solution, since it is maximizing a concave function over a convex set. Now
25
we interpret the equilibrium price as the dual variables (or as the Lagrange multipliers) for
the problem (2.10-2.11). Then its Lagrangian is
L(x, p) =∑
i
Ui(xi)−∑
l
pl(yl − c) =∑
i
(Ui(xi)− qixi) +∑
l
plcl. (2.12)
The dual problem is
minp≥0
∑i
Bi(qi) +∑
l
plcl, (2.13)
where
Bi(qi) = maxxi≥0
Ui(xi)− xiqi. (2.14)
Convex duality implies that at the optimump∗, the correspondingx∗, which maximizes
individual optimality (2.9), is exactly the unique solution to the primal problem (2.10-2.11)
since (2.14) is identical to (2.9). Therefore, provided the equilibrium pricesp∗ can be made
to align with the Lagrange multipliers, the equilibrium ratex∗ solves the primal problem in
a distributed way. It is proven in [96] that any link algorithm that satisfies
y∗l ≤ cl with equality ifp∗ > 0 for anyl (2.15)
will guarantee this alignment. In this case,x∗ is the unique primal optimal solution, and
p∗ is a dual optimal solution. It has been argued [96] that the condition (2.15) is satisfied
by any AQM that stabilizes the queue, e.g., RED, REM, and Droptail. Therefore, various
TCP/AQM protocols can be interpreted as different distributed primal-dual algorithms to
solve the global optimization problem (2.10-2.11) with different utility functions.
The equilibrium structures of different congestion control schemes are characterized by
their corresponding utility functions. This model provides us with a rigorous framework
in which to study various equilibrium properties such as fairness, efficiency, and effects of
different network parameters. In Chapter 6, I will present the methods and results following
this approach.
26
This optimization framework can also be extended to study the interaction of TCP at a
fast timescale and IP routing at a slow timescale. See Chapter 5 for details.
27
Chapter 3
Local Dynamics of Reno/RED
3.1 Introduction
It is well known that TCP Reno/RED can oscillate wildly and it is extremely hard to reduce
the oscillation by tuning RED parameters, e.g., [110, 25]. This oscillation could be the
outcome of the AIMD bandwidth probing strategy employed by TCP Reno and noise-like
traffic that are not effectively controlled by TCP (e.g., short lived TCP source). Recent
models e.g., [36, 59], imply however that oscillation is an inevitable outcome of the pro-
tocol itself. We present more evidence to support this view. We argue that Reno/RED
oscillates not only because of the AIMD probing and noise traffic, but more fundamentally,
it is due to instability. Therefore, even if there is no AIMD, and the congestion window is
periodically adjusted by the average of AIMD based on loss probability, the oscillation per-
sists. We illustrate usingns-2simulations that, after smoothing out the AIMD component
of the oscillation, the average behavior can either be steady with small random fluctuations
(when the protocol is stable), or exhibit limit cycles of amplitude much larger than ran-
dom fluctuations (when it is unstable). Moreover, this qualitative behavior persists even
when a large amount of noise traffic is introduced, and even when sources have different
delays. We conclude that it is the protocol stability that largely determines the dynamics of
Reno/RED.
This motivates the stability characterization of Reno/RED. In Section 3.3 we develop a
general nonlinear model of Reno/RED. The equilibrium structure of this system is analyzed
using duality model, and a unique equilibrium exists because it is the unique solving of the
28
underling utility maximization problem, see [96] for details. Here, we study local stability
by linearizing the model around this equilibrium. The linear model generalizes the single
link identical source model of [59]. We validate our model with simulations and illustrate
the stability region of Reno/RED. We derive a sufficient stability condition for the special
case of a single link withheterogeneoussources. It shows that Reno/RED becomes unstable
when delay increases, or more strikingly, when link capacity increases!
In the linearized model, the gain introduced by TCP Reno increases rapidly with delay
and link capacity. This induces instability and makes compensation by RED extremely
difficult. In particular, RED parameters can be tuned to improve stability, but only at the
cost of a large queue, even when they are dynamically adjusted. Our results suggest that
Reno/RED is ill suited for future high-speed networks, which motivates the design of new
distributed algorithms for high speed long latency networks.
3.2 Motivation
Why does Reno/RED oscillate? What is the effect of AIMD probing, noise traffic, and
heterogeneity of delays on average congestion window and instantaneous queue size? In
this section, we show that their effect is insignificant in comparison with that of protocol
instability. This protocol instability is the dominant reason for oscillation in the Reno/RED
system. Therefore, it is very important to study the protocol stability of Reno/RED system.
We simulate a single bottleneck network usingns-2. The bottleneck link has a capacity
9 pkts/ms with a constant packet size of 1000 bytes. The AQM running on this link is RED
with ECN marking inbytemode (i.e., ACK packets are marked with negligible probability).
The RED parameters aremaxp = 0.1,minth = 50 pkts,maxth = 550 pkts, and weight
for queues averagingα = 10−4. The link is shared by 50 persistent TCP Reno sources.
We have run simulations with both one-way and two-way traffic, and the behavior is very
similar. The results in Figures 3.1 and 3.2 are for two-way traffic, and those in Figure 3.3
are for one-way traffic. The measurements on the Internet [2] show that most connections
have round-trip delays between 15-500ms. We perform simulations within this range of
delays.
29
0 5 10 15 20 25 30 35 400
10
20
30
40
50
60
70
Win
dow
(pkt
s)
Individual windowAverage window
time(s)
Individual window
Average window
(a) Window (delay = 40ms)
0 5 10 15 20 25 30 35 400
100
200
300
400
500
600
700
800
Inst
aneous
queue(p
kts)
time(s)
(b) Queue (delay = 40ms)
0 5 10 15 20 25 30 35 400
20
40
60
80
100
120
140
Win
dow
(pkt
s)
Individual windowAverage window
time(s)
Individual window
Average window
(c) Window (delay = 200ms)
0 5 10 15 20 25 30 35 400
100
200
300
400
500
600
700
800
Inst
aneous
queue(p
kts)
time(s)
(d) Queue (delay = 200ms)
Figure 3.1: Window and queue traces without noise traffic.
Figure 3.1 gives the result of two cases where connections have identical roundtrip
propagation delay and generate traffic in both directions. Figure 3.1(a) shows an individual
window and the average window that is mean window size of all 50 sources, as a function
of time. They are typical traces when round-trip propagation delay is small (40ms in this
case). Oscillations due to AIMD are prominent in the individual window, but disappear
in the average window. Since the queue averages individual windows, it also displays a
smooth trace with small random fluctuations, as shown in Figure 3.1(b). We consider the
averagebehavior of the protocol stable (non-oscillatory) in this case.
30
Figures 3.1(c) and (d) show the corresponding windows and queue when round-trip
propagation delay is increased to 200ms. Not only does the individual window oscillate
with a larger amplitude, more importantly, its average displays a deterministic limit cycle.
This also shows up in the queue trace. We say the protocol is in anunstableregime.
What is the effect of noise-like mice traffics that are not effectively controlled by
Reno/RED? To get a qualitative understanding, we add additional short HTTP sources to
the 50 persistent bi-directional TCP flows. Each HTTP source sends a single-packet request
to its destination, which then replies with a file of size that is exponentially distributed. Af-
ter the source completely receives the data, it waits for a random time that is exponentially
distributed with a mean of 500 milliseconds and repeats the process. Both the request and
the response are carried over TCP connections. Two sets of simulations are conducted:
the first with 60 http sources generating 10% noise (i.e., persistent TCP sources occupied
90% of bottleneck link capacity), and the second set with 180 http sources generating 30%
noise.
The queue traces when propagation delays are 40ms (stable) and 200ms (unstable),
respectively, are shown in Figures 3.2(a) and (b) for a noise intensity of 10% and in Figure
3.2(c) and (d) for a noise intensity of 30%. The behavior of the queue is dominated by the
stability of the protocol, not by noise-like mice traffic (compare with Figures 3.1(b) and
(d)). In the stable regime (40ms delay), the noise traffic increases the average queue length
slightly. This increases the marking probability and reduces the average window of the
persistent TCP sources.
All our previous simulations are for sources with identical propagation delay. Will the
dynamic behavior be very different when sources have different delays? We repeat the
previous experiments, without noise, with 50 persistent connections having delays ranging
from 40ms to 64ms at 1ms increments, with 2 sources to each delay value. We study their
dynamic behavior when all delays are scaled up, or down, over a wide range. The behavior
is qualitatively similar to the case of identical delay, with more severe queue oscillation.
Figure 3.3(a) shows the instantaneous queue when the scaling factor is0.3 (delays range
from 12ms to19.2ms), with an average delay of15.6ms, averaged over all sources. Figure
3.3(b) shows the queue when the scaling factor is 4, with an average delay of 208ms.
31
0 5 10 15 20 25 30 35 400
100
200
300
400
500
600
700
800
Inst
aneous
queue(p
kts)
time(s)
(a) Queue (delay = 40ms, 10% noise)
0 5 10 15 20 25 30 35 400
100
200
300
400
500
600
700
800
Inst
ane
ou
s queue(p
kts)
time(s)
(b) Queue (delay = 200ms, 10% noise)
0 5 10 15 20 25 30 35 400
100
200
300
400
500
600
700
800
Inst
aneous
queue(p
kts)
time(s)
(c) Queue (delay = 40ms, 30% noise)
0 5 10 15 20 25 30 35 400
100
200
300
400
500
600
700
800
Inst
aneous
queue(p
kts)
time(s)
(d) Queue (delay = 200ms, 30% noise)
Figure 3.2: Queue traces with noise traffic.
Hence it is protocol stability that largely determines the dynamics of Reno/RED. We
now characterize when Reno/RED is stable.
3.3 Dynamic model
In this section we develop a model of Reno/RED and use it to study the local dynamics
of Reno/RED. We start with a nonlinear model, make a few remarks about its equilibrium
properties, and then linearize the model around the equilibrium. We validate our linear
32
0 10 20 30 40 50 60 70 80 90 1000
100
200
300
400
500
600
700
800
time (sec)
Instantaneous queue (pkts)
inst
anta
neou
s qu
eue
(pkt
s)
(a) Queue (delays from 12ms to 19ms)
0 10 20 30 40 50 60 70 80 90 1000
100
200
300
400
500
600
700
800
time (sec)
Instantaneous queue (pkts)
inst
anta
neou
s qu
eue
(pkt
s)
(b) Queue (delays from 160ms to 254ms)
Figure 3.3: Queue traces with heterogeneous delays.
model withns-2simulations and illustrate the stability region of Reno/RED. Finally we
derive a stability condition for the special case of a single link with heterogeneous sources.
3.3.1 Nonlinear model of Reno/RED
The general nonlinear model for TCP/AQM systems has been presented in Section 2.3.
Here more details are given in order to study Reno/RED systems.
A network is modelled as a set ofL links with finite capacitiesc = (cl, l ∈ L). They
are shared by a set ofN sources. The interactions between them are specified by a routing
matrixR whereRli = 1 if link l is in the path of sourcei, andRli = 0 otherwise.
Denoteτi(t) as the round-trip time of sourcei at time t; it is the sum of round-trip
propagation delaydi and the round-trip queueing delay∑
l Rlibl(t)/cl. Sourcei’s sending
ratexi(t) can be formulated as
xi(t) =wi(t)
τi(t), (3.1)
wherewi(t) denotes the congestion window size. The aggregate flow rate at linkl is
yl(t) =∑
i
Rlixi(t− τ fli(t)), (3.2)
33
whereτ fli(t) is the forward delay from sourcei to link l.
The link congestion pricepl(t) in the general model corresponds to the packet marking
(or loss) probability in TCP Reno. The end-to-end marking probability observed at sourcei
is actuallyqi(t) = 1−∏l∈Li
(1−pl(t−τ bli(t))) whereτ b
li(t) is the backward delay from link
l to sourcei. We assume thatpl(t) is small for allt so that, approximately, the end-to-end
probability is
qi(t) =∑
l
Rlipl(t− τ bli(t)). (3.3)
The forward and backward delays are related to the round-trip time through
τi(t) = τ fli(t) + τ b
li(t)
for all l ∈ Li.
We now model TCP Reno and RED to provide the(F, G) functions in the general
framework. We focus on the AIMD algorithm of TCP Reno at thecongestion avoidance
phase. The congestion window change in this phase has been described in Section 2.1. At
time t, sourcei transmits at ratexi(t) packets/second; hence, it receives acknowledgments
at ratexi(t− τi(t)), assuming every packet is acknowledged. A fraction(1−qi(t)) of these
acknowledgments are positive, each incrementing the windowwi(t) by 1/wi(t); hence the
windowwi(t) increases, on average, at the rate ofxi(t− τi(t))(1− qi(t))/wi(t). Similarly
negative acknowledgments are received at an average rate ofxi(t−τi(t))qi(t), each halving
the window, and hence the windowwi(t) decreases at a rate ofxi(t − τi(t))qi(t)wi(t)/2.
Hence, the window evolves under Reno according to
wi(t) = xi(t− τi(t))(1− qi(t))1
wi(t)− xi(t− τi(t))qi(t)
wi(t)
2, (3.4)
whereqi(t) is given by (3.3).
To model RED, letbl(t) denote the instantaneous queue length at timet that evolves
34
according to
bl(t) = yl(t)− cl, (3.5)
whereyl(t) is the flow rate given by (3.2) andcl is the link capacity. Define the average
queue length asrl(t). It is updated according to
rl(t) = −αlcl (rl(t)− bl(t)), (3.6)
whereα is the averaging weight. Given the average queue lengthrl(t), the marking proba-
bility is given by
pl(t) =
0 rl(t) ≤ bl
ρl(rl(t)− bl) bl < rl(t) < bl
1 rl(t) ≥ bl
, (3.7)
wherebl, bl, andpl are RED parameters, andρl = pl/(bl − bl).
In summary, Reno/RED is modelled by (3.4–3.7), and their interconnection through the
network is modelled by (3.2–3.3).
The equilibrium of this system is studied in [96, 101]. The Reno/RED model is inter-
preted as carrying out a distributed, primal-dual algorithm to maximize the aggregate utility
over the Internet. The utility function of TCP Reno is derived to be
Ui(xi) =
√2
τi
tan−1
(τixi√
2
).
The equilibrium properties can be studied by solving the underlying convex program. It
also implies that the Reno/RED system has a unique equilibrium.
3.3.2 Linear model of Reno/RED
We linearize the Reno/RED equations (3.4-3.7) to study its stability around equilibrium.
We make several simplifying assumptions. First we assume that the routing matrixR has
35
full row rank so there is a unique equilibrium loss probability vectorp. Second, only
congested links at the equilibrium are considered in the linear model. Moreover we assume
that the system operates in regionbl < rl(t) < bl, so that the marking probability is affine
in the average queue length,pl(t) = ρl(rl(t)− bl).
We make a key assumption on the time-varying, round-trip delay. Round-trip delay ap-
pears in two places: first, in the relation between windowwi(t) and ratexi(t), as expressed
in (3.1), and second, in the time argument of flow rateyl(t), as expressed in (3.2), and
the end-to-end marking probabilityqi(t), as expressed in (3.3). Inclusion of instantaneous
queueing delay in the first place yields a qualitatively different model than if queueing de-
lay is ignored or assumed constant. It means that the queue is not an integrator but has more
complicated dynamics; see (3.9) below. As the proof of Theorem 3.1 shows, this dynamic
is critical to the stability of Reno/RED. The resulting linear model matches simulations
significantly better than if queueing delay is assumed constant. Time-varying delay in the
second place makes linearization difficult, and we replace it by its (constant) equilibrium
value (including equilibrium queueing delay). We approximate the delaysτi(t), τ fli(t), and
τ bli(t) by their equilibrium values in (3.2) and (3.3). With these assumptions, we linearize
Reno/RED around the unique equilibrium. From (3.4), Reno becomes
wi(t) =
(1−
∑
l
Rlipl(t− τ bli)
)wi(t− τi)
τi(t− τi)
1
wi(t)− 1
2
∑
l
Rlipl(t− τ bli)
wi(t− τi) wi(t)
τi(t− τi).
Let w∗i , p
∗l , . . . be equilibrium quantities andδwi(t) = wi(t)− w∗
i , . . . . be the small varia-
tions near the equilibrium. Linearization yields
δwi(t) = − 1
τiq∗i
∑
l
Rliδpl(t− τ bli) −
q∗i w∗i
τi
δwi(t).
Around the equilibrium, the buffer process under RED evolves according to
bl(t) =∑
l
Rliwi(t− τ f
li)
τi(t− τ fli)
− cl =∑
l
Rliwi(t− τ f
li)
di +∑
k Rkibk(t− τ fli)/ck
− cl.
Let τi = di +∑
k Rkib∗k/ck be the equilibrium round-trip time (including queueing delay).
36
Linearizing, we have
δbl(t) =∑
i
Rliδwi(t− τ f
li)
τi
−∑
k
∑i
RliRki δbk(t− τ fli)
w∗i
τ 2i ck
.
The second term above is ignored if we have neglected or assumed constant the queueing
delay in round-trip time. The double summation sums over all linksk that share any source
i with link l. It says that the link dynamics in the network are coupled through shared
sources. The termδbk(t − τ fli)w
∗i /(τick) is roughly the backlog at linkk due to packets of
sourcei, under FIFO queueing. Hence the backlogbl(t) at link l decreases at a rate that
is proportional to the backlog of this shared sourcei at another linkk. This is because
backlog in the path of sourcei reduces therate at which sourcei packets arrive at linkl,
decreasingbl(t).
Putting everything together, Reno/RED is described by, in Laplace domain,
δw(s) = −(sI + D1)−1D2R
Tb (s)δp(s),
δp(s) = (sI + D3)−1D4δb(s),
δb(s) = (sI + Rf (s)D5RT D6)
−1Rf (s)D7δw(s),
where the diagonal matrices areD1 = diag (q∗i w∗i /τi), D2 = diag (1/(τiq
∗i )), D3 =
diag (αlcl), D4 = diag (αlclρl), D5 = diag (w∗i /τ
2i ), D6 = diag (1/cl), andD7 =
diag (1/τi), andRf (s) and Rb(s) are delayed forward and backward routing matrices,
defined as
[Rf (s)]li =
e−τflis if l ∈ Li
0 otherwise, and [Rb(s)]li =
e−τblis if l ∈ Li
0 otherwise..(3.8)
This model generalizes the single-link, identical-source model of [59] to multiple links
with heterogeneous sources.
37
3.3.3 Validation and stability region
We present a series of experiments to validate our linear model when the system is stable
or barely unstable, and to illustrate numerically the stability region.
We consider a single link of capacityc pkts/ms shared byN sources with identi-
cal round-trip propagation delayd ms. ForN = 20, 30, . . . , 60 sources, capacityc =
8, 9, . . . , 15 pkts/ms, and propagation delayd = 50, 55, . . . , 100 ms, we examine the
Nyquist plot of the loop gain of the feedback system (L(jω) in (3.9) below). For each
(N, c) pair, we determine the delaydm(N, c), at5ms increments, at which the smallest in-
tercept of the Nyquist plot with the real axis is closest to−1. This is the delay at which the
system(N, c) transits from stability to instability according to the linear model. For this
delay, we compute the critical frequencyfm(N, c) at which the phase ofL(jω) is−π. Note
that the computation ofL(jω) requires equilibrium round-trip timeτ , the sum of propa-
gation delaydm(N, c), and equilibrium queueing delay. The queueing delay is calculated
from the duality model [96]. Hence, for each(N, c) pair that becomes barely unstable at
a delay between 50ms and 100ms, we obtain the critical (propagation) delaydm(N, c) and
the critical frequencyfm(N, c) from the analytical model. For all experiments, we have
fixed the parameters atα = 10−4, ρ = 0.1/(540− 40) = 0.0002.
We repeat these experiments inns-2, using persistent TCP sources and RED with ECN
marking. The RED parameters are (0.1, 40pkts, 540pkts,10−4), corresponding to theα and
ρ values in the model. For each(N, c) pair, we examine the queue and window trajectories
to determine the critical delaydns(N, c) when the system transits from stability to insta-
bility. We measure the critical frequencyfns(N, c), the fundamental frequency of queue
oscillation, from the fast fourier transform of the queue trajectory. Thus, corresponding
to the linear model, we obtain the critical delaydns(N, c) and frequencyfns(N, c) from
simulations.
We compare model prediction with simulation. Figure 3.4(a) plots the critical delay
dns(N, c) from ns-2simulations versus the critical delaydm(N, c) computed from the lin-
ear model. Each data point corresponds to a particular(N, c) pair. The dashed line is where
all points should lie if the linear model agrees perfectly with the simulation. Figure 3.4(b)
38
50 55 60 65 70 75 80 85 90 95 10050
55
60
65
70
75
80
85
90
95
100Round trip propagation delay at critical frequency (ms)
delay (NS)
de
lay (
mo
de
l)
30 data points
(a) Critical delay (ms)
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5Critical frequency (Hz)
frequency (NS)
fre
qu
en
cy (
mo
de
l)
dynamiclink model
staticlink model
30 data points
(b) Critical frequency (Hz)
Figure 3.4: Linear model validation.
gives the corresponding plot for critical frequenciesfns(N, c) versusfm(N, c). The agree-
ment between model and simulation seems quite reasonable (recall that delay values have
a resolution of 5ms).
Consider a static link model where marking probability is a function of link flow rate
pl(t) = fl(yl(t)).
Then the linearized model is
δpl(t) = f ′l (y∗l ) δyl(t),
wheref ′l (y∗l ) is the derivative offl evaluated at equilibrium. Also shown in Figure 3.4(b)
are critical frequencies predicted from this static-link model (withf ′l (y∗l ) = ρ = 0.0002;
this does not affect the critical frequency), using the same Nyquist plot method described
above. It shows that queue dynamics are significant at the time-scale of interest.
Figure 3.5 illustrates the stability region implied by the linear model. For eachN ,
it plots the critical delaydm(N, c) versus capacityc. The curve separates stable (below)
from unstable regions (above). The negative slope shows that Reno/RED becomes unstable
39
8 9 10 11 12 13 14 1550
55
60
65
70
75
80
85
90
95
100Round trip propagation delay at critical frequency
capacity (pkts/ms)
de
lay (
ms)
N=40
N=30
N=20
N=50 N=60
Figure 3.5: Stability region.
when delay or capacity is large. AsN increases, the stability region expands, i.e., small
load induces instability. Intuitively, a larger delay or capacity, or a smaller load, leads to
a larger equilibrium window; this confirms the folklore that TCP behaves poorly at large
window size.
3.4 Local stability analysis
We now characterize the stability region in the case of a single link withN heterogeneous
sources. Writing forward delay as a fractionβi ∈ (0, 1) of round-trip time,τ fi = βiτi, and
dropping link subscriptl, the open-loop transfer function is
L(s) = Rf (s)D7(sI + D1)−1D2R
Tb (s)(sI + D3)
−1D4(sI + Rf (s)D5RT D6)
−1
=∑
i
1
τip∗(τis + p∗w∗i )· αcρ
s + αc· 1
s + 1c
∑n
x∗nτn
e−βiτns· e−τis. (3.9)
The first term on the right-hand side describes Reno dynamics, the second term describes
RED averaging, the third term is the buffer process, and the last term represents network
delay. The special case where all sources have identical round-trip times,τi = τ , and
forward delays are zero,βi = 0, is analyzed in [59]. They provide sufficient conditions for
closed-loop stability and use them to tune RED parametersα andρ.
40
We start with a lemma that collects some equilibrium properties. It can be proved di-
rectly from the fixed point of (3.4)–(3.7). Letτ := maxi τi, τ := mini τi, τ := (∑
i 1/τi)−1,
andβ := maxi βi.
Lemma 3.1.Letp∗ be the equilibrium loss probability, and letw∗i andx∗i be the equilibrium
window and rate respectively. Thenp∗ = 2/(2 + (cτ)2), w∗i = cτ for all sourcesi, x∗i =
w∗i /τi and
∑i x
∗i /c = 1.
A sufficient condition for local stability is provided by the following theorem.
Theorem 3.1.The closed-loop system described by (3.9) is stable if
ρτ 2
τ τp∗2w∗1
(1 +
1
cτα+
1
p∗w∗1
)<
π(1− β)2
√4β
2+ π2(1− β)2
.
Proof. See [100] for detailed proof.
The left-hand side of the (sufficient) stability condition depends on network parameters
(c and τi) as well as RED parameters (α and ρ). The right-hand side is a property of
the network node that is independent of these parameters. For stability, the left-hand side
must be small. This requires small capacityc and delaysτi and largeN , confirming the
simulation results of the last section. To understand this, note thatcτ is the equilibrium
window size of all sources. Assumingw∗1 = cτ À 2 so thatp = 2/w∗2
1 , then the stability
condition can be re-written as
ρw∗3
1 N
4
(w∗
1
2+ 1 +
N
w∗1α
)<
π(1− β)2
√4β
2+ π2(1− β)2
.
This suggests that the system becomes unstable when window sizew∗1 becomes large,
agreeing with our empirical experience that TCP behaves poorly at large window size.
Roughly, whenc doubles, the equilibrium rate doubles, and hence the window is halved
with twice the magnitude at twice the frequency, resulting in a quadratic increase in control
gain and pushing the system into instability.
The dependence of the stability condition onc, τ , andN is most clearly exhibited in
the case of identical sources, withτ = τi = τ = τ = Nτ .
41
Corollary 3.1. Supposep = 2/w∗21 . Then the stability condition in Theorem 3.1 becomes
ρc3τ 3
4N2
(cτ
2N+ 1 +
1
αcτ
)<
π(1− β)2
√4β
2+ π2(1− β)2
.
The stability condition also suggests that a smallerρ and a largerα enhance stability.
A smallerρ implies a larger equilibrium queue length [96]. A largerα incorporates the
current queue length into the marking probability more quickly. See [100] for proofs of
this Corollary.
3.5 RED parameter setting
It is suggested in [34] that the RED parametermaxp be dynamically adjusted: reduce
maxp asN decreases and raise it otherwise. Raisingmaxp, or reducingmaxth-minth ,
is equivalent to increasingρ ( = maxp/(maxth-minth) ) in the direction consistent
with the stability condition in Theorem 3.1. Theorem 3.1 sets an upper bound onρ,
given N, c, andτ , and hence a lower bound on equilibrium queue length, to ensure sta-
bility. Adapting RED parameterscannotprevent the inevitable choice between stability
and performance: eitherρ is set small to stabilize the queue, around a large value, or,
alternatively, it is set large to reduce the queue, at the risk of violent oscillation. What
adaptation can hope to achieve is to dynamically find a good compromise when network
condition changes.
The same stability analysis can also be applied to other AQM schemes, such as Virtual
Queue [50, 85, 87] and REM/PI [5, 59], and clarifies the role of AQM. The stability proof
relies on bounding a set of the formK · co{h(v, θ)} to the right of(−1, 0). The gainK
and the trajectoryh depend on TCP as well as AQM. For instance, for the case of a single
link with capacityc shared byN identical sources with delayτ , TCP and network delay
contribute a factor
htcp =e−jv
jv + p∗w∗1
42
to the trajectoryh and a factor
Ktcp =c2τ 2
2N(3.10)
to the gainK, assuming the equilibrium window is large so thatp∗ = 2/w2i = 2N/cτ .
AQM compensates for the high gain introduced by TCP by shapingh and reducingK.
With RED, for instance,
h(v, θ) =1
jv + αcτ
e−jθ
v· htcp, and K =
cταρ
1− β·Ktcp.
The first term inh is due to RED averaging, the second term is due to queue dynamics that
also boundsθ ≤ θ0. Hence both the queue and RED add phase lag toh. More importantly,
RED adds anothercτ to the gainK, necessitating a smallαρ for stability and leading to
sluggish response and large equilibrium queue. The factorτ/(1− β) in K comes from the
queue.
The high gainKtcp in (3.10) is mainly responsible for instability at high delay, high
capacity or low load. It makes it difficult for any AQM algorithm to stabilize the current
TCP.
3.6 Conclusion
We have presented simulation results to demonstrate that it is protocol stability more than
other factors that determine the dynamics of TCP/RED. We have developed a multi-link,
multi-source model that can be used to study the stability of general TCP/AQM. We have
presented a sufficient stability condition for the case of a single link with heterogeneous
sources and illustrated the form of Reno/RED’s stability region. It implies that Reno/RED
becomes unstable when the network scales up in delay or capacity. Our analysis indicates
the role, and the difficulty, of RED in stabilizing Reno.
43
Chapter 4
Modelling and Dynamics of FAST
4.1 Introduction
Congestion control is a distributed feedback algorithm to allocate network resources among
competing users. The algorithms in the current Internet, TCP Reno and its variants, have
prevented severe congestion while the Internet underwent explosive growth during the last
decade. In the previous chapter, we have shown that TCP Reno is ill suited for the future
high-speed networks. It is well known that it does not scale as the bandwidth-delay product
as the Internet continues to grow [59, 100]. This has motivated several recent proposals for
congestion control of high-speed networks, including HSTCP [39], Scalable TCP [81],
FAST TCP [69], and BIC TCP [163] (see [69] for extensive references). We have briefly
described the motivation, background theory, and congestion window update functions of
FAST TCP in Chapter 2. The details of the architecture, algorithms, extensive experimental
evaluations of FAST TCP, and comparison with other TCP variants can be found in [69].
Local stability of FAST TCP in the absence of feedback delay is proved in [69] for the case
of a single link. We extend the analysis to local stability with feedback delay and global
stability without feedback delay, both for general networks.
Most of the stability analysis in the literature is based on the fluid model introduced in
[59] (see surveys in [98, 79, 138] for extensions and related models). Key features of many
of these models are that a source controls its sending rate directly1 and that the queueing
1Even when the congestion window size is used as the control variable, sending rate is often taken to bethe window normalized by aconstantround-trip time, and hence a source still controls its rate directly.
44
delay at a link is proportional to the integral of the excess demand for its bandwidth.
In reality, a source dynamically sets its congestion window rather than its sending rate.
These models do not adequately capture the self-clocking effect where a packet is sent only
when an old one is acknowledged, except briefly and immediately after the congestion win-
dow is changed. This automatically constrains the inputrate at a link to the link capacity,
after a brief transient, no matter how large the congestion windows are set. Recently, a
new discrete-time link model was proposed in [160, 69] to capture this effect, and detailed
experimental validations have been carried out in [160]. While the traditional continuous-
time link model does not consider self-clocking, the new discrete-time link model ignores
the fast dynamics at the links. We first present both models of FAST TCP in Section 4.2.
Experimental results are provided to show that, despite errors in these models, both of them
track queueing delays reasonably well.
In Section 4.3, we prove that FAST TCP is globally stable for arbitrary networks when
there is no feedback delay using the continuous-time model. We also derive a sufficient
condition for local asymptotic stability for arbitrary networks with feedback delay, using
the techniques developed in [125, 152]. This condition is also necessary when the sources
are homogeneous with a single bottleneck link. We compare the predictions of stability
based on this condition to experiments on the Dummynet Testbed with such topology. Our
experiments suggest that FAST TCP is always stable for homogeneous sources with a sin-
gle link, while the model with delay predicts instability when the delay is large. We conjec-
ture that this conflict maybe due to the self-clocking effect ignored in the continuous-time
model.
In Sections 4.4, we analyze the stability of FAST TCP using the discrete-time model.
First, we prove that local asymptotic stability of FAST TCP in arbitrary networks in the
presence of delay depends on feedback delays only through their heterogeneity. It implies
in particular that a network where all sources have the same delay is always stable, no
matter how large the delay is. It also confirms the common belief that a slower update
enhances stability. Then we restrict ourselves to a single link without feedback delay and
prove the global stability of FAST TCP. The techniques developed for this discrete-time
model are new and applicable to analyzing other protocols.
45
Finally, we conclude in Section 4.5 with limitations of this work.
4.2 Model
4.2.1 Notation
A network consists of a set ofL links indexed byl with finite capacitycl. It is shared by a
set ofN flows identified by their sources indexed byi. Let R be the routing matrix where
Rli = 1 if sourcei uses linkl, and0 otherwise.
We uset for time in the continuous model, and for time step in the discrete-time model.
The meaning oft should be clear from the context. FAST TCP updates its congestion
window every fixed time period, which is used as the time unit.
Let di denote the round-trip propagation delay of sourcei, andqi(t) denote the round-
trip queueing delay. The round-trip time is given byTi(t) := di + qi(t). We denote the
forward feedback delay from sourcei to link l by τ fli and the backward feedback delay
from link l to sourcei asτ bli. The sum of forward delay from sourcei to any link l and
the backward delay from linkl to sourcei is fixed, i.e.,τi := τ fli + τ b
li for any link l on
the path of sourcei. We make a subtle assumption here. In reality, the feedback delays
τ fli , τ b
li include queueing delay and are time-varying. We assume for simplicity that they
are constant, and mathematically unrelated toTi(t). Later, when we analyze linear stability
around the network equilibrium in the presence of feedback delay, we can interpretτi as
the equilibrium value ofTi.
Let wi(t) be sourcei’s congestion window at timet (discrete or continuous time). The
sending rate of sourcei at timet is defined as
xi(t) =wi(t)
Ti(t), (4.1)
whereTi(t) = di + qi(t). The aggregate rate at linkl is
yl(t) =∑
i
Rlixi(t− τ fli). (4.2)
46
Let pl(t) be the queueing delay at linkl. The end-to-end queueing delayqi(t) observed by
sourcei is
qi(t) =∑
l
Rlipl(t− τ bli). (4.3)
4.2.2 Discrete and continuous-time models
FAST TCP source periodically updates its congestion windoww based on the average RTT
and estimated queueing delay. The pseudo-code is
w← (1− γ)w + γ
(baseRTT
RTTw+ α
),
whereγ ∈ (0, 1], baseRTT is the minimum RTT observed, andα is a constant. We model
this by the following discrete time equation
wi(t + 1) = γ
(diwi(t)
di + qi(t)+ αi
)+ (1− γ)wi(t), (4.4)
wherewi(t) is the congestion window of theith source,γ ∈ (0, 1], andαi is a constant for
sourcei. The corresponding continuous-time model is
wi(t) = γ
(diwi(t)
di + qi(t)+ αi − wi(t)
), (4.5)
where the time is measured in the unit of update period in FAST TCP.
For the continuous-time model, queueing delay has been traditionally modelled with
pl(t) =1
cl
(yl(t)− cl). (4.6)
However, TCP uses self-clocking: the source always tries to maintain that the number
of packets in fly equals to the congestion window size. When the congestion window is
fixed, the source will send a new packet exactly after it receives an ACK packet. When the
congestion window changes, the source sends out bulk traffic in burst, or sends nothing in
a short time period. Therefore, one round-trip time after a congestion window is changed,
packet transmission will be clocked at the same rate as the throughput the flow receives.
47
We assume that the disturbance in the queues due to congestion window changes settles
down quickly compared with the update period of the discrete-time model; see [160] for
detailed justification and validation experiments for these arguments. A consequence of this
assumption is that the link queueing delay vector,p(t) = (pl(t), for all l), is determined
implicitly by sources’ congestion windows in a static manner
∑i
Rliwi(t− τ f
li)
di + qi(t− τ fli)
= cl if pl(t) > 0
≤ cl if pl(t) = 0, (4.7)
where theqi is the end-to-end queueing delay given by (4.3).
In summary, the continuous-time model is specified by (4.5) and (4.6), and the discrete-
time model is specified by (4.4) and (4.7), where the source rates and aggregate rates at
links are given by (4.1) and (4.2), and the end-to-end delays are given by (4.3). While
the continuous-time model does not take self-clocking into full account, the discrete-time
model ignores the fast dynamics at the links. Before comparing these models, we clarify
their common equilibrium structure by the following theorem cited from [69].
Theorem 4.1. Suppose that the routing matrixR has full row rank. A unique equilibrium
(x∗, p∗) of the network exists, andx∗ is the unique maximizer of
maxx≥0
∑i
αi log xi s.t. Rx ≤ c (4.8)
with p∗ as the corresponding optimum of its Lagrangian dual. This implies in particular
that the equilibrium ratex∗ is αi-weighted proportionally fair.
4.2.3 Validation
The continuous-time link model implies that the queue takes an infinite amount of time
to converge after a window change. In the other extreme, the discrete-time link model
assumes that the queue settles down in one sampling time. Neither is perfect, but we now
present experimental results that suggest both track the queue dynamics well.
All the experiments reported in this paper are carried out on the Dummynet Testbed
48
[132]. A FreeBSD machine is configured as a Dummynet router that provides different
propagation delays for different sources. It can be configured with different capacity and
buffer size. In our experiments, the bottleneck link capacity is800Mbps, and the buffer size
is 4000 packets with a fixed packet length of1500 bytes. A Dummynet monitor records
the queue size every0.4 second. The congestion window size and RTT are recorded at the
host every 50ms. TCP traffic is generated usingiperf. The publicly released code of FAST
[33] is used in all experiments involving FAST. We present two experiments to validate the
model, one closed-loop and one open-loop.
In the first (closed-loop) experiment, there are 3 FAST TCP sources sharing a Dum-
mynet router with a common propagation delay of 100ms. The measured and predicted
queue sizes are given in Figure 4.1. In the beginning of the experiment, the FAST sources
are in the slow start phase, and none of the models gives accurate prediction. After the
FAST TCP enters the congestion avoidance phase, both models track the queue size well.
0 2 4 6 8 10 12 14
0
100
200
300
400
500
600
700
Experiment time (seconds)
Queue s
ize (
pkts
)
Real Queue Discrete time model Continuous time model
Figure 4.1: Model validation–closed loop
To eliminate the modelling error in the congestion window adjustment algorithm itself
while validating the link models, we decouple the TCP and queue dynamics by using open-
loop, window control. The second experiment involves three sources with propagation
delays 50ms, 100ms, and 150ms sharing the same Dummynet router.
We changed the Linux 2.4.19 kernel so that the sources vary their window sizes ac-
49
cording to the schedules shown in Figure 4.2(a). The sequences of congestion window
sizes are then used in (4.1)–(4.2) and (4.6) to compute the queueing delay predicted by the
continuous-time model. We also use them in (4.1)–(4.2) and (4.7) to compute the predic-
tions of the discrete-time model. The queueing delay measured from the Dummynet and
those predicted by these two models are shown in Figure 4.2(b), which indicates that both
models track the queue sizes well. We next analyze the stability properties of these two
models.
0 5 10 15 20 25 300
200
400
600
800
1000
1200
1400
1600
1800
2000
2200
Experiment time (sec)
Congestion w
indow
(pkt)
Flow 1Flow 2Flow 3
(a) Scheduled congestion windows
0 5 10 15 20 25 300
500
1000
1500
2000
2500
3000
Queue S
ize (
pkt)
Experiment time (sec
Real queueContinuous time modelDiscrete time model
(b) Resulting queue size
Figure 4.2: Model validation–open loop.
4.3 Stability analysis with the continuous-time model
We present the stability analysis of the continuous model in general networks with and
without feedback delays.
4.3.1 Global stability without feedback delay
In this subsection, we show that FAST is globally stable for general networks by designing a
Lyapunov. When there is no feedback delay, the equations (4.2) and (4.3) can be simplified
50
as
yl(t) =∑
i
Rlixi(t) and qi(t) =∑
l
Rlipl(t). (4.9)
Suppose thatR is full row rank, and the system has unique equilibrium source rates
and link prices. Letwi, pl, . . . be the equilibrium quantities, and denoteδwi(t) := wi(t)−wi, δpl(t) = pl(t)− pl, . . . . From (4.5) we can formulate the equilibrium window as
wi =αiTi
qi
, (4.10)
whereTi is the equilibrium round-trip delayTi = di + qi.
Based on (4.5) and (4.10), we can write the derivative ofwi(t) as
1
γwi(t) = αi − qi(t)wi(t)
Ti(t),
= αi − qi(t)
Ti(t)(wi + δwi(t)),
= − qi(t)
Ti(t)δwi(t) + αi
Ti(t)qi − qi(t)Ti
Ti(t)qi
,
= − qi(t)
Ti(t)δwi(t)− αi
diδqi(t)
Ti(t)qi
.
Therefore, we have
1
γδwi(t) = − qi(t)
Ti(t)δwi(t)− αidi
Ti(t)qi
δqi(t). (4.11)
Based on (4.1) and (4.10) we have
δxi(t) =wi + δwi(t)
Ti(t)− wi
Ti
=δwi(t)
Ti(t)− (
1
Ti
− 1
Ti(t))wi =
δwi(t)
Ti(t)− δqi(t)
Ti(t)Ti
αiTi
qi
.
Therefore, we have
δxi(t) =1
Ti(t)δwi(t)− αi
Ti(t)qi
δqi(t). (4.12)
51
Based on (4.12) and (4.9), the derivative of link price is
pl(t) =1
cl
(∑
i
Rlixi(t)− cl) =1
cl
∑i
Rliδxi(t). (4.13)
From (4.12) and (4.13), we have
δpl(t) =1
cl
∑i
Rli
(1
Ti(t)δwi(t)− αi
Ti(t)qi
δqi(t)
). (4.14)
With the preliminary results, we present and prove the following theorem.
Theorem 4.2. The continues-time model of FAST TCP is globally asymptotically stable
when there is no feedback delay andR is full row rank.
Proof: Considering the functionV (w(t), p(t)) defined as
V (w(t), p(t)) =1
2γ
∑i
qi
αidi
(wi(t)− wi)2 +
1
2
∑
l
cl(pl(t)− pl)2. (4.15)
Clearly, the functionV (w, p) is non-negative for all(w(t), p(t)). It is zero if and only if
w(t) = w andp(t) = p, where the system is at its equilibrium. DifferentiatingV (w(t), p(t))
with respect to the solution trajectory using (4.14) and (4.11) yields
V (w(t), p(t)) =∑
i
qi
γαidi
δwi(t)δwi(t) +∑
l
clδpl(t)δpl(t),
=∑
i
qi
αidi
δwi(t)
(− qi(t)
Ti(t)δwi(t)− αidi
Ti(t)qi
δqi(t)
)
+∑
l
∑i
Rli
(1
Ti(t)δwi(t)− αi
Ti(t)qi
δqi(t)
)δpl(t),
= −∑
i
qiqi(t)
Ti(t)αidi
δwi(t)2 −
∑i
1
Ti(t)δwi(t)δqi(t)
+∑
i
1
Ti(t)δwi(t)
∑
l
Rliδpl(t)−∑
i
αi
Ti(t)qi
δqi(t)∑
l
Rliδpl(t),
= −∑
i
q∗i qi(t)
Ti(t)αidi
δwi(t)2 −
∑i
αi
Ti(t)qi
δqi(t)2
From the above equation,V (w(t), p(t)) ≤ 0. Sinceα > 0, qi > 0 andTi(t) > di > 0, if the
52
equality holds,δqi(t) = 0, which also implies thatδwi(t) = 0. ThereforeV (w(t), p(t)) =
0, if and only if the source rates are at equilibrium. Therefore, the system is at its unique
equilibrium under our assumption thatR is full row rank.
From the above argument,V (w(t), p(t)) is a system Lyapunov function, and the system
is globally asymptotically stable.
From the proof, it is clear thatV (w(t), p(t)) = 0 only implies that the source rates
are in equilibrium. WhenR is not full rank, the source rates still globally converge to
their equilibrium values, but the equilibrium link pricep is no longer unique and may not
converge to a fixed value.
In our continuous model, we ignored that positive projection in the link price updates
(i.e., the link prices have to be non-negative). This is equivalent to assuming that the bot-
tleneck link is unchanged in this dynamical system. We need to consider this saturation
problem in our future research.
4.3.2 Local stability with feedback delay
When feedback delays are present, the global stability analysis for FAST TCP in general
networks is still open. In this section, we try to provide a condition to ensure local stability.
Since there exists a unique equilibrium as described in Theorem 4.1, we can linearize
the model (4.5) and (4.6) around this equilibrium. Define routing matrices with feedback
delay in frequency domain as
(Rf (s))li :=
e−τflis if Rli = 1
0 if Rli = 0(Rb(s))li :=
e−τblis if Rli = 1
0 if Rli = 0.
Let xi, qi, andTi be the corresponding equilibrium values associated with sourcei. The
following Lemma provides the open-loop transfer function.
Lemma 4.1. The open-loop transfer function of the linearized FAST TCP system is
L(s) = D3Rf (s)Λ(s)XRTf (−s), (4.16)
53
where
D3 := diag
(1
cl
), X := diag(xi), and Λ(s) := diag(
e−τis
Tis
Tis + γTi
Tis + γqi
).
Proof. See Appendix 4.6.1.
The following theorem provides a sufficient condition for local stability.
Theorem 4.3.The FAST TCP system described by (4.5) and (4.6) is locally stable if
M
φ
√φ2 + γ2T 2
max
φ2 + γ2q2min
< 1, (4.17)
whereM is the maximal number of links in the path of any source,qmin = mini qi, Tmax =
maxi Ti and
φ := mini
(π
2− tan−1 1− qi/Ti
2√
qi/Ti
). (4.18)
Proof. See the Appendix 4.6.2.
This is actually a very weak theorem. The condition (4.18) can hardly be satisfied when
M is large. But it can provide us with some information about the effect of various param-
eters on the stability. For example, this condition suggests that the equilibrium queueing
delay should be large to guarantee stability. In general, this condition is only sufficient.
When there is only one link and all sources have the same feedback delays, it becomes
necessary as well. Our numerical simulations for this model validate this.
4.3.3 Numerical simulation and experiment
The condition in Theorem 4.3 implies that FAST TCP may become unstable in a single
bottleneck network with homogeneous sources. However our experiments with FAST TCP
on the Dummynet Testbed have always been stable.
We now present an experiment that violates the local stability condition. Moreover, nu-
merical simulation of the continuous-time model exhibits instability. Yet, the same network
54
on Dummynet is clearly stable. This suggests that the discrepancy is not in the stability the-
orem but rather in the continuous-time model.
In our experiment, the sources have identical propagation delay of100ms with a con-
stantα value of70 packets. They share a bottleneck with capacity of800Mbps. The simu-
lations and experiments consist of three intervals. The interval length is10 seconds for the
continuous-time model simulation and 100 seconds for the experiment2. Three sources are
active from the beginning of the experiment, seven additional sources activate in the sec-
ond interval, and in the last interval, all sources become inactive except five of them. The
simulation and experimental results are shown in Figure 4.3 and Figure 4.4, respectively.
10 12 14 16 18 20 22 24 26 28 300
500
1000
1500
2000
2500
3000
3500
4000
Simulation time (sec)
Queue s
ize (
pkts
)
(a) Queue size
10 12 14 16 18 20 22 24 26 28 300
500
1000
1500
2000
2500
3000
Congestion w
indow
(pkt)
Simulation time (sec)
Flow 1Flow 4Flow 10
(b) Window size
Figure 4.3: Numerical simulations of FAST TCP.
The stability condition in Theorem 4.3 is not satisfied and, as expected, the numerical
simulation based on the continuous-time model exhibits periodic oscillation. However, in
the Dummynet experiment, FAST TCP is actually stable (see Figure 4.4).3
We believe that the discrepancy is largely due to the fact that the continuous-time model
does not capture the self-clocking effect accurately. Self-clocking ensures that packets are
2We use a long duration in the Dummynet experiment because a FAST TCP source takes longer to con-verge due to slow-start, which is not included in our model.
3The regular spikes every 10 seconds in the queue size are probably due to a certain background task inthe sending host.
55
50 100 150 200 2500
100
200
300
400
500
600
700
800
Dum
mynet queue s
ize (
pkt)
Experiment time (sec)
(a) Queue size
50 100 150 200 2500
500
1000
1500
2000
2500
Simulation time (sec)
Congestion w
indow
(pkt)
Flow 1Flow 4Flow 10
(b) Window size
Figure 4.4: Dummynet experiments of FAST TCP.
sent at the same rate as the goodput the source receives, except briefly when the window
size changes. This self-clocking feature can actually help the system approach an equi-
librium. Indeed, for the case of one source for one link, a discrete-event model is used
in [160] to prove that TCP FAST and Vegas are always stable regardless of the feedback
delay. It also provides justification for the discrete-time models in (4.4) and (4.7) based on
the self-clocking feature introduced in the last section.
4.4 Stability analysis with the discrete-time model
We now analyze the stability of this model. We will see that the discrete-time model pre-
dicts that a network of homogeneous sources with the same feedback delay is locally stable
no matter how large the delay is, agreeing with our experimental experience. In the follow-
ing subsection, we study the local stability of FAST TCP using the discrete-time model for
arbitrary networks with feedback delays.
56
4.4.1 Local stability with feedback delay
A network of FAST TCP sources is modelled by equations (4.3), (4.4), and (4.7). This
generalizes the model in [69] by including feedback delay. When local stability is studied,
we ignore all un-congested links (links where prices are zero in equilibrium) and assume
that equality always holds in (4.7).
The main result of this section provides a sufficient condition for local asymptotic sta-
bility in general networks with common feedback delay.
Theorem 4.4. FAST TCP is locally stable for arbitrary networks ifγ ∈ (0, 1] and if all
sources have the same round-trip feedback delayτi = τ for all i.
The stability condition in the theorem does not depend on the value of the feedback
delay, but only on the heterogeneity among them. In particular, when all feedback delays
are ignored,τi = 0 for all i, then FAST TCP is locally asymptotically. This generalizes the
stability result in [69].
Corollary 4.1. FAST TCP is locally asymptotically stable in the absence of feedback delay
for general networks with anyγ ∈ [0, 1).
The rest of this subsection is devoted to the proof of Theorem 4.4.
We applyZ-transform to the linearized system and use the generalized Nyquist criterion
to derive a sufficient stability condition. Define the forward and backwardZ-transformed
routing matricesRf (z) andRb(z) as
(Rf (z))li :=
z−τfli if Rli = 1
0 if Rli = 0and (Rb(z))li :=
z−τbli if Rli = 1
0 if Rli = 0.
The relationτ fli + τ b
li = τi gives
Rb(z) = Rf (z−1) · diag(z−τi). (4.19)
DenoteW (z), Q(z), andP (z) as the correspondingZ-transforms ofδw(t), δq(t), andδp(t)
for the linearized system. Letq andw be the end-to-end queueing delay and congestion
57
window at equilibrium. Linearizing (4.7) yields
∑i
Rli
(δwi(t− τ f
li)
di + qi
− wiδqi(t− τ f
li)
(di + qi)2
)= 0,
where the equality is used in (4.7). The correspondingZ-transform in matrix form is
Rf (z)D−1MW (z)−Rf (z)BQ(z) = 0, (4.20)
where the diagonal matricesB, D, andM are
B := diag
(wi
(di + qi)2
),M := diag
(di
di + qi
), andD := diag(di).
SinceRf (z) is generally not a square matrix, we cannot cancel it in (4.20).
Equation (4.3) is already linear, and the correspondingZ-transform in matrix form is
Q(z) = Rb(z)T P (z). (4.21)
By combining (4.20) and (4.21), we obtain
I −RT
b (z)
Rf (z)B 0
Q(z)
P (z)
=
0
Rf (z)D−1M
W (z).
Solving this equation with block matrix inverse gives the transfer function fromW (z) to
Q(z)
Q(z)
W (z)= RT
b (z)(Rf (z)BRTb (z))−1Rf (z)D−1M.
TheZ-transform of the linearized, congestion window update algorithm is
zW (z) = γ (MW (z)−DBQ(z)) + (1− γ)W (z).
By combining the above equations, we get the open-loop transfer functionL(z) from W (z)
58
to W (z) as
L(z) = − (γ
(M −DBRT
b (z)(Rf (z)BRTb (z))−1Rf (z)D−1M
)+ (1− γ)I
)z−1.
A sufficient condition for local stability can be developed based on the generalized Nyquist
criterion [23, 31]. Since the open-loop system is stable, if we can show that the eigenvalue
loci of L(ejw) does not enclose−1 for ω ∈ [0, 2π), the closed-loop system is stable.
Therefore, if the spectral radius ofL(ejw) is strictly less than 1 forω ∈ [0, 2π), the system
will be stable.
Whenz = ejw, the spectral radii ofL(z) and−zL(z) are the same. Hence, we only
need to study the spectral radius of
J(z) : = γ(M −DBRTb (z)
(Rf (z)BRT
b (z))−1
Rf (z)D−1M + (1− γ)I.
Clearly, the eigenvalues ofJ(z) are dependent onγ. For any givenz = ejω, let the eigen-
values ofJ(z) be denoted byλi(γ), i = 1 . . . N , as functions ofγ ∈ (0, 1]. It is clear
that
|λi(γ)| = |γλi(1) + (1− γ)| ≤ γ|λi(1)|+ (1− γ).
Hence ifρ(J(z)) < 1 for any z = ejω for γ = 1, it will also hold for all γ ∈ (0, 1].
Therefore, it suffices to study the stability condition forγ = 1.
Let µi be theith diagonal entry of matrixM with µi = di/(di + qi). Denoteµmax :=
maxi µi. Since the end-to-end queueing delayqi cannot be zero at equilibrium (otherwise
the rate will be infinitely large), we haveqi > 0 andµmax < 1. The following lemma
characterizes the eigenvalues ofJ(z) with γ = 1.
Lemma 4.2. Whenz = ejω with ω ∈ [0, 2π) andγ = 1, the eigenvalues ofJ(z) have the
following properties:
1. There areL zero eigenvalues with the corresponding eigenvectors as the columns of
matrixM−1DBRTb (z).
59
2. The nonzero eigenvalues have moduli less than1 if τmax− τmin < 1/4, whereτmax =
maxi τi andτmin = mini τi .
Proof: At γ = 1, the matrixJ(z) is
M −DBRTb (z)(Rf (z)BRT
b (z))−1Rf (z)D−1M.
It is easy to check that
J(z)M−1DBRTb (z) = DBRT
b (z)−DBRTb (z) = 0.
SinceM−1DBRTb (z) has full column rank, it consists ofL linearly independent eigenvec-
tors ofJ(z) with corresponding eigenvalue 0. This proves the first assertion.
For the second assertion, suppose thatλ is an eigenvalue ofJ(z) for a givenz. Define
matrixA as
A : = J(z)− λI = (M − λI)−DBRTb (z)(Rf (z)BRT
b (z))−1Rf (z)D−1M,
which is singular by definition. Based on the matrix inversion formula (see, e.g., [62])
(J + EHS)−1 = J−1 − J−1E(H−1 + SJ−1E)−1SJ−1,
if J + EHS is singular, then eitherJ or H−1 + SJ−1E is singular. We can let
J := M − λI, E := −DBRTb (z), H := (Rf (z)BRT
b (z))−1, andS := Rf (z)D−1M.
SinceA = J + EHS is singular, eitherJ = M − λI or H−1 + SJ−1E is singular. The
second term can be reformulated intoRf (z)(B −M(M − λI)−1B)RTb (z).
Case 1:M − λI is singular. SinceM is diagonal, then
0 < λ =di
di + qi
= µi ≤ µmax < 1.
Case 2:Rf (z)(B −M(M − λI)−1B)RTb (z) is singular.
60
It is clear that
B −M(M − λI)−1B = diag(1− µi(µi − λ)−1βi
)= −λdiag
(βi
µi − λ
),
whereβi is the ith diagonal entry of matrixB. Hence,λ = 0 is always an eigenvalue,
which is claimed before. Ifλ is nonzero, it has to be true that
det
(Rf (z)diag
(βi
µi − λ
)RT
b (z)
)= 0. (4.22)
Whenz = ejω, we havez−1 = z. Hence, equation (4.19) can be rewritten as
RTb (z) = diag(z−τi)RT
f (z) = diag(z−τi)R∗f (z).
Substituting the above equation into (4.22) withz = ejω yields
det
(Rf (z)diag
(e−jωτiβi
µi − λ
)R∗
f (z)
)= 0. (4.23)
Therefore, the following formula is also zero
e−j(ωτmax+ψ) det
(Rf (z)diag
(ej(θi+ψ)βi
µi − λ
)R∗
f (z)
)= 0.
whereθi = (τmax − τi)ω, andψ can be any value. Whenτmax − τmin < 1/4, we have
0 ≤ θi = (τmax − τi)ω ≤ π/2.
Suppose that there is a solution such that|λ| ≥ 1. Based on Lemma 4.3, which will be
presented later, there exists aψ s.t. Im(diag(ej(θi+ψ)βi/(µi − λ))
)is a positive diagonal
matrix. Therefore the imaginary part of matrixRf (z)diag(ej(θi+ψ)βi/(µi − λ))
)R∗
f (z)
is positive definite, and the real part is symmetric. From Lemma 4.4 below, it has to be
nonsingular. This contradicts the equation
det
(Rf (z)diag
(ej(θi+ψ)βi
µi − λ
)R∗
f (z)
)= 0.
61
Hence, we have|λ| < 1.
The proof of Theorem 4.4 will be complete after the next two lemmas.
Lemma 4.3. Suppose that0 < µi < 1 and0 ≤ θi < π/2. If |λ| ≥ 1 , there exists aψ such
that
Im
(ej(θi+ψ)βi
µi − λ
)> 0 for i = 1 . . . N.
Proof: See Appendix 4.6.3.
Lemma 4.4. If the real part of a complex matrix is symmetric, and the imaginary part is
positive definite, then the matrix is nonsingular.
Proof: See Appendix 4.6.4.
4.4.2 Global stability for one link without feedback delay
In the absence of feedback delay, when there is only one link, the FAST TCP model can be
simplified into
wi(t + 1) = γ
(diwi(t)
di + q(t)+ αi
)+ (1− γ)wi(t), (4.24)
∑i
wi(t)
di + q(t)≤ c with equality if q(t) > 0, (4.25)
whereq(t) is the queueing delay at the link (subscript is omitted). The main result of this
section proves that the above system (4.24)–(4.25) is globally asymptotically stable and
converges to the equilibrium exponentially fast starting from any initial value.
Theorem 4.5. On a single link, FAST TCP converges exponentially to the equilibrium, in
the absence of feedback delay.
In the rest of this section, we prove the theorem in several steps. The first result is that
after finite stepsK1, equality always holds in (4.25) andq(t) > 0 for any t > K1. Define
62
the normalized congestion window sum asY (t) :=∑
i wi(t)/di. From (4.25), it is clear
thatq(t) > 0 if and only if Y (t) > c.
Lemma 4.5. There existsK1 > 0 such that the following claims are true for allt > K1:
1. q(t) > 0.
2. ν(t + 1) = (1− γ)ν(t) whereν(t) := Y (t)− c−∑i αi/di .
Proof: If initially q(t) = 0, which also meansY (t) ≤ c, from (4.24) we haveY (t + 1) =
Y (t) + γ∑
i αi/di, which linearly increases witht. ThenY (t) > c after some finite steps.
Therefore, there exists aK1 such thatY (t) > c andq(t) > 0 at t = K1.
We will show thatY (t) > c impliesY (t + 1) > c. Henceq(t) > 0 for all t > K1.
Moreover,ν(t) converges exponentially to 0.
SupposeY (t) > c. From∑
i wi(t)/(di + qi(t)) = c, we have
ν(t + 1) =∑
i
wi(t + 1)
di
−∑
i
αi
di
− c,
= (1− γ)∑
i
wi(t)− αi
di
+ γ∑
i
wi(t)
di + q(t)− c,
= (1− γ)
(∑i
wi
di
− c−∑
i
αi
di
)= (1− γ) ν(t).
This proves the second assertion. Moreover it implies
Y (t + 1) = (1− γ)Y (t) + γ
(∑i
αi
di
+ c
).
Hence,Y (t) > c impliesY (t + 1) > c andq(t + 1) > 0. This completes the proof.
For the rest of this subsection, we pick a fixedε with 0 < ε <∑
i αi/di. Define
qmin :=dmin
c
(∑i
αi
di
− ε
), and qmax :=
dmax
c
(∑i
αi
di
+ ε
),
wheredmin := mini di anddmax := maxi di.
Thenq(t) is bounded by these two values after finite steps.
63
Lemma 4.6. There exists a positiveK2 such thatqmin ≤ q(t) ≤ qmax for anyt ≥ K2.
Proof: From Lemma 4.5, after finite stepsK1, ν(t + 1) = (1 − γ)ν(t). Therefore, there
exists aK2 such that|ν(t)| < ε for all t ≥ K2. It implies
∑i
αi
di
<∑
i
wi(t)
di
− c + ε =∑
i
(wi(t)
di
− wi(t)
di + q(t)
)+ ε,
≤∑
i
q(t)wi(t)
dmin(di + q(t))+ ε =
q(t)c
dmin
+ ε.
.
Therefore,
q(t) ≥ dmin
c
(∑i
αi
di
− ε
)= qmin.
The proof forqmax is the same.
Define µi(t) := di/(di + q(t)), and denoteµmax := maxi di/(di + qmin), µmin :=
mini di/(di + qmax). Based on Lemma 4.6, we have1 > µmax ≥ µi(t) > µmin > 0 for any
t ≥ K2. Define
ηi(t) :=wi(t)− αi
αidi
− 1
q(t), (4.26)
and denoteηmax(t) := maxi ηi(t), ηmin(t) := mini ηi(t). We will show that the window
update for sourcei is proportional toηi(t), and the system is at equilibrium if and only if
all ηi(t) are zero. The next lemma gives bounds onηi(t).
Lemma 4.7. There exist two positive numbersδ1 andδ2 such that for allt ≥ K2
ηmax(t) > −δ1(1− γ)t and ηmin(t) < δ2(1− γ)t.
Proof: From (4.26), it is easy to check thatY (t + 1) − Y (t) = −γν(t). By Lemma 4.5,
whent ≥ K2 we have
Y (t + 1)− Y (t) = −γν(t) ≤ γ(1− γ)t−K2|ν(K2)| = κ(1− γ)t, (4.27)
64
whereκ := γ(1− γ)−K2|ν(K2)|.The update of sourcei’s congestion window is
wi(t + 1)− wi(t) = γ
(diwi(t)
di + q(t)+ αi − wi(t)
)=
γq(t)
di + q(t)
(αidi
q(t)− (wi(t)− αi)
),
= −γαidiq(t)
di + q(t)
(wi(t)− αi
αidi
− 1
q(t)
)= −γαiq(t)µi(t)ηi(t).
Chooseδ1 large enough such thatδ1Nγαminqminµmin/dmax > κ whereαmin := mini αi.
We now proveηmax(t) > −δ1(1 − γ)t for all t ≥ K2 by contradiction. Suppose that
there is a timet ≥ K2 such thatηmax(t) ≤ −δ1(1 − γ)t. Then all theηi(t) are negative,
which implies
Y (t + 1)− Y (t) =∑
i
(wi(t + 1)− wi(t))/di =∑
i
−γαiq(t)µi(t)ηi(t)/di,
≥ N(−ηmax)γαminqminµmin/dmax,
≥ δ1N(1− γ)tγαminqminµmin/dmax > κ(1− γ)t.
This contradicts equation (4.27), which proves the claim. The proof forηmin(t) is similar.
DefineL(t) as:
L(t) := ηmax(t)− ηmin(t). (4.28)
The following lemma implies that the difference between differentηi(t) goes to zero expo-
nentially fast.
Lemma 4.8. There are two positive numbersδ3 andδ4, such that fort ≥ K2 we have
1. L(t) ≥ 0.
2. L(t + 1) ≤ (1− γ + γµmax)L(t) + δ3(1− γ)t.
3. L(t) ≤ δ4(1− γ + γµmax)t.
Proof: See Appendix 4.6.5.
65
Lemma 4.9. Bothηmax(t) andηmin(t) exponentially converge to zero.
Proof: Whent ≥ K2, combining Lemma 4.7 and Lemma 4.8 yields an upper bound for
ηmax(t),
ηmax(t) = L(t) + ηmin(t) ≤ δ4(1− γ + γµmax)t + δ2(1− γ)t.
The lower bound ofηmax is−δ1(1−γ)t ≤ ηmax(t). Since both the upper and lower bounds
of ηmax(t) converge to zero exponentially fast, it exponentially goes to zero. The proof for
ηmin(t) is similar.
Proof of Theorem 4.5: The system is at equilibrium if and only ifwi(t) = wi(t + 1) for
all i. This is equivalent toηi(t) = 0 for all i because of the equation
wi(t + 1)− wi(t) = −γαiq(t)µi(t)ηi(t).
Since bothηmax(t) andηmin(t) converge to zero exponentially from any initial value, the
system converges to the equilibrium defined byηi(t) = 0 globally.
4.5 Conclusion
We have introduced the traditional continuous-time model for FAST TCP. We analyze its
stability for general networks. We prove that the FAST TCP system is globally stable
without feedback delay. When the feedback delays are present, a sufficient condition is
provided for local stability. However, there are certain inconsistencies between this model
and our experiments, which maybe due to the self-clocking effects.
We present a new discrete-time link model that fully captures the effect of self-clocking.
Using this discrete-time model, we have derived a sufficient condition for local asymptotic
stability for general networks in the presence of feedback delay. The condition states that
the system is stable if the difference among delays of the sources is small. This implies, in
particular, that a network with homogeneous sources is always stable, consistent with our
66
experimental experience so far. We also prove that FAST TCP is globally stable on a single
link in the absence of feedback delay.
This work can be extended in several ways. First, the condition for local asymptotic
stability derived appears more restrictive than our experiments suggest. Moreover, we have
also found scenarios where predictions of the discrete-time model disagree with experi-
ment. These discrepancies should be clarified. Second, it will be interesting to extend
the global stability analysis to general networks with feedback delays. Finally, the new
model and the analysis techniques here can be applied to analyze other congestion control
algorithms.
4.6 Appendix
4.6.1 Proof of Lemma 4.1
The FAST TCP model (4.1, 4.3, 4.5, 4.2, and 4.6) can be linearized into
δqi(t) =∑
l
Rliδpl(t− τ bli), δxi(t) =
δwi(t)
di + qi
− wiδqi(t)
(di + qi)2,
δwi(t) = γ
(−qiδwi(t)
di + qi
− diwiδqi(t)
(di + qi)2
), δyl(t) =
∑i
Rliδxi(t− τ fli),
δpl(t) = δyl(t)/cl,
wherewi andqi are the equilibrium values. Sinceτi = τ fli + τ b
li for all link l on the path of
sourcei, the following equation holds
RTb (s) = diag(e−τis)RT
f (−s). (4.29)
67
The Laplace transform of the linearized system in matrix form is
Q(s) = Rb(s)T P (s),
X(s) = D1W (s)−BQ(s, )
sW (s) = γ ((DD1 − I)W (s)−DBQ(s)) ,
Y (s) = Rf (s)X(s),
sP (s) = D3Y (s),
where the diagonal matrices are
D := diag(di) , D1 := diag
(1
di + qi
),
B := diag
(wi
(di + qi)2
), D3 := diag
(1
cl
).
The open-loop transfer function fromP (s) to P (s) can be derived based on the above
equations as
1
sD3Rf (s)
(γDD1(sI − γ(DD1 − I))−1 + 1
)D2R
Tb (s).
By using the fact thatTi = di + qi, xi = wi/Ti and (4.29), we can simplify the open loop
transfer functionL(s) into (4.16).
4.6.2 Proof of Theorem 4.3
It is sufficient to show that the eigenvalues of the open-loop transfer function do not encircle
−1 in the complex plain fors = jω, ω ≥ 0 when the condition is satisfied. Since both
X andΛ(s) are diagonal matrices, by using the similar technique in [24], it is not difficult
to check that whens = jω, the set of eigenvalues ofL(s) is same as that ofL(jω) =
Λ(s)RT (−jω)R(jω) except for some zero eigenvalues whereR(jω) is defined as
R(jω) := diag(1√cl
)Rf (jω)diag(√
xi).
68
Following the argument of [152], we study the convex hull as a function ofjω formed by
N Nyquist trajectories. More specifically, the spectrum ofL(jω) satisfies
σ(L(jω)) = σ(Λ(s)RT (−jω)R(jω)
)⊆ ρ
(RT (−jω)R(jω)
)· co(0 ∪ {Λi(jω)}) ,
wherei = 1 . . . N , co(·) denotes the convex hull, and
Λi(jω) :=e−jωTi
jωTi
jωTi + γTi
jωTi + γqi
,
where theτi is replaced withTi sinceτi = Ti at equilibrium. Similar to [24], the spectral
radius ofRT (−jω)R(jω) is less thanM , which is the maximal number of links in the path
of any source,M = maxi
∑l Rli. It implies
σ(L(jω)) ⊆ M · co(0 ∪ {Λi(jω), i = 1 . . . N}) .
Therefore a sufficient condition for local stability is thatMΛi(jω) does not encircle−1 for
any i.
It is a standard control theory result that the largest phase lag of(jωTi + γTi)/(jωTi +
γqi) is produced whenωTi =√
γTi · γqi , which is
∠j√
γTi · γqi + γTi
j√
γTi · γqi + γqi
= − tan−1 1− qi/Ti
2√
qi/Ti
.
The above equation yields
∠Λi(jω) ≥ −ωTi − π
2− tan−1 1− qi/Ti
2√
qi/Ti
.
Suppose that at frequencyωi the phase lag ofΛi(jω) is−π. Hence,
−π = Λi(jωi) ≥ −ωiTi − π
2− tan−1 1− qi/Ti
2√
qi/Ti
.
69
Based on (4.18) we have
ωiTi ≥ φ for i = 1 . . . N.
It is easy to check that the magnitude ofΛi(jω) is a decreasing function ofω. Therefore,
M |Λi(jωi)| ≤ M |Λi(jφ
Ti
)| = M
φ
√φ2 + γ2T 2
i
φ2 + γ2q2i
≤ M
φ
√φ2 + γ2T 2
max
φ2 + γ2q2min
< 1,
andMΛ(jωi) can not encircle−1. Based on the above argument, the system is locally
stable if (4.17 ) is satisfied.
4.6.3 Proof of Lemma 4.3
O
Z
Re
Im
A B
λ
λ
Figure 4.5: Illustration of Lemma 4.3.
Proof: There is a complex plane in Figure 4.5. Let the pointsA, B, andλ represent the
value ofµmin, µmax, andλ, respectively.Z is the intersection of segmentAλ and the unit
circle, andλ stands for the complex conjugate ofλ.
Let φi ∈ [0, 2π) be the phase of1/(µi − λ). Clearly, φi ∈ [0, π) if Im(λ) ≤ 0,
andφi ∈ (π, 2π) otherwise. Denoteφmax := maxi φi andφmin := mini φi, then0 ≤φmax − φmin ≤ π. Since everyµi is in the range[µmin, µmax], it is easy to check that every
70
φi is in the range formed by the phases of1/(µmin − λ) and1/(µmax − λ). This implies
φmax − φmin ≤ |∠ 1
µmin − λ− ∠ 1
µmax − λ| = ∠AλB = ∠AλB < ∠OZB < π/2.
Let ε > 0 be small enough such thatφmax − φmin < π/2 − ε. Choosingψ = −φmin + ε
gives us
∠ej(ψ+θi)βi
µi − λ= φi + ψ + θi,
= φi − φmin + ε + θi, (greater than 0)
< φmax − φmin + ε + π/2 < π.
The fact that its phase is in(0, π) implies that
Im
(ej(ψ+θi)βi
µi − λ
)> 0.
4.6.4 Proof of Lemma 4.4
Suppose thatA = Ar + jAi whereAr = ATr andAi is positive definite. IfA is singular,
there exists a nonzero vectorv such thatAv = 0. Suppose thatv = α + jβ. ThenAv = 0
gives
Arα− Aiβ = 0, (4.30)
Arβ + Aiα = 0. (4.31)
Multiplying βT to equation (4.30) yields
βT Arα = βT Aiβ ≥ 0. (4.32)
71
Multiplying αT to equation (4.31) gives us
αT Arβ = −αT Aiα ≤ 0. (4.33)
SinceβT Arα = αT ATr β = αT Arβ, both (4.32) and (4.33) must hold with equality. This
means that bothα andβ are zero. It contradicts the assumption thatv is nonzero.
4.6.5 Proof of Lemma 4.8
It is obvious thatL(t) ≥ 0 because of its definition in (4.28). We start with the update of
ηi(t)
ηi(t + 1)− ηi(t) =wi(t + 1)− wi(t)
αidi
− 1
q(t + 1)+
1
q(t),
= −γαiq(t)µi(t)ηi(t)
αidi
− 1
q(t + 1)+
1
q(t),
= −γq(t)ηi(t)
di + q(t)− 1
q(t + 1)+
1
q(t),
= −γ(1− µi(t))ηi(t)− 1
q(t + 1)+
1
q(t).
For simplicity, we letai(t) := 1 − γ + γµi(t) and denoteamax := 1 − γ + γµmax, then
ai(t) ≤ amax. This definition simplifies the above equation into
ηi(t + 1) = ai(t)ηi(t)− 1
q(t + 1)+
1
q(t). (4.34)
By comparing equation (4.34) for sourcei andj, we obtain
ηi(t + 1)− ηj(t + 1) = ai(t)ηi(t)− aj(t)ηj(t). (4.35)
Without loss of generality, suppose that at timet + 1, the largest and smallest values ofη
are achieved at sourcesi andj, respectively. This assumption implies
L(t + 1) = ηi(t + 1)− ηj(t + 1).
72
The upper bound ofL(t+1) is derived by considering the following three cases separately.
Case 1:ηi(t) andηj(t) have different signs. It is easy to see that
L(t + 1) = ai(t)ηi(t)− aj(t)ηj(t) ≤ amax(ηi(t)− ηj(t)),
≤ amax(ηmax(t)− ηmin(t)) = amaxL(t).
Case 2:Bothηi(t) andηj(t) are positive. It yields
L(t + 1) = ai(t)µi(t)ηi(t)− aj(t)ηj(t) ≤ amaxηmax(t),
= amaxL(t) + amaxηmin(t) ≤ amaxL(t) + amaxδ2(1− γ)t,
≤ amaxL(t) + δ3(1− γ)t,
where the last step is choosingδ3 larger thanamaxδ2.
Case 3:Bothηi(t) andηj(t) are negative. The proof is similar to that for Case 2.
Summarizing all the above cases, we have provedL(t+1) ≤ amaxL(t)+ δ3(1−γ)t for
all t ≥ K2. Denoteb := 1 − γ, then1 > amax > b ≥ 0. For anyt ≥ K2, an upper bound
of L(t) is
L(t) ≤ amaxL(t− 1) + δ3bt−1 ≤ at−K2
max L(K2) + δ3(bt−1 + bt−2amax + · · ·+ bK2at−K2−1
max ),
=
(a−K2
max L(K2)− δ3bK2a−K2
max
b− amax
)at
max +δ3
b− amax
bt.
Note that the coefficient ofbt is negative. By choosingδ4 as the coefficient ofatmax, we get
L(t) ≤ δ4atmax = δ4(1− γ + γµmax)
t.
73
Chapter 5
Cross-Layer Optimization in TCP/IPNetworks
5.1 Introduction
Recent studies have shown that any TCP congestion control algorithm can be interpreted
as carrying out a distributed primal-dual algorithm over the Internet to maximize aggregate
utility, and a user’s utility function is defined by its TCP algorithm, see, e.g., [80, 97, 116,
107, 101, 88, 96] for unicast, [75, 30] for multi-cast, and [98, 79, 138] for recent surveys
and further references. All of these works assume that routing is given and fixed at the
timescale of interest, and TCP, together with active queue management (AQM), attempts
to maximize aggregate utility over source rates. In this chapter, we study the cross-layer
utility maximization at the timescale of route changes.
We focus on the situation where a single minimum-cost route (shortest path) is selected
for each source-destination pair. This models IP routing in the current Internet within an
Autonomous System using common routing protocols such as OSPF [119]1 or RIP [56].
Routing is typically updated at a much slower timescale than TCP–AQM. We model this
by assuming that TCP and AQM converge instantly to equilibrium after each route update
to produce source rates and “congestion prices” for that update period. These congestion
prices may represent delays or loss probabilities across network links. They determine
the next routing update in the case of dynamic routing, similar to the system analyzed in
1Even though OSPF implements a shortest-path algorithm, it allows multiple equal-cost paths to be uti-lized. Our model ignores this feature.
74
[48]. Thus TCP–AQM/IP forms a feedback system where routing interacts with congestion
control in an iterative process. We are interested in the equilibrium and stability properties
of this iterative process. To simplify notation, we will henceforth use TCP–AQM/IP and
TCP/IP interchangeably.
Here are our main results. In the case of pure dynamic routing, i.e., when link costs are
the congestion prices generated by TCP–AQM, it turns out that we can interpret TCP/IP
as a distributed primal-dual algorithm to maximize aggregate utility overbothsource rates
(by TCP–AQM) and routes (by IP) if TCP/IP converges. We consider the problem, and its
Lagrangian dual, of maximizing utility over source rates and over routing that use only a
singlepath for each source-destination pair. Unlike the TCP-AQM problem or the multi-
path routing problem that are convex optimizations with no duality gap, the single path
TCP/IP problem is non-convex and generally has a duality gap. Equilibrium of the TCP/IP
system exists if and only if this problem has no duality gap. In this case, TCP/IP equilibrium
solves both the primal and the dual problem. Moreover, it incurs no penalty for not splitting
traffic across multiple paths: optimal single-path routing achieves the same aggregate utility
as optimal multi-path routing. Multi-path routing can achieve a strictly higher utility only
when there is a duality gap between the single-path primal and dual problems, but in this
case, the TCP/IP iteration does not even have an equilibrium, let alone solving the utility
maximization problem.
Even when the single-path problem has no duality gap and TCP/IP has an equilibrium,
the equilibrium is generally unstable under pure dynamic routing. It can be stabilized by
adding a sufficiently large static component to the definition of link cost. The existence
and characterization of TCP/IP equilibrium when the link costs are not pure congestion
prices, however, are open problems. To proceed, we specialize to a ring network with a
common destination and demonstrate an inevitable tradeoff between utility maximization
and routing stability (Section 5.5). Specifically, we show that the TCP/IP system over the
special ring network is indeed unstable when link costs are pure prices. It can be stabilized
by adding a static component to the link cost, but at the expense of a reduced utility in
equilibrium. The loss in utility increases with the weight on the static component. Hence,
while stability requires a small weight on prices, utility maximization favors a large weight.
75
We present numerical results to validate these qualitative conclusions in a general network
topology. These results also suggest that routing instability can reduce aggregate utility to
less than that achievable by (the necessarily stable) pure static routing.
Indeed we show that if the link capacities are optimally provisioned, thenpure static
routing is enough to maximize utility even for general networks (Section 5.6). Moreover,
it is optimal within the class of multi-path routing: again, there is no penalty at optimality
in not splitting traffic across multiple paths.
Finally, we discuss some implications and limitations of these results (Section 5.7).
5.2 Related work
Our goal is to understand equilibrium and stability issues in cross-layer optimization of
TCP/IP networks. Another approach to joint routing and congestion control is to allow
multi-path routing, i.e., a source can transmit its data along multiple paths to its destination.
In this formulation, a source’s decision is decomposed into two: how much traffic to send
(congestion control) and how to distribute it over the available paths (multi-path routing
or load balancing) in order to maximize aggregate utility. This has been analyzed in, e.g.,
[46, 80, 74]. The general intuition is that, for each source-destination pair, only paths with
the minimum, and hence equal, “congestion prices” will be used, and this minimum price
determines the total source rate as in the single-path case. In contrast to TCP/IP networks,
this formulation assumes that both decisions operate on the same timescale. However, it
provides an upper bound to the problem TCP/IP attempts to solve (see Section 5.4).
The multi-path problem is equivalent to the multicommodity flow problem which has
been extensively studied; see, e.g., [1, 13]. The standard formulation is to maximize aggre-
gate throughput, corresponding to a common and linear utility function. It is then a linear
program and therefore can be solved in polynomial time, though there are combinatorial
algorithms for this class of problems that are more efficient. Recently, fast approxima-
tion algorithms and their competitive ratios have been developed for network flow, and
other, problems, e.g., [130, 48, 7]. Since the multi-path problem upper bounds the TCP/IP
problem, the work on network flow problems provides insight to the performance limit of
76
TCP/IP. There are however differences. First, our single-path routing problem is NP-hard
(see Section 5.4) and generally has a duality gap, whereas the network flow problem is gen-
erally a linear program that is in P and has no duality gap. Second, the utility functions that
correspond to common TCP algorithms are strictly concave whereas they are typically lin-
ear, in fact, identity, functions in network flow problems. Third, the algorithms developed
for network flow problems are typically centralized and therefore cannot model TCP/IP
iterations or be carried out in a large network where they must be decentralized.
Instability of single-path routing is not surprising as it is well known that stability gen-
erally requires that the relative weight on the dynamic (traffic-sensitive) component of the
link cost be small. Indeed, our conclusions are similar to those reached in [12, 103] that
study the same ring network for routing stability using different link costs. Here, since the
dynamic component is the dual-optimal price for the utility maximization problem com-
puted by TCP–AQM, this implies a tradeoff between routing stability and utility maxi-
mization.
5.3 Model
In general, we use small letters to denote vectors, e.g.,x with xi as itsith component;
capital letters to denote matrices, e.g.,H,W,R, or constants; e.g.,L,N, K i; and script
letters to denote sets of vectors or matrices, e.g.,Ws,Wm,Rs,Rm. Superscript is used to
denote vectors, matrices, or constants pertaining to sourcei, e.g.,yi, wi, H i, K i.
A network is modelled as a set ofL uni-directional links with finite capacitiesc =
(cl, l = 1, . . . , L), shared by a set ofN source-destination pairs, indexed byi (we will also
refer to the pair simply as “sourcei”). There areKi acyclic paths for sourcei represented
by aL×K i 0-1 matrixH i where
H ilj =
1, if path jof sourcei uses linkl
0, otherwise.
Let Hi be the set of all columns ofH i that represents all the available paths toi under
77
single-path routing. Define theL×K matrixH as
H = [H1 . . . HN ],
whereK :=∑
i Ki, andH defines the topology of the network.
Let wi be aK i × 1 vector where thejth entry represents the fraction of sourcei on its
jth path such that
wij ≥ 0 ∀j, and 1T wi = 1,
where1 is a vector of an appropriate dimension with the value1 in every entry. We require
wij ∈ {0, 1} for single-path routing and allowwi
j ∈ [0, 1] for multi-path routing. Collect
the vectorswi, i = 1, . . . , N , into aK ×N block-diagonal matrixW . LetWs be the set of
all such matrices corresponding to single-path routing defined as
{W |W = diag(w1, . . . , wN) ∈ {0, 1}K×N , 1T wi = 1 }.
Define the corresponding setWm for multi-path routing as
{W |W = diag(w1, . . . , wN) ∈ [0, 1]K×N , 1T wi = 1 }. (5.1)
As mentioned above,H defines the set of acyclic paths available to each source and rep-
resents the network topology.W defines how the sources load balance across these paths.
Their product defines aL × N routing matrixR = HW that specifies the fraction ofi’s
flow at each linkl. The set of all single-path routing matrices is
Rs := { R | R = HW,W ∈ Ws }, (5.2)
and the set of all multi-path routing matrices is
Rm := { R | R = HW,W ∈ Wm }. (5.3)
78
The difference between single-path routing and multi-path routing is the integer constraint
onW andR. A single-path routing matrix inRs is an 0-1 matrix
Rli =
1, if link l is in a path of sourcei
0, otherwise.
A multi-path routing matrix inRm is one whose entries are in the range[0, 1]
Rli
> 0, if link l is in a path of sourcei
= 0, otherwise.
The path of sourcei is denoted byri = [R1i . . . RLi]T , the ith column of the routing
matrixR.
We consider the situation where TCP–AQM operates at a faster timescale than routing
updates. We assume asinglepath is selected for each source-destination pair that mini-
mizes the sum of the link costs in the path, for some appropriate definition of link cost. In
particular, traffic is not split across multiple paths from the source to the destination even
if it is available. This models, for example, IP routing within an Autonomous System. We
focus on the timescale of the route changes and assume TCP–AQM is stable and converges
instantly to equilibrium after a route change. As in [96], we will interpret the equilibria of
various TCP and AQM algorithms as solutions of a utility maximization problem defined
in [80]. Different TCP algorithms solve the same prototypical problem (5.4) with different
utility functions [96, 101].
Specifically, suppose each sourcei has a utility functionUi(xi) as a function of its total
transmission ratexi. We assumeUi is strictly concave increasing (which is the case for
common TCP algorithms [96]). LetR(t) ∈ Rs be the (single-path) routing in periodt. Let
the equilibrium ratesx(t) = x(R(t)) and pricesp(t) = p(R(t)) generated by TCP–AQM
in periodt, respectively, be the optimal solutions of the constrained maximization problem
maxx≥0
∑i
Ui(xi) s. t. R(t)x ≤ c, (5.4)
79
and its Lagrangian dual
minp≥0
∑i
maxxi≥0
(Ui(xi)− xi
∑
l
Rli(t)pl
)+
∑
l
clpl. (5.5)
The prices,pl(t), l = 1, . . . , L, are measures of congestion, such as queueing delays or loss
probabilities [96, 101]. We assume the link costs in periodt are
dl(t) = apl(t) + bτl, (5.6)
wherea ≥ 0, b ≥ 0, andτl > 0 are constants. Based on these costs, each source computes
its new routeri(t + 1) ∈ Hi individually that minimizes the total cost on its path
ri(t + 1) = arg minri∈Hi
∑
l
dl(t)ril . (5.7)
In (5.6),τl are traffic insensitive components of the link costdl(t), e.g.,τl = 1/cl. If τl
represent the fixed propagation delays across linksl andpl(t) the queueing delays at link
l, thendl(t) represent total delays across linkl. The protocol parametersa andb determine
the responsiveness of routing to network traffic:a = 0 corresponds to static routing,b = 0
corresponds to purely dynamic routing, and the larger the ratio ofa/b, the more responsive
routing is to network traffic. As we will see below, they determine whether an equilibrium
exists and whether it is stable, and the achievable utility at equilibrium.
An equivalent way to specify the TCP–AQM/IP system as a dynamical system, at the
timescale of route changes, is to replace (5.4)–(5.5) by their optimality conditions. The
routing is updated according to
ri(t + 1) = arg minri∈Hi
∑
l
(apl(t) + bτl)ril , for all i, (5.8)
80
wherep(t) andx(t) are given by
∑
l
Rli(t)pl(t) = U ′i(xi(t)) for all i (5.9)
∑i
Rli(t)xi(t)
≤ cl if pl(t) ≥ 0
= cl if pl(t) > 0for all l (5.10)
x(t) ≥ 0, p(t) ≥ 0. (5.11)
This set of equations describe how the routingR(t), ratesx(t), and pricesp(t) evolve.
Note thatx(t) andp(t) depend only onR(t) through (5.9)–(5.11), implicitly assuming that
TCP–AQM converges instantly to an equilibrium given the new routingR(t).
We say that(R∗, x∗, p∗) is an equilibrium of TCP/IPif it is a fixed point of (5.4)–
(5.7), or equivalently, (5.8)–(5.11), i.e., starting from routingR∗ and associated(x∗, p∗),
the above iterations yield(R∗, x∗, p∗) in the subsequent periods.
5.4 Equilibrium of TCP/IP
In this section, we study the condition under which TCP/IP as modelled by (5.4)–(5.7) or
(5.8)–(5.11) has an equilibrium and characterize the equilibrium as the optimal solution
of an optimization problem. Since (5.8)–(5.11) is a system of mixed integer nonlinear
inequalities, characterization of its equilibrium, even the basic question of existence and
uniqueness, is in general difficult to determine. The case of pure dynamic routing,a > 0
andb = 0, is the simplest and most instructive.
We completely characterize the case of pure dynamic routing,a > 0 andb = 0 in this
section. Without loss of generality, we seta = 1 in (5.7) and (5.8) whenb = 0.
Consider the joint optimization problem
maxR∈Rs
maxx≥0
∑i
Ui(xi) s. t.Rx ≤ c, (5.12)
81
and its Lagrangian dual
minp≥0
∑i
maxxi≥0
(Ui(xi)− xi min
ri∈Hi
∑
l
Rlipl
)+
∑
l
clpl, (5.13)
whereri is the ith column ofR with ril = Rli. While (5.4)–(5.5) maximize utility over
source rates only, problem (5.12) maximizes utility over both rates and routes. While (5.4)
is a convex program without duality gap, problem (5.12) is non-convex because the variable
R is discrete and generally has a duality gap.2 The interesting feature of the dual problem
(5.13) is that the maximization overR takes the form of minimum-cost routing with prices
p generated by TCP–AQM as link costs. This suggests that TCP/IP might turn out to be
a distributed algorithm that attempts to maximize utility, with proper choice of link costs.
This is indeed true when equilibrium of TCP/IP exists.
Theorem 5.1.Under pure dynamic routing, that is,a = 1 andb = 0.
1. An equilibrium(R∗, x∗, p∗) of TCP/IP exists if and only if there is no duality gap
between (5.12) and (5.13).
2. In this case, the equilibrium(R∗, x∗, p∗) is a solution of (5.12) and (5.13).
Hence, one can regard the layering of TCP and IP as a decomposition of the utility
maximization problem over source rates and routes into a distributed and decentralized
algorithm, carried out on two different timescales, in the sense that an equilibrium of the
TCP/IP iteration (5.8)–(5.11), if it exists, solves (5.12) and (5.13). An equilibrium may not
exist. Even if it does, it may not be stable–an issue we address in Section 5.5.
Example 1: Duality gap
A simple example, where there is a duality gap and equilibrium of TCP/IP does not exist,
consists of a single source-destination pair connected by two parallel links each of capacity
1, as shown in Figure 5.2 (takeN = 1). Clearly, under pure dynamic single-path routing,
equilibrium of TCP/IP does not exist, because the TCP/IP iteration (5.8)–(5.11) will choose
2The nonlinear constraintRx ≤ c can be converted into a linear constraint–see proof of Theorem 5.2–sointeger constraint onR is the real source of difficulty.
82
one of the two routes in each period to carry all traffic. TCP–AQM will generate positive
price for the chosen route and zero price for the other route, so that in the next period, the
other route will be selected, and the cycle repeats. The proof that there is a duality gap
between the primal problem (5.12) and the dual problem (5.13) is given in Appendix 5.8.1
(takeN = 1). Intuitively, either path is optimal (both for primal and for dual problem). For
the primal problem the optimal rate isx∗ = 1, constrained by link capacity, whereas for
the dual problem, the optimal rate isx∗ = 2, primal infeasible. Hence the primal optimal
value isU(1), strictly less than the dual optimal value ofU(2).
The duality gap is a measure of “cost of not splitting”. To elaborate, define the La-
grangian [14, 104]
L(R, x, p) =∑
i
(Ui(xi)− xi
∑
l
Rlipl
)+
∑
l
clpl.
The primal (5.12) and dual (5.13) can then be expressed respectively as
Vsp = maxR∈Rs,x≥0
minp≥0
L(R, x, p)
Vsd = minp≥0
maxR∈Rs,x≥0
L(R, x, p).
If we allow sources to distribute their traffic among multiple paths available to them, then
the corresponding problems for multi-path routing are
Vmp = maxR∈Rm,x≥0
minp≥0
L(R, x, p)
Vmd = minp≥0
maxR∈Rm,x≥0
L(R, x, p). (5.14)
Theorem 5.2.The relations among these four problems areVsp ≤ Vsd = Vmp = Vmd.
According to Theorem 5.1, TCP/IP has an equilibrium exactly when there is no dual-
ity gap in the single-path utility maximization, i.e., whenVsp = Vsd. Theorem 5.2 then
says that in this case, there is no penalty in not splitting the traffic, i.e., single-path routing
performs as well as multi-path routing,Vsp = Vmp. Multi-path routing achieves a strictly
83
higher utilityVmp precisely when TCP/IP has no equilibrium, in which case the TCP/IP it-
eration (5.8)–(5.11) cannot converge, let alone solving the single-path utility maximization
problem (5.12) or (5.13). In this case the problem (5.12) and its dual (5.13) do not charac-
terize TCP/IP, but their gap measures the loss in utility in restricting routing to single-path
and is of independent interest.
Even though minimum-cost routing is polynomial, it is shown in [153] that single-path
utility maximization is NP-hard. This is not surprising since, e.g., a related problem on
load balancing on a ring has been proved to be NP-hard in [29].
Theorem 5.3.The primal problem (5.12) is NP-hard.
Theorem 5.3 shows that the general problem (5.12) is NP-hard by reducing all instances
of the integer partition problem to some instances of the primal problem (5.12). Theorem
5.2 however implies that the sub-class of the utility maximization problems with no duality
gap are polynomial-time solvable, since they are equivalent to multi-path problems that are
concave programs and hence polynomial-time solvable. It is a common phenomenon for
sub-classes of NP-hard problems to have polynomial-time algorithms. Informally, the hard
problems are those with nonzero duality gap.
The rest of this section is devoted to the proofs for Theorems 5.1–5.3. We will first
prove Theorem 5.2. Then we show that an equilibrium of TCP/IP must solve the dual
problem (5.13). This together with Theorem 5.2 implies Theorem 5.1. Finally, we present
a proof for Theorem 5.3.
Proof of Theorem 5.2. The inequality follows from the weak duality theorem [14]. We
now proveVsd = Vmd andVmp = Vmd. We formulateVsd as
Vsd = minp≥0
maxR∈Rs,x≥0
(∑i
Ui(xi)− pT Rx
)+ pT c,
= minp≥0
maxx≥0
(∑i
Ui(xi)− minW∈Ws
pT HWx
)+ pT c,
84
whereR = HW with W ∈ Ws from (5.2). Similarly, from (5.3) we have
Vmd = minp≥0
maxx≥0
(∑i
Ui(xi)− minW∈Wm
pT HWx
)+ pT c.
Define functionsfs(x, p) andfm(x, p) as
fs(x, p) := minW∈Ws
pT HWx, fm(x, p) := minW∈Wm
pT HWx.
In order to show thatVsd = Vmd, we only need to show thatfs(x, p) = fm(x, p). Clearly
fs(x, p) ≥ fm(x, p) sinceWs ⊆ Wm. From (5.1), noting thatW = diag(wi), we have
fm(x, p) = minW
pT HWx
s. t. 1T wi = 1, 0 ≤ wij ≤ 1.
Since this is a linear programming for givenx andp, at least one of the optimal points lies
on the boundary, i.e.,wij = 0 or 1 for all i andj, and hence is inWs ⊆ Wm. Such a point
solves bothfs(x, p) andfm(x, p), i.e.,fs(x, p) = fm(x, p).
To showVmd = Vmp, transformVmp into a convex optimization with linear constraints,
which hence has no duality gap; see, e.g., [14]. Now,Vmp is equivalent to the problem
maxR∈Rm,x≥0
∑i
Ui(xi) s.t. Rx ≤ c. (5.15)
Note that this is not a convex program since the feasible set specified byRx ≤ c is generally
not convex. Define theKi × 1 vectorsyi in terms of the scalarxi and theKi × 1 vectors
wi as the new variables
yi = xiwi. (5.16)
The mapping from(xi, wi) to yi is one-to-one: the inverse of (5.16) isxi = 1T yi and
wi = yi/xi. Now change the variables in (5.15) and (5.14) from(W,x) to y by substituting
85
xi = 1T yi andRx = HWx = Hy into (5.15) and (5.14). We obtain an equivalent problem
maxy≥0
∑i
Ui(1T yi) s.t. Hy ≤ c
and its Lagrangian dual. This is a convex program with linear constraint and hence has no
duality gap. This provesVmp = Vmd.
Proof of Theorem 5.1. It is easy to show that optimal solutions exist for both the primal
problem (5.12) and its dual (5.13), so the issue is whether there is a duality gap. We will
prove the theorem in two steps. First, given an equilibrium(R, x, p) of TCP/IP, we will
show that it solves both the primal (5.12) and the dual (5.13), and hence there is no duality
gap. Then, given a solution(R∗, x∗, p∗) of the primal and the dual problems, we will show
that it is an equilibrium of TCP/IP.
Step 1: Necessity.Let (R, x, p) be an equilibrium of TCP/IP, i.e., a fixed point of (5.4)–
(5.7) witha = 1, b = 0. Then
pT ri = minri∈Hi
pT ri for all i, (5.17)
(p, x) = arg minp≥0
maxx≥0
(∑i
U(xi)− pT Rx
)+ pT c, (5.18)
whereri are theith columns of routing matrixR ∈ Rs.3 We will show that(R, x, p) solves
the dual problem (5.13). Then, since the dual problem (5.13) upper bounds the primal
problem (5.12) by Theorem 5.2, andR ∈ Rs is a single-path routing and hence primal
feasible,(R, x, p) also solves the primal (5.12).
To show that(R, x, p) solves the dual problem, we use the fact that the dual problem
has an optimal solution, denoted by(R∗, x∗, p∗) and show that both achieve the same dual
3One can exchange the order of min and max in (5.18) since givenR, there is no duality gap inmaxx≥0
∑i Ui(xi) s. t. Rx ≤ c.
86
objective value, i.e.,L(R, x, p) = L(R∗, x∗, p∗). Now
(p∗, x∗, R∗) = arg minp≥0
maxx≥0
(∑i
U(xi)− minR∈Rs
pT Rx + pT c
). (5.19)
Let
f(p) := maxx≥0
(∑i
U(xi)− pT Rx
)+ pT c,
g(p) := maxx≥0
(∑i
U(xi)− minR∈Rs
pT Rx
)+ pT c.
Then (5.18) impliesf(p) = minp≥0 f(p) and (5.19) impliesg(p∗) = minp≥0 g(p). Since
R ∈ Rs, we have
f(p) ≤ g(p) for all p ≥ 0,
and hence
f(p) = minp≥0
f(p) ≤ minp≥0
g(p) = g(p∗).
On the other hand
f(p) = maxx≥0
∑i
U(xi)− pT Rx + pT c
= maxx≥0
∑i
U(xi)−∑
i
xi
(pT ri
)+ pT c
= maxx≥0
∑i
U(xi)−∑
i
xi
(minri∈Hi
pT ri
)+ pT c
= g(p)
≥ g(p∗),
where the third equality follows from (5.17). Therefore,f(p) = g(p∗) = g(p) and
L(R, x, p) = L(R∗, x∗, p∗). Moreover,(R, x, p) is an optimal solution of the dual prob-
lem.
87
Step 2: Sufficiency. Assume that there is no duality gap and(R∗, x∗, p∗) is an optimal
solution for both the primal problem (5.12) and its dual (5.13). We claim that it is also an
equilibrium of (5.4)–(5.7) witha = 1 andb = 0, i.e., we need to show that
(p∗)T (ri)∗ = minri∈Hi
(p∗)T ri, (5.20)
(p∗, x∗) = arg minp≥0
maxx≥0
L(R∗, x, p) = arg maxx≥0
minp≥0
L(R∗, x, p), (5.21)
where(ri)∗ are theith columns ofR∗. The second equality in (5.21) follows from the fact
that there is no duality gap for the TCP–AQM problem.
Since(R∗, x∗, p∗) solves the dual problem (5.13), the optimal routing matrixR∗ satisfies
(5.20) by the saddle point theorem [14]. But(R∗, x∗, p∗) also solves the primal problem
(5.12). In particular,(x∗, p∗) solves the utility maximization problem over source rates and
its Lagrangian dual, withR∗ as the routing matrix, i.e.,(x∗, p∗) satisfies (5.21).
Proof of Theorem 5.3.We describe a polynomial time procedure that reduces an instance
of integer partition problem [47, pp. 47] to a special case of the primal problem. Given a
set of integersc1, . . . , cN , the integer partition problem is to find a subsetA ⊂ {1, . . . , N}such that
∑i∈A
ci =∑
i 6∈A
ci.
Given an instance of the integer partition problem, consider the network in Figure 5.1, with
N sources at the root, two relay nodes, andN receivers, one at each of theN leaves. The
two links from the root to the relay nodes have a capacity of∑
i ci/2 each, and the two links
from each relay node to receiveri have a capacity ofci. All receivers have the same utility
function that is increasing. The routing decision for each source is to decide which relay
node to traverse. Clearly, maximum utility of∑
i Ui(ci) is attained when each receiveri
receives at rateci, from exactly one of the relay nodes, and the links from the root to the
two relay nodes are both saturated. Such a routing exists if and only if there is a solution to
the integer partition problem.
88
ic
2
1ic
2
1
1c
2c
Nc
Nc 1
c
2c
Figure 5.1: Network to which integer partition problem can be reduced.
Comment: The case ofb > 0 for general network is completely open. Ifa = 0 and
b > 0, routingR(t) = R, for all t, is the static minimum-cost routing withτl as the link
costs. An equilibrium(R, x(R), p(R)) always exists in this case. Even thoughR minimizes
routing cost and(x(R), p(R)) solves (5.4)–(5.5), it is not known if(R, x(R), p(R)) jointly
solves any optimization problem.
For the case ofa > 0 andb > 0, even the existence of equilibrium is unknown for
general networks. In the following section, we will study the dynamics of TCP/IP under
the assumption that such an equilibrium of TCP/IP exists.
5.5 Dynamics of TCP/IP
Theorem 5.1 suggests using pure congestion pricesp(t) generated by TCP–AQM as link
costs. In this case, an equilibrium of TCP/IP, when it exists, maximizes aggregate utility
over both rates and routes. We show in this section however that the equilibrium may not
be unstable. Routing can be stabilized by including a strictly positive traffic-insensitive
(static) component in link costs (b > 0), but at a reduced achievable utility. There thus
seems to be an inevitable tradeoff between achievable utility and routing stability.
To make this precise, we start with analysis of a special ring network with a common
89
destination. As remarked in the last section, for a general network, we do not even know if
an equilibrium exists whenb > 0, let alone characterizing it. For the ring network, however,
not only does equilibrium always exists, but we can also study rigorously its stability and
achievable utility, as well as their tradeoff under minimum-cost routing. We also illustrate
through a numerical example that the qualitative conclusions derived from this network
seem to generalize to a general network.
5.5.1 Simple ring network
Consider a ring network withN + 1 nodes, indexed byi = 0, 1, . . . , N . Nodesi ≥ 1 are
sources and their common destination is node 0; see Figure 5.2. For notational convenience
r
2
0
N 1
Figure 5.2: A ring network.
we will also refer to node0 as nodeN + 1. Each pair of nodes is connected by two links,
one in each direction. We will refer to the two uni-directional links between nodei − 1
and i as link i; the direction should be clear from the context. The fixed delay on linki
is denoted asτi > 0, i = 1, . . . , N + 1, in each direction. We construct the cost on link
i in periodt asdi(t) = api(t) + bτi, wherepi(t) is the price on linki. At time t, source
i routes all of its traffic in the direction, counterclockwise or clockwise, with the smaller
cost. The ring network is particularly simple because the routing of the whole network
can be represented by a single numberr. Note that under minimum-cost routing, if node
i sends in the counterclockwise direction, so must nodei − 1, and if nodei sends in the
clockwise direction, so must nodei + 1. Hence, we can represent routing on the network
90
by r ∈ {0, . . . , N} with the interpretation that nodes1, . . . , r send in the counterclockwise
direction and nodesr + 1, . . . , N send in the clockwise direction.
For this special case, we now show that the duality gap is trivial, that minimum-cost
routing based just on prices (b = 0) indeed solves the primal and dual problems as Theorem
5.1 guarantees, but that the equilibrium is unstable. Using a continuous model, we then
show that routing can be stabilized if the weightb on the fixed delay is nonzero and the
weighta on price is small enough. The maximum achievable utility however decreases with
smallera. There is thus an inevitable tradeoff between utility maximization and routing
stability.
5.5.2 Utility and stability of pure dynamic routing
Suppose all sourcesi have the same utility functionU(xi), and all links have the same
capacity ofc = 1 unit. We assume thatU is strictly concave increasing and differen-
tiable. Then at any time, only link 1, in the counterclockwise direction, and linkN + 1,
in the clockwise direction, can be saturated and have strictly positive price. The utility
maximization problem (5.12) reduces to the following simple form
maxr∈{0,...,N}
maxxi
∑i
U(xi) (5.22)
subject tor∑
i=1
xi ≤ 1, andN∑
i=r+1
xi ≤ 1. (5.23)
When routing isr, nodesi = 1, . . . , r see pricep1(r) on their paths while nodesi =
r + 1, . . . , N see pricepN+1(r) on their paths. Since these ratesxi(r) and pricespi(r) are
primal and dual optimal, they satisfy [97]
U ′(xi(r)) = p1(r) for i = 1, . . . , r, (5.24)
U ′(xi(r)) = pN+1(r) for i = r + 1, . . . , N. (5.25)
This implies thatx1(r) = · · · = xr(r) andxr+1(r) = · · · = xN(r).
It is easy to see that the optimal routingr∗ 6= 0 or N . Hence both constraints are active
91
at optimality, implying that (from (5.23))
x1(r) = · · · = xr(r) = 1r, (5.26)
xr+1(r) = · · · = xN(r) = 1N−r
. (5.27)
The problem (5.22)–(5.23) thus becomes
maxr∈{1,...,N−1}
r U
(1
r
)+ (N − r) U
(1
N − r
).
Dividing the objective function byN and using the strict concavity ofU , we have
r
NU
(1
r
)+
N − r
NU
(1
N − r
)≤ U
(2
N
),
with equality if and only ifr = N/2. This implies that the optimal routing is
r∗ := bN/2c (5.28)
and the maximum utility is
V ∗ :=
⌊N
2
⌋U
(1
bN/2c)
+
⌈N
2
⌉U
(1
dN/2e)
, (5.29)
wherebyc is the largest integer less or equal toy anddye is the smallest integer greater or
equal toy.
It can be shown that there is no duality gap for the ring network considered here
whenN is even, by verifying that routingr∗ in (5.28), ratesxi(r∗) in (5.27), and prices
p1(r∗), pN+1(r
∗) in (5.24)–(5.25) are indeed primal-dual optimal.4 WhenN is odd, there
is generally a duality gap due to integral constraint onr; see Appendix 5.8.1 for a proof.
This duality gap disappears in the convexified problem when routing is allowed to take
real value in[0, N ], a model we consider in the next subsection. This suggests that TCP
together with minimum-cost routing based on prices can potentially maximize utility for
4This also follows from Theorem 5.1 and the fact thatr = N/2 is by symmetry the equilibrium routingwhenN is even.
92
this ring network. We next show, however, that minimum-cost routing based only on prices
is unstable.
Given routingr, we can combine (5.24)–(5.25) and (5.27) to obtain the pricesp1(r) and
pN+1(r) on links 1 andN + 1
p1(r) = U ′(
1
r
)andpN+1(r) = U ′
(1
N − r
). (5.30)
The path cost for nodei in the counterclockwise direction is
D−(i; r) =i∑
j=1
bτj + ap1(r) = b
i∑j=1
τj + aU ′(
1
r
), (5.31)
and the path cost in the clockwise direction is
D+(i; r) =N+1∑
j=i+1
bτj + apN+1(r) = b
N+1∑j=i+1
τj + aU ′(
1
N − r
). (5.32)
In the next period, each nodei will choose the counterclockwise or clockwise direction
accordingly asD−(i; r) or D+(i; r) is smaller. Definef(r) as
f(r) := max {i | D−(i; r) ≤ D+(i; r)}. (5.33)
Then the resulting routing satisfies the recursive relation
r(t + 1) =
0 if D−(1; r(t)) > D+(1; r(t))
N if D−(N ; r(t)) < D+(N ; r(t))
f(r(t)) otherwise.
Theorem 5.4. If b = 0 anda > 0, then, starting from any routingr(0) except the equilib-
rium N/2 whenN is even, the subsequent routing oscillates between 0 andN .
93
Proof. For anyr(0) ∈ {0, . . . , N}, we have
D−(1; r(0))−D+(1; r(0)) = D−(N ; r(0))−D+(N ; r(0)),
= a
(U ′
(1
r(0)
)− U ′
(1
N − r(0)
)).
If N is even, thenN/2 is the unique equilibrium routing that solvesD−(i; N/2) = D+(i; N/2).
Supposer(0) 6= N/2. If r(0) > N/2, then1/r(0) < 2/N < 1/(N − r(0)). SinceU ′ is
strictly decreasing,U ′(1/r(0)) > U ′(1/(N − r(0)) and henceD−(1; r(0)) > D+(1; r(0))
andr(1) = 0. Similarly, if r(0) < N/2, thenD−(N ; r(0)) < D+(N ; r(0)) andr(1) = N .
Hencer oscillates between 0 andN henceforth.
Even though purely dynamic routing based on prices(b = 0) maximizes utility, The-
orem 5.4 says that it is unstable. We will henceforth, without loss of generality, setb = 1
and consider the effect ofa on utility maximization and stability.
5.5.3 Maximum utility of minimum-cost routing
As mentioned above, the duality gap for the ring network we consider is of a trivial kind
that disappears when integer constraint on routing is relaxed. For the rest of this section,
we consider a continuous model where every point on the ring is a source. A point on the
ring is labelled bys ∈ [0, 1], and the common destination is the point0 (or equivalently 1).
The utility maximization problem becomes
maxr∈[0,1]
maxx(·)
∫ 1
0
U(x(u))du (5.34)
subject to∫ r
0
x(u)du ≤ 1, and∫ 1
r
x(u)du ≤ 1. (5.35)
As in the discrete case, both constraints are active at optimality, and hence the problem
reduces to
maxr∈(0,1)
rU
(1
r
)+ (1− r)U
(1
1− r
),
94
which, by concavity, yields the optimal routingr∗ and maximum utilityV ∗
r∗ =1
2, and V ∗ = U(2). (5.36)
To see that there is no duality gap, note that the problem (5.34)–(5.35) is equivalent to
maxr∈[0,1]
maxx−,x+≥0
rU(x−) + (1− r)U(x+),
subject to rx− ≤ 1, (1− r)x+ ≤ 1.
Define the Lagrangian as
L(r, x−, x+, p−, p+) = rU(x−) + (1− r)U(x+) + p−(1− rx−) + p+(1− (1− r)x+).
It is easy to verify that
r∗ =1
2, x−∗ = x+∗ = 2, p−∗ = p+∗ = U ′(2) (5.37)
are primal-dual optimal and there is no duality gap; see Appendix 5.8.2.
We now look at the maximum utility achievable by the equilibrium of minimum-cost
routing. Let the delay froms to the destination in the counterclockwise direction be
T (s) :=
∫ s
0
τ(u)du,
and the delay in the clockwise direction be
T (1)− T (s) =
∫ 1
s
τ(u)du,
whereτ(u), u ∈ [0, 1], is given. Here,τ(u) corresponds to link cost in the discrete model.
Given routingr ∈ [0, 1], the price in the counterclockwise direction isU ′(1/r), and the
price in the clockwise direction isU ′(1/(1− r)). Then the cost of sources in the counter-
95
clockwise direction is
D−(s; r) = T (s) + aU ′(
1
r
), (5.38)
and the cost in the clockwise direction is
D+(s; r) = T (1)− T (s) + aU ′(
1
1− r
). (5.39)
A routing r is in equilibrium if the costs of sourcer in both directions are the same.
Definition 5.1. A routingr is called anequilibrium routingif D−(r; r) = D+(r; r). It is
denoted byra or r(a).
By definition,ra is the solution of
g(r) := 2T (r)− T (1) + a
(U ′
(1
r
)− U ′
(1
1− r
))= 0. (5.40)
Sinceg(0) < 0, g(1) > 0 andg′(r) > 0, the equilibriumra is in (0, 1) and is unique. Given
a routingr, its utility is
V (r) := rU
(1
r
)+ (1− r)U
(1
1− r
).
The maximum utility achieved by minimum-cost routing, with parametera, is thenV (ra) ≤V (r∗) = V ∗.
The next result implies thatra varies betweenr0 andr∗ and converges monotonically to
r∗ asa →∞. As a result, the lossV ∗ − V (ra) ≥ 0 in utility also approaches0 asa →∞.
Denote the interval in which1/ra and1/(1− ra) vary asI := [2, 1/ min{r0, 1− r0}].
Theorem 5.5.SupposeU ′′ exists and is bounded onI. For all a ≥ 0, |ra − r∗| is a strictly
decreasing function ofa. Moreover, asa →∞, |ra − r∗| andV ∗ − V (ra) approach 0.
Proof. The equation (5.40) defines the equilibrium routingr(a) := ra as an implicit func-
96
tion of a. By the implicit function theorem,r′(a) satisfies
1
r′(a)
[U ′
(1
1− ra
)− U ′
(1
ra
)]= 2τ(ra)− a
r2a
U ′′(
1
ra
)− a
(1− ra)2U ′′
(1
1− ra
).
The right-hand side is positive sinceU is strictly concave. Hencer′(a) has the same sign
as the term in the square bracket, i.e., sinceU ′ is decreasing,
r′(a) =
> 0 if ra < r∗
< 0 if ra > r∗
= 0 if ra = r∗
. (5.41)
This implies that|ra − r∗| is a strictly decreasing function ofa; see Figure 5.3.
Hence|ra − r∗| converges to a limit asa → ∞. SinceU ′′ is bounded on the closed
intervalI, so isU ′. Hence, from (5.40), we must have
U ′(1/ra)− U ′(1/(1− ra)) → 0, or U ′(1/ lima→∞
ra) = U ′(1/(1− lima→∞
ra)).
SinceU ′ is strictly decreasing, this implies thatlima→∞ ra = 1− lima→∞ ra = r∗.
To show thatV ∗ − V (ra) ≥ 0 also converges to 0, note thatV ′(r∗) = 0 and hence we
have, by Taylor expansion,
V (ra)− V ∗ =1
2V ′′(u)(ra − r∗)2
for someu betweenra andr∗. Here
V ′′(u) =1
u3U ′′
(1
u
)+
1
(1− u)3U ′′
(1
1− u
)≥ − 2µ
(min{r0, 1− r0})3
whereµ is the upper bound ofU ′′ on I. Hence
0 ≤ V ∗ − V (ra) ≤ µ(ra − r∗)2
(min{r0, 1− r0})3.
Since|ra − r∗| → 0, the proof is complete.
97
The shape ofr′(a) in (5.41) implies that, ifr(0) > r∗ thenr(a) ≥ r∗ for all a but r(a)
decreases tor∗ asa →∞, and ifr(0) < r∗ thenr(a) ≤ r∗ for all a butr(a) increases tor∗
monotonically, as illustrated in Figure 5.3. This is a consequence of the continuity ofr(a).
r(0)
r*
r(0)
Figure 5.3: The routingr(a).
5.5.4 Stability of minimum-cost routing
We now turn to the stability ofra. For simplicity, we will takeU(x) = log x, the utility
function of TCP Vegas [101] and FAST [69]. With this logarithm utility,V ′(ra) = log(1−r)/r and hence Theorem 5.5 can be strengthened to show thatV ∗ − V (ra) is a strictly
decreasing function ofa, and hence converges monotonically to 0 asa →∞.
Givenr, let f(r) denote the solution of
D−(s; r) = D+(s; r).
It is in the range[0, 1] if and only if 0 ≤ T (s) ≤ T (1), or if and only if
r∗ − T (1)
2a≤ r ≤ r∗ +
T (1)
2a.
98
We will assume thatminu∈[0,1] τ(u) > 0. ThenT−1 exists and
f(r) = T−1
(1
2(T (1) + a)− ar
). (5.42)
The routing iteration is
r(t + 1) = [f(r(t))]10 , (5.43)
where[r]10 = max{0, min{1, r}}.
Definition 5.2. The equilibrium routingra is (globally) stable, if starting from any routing
r(0), r(t) defined by (5.42)–(5.43) converges tora ast →∞.
Example 2: Uniform τ
Suppose delay is uniform on the ring,τ(u) = τ for all u ∈ [0, 1], so thatT (r) = rτ . From
(5.40), the equilibrium routing is
ra =1
2= r∗, ∀a ≥ 0
coinciding with the utility-maximizing routingr∗.
Supposea < τ . Then the routing iteration becomes
r(t + 1) =1
2τ(τ + a)− a
τr(t) = f(r(t)).
Since|f(s)− f(r)| = (a/τ)|s− r| < |s− r|, f(r) is a contraction mapping and hencera
is globally stable for all0 ≤ a < τ .
Hence for the uniform delay case, adding a static component to link cost stabilizes
routing provided the weight on prices is smaller than link delay. Moreover, the static com-
ponent does not lead to any loss in utility (ra = r∗). The stability condition generalizes to
the general delay case. The following theorem says that ifa is smaller than the minimum
“link delay,” thenra is globally stable; ifa is bigger than the maximum “link delay,” then it
99
is globally unstable (diverge from any initial routing exceptra); otherwise, it may converge
or diverge depending on initial routing.
Theorem 5.6.The stability is affected by parametera such that
1. If a < minu∈[0,1] τ(u) thenra is globally stable.
2. Supposea ≥ T (1). Then there existsr < ra < r such that
(a) If r(0) = r or r(0) = r then subsequent routings oscillate betweenr andr.
(b) If r(0) < r or r(0) > r then subsequent routings after a finite number of
iterations oscillate between 0 and 1.
(c) If r < r(0) < r thenr(t) converges tora provideda < minu∈(r,r) τ(u).
3. If a > maxu∈[0,1] τ(u) then starting from any initial routingr(0) 6= ra, subsequent
routings after a finite number of iterations oscillate between 0 and 1.
Proof. 1. We show that the routing iteration (5.43) is a contraction mapping ifa <
minu∈[0,1] τ(u). Now
∣∣[f(s)]10 − [f(r)]10∣∣ ≤ |f(s)− f(r)| ,
=
∣∣∣∣T−1
(T (1) + a− 2as
2
)− T−1
(T (1) + a− 2ar
2
)∣∣∣∣ ,
=
∣∣∣∣1
T ′(u)(as− ar)
∣∣∣∣ ,
≤ a
minu∈[0,1] τ(u)|s− r|,
for someu betweenr and s, by the mean value theorem. Henceh(r) is a contraction
mapping and starting from anyr(0) ∈ [0, 1], r(t) converges exponentially tora.
2. Defineh(r) = (T (1) + a)/2− ar. The routing iteration can be written as
T (r(t + 1)) = [h(r(t))]10. (5.44)
100
Define the following sequences
a0 = 0, b0 = T (0),
an+1 = h−1(bn), bn+1 = T (an+1).
Note that(an, n ≥ 0) is a routing sequence going backward in time. The following lemma
is proved in the appendix, following [103].
Lemma 5.1. LetTa = T (ra) = h(ra). Then
0 = a0 < a2 < · · · < ra < · · · < a3 < a1 < 1,
T (0) = b0 < b2 < · · · < Ta < · · · < b3 < b1 < T (1).
Since the sequences are monotone, the lemma implies that there arer andr with 0 <
r < ra < r < 1 such that
limn→∞
a2n = r, and limn→∞
a2n+1 = r.
By continuity ofT andh, we have
T (r) = h(r), and T (r) = h(r).
This implies that starting fromr(0) = r or r(0) = r, the subsequent routings oscillate
betweenr andr.
To show the second claim, supposer(0) < r. Specifically, supposea2n−2 < r(0) < a2n
for somen. If h(r(0)) > T (1) (possible sincea ≥ T (1)), thenr(1) = 1 and subsequent
routings oscillate between 0 and 1. Otherwise, from (5.44),r(0) = h−1(T (r(1))), and
hencea2n−2 < h−1(T (r(1))) < a2n. Sinceh is strictly decreasing, we haveb2n−1 <
T (r(1)) < b2n−3 by definition ofbn. Hence, sinceT is strictly increasing,a2n−1 < r(1) <
a2n−3. The same argument then shows thata2n−4 < r(2) < a2n−2. Hence we have shown
thatr(0) < a2n impliesr(2) < a2n−2. This proves the second claim. The proof of the third
101
claim follows the same argument of part 1.
3. By the mean value theorem, we have, for allα, α′ in [0, 1],
|h−1(T (α))− h−1(T (α′))| =T ′(u)
a|α− α′|,
for someu betweenα andα′. Hence the iteration map
an+1 = h−1(T (an))
is a contraction provideda > maxu∈[0,1] τ(u). This implies that the sequence(an, n ≥ 0)
converges and, sincera is the unique fixed point ofh−1(T (·)), r = r = ra. The assertion
then follows from part 2(b).
5.5.5 General network
It is difficult to derive an analytical bound ona to guarantee routing stability or to compute
optimal routing for general networks. In this section, we present numerical results to illus-
trate that the intuition from the simple ring network analyzed in the previous subsections
extends to general topology.
We generate a random network based on Waxman’s algorithm [159]. The nodes are
uniformly distributed in a two-dimensional plane. The probability that a pair of nodesu, v
are connected is given by
Prob(u, v) = α exp
(d(u, v)
βL
),
where the maximum probabilityα > 0 controls connectivity,β ≤ 1 controls the length of
the edges with a largerβ favoring longer edges,d(u, v) is the Euclidean distance between
nodesu andv, andL is the maximum distance between any two nodes. In our example, we
set the number of nodesN = 30, α = 0.8, β = 0.3, which generated 90 bidirectional links;
see Figure 5.4. The fixed delayτl of each linkl is randomly chosen according to a uniform
distribution over [100, 400]ms. The link capacities are randomly chosen from the interval
102
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
x
y
random network with N=30,α=0.8,β=0.3
Figure 5.4: A random network.
[1000, 4000] packets/sec, also with uniform distribution. There are 60 flows on the network
with randomly chosen source and destination nodes. Routing on this network is computed
using the Bellman-Ford minimum-cost algorithm, with link costdl(t) = τl +apl(t) in each
update periodt, on a slower timescale than congestion control. In each routing periodt,
we first solve the link prices based on the current routing, using the gradient projection
algorithm of [97]. We iterate the source algorithm to update rates and the link algorithm to
update prices, until they converge. The link prices are then used to compute the minimum-
cost for the next period.
We measure the performance of the scheme at differenta by the sum of all of the
source’s utilities. If the routing is stable (at smalla), the aggregate utility is computed using
the equilibrium routing. Otherwise, the routing oscillates and the time-averaged aggregate
utility is used. The result is shown in Figure 5.5.
As expected, whena is small, routing is stable and the aggregate utility increases with
a, as in the ring network analyzed in Section 5.5.3 (Theorem 5.5). Whena < 4, the static
delayτl dominates the link cost, and the routes computed withdl(t) remain the same as
with static routing (a = 0), and hence the aggregate utility is independent ofa. Routing
103
100
101
102
392
394
396
398
400
402
404
406
408
410
412
α value
ag
gre
ga
te u
tilit
y
Figure 5.5: Aggregate utility as a function ofa for random network
becomes unstable at arounda = 10. Even though the time-averaged utility continues to
rise after routing instability sets in, eventually it peaks and drops to a level less than the
utility achievable by the necessarily stable static routing.
5.6 Resource provisioning
Results in the previous sections show that even though an equilibrium of TCP/IP, when
it exists, maximizes utility under pure dynamic routing, it can be unstable and hence not
attainable by the TCP/IP system. In this section, we show that if the link capacities are op-
timally provisioned, however, purestaticrouting is enough to maximize utility. Moreover,
it is optimal even within the class of multi-path routing: again, there is no penalty in not
splitting traffic across multiple paths.
Suppose it costsαl > 0 amount to provision a unit of capacity at linkl, and letα = (αl,
for all l) be the vector of unit costs. For instance, a longer link may have a largerαl. The
total capacity cost over the entire network isαT c. Suppose the budget for provisioning the
capacity isB. Consider the problem of optimally selecting capacities, routing, and source
104
rates to maximize utility
maxc≥0
maxR∈Rm
maxx≥0
∑i
Ui(xi), (5.45)
subject to Rx ≤ c, (5.46)
αT c ≤ B. (5.47)
whereUi are concave increasing utility functions. Note thatR ranges inRm and hence
multi-path routing is allowed. This problem may arise when optical lightpaths can be
dynamically reconfigured at a connection timescale.
Theorem 5.7.SupposeU ′i(xi) > 0 for all i andxi ≥ 0. At optimality,
1. There is an optimal solution(c∗, R∗, x∗) whereR∗ ∈ Rs is a single-path routing.
2. Moreover,R∗ is pure static routing usingαl as link costs.
3. R∗x∗ = c∗, i.e., there is no slack capacity.
4. αT c = B, i.e., there is no slack in budget.
5. Link prices generated by TCP–AQM are proportional to the provisioning costs,p∗ =
γ∗α for someγ∗ > 0.
Proof. It is easy to show the existence of an equilibrium. Define the Lagrangian of (5.45)-
(5.47) as
L(c, R, x, p, γ) =∑
i
Ui(xi)− pT (Rx− c)− γ(αT c−B).
At optimality, the KKT condition holds: there existp∗ ≥ 0 andγ∗ ≥ 0 such that
U′(x∗i ) = (R∗)T p∗, (5.48)
p∗ = γ∗α, (5.49)
(p∗)T (R∗x∗ − c∗) = 0, (5.50)
γ∗(αT c∗ −B) = 0. (5.51)
105
From (5.49), we obtain the last claim in the theorem. Moreover, (5.49) andU ′i(x
∗i ) > 0
imply that γ∗ > 0 andp∗l > 0 for all l, sinceα > 0. Hence, from (5.50) and (5.51),
equality holds in (5.46) and (5.47), proving the third and fourth claims.
To prove the first two claims, express the routing matrixR asR = HW whereW ∈Wm. Using the equalities in (5.46) and (5.47) to eliminatec, we can transform the utility
maximization problem (5.45)–(5.47) into:
maxW∈Wm
maxx≥0
∑i
Ui(xi),
subject to∑
i
(αT H iwi
)xi = B,
whereW =diag(wi). SinceUi is nondecreasing and both the objective and the constraints
above are separable ini, in order to maximize utility,wi should be chosen to be a solution
of
minwi
αT H iwi,
subject to 1T wi = 1, 0 ≤ wij ≤ 1.
Since this is a linear program, there exists an optimal point on the boundary. Hence there
is an optimalW ∗ ∈ Ws, i.e., minimum-cost single-path routing usingαl as link costs is
optimal.
5.7 Conclusion
Given a routing, TCP-AQM can be interpreted as a distributed primal-dual algorithm over
the Internet to maximize aggregate utility over source rates. In this chapter, we study
whether TCP-AQM together with IP (modelled by minimum-cost routing) can maximize
utility over both source rates and routing, on a slower time scale. We show that we can
indeed interpret TCP/IP asattemptingto maximize utility in the sense that its equilibrium,
if it exists, solves the utility maximization problem and its dual, provided congestion prices
generated by TCP-AQM are used as link costs. TCP/IP equilibrium exists if and only if
106
there is no penalty in not splitting traffic across multiple paths. Even if equilibrium exists,
however, TCP/IP with pure dynamic routing can be unstable. Specializing to a special ring
network, we show that routing is indeed unstable when link costs are congestion prices. It
can be stabilized by adding a static component to the definition of link cost, but the static
component reduces the achievable utility. There thus seems to be an inevitable tradeoff
between routing stability and utility maximization, for a given set of link capacities. We
show, however, that if link capacities are optimally provisioned, then pure static (and hence
stable) routing is sufficient to maximize utility even for general networks, and link costs
are proportional to the provisioning costs. Moreover single-path routing can achieve the
same utility as multi-path routing. Hence, one can regard the layering of TCP and IP as
a decomposition of the utility maximization problem over source rates and routes into a
distributed and decentralized algorithm, carried out on different time scales, at least when
network capacities are well provisioned.
5.8 Appendix
5.8.1 Proof of duality gap
We prove that there is generally a duality gap between the primal problem (5.22)–(5.23)
and its dual whenN is odd.
It is easy to see that the primal optimal routing is
r∗ =N − 1
2, or
N + 1
2.
Suppose without loss of generality thatr∗ = (N − 1)/2 (the other case is similar). Then,
the source rates are
x1 = · · · = xr∗ =2
N − 1, and xr∗+1 = · · · = xN =
2
N + 1
107
yielding a primal objective value of
N − 1
2U
(2
N − 1
)+
N + 1
2U
(2
N + 1
)
= N
{ (1
2− 1
N
)U
(2
N − 1
)+
(1
2+
1
N
)U
(2
N + 1
)}
< NU
(2
N
),
where the last inequality follows from the strict concavity ofU . We now show that the
right-hand side is the optimal dual objective value, and hence there is a duality gap.
The dual problem of (5.22)–(5.23) is (e.g., [97])
minp1,pN+1≥0
(N∑
i=1
maxxi
φ(xi, p1, pN+1) + (p1 + pN+1)
),
whereφ(xi, p1, pN+1) = U(xi)−xi min{p1, pN+1}. First, note that the minimizing(p1, pN+1)
must satisfyp1 = pN+1, for otherwise, if (say)p1 < pN+1, then the dual objective value is
N∑i=1
maxxi
(U(xi)− xip1) + (p1 + pN+1)
and can be reduced by decreasingpN+1 to p1. Hence the dual problem is equivalent to
minp≥0
N∑i=1
maxxi
(U(xi)− xi p) + 2p. (5.52)
Let p∗ denote the minimizer andx∗i = xi(p∗) = x(p∗) =: x∗ denote the corresponding
maximizers (they are equal for alli by symmetry). Then we have
U ′(x∗) = p∗. (5.53)
Differentiating the objective function in (5.52) with respect top and setting it to zero, we
have
0 = N(U ′(x∗)x′(p∗)− p∗x′(p∗)− x∗) + 2. (5.54)
108
Using (5.53), we havex∗ = 2/N, and hence the minimum dual objective value is
N(maxx∗
U(x∗)− x∗ p∗) + 2p∗ = NU
(2
N
)
as desired.
5.8.2 Proof of primal-dual optimality
We prove that the solution given by (5.37) is primal-dual optimal using the saddle-point
theorem (e.g., [14, pp. 427]). Clearly,(r∗, x−∗, x+∗) is primal feasible and(p−∗, p+∗)
is dual feasible. We now show that(r∗, x−∗, x+∗, p−∗, p+∗) is a saddle point, i.e., for all
(r, x−, x+, p−, p+). Now
L(r, x−, x+, p−∗, p+∗) ≤ L(r∗, x−∗, x+∗, p−∗, p+∗) ≤ L(r∗, x−∗, x+∗, p−, p+).
For the right inequality, substituting(r∗, x−∗, x+∗) from (5.37) intoL(r∗, x−∗, x+∗, p−, p+)
to get, for all(p−, p+),
L(r∗, x−∗, x+∗, p−, p+) = U(2).
But U(2) = L(r∗, x−∗, x+∗, p−∗, p+∗), establishing the right inequality. For the left in-
equality, denotingp∗ := p−∗ = p+∗, from (5.37) we have
L(r, x−, x+, p−∗, p+∗) = rU(x−) + (1− r)U(x+)− (rx− + (1− r)x+)p∗ + 2p∗
≤ U(y)− yp∗ + 2p∗ (concavity ofU), (5.55)
with y := rx− + (1− r)x+, where equality holds if and only ifx− = x+ sinceU is strictly
concave. Notice that the right-hand side is maximized overy if and only if y satisfies
U ′(y) = p∗. This implies thaty = x−∗ = x+∗ = 2 sinceU ′ is strictly monotonic. Substitute
y = 2 into (5.55) yields, for all(r, x−, x+),
L(r, x−, x+, p−∗, p+∗) ≤ U(2)
109
as desired, sinceU(2) = L(r∗, x−∗, x+∗, p−∗, p+∗).
5.8.3 Proof of Lemma 5.1
We will prove the lemma by induction. Note thatb0 < Ta implies thata1 = h−1(b0) >
h−1(Ta) = ra. Sincea ≥ T (1) andh(1) < 0, a1 = h−1(b0) < 1 (see Figure 5.6). Hence
h(r)
a2
T(r)
b1
T(0)
b0
r
a1a0
Figure 5.6: Proof of Lemma 5.1.
0 = a0 < ra < a1 < 1.
This implies thatb1 = T (a1) satisfies
T (0) = b0 < Ta < b1 < T (1).
Sinceb1 < T (1) < h(0), a2 = h−1(b1) > h−1(h(0)) = 0, we have
0 = a0 < a2 < ra < a1 < 1.
110
Let the induction hypothesis be
a0 < . . . < a2n < ra < a2n−1 < . . . < a1
b0 < . . . < b2n−2 < Ta < b2n−1 < . . . < b1.
Thenb2n = T (a2n) > T (a2n−2) = b2n−2 andb2n = T (a2n) < T (ra) = Ta. Hence,
b2n−2 < b2n < Ta.
This implies thatra < a2n+1 < a2n−1, which in turn implies thatTa < b2n+1 < b2n−1. This
completes the induction.
111
Chapter 6
Throughput, Fairness, and Capacity
6.1 Introduction
Recent studies, e.g. [80, 97, 116, 164, 101, 88, 96], have shown that a bandwidth allocation
policy can be formulated as a utility maximization problem where the bandwidth allocation
x∗ (source rates) solves [80]
maxx
∑i
Ui(xi) subject toRx ≤ c. (6.1)
It is remarkable that as long as traffic sources adapt their rates to the aggregate (sum of)
congestion in their paths, they are implicitly maximizing some utility. The optimization
problem (6.1) is a convenient characterization of the equilibrium properties of various
TCP/AQM systems. We can derive the underlying utility functions of various TCP al-
gorithms and use them to study the relations among network throughput, fairness, and ca-
pacity. Our work reveals some counter-intuitive behaviors, which will be briefly presented
in this chapter. See [144, 146] for more detailed results and proofs.
We refer to network throughput as the total traffic through the network, which measures
the efficiency of the bandwidth allocation policy under which the network operates. There
are many examples in the literature that point to an inevitable tradeoff between fairness and
aggregate throughput (efficiency), yet there is no general theorem clarifying this folklore.
How do we balance fairness and efficiency in designing bandwidth allocation policies?
Will adding additional link capacities necessarily result in higher aggregate throughput?
112
In this chapter, we rigorously study these questions in general networks using an ana-
lytical model . Here are our main results.
Suppose that the bandwidth allocation policies, represented by utility functions, are
parameterized by a common scalarα ≥ 0. We derive explicit expressions for the changes
in source rates and congestion prices when the parameterα or the capacities change for
general utility functions.
We specialize to a particular class of utility functions [116] that characterize various
TCP variants and include various fairness criterions as special cases. The parameterα in
these utility functions can be interpreted as a quantitative measure of fairness [107, 16], and
an allocation isfair if α is large. All examples in the literature indicate that a fair allocation
is necessarily inefficient. We quantitatively formulate the relations between fairness and
efficiency in general networks. This characterization allows us both to produce the first
counter-example (Theorem 6.3) and trivially explain all the previous supporting examples
(Corollary 6.2). Surprisingly, the class of networks in our counter-example indicates that
a fairer allocation could bealwaysmore efficient. In particular it implies that max-min
fairness can achieve a higher aggregate throughput than proportional fairness.
Intuitively, we might expect that the aggregate throughput will always rise when some
links increase their capacities. This turns out to be wrong, and we characterize exactly the
condition under which this is true (Theorem 6.4). Not only can the aggregate throughput be
reduced when some link increases its capacity, more strikingly, it can also be reduced even
whenall links increase their capacities by the same amount (Theorem 6.5). Moreover, this
holds under all bandwidth allocation policies . This paradoxical result seems less surprising
in retrospect: raising link capacities always increases the aggregate utility, but mathemat-
ically there is no a priori reason that it should also increase the aggregate throughput. If
all links increase their capacities proportionally, however, the aggregate throughput will
indeed increase, under the class of utility functions proposed in [116] (Theorem 6.6).
It is well known that counter-intuitive behavior can arise in a distributed system where
agents optimize their own objectives, e.g., the Braess paradox in transportation networks.
It was discovered theoretically in 1968 [18, 120, 44] and verified in real world years later
[37]. It shows that adding a new road to a transportation network may causelonger travel
113
time foreverycar. Subsequent paradoxes have been discovered in mechanical and electrical
networks [27], in queueing networks [28, 136, 83, 11, 84], and in computer systems [72,
73]. Even though our results have the same flavor, they differ in important ways from the
Braess paradox.
First, in the Braess paradox, the performance degradation is due to misalignment of
individual and social optimalities. In our case, it is due to misalignment of two social
objectives (utility maximization versus throughput maximization). Second, in the Braess
paradox, the addition of new road leads to degraded performance forall flows, and hence
the new equilibrium point is not Pareto optimal. In our case, all equilibrium points are
Pareto optimal, and hence some flows are worse off and some better off in the new equi-
librium point. Finally, examples of the Braess paradox always involve the addition of new
paths and flows that re-route to maximize their own objectives. In our case, only link
capacities are changed, while network topology and routing are fixed.
6.2 Model
A network consists ofL links with finite capacitycl. It is shared byN sources indexed byi.
R is the routing matrix whereRli = 1 if sourcei uses linkl and0 otherwise. Letxi be the
transmission rate of sourcei, andUi(xi; α) be its utility. All the utility functionsUi(xi; α)
are parameterized by a scalarα ≥ 0. SupposeUi(xi; α) are concave inxi for α ≥ 0 and
strictly concave whenα > 0. Whenα andc are clear from the context, we may useUi(xi)
in place ofUi(xi; α).
Consider the utility maximization problem
maxx≥0
∑i
Ui(xi; α) subject to Rx ≤ c, (6.2)
and its Lagrangian dual
minp≥0
∑i
maxxi≥0
(Ui(xi; α)− xi
∑
l
Rlipl
)+
∑
l
clpl. (6.3)
114
A maximizerx = x(α, c) for (6.2) and a minimizerp = p(α, c) for (6.3) exist forα ≥0, c > 0, since the utility functionsUi(xi) are concave.
Unless otherwise specified, we will assume thatα > 0, andR is full row rank, so
that the solutionsx = x(α, c) and p = p(α, c) are unique. The aggregate throughput
T = T (α, c) is defined in terms of the unique solution,
T (α, c) :=∑
i
xi(α, c). (6.4)
From Lemma 6.1 below,x(α, c) andp(α, c) are continuous functions ofα andc. More-
over, they are differentiable except at a finite number of points when the active constraint
set at optimal changes asα or c is perturbed. We can study∂T/∂α and∂T/∂c in be-
tween these points. Hence, all our statement should be interpreted piecewise in between
non-differentiable point. For the rest of the paper, we will thus focus on the utility max-
imization with equality constraints that represent only those constraints that are active at
optimality
maxxi≥0
∑i
Ui(xi; α) s.t. Rx = c. (6.5)
In this case the dual problem (6.3) should be interpreted as the Lagrangian dual of (6.2)
with a possibly reducedR, as opposed to the dual of (6.5).
SupposeR has full row rank. SupposeN ≥ L and letM = N − L be the difference
between the number of sources and the number of links. ThenM is the dimension of the
null space ofR. Let (zm,m = 1, . . . , M), zm ∈ <N be any basis of the null space ofR,
and letZ = [z1 z2 . . . zM ] be the matrix withzm as its columns. LetV = V (x; α) :=∑
i Ui(xi; α) be the aggregate utility function. LetD = D(α, c) denote the curvature of
the aggregate utility function
D := −∂2V
∂x2, (6.6)
115
andb = b(α, c) be
b :=∂2V
∂x∂α(6.7)
at the optimal allocationx = x(α, c).
6.3 Basic results
In the rest of this chapter, we assume that the active constraint set is unchanged whenα or
c is perturbed locally (i.e., we consider problem (6.5) instead of problem (6.2)). WhenR
is full row rank, the following lemma cited from [150] guarantees the differentiability of
x(α, c) andp(α, , c).
Lemma 6.1. For any α > 0, c > 0, the unique solutionx(α, c) and p(α, c) of (6.5) is
continuous and differentiable at(α, c).
The basic results on how throughput and prices vary as the utility parameterα and ca-
pacityc change are given in the next theorem. In the next two sections, we will specialize to
a particular class of utility functions to study the throughput-fairness tradeoff and whether
increasing capacity always raises throughput.
Theorem 6.1. The optimal ratesx = x(α, c) of (6.5) and optimal pricesp = p(α, c) of
(6.3) satisfy the following equations
∂x
∂α= (D−1 −D−1RT (RD−1RT )−1RD−1)b,
∂x
∂c= D−1RT (RD−1RT )−1,
∂p
∂α= (RD−1RT )−1RD−1b,
∂p
∂c= −(RD−1RT )−1,
where matrixD and vectorb are defined in (6.6) and (6.7), respectively.
116
Proof: At the optimal point, the Karush-Kuhn-Tucker condition holds. We have
Rx = c and RT p− ∂V
∂x= 0. (6.8)
Define
y :=
x
p
, w :=
c
α
, and G(w, y) =
Rx− c
RT p− ∂V∂x
.
Then (6.8) can be rewritten asG(w, y) = 0. The derivatives of functionG are
∂G
∂y=
R 0
−∂2V∂x2 RT
=
R 0
D RT
,
∂G
∂w=
−I 0
0 − ∂2V∂x∂α
= −
I 0
0 b
.
SinceR is full row rank andD is positive definite,RD−1RT is positive definite. Then
∂G/∂y is always invertible, and it can be checked that
R 0
D RT
−1
=
D−1RT (RD−1RT )−1 D−1 −D−1RT (RD−1RT )−1RD−1
−(RD−1RT )−1 (RD−1RT )−1RD−1
.
All the above matrices are well defined becauseRD−1RT is invertible. From the implicit
function theorem, the vectory can be uniquely solved in terms ofw locally. Moreover
dy
dw= −
(∂G
∂y
)−1∂G
∂w=
R 0
D RT
−1
I 0
0 b
.
From the definitions ofy andw, we have
∂x
∂α= (D−1 −D−1RT (RD−1RT )−1RD−1)b,
∂x
∂c= D−1RT (RD−1RT )−1,
∂p
∂α= (RD−1RT )−1RD−1b,
∂p
∂c= −(RD−1RT )−1.
117
Since the optimalx always satisfies the constraintsRx = c, for a fixedc, the change in
x should be in the null space ofR asα varies. This is captured by the following corollary.
Corollary 6.1. The derivative∂x/∂α can also be expressed as
∂x
∂α= Z(ZT DZ)−1ZT b,
where the columns of matrixZ form a basis of the null space ofR.
Proof: Denote
∆ := D−1 −D−1RT (RD−1RT )−1RD−1 − Z(ZT DZ)−1ZT .
From Theorem 6.1 and the definition of∆, we only need to show∆ = 0. By the definition
of matrixZ we have
RZ = 0, and ZT RT = 0.
It is clear that
R
ZT D
∆ =
RD−1 −RD−1 − 0
ZT − 0− ZT
= 0.
The next step is to show that the matrix
R
ZT D
is full rank so that∆ must be the zero
matrix. Suppose it is not, then there exists a nonzero vectorv such that
R
ZT D
v = 0. (6.9)
Hence,Rv = 0, i.e.,v is in the null space ofR. Since the columns ofZ form a basis of the
118
null space ofR, there existsw such thatv = Zw. Substituting into (6.9), we have
ZT Dv = ZT DZw = 0.
SinceZT DZ is positive definite and invertible, we must havew = 0 andv = Zw = 0. This
contradicts the assumption thatv 6= 0. Therefore
R
ZT D
is full rank and∆ = 0.
We will apply these results to a particular class of utility functions to study the effect of
changes in fairness (α) and capacity (c) on throughput in general networks.
6.4 Is fair allocation always inefficient?
Now we apply the expression for∂x/∂α in Corollary 6.1 to study the effect of changes in
fairness (α) on throughputT (α) = T (α, c) for a fixedc > 0. It clarifies a folklore about
the tradeoff between efficiency and fairness of a bandwidth allocation policy.
6.4.1 Conjecture
Recent studies show that bandwidth allocations can be formulated as a utility maximization
problem [80, 97, 96], and allocation properties such as throughput and fairness can be
studied by analyzing the underlying convex optimization problem.
Kelly et al. [80] introduceproportional fairness, characterized by utility function
Ui(xi) = log xi, which is achieved by TCP Vegas and FAST. Massoulie et al. [107] propose
another allocation policyminimum potential delaywith Ui(xi) = −1/xi, which has been
shown to approximate the fairness of TCP Reno by Kunniyur and Srikant [88]. Mo and
Walrand [116] present the following class of utility functions
U(xi, α) =
(1− α)−1 x1−αi if α 6= 1
log xi if α = 1. (6.10)
It includes all the previously considered allocation policies as special cases–maximum
throughput (α = 0), proportional fairness (α = 1), minimum potential delay (α = 2),
119
and max-min fairness (α = ∞). It provides a convenient way to compare the fairness of
different allocation policies . Moreover, it can also generate utility functions of major TCP
congestion control algorithms, e.g., Reno (α = 2), HSTCP [39] (α = 1.2), and Vegas,
FAST [69], Scalable TCP [81] (α = 1).
We are not concerned with fairnessacross different flowsunder the same allocation
policy represented by a givenα value, as, e.g., Jain’s fairness index is [67]. Rather, we
want to compare fairnessacross allocation policies. While there are no generally accepted
criteria to compare the fairness of allocation policies, many examples in the networking
literature (e.g., [107, 116, 16, 105, 139]) informally compare specific allocation policies
in terms of theirα. For instance,α = 0 maximizes throughput but can be extremely
unfair. Proportional fairness (α = 1) is considered fairer, and max-min fairness (α =
∞) the fairest because it generalizes equal sharing at a single resource to a network of
resources in a way that maintains Pareto optimality [45, 15]. Comparison of fairness of
these polices [107] in a simple network shows that the minimum-potential-delay policy
(α = 2) “penalizes more (less) severely long routes than max-min (proportional) fairness.”
We extrapolate this intuition based on special cases to a continuum of allocation policies
indexed byα, and interpretα as a quantitative measure of fairness.
Is a fairer policy (largerα) always less efficient (smaller aggregate throughputT (α))?
This conjecture is prompted by the various examples in resource allocation in the literature
of wired networks [107, 116, 16], wireless networks, [105, 139], economics, [22], etc.
These examples seem to illustrate (quoted from [105])
“the fundamental conflict between achieving flow fairness and maximizing
overall system throughput. ... The basic issue is thus the tradeoff between
these two conflicting criteria.”
This conjecture can be analytically expressed as
Conjecture 6.1. T (α) is non-increasing
∂T
∂α≤ 0 for α > 0.
120
6.4.2 Special cases
We review several examples in the literature that have motivated Conjecture 6.1. The con-
jecture is checked in special networks for max-min fairness, minimum potential delay,
proportional fairness, and the maximum-throughput policy by analytically solving (6.2)
or numerically computingT (α). However, these techniques are not applicable to general
networks. As is shown in the next subsection, the underlying network topology in these
examples possesses a special structure that leads to trivial sufficient conditions for the con-
jecture to be true.
Example 1: Linear network with uniform capacity [107, 16]
Consider the classical linear network withL unitary capacity links andN = L + 1 com-
peting sources, shown in Figure 6.1. The ratesxi(α) are computed by solving (6.2) [16] ,
1x 2x Lx
0x
Figure 6.1: Linear network.
which gives
x0(α) =1
L1/α + 1, and xi(α) =
L1/α
L1/α + 1for i ≥ 1.
Using this, we can easily check that, forα > 0,
∂T
∂α=
−L1/α(L− 1) log L
α2 (1 + L1/α)2
= 0, L = 1
< 0, L ≥ 2.
Hence, except for the single-link case,T (α) is strictly decreasing inα for the linear network
with uniform link capacity. After examining this special case with several specialα values,
121
Massoulie et al. [107] made a cautious comment: “It is not known whether the same
ordering holds for arbitrary network topologies.”
Example 2: Linear network with nonuniform capacity [116]
The same network topology in Example 1 is considered in [116] withL = 2 with link
capacitiesc1 < c2. The source rates under max-min fairness are
x0(∞) = x1(∞) =c1
2, x2(∞) = c2 − c1
2, T (∞) = c2 +
c1
2.
It is not hard to solve (6.2) to obtain the source rates and derive the aggregate throughput
for proportional fairness
T (1) =2
3c1 +
1
3
√c21 + c2
2 − c1c2 +2
3c2 ≥ c2 +
c1
2= T (∞),
which supports the conjecture forα = 1 andα = ∞.
Example 3: Linear network with two long flows
Consider a linear network with two long flows, shown in Figure 6.2. The link capacities
1x
7x
2x 3x 4x 5x
6x
1c 2c 3c 4c 5c
Figure 6.2: Linear network with two long flows.
arec = (500, 400, 300, 200, 500)T , and the aggregate throughputT (α) can be numerically
solved for anyα ≥ 0. The result is shown in Figure 6.3. It suggests that the conjecture is
true for allα > 0 for this network. Corollary 6.2 below implies that, indeed, it is.
122
0 2 4 6 8 10 121500
1550
1600
1650
1700
1750
1800
α value
Tota
l Thro
ughput
Figure 6.3: Fairness-efficiency tradeoff.
6.4.3 Necessary and sufficient conditions
We now investigate the conjecture in general networks. The aggregate throughputT is a
function of source ratex(α)
T (x(α)) = 1T x(α), (6.11)
where1 = (1, . . . , 1)T . From Corollary 6.1, we have
∂T
∂α= 1T Z(ZT DZ)−1ZT b. (6.12)
When the utility functionU(x, α) is defined as in (6.10), the matrixD and vectorb defined
in (6.6) and (6.7) take the forms
D = α diag(x−α−11 , . . . , x−α−1
N ), b = (x−α1 log x1, . . . , x
−αN log xN)T ,
wherex = x(α) = x(α, c) are the optimal rates. Letµ = µ(α, c), β = β(α, c) and
A = A(α, c) be defined by
µi := zTi b, βi := −1T zi, and A := ZT DZ, (6.13)
123
wherezi are theith columns ofZ. Note thatA is positive definite and hence invertible.
Let Ai(α, c) denote the matrix obtained from replacing theith row of A with row vector
βT = (β1, β2 . . . βM). From the above definitions and (6.12) we have
∂T
∂α= −βT A−1µ. (6.14)
Our first main result is a necessary and sufficient condition for the conjecture to hold.
Note that the condition is a function ofα even though this is not explicit in the notation.
Theorem 6.2.For anyα > 0
∂T
∂α≤ 0 if and only if
M∑i=1
µi det Ai ≥ 0.
Proof: The key observation is the following expression for the row vector
βT A−1 =1
det A
(det A1, det A2, . . . , det An
), (6.15)
which follows from the following formula for matrix inverse [62]
A−1 =1
det AA∗,
whereA∗ is the adjoint matrix ofA. Combining (6.14) and (6.15), we have
∂T
∂α= − 1
det A
M∑i=1
µi det Ai.
Theorem 6.2 characterizes exactly the set of networks(R, c) in which Conjecture 6.1
is true. Though difficult to understand intuitively, this characterization leads directly to
two sufficient conditions that explain all the examples in Section 6.4.2. The first condition
implies that the conjecture is true when every link has a single-link flow and there is only
one long flow. This condition is satisfied by Examples 1 and 2. The second condition
124
implies that the conjecture is true when there are two long flows but both pass through the
same number of links. This condition is satisfied by Example 3. The corollary implies that,
while the diversity of capacitiescl in Examples 2 and 3 makes the optimization problem
(6.2) hard to solve and the previous analysis methods complicated, they are not relevant at
all to the truth of the conjecture for these examples.
Corollary 6.2. Suppose every link has a single-link flow.
1. If dim(Z) = 1, then∂T/∂α ≤ 0 for all α > 0.
2. If dim(Z) = 2 and the only two long flows pass through the same number of links,
then∂T/∂α ≤ 0 for all α > 0.
To prove the corollary, we now specialize to a particular basisZ of the null space of the
routing matrixR, making use of the fact that every link has a single-link flow. Rearrange
the column of routing matrixR to expressR as
R =[
IL R1
],
whereIL is theL× L identity matrix andR1 is aL×M matrix,N = L + M . We choose
a set of basis for the null space ofR such that matrixZ can be expressed as
Z =
−R1
IM
.
Clearly rank(Z) = dim(Z) = M .
Lemma 6.2. Suppose every link has a single-link flow. ForZ in the form of (6.16), we have
1. µm ≥ 0 for m = 1, . . . ,M .
2. amm ≥ amn for all m,n = 1, . . . , M .
Proof: The proof is a series calculations based on the Karush-Kuhn-Tucker conditions.
See [144, 146] for the detailed proof.
125
We are now ready to prove Corollary 6.2 with the above lemma.
Proof of Corollary 6.2: 1) In this case,M = 1 and Z ∈ <N×1 is a column vector.
There areL single-link flows, one at each of theL links, and exactly one other flow that
can traverse one or more links. This means∑L
j=1−z1j ≥ 1 since the long flow at least
transverses one link. Hence
β1 = −1T z1 =L∑
j=1
−z1j − 1 ≥ 0.
From Lemma 6.2, we know thatµ1 > 0. From Theorem 6.2 we have
∂T
∂α= − µ1β1
det A≤ 0
since matrix A is positive definite.
2) In addition to theL single-link flows, there are two flows that traverse one or more
links. Since they traverse the same number of links, we have
β1 = β2 = −1T z1 ≥ 0, (6.16)
as in the first assertion. We also have
µ1 det A1 + µ2 det A2 = β1 [µ1(a22 − a21) + µ2(a11 − a12)] .
Lemma 6.2 and (6.16) then imply that the above quantity is nonnegative. Hence,
∂T
∂α= −µ1 det A1 + µ2 det A2
det A≤ 0.
126
6.4.4 Counter-example
The condition in the second part of Corollary 6.2 that both long flows pass through the
same number of links is important. When it fails, there are networks where theoppositeof
the conjecture is true!
Theorem 6.3.Whendim(Z) ≥ 2, for anyα0 > 0, there exists a network such that
∂T
∂α> 0 for all α > α0
Proof: See the detailed proof in [144, 146] .
Since in reality there are many more flows than bottleneck links and therefore dim(Z)
is typically large, it is conceivable that Conjecture 6.1 is wrong more often than right in
practice.
Example 4: Counter-example
Consider the linear network withL = 5 links andN = 7 sources, shown in Figure 6.4. The
1x
6x
2x 3x 4x 5x
7x
Sc Sc Lc Lc Lc
Figure 6.4: Network for counter-example in Theorem 6.3.
null space ofR has a dimension dim(Z) = N − L = 2. There are five one-link flows with
ratesx1, . . . , x5 and two long flows with ratesx6, x7. Links 1 and 2 have a small capacity
cS, and links 3, 4, and 5 have a large capacitycL. We solve the utility maximization (6.5)
numerically to computeT (α) for α ∈ [0.5, 10].
The aggregate throughputT (α) is plotted in Figure 6.5 as a function ofα, for cS = 10
and cL = 1, 000. The minimal throughput is achieved aroundα0 = 0.95 and will be
127
achieved aroundα0 = 0.75 if we changecL to be5, 000. T (α) is strictly increasing beyond
α0. In particular,
T (∞) > T (2) > T (1).
The example is surprising at first because the conventional wisdom in networking is that
0 2 4 6 8 10 123006.2
3006.4
3006.6
3006.8
3007
3007.2
3007.4
3007.6
3007.8
α value
To
tal T
hro
ug
hp
ut
Figure 6.5: Throughput versus efficiencyα in the counter-example.
increasingα favors long flows that take up more resources, leading to a drop in aggregate
throughput. This is not exactly right. Recall that the pricepl at a link is a precise measure
of congestion at that link. A more precise intuition is that increasingα favors “expensive”
flows, flows that have the largest sum of link prices in their paths. In Example 4, the link
capacitycS is small andcL is large, so that prices are high at links 1 and 2, and low at links
3, 4, and 5. Even thoughx6 traverses more links, it has a lower aggregate price over its
path thanx7. Hence, whenα increases,x7 increases, leading to a reduction inx6 (because
of sharing at link 2). This reduction allows increases in flowsx3, x4,andx5 , so that the net
change in aggregate throughputT (α) is positive. Hence the counter-example relies on the
design that the longest flow is not the most expensive.
Indeed, one can prove that for the network in Figure 6.4,∂x7/∂α > 0 and∂x6/∂α < 0
128
for all α > α0, as illustrated in Figure 6.6. In this example, the decrease inx6 allows
0 2 4 6 8 10 121.5
2
2.5
3
3.5
4
α value
so
urc
e r
ate
x6
x7
Figure 6.6: Source rates versusα in the counter-example.
increases in just three one-link flows, and yet this is enough to produce a net increase in the
aggregate throughput. Our example actually is compact in that our proof shows thatx6 has
to pass through at least three links (link 3,4,5) to make∂T/∂α > 0.
One may notice that the amount of increment in Figure 6.5 is quite small. In fact, an
easy and loose upper-bound for the increment of aggregate throughput iscS/2. Currently,
we don’t know whether this small variation is true only for this example or for general
networks (R, c).
6.5 Does increasing capacity always raise throughput?
We have seen how fairness, as measured byα, can affect efficiency, as measured byT , in
unexpected ways due to interaction among sources in general networks. In this section, we
study how increasing capacityc affects the aggregate throughputT . The results here can
be useful in deciding in which links resources should be invested to maximize aggregate
throughput.
Let δ ∈ RL be the vector that represents the increases in link capacities in the entire
129
network. For instance, whenδ = εel, whereε > 0 is a small scalar, andel is aL-vector that
has all its entries 0 except thel-th entry which is 1, then only linkl increases its capacity
from cl to cl + ε. Whenδ = ε1, then all links increase their capacities byε unit. When
δ = εc, then all links increase their capacities by amounts proportional to their current
capacities.
The change in aggregate throughput per unit of an infinitesimal changeδ in capacities
is measured by the directional derivativeDT of T in directionδ, defined as
DT (α, δ) = DT (α, δ; c) := limε→0
T (α, c)− T (α, c + εδ)
ε.
From (6.11), we have
DT (α, δ) = 1T ∂x
∂cδ,
where∂x/∂c is evaluated at the optimal ratex(α, c). We will takeδ to denote thedirec-
tion of increase in capacity, with the understanding thatε DT (α, δ) provides an estimate of
change in aggregate throughput whenc is changed toc + εδ. Our results should be inter-
preted in the context of small perturbations that do not change the active constraint set in
(6.2).
DefineB = RD−1RT , η = 1T D−1RT , andBi is the matrix obtained by replacingith
row of B by η. A similar argument to the proof of Theorem 6.2 yields the following:
Theorem 6.4.For anyδ, α > 0
DT (α, δ) ≥ 0 if and only ifL∑
i=1
δi det Bi ≥ 0.
Theorem 6.4 characterizes exactly the set of all networks(R, c), and directionsδ, in
which aggregate throughput will increase, forall fairnessα > 0. An easy consequence is
the following:
Corollary 6.3. If R has only two rows, thenDT (α, δ) ≥ 0 for anyα > 0 and anyδ ≥ 0.
Proof: Let Bij denote the(i, j) element ofB. A similar argument to the proof of Lemma
130
6.2 shows that
ηi = Bii, and Bii ≥ Bij = Bji for i, j = 1, 2.
Then
det(B1) = η1B22 − η2B21 = b22(B11 −B21) ≥ 0.
Similarly we havedet(B2) ≥ 0. From Theorem 6.4, we haveDT (α, δ) ≥ 0.
Corollary 6.3 says that increasing link capacity always raises aggregate throughput,
provided there are only two bottleneck links. Intuitively, one might expect this to hold
more generally. This is however not the case. We provide three interesting examples, with
different instantiations of directionδ, as an illustration.
The first result says that not only can the aggregate throughput be reduced when some
link increases its capacity; paradoxically, it can also be reduced whenall links increase
their capacities by the same amount. This is true for almost all fairnessα.
Theorem 6.5.Given anyα0 > 0,
1. there exists a network (R, c) such that for allα > α0, DT (α, el) < 0 for some linkl.
2. there exists a network (R, c) such that for allα > α0, DT (α, 1) < 0.
Proof: The proof is by construction. For the first claim, consider the network in Figure 6.7.
c5
c1
c2
c3
c4
Figure 6.7: Counter-example for Theorem 6.5(1).
There is a single-link flowxl at each linkl, for l = 1, . . . 4. The flowx5 transverses links
131
1, 2, and5, and flowx6 transverses links3, 4, and5 respectively. The capacities of the links
arec1 = c2 = c3 = c4 = cL andc5 = cS. We increase only link 5’s capacity by 1, which
corresponds toδ = e5. For any fixedα0 > 0, we can choosecL/cS large enough, such that
for anyα > α0 all links are fully utilized. Calculating the change in aggregate throughput
using Mathematica gives
DT (α, e5) = 1T ∂x
∂ce5 = 1T D−1RT
(RD−1RT
)−1e5 = −1.
For the second claim, consider the network shown in Figure 6.8. There is a single-link
c2
c1
c3
c4 c5 c10
x13
x11
x12
Figure 6.8: Counter-example for Theorem 6.5(2).
flow xl at each linkl, for l = 1, ..., 10. The link capacities arecl = cS for l = 1, 2, 3 and
cl = cL for l = 4, . . . , 10. We skip the detailed proof ofDT (α, 1) < 0, which can be found
in [144, 146].
Example 5: DT (∞, 1) < 0 for somec in Figure 6.8
To illustrate, we calculate the change in aggregate throughput for the network in Figure 6.8
under max–min policyα = ∞. Let the link capacities becS = 2 andcL = 10. The source
can be easily calculated under max-min fairness, and it is easy to check that the aggregate
throughput is55. When all capacities are increased by2ε with 0 ≤ ε ≤ 1, we can check
that the total throughputT (ε) changes into55 − ε, i.e.,T (ε) is a decreasing function ofε.
IndeedDT (∞, 1) = −1/2 < 0 in this situation.
132
If link capacities are increased proportionally, i.e., ifc is increased to(1 + ε)c, then
aggregate throughput will always rise. Note that even though increasing all link capacities
proportionally may be interpreted as changing the unit of capacity, it doesnot imply that
general utility functions increase proportionally in optimal flow vectors. It does however
imply this for the class of utility functions defined in (6.10).
Theorem 6.6.For any network (R,c) and for allα > 0, DT (α, c) > 0.
Proof: The necessary and sufficient condition for anyx ≥ 0 andp ≥ 0 to be primal and
dual optimal are
Rx = c, and RT p =∂V
∂x= (x−α
1 , . . . , x−αN )T .
Supposex andp are optimal with link capacitiesc. Whenc is increased to(1 + ε)c for
ε > 0, we claim thatx(1 + ε) and(1 + ε)−αp are the new optimal rate and price vectors,
respectively. We can check that these vectors satisfy the optimality condition for capacity
(1 + ε)c
Rx(1 + ε) = c(1 + ε),
RT p(1 + ε)−α = (1 + ε)−α(x−α1 , . . . , x−α
N )T =∂V
∂x.
Therefore, the aggregate throughput is increased from the original valueT to (1 + ε)T .
Hence, we haveDT (α, c) = T > 0.
6.6 Conclusion
A bandwidth allocation policy can be defined in terms of utility functions parameterized
by some protocol parameterα. We have studied how throughputs and prices change as
link capacities orα changes. We then focus on a specific class of utility functions where
α can be interpreted as a quantitative measure of fairness. We say an allocation is fair ifα
is large and efficient if the aggregate throughput is large. We use this model to investigate
whether a fairer allocation is always more inefficient and whether increasing link capacities
133
always raises throughput. We characterize exactly the set of all networks(R, c) in which
the answers are “yes.” Though these characterizations are difficult to understand intuitively,
they have led to simple corollaries that explain all the examples we found in the literature
and to the discovery of the first counter examples.
There are a number of ways this preliminary work can be extended. First, we have
focused of how throughputsx change in response to changes inα andc, which is only half
of Theorem 6.1. The application of the other half of Theorem 6.1 on how prices change
has not been exploited. Second, the necessary and sufficient conditions for the conjectures
are hard to understand intuitively and check for large networks. It is not clear whether
this condition is likely to hold or fail in practice. It would be useful to derive equivalent
characterizations that are more intuitive or more general corollaries than reported here.
Finally, we have assumed that every source has the same utility function. It would be
interesting to see how the fairness definition and tradeoff results should generalize when
sources have the same class of utility functions but with differentαi parameters, or have
different utility functions.
134
Chapter 7
Other Related Projects
7.1 Network equilibrium with heterogeneous protocols
7.1.1 Introduction
As we have seen in the previous chapters, congestion control protocols can be modelled
as distributed algorithms to maximize the aggregate utility, e.g., [80, 97, 116, 164, 88,
96]. However, these studies assume that all sources are homogeneous, that is, even though
they may control their rates using different algorithms, they all adapt to the same type of
congestion signals, e.g., loss probabilities in TCP Reno and queueing delay in FAST TCP
[69]. When sources withheterogeneousprotocols that react to different congestion signals
share the same network, the current duality framework is no longer applicable. With new
congestion control algorithms proposed for large bandwidth-delay product networks and
usage of congestion signals other than packet losses (including explicit feedbacks with
ECN), we need a rigorous framework to understand the behavior of large-scale networks
with heterogeneous protocols.
A congestion control protocol generally takes the form
pl = gl
∑
j:l∈L(j)
xj(t), pl(t)
, xj = fj
xj(t),
∑
l∈L(j)
mjl (pl(t))
. (7.1)
As we have shown in Chapter 2, heregl(·) models a queue management algorithm, and
fj(·) models a TCP algorithm. The effective pricesmjl (pl(t)) are functions of the link
135
pricespl(t), which in general can vary depending on the links and source.
When all algorithms use the same pricing signal (i.e., homogeneous protocols with
mjl = ml for all j), the equilibrium of (7.1) turns out to be very simple. At the equilibrium,
the source ratesxi solve a utility maximization problem, and the link congestion measure
pl serves as a Lagrange dual. When heterogeneous algorithms that use different pricing
signals share the same network, i.e.,mjl are different for different sourcesj, the situation
is much more complicated. For instance, when TCP Reno and FAST TCP share the same
network, neither loss probability nor queueing delay can serve as the Lagrange multiplier at
the link, and (7.1) can no longer be interpreted as solving the standard utility maximization
problem. Basic questions, such as the existence and uniqueness of equilibrium, and its
local and global stability, need to be re-examined.
7.1.2 Model
A network consists of a set ofL links with finite capacitiescl. There areJ different proto-
cols indexed by superscriptj, and there areN j sources using protocolj indexed by(j, i).
The total number of sources isN :=∑
j N j. The L × N j routing matrixRj for type
j sources is defined byRjli = 1 if source(j, i) uses linkl, and 0 otherwise. The overall
routing matrix is denoted byR =[
R1 R2 · · · RJ
].
Every link l has a pricepl. A type j source reacts to the “effective price”mjl (pl) in
its path. By specifying functionmjl , we can let the link feed back different congestion
signals to sources using different protocols. The end-to-end prices for source(j, i) is qji =
∑l R
jlim
jl (pl). Let qj = (qj
i , i = 1, . . . , N j), q = (qj, j = 1 . . . , J), mj(p) = (mjl (pl), l =
1, . . . L) andm(p) = (mj(pl), j = 1, . . . J) be vector forms. Thenqj = (Rj)T
mj(p) and
q = RT m(p).
Let xj be a vector with the ratexji of source(j, i) as itsith entry, and letx be the vector
of xj. We suppose that source(j, i) has a utility functionU ji (xj
i ) that is strictly concave
and increasing.
A network is in equilibrium when each source(j, i) maximizes its net benefit and the
demand for and supply of bandwidth at each bottleneck link are balanced. Formally, a
136
network equilibrium is defined as follows.
Given pricesp, the end-to-end price vectorq is formulated asq = RT m(p). The
source ratexji is uniquely determined byqj
i , and it uniquely solvesmaxz≥0 U ji (z) − zqj
i .
Therefore, the source rates vectorx is a function of link pricesp, denoted asx(p). Denote
y(p) as the aggregate source rates at links, theny(p) = Rx(p).
In equilibrium, the aggregate rate at each link is no more than the link capacity, and
they are equal if the link price is strictly positive. Formally, we callp anequilibrium if it
satisfies
P (y(p)− c) = 0, y(p) ≤ c, p ≥ 0 (7.2)
whereP := diag(pl) is a diagonal matrix. We will study the existence and uniqueness
properties of network equilibrium specified by the above equations.
7.1.3 Existence of equilibrium
We prove the existence of equilibrium under the following assumptions.
A1: Utility functions U ji are strictly concave, increasing, and twice differentiable. Price
mapping functionsmjl are differentiable and strictly increasing withmj
l (0) = 0.
A2: For anyε > 0, there exists a numberpmax such that ifpl > pmax for link l, then
xji (p) < ε for all (j, i) with Rj
li = 1.
These assumptions are mild. Concavity and monotonicity of utility functions are often
assumed in network pricing for elastic traffic. The assumption onmjl preserves the relative
order of prices and maps zero price to zero effective price. Assumption A2 says that when
pl is high enough, then every source going through linkl has a rate less thanε.
Theorem 7.1. Suppose A1 and A2 hold. There exists an equilibrium pricep∗ for any
network(c,m, R, U).
The mathematical tool used to prove this theorem is the Nash theorem in game theory
[121, 10], which is an application of Kakutani’s generalized fixed point theorem. The
detailed proof can be found in [147].
137
7.1.4 Examples of multiple equilibria
Theorem 7.1 guarantees the existence of network equilibrium; however the equilibrium
may not be unique. We show two examples of multiple equilibria.
In a single-protocol network, if the routing matrix R has full row rank, then there is
a unique active constraint set. In contrast, the active constraint set in a multi-protocol
network can be non-unique even if R has full row rank as shown in Example 1. Clearly, the
equilibrium prices associated with different active constraint sets are different.
Example 1: Multiple equilibria with different active constraint sets
Consider a symmetric network in Figure 7.1 with three flows. The link 1 and link 3 have
1x12x1
1x2
Figure 7.1: Example 2: two active constraint sets.
identical parameters, and flows(1, 1) and(1, 2) have identical utility function. We show
that [142], under certain conditions, the network has two equilibria with different congested
links. We also carry out experiments with TCP Reno, which reacts to loss probability,
and TCP Vegas/FAST, which reacts to delay, and we set the experiment parameters such
that the conditions are satisfied. The prices (queues) at link 1 and link 2 are shown in
Figure 7.2. This result unambiguously exhibits that there are two equilibria with different
active constraint sets. The queue flip is produced when the network operates at different
equilibria.
Example 2: Multiple equilibria with a unique active constraint set
When the active constraint set is unique, it is still possible to have multiple equilibria,
and even uncountable many of them. We show that such an example withJ = 3. The
138
0 100 200 300 400 500 600 700 800 900 10000
100
200
300
400
Simulation time (sec)
Lin
k1
Qu
eu
e S
ize
(p
kts
)
0 100 200 300 400 500 600 700 800 900 10000
50
100
150
200
250
300
Simulation time (sec)
Lin
k2
Qu
eu
e S
ize
(p
kts
)
Figure 7.2: Shifts between the two equilibria with different active constraint sets.
network is shown in Figure 7.3 with three unit-capacity links,cl = 1. All the sources use
1x12x1
1x2
1x3
3x1
2x2
Figure 7.3: Example 2: uncountably many equilibria.
the same utility functionU ji (xj
i ) = − (1− xj
i
)2/2, and the price mapping function is linear
mj(p) = Kjp, whereKj areL × L diagonal matrices withK1 = I, K2 = diag(5, 1, 5),
andK3 = diag(1, 3, 1).
It can be shown that the equilibrium pricep satisfies
∑j
Rj(Rj)T Kjp =∑
j
Rj1− c
which is a linear equation inp. It has a unique solution if the determinantdet(∑
j Rj(Rj)T Kj)
139
is nonzero, but has no or multiple solutions otherwise. By choosing appropriate parameters
(as shown above), we can make this determinant zero. It is easy to check that all of the
following are equilibrium prices for this network
p11 = p1
3 = 1/8 + ε, and p12 = 1/4− 2ε where ε ∈ [0, 1/24].
The corresponding source rates can also be derived, and all capacity constraints are tight
with these rates, yet there are uncountably many equilibria.
Examples 1 and 2 show that global uniqueness is generally not guaranteed in a multi-
protocol network. We will show, however, that local uniqueness is basically a generic
property of the equilibrium set.
7.1.5 Local uniqueness of equilibrium
We denoteE as the equilibrium set wherep is in E if and only if it satisfies (7.2). Fix an
equilibrium pricep∗ ∈ E. Let theactive constraint setL = L(p∗) ⊆ L be the set of links
at whichp∗l > 0. Consider the reduced system that consists only of links inL, and denote
all variables in the reduced system byc, p, y, etc. Then, sinceyl(p) = cl for everyl ∈ L,
we havey(p) = c. Let the Jacobian for the reduced system beJ(p) = ∂y(p)/∂p, and
J(p) =∑
j
Rj ∂xj(p)
∂qj
(Rj
)T ∂mj(p)
∂p. (7.3)
Since the equilibrium pricep∗ for the links inL is a solution ofy(p) = c, by the inverse
function theorem, the equilibrium pricep∗, is locally uniqueif the Jacobian matrixJ(p∗) =
∂y/∂p is nonsingular atp∗. We call a networkregular if all its equilibrium prices are locally
unique. The following theorem shows that almost all networks are regular and that regular
networks have finitely many equilibrium prices. This implies that the uncountablly many
equilibria shown in Example2 almost never happens in real networks.
Theorem 7.2. Suppose assumptions A1 and A2 hold. Given any price mapping functions
m, any routing matrixR and utility functionsU ,
140
1. the set of link capacitiesc for which not all equilibrium prices are locally unique has
Lebesgue measure zero in<L+.
2. the number of equilibria for a regular network(c, m,R, U) is finite.
We now narrow our attention to networks that satisfy an additional assumption:
A3: Every link l has a single-link flow.(j, i) with(U j
i
)′(cl) > 0.
Assumption A3 says that when the price of linkl is small enough, the aggregate rate
through it will exceed its capacity. It implies that the active constraint set is unique and
contains every link.
Since all the equilibria of a regular network have nonsingular Jacobian matrices, we
can define theindexI(p) of p ∈ E as
I(p) =
1 if det (J(p)) > 0
−1 if det (J(p)) < 0.
Theorem 7.3.Suppose assumptions A1–A3 hold. Given any regular network, we have
∑p∈E
I(p) = (−1)L
whereL is the number of links.
The proof of this theorem is based on the Poincare-Hopf index Theorem [149, 113].
First, we construct a vector field formed by a continuous-time gradient project algorithm
[97] with multiple protocols. Clearly,p∗ is an equilibrium point of this vector field if and
only if it is a network equilibrium. Under the assumption A3, there will be a contraction
region in this vector field, and all the equilibria are in this region. The Jacobian matrix of
the vector field equilibrium point is the same as 7.3 if uniform stepsize is used. Since the
network is regular, every equilibrium has an index. Using the Poincare-Hopf index theorem
gives us the result in Theorem 7.3. An obvious consequence of this theorem is:
Corollary 7.1. Suppose assumptions A1–A3 hold. A regular network has an odd number
of equilibria.
141
Corollary 7.1 also implies the existence of an equilibrium, although we show this in a
more general setting in Section 7.1.3. Next we will present another example to illustrate
Theorem 7.3 and Corollary 7.1.
Example 3: Illustration of Theorem 7.3 and Corollary 7.1
Recall that in Example 1, there are uncountably many equilibria. The componentsx11
andq11 of these equilibrium points are shown by the (red) solid line in Figure 7.4. We can
change the utility function of every source into the following form
U ji (xj
i , αji ) =
βji (x
ji )
1−αji /(1− αj
i ) if αji 6= 1
βji log xj
i if αji = 1
,
whereαji andβj
i are parameters. We pick two points (the two black dots), and choose ap-
propriate parametersαji andβj
i for every source, such that these two points will be isolated
equilibria.
After this perturbation, we can check whether the two designed equilibria are locally
unique, and the network is regular. Corollary 7.1 predicts an odd number of equilibria. We
indeed can find another equilibrium, and three of them in total 7.4.
x 11
q 11
1=(β11/q
11)1/α1
1 x 1
5/6
7/8
1/8 1/6
(0.135,0.865)
(0.165,0.835)
Figure 7.4: Example 3: construction of multiple isolated equilibria.
We further check the local stability of these three equilibria under the gradient algo-
rithm. It turns out that one of them is not stable and has index 1, while the other two are
142
stable with index−1. The dynamics of this network under the gradient algorithm can be
illustrated by a vector field. We draw the vector field restricted on the planep1 = p3, and
the phase portrait is shown in Figure 7.5. The (red) dots represent the three equilibria. Note
that the equilibrium in the middle is a saddle point, and it is therefore unstable. The (red)
arrows give the direction of this vector field. Individual trajectories are plotted with slim
(blue) lines.
0.1 0.11 0.12 0.13 0.14 0.15 0.16 0.17 0.18 0.19 0.20.15
0.16
0.17
0.18
0.19
0.2
0.21
0.22
0.23
0.24
0.25
p1
p 2
Figure 7.5: Example 3: vector field of (p1, p2).
7.1.6 Global uniqueness of equilibrium
The exact condition under which network equilibrium is globally unique is generally hard
to prove. We provide several special cases for global uniqueness.
Theorem 7.4. Suppose assumptions A1–A3 hold. If all equilibria have index(−1)L, then
E contains exactly one point. In particular, if all equilibria are locally stable, thenE
contains exactly one point.
The first claim of the theorem directly follows from Theorem 7.3. It can be checked
that a local stable equilibrium also has an index of(−1)L, and the second claim also holds.
This result relates the local stability of an algorithm to the uniqueness property of a
network. Local stability can be checked in several ways, and it can be used to prove global
uniqueness. We will concentrate on several special cases in the rest of this section.
143
Theorem 7.5. Suppose assumptions A1–A3 hold andR has full row rank. If for allj and
l, mjl (pl) = kjpl for some scalarkj > 0, then there is a unique network equilibrium .
Under the assumption of Theorem 7.5, it is easy to show that we have an unusual situ-
ation in the theory of heterogeneous protocols where the equilibrium rate vectorx solves
the following concave maximization problem
maxx
∑i,j
kjU ji (xj
i ) s. t.Rx ≤ c.
Therefore, such a network always has a globally unique equilibrium whenU ji are strictly
concave. It can also be proved using Theorem 7.4 by showing that every equilibrium is
locally stable under the gradient projection algorithm.
Theorem 7.6. Suppose assumptions A1–A2 hold. The linear network in Figure 7.6 has a
unique equilibrium.
7.6. We can show that every equilibrium in this network is locally stable and that even
1x
L+1x
2x LxFigure 7.6: Corollary 7.6: linear network.
every source uses different protocols. By Theorem 7.4, the equilibrium is globally unique.
Theorem 7.4 also implies the global uniqueness of equilibrium for any network in which
no flow passes through more than 2 links in the active constraint set, when A1–A3 hold. In
this case, the Jacobian matrix is strictly diagonally dominant with negative diagonal entries,
and hence its determinant is(−1)L.
Theorem 7.7.Suppose assumptions A1–A3 hold. A network where all flows using at most
two links has a unique equilibrium.
144
7.1.7 Conclusion
When sources sharing the same network react to different pricing signals, the current dual-
ity model no longer explains the equilibrium of bandwidth allocation. We have introduced a
mathematical formulation of network equilibrium for multi-protocol networks and studied
several fundamental properties, such as existence, local uniqueness, number of equilibria,
and global uniqueness. We prove that equilibria exist and are almost always locally unique.
The number of equilibria is almost always finite and must be odd when they are associ-
ated with the same active constraint set. We provide four sufficient conditions for global
uniqueness.
The utility maximization problem that underlies a single-protocol network implies that
the equilibrium source rates exist and are always unique [96]. In the heterogeneous protocol
case, we prove that equilibrium still exists, under mild conditions, despite the lack of an un-
derlying concave optimization problem. There can be uncountably many equilibria, and the
bottleneck links set can be also be non-unique. However, we prove that almost all networks
have finitely many equilibria and that they are necessarily locally unique. Non-uniqueness
can arise in two ways. First, the equilibria associated with different sets of bottleneck links
are always distinct. Second, the number of equilibria associated with each set of bottleneck
links can be more than one, though always odd. Moreover, these equilibria cannot all be
locally stable unless the equilibrium is globally unique. We also provide several special
cases for global uniqueness of network equilibrium. We also provide numerical examples
to illustrate the theorem and equilibrium properties.
7.2 Control unresponsive flow–CHOKe
7.2.1 Introduction
TCP is believed to be largely responsible for preventing congestion collapse while the In-
ternet has undergone dramatic growth in the last decade. Indeed, numerous measurements
have consistently shown that more than 90% of traffic on the current Internet still consists
of TCP packets, which, fortunately, are congestion controlled. Without a proper incen-
145
tive structure, however, this state of affairs is fragile and can be disrupted by the growing
number of non-rate-adaptive (e.g., UDP-based) applications that can monopolize network
bandwidth to the detriment of rate-adaptive applications. This has motivated several active
queue management schemes, e.g., [111, 41, 94, 141, 123, 126, 35], that aim at penalizing
aggressive flows and ensuring fairness. The scheme, CHOKe, of [126] is particularly in-
teresting in that it does not require any state information and yet can provide a minimum
throughput to TCP flows.
The basic idea of CHOKe is explained in the following quotation from [126]:
“When a packet arrives at a congested router, CHOKe draws a packet at random from
the FIFO (first-in-first-out) buffer and compares it with the arriving packet. If they both
belong to the same flow, then they are both dropped; else the randomly chosen packet
is left intact and the arriving packet is admitted into the buffer with a probability that
depends on the level of congestion (this probability is computed exactly as in RED). ”
The surprising feature of this simple scheme is that it can bound the bandwidth share of
UDP flows regardless of their arrival rate. Extensive simulation results in [126] show that
as the arrival rate of UDP packets increases without bound, their bandwidth share peaks and
then drops to zero! It seems intriguing that a flow that maintains a much larger number of
packets in the queue does not receive a larger share of bandwidth, as in the case of a regular
FIFO buffer. We provide an analytical model of CHOKe that explains both this throughput
behavior and the spatial characteristics of its leaky buffer. In this section, we will present
the model, analysis, and simulations of CHOKe very briefly. See [155, 143, 145] for details.
7.2.2 Model
We focus on a network with a single bottleneck link with capacityc pkts/sec, which is
shared by byN identical TCP sources and a UDP flow with a constant sending rate. We
study the network’s equilibrium behavior.
Equilibrium quantities (rate, dropping probability, etc.) associated with the UDP flow
are indexed by 0. Since the TCP sources are identical, we will use index 1 for all TCP
146
flows. The definition of all variables and some of their obvious properties are collected
below:
d: common round-trip propagation delay for TCP sources.
τ : common queueing delay, and round-trip delay isd + τ .
bi: packet backlog from flowi, i = 0, 1.
b: total backlog;b = b0 + b1N .
r: congestion based dropping probability. In general,r = g(b, τ) whereg is a function
of aggregate backlogb and queueing delayτ .
xi: source rate of flowi. In general,x1 = f(p1, τ) wheref is a function of overall loss
probabilityp1 and queueing delayτ .
hi: the probability that an incoming packet of flowi is dropped by CHOKehi = bi/b.
pi: overall probability that a packet of flowi is dropped before it gets through.
A packet may be dropped, either on arrival due to CHOKe or congestion (e.g., according
to RED) or after it has been admitted into the queue when a future arrival from the same
flow triggers a comparison. Every arrival packet from flowi can trigger either0 packet loss
from the buffer, 1 packet loss due to RED, or 2 packet losses due to CHOKe. These events
happen with respective probabilities of(1 − hi)(1 − r), (1 − hi)r, andhi. Hence, each
arrival to the buffer is accompanied by an average packet loss of
pi = 2hi + (1− hi)r + 0 · (1− hi)(1− r) = 2hi + r − rhi. (7.4)
Consider a packet of flowi that eventually goes through the queue without being dropped.
The probability that it is not dropped on arrival is(1− r)(1−hi). Once it enters the queue,
it takesτ time to go through it. In this time period, there are on averageτxi packets from
flow i that arrive at the queue. The probability that this packet is not chosen for comparison
147
is (1− 1/b)τxi. Hence, the overall probability that a packet of flowi survives the queue is
1− pi = (1− r)(1− hi)
(1− 1
b
)τxi
. (7.5)
The rate of the flowi’s packets getting through the buffer isxi(1 − pi). Since the link is
fully utilized, the flow throughputs sum to link capacity:
x0(1− p0) + Nx1(1− p1) = c.
The model is derived by putting together all the above equations. The only independent
variable is UDP ratex0, and there are ten dependent variables. In summary, the model is
described by the following ten equations:
pi = 2hi + r − rhi, i = 0, 1 (7.6)
pi = 1− (1− r)(1− hi)
(1− 1
b
)τxi
, i = 0, 1 (7.7)
hi =bi
b, i = 0, 1 (7.8)
b = b0 + Nb1 (7.9)
c = x0(1− p0) + Nx1(1− p1) (7.10)
x1 = f(p1, τ) (TCP) (7.11)
r = g(b, τ) (e.g. RED) (7.12)
Substituting(f, g) with the analytical model of Reno/RED, this set of nonlinear equations
(7.6)–(7.12) can be solved numerically using Matlab. The solution is accurately validated
with ns-2 simulations shown in Section 7.2.5. This solution can then be used in the differ-
ential equation model described later to solve for spatial properties of the leaky buffer.
7.2.3 Throughput analysis
By making three approximations, we can derive the maximum achievable UDP throughput,
and prove that UDP throughput approaches zero when it sends infinitely fast.
148
First, we approximate the system by one in which the order of congestion based drop-
ping and CHOKe is reversed. Second, we assume thatN is large such that a comparison
triggered by aTCParriving packet never yields a match. The last assumption is that we can
approximate(1− 1/b)b ' e−1. Under these assumptions, we can eliminateτ in the model
(7.6)–(7.12) and get our key equation
1− h0
1− 2h0
= exp
(x0(1− r)(1− h0)
c− x0(1− r)(1− 2h0)
). (7.13)
Let µ0 = µ0(x0) denote the UDP throughput share,µ0 = x0(1 − p0)/c, and let
µ∗0 = maxx0≥0 µ0(x0) denote the maximum achievable UDP share. The UDP through-
put behavior can be totally captured using equation (7.13), which is independent of TCP
and AQM algorithms. We show the bandwidth properties in Theorem 7.8 and visualize it
in Figure 7.7 which shows UDP throughput versus sending rates.
100
101
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
µ0 U
DP
bandw
idth
share
x0(1-r)/c
Peak Value 0.269
Figure 7.7: Bandwidth shareµ0 v.s. Sending ratex0(1− r)/c.
Theorem 7.8.The UDP throughput has the following properties:
1. The maximum UDP bandwidth share isµ∗0 = (e + 1)−1 = 0.269. It is attained
when the UDP input rate after congestion based dropping isx∗0(1 − r∗) = c(2e −1)/(e + 1) = 1.193c. In this case, the CHOKe dropping rate for UDP ish∗0 =
(e− 1)/(2e− 1) = 0.387.
149
2. As the UDP sending rate grows without bound, even though UDP packets occupy up
to half of the queue, its throughput drops to zero. That is, asx0 →∞, b0 → b/2 but
µ0 → 0.
The second result of the theorem can also be proved without using the three approxi-
mations. See proofs and more details in [155, 143, 145].
7.2.4 Spatial characteristics
We now derive the spatial characteristics of the leaky buffer under CHOKe that give rise to
the macroscopic properties of maximum and asymptotic throughput proved in the previous
subsection.
Lety ∈ [0, b] denote a position in the queue, withy = 0 being the tail. Definev(y) as the
velocity at which the packet at positiony moves toward the head of the queuev(y) = dy/dt.
For instance, the velocity at the head of the queue equals the link capacity,v(b) = c. Let
ρi(y) be the probability that the packet at positiony belongs to flowi, i = 0, 1. The
bandwidth shareµi is the probability that the head of the queue is occupied by a packet
from flow i, µi = ρi(b). We can derive an ordinary differential equation (ODE) model of
these two quantities
v′(y) = β (ρ0(y)x0 + (1− ρ0(y))x1) , (7.14)
ρ′0(y) = β(x0 − x1) ρ0(y)(1− ρ0(y))1
v(y), (7.15)
whereβ = log(1− 1/b), and the boundary conditions are
v(b) = c, and ρi(0)v(0) = xi(1− r)(1− hi), i = 0, 1.
The spatial characteristics of the leaky buffer under CHOKe are totally captured by
these differential equations. Now we present some structural properties of the velocity
v(y) and spatial distributionρ0(y), which are shown in Theorems 7.9, 7.10, and illustrated
in Figure 7.8.
150
Theorem 7.9. 1. For all x0 ≥ 0, packet velocityv(y) is a convex and strictly decreasing
function. It is linear if and only ifx0 = x1.
2. Supposex0 > x1. Thenρ0(y) is a strictly decreasing function. Moreover, ifρ0(0) ≤ρ∗, thenρ0(y) is convex. Ifρ0(b) ≥ ρ∗, thenρ0(y) is concave. Ifρ0(b) < ρ∗ < ρ0(0),
thenρ0(y) is first concave and then convex (as it is shown in Figure 7.8(b) ) , where
ρ∗ = (x0 − 2x1)/(3(x0 − x1)).
Now we study the asymptotic properties ofv(y) andρ0(y) asx0 goes to infinity. We
assume that the pointwise limits ofv(y) andρ0(y) exis and denote this byv∞(y) andρ∞0 (y).
We describe the asymptotic properties in the following theorem.
Theorem 7.10.For anyx0, every flow, including UDP flow, occupies less than half of the
queue. Whenx0 →∞, we have
1. the buffer sizeb∞ is finite. The UDP’s share of this buffer ish∞0 = (1−r∞)/(2−r∞).
2. the throughput of UDP source goes to zero, i.e.,x0(1− p0) = ρ0(b)c → 0.
3. let y∗ = b∞(1−r∞)/(2−r∞). When0 ≤ y ≤ y∗, we haveρ∞0 = 1 andv∞(y) = ∞.
Wheny∗ < y ≤ b∞, we haveρ∞0 = 0 andv∞(y) = c− β∞x∞1 (b∞ − y)
When the UDP input rate increases, even though the total number of UDP packets in
the queue increases, their spatial distribution becomes more and more concentrated near the
tail of the queue and drops rapidly to zero toward the head of the queue. This means that
most of the UDP packets are dropped before they reach the head. It is therefore possible to
simultaneously maintain a large number of packets (concentrated near the tail) and receive a
small bandwidth share, in stark contrast to the behavior of a non-leaky FIFO buffer. Indeed,
asx0 grows without bound, UDP share drops to 0. This also confirms the approximate
throughput analysis of Theorem 7.8. Second, the packet velocity is infinite before the
positiony∗ because UDP packets are being dropped at an infinite rate untily∗.
151
y
v(y)
x0 = x1
c
y
v(y)
x0 = x1
c
y
v(y)
c
10xx ≠
y
v(y)
c
∞=0
x
bb bb bbbr
r
−
−
2
1
(a) Velocityv(y)
y
ρ0(y)
x0 = x1
y
∞=0
x
ybb b
ρ0(y) ρ0(y)
br
r
−
−
2
1
x0 > x1
(b) Spatial distributionρ0(y)
Figure 7.8: Illustration of Theorems 7.9 and 7.10.
152
7.2.5 Simulations
We implement a CHOKe module in ns-2 version 2.1b9. We have conducted extensive
simulations with a single bottleneck link network. This network is shared byN newReno
TCP sources and one UDP source which sends data at a constant rate. We present three
sets of simulation results. The first set illustrates the accuracy of our TCP/CHOKe model
(7.6)–(7.12) and its macroscopic properties. The second set illustrates the spatial properties
proved in Theorem 7.10. The third example uses TCP Vegas and illustrates that these
properties are insensitive to the specific TCP algorithms.
For the newReno simulations in the first two sets the link capacity is fixed at 125
pkts/sec, and round-trip propagation delay is 100ms. We use RED+CHOKe as the queue
management with RED parameters minthb = 20 packets, maxthb = 520 packets,pmax =
0.5. The corresponding analytical model for Reno (functionf ) and RED (functiong) can
be found in Chapter 3.
In our simulation, we vary the UDP sending ratex0 from 0.1c to 10c and measure the
aggregate queue sizeb, UDP bandwidth shareµ0 = ρ0(b), and TCP throughputµ1. We
also solve for these quantities using the analytical model (7.6)–(7.12) and the approximate
model described in Section 7.2.3. The results, shown in Figures 7.9, illustrate both the
macroscopic behavior of TCP/CHOKe and the accuracy of our analytical models.
As can be seen from Figure 7.9, the aggregate queue lengthb steadily increases as
the UDP ratex0 rises. UDP bandwidth shareµ0 = ρ0(b) rises, peaks, and then drops to
less than 5% asx0 increases from0.1c to 10c, while the total TCP throughput follows an
opposite trend, eventually exceeding 95% of the capacity (not shown). These results match
closely those obtained in [126], for both the analytical model and the approximate mode.
Figure 7.9(b) also displays the UDP bandwidth share measured from the simulations for
the casesx0 = 0.1c, c, 10c. It verifies Theorems 7.8 that predicts that the UDP bandwidth
share peaks at around 0.269 and tends to zero asx0 increases.
The next set of results measures the spatial distributionsρ0(y) of UDP packets in the
above simulations shown in Figure 7.9. The simulation results, and analytical solutions, are
both shown in Figure 7.10. They match well Theorem 7.10 and agree with Figure 7.8(b)
153
10−1
100
101
0
20
40
60
80
100
120
140
160
UDP sending rate/capacity, c=125pkts/sec
queue s
ize (
pkts
)
SimulationModelApproximate Model
(a) Queue sizeb
10−1
100
101
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
UD
P b
an
dw
idth
sh
are
UDP sending rate/capacity, c=125pkts/sec
x0=0.1c
share=0.0777
x0=c
share=0.2474
x0=10c
share=0.0195
SimulationModelApproximate Model
(b) UDP bandwidth shareµ0
Figure 7.9: Experiment 1: effect of UDP ratex0 on queue size and UDP share
in Section 7.2.4. Whenx0 = 0.1c (Figure 7.10(a)), UDP packets are distributed roughly
uniformly in the queue, with probability close to 0.08 at each position. As a result, its
bandwidth share is roughly 10%. Asx0 increase,ρ0(y) concentrates more and more near
the tail of the queue and drops rapidly toward the head, as predicted by Theorem 7.10.
Also marked in Figure 7.9(b) are the UDP bandwidth shares corresponding to UDP rates in
Figure 7.10. As expected the UDP bandwidth shares in 7.9(b) are equal toρ0(b) in Figure
7.10. Whenx0 > 10c, even though roughly half of the queue is occupied by UDP packets,
almost all of them are dropped before they reach the head of the queue!
In the last set of simulations, we use TCP Vegas [20] instead of newReno. In these sim-
ulations, the link capacity is fixed atc = 1875 pkts/sec., the round-trip propagation delay is
d = 100ms, and the number of TCP sources isN = 100. We set Vegas parameterαd = 20
packet, and use RED parameters(20, 1020, 0.1). The UDP sending rate varies from0.1c to
10c. We measure the UDP bandwidth shareµ0 and queue lengthb, and compare them with
the numerical solutions of the full model and those of the approximate model described.
The results are shown in Figure 7.11. Comparison of this with Figure 7.9 for NewReno
simulations confirms that the qualitative behavior of TCP/CHOKe is insensitive to TCP
algorithms.
154
10 20 30 40 50 60 70 80 900
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Queue Position
UD
P P
acke
ts S
pa
tia
l D
istr
ibu
tio
n
SimulationModel
(a) UDP ratex0 = 0.1c
20 40 60 80 100 1200
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Queue Position
UD
P P
acke
ts S
pa
tia
l D
istr
ibu
tio
n
SimulationModel
(b) UDP ratex0 = c
20 40 60 80 100 120 1400
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Queue Position
UD
P P
acke
ts S
pa
tia
l D
istr
ibu
tio
n
SimulationModel
(c) UDP ratex0 = 10c
20 40 60 80 100 120 1400
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Queue Position
UD
P P
acke
ts S
pa
tia
l D
istr
ibu
tio
n
SimulationModel
(d) UDP ratex0 = 100c
Figure 7.10: Experiment 2: spatial distributionρ(y).
155
10−1
100
101
100
150
200
250
300
350
400
UDP sending rate/capacity, c=1875pkts/sec
queue s
ize (
pkts
)
SimulationModelApproximate model
(a) Queue sizeb
10−1
100
101
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
UDP sending rate/capacity c=1875pkts/sec
UD
P b
an
dw
idth
sh
are
x0=0.1c
share=0.0813
x0=c
share=0.2640
x0=10c
share=0.034
SimulationModelApproximate model
(b) UDP bandwidth shareµ0
Figure 7.11: Experiment 3: effect of UDP ratex0 on queue size and UDP share with TCPVegas.
7.2.6 Conclusions
We have developed a model of CHOKe. Its key features are the incorporation of the feed-
back equilibrium of TCP and a detailed modelling of the queue dynamics. We prove that as
the UDP input rate increases, its bandwidth peaks at(e+1)−1 = 0.269 when the UDP input
rate is slightly larger than link capacity, and drops to zero as the UDP input rate tends to in-
finity. To explain this phenomenon, we have introduced the concepts of spatial distribution
and velocity of packets in the queue. We prove that structural and asymptotic properties
of these quantities make it possible for UDP to simultaneously maintain a large number
of packets in the queue and receive a vanishingly small bandwidth share, the mechanism
through which CHOKe protects TCP flows.
156
Chapter 8
Summary and Future Directions
We have studied the equilibrium and dynamics of the Internet congestion control using tools
recently developed from feedback-control theory and optimization theory. We summarize
our work and list several future research directions in this chapter.
The models and dynamics of TCP Reno and FAST TCP have been studied in chapters
3 and 4. We point out several future directions in the studies of TCP dynamics.
1. TCP dynamics can be studied in several different settings, e.g., single link vs. general
network, homogeneous sources vs. heterogenous sources, local stability vs. global
stability, without feedback delay vs. with feedback delay. In general, the latter set-
tings are more difficult to deal with. Our studies only cover part of them and should
be extended to global stability with feedback delays in general networks.
2. As we mentioned in Section 7.1, during the incremental deployment of congestion
control schemes, there is an inevitable phase of heterogenous protocols running on
the same network. While the equilibrium properties of heterogenous protocols have
been studied in [146], the dynamics of such systems are still open and is one of our
future directions.
3. The fluid model of TCP has been widely used to study TCP dynamics; however this
model can not capture the self-clocking feature in the packet level. We have shown
in Chapter 4 that this model may give wrong predictions about stability. A discrete-
time model is introduced to capture thisself-clockingeffect. However, we also found
several scenarios where its predictions also disagree with the experiments. It seems
157
that both models are inaccurate. We need to clarify the discrepancies in these models
and hopefully derive a better one.
Recent studies have shown that TCP/AQM algorithms can be interpreted as carrying
out a distributed primal-dual algorithm over the Internet to maximize aggregate utility.
The equilibrium properties (i.e., fairness, throughput, capacity, and routing) of TCP/AQM
systems are studied using the utility maximization framework in Chapter 5, and 6. These
studies can be also extended in several ways.
1. We have focused on how network throughput changes in response to changes in
fairness and capacity in Chapter 6. However, how link prices and network revenue
changes has not been investigated.
2. We have identified the existence of a non-trial duality gap, in the joint utility maxi-
mization problem in Chapter 5, and we need to derive a bound for this gap which is
the penalty for not splitting the traffic.
3. Even though numerical examples suggest that the tradeoff between routing stability
and utility maximization exists in a more general network, we have not been able to
find an analytical proof.
4. When a static component is included in link cost, it is not known if TCP/IP has an
equilibrium, whether the equilibrium jointly solves a certain optimization problem,
and under what condition it is stable.
158
Bibliography
[1] R. K. Ahuja, T. L. Magnanti, and J. B. Orlin.Network Flows: Theory, Algorithms,
and Applications. Prentice-Hall, 1993.
[2] M. Allman. A web server’s view of the transport layer.ACM Computer Communi-
cation Review, 30(5), October 2000.
[3] M. Allman, V. Paxson, and W. Stevens. TCP congestion control. RFC 2581,
Internet Engineering Task Force, April 1999.http://www.ietf.org/rfc/
rfc2581.txt .
[4] T. Alpcan and T. Basar. A utility-based congestion control scheme for internet-style
networks with delay. InProceedings of IEEE Infocom, 2003.
[5] S. Athuraliya, V. H. Li, S. H. Low, and Q. Yin. REM: Active queue management.
IEEE Network, 15(3):48–53, May/June 2001.
[6] S. Athuraliya and S. Low. Optimization flow control – ii: Implementation, 2000.
[7] B. Awerbuch, Y. Azar, and S. Plotkin. Throughput competitive online routing. In
Proceedings of 34th IEEE symposium on Foundations of Computer Science, pages
32–40, 1993.
[8] J. Aweya, M. Ouellette, and D. Y. Montuno. A control theoretic approach to active
queue management.Computer Networks, 36:203–235, 2001.
[9] F. Baccelli and D. Hong. Interaction of TCP Flows as Billiards. InProceedings of
IEEE Infocom, 2003.
159
[10] T. Basar and G. J. Olsder.Dynamic Noncooperative Game Theory. Academic Press,
London and San Diego, second edition, 1995.
[11] N. Bean, F. Kelly, and P. Taylor. Braess’s paradox in loss networks.Journal of
Applied Probability, pages 155–159, 1997.
[12] D. P. Bertsekas. Dynamic behavior of shortest path routing algorithms for communi-
cation networks.IEEE Transactions on Automatic Control, pages 60–74, February
1982.
[13] D. P. Bertsekas.Linear network optimization: algorithms and codes. MIT Press,
1991.
[14] D. P. Bertsekas.Nonlinear Programming. Athena Scientific, 1995.
[15] D. P. Bertsekas and R. Gallager.Data Networks. Prentice-Hall Inc., second edition,
1992.
[16] T. Bonald and L. Massoulie. Impact of fairness on Internet performance. InPro-
ceedings of ACM Sigmetrics, pages 82–91, June 2001.
[17] B. Braden, D. Clark, J. Crowcroft, B. Davie, S. Deering, D. Estrin, S. Floyd,
V. Jacobson, G. Minshall, C. Partridge, L. Peterson, K. Ramakrishnan, S. Shenker,
J. Wroclawski, and L. Zhang. Recommendations on queue management and con-
gestion avoidance in the Internet. RFC 2309, Internet Engineering Task Force, April
1998.http://www.ietf.org/rfc/rfc2309.txt .
[18] D. Braess.Uber ein paradoxen aus der verkehrsplanung.Unternehmensforschung,
12:258–268, 1968.
[19] L. Brakmo, S. O’Malley, and L. Peterson. TCP Vegas: New techniques for conges-
tion detection and avoidance. InProceedings of ACM Sigcomm, pages 24–35, Aug.
1994.
160
[20] L. S. Brakmo and L. L. Peterson. TCP Vegas: End to end congestion avoidance on
a global Internet.IEEE Journal on Selected Areas in Communications, 13(8):1465–
1480, 1995.
[21] H. Bullot, R. L. Cottrell, and R. Hughes-Jones. Evaluation of advanced TCP stacks
on fast long-distance production networks. InProceedings of Second International
Workshop on Protocols for Fast Long-Distance Networks, 2004.
[22] M. Butler and H. Williams. The allocation of shared fixed costs, 2002.
http://www.lse.ac.uk/collections/operationalResearch/
pdf/lseor02-52.pdf .
[23] F. M. Callier and C. A. Desoer.Linear System Theory, pages 368–374. Springer-
Verlag, New York, 1991.
[24] H. Choe and S. H. Low. Stabilized Vegas. InProceedings of IEEE Infocom, April
2003.http://netlab.caltech.edu .
[25] M. Christiansen, K. Jaffay, D. Ott, and F. D. Smith. Tuning RED for web traffic. In
Proceedings of ACM Sigcomm, pages 139–150, 2000.
[26] M. Christiansen, K. Jeffay, D. Ott, and F. D. Smith. Tuning red for web traffic.
IEEE/ACM Trans. Netw., 9(3):249–264, 2001.
[27] J. Cohen and P. Horowitz. Paradoxical behaviour of mechanical and electrical net-
works. Nature, 352:699–701, August 1991.
[28] J. Cohen and F. Kelly. A paradox of congestion in a queuing network.J. Appl. Prob.,
27:730–734, 1990.
[29] S. Cosares and I. Saniee. An optimization problem related to balancing loads on
SONET rings.Telecommunications Systems, 3:165–181, 1994.
[30] S. Deb and R. Srikant. Congestion control for fair resource allocation in networks
with multicast flows.IEEE/ACM Transactions on Networking, 12(2):274–285, 2004.
161
[31] C. A. Desoer and Y. T. Yang. On the generalized nyquist stability criterion.IEEE
Transactions on Automatic Control, 25:187–196, 1980.
[32] D. Dutta, A. Goel, and J. Heidemann. Oblivious aqm and nash equilibria. InPro-
ceedigns of IEEE Infocom, 2003.
[33] FAST-Team. FAST implementation code.http://netlab.caltech.edu/
FAST/ .
[34] W. Feng, D. D. Kandlur, D. Saha, and K. G. Shin. A self-configuring RED gateway.
In Proceedings of IEEE Infocom, volume 3, pages 1320–1328, March 1999.
[35] W. Feng, K. G. Shin, D. Kandlur, and D. Saha. Stochastic Fair Blue: A queue
management algorithm for enforcing fairness. InProceedings of IEEE Infocom,
2001.
[36] V. Firoiu and M. Borden. A study of active queue management for congestion con-
trol. In Proceedings of IEEE Infocom, pages 1435–1444, March 2000.
[37] C. Fisk and S. Pallottino. Empirical evidence for equilibrium paradoxes with im-
plications for optimal planning strategies.Transportation Research, 15A:245–248,
1981.
[38] S. Floyd. Connections with multiple congested gateways in packet-switched net-
works part 1: One-way traffic.Computer Communications Review, 21(5):30–47,
Ocotober 1991.
[39] S. Floyd. Highspeed TCP for large congestion windows. RFC 3649, IETF Experi-
mental, December 2003.http://www.faqs.org/rfcs/rfc3649.html .
[40] S. Floyd and T. Henderson. The NewReno modification to TCP’s fast recovery
algorithm. RFC 2582, Internet Engineering Task Force, April 1999.http://
www.ietf.org/rfc/rfc2582.txt .
[41] S. Floyd and V. Jacobson. Random early detection gateways for congestion avoid-
ance.IEEE/ACM Transactions on Networking, 1(4):397–413, 1993.
162
[42] S. Floyd and V. Paxson. Difficulties in simulating the Internet.IEEE/ACM Transac-
tion on Networking, 9(4):392–403, 2001.
[43] C. Fraleigh, S. Moon, B. Lyles, M. Khan, D. Moll, R. Rockell, T. Seely, and C. Diot.
Packet level traffic measurements from the sprint IP backbone.IEEE Network,
17(6):6–16, Nov.-Dec. 2003.
[44] M. Frank. The braess paradox.Mathematical Programming, 20:283–302, 1981.
[45] E. M. Gafni and D. P. Bertsekas. Dynamic control of session input rates in com-
munication networks.IEEE Transactions on Automatic Control, 29(11):1009–1016,
January 1984.
[46] R. G. Gallager and S. J. Golestani. Flow control and routing algorithms for data
networks. InProceedings of the 5th International Conf. Comp. Comm., pages 779–
784, 1980.
[47] M. Garey and D. Johnson.Computers and intractability: a guide to the theory of
NP-completeness. W. H. Freeman and Co., 1979.
[48] N. Garg and J. Konemann. Faster and simpler algorithms for multicommodity flow
and other fractional packing problems. InProceedings of IEEE Symposium on Foun-
dations of Computer Science, pages 300–309, 1998.
[49] R. Garg, A. Kamra, and V. Khurana. A game-theoretic approach towards congestion
control in communication networks.SIGCOMM Comput. Commun. Rev., 32(3):47–
61, 2002.
[50] R. J. Gibbens and F. P. Kelly. Resource pricing and the evolution of congestion
control. Automatica, 35:1969–1985, 1999.
[51] T. G. Griffin, F. B. Shepherd, and G. Wilfong. The stable paths problem and inter-
domain routing.IEEE/ACM Trans. Netw., 10(2):232–243, 2002.
163
[52] K. Gummadi, R. Dunn, S. Saroiu, S. Gribble, H. Levy, and J. Zahorjan. Measure-
ment, modeling, and analysis of a peer-to-peer file-sharing workload. InProceedings
of the 19th ACM SOSP, Bolton Landing, NY, October 2003.
[53] L. Guo and I. Matta. The war between mice and elephants. InProceedings of IEEE
ICNP, November 2001.
[54] M. Handley, S. Floyd, J. Padhye, and J. Widmer. TCP Friendly Rate Control
(TFRC): Protocol specification. RFC 3168, Internet Engineering Task Force, Jan-
uary 2003.http://www.ietf.org/rfc/rfc3384.txt .
[55] E. Hashem. Analysis of random drop for gateway congestion control. Technical
Report LCS TR-465, Laboratory for Computer Science, MIT, Cambridge MA, 1989.
[56] C. Hedrick. Routing information protocol. RFC 1058, Internet Engineering Task
Force, June 1988.http://www.ietf.org/rfc/rfc1058.txt .
[57] J. C. Hoe. Improving the start-up behavior of a congestion control sheme for TCP.
In Proceedings of ACM Sigcomm, pages 270–280, New York, 1996.
[58] C. Hollot, V. Misra, and W. Gong. On designing improved controllers for AQM
routers supporting TCP flows. InProceedings of IEEE Infocom, 2001.
[59] C. Hollot, V. Misra, and W. Gong. Analysis and design of controllers for AQM
routers supporting TCP flows.IEEE Transactions on Automatic Control, 47(6),
2002.
[60] C. Hollot, V. Misra, D. Towsley, and W. Gong. A control theorietic analysis of RED.
In Proceedings of IEEE Infocom, 2001.
[61] C. Hollot, V. Misra, D. Towsley, and W. Gong. On designing improved controller
for AQM routers supporting TCP flows. InProceedings of IEEE Infocom, 2001.
[62] R. A. Horn and C. R. Johnson.Matrix Analysis. Cambridge University Press, 1985.
164
[63] V. Jacobson. Congestion avoidance and control. InProceedings of ACM Sigcomm,
pages 314–329, Stanford, CA, Aug. 1988.
[64] V. Jacobson, R. Braden, and D. Borman. TCP extensions for high performance.
RFC 1323, Internet Engineering Task Force, May 1992.http://www.ietf.
org/rfc/rfc1323.txt .
[65] R. Jain. A delay-based approach for congestion avoidance in interconnected het-
erogeneous computer networks.Comput. Commun. Rev., 19(5):56C–71, October
1989.
[66] R. Jain. Congestion control in computer networks: Issues and trends.IEEE Network,
pages 24–30, May 1990.
[67] R. Jain, W. Hawe, and D. Chiu. A quantitative measure of fairness and discrimi-
nation for resource allocation in shared computer systems. Technical Report DEC
TR-301, Digital Equipment Corporation, September 1984.
[68] C. Jin, D. X. Wei, and S. H. Low. FAST TCP: motivation, architecture, algorithms,
performance. Technical Report CaltechCSTR:2003:010, Caltech, December 2003.
[69] C. Jin, D. X. Wei, and S. H. Low. FAST TCP: motivation, architecture, algorithms,
performance. InProceedings of IEEE Infocom, March 2004.http://netlab.
caltech.edu .
[70] C. Jin, X. Wei, S. H. Low, G. Buhrmaster, J. Bunn, D. H. Choe, R. L. A. Cottrell, J. C.
Doyle, W. Feng, O. Martin, H. Newman, F. Paganini, S. Ravot, and S. Singh. FAST
TCP: From Theory to Experiments.IEEE Network, 19(1):4–11, January/February
2005.
[71] R. Johari and D. K. Tan. End-to-end congestion control for the Internet: delays and
stability. IEEE/ACM Transactions on Networking, 9(6):818–832, 2001.
165
[72] H. Kameda, E. Altman, T. Kozawa, and Y. Hosokawa. Braess-like paradoxes in dis-
tributed computer systems.IEEE Transactions on Automatic Control, 45(9):1687–
1690, 2000.
[73] H. Kameda and O. Pourtallier. Paradoxes in distributed decisions on optimal load
balancing for networks of homogeneous computers.Journal of the ACM, 49:407–
433, 2002.
[74] K. Kar, S. Sarkar, and L. Tassiulas. Optimization based rate control for multipath
sessions. InProceedings of 7th International Teletraffic Congress (ITC), December
2001.
[75] K. Kar, S. Sarkar, and L. Tassiulas. Optimization based rate control for multirate
multicast sessions. InProceedings of IEEE Infocom, pages 123–132, April 2001.
[76] D. Katabi, M. Handley, and C. Rohrs. Congestion control for high bandwidth-delay
product networks. InProceedings of ACM Sigcomm, 2002.
[77] F. Kelly. Charging and rate control for elastic traffic.European Transactions on
Telecommunications, 8:33–37, 1997.
[78] F. Kelly and T. Voice. Stability of end-to-end algorithms for joint routing and rate
control. Computer Communication Review, 35(2):5–12, 2005.
[79] F. P. Kelly. Fairness and stability of end-to-end congestion control.European Jour-
nal of Control, 9:159–176, 2003.
[80] F. P. Kelly, A. Maulloo, and D. Tan. Rate control for communication networks:
Shadow prices, proportional fairness and stability.Journal of Operations Research
Society, 49(3):237–252, March 1998.
[81] T. Kelly. Scalable TCP: improving performance in highspeed wide area networks.
ACM SIGCOMM Computer Communication Review., 33(2):83–91, 2003.
[82] K. B. Kim and S. H. Low. Analysis and design of aqm for stabilizing tcp. Technical
Report CaltechCSTR:2002:009, Caltech, March 2002.
166
[83] Y. Korilis, A. Lazar, and A. Orda. Capacity allocation under noncooperative routing.
IEEE Trans. on Automatic Control, 42(3):309–325, 1997.
[84] Y. Korilis, A. Lazar, and A. Orda. Avoiding the Braess paradox in non-cooperative
networks.J. Appl. Prob., 36:211–222, 1999.
[85] S. Kunniyur and R. Srikant. End-to-end congestion control schemes: Utility func-
tions, random losses and ECN marks. InProceedings of IEEE Infocom, pages 1323–
1332, March 2000.
[86] S. Kunniyur and R. Srikant. Analysis and design of an adaptive virtual queue (avq)
algorithm for active queue management. InProceedings of ACM Sigcomm, pages
123–134, New York, NY, USA, 2001. ACM Press.
[87] S. Kunniyur and R. Srikant. A time–scale decomposition approach to adaptive ECN
marking. InProceedings of IEEE Infocom, pages 1330–1339, April 2001.http:
//comm.csl.uiuc.edu:80/˜srikant/pub.html .
[88] S. Kunniyur and R. Srikant. End-to-end congestion control: utility functions, ran-
dom losses and ECN marks.IEEE/ACM Transactions on Networking, 11(5):689 –
702, October 2003.
[89] R. J. La and V. Anantharam. Charge-sensitive TCP and rate control in the internet.
In INFOCOM (3), pages 1166–1175, 2000.
[90] T. V. Lakshman and U. Madhow. The performance of TCP/IP for networks with
high bandwidth-delay products and random loss.IFIP Transactions C-26, High
Performance Networking, pages 135–150, 1994.
[91] W. E. Leland, M. S. Taqqu, W. Willinger, and D. V. Wilson. On the self-similar
nature of ethernet traffic (extended version).IEEE/ACM Trans. Netw., 2(1):1–15,
1994.
167
[92] L. Li, D. Alderson, W. Willinger, and J. Doyle. A first-principles approach to un-
derstanding the Internet’s router-level topology. InProceedings of ACM Sigcomm,
pages 3–14, 2004.
[93] L. Li, D. Alderson, W. Willinger, and J. Doyle. Towards a theory of scale free graphs,
definition, properties and implications.Internet Mathematics, 2005. To appear.
[94] D. Lin and R. Morris. Dynamics of random early detection. InProceedings of the
ACM Sigcomm, pages 127–137, 1997.
[95] X. Lin and N. B. Shroff. Utility maximization for communication networks with
multi-path routing. Submitted to IEEE Transactions on Automatic Control, 2004.
http://min.ecn.purdue.edu/˜linx/ .
[96] S. H. Low. A duality model of TCP and queue management algorithms.IEEE/ACM
Transactions on Networking, 11(4):525–536, August 2003.http://netlab.
caltech.edu .
[97] S. H. Low and D. E. Lapsley. Optimization flow control I: basic algorithm and
convergence.IEEE/ACM Transactions on Networking, 7(6):861–874, 1999.
[98] S. H. Low, F. Paganini, and J. C. Doyle. Internet congestion control.IEEE Control
Systems Magazine, 22(1):28–43, Feb. 2002.
[99] S. H. Low, F. Paganini, J. Wang, S. Adlakha, and J. C. Doyle. Dynamics of
TCP/RED and a scalable control. InProceedings of IEEE INFOCOM, June 2002.
http://netlab.caltech.edu .
[100] S. H. Low, F. Paganini, J. Wang, and J. C. Doyle. Linear stability of TCP/RED
and a scalable control.Computer Networks Journal, 43(5):633–647, 2003.http:
//netlab.caltech.edu .
[101] S. H. Low, L. Peterson, and L. Wang. Understanding Vegas: a duality model.Journal
of ACM, 49(2):207–235, March 2002.http://netlab.caltech.edu .
168
[102] S. H. Low and R. Srikant. A mathematical framework for designing a low-loss,
low-delay internet.Networks and Spatial Economics, 4(1), 2004.
[103] S. H. Low and P. Varaiya. Dynamic behavior of a class of adaptive routing protocols
(IGRP). InProceedings of IEEE Infocom, pages 610–616, March 1993.
[104] D. G. Luenberger.Linear and Nonlinear Programming. Addison-Wesley Publishing
Company, second edition, 1984.
[105] H. Luo, S. Lu, V. Bharghavan, J. Cheng, and G. Zhong. A packet scheduling ap-
proach to qos support in multihop wireless networks.Mob. Netw. Appl., 9(3):193–
206, 2004.
[106] L. Massoulie. Stability of distributed congestion control with heterogenous feedback
delays.IEEE Transactions on Automatic Control, 47:895–902, 2002.
[107] L. Massoulie and J. Roberts. Bandwidth sharing: objectives and algorithms.
IEEE/ACM Transactions on Networking, 10(3):320–328, June 2002.
[108] M. Mathis, J. Mahdavi, S. Floyd, and A. Romanow. TCP selective acknowledgment
options. RFC 2018, Internet Engineering Task Force, October 1996.http://
www.ietf.org/rfc/rfc2018.txt .
[109] M. Mathis, J. Semke, J. Mahdavi, and T. Ott. The macroscopic behavior of the
TCP congestion avoidance algorithm.Computer Communication Review, 27(1),
July 1997.
[110] M. May, T. Bonald, and J.-C. Bolot. Analytic evaluation of RED performance. In
Proceedings of IEEE Infocom, 2000.
[111] P. McKenny. Stochastic fairness queueing. InProceedings of IEEE Infocom, pages
733–740, 1990.
[112] M. Mehyar, D. Spanos, and S. H. Low. Optimization flow control with estimation
error. InProceedings of IEEE Infocom, March 2004.
169
[113] J. Milnor. Topology from the Differentiable Viewpoint. The University Press of
Virginia, Charlottesville, 1972.
[114] V. Misra, W. Gong, and D. Towsley. Stochastic differential equation modeling and
analysis of tcp-windowsize behavior, 1999.
[115] V. Misra, W. Gong, and D. Towsley. Fluid-based analysis of a network of AQM
routers supporting TCP flows with an application to RED. InProceedings of ACM
Sigcomm, 2000.
[116] J. Mo and J. Walrand. Fair end-to-end window-based congestion control.IEEE/ACM
Transactions on Networking, 8(5):556–567, October 2000.
[117] R. T. Morris. Scalable TCP Congestion Control. PhD thesis, Harvard University,
1999.
[118] G. Motwani and K. Gopinath. Evaluation of advanced TCP stacks in the iSCSI envi-
ronment using simulation model. InProceedings. 22nd IEEE / 13th NASA Goddard
Conference on Mass Storage Systems and Technologies, pages 210–217, 2005.
[119] J. Moy. OSPF version 2. RFC 2328, Internet Engineering Task Force, April 1998.
http://www.ietf.org/rfc/rfc2328.txt .
[120] J. Murchland. Braess’s paradox of traffic flow.Transpn. Res., 4:391–394, 1970.
[121] M. Osborne and A. Rubinstein.A Course in Game Theory. The MIT Press, 1994.
[122] T. Ott, J. Kemperman, and M. Mathis. The stationary behavior of ideal TCP conges-
tion avoidance. ftp://ftp.bellcore.com/pub/tjo/TCPwindow.ps.
[123] T. J. Ott, T. V. Lakshman, and L. Wong. SRED: Stabilized RED. InProceedings of
IEEE Infocom, 1999.ftp://ftp.bellcore.com/pub/tjo/SRED.ps .
[124] J. Padhye, V. Firoiu, D. Towsley, and J. Kurose. Modeling TCP Reno performance:
A simple model and its empirical validation.IEEE/ACM Transactions on Network-
ing, 8(2):133–145, April 2000.
170
[125] F. Paganini, Z. Wang, J. C. Doyle, and S. H. Low. Congestion control for high
performance, stability, and fairness in general networks.IEEE/ACM Transactions
on Networking, 13(1):43–56, 2005.
[126] R. Pan, B. Prabhakar, and K. Psounis. CHOKe: A stateless AQM scheme for ap-
proximating fair bandwidth allocation. InProceedings of IEEE Infocom, March
2000.
[127] A. Papachristodoulou. Global stability analysis of a TCP/AQM protocol for arbitrary
networks with delay. InProceedings of IEEE Confernece on Decision and Control,
December 2004.
[128] A. Papachristodoulou, L. Li, and J. C. Doyle. Methodological frameworks for large-
scale network analysis and design.Computer Communication Review, 34(3), 2004.
[129] L. L. Peterson and B. S. Davie.Computer Networks: A Systems Approach, chapter
5.2, pages 383–389. Morgan Kaufmann Publishers, Inc., second edition, 2000.
[130] S. Plotkin, D. Shmoys, and E. Tardos. Fast approximation algorithms for fractional
packing and covering problems.Math of Oper. Research, pages 257–301, 1995.
[131] K. Ramakrishnan, S. Floyd, and D. Black. The addition of explicit congestion noti-
fication (ECN) to IP. RFC 3168, Internet Engineering Task Force, September 2001.
http://www.ietf.org/rfc/rfc3168.txt .
[132] L. Rizzo. IP dummynet.http://http://info.iet.unipi.it/˜luigi/
ip_dummynet/ .
[133] S. Ryu, C. Rump, and C. Qiao. Advances in active queue management(AQM) based
TCP congestion control.Telecommunication System, 25(3):317–351, 2004.
[134] S. Shakkottai, R. Srikant, and S. P. Meyn. Bounds on the throughput of congestion
controllers in the presence of feedback delay.IEEE/ACM Trans. Netw., 11(6):972–
981, 2003.
171
[135] J. L. Sobrinho. Network routing with path vector protocols: theory and applications.
In Proceedings of ACM Sigcomm, pages 49–60, New York, NY, USA, 2003. ACM
Press.
[136] B. C. W. Solomon and I. Ziedins. Braess’s paradox in a queuing network with state
depending routing.Journal of Applied Probability, 34:134–154, 1997.
[137] Sprint ATL. IP monitoring project, http://ipmon.sprint.com/
packstat/packetoverview.php .
[138] R. Srikant.The Mathematics of Internet Congestion Control. Birkhauser, 2004.
[139] R. Srinivasan and A. Somani. On achieving fairness and efficiency in high-speed
shared medium access.IEEE/ACM Transactions on Networking, 11(1), February
2003.
[140] W. Stevens. TCP slow start, congestion avoidance, fast retransmit, and fast recovery.
RFC 2001, Internet Engineering Task Force, January 1997.http://www.ietf.
org/rfc/rfc2001.txt .
[141] I. Stoica, S. Shenker, and H. Zhang. Core-stateless fair queueing: Achieving ap-
proximately fair bandwidth allocations in high speed networks. InProceedings of
ACM Sigcomm, pages 118–130, 1998.
[142] A. Tang, J. Wang, S. Hedge, and S. H. Low. Equilibrium and fairness of networks
shared by TCP Reno and Vegas/FAST.To appear in Telecommunication Systems,
2005.
[143] A. Tang, J. Wang, and S. H. Low. Understanding CHOKe. InProceedings of IEEE
Infocom, April 2003. http://netlab.caltech.edu .
[144] A. Tang, J. Wang, and S. H. Low. Is fair allocation always inefficient. InProceedings
of IEEE Infocom, 2004.
[145] A. Tang, J. Wang, and S. H. Low. Understanding choke: Throughput and spatial
characteristics.IEEE/ACM Trans. Netw., 12(4):694–707, 2004.
172
[146] A. Tang, J. Wang, and S. H. Low. Counter-intuitive throughput behaviors in networks
under end-to-end control.To appear in IEEE/ACM Transactions on Networking,
June 2006.
[147] A. Tang, J. Wang, S. H. Low, and M. Chiang. Network equilibrium of heterogeneous
congestion control protocols. InProceedings of IEEE Infocom, Miami, FL, March
2005.
[148] P. Tinnakornsrisuphap and A. M. Makowski. Limit behavior of ECN/RED gateways
under a large number of TCP flows. InProceedings of IEEE Infocom, pages 873–
883, April 2003.
[149] H. Varian. A third remark on the number of equilibria of an economy.Econometrica,
43:985–986, 1975.
[150] H. R. Varian. Microeconomic analysis. W. W. Norton & Company, third edition,
1992.
[151] G. Vinnicombe. On the stability of end-to-end congestion control for the Inter-
net. CUED/F-INFENG/TR.398, December 2000.http://www-control.
eng.cam.ac.uk/gv/internet/ .
[152] G. Vinnicombe. On the stability of networks operating TCP-like protocols. In
Proceedings of IFAC, 2002.http://netlab.caltech.edu/pub/papers/
gv_ifac.pdf .
[153] J. Wang, L. Li, S. H. Low, and J. C. Doyle. Can TCP and shortest-path rout-
ing maximize utility? InProceedings of IEEE Infocom, April 2003. http:
//netlab.caltech.edu .
[154] J. Wang, L. Li, S. H. Low, and J. C. Doyle. Cross-layer optimization in TCP/IP
networks. IEEE/ACM Transactions on Networking, 2005. http://netlab.
caltech.edu .
173
[155] J. Wang, A. Tang, and S. H. Low. Maximum and asymptotic UDP throughput under
CHOKe. InProceedings of ACM Sigmetrics, pages 82–90, 2003.
[156] J. Wang, A. Tang, and S. H. Low. Local stability of FAST TCP. InProceedings of
IEEE Conference on Decision and Control, December 2004.
[157] J. Wang, D. X. Wei, and S. H. Low. Modeling and stability of FAST TCP. In
Proceedings of IEEE Infocom, Miami, FL, March 2005.
[158] Z. Wang and F. Paganini. Global stability with time delay in network congestion
control. InProceedings of IEEE CDC, December 2002.
[159] B. M. Waxman. Routing of multipoint connections.IEEE Journal on Selected Areas
in Communications, 6(9):1617–1622, 1988.
[160] D. X. Wei. Congestion control algorithms for high speed long distance tcp.
Master’s Thesis, Caltech, 2004http://www.cs.caltech.edu/˜weixl/
research/msthesis.ps .
[161] J. T. Wen and M. Arak. A unifying passivity framework for network flow control.
In Proceedings of IEEE Infocom, 2003.
[162] W. Willinger, M. S. Taqqu, R. Sherman, and D. V. Wilson. Self-similarity through
high-variability: statistical analysis of ethernet lan traffic at the source level.
IEEE/ACM Trans. Netw., 5(1):71–86, 1997.
[163] L. Xu, K. Harfoush, and I. Rhee. Binary increase congestion control for fast long-
distance networks. InProceedings of IEEE Infocom, March 2004.
[164] H. Yaiche, R. R. Mazumdar, and C. Rosenberg. A game theoretic framework for
bandwidth allocation and pricing in broadband networks.IEEE/ACM Transactions
on Networking, 8(5):667–678, 2000.
[165] R. H. Zakon. Hobbes’ Internet timeline, v8.0, 2005, link:
http://www.zakon.org/robert/internet/timeline/.
174
[166] H. Zhang, L. Baohong, and W. Dou. Design of a robust active queue management
algorithm based on feedback compensation. InProceedings of ACM Sigcomm, pages
277–285, 2003.
[167] X. Zhu, J. Yu, and J. C. Doyle. Heavy tails, generalized coding and optimal web
layout. InProceedings of IEEE Infocom, 2001.