congestion control - mitweb.mit.edu/6.976/www/notes/notes3.pdf · congestion control is essentially...

Congestion Control

• Topics

◦ Congestion control

− what & why?

◦ Current congestion control algorithms

− TCP and UDP

◦ Ideal congestion control

− Resource allocation

− Distributed algorithms

◦ Relation

− current algorithms and resource allocation

◦ Broad implications

6.976/ESD.937 1

Congestion Control: What and Why?

• Internet is used by many independent users

◦ Resources (link capacity) are finite

◦ If every user sends data at very high rate, it will cause

“congestion”

− packets will be dropped, i.e. unreliable transmission

− seemingly high utilization of resource may be actually very

low!

◦ If every user sends data at very low rate, resource will not be

well-utilized

→ Users need to send data at the “correct” rate so that

◦ Resources are well-utilized

◦ Users get “reliable” data transfer

◦ Resources should be shared “fairly”

→ This is the primary goal of congestion control

6.976/ESD.937 2

Congestion Control

• Main question:

◦ How to implement congestion control in the network

◦ With the help of only distributed protocols at the users

− is it even possible?

− if not, what is the best possible thing to do?

• An example:ReceiverSender link

↑(

unknown capacity

for sender/receiver

)

6.976/ESD.937 3

Congestion Control

• If senders get acknowledgment of receipt of packets

◦ They know if packets are dropped or not

• Based on this, senders can infer the following

◦ If packets are not dropped

− sender is sending at rate lower than the capacity

◦ If packets are dropped

− sender is sending at rate higher than capacity

→ Use drops of packets as “signal of congestion”

→ Change rate as a reaction to packet drop so as to achieve

“fair-share”

• Congestion control is essentially ”feedback-control”

6.976/ESD.937 4

Congestion Control Mechanism

• To perform congestion control, we need two basic protocols

◦ Algorithm I: Drop detection

− ask receiver to acknowledge receipt of packets

− if a sent packet is not acknowledged fast enough the packet

is assumed to be dropped

◦ Algorithm II: Rate-control

− if packets are not dropped, i.e. no congestion and hence

increase rate by certain amount

− else there is congestion and hence decrease rate by certain

amount

• Algorithms I and II are key ideas behind current congestion control

protocols

6.976/ESD.937 5

Congestion Control

• Recall, Internet has “layered” architecture

◦ Congestion control is essentially required for

− “reliable” transmission at “fair-rate” with “high-resource

utilization”

→ Implemented in “Network Layer”

• The congestion control protocol is also called “transport protocol”

◦ Transmission Control Protocol (TCP) is a popular protocol

− provides reliable transmission

− when all users exhibit “good-citizen” behavior

− but has higher delay (cost of reliability)

◦ User Data-gram Protocol (UDP) is another protocol

− unreliable and user selfish protocol

− but has lower delay

− useful for media communication

6.976/ESD.937 6

TCP

• Brief History

◦ TCP (some version) has been around since ARPANET

◦ Older versions(s) required users to send “probes” to network to

detect level of congestion

→ led network to always operate under congestion

→ often led to “congestion collapse”

• Popular well-documented example

◦ Congestion collapse in October 1986

◦ Rate between Lawrence-Berkeley Lab and UCB went down from

32 kbps → 40 bytes!

6.976/ESD.937 7

TCP

• Post-mortem of “congestion collapse”

◦ Older TCP sent more and more packets without confirming their

delivery

− violation of ‘packet conservation’ principle was key reason

for failure

◦ Did not account for inhomogeneity in network bandwidth

◦ Lack of rate-control

• Current TCP

◦ Packet conservation: inject new packet when old packet wave has

reached destination

◦ Slow-start: search for capacity starting from zero

◦ Rate-control: control rate via packet drop feedback, and be “good

user”

6.976/ESD.937 8

TCP: Current Algorithm

• Algorithm has two components

◦ Sender algorithm and receiver algorithm

• Sender algorithm:

◦ Parameters:

− STHRESH: threshold for slow-start

−Wmax: maximum window-size

− CW: current window size

◦ Initially: number the data to be sent 1, 2, . . . , N .

6.976/ESD.937 9


• Slow-start:

◦ Set STHRESH = Wmax/2

◦ Start with CW= 1, packet sent P= 0

◦ Each time, send packets from P to P+CW

◦ When a packet P+1 is acknowledged, set P ← P+1

◦ For every acknowledgment, increase CW to CW+1

− till CW < STHRESH or Packet loss is detected

− if packet loss, then set STHRESH ← STHRESH/2 and CW

← CW/2

◦ Continue above until CW=STHRESH

− go to “congestion-avoidance” phase

◦ If P=N ever, transmission is over!

6.976/ESD.937 10


• Congestion-avoidance

◦ For acknowledgement of P+1,

− CW ← CW +1

[CW]

◦ If packet-loss then CW is decreased

− different versions have different ways of dealing with the

window size decrease

− one version:

set STHRESH ← CW/2

CW ← 1

go to slow-start phase

6.976/ESD.937 11


• Receiver algorithm:

◦ For every received packet, send acknowledgment

◦ With request for the next packet # required

− if packets 1-10 and 13 are received, then send request for

packet number 11 (and not 14)

• Packet-loss detection:

◦ Essentially, packet is dropped when acknowledgment is not

received

◦ Specific protocol

− if ack. not received within TIME-OUT, or

− 3 consecutive requests sent for the same packet # by

receiver

6.976/ESD.937 12

TCP: An Example

• Suppose, CW = 10 at some-time

ACK 1ACK 1

ReceiverSender

TIME-OUT

T=14

T=20

T=0

T=4

T=6

T=7

1-10 packets

11-21 packets

14-25 packets

ACK 1

ACK 10

ACK 11-13

CW = 11

CW: (10-1 10-1+ 10-1 )1

1CW: (10 10 + 10 )

6.976/ESD.937 13

TCP: Performance

• The main performance metric

◦ Throughput of TCP

◦ That is, what is the net “equilibrium” rate of users

− as a fraction of the total capacity

• Evaluation of TCP throughput

◦ Model: TCP dynamics

− effect of slow-start vs. congestion-avoidance

◦ Characterization of equilibrium of TCP

• First, we’ll consider a model for TCP

6.976/ESD.937 14

TCP: Performance

• We’ll consider a very simple situation

◦ single link and single user

◦ ignore extra complications

• Given this model, we will identify

◦ Key system dynamics affecting its performance

• Specifically, we’ll try to compare effect of

◦ Slow-start vs. Congestion control

◦ And, use this insight to evaluate performance precisely

6.976/ESD.937 15

TCP: Performance

•Model:

◦ Single user accessing link of capacity C

◦ T be round-trip time

◦ B be the butter size at the link

◦ Maximum window size Wmax = cT + B

→ STHRESH = Wmax/2 = cT+B2

◦ User wants to transmit infinite data

• Goal: Evaluate average rate of transmission

6.976/ESD.937 16

TCP: Performance

• TCP dynamics

◦ Periodic between slow-start and congestion avoidance

congestion-avoidanceslow-start slow-start

t = 0 Tss Tss + Tca · · ·

← Nss → ← Nca →

Rate =Nss + Nca

Tss + Tca

• Next, we evalute

◦ Nss, Nca, Tss and Tca

◦ To obtain rate of TCP

6.976/ESD.937 17

TCP: Performance

• We first study the Slow-start phase

◦ Specifically, evolution of window size and queue-size

• Slow-start phase has cyclic behavior

◦ Divide time into cycles where window size doubles in each cycle:

− t = 0: first packet sent; W(0)= 1

− t = T: packet ACK; W(T)= 2

− t = 2T: both packets ACK; W(2T)= 3

...

→ each mini-cycle has length =T

◦ Window-size evolution: W(nT) ≈ 2n−1 + 1

6.976/ESD.937 18

TCP: Performance

• Next, we study queue-length evolution

◦ In slow-start phase using window-evoluation

• Again, queue-size has the same cyclic behavior

◦ Queue-size at the beginning of cycle ∼ 0

◦ During nth cycle, 2n−2 packets are sent

− total packets sent (until nth cycle)

= 2n−2 + 2n−3 + · · · 20 = 2n+1

− total packets acknowledged by end of nth cycle

W(nT) = 2n−2

→ Max-queue length in nth cycle,

Q(nT) ∼ 2n−1 − 1− 2n−2 ∼ 2n−2 ∼W(nT)/2.

6.976/ESD.937 19

TCP: Performance

• In slow-start phase, buffer-overflow happens if

◦ Q is larger than B

− But window can be at most STHRESH

◦ Now, Q =W

2

− which is at most STHRESH2 = B+cT

4

◦ Hence, for overflow(

B+cT4

)

> B

− That is, cT > 3B ⇔ B < cT/3

◦ Equivalently, no overflow when B ≥ cT/3

6.976/ESD.937 20

TCP: Performance

• First, we consider no overflow situation

◦ That is, B≥ cT/3

• When there is no overflow, we have the following

◦ Tss: time for slow-start

W(Tss) ∼B+cT

2⇔ 2Tss/T ∼

B+cT

2

⇔ Tss = T log2

(

B+cT

2

)

◦ Nss: # of packets transmitted

= # of packets acknowledge = window-size

=B+cT

2

6.976/ESD.937 21

TCP: Performance

• Now, iff overflow happens in slow-start, i.e. B < c T/3

◦ There are two slow-start phases:

◦ Phase 1: at the overflow

− Q= B = W/2; but detection happens after T time, i.e.,

window has doubled

→ W = 2· B at Tss1 (time to overflow)

→ 2Tss1/T ≈ 2 B ⇔ Tss1 = T log2 2B

− Nss1 = 2B

− Set STHRESH = 2B2 = B

6.976/ESD.937 22

TCP: Performance

◦ Phase 2: after overflow

− Now, the overflow does not happen as STHRESH is low enough!

− W = B at the end of this phase

→ 2Tss2/T ∼ B → Tss2 ∼ T logB

− Nss2 = B

◦ In summary, when overflow happens

Tss = Tss1 + Tss2 = 2T log B + T

Nss = 3B

◦ And when overflow does not happen

Tss = T log2

(

B+cT

2

)

Nss =B+cT

2

6.976/ESD.937 23

TCP: Performance

• Now, we study Congestion-avoidance phase

◦ Start of window-size

W0 =

Wmax/2 = cT+ B2 if B ≥ cT

3

B if B < cT3

◦ Next, we study evolution of window-size with time t = W(t)

− Let a(t) = # of acknowledgments received till time t

− Then, change in window-size isdW

dt=

dW

da·da

dt

Rate change of a =

{

C if W is large, server is busy

W/T if W is smaller

− That is,da

dt= min{W/T,C}

− Also:dW

da=

1

W(by definition of algorithm)

6.976/ESD.937 24

TCP: Performance

• Hence,dW

dt=

{

1/T if W < cT

c/W if W ≥ cT

◦ Congestion avoidance will end when W = Wmax

• Two-phases

◦ Phase 1: W0 ≤ W < cT

◦ Phase 2: cT ≤ W ≤ Wmax

• First, we consider Phase 1

6.976/ESD.937 25

TCP: Performance

• Phase 1: W0 ≤ W < cT

◦ Tca1 = T (cT− W0), sincedW

dt= 1/T

◦ Nca1 = a(Tca1) =∫ Tca1

0 da(t), where

∫ Tca1

0

da(t) =

∫ Tca1

0

da(t)

dtdt

=

∫ Tca1

0

W(t)

Tdt

=

∫ Tca1

0

W0 + t/T

Tdt

=

∫ Tca1

0

W0 · Tca1

T+

T2ca1

2T2

= W0(cT−W0)[

1 +cT−W0

2

]

6.976/ESD.937 26

TCP: Performance

• Phase 2: W ≥ cT;

◦dW

dt=

c

W(t)⇔W(t)dW = cdt

⇔W2(t)−W2(0) = 2ct (where W(0) = cT)

⇔ W2(t) = 2ct + c2T2

◦ Phase ends when W(Tca2) = Wmax

⇒ W2max = 2cTca2 + c2T2

⇒ 2cTca2 = W2max − c2T2

⇒ Tca2 =W2

max − c2T2

2c

◦ Nca2 = cTca2

− because, node is running at capacity c

6.976/ESD.937 27

TCP: Performance

• In summary, in congestion avoidance phase

◦ Tca1 = T (cT− W0),

◦ Nca1 = W0(cT−W0)[

1 + cT−W02

]

◦ Tca2 = W2max−c2T2

2c , and

◦ Nca2 = cTca2

• Let, compare the contribution of slow-start and congestion avoidance

phases

◦ When, B = cT, for large c

• We’ll find that congestion avoidance dominates

→ Only concentrate on TCP-dynamics for congestion-avoidance

→ We’ll use this insight to find throughput for simple model

6.976/ESD.937 28

TCP: Performance

• We consider only congestion-avoidance dynamics

◦ Single-link

◦ Many-sources

• Modeling

◦ Tr: RTT of user r

◦ Wr(t): window size of user r at time t

◦ qr(t): fraction of packet list at time t for user r

◦ xr(t) =Wr(t)

T: rate of user r at time t

◦ ar(t) = acknowledgment until time t for user r

• Dynamics

◦ Acknowledgment: increase window by1

Wr(t)◦ Drop: decrease window by βWr(t)

6.976/ESD.937 29

TCP: Performance

• To study throughput of TCP

◦ We need to study evolution of Wr(t)

• Let qr(t) fraction of packets are dropped at time t for user r

◦ Then, the rate at which packets are dropped at t for user r

xr(t− Tr) · qr(t)

◦ That is, drop rate at time t for user r is xr(t− Tr)qr(t)

◦ Acknowledgment rate at t: xr(t− Tr)(1− qr(t))

• Then, Wr(t) changes as follows:

dWr(t)

dt= (1− qr(t))

[

1

Wr(t)· xr(t− Tr)

]

− qr(t)[

βWr(t) · xr(t− Tr)]

• Since, xr(t) = Wr(t)/Tr

dxr(t)

dt=

(1− qr(t))xr(t− Tr)

TrXr(t)− qr(t)βTrxr(t)xr(t− Tr)

6.976/ESD.937 30

TCP: Performance

• To obtain long-term effective throughput

◦ We evaluate the equilibrium point:

dxr(t)

dt= 0

◦ Ignoring the “delay Tr” in equation

− usually, can not ignored

− we’ll study when is this justified

• This gives us the following:

◦ In equilibrium

0 =(1− qr)

T1− βx2

rTrqr

◦ That is,

xr =1

Tr

√

1− qr

βqr

6.976/ESD.937 31

TCP: Performance

• The analysis has following main message

◦ Throughput is

− mainly affected by congestion avoidance phase, and

− not by slow-start

◦ Qualitatively, throughput is

− inversely proportional to RTT (Tr)

− square root inversely proportional to drop-probability

6.976/ESD.937 32

TCP: Performance

• Questions:

◦ We assumed “deterministic” dynamics

− how valid is it?

− The Law-of-Large-Numbers or Fluid-models provide

justification

◦ We ignored the effect of other users

− implicit in qr( · )

◦ What happens when many links?

− naturally, hard to quantify exact relation

− however, its useful to ask the following basic question:

− what do we really what to achieve from TCP

− and has it achieved that?

◦ next, we address this basic question

6.976/ESD.937 33

Resource Allocation

• Suppose a piece of cake is to be shared between two people in a fair

manner:

◦ How should we do it?

◦ Well, divide it into half each, assuming both care for size only.

◦ What if one cares only for “cherry” but other cares about actual

“cake”?

→ Division scheme should care about utility of “cake” to the people.

→ Known literature of cake-cutting algorithm

− has inspired a lot of interesting work in Game Theory and

Algorithms

6.976/ESD.937 34

Resource Allocation

• Consider a single link with capacity C

◦ R users want to use it and let fr, 1 ≤ r ≤ R be rate at which user

r wants to send data

◦ If f1 + · · · + fR ≤ C

− allocate demanded rate to each user

◦ But, if f1 + · · · + fR > C, then

− we need an allocation mechanism to allocate rates

x1, . . . , xR to all users such that

0 ≤ xr ≤ fr ; 1 ≤ r ≤ R

R∑

r=1

xr ≤ C

− and, allocation should maximize the overall system utility

6.976/ESD.937 35

Resource Allocation

• Let ur(xr) be utility of rate xr to user r

◦ Then, allocation (xr) should be solution of

maxR

∑

r=1

ur(yr)

subject to

0 ≤ yr ≤ fr ; 1 ≤ r ≤ RR

∑

r=1

yr ≤ C

◦ Now, a “bad” user can have ur very high

→ can’t rely on users utility for fair allocation

◦ Question:

− what should be ur( · ) so that allocation is “fair”?

6.976/ESD.937 36

Fair Allocation

• Let’s consider a simple example

◦ C = 10 ; f1 = 4, f2 = 20.

◦ x1 = 4 ; x2 = 6 “makes sense” because allocation is as much equal

as possible

→Max-min fair: (xr) is max-min fair iff

◦ For any (r, r′); xr > xr′ only if xr′ = fr′

• In the above example,

◦ x1 = 3 ; x2 = 1 is not max-min fair because x2 > x2 but x1 6= f1.

• Philosophy of max-min fair:

◦ At the fair allocation, the only way to become “richer’ is to make

a poor, “poorer.”

6.976/ESD.937 37

Max-Min Fair Allocation

• Network resource allocation

◦ Consider network with L links with capacities C1, . . . , CL

◦ Sources 1, . . . , R with demands f1, . . . fr. Let S denote set of all

sources.

◦ Source r sends data from node sr to dr using some links

(according to routing algorithm)

◦ Rates (xr) are feasible if data transmission, according to the rates,

satisfies link capacities constraints

− that is, no link is over-subscribed

• Question:

◦ What about fair allocation?

→ Similar definition generalizes.

6.976/ESD.937 38


• Definition (max-min fair):

Let (xr) be max-min fair for a given capacitated network only if

(1) (xr) is feasible

(2) for any other feasible (yr), if ys > xs for some source s, then

there exists source p s.t. xp ≤ xs and yp < xp

• Question:

◦ How to find such an allocation?

◦ Next, we will see a simple centralized algorithm.

◦ Later we will consider distributed algorithms and their relationship

to current TCP!

6.976/ESD.937 39


(1) Let nℓ be number of sources in S that use link ℓ. For each ℓ s.t. cℓ 6= 0,

define fair share fℓ as:

gℓ =cℓ

nℓ

(2) Define zr = minℓ∈r

gℓ, which is the min rate over links that source r is using.

(3) Define zmin = minr

zr.

(4) Let R = {r : zr = zmin}

(5) For all r ∈ R; the max-min rate xr = min(zr, fr)

(6) Set S ← S\R ;

cℓ ← cℓ −∑

r∈R ;ℓ∈r

xr ;

nℓ ← nℓ −∑

r∈R

1{ℓ∈r}

(7) Repeat (1)–(6) until S is empty.

6.976/ESD.937 40


• Algorithm

◦ Not distributed

→ Not implementable

• Fairness

◦ There can be other “fairness” criteria

• Next, we will see

◦ A range of fairness criteria

− max-min fair as one of them

◦ Study distributed algorithm for allocation based on them.

6.976/ESD.937 41

α-Fair Allocation

• Let utility of user r be

ur(xr) =

−wrx1−αr

r

1− αrαr > 0 ; αr 6= 1

wr log xr αr = 1 .

• Also, fr =∞, i.e. everyone wants maximal rate.

• Fair allocation is solution to optimization problem

max(xr)

∑

ur(xr)

subject to∑

ℓ∈r

xr ≤ cℓ ; ∀ℓ ∈ L

xr ≥ 0

• Next, we consider special examples of the above class of fair allocation.

6.976/ESD.937 42

Examples

(I) Minimum delay fair

◦ αr = 2 ; ∀r

◦ ur(xr) = −wrx1−αr

r

1− αr

= −wr

xr

(II) Proportional fair

◦ αr = 1 ; ∀r

◦ ur(xr) = +wr log xr

(III) Max-min fair

◦ αr = α ; ∀r ; α→∞

◦ An example

6.976/ESD.937 43

Resource Allocation

• Consider routing matrix M = [Mre]R×L

◦Mrℓ = 1 if data of user r passes through link ℓ

◦ y = Mx ; x = (xr) rate vector

• Then, resource allocation problem

(RAC) max(xr)

∑

ur(xr)

subject to

xr ≥ 0 ; y = Mx ≤ C

• Instead of solving RAC, we will first solve

(RAC1)maxx≤0

∑

r

ur(xr)−∑

ℓ∈L

∫ s:∑

ℓ∈s xs

0

fℓ(y)dy .

• Here, constraint y = Mx ≤ C is absent

• Instead, fℓ( · ) is penalty (price) function s.t.

◦∫ y

0 fℓ(x)dx→∞ as y →∞

◦ non-decreasing, continuum, non-negative

6.976/ESD.937 44

RAC1

• Note that, for all choice of (αr)

◦ ur( · ) is strictly concave, increasing

• Hence, the new utility function of RAC1

◦ V (x) =∑

r

ur(xr)−∑

ℓ

∫ s:∑

ℓ∈s xs

0

fℓ(y)dy

− is strictly concave

− Proof [pg. 24, [SRIKANT]]

• We also assume that

◦ ur(xr)→ −∞ as xr → 0.

6.976/ESD.937 45

RAC1

• For maximizing strictly concave function V ( · )

◦ V (x)→ −∞ as ‖x‖ → 0

◦ V (x)→ +∞ as ‖x‖ → ∞

→ There is a unique solution of RAC1 that lies in the interior of set

x ≥ 0.

• Hence, optimal solution must satisfy

dV

dxr= 0 ∀r

→ u′r(xr)−∑

ℓ:ℓ∈r

fℓ(∑

s:ℓ∈s

xs) = 0 ∀r

→ Solution of these equations leads to optimal route

→ But, difficult to solve in decentralized manner

6.976/ESD.937 46

Decentralize Solution for RAC1

• Each node can compute traffic through outgoing link and corresponding

price at time t; i.e.

◦ yℓ(t) =∑

s:ℓ∈s xs(t) ; pℓ(t) = fℓ(y)ℓ(t))

◦ price on a route = sum of prices on its links, i.e.

qr(t) =∑

ℓ:ℓ∈r

pℓ(t)

or

q = Mp

• Then, optimality condition is

u′r(xr)− qr = 0

• A “natural” gradient algorithm is

xr(t) = kr(xr)(u′r(xr)− qr) ,

with kr( · ) non-negative, non-decreasing continuous function.

6.976/ESD.937 47

RAC1

• Then: The gradient algorithm converges to optimal solution starting

from any initial condition

• Proof [pg. 26, [SRIKANT]]

→ Implication:

◦ To reach “modified” resource allocation , simple radiant algorithm

based on prices and utility function is sufficient

◦ Question: What prices/penalty lead to “correct” algorithm?

− well, penalty function are approximations for capacity

constraints: fℓ(x)→∞ as x→ C−ℓ− actual utilities are never exact

• Next, some exact penalty functions

6.976/ESD.937 48

Exact Penalty Function

• An explanation of penalty functions can be given in terms of “dual”

variables [explained later].

• Based on this idea, it is possible to introduce adaptive penalty function

◦ Define, fℓ(yℓ, cℓ) =(yℓ

cℓ

)Bℓ; c virtual capacity

◦ We will adapt cℓ during the course of algorithm as follows:

(A)dcℓ

dt= αℓ(cℓ − yℓ)

+cℓ

◦ And, primal algorithm

xr = kr(xr)(u′r(xr)− qr) ;

qr =∑

ℓ:ℓ∈r

pℓ =∑

ℓ:ℓ∈r

fℓ(yℓ1, cℓ) .

• The above combination provides a primal algorithm with adaptive

price-function

6.976/ESD.937 49

Exact Penalty Function

• Then: If routing matrix M is full rank, then algorithm solves RAC,

exactly.

• Proof [pg. 30, [SRIKANT]]

• Implications:

− It is possible to change rates and price function adaptively to

obtain solution of Resource allocation problem

− The “price” has natural optimization interpretation in terms of

dual variable

6.976/ESD.937 50

RAC: Dual

• RAC:

maxx≥0

∑

ur(xr)

subject to

Mx ≤ c .

• Lagrangian of RAC: x ≥ 0 ; λ ≥ 0

L(x; λ) =∑

r

ur(xr)− λT (Mx− c).

• Dual

D(λ) = maxx≥0;Mx≤c

[

∑

r

ur(xr)− λT (Mx− c)

]

= maxx≥0;Mx≤c

[

∑

r

ur(xr)−∑

ℓ

λℓyℓ

]

+∑

ℓ

λℓcℓ,

where, yℓ =∑

s:ℓ∈s xs.

6.976/ESD.937 51

Dual and It’s Properties

• Dual optimal

infλ≥0

D(λ)

• By strong-duality: infλ≥0

D(λ) = RAC

• For given λ ≥ 0: maximizing x is s.t. for all r

∂L

∂xr= 0⇒

∂ur(xr)

∂xr−

∑

ℓ:ℓ∈r

λℓ = 0

• Further, by strong duality, for the optimizing (λ, x) pair

λℓ(yℓ − cℓ) = 0.

• Based on above optimality conditions, a simple gradient algorithm: [pg.

28, [SRIKANT]]

◦ Set xr = u−1r (

∑

ℓ:ℓ∈r λℓ),

◦ λℓ = hℓ(λℓ)(yℓ − cℓ)+λℓ

.

6.976/ESD.937 52

RAC: Dual

• Then: The gradient descent algorithm described for dual of RAC

converges to optimal solution if the routing matrix M has full rank

◦ Proof [pg. 29, [SRIKANT]]

• Implication:

◦ By adjusting “prices” (λℓ = pℓ) and rational use behavior, desired

rate-allocation can be achieved

• What if both prices and rate are changed simultaneously?

6.976/ESD.937 53

RAC: Primal–Dual

• Primal-dual algorithm

◦ Primal algorithm at sources

xr = kr(xr)(u′r(xr)− qr) ,

◦ Dual algorithm at links

pℓ = kℓ(pℓ)(yℓ − cℓ)+pℓ

.

• Then: The algorithm converges to the solution of RAC, starting from

initial condition

Proof [pg. 28, [SRIKANT]]

6.976/ESD.937 54

Relation to Current Algorithms

• Primal algorithm

xr = kr(xr)(u′r(xr)− qr)

◦ Changing rate based on feedback

• Recall, TCP changes window-size

◦ wr(t) be windowsize

◦ Tr be RTT

◦ Then, rate xr(t) =wr(t)

Tr

◦ Let qr(t) be drop probability

• Then, TCP-Reno

◦ increases window if no drop (prb. 1− qr(t))

◦ decreases window if drop (prb. qr(t))

6.976/ESD.937 55

TCP versus Primal Algorithm

• Specifically, TCP dynamics

xr =1− qr(t)xr(t− Tr)

T 2r xr(t)

− βqr(t)xr(t)xr(t− Tr).

◦ Let xr(t− Tr) ≈ xr(t); then

xr =1− qr(t)

T 2r

− βx2r(t)qr(t).

◦ If qr(t) ∼ small, 1− qr(t) ∼ 1; then

xr =1

T 2r

− βx2r(t)qr(t)

= βx2r(t)

[

1

βT 2r x2

r(t)− qr(t)

]

= kr(xr(t)) [u′r(xr(t))− qr(t)]

Then, ur(t) = −1

βT 2r xr(t)

That is: TCP ≡ weighted delay fair allocation!

6.976/ESD.937 56

What About Dual?

• TCP corresponds to primal algorithm

• What about dual algorithm?

◦ qr( · ): corresponds to drop probability

→ Queue management algorithm corresponds to dual algorithm

• Popular queue-management algorithm

◦ Random Early Detection (RED)

◦ Mark (or drop) packet with probability proportional to queue-size

◦ If butter-size is B, then probability of marking packet when

queue-size is Q isQ

B.

◦ Let y be arrival rate

6.976/ESD.937 57

Dual = Active Queue Management

• Now, queue-dynamics

Q = (y − c)+Q

• Then, marking (drop) probability

p =Q

B⇒ p = α(y − c)+p

→ This has exactly the same dynamics as prices or Lagrange

multiplier in dual or primal-dual algorithm

• Note that

qr ≈∑

ℓ:ℓ∈r

pℓ ;

because if drops at links are independent then

(1− qr) =∏

ℓ:ℓ∈r

(1− pℓ) ≈ 1−∑

ℓ:ℓ∈r

pℓ ,

when pℓ is small.

6.976/ESD.937 58

Next

• We assumed that RCA algorithm or TCP has immediate feedback.

◦ In practice, feedback is always delayed

◦ Questions:

(1) How does delayed feedback affect performance?

(2) If delay is very large: feedback is useless. Then how large

can delay be so as the network can still operate?

• Next, we will see how ideas from control theory can be useful.

→ Professor Mitter’s lectures

6.976/ESD.937 59

Other Issues

• If everything works well, then

◦ TCP and RED seem good algorithms

• Question:

◦ What if uses do not cooperate?

• Malicious users can lead to undesirable network performance.

→ Next, we study network security.

6.976/ESD.937 60

congestion control - mitweb.mit.edu/6.976/www/notes/notes3.pdf · congestion control is essentially...

Documents