congestion control - mitweb.mit.edu/6.976/www/notes/notes3.pdf · congestion control is essentially...
TRANSCRIPT
Congestion Control
• Topics
◦ Congestion control
− what & why?
◦ Current congestion control algorithms
− TCP and UDP
◦ Ideal congestion control
− Resource allocation
− Distributed algorithms
◦ Relation
− current algorithms and resource allocation
◦ Broad implications
6.976/ESD.937 1
Congestion Control: What and Why?
• Internet is used by many independent users
◦ Resources (link capacity) are finite
◦ If every user sends data at very high rate, it will cause
“congestion”
− packets will be dropped, i.e. unreliable transmission
− seemingly high utilization of resource may be actually very
low!
◦ If every user sends data at very low rate, resource will not be
well-utilized
→ Users need to send data at the “correct” rate so that
◦ Resources are well-utilized
◦ Users get “reliable” data transfer
◦ Resources should be shared “fairly”
→ This is the primary goal of congestion control
6.976/ESD.937 2
Congestion Control
• Main question:
◦ How to implement congestion control in the network
◦ With the help of only distributed protocols at the users
− is it even possible?
− if not, what is the best possible thing to do?
• An example:ReceiverSender link
↑(
unknown capacity
for sender/receiver
)
6.976/ESD.937 3
Congestion Control
• If senders get acknowledgment of receipt of packets
◦ They know if packets are dropped or not
• Based on this, senders can infer the following
◦ If packets are not dropped
− sender is sending at rate lower than the capacity
◦ If packets are dropped
− sender is sending at rate higher than capacity
→ Use drops of packets as “signal of congestion”
→ Change rate as a reaction to packet drop so as to achieve
“fair-share”
• Congestion control is essentially ”feedback-control”
6.976/ESD.937 4
Congestion Control Mechanism
• To perform congestion control, we need two basic protocols
◦ Algorithm I: Drop detection
− ask receiver to acknowledge receipt of packets
− if a sent packet is not acknowledged fast enough the packet
is assumed to be dropped
◦ Algorithm II: Rate-control
− if packets are not dropped, i.e. no congestion and hence
increase rate by certain amount
− else there is congestion and hence decrease rate by certain
amount
• Algorithms I and II are key ideas behind current congestion control
protocols
6.976/ESD.937 5
Congestion Control
• Recall, Internet has “layered” architecture
◦ Congestion control is essentially required for
− “reliable” transmission at “fair-rate” with “high-resource
utilization”
→ Implemented in “Network Layer”
• The congestion control protocol is also called “transport protocol”
◦ Transmission Control Protocol (TCP) is a popular protocol
− provides reliable transmission
− when all users exhibit “good-citizen” behavior
− but has higher delay (cost of reliability)
◦ User Data-gram Protocol (UDP) is another protocol
− unreliable and user selfish protocol
− but has lower delay
− useful for media communication
6.976/ESD.937 6
TCP
• Brief History
◦ TCP (some version) has been around since ARPANET
◦ Older versions(s) required users to send “probes” to network to
detect level of congestion
→ led network to always operate under congestion
→ often led to “congestion collapse”
• Popular well-documented example
◦ Congestion collapse in October 1986
◦ Rate between Lawrence-Berkeley Lab and UCB went down from
32 kbps → 40 bytes!
6.976/ESD.937 7
TCP
• Post-mortem of “congestion collapse”
◦ Older TCP sent more and more packets without confirming their
delivery
− violation of ‘packet conservation’ principle was key reason
for failure
◦ Did not account for inhomogeneity in network bandwidth
◦ Lack of rate-control
• Current TCP
◦ Packet conservation: inject new packet when old packet wave has
reached destination
◦ Slow-start: search for capacity starting from zero
◦ Rate-control: control rate via packet drop feedback, and be “good
user”
6.976/ESD.937 8
TCP: Current Algorithm
• Algorithm has two components
◦ Sender algorithm and receiver algorithm
• Sender algorithm:
◦ Parameters:
− STHRESH: threshold for slow-start
−Wmax: maximum window-size
− CW: current window size
◦ Initially: number the data to be sent 1, 2, . . . , N .
6.976/ESD.937 9
TCP: Current Algorithm
• Slow-start:
◦ Set STHRESH = Wmax/2
◦ Start with CW= 1, packet sent P= 0
◦ Each time, send packets from P to P+CW
◦ When a packet P+1 is acknowledged, set P ← P+1
◦ For every acknowledgment, increase CW to CW+1
− till CW < STHRESH or Packet loss is detected
− if packet loss, then set STHRESH ← STHRESH/2 and CW
← CW/2
◦ Continue above until CW=STHRESH
− go to “congestion-avoidance” phase
◦ If P=N ever, transmission is over!
6.976/ESD.937 10
TCP: Current Algorithm
• Congestion-avoidance
◦ For acknowledgement of P+1,
− CW ← CW +1
[CW]
◦ If packet-loss then CW is decreased
− different versions have different ways of dealing with the
window size decrease
− one version:
set STHRESH ← CW/2
CW ← 1
go to slow-start phase
6.976/ESD.937 11
TCP: Current Algorithm
• Receiver algorithm:
◦ For every received packet, send acknowledgment
◦ With request for the next packet # required
− if packets 1-10 and 13 are received, then send request for
packet number 11 (and not 14)
• Packet-loss detection:
◦ Essentially, packet is dropped when acknowledgment is not
received
◦ Specific protocol
− if ack. not received within TIME-OUT, or
− 3 consecutive requests sent for the same packet # by
receiver
6.976/ESD.937 12
TCP: An Example
• Suppose, CW = 10 at some-time
ACK 1ACK 1
ReceiverSender
TIME-OUT
T=14
T=20
T=0
T=4
T=6
T=7
1-10 packets
11-21 packets
14-25 packets
ACK 1
ACK 10
ACK 11-13
CW = 11
CW: (10-1 10-1+ 10-1 )1
1CW: (10 10 + 10 )
6.976/ESD.937 13
TCP: Performance
• The main performance metric
◦ Throughput of TCP
◦ That is, what is the net “equilibrium” rate of users
− as a fraction of the total capacity
• Evaluation of TCP throughput
◦ Model: TCP dynamics
− effect of slow-start vs. congestion-avoidance
◦ Characterization of equilibrium of TCP
• First, we’ll consider a model for TCP
6.976/ESD.937 14
TCP: Performance
• We’ll consider a very simple situation
◦ single link and single user
◦ ignore extra complications
• Given this model, we will identify
◦ Key system dynamics affecting its performance
• Specifically, we’ll try to compare effect of
◦ Slow-start vs. Congestion control
◦ And, use this insight to evaluate performance precisely
6.976/ESD.937 15
TCP: Performance
•Model:
◦ Single user accessing link of capacity C
◦ T be round-trip time
◦ B be the butter size at the link
◦ Maximum window size Wmax = cT + B
→ STHRESH = Wmax/2 = cT+B2
◦ User wants to transmit infinite data
• Goal: Evaluate average rate of transmission
6.976/ESD.937 16
TCP: Performance
• TCP dynamics
◦ Periodic between slow-start and congestion avoidance
congestion-avoidanceslow-start slow-start
t = 0 Tss Tss + Tca · · ·
← Nss → ← Nca →
Rate =Nss + Nca
Tss + Tca
• Next, we evalute
◦ Nss, Nca, Tss and Tca
◦ To obtain rate of TCP
6.976/ESD.937 17
TCP: Performance
• We first study the Slow-start phase
◦ Specifically, evolution of window size and queue-size
• Slow-start phase has cyclic behavior
◦ Divide time into cycles where window size doubles in each cycle:
− t = 0: first packet sent; W(0)= 1
− t = T: packet ACK; W(T)= 2
− t = 2T: both packets ACK; W(2T)= 3
...
→ each mini-cycle has length =T
◦ Window-size evolution: W(nT) ≈ 2n−1 + 1
6.976/ESD.937 18
TCP: Performance
• Next, we study queue-length evolution
◦ In slow-start phase using window-evoluation
• Again, queue-size has the same cyclic behavior
◦ Queue-size at the beginning of cycle ∼ 0
◦ During nth cycle, 2n−2 packets are sent
− total packets sent (until nth cycle)
= 2n−2 + 2n−3 + · · · 20 = 2n+1
− total packets acknowledged by end of nth cycle
W(nT) = 2n−2
→ Max-queue length in nth cycle,
Q(nT) ∼ 2n−1 − 1− 2n−2 ∼ 2n−2 ∼W(nT)/2.
6.976/ESD.937 19
TCP: Performance
• In slow-start phase, buffer-overflow happens if
◦ Q is larger than B
− But window can be at most STHRESH
◦ Now, Q =W
2
− which is at most STHRESH2 = B+cT
4
◦ Hence, for overflow(
B+cT4
)
> B
− That is, cT > 3B ⇔ B < cT/3
◦ Equivalently, no overflow when B ≥ cT/3
6.976/ESD.937 20
TCP: Performance
• First, we consider no overflow situation
◦ That is, B≥ cT/3
• When there is no overflow, we have the following
◦ Tss: time for slow-start
W(Tss) ∼B+cT
2⇔ 2Tss/T ∼
B+cT
2
⇔ Tss = T log2
(
B+cT
2
)
◦ Nss: # of packets transmitted
= # of packets acknowledge = window-size
=B+cT
2
6.976/ESD.937 21
TCP: Performance
• Now, iff overflow happens in slow-start, i.e. B < c T/3
◦ There are two slow-start phases:
◦ Phase 1: at the overflow
− Q= B = W/2; but detection happens after T time, i.e.,
window has doubled
→ W = 2· B at Tss1 (time to overflow)
→ 2Tss1/T ≈ 2 B ⇔ Tss1 = T log2 2B
− Nss1 = 2B
− Set STHRESH = 2B2 = B
6.976/ESD.937 22
TCP: Performance
◦ Phase 2: after overflow
− Now, the overflow does not happen as STHRESH is low enough!
− W = B at the end of this phase
→ 2Tss2/T ∼ B → Tss2 ∼ T logB
− Nss2 = B
◦ In summary, when overflow happens
Tss = Tss1 + Tss2 = 2T log B + T
Nss = 3B
◦ And when overflow does not happen
Tss = T log2
(
B+cT
2
)
Nss =B+cT
2
6.976/ESD.937 23
TCP: Performance
• Now, we study Congestion-avoidance phase
◦ Start of window-size
W0 =
Wmax/2 = cT+ B2 if B ≥ cT
3
B if B < cT3
◦ Next, we study evolution of window-size with time t = W(t)
− Let a(t) = # of acknowledgments received till time t
− Then, change in window-size isdW
dt=
dW
da·da
dt
Rate change of a =
{
C if W is large, server is busy
W/T if W is smaller
− That is,da
dt= min{W/T,C}
− Also:dW
da=
1
W(by definition of algorithm)
6.976/ESD.937 24
TCP: Performance
• Hence,dW
dt=
{
1/T if W < cT
c/W if W ≥ cT
◦ Congestion avoidance will end when W = Wmax
• Two-phases
◦ Phase 1: W0 ≤ W < cT
◦ Phase 2: cT ≤ W ≤ Wmax
• First, we consider Phase 1
6.976/ESD.937 25
TCP: Performance
• Phase 1: W0 ≤ W < cT
◦ Tca1 = T (cT− W0), sincedW
dt= 1/T
◦ Nca1 = a(Tca1) =∫ Tca1
0 da(t), where
∫ Tca1
0
da(t) =
∫ Tca1
0
da(t)
dtdt
=
∫ Tca1
0
W(t)
Tdt
=
∫ Tca1
0
W0 + t/T
Tdt
=
∫ Tca1
0
W0 · Tca1
T+
T2ca1
2T2
= W0(cT−W0)[
1 +cT−W0
2
]
6.976/ESD.937 26
TCP: Performance
• Phase 2: W ≥ cT;
◦dW
dt=
c
W(t)⇔W(t)dW = cdt
⇔W2(t)−W2(0) = 2ct (where W(0) = cT)
⇔ W2(t) = 2ct + c2T2
◦ Phase ends when W(Tca2) = Wmax
⇒ W2max = 2cTca2 + c2T2
⇒ 2cTca2 = W2max − c2T2
⇒ Tca2 =W2
max − c2T2
2c
◦ Nca2 = cTca2
− because, node is running at capacity c
6.976/ESD.937 27
TCP: Performance
• In summary, in congestion avoidance phase
◦ Tca1 = T (cT− W0),
◦ Nca1 = W0(cT−W0)[
1 + cT−W02
]
◦ Tca2 = W2max−c2T2
2c , and
◦ Nca2 = cTca2
• Let, compare the contribution of slow-start and congestion avoidance
phases
◦ When, B = cT, for large c
• We’ll find that congestion avoidance dominates
→ Only concentrate on TCP-dynamics for congestion-avoidance
→ We’ll use this insight to find throughput for simple model
6.976/ESD.937 28
TCP: Performance
• We consider only congestion-avoidance dynamics
◦ Single-link
◦ Many-sources
• Modeling
◦ Tr: RTT of user r
◦ Wr(t): window size of user r at time t
◦ qr(t): fraction of packet list at time t for user r
◦ xr(t) =Wr(t)
T: rate of user r at time t
◦ ar(t) = acknowledgment until time t for user r
• Dynamics
◦ Acknowledgment: increase window by1
Wr(t)◦ Drop: decrease window by βWr(t)
6.976/ESD.937 29
TCP: Performance
• To study throughput of TCP
◦ We need to study evolution of Wr(t)
• Let qr(t) fraction of packets are dropped at time t for user r
◦ Then, the rate at which packets are dropped at t for user r
xr(t− Tr) · qr(t)
◦ That is, drop rate at time t for user r is xr(t− Tr)qr(t)
◦ Acknowledgment rate at t: xr(t− Tr)(1− qr(t))
• Then, Wr(t) changes as follows:
dWr(t)
dt= (1− qr(t))
[
1
Wr(t)· xr(t− Tr)
]
− qr(t)[
βWr(t) · xr(t− Tr)]
• Since, xr(t) = Wr(t)/Tr
dxr(t)
dt=
(1− qr(t))xr(t− Tr)
TrXr(t)− qr(t)βTrxr(t)xr(t− Tr)
6.976/ESD.937 30
TCP: Performance
• To obtain long-term effective throughput
◦ We evaluate the equilibrium point:
dxr(t)
dt= 0
◦ Ignoring the “delay Tr” in equation
− usually, can not ignored
− we’ll study when is this justified
• This gives us the following:
◦ In equilibrium
0 =(1− qr)
T1− βx2
rTrqr
◦ That is,
xr =1
Tr
√
1− qr
βqr
6.976/ESD.937 31
TCP: Performance
• The analysis has following main message
◦ Throughput is
− mainly affected by congestion avoidance phase, and
− not by slow-start
◦ Qualitatively, throughput is
− inversely proportional to RTT (Tr)
− square root inversely proportional to drop-probability
6.976/ESD.937 32
TCP: Performance
• Questions:
◦ We assumed “deterministic” dynamics
− how valid is it?
− The Law-of-Large-Numbers or Fluid-models provide
justification
◦ We ignored the effect of other users
− implicit in qr( · )
◦ What happens when many links?
− naturally, hard to quantify exact relation
− however, its useful to ask the following basic question:
− what do we really what to achieve from TCP
− and has it achieved that?
◦ next, we address this basic question
6.976/ESD.937 33
Resource Allocation
• Suppose a piece of cake is to be shared between two people in a fair
manner:
◦ How should we do it?
◦ Well, divide it into half each, assuming both care for size only.
◦ What if one cares only for “cherry” but other cares about actual
“cake”?
→ Division scheme should care about utility of “cake” to the people.
→ Known literature of cake-cutting algorithm
− has inspired a lot of interesting work in Game Theory and
Algorithms
6.976/ESD.937 34
Resource Allocation
• Consider a single link with capacity C
◦ R users want to use it and let fr, 1 ≤ r ≤ R be rate at which user
r wants to send data
◦ If f1 + · · · + fR ≤ C
− allocate demanded rate to each user
◦ But, if f1 + · · · + fR > C, then
− we need an allocation mechanism to allocate rates
x1, . . . , xR to all users such that
0 ≤ xr ≤ fr ; 1 ≤ r ≤ R
R∑
r=1
xr ≤ C
− and, allocation should maximize the overall system utility
6.976/ESD.937 35
Resource Allocation
• Let ur(xr) be utility of rate xr to user r
◦ Then, allocation (xr) should be solution of
maxR
∑
r=1
ur(yr)
subject to
0 ≤ yr ≤ fr ; 1 ≤ r ≤ RR
∑
r=1
yr ≤ C
◦ Now, a “bad” user can have ur very high
→ can’t rely on users utility for fair allocation
◦ Question:
− what should be ur( · ) so that allocation is “fair”?
6.976/ESD.937 36
Fair Allocation
• Let’s consider a simple example
◦ C = 10 ; f1 = 4, f2 = 20.
◦ x1 = 4 ; x2 = 6 “makes sense” because allocation is as much equal
as possible
→Max-min fair: (xr) is max-min fair iff
◦ For any (r, r′); xr > xr′ only if xr′ = fr′
• In the above example,
◦ x1 = 3 ; x2 = 1 is not max-min fair because x2 > x2 but x1 6= f1.
• Philosophy of max-min fair:
◦ At the fair allocation, the only way to become “richer’ is to make
a poor, “poorer.”
6.976/ESD.937 37
Max-Min Fair Allocation
• Network resource allocation
◦ Consider network with L links with capacities C1, . . . , CL
◦ Sources 1, . . . , R with demands f1, . . . fr. Let S denote set of all
sources.
◦ Source r sends data from node sr to dr using some links
(according to routing algorithm)
◦ Rates (xr) are feasible if data transmission, according to the rates,
satisfies link capacities constraints
− that is, no link is over-subscribed
• Question:
◦ What about fair allocation?
→ Similar definition generalizes.
6.976/ESD.937 38
Max-Min Fair Allocation
• Definition (max-min fair):
Let (xr) be max-min fair for a given capacitated network only if
(1) (xr) is feasible
(2) for any other feasible (yr), if ys > xs for some source s, then
there exists source p s.t. xp ≤ xs and yp < xp
• Question:
◦ How to find such an allocation?
◦ Next, we will see a simple centralized algorithm.
◦ Later we will consider distributed algorithms and their relationship
to current TCP!
6.976/ESD.937 39
Max-Min Fair Allocation
(1) Let nℓ be number of sources in S that use link ℓ. For each ℓ s.t. cℓ 6= 0,
define fair share fℓ as:
gℓ =cℓ
nℓ
(2) Define zr = minℓ∈r
gℓ, which is the min rate over links that source r is using.
(3) Define zmin = minr
zr.
(4) Let R = {r : zr = zmin}
(5) For all r ∈ R; the max-min rate xr = min(zr, fr)
(6) Set S ← S\R ;
cℓ ← cℓ −∑
r∈R ;ℓ∈r
xr ;
nℓ ← nℓ −∑
r∈R
1{ℓ∈r}
(7) Repeat (1)–(6) until S is empty.
6.976/ESD.937 40
Max-Min Fair Allocation
• Algorithm
◦ Not distributed
→ Not implementable
• Fairness
◦ There can be other “fairness” criteria
• Next, we will see
◦ A range of fairness criteria
− max-min fair as one of them
◦ Study distributed algorithm for allocation based on them.
6.976/ESD.937 41
α-Fair Allocation
• Let utility of user r be
ur(xr) =
−wrx1−αr
r
1− αrαr > 0 ; αr 6= 1
wr log xr αr = 1 .
• Also, fr =∞, i.e. everyone wants maximal rate.
• Fair allocation is solution to optimization problem
max(xr)
∑
ur(xr)
subject to∑
ℓ∈r
xr ≤ cℓ ; ∀ℓ ∈ L
xr ≥ 0
• Next, we consider special examples of the above class of fair allocation.
6.976/ESD.937 42
Examples
(I) Minimum delay fair
◦ αr = 2 ; ∀r
◦ ur(xr) = −wrx1−αr
r
1− αr
= −wr
xr
(II) Proportional fair
◦ αr = 1 ; ∀r
◦ ur(xr) = +wr log xr
(III) Max-min fair
◦ αr = α ; ∀r ; α→∞
◦ An example
6.976/ESD.937 43
Resource Allocation
• Consider routing matrix M = [Mre]R×L
◦Mrℓ = 1 if data of user r passes through link ℓ
◦ y = Mx ; x = (xr) rate vector
• Then, resource allocation problem
(RAC) max(xr)
∑
ur(xr)
subject to
xr ≥ 0 ; y = Mx ≤ C
• Instead of solving RAC, we will first solve
(RAC1)maxx≤0
∑
r
ur(xr)−∑
ℓ∈L
∫ s:∑
ℓ∈s xs
0
fℓ(y)dy .
• Here, constraint y = Mx ≤ C is absent
• Instead, fℓ( · ) is penalty (price) function s.t.
◦∫ y
0 fℓ(x)dx→∞ as y →∞
◦ non-decreasing, continuum, non-negative
6.976/ESD.937 44
RAC1
• Note that, for all choice of (αr)
◦ ur( · ) is strictly concave, increasing
• Hence, the new utility function of RAC1
◦ V (x) =∑
r
ur(xr)−∑
ℓ
∫ s:∑
ℓ∈s xs
0
fℓ(y)dy
− is strictly concave
− Proof [pg. 24, [SRIKANT]]
• We also assume that
◦ ur(xr)→ −∞ as xr → 0.
6.976/ESD.937 45
RAC1
• For maximizing strictly concave function V ( · )
◦ V (x)→ −∞ as ‖x‖ → 0
◦ V (x)→ +∞ as ‖x‖ → ∞
→ There is a unique solution of RAC1 that lies in the interior of set
x ≥ 0.
• Hence, optimal solution must satisfy
dV
dxr= 0 ∀r
→ u′r(xr)−∑
ℓ:ℓ∈r
fℓ(∑
s:ℓ∈s
xs) = 0 ∀r
→ Solution of these equations leads to optimal route
→ But, difficult to solve in decentralized manner
6.976/ESD.937 46
Decentralize Solution for RAC1
• Each node can compute traffic through outgoing link and corresponding
price at time t; i.e.
◦ yℓ(t) =∑
s:ℓ∈s xs(t) ; pℓ(t) = fℓ(y)ℓ(t))
◦ price on a route = sum of prices on its links, i.e.
qr(t) =∑
ℓ:ℓ∈r
pℓ(t)
or
q = Mp
• Then, optimality condition is
u′r(xr)− qr = 0
• A “natural” gradient algorithm is
xr(t) = kr(xr)(u′r(xr)− qr) ,
with kr( · ) non-negative, non-decreasing continuous function.
6.976/ESD.937 47
RAC1
• Then: The gradient algorithm converges to optimal solution starting
from any initial condition
• Proof [pg. 26, [SRIKANT]]
→ Implication:
◦ To reach “modified” resource allocation , simple radiant algorithm
based on prices and utility function is sufficient
◦ Question: What prices/penalty lead to “correct” algorithm?
− well, penalty function are approximations for capacity
constraints: fℓ(x)→∞ as x→ C−ℓ− actual utilities are never exact
• Next, some exact penalty functions
6.976/ESD.937 48
Exact Penalty Function
• An explanation of penalty functions can be given in terms of “dual”
variables [explained later].
• Based on this idea, it is possible to introduce adaptive penalty function
◦ Define, fℓ(yℓ, cℓ) =(yℓ
cℓ
)Bℓ; c virtual capacity
◦ We will adapt cℓ during the course of algorithm as follows:
(A)dcℓ
dt= αℓ(cℓ − yℓ)
+cℓ
◦ And, primal algorithm
xr = kr(xr)(u′r(xr)− qr) ;
qr =∑
ℓ:ℓ∈r
pℓ =∑
ℓ:ℓ∈r
fℓ(yℓ1, cℓ) .
• The above combination provides a primal algorithm with adaptive
price-function
6.976/ESD.937 49
Exact Penalty Function
• Then: If routing matrix M is full rank, then algorithm solves RAC,
exactly.
• Proof [pg. 30, [SRIKANT]]
• Implications:
− It is possible to change rates and price function adaptively to
obtain solution of Resource allocation problem
− The “price” has natural optimization interpretation in terms of
dual variable
6.976/ESD.937 50
RAC: Dual
• RAC:
maxx≥0
∑
ur(xr)
subject to
Mx ≤ c .
• Lagrangian of RAC: x ≥ 0 ; λ ≥ 0
L(x; λ) =∑
r
ur(xr)− λT (Mx− c).
• Dual
D(λ) = maxx≥0;Mx≤c
[
∑
r
ur(xr)− λT (Mx− c)
]
= maxx≥0;Mx≤c
[
∑
r
ur(xr)−∑
ℓ
λℓyℓ
]
+∑
ℓ
λℓcℓ,
where, yℓ =∑
s:ℓ∈s xs.
6.976/ESD.937 51
Dual and It’s Properties
• Dual optimal
infλ≥0
D(λ)
• By strong-duality: infλ≥0
D(λ) = RAC
• For given λ ≥ 0: maximizing x is s.t. for all r
∂L
∂xr= 0⇒
∂ur(xr)
∂xr−
∑
ℓ:ℓ∈r
λℓ = 0
• Further, by strong duality, for the optimizing (λ, x) pair
λℓ(yℓ − cℓ) = 0.
• Based on above optimality conditions, a simple gradient algorithm: [pg.
28, [SRIKANT]]
◦ Set xr = u−1r (
∑
ℓ:ℓ∈r λℓ),
◦ λℓ = hℓ(λℓ)(yℓ − cℓ)+λℓ
.
6.976/ESD.937 52
RAC: Dual
• Then: The gradient descent algorithm described for dual of RAC
converges to optimal solution if the routing matrix M has full rank
◦ Proof [pg. 29, [SRIKANT]]
• Implication:
◦ By adjusting “prices” (λℓ = pℓ) and rational use behavior, desired
rate-allocation can be achieved
• What if both prices and rate are changed simultaneously?
6.976/ESD.937 53
RAC: Primal–Dual
• Primal-dual algorithm
◦ Primal algorithm at sources
xr = kr(xr)(u′r(xr)− qr) ,
◦ Dual algorithm at links
pℓ = kℓ(pℓ)(yℓ − cℓ)+pℓ
.
• Then: The algorithm converges to the solution of RAC, starting from
initial condition
Proof [pg. 28, [SRIKANT]]
6.976/ESD.937 54
Relation to Current Algorithms
• Primal algorithm
xr = kr(xr)(u′r(xr)− qr)
◦ Changing rate based on feedback
• Recall, TCP changes window-size
◦ wr(t) be windowsize
◦ Tr be RTT
◦ Then, rate xr(t) =wr(t)
Tr
◦ Let qr(t) be drop probability
• Then, TCP-Reno
◦ increases window if no drop (prb. 1− qr(t))
◦ decreases window if drop (prb. qr(t))
6.976/ESD.937 55
TCP versus Primal Algorithm
• Specifically, TCP dynamics
xr =1− qr(t)xr(t− Tr)
T 2r xr(t)
− βqr(t)xr(t)xr(t− Tr).
◦ Let xr(t− Tr) ≈ xr(t); then
xr =1− qr(t)
T 2r
− βx2r(t)qr(t).
◦ If qr(t) ∼ small, 1− qr(t) ∼ 1; then
xr =1
T 2r
− βx2r(t)qr(t)
= βx2r(t)
[
1
βT 2r x2
r(t)− qr(t)
]
= kr(xr(t)) [u′r(xr(t))− qr(t)]
Then, ur(t) = −1
βT 2r xr(t)
That is: TCP ≡ weighted delay fair allocation!
6.976/ESD.937 56
What About Dual?
• TCP corresponds to primal algorithm
• What about dual algorithm?
◦ qr( · ): corresponds to drop probability
→ Queue management algorithm corresponds to dual algorithm
• Popular queue-management algorithm
◦ Random Early Detection (RED)
◦ Mark (or drop) packet with probability proportional to queue-size
◦ If butter-size is B, then probability of marking packet when
queue-size is Q isQ
B.
◦ Let y be arrival rate
6.976/ESD.937 57
Dual = Active Queue Management
• Now, queue-dynamics
Q = (y − c)+Q
• Then, marking (drop) probability
p =Q
B⇒ p = α(y − c)+p
→ This has exactly the same dynamics as prices or Lagrange
multiplier in dual or primal-dual algorithm
• Note that
qr ≈∑
ℓ:ℓ∈r
pℓ ;
because if drops at links are independent then
(1− qr) =∏
ℓ:ℓ∈r
(1− pℓ) ≈ 1−∑
ℓ:ℓ∈r
pℓ ,
when pℓ is small.
6.976/ESD.937 58
Next
• We assumed that RCA algorithm or TCP has immediate feedback.
◦ In practice, feedback is always delayed
◦ Questions:
(1) How does delayed feedback affect performance?
(2) If delay is very large: feedback is useless. Then how large
can delay be so as the network can still operate?
• Next, we will see how ideas from control theory can be useful.
→ Professor Mitter’s lectures
6.976/ESD.937 59
Other Issues
• If everything works well, then
◦ TCP and RED seem good algorithms
• Question:
◦ What if uses do not cooperate?
• Malicious users can lead to undesirable network performance.
→ Next, we study network security.
6.976/ESD.937 60