tcp on current network technologies -...
TRANSCRIPT
TCP on current networktechnologiesDEA DIF lecture
C. PhamThese slides borrow material from various
sources which are indicated below eachslide when necessary
The congestion phenomenom
From Computer Networks, A. Tanenbaum
When congestion happens?
qTraffic from fast links to abottleneck link, traffic aggregationqDifference in routers speed
10 Mbps
100 Mbps
1.5 Mbps
Load
Load
Thr
ough
put
Del
ay
knee cliff
congestioncollapse
packetloss
Congestion: A Close-up Viewq knee – point after whichq throughput increases
very slowlyq delay increases fast
q cliff – point after whichq throughput starts to
decrease very fast tozero (congestioncollapse)
q delay approachesinfinity
q Note (in an M/M/1 queue)q delay = 1/(1 – utilization)
From Shivkumar Kalyanaraman
LoadT
hrou
ghpu
t knee cliff
congestioncollapse
Congestion Control vs.Congestion Avoidance
qCongestion controlgoalqstay left of cliff
qCongestionavoidance goalqstay left of knee
qRight of cliff:qCongestion collapse
From Shivkumar Kalyanaraman
Basic Control Model
qLet’s assume window-based operationqReduce window when congestion is
perceivedqHow is congestion signaled?
• Either mark or drop packets
qWhen is a router congested?• Drop tail queues – when queue is full• Average queue length – at some threshold
qIncrease window otherwiseqProbe for available bandwidth – how?
From Shivkumar Kalyanaraman
Simple linear control
qMany different possibilities for reaction tocongestion and methods for probingqExamine simple linear controlsqWindow(t + 1) = a + b Window(t)qDifferent ai/bi for increase and ad/bd for
decreaseqSupports various reaction to signalsqIncrease/decrease additivelyqIncreased/decrease multiplicativelyqWhich of the four combinations is optimal?
From Shivkumar Kalyanaraman
Phase plotsq Simple way to visualize behavior of competing flows
over time
q Caveat: assumes 2 flows, synchronized feedback, equalRTT, discrete “rounds” of operation
Efficiency Line x1+x2=C
Fairness Line x1=x2
User 1’s Allocation x1
User 2’sAllocation
x2Optimal point
Overload
Underutilization
From Shivkumar Kalyanaraman
Additive Increase/Decrease
T0
T1
Efficiency Line
Fairness Line
User 1’s Allocation x1
User 2’sAllocation
x2
q Both X1 and X2 increase/decrease by the same amountover timeq Additive increase improves fairness & increases loadq Additive decrease reduces fairness & decreases load
From Shivkumar Kalyanaraman
Multiplicative Increase/Decreaseq Both X1 and X2 increase by the same factor over timeq Fairness unaffected (constant), but load increases (MI) or
decreases (MD)
T0
T1
Efficiency Line
Fairness Line
User 1’s Allocation x1
User 2’sAllocation
x2
From Shivkumar Kalyanaraman
Additive Increase/Multiplicative Decrease (AIMD)
q Assumption: decrease policy must (at minimum) reverse theload increase over-and-above efficiency lineq Implication: decrease factor should be conservatively set to
account for any congestion detection lags etc
x0
x1
x2
Efficiency Line
Fairness Line
User 1’s Allocation x1
User 2’sAllocation
x2
From Shivkumar Kalyanaraman
Further readings
qP. Gevros, J. Crowcroft, P. Kirstein,and S. Bhatti, "Congestion controlmechanisms and the best effortservice model," IEEE Network, vol.15, pp. 16 - 26, May/June 2001.
Reliable communications onthe Internet
qTCP (RFC 793) has been proposed asearly as 1974, no congestion controlqSuccessfully tested on the ArpaNet
until first congestion collapse in 1986qJacobson introduces the TCP CC
mechanism (AIMD) ‹ TCP TahoeqNow, 90% of reliable transfers use
TCP Tahoe, Reno or New Reno
TCP, adding reliability to IPHost A
Seq=92, 8 bytes data
ACK=100
loss
timeo
ut
time Lost ACK
Host B
X
Seq=92, 8 bytes data
ACK=100
Host A
Seq=100, 20 bytes data
ACK=100
Seq
=92
timeo
uttime Early timeout
Cumulated ACKs
Host B
Seq=92, 8 bytes data
ACK=120
Seq=92, 8 bytes data
Seq
=100
tim
eout
ACK=120
TCP (Tahoe) CongestionControl
qMaintains three variables:qcwnd – congestion windowqrcv_win – receiver advertised windowqssthresh – threshold size (used to update cwnd)
• Rough estimate of knee point…
qFor sending use: win = min(rcv_win, cwnd)
From Shivkumar Kalyanaraman
TCP congestion control: the big picture
q CongW grows exponentially (slow start), thenlinearly (congestion avoidance),
q If loss, divides threshold by 2 (multiplicativedecrease) and restart with CongW=1 packet
From Computer Networks, A. Tanenbaum
TCP: Slow Start
q Goal: initialize system and discover congestion quicklyq How? Quickly increase cwnd until network congested ‡ get
a rough estimate of the optimal cwndq How do we know when network is congested?q packet loss (TCP)
• over the cliff here ‡ congestion controlq congestion notification (eg: DEC Bit, ECN)
• over knee; before the cliff‡congestion avoidance
q Implications of using loss as congestion indicatorq Late congestion detection if the buffer sizes largerq Higher speed links or large buffers => larger windows => higher
probability of burst lossq Interactions with retransmission algorithm and timeouts
From Shivkumar Kalyanaraman
TCP: Slow Start
qWhenever starting traffic on a newconnection, or whenever increasing trafficafter congestion was experienced:
• Set cwnd =1• Each time a segment is acknowledged increment cwnd
by one (cwnd++).
qDoes Slow Start increment slowly? Notreally. In fact, the increase of cwnd isexponential!! Window increases to W inRTT * log2(W)
From Shivkumar Kalyanaraman
ACK for segment 1
segment 1cwnd = 1
cwnd = 2 segment 2segment 3
ACK for segments 2 + 3
cwnd = 4 segment 4segment 5segment 6segment 7
ACK for segments 4+5+6+7
cwnd = 8
Slow Start Exampleq The congestion
window increasesquite rapidly infact
From Shivkumar Kalyanaraman
Round Trip Time
1
One RTT
One pkt time
0R
2
1R
3
4
2R
567
83R
91011
1213
1415
1
2 3
4 5 6 7
From Shivkumar Kalyanaraman
Slow Start Sequence Plot
Time
Sequence No
.
.
.
CongW double everyRTT
From Shivkumar Kalyanaraman
Congestion Avoidance
qGoal: maintain operating point at the leftof the cliff:qHow?qadditive increase: starting from the rough
estimate (ssthresh), slowly increase cwnd toprobe for additional available bandwidthqmultiplicative decrease: cut congestion window
size aggressively if a loss is detected.
From Shivkumar Kalyanaraman
Congestion Avoidance
Purpose: Slow down “Slow Start”
If cwnd > ssthresh theneach time a segment is acknowledgedincrement cwnd by 1/cwnd‹cwnd += 1/cwnd.
(So cwnd is increased by one only if allsegments have been acknowledged)
From Shivkumar Kalyanaraman
Congestion AvoidanceSequence Plot
Time
Sequence No Window growsby 1 every round
From Shivkumar Kalyanaraman
From Guy Leduc, RHDM 2002
From Chandi Barakat, PhD defense, 2001
Putting it together
From Guy Leduc, RHDM 2002
TCP evolution
1975 1980 1985 1990
1982TCP & IP
RFC 793 & 791
1974TCP described by
Vint Cerf and Bob KahnIn IEEE Trans Comm
1983BSD Unix 4.2
supports TCP/IP
1984Nagel’s algorithmto reduce overhead
of small packets;predicts congestion
collapse
1987Karn’s algorithmto better estimate
round-trip time
1986Congestion
collapseobserved
1988Van Jacobson’s
algorithmscongestion avoidanceand congestion control(most implemented in
4.3BSD Tahoe)
19904.3BSD Renofast retransmitdelayed ACK’s
1975Three-way handshake
Raymond TomlinsonIn SIGCOMM 75
From Shivkumar Kalyanaraman
TCP in the 90s
1993 1994 1996
1994ECN
(Floyd)Explicit
CongestionNotification
1993TCP Vegas
(Brakmo et al)real congestion
avoidance
1994T/TCP
(Braden)Transaction
TCP
1996SACK TCP(Floyd et al)
SelectiveAcknowledgement
1996Hoe
Improving TCPstartup
1996FACK TCP
(Mathis et al)extension to SACK
From Shivkumar Kalyanaraman
Further readings
q V. Jacobson, «!Congestion avoidance and control!»,ACM SIGCOMM, 1988.
q L. Brakmo and L. Peterson, «!TCP Vegas: End toend congestion avoidance on a global internet!»,IEEE Journal on Selected Areas inCommunications, 13(8), October 1995.
q J. Widmer, R. Denda, and M. Mauve, "A survey onTCP-friendly congestion control," IEEE Network,vol. 15, pp. 28 - 37, May/June 2001.
q http://www.cnaf.infn.it/~ferrari/tcp.html for alist of papers on TCP analysis
TCP Modeling
q Given the congestion behavior of TCP can wepredict what type of performance we should get?
qWhat are the important factorsq Loss rate
• Affects how often window is reducedq RTT
• Affects increase rate and relates BW to windowq RTO
• Affects performance during loss recoveryqMSS
• Affects increase rate
From Shivkumar Kalyanaraman
From Guy Leduc, RHDM 2002
(N/2)2+1/2(N/2)2
, from (N+N/2)/2
Further readings
q The first «!square root!» TCP formulaq Matthew Mathis, Jeffrey Semke, Jamshid Mahdavi,
Teunis Ott, «!The Macroscopic Behavior of the TCPCongestion Avoidance Algorithm!», ComputerCommunications Review, Vol. 27(3), July 1997
qMore accurate TCP modellingq Padhye et al., «!Modeling TCP Throughput: A Simple
Model and its Empirical Validation!», SIGCOMM 98.
q Even more accurate TCP modellingq Chadi Barakat, «!TCP modeling and validation!», IEEE
Networks, vol. 15, no. 3, pp. 38-47, May 2001.
Some results
From Padhye, SIGCOMM98
Modeling means considering
qTimouts and retransmit strategiesqHow the window size increasesqHow the RTTs are distributedqHow the losses are correlatedqHow congestions are distributedqAnd…what have not been discovered
yet!
TCP in the Internet jungle!
qHigh Speed NetworksqLong distance, Large bandwidth, Low BERqDelay.bandwidth is high (memory)
qAsymetric networksqBandwidth, latency (cable, xDSL)
qWireless networksqHigh BER on lossy linksqSatellite, high latencies
qHow good is the old design of TCP?
TCP’s parameters onperformances
qTransmission window sizeqCongestion window sizeqBuffer sizeqTimersq?
High Speed Networks
Asymetric networks
Conclusions
The most obvious problems
qProblem with TCP headerqRwnd is on 16-bit only…q…solve by Window Scale OptionqSeqNum is 32-bit only…q…solve by Protect Against Wrapped
Sequence Number (PAWS)
From 1st PFLDnet Workshop, Sally Floyd
I
N
D
I
A
N
A
U
N
I
V
E
R
S
I
T
Y
Motivation (1)
• Basic assumption of TCP:– Packet loss is due to network congestion
• TCP thus reacts to packet loss withexponential backoff
• After backoff, transmission speed growsonly linearly
From 1st PFLDnet Workshop, Steven Wallace
I
N
D
I
A
N
A
U
N
I
V
E
R
S
I
T
Y
Motivation (2)
• What about high-speed research networks?– Packet loss is usually not due to congestion
– Loss comes from equipment, cabling, etc.
– This loss cannot necessarily be avoided
• TCP will collapse even though plenty ofcapacity is still available
From 1st PFLDnet Workshop, Steven Wallace
High Speed TCP (S. Floyd)
qModify the response function to allow for morelink utilization in current high-speed networkswhere the loss rate is smaller than that of thenetworks TCP was design for (at most 10-2)
TCP Throughput (Mbps) RTTs Between Losses W P --------------------- ------------------- ---- ----- 1 5.5 8.3 0.02 10 55.5 83.3 0.0002 100 555.5 833.3 0.000002 1000 5555.5 8333.3 0.00000002 10000 55555.5 83333.3 0.0000000002
Table 1: RTTs Between Congestion Events for Standard TCP, for 1500-Byte Packets and a Round-Trip Time of 0.1 Seconds.
From draft-ietf-tsvwg-highspeed-01.txt
Modifying the response Packet Drop Rate P Congestion Window W RTTs Between Losses ------------------ ------------------- ------------------- 10^-2 12 8 10^-3 38 25 10^-4 120 80 10^-5 379 252 10^-6 1200 800 10^-7 3795 2530 10^-8 12000 8000 10^-9 37948 25298 10^-10 120000 80000
Table 2: TCP Response Function for Standard TCP. The average congestion window W in MSS-sized segments is given as a function of the packet drop rate P.
To specify a modified responsefunction for HighSpeed TCP, weuse three parameters, Low_Window,High_Window, and High_P. ToEnsure TCP compatibility, theHighSpeed response function usesthe same response function asStandard TCP when the currentcongestion window is at mostLow_Window, and uses the HighSpeedresponse function when the currentcongestion window is greater thanLow_Window. In this document weset Low_Window to 38 MSS-sizedsegments, corresponding to a packetdrop rate of 10^-3 for TCP.
Packet Drop Rate P Congestion Window W RTTs Between Losses ------------------ ------------------- ------------------- 10^-2 12 8 10^-3 38 25 10^-4 263 38 10^-5 1795 57 10^-6 12279 83 10^-7 83981 123 10^-8 574356 180 10^-9 3928088 264 10^-10 26864653 388
Table 3: TCP Response Function for HighSpeed TCP. The average congestion window W in MSS-sized segments is given as a function of the packet drop rate P.
From draft-ietf-tsvwg-highspeed-01.txt
See it in image
From 1st PFLDnet Workshop, Sally Floyd
Some simulationsFr
om 1
st PF
LDne
t Wor
ksho
p, S
ally
Flo
yd
Some simulationsFr
om 1
st PF
LDne
t Wor
ksho
p, S
ally
Flo
yd
Adding Limited Slow-Start
From 1st PFLDnet Workshop, Sally Floyd
Further readings
qhttp://www.ietf.org/internet-drafts/draft-ietf-tsvwg-highspeed-01.txtqhttp://www.ietf.org/internet-
drafts/draft-ietf-tsvwg-slowstart-00.txtqhttp://www.icir.org/floyd/hstcp.html
Scalable TCP
qTCP’s overview (steady state)qcwnd=cwnd+1/cwndqCwnd=cwnd-1/2cwnd
From 1st PFLDnet Workshop, Tom Kelly
STCP generalized algorithm
From 1st PFLDnet Workshop, Tom Kelly
STCP in images
From 1st PFLDnet Workshop, Tom Kelly
Fairness
From 1st PFLDnet Workshop, Tom Kelly
Some results
From 1st PFLDnet Workshop, Tom Kelly
Further readings
q http://www-lce.eng.cam.ac.uk/~ctk21/scalable/
q And much more…q Tsunamiq E2E lightpathq Fast TCPq UDP Blastq XFTPq …
High Speed Networks
Asymetric networks
Conclusions
High Speed Networks
Asymetric networks
Conclusions
Research on TCP means
qSeveral possible research objectivesqRemain compatible with currently deployed
versions of TCP (mainly Reno and New Reno)• On better understanding the behavior of TCP (self-
similarity, bottleneck)• On proposing small variations on the current TCP
(ECN, HSTCP…)
qOn proposing completely new solutions• For very high speed, lightpath…• For changing the congestion control part