cse 124 networked services fall 2010...
TRANSCRIPT
10/19/2010 CSE 124 Networked Services Fall 2010
CSE 124Networked Services Fall 2010
Lecture-8
Instructor: B. S. Manoj, Ph.D
http://cseweb.ucsd.edu/classes/fa10/cse124
1
Updates
Webboard and project partners
Project-2 idea development
Homework-2
Capture a sufficiently long TCP session and observe
Three-way Connection setup
Estimated RTT behavior
Details will be placed on the web
Deadline: 22nd October
Midterm
On November 2nd, 2010
Logical questions, some definitions, and some example pseudo code or code ideas (1-2 problem on TCP)
10/19/2010 CSE 124 Networked Services Fall 2010 2
End-to-end Transport in networks
10/19/2010 CSE 124 Networked Services Fall 2010 3
State transition diagram of TCP
10/19/2010 CSE 124 Networked Services Fall 2010 4
Transition Event/Action
TCP Connection Management (cont…)
Closing a connection:
client closes socket: close();
Step 1: client end system
sends TCP FIN control
segment to server
Step 2: server receives FIN,
replies with ACK. Closes connection, sends FIN.
client server
close
close
closed
tim
ed w
ait
10/19/2010 5CSE 124 Networked Services Fall 2010
TCP Connection Management (contd...)
Step 3: client receives FIN,
replies with ACK.
Enters “timed wait” - will respond with ACK to received FINs
Step 4: server, receives ACK.
Connection closed.
client server
closing
closing
closed
tim
ed w
ait
closed
10/19/2010 6CSE 124 Networked Services Fall 2010
TCP Connection Management (contd...)
TCP clientlifecycle
TCP serverlifecycle
10/19/2010 7CSE 124 Networked Services Fall 2010
Main close possibilities
10/19/2010 CSE 124 Networked Services Fall 2010 8
• This side closes
• ESTABLISHED->FIN_WAIT_1->FIN_WAIT_2->
TIME_WAIT->CLOSED
• Other side closes
• ESTABLISHED-> CLOSE_WAIT->LAST_ACK->CLOSED
• Both sides close at the same time
• ESTABLISHED->FIN_WAIT_1->CLOSING->TIME_WAIT->
CLOSED
Connection Established State
• Data transfer– Sender transmits data in sequence– Receiver acknowledges the received data packets– Sequence number helps in detecting loss and in-order delivery
• Flow control • Sliding window management
– Helps in in order delivery, reliable delivery, and flow control– Receiver window
• Reliable delivery • Round trip time estimation
– Adaptive approach
• Congestion control– Window management– Slow start, congestion avoidance phase, congestion control phase– Fast retransmit, Fast recovery – Many variants
10/19/2010 CSE 124 Networked Services Fall 2010 9
Flow Control
10/19/2010 CSE 124 Networked Services Fall 2010 10
Sliding Window Management
• LastByteAcked ≤ LastByteSent• LastByteSent ≤ LastByteWritten• LastByteRead < NextByteExpected• NextByteExpected ≤ LastByteRcvd+1 (if received in order, else, the beginning of the gap)
Receiver• LastByteRcvd –LastByteRead ≤ MaxRcvWindow• AdvertizedWindow = MaxRcvBuffer – ((NextByteExpected -1) – LastByteRead)• LastByteSent – LastByteAcked ≤ AdvertisedWindow
Sender• EffectiveWindow = AdvertizedWindow – (LastByteSent – LastByteAcked)
– EffectiveWindow >0 for transmitter to transmit
• (LastByteWritten – LastByteAcked)+y > MaxSendBuffer– Else the sender application is prevented from sending more
10/19/2010 CSE 124 Networked Services Fall 2010 11
10/19/2010 CSE 124 Networked Services Fall 2010 12
Sender Receiver
Linux TCP: Packet Reception
Linux TCP: Packet Transmission
10/19/2010 CSE 124 Networked Services Fall 2010 13/proc/sys/net/ipv4/tcp rmem /proc/sys/net/ipv4/tcp wmem
Reliable Delivery
10/19/2010 CSE 124 Networked Services Fall 2010 14
TCP Round Trip Time and Timeout
EstimatedRTT = (1- )*EstimatedRTT + *SampleRTT
Exponential weighted moving average
influence of past sample decreases exponentially fast
typical value: = 0.125
RTT: gaia.cs.umass.edu to fantasia.eurecom.fr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RT
T (
mil
lise
con
ds)
SampleRTT Estimated RTT
10/19/2010 15CSE 124 Networked Services Fall 2010
TCP Round Trip Time and Timeout
Setting the timeout
EstimtedRTT plus “safety margin”
large variation in EstimatedRTT -> larger safety margin
first estimate of how much SampleRTT deviates from EstimatedRTT:
TimeoutInterval = EstimatedRTT + 4*DevRTT
TimeoutInterval is expoentially increased with
every retransmission
DevRTT = (1-)*DevRTT +
*|SampleRTT-EstimatedRTT|
(typically, = 0.25)
Then set timeout interval:
10/19/2010 16CSE 124 Networked Services Fall 2010
CSE 124 Networked Services Fall 2010 3-17
TCP: retransmission scenarios
Host A
timepremature timeout
Host B
Seq=
92 tim
eout
Host A
loss
tim
eout
lost ACK scenario
Host B
X
time
Seq=
92 tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
10/19/2010
CSE 124 Networked Services Fall 2010 3-18
TCP retransmission scenarios (more)
Host A
loss
tim
eout
Cumulative ACK scenario
Host B
X
time
SendBase= 120
10/19/2010
CSE 124 Networked Services Fall 2010 3-19
TCP ACK generation [RFC 1122, RFC 2581]
Event at Receiver
Arrival of in-order segment with
expected seq #. All data up to
expected seq # already ACKed
Arrival of in-order segment with
expected seq #. One other
segment has ACK pending
Arrival of out-of-order segment
higher-than-expect seq. # .
Gap detected
Arrival of segment that
partially or completely fills gap
TCP Receiver action
Delayed ACK. Wait up to 500ms
for next segment. If no next segment,
send ACK
Immediately send single cumulative
ACK, ACKing both in-order segments
Immediately send duplicate ACK,
indicating seq. # of next expected byte
Immediate send ACK, provided that
segment starts at lower end of gap
10/19/2010
Fast Retransmit
time-out period often relatively long:
long delay before resending lost packet
detect lost segments via duplicate ACKs.
sender often sends many segments back-to-back
if segment is lost, there will likely be many duplicate ACKs for that segment
If sender receives 3 ACKs for same data, it assumes that segment after ACKed data was lost:
fast retransmit: resend segment before timer expires
10/19/2010 20CSE 124 Networked Services Fall 2010
Host A
tim
eout
Host B
time
X
seq # x1seq # x2seq # x3seq # x4seq # x5
ACK x1
ACK x1ACK x1ACK x1
tripleduplicate
ACKs
10/19/2010 21CSE 124 Networked Services Fall 2010
Congestion Control
10/19/2010 CSE 124 Networked Services Fall 2010 22
TCP congestion control:
TCP sender should transmit as fast as possible, but without congesting network
Q: how to find rate just below congestion level
decentralized: each TCP sender sets its own rate, based on implicit feedback:
ACK: segment received (a good thing!), network not congested, so increase sending rate
lost segment: assume loss due to congested network, so decrease sending rate
10/19/2010 23CSE 124 Networked Services Fall 2010
TCP congestion control: bandwidth probing
“probing for bandwidth”: increase transmission rate on receipt of ACK, until eventually loss occurs, then decrease transmission rate
continue to increase on ACK, decrease on loss (since available bandwidth is changing, depending on other connections in network)
ACKs being received, so increase rate
X
X
XX
X loss, so decrease rate
sendin
g rate
time
Q: how fast to increase/decrease?
details to follow
TCP’s“sawtooth”behavior
10/19/2010 24CSE 124 Networked Services Fall 2010
TCP Congestion Control: details
sender limits rate by limiting number of unACKed bytes “in pipeline”:
cwnd: differs from rwnd (how, why?)
sender limited by min(cwnd,rwnd)
roughly,
cwnd is dynamic, function of perceived
network congestion
rate =cwnd
RTTbytes/sec
LastByteSent-LastByteAcked cwnd
cwndbytes
RTT
ACK(s)
10/19/2010 25CSE 124 Networked Services Fall 2010
TCP Congestion Control: more details
segment loss event: reducing cwnd
timeout: no response from receiver cut cwnd to 1
3 duplicate ACKs: at least some segments getting through (recall fast retransmit)
cut cwnd in half, less aggressively than on timeout
ACK received: increase cwnd
slowstart phase:
increase exponentially fast (despite name) at connection start, or following timeout
congestion avoidance:
increase linearly
10/19/2010 26CSE 124 Networked Services Fall 2010
TCP Slow Start
when connection begins, cwnd =
1 MSS
example: MSS = 500 bytes & RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be >> MSS/RTT
desirable to quickly ramp up to respectable rate
increase rate exponentially until first loss event or when threshold reached
double cwnd every RTT
done by incrementing cwnd
by 1 for every ACK received
Host A
RTT
Host B
time
10/19/2010 27CSE 124 Networked Services Fall 2010
TCP slow (exponential) start
10/19/2010 CSE 124 Networked Services Fall 2010 28
Transitioning into/out of slowstart
ssthresh: cwnd threshold maintained by TCP
on loss event: set ssthresh to cwnd/2
remember (half of) TCP rate when congestion last occurred
when cwnd >= ssthresh: transition from slowstart to congestion
avoidance phase
slow start timeout
ssthresh = cwnd/2cwnd = 1 MSSdupACKcount = 0retransmit missing segment
timeout
ssthresh = cwnd/2 cwnd = 1 MSSdupACKcount = 0retransmit missing segment
L
cwnd > ssthresh
cwnd = cwnd+MSSdupACKcount = 0transmit new segment(s),as allowed
new ACKdupACKcount++
duplicate ACK
L
cwnd = 1 MSSssthresh = 64 KBdupACKcount = 0 congestion
avoidance
10/19/2010 29CSE 124 Networked Services Fall 2010
TCP: congestion avoidance
when cwnd > ssthreshgrow cwnd linearly
increase cwnd by 1
MSS per RTT
approach possible congestion slower than in slowstart
implementation: cwnd = cwnd + MSS/cwnd
for each ACK received
ACKs: increase cwnd
by 1 MSS per RTT: additive increase
loss: cut cwnd in half
(non-timeout-detected loss ): multiplicative decrease
AIMD
AIMD: Additive IncreaseMultiplicative Decrease
10/19/2010 30CSE 124 Networked Services Fall 2010
TCP congestion control FSM: details
slow start
congestionavoidance
fastrecovery
timeoutssthresh = cwnd/2cwnd = 1 MSSdupACKcount = 0retransmit missing segment
timeout
ssthresh = cwnd/2 cwnd = 1 MSSdupACKcount = 0retransmit missing segment
L
cwnd > ssthresh
cwnd = cwnd+MSSdupACKcount = 0transmit new segment(s),as allowed
new ACKcwnd = cwnd + MSS (MSS/cwnd)dupACKcount = 0transmit new segment(s),as allowed
new ACK.
dupACKcount++
duplicate ACK
ssthresh= cwnd/2cwnd = ssthresh + 3retransmit missing segment
dupACKcount == 3
dupACKcount++
duplicate ACK
ssthresh= cwnd/2cwnd = ssthresh + 3
retransmit missing segment
dupACKcount == 3
timeout
ssthresh = cwnd/2cwnd = 1 dupACKcount = 0retransmit missing segment
cwnd = cwnd + MSStransmit new segment(s), as allowed
duplicate ACK
cwnd = ssthreshdupACKcount = 0
New ACK
L
cwnd = 1 MSSssthresh = 64 KBdupACKcount = 0
10/19/2010 31CSE 124 Networked Services Fall 2010
Popular “flavors” of TCP
ssthresh
ssthresh
TCP Tahoe
TCP Reno
Transmission round
cwnd
win
dow
siz
e (
in s
egm
ents
)
10/19/2010 32CSE 124 Networked Services Fall 2010
Summary: TCP Congestion Control
when cwnd < ssthresh, sender in slow-start
phase, window grows exponentially.
when cwnd >= ssthresh, sender is in congestion-
avoidance phase, window grows linearly.
when triple duplicate ACK occurs, ssthresh set to cwnd/2, cwnd set to ~ ssthresh
when timeout occurs, ssthresh set to cwnd/2, cwnd set to 1 MSS.
10/19/2010 33CSE 124 Networked Services Fall 2010
Simplified TCP throughput
Average throughout of TCP as function of window size, RTT?
ignoring slow start
let W be window size when loss occurs.
when window is W, throughput is W/RTT
just after loss, window drops to W/2, throughput to W/2RTT.
average throughout: .75 W/RTT
10/19/2010 34CSE 124 Networked Services Fall 2010
Assuming in a cycle, 1 packet is lost
Therefore, the loss rate L is obtained as
Since we can get
Throughput = 10/19/2010 CSE 124 Networked Services Fall 2010 35
TCP throughput as a function of Loss rate
.75 W/RTT=
XX
X loss, so decrease rate
sendin
g rate
time
TCP’s“sawtooth”behavior
X
W/2
W
0.75W
Summary
Reading assignment
TCP from Chapter 3 in Kurose and Ross
TCP from Chapter 5 in Peterson and Davie
Homework:
Problems P43 (308) from Kurose and Ross
Deadline: See the website
10/19/2010 36CSE 124 Networked Services Fall 2010