network protocol design (itc8061) · mobile. 1-3 how do loss and delay occur? ... web site toll...
TRANSCRIPT
1-1
Network Protocol Design (ITC8061)
v These are modifications of the original slides by Kurose and Ross.
1-2
What’s the Internet: “Nuts and Bolts” Viewv protocols control sending,
receiving of msgso e.g., TCP, IP, HTTP, FTP, PPP
v Internet: “network of networks”
o loosely hierarchicalo public Internet versus private
intranetv Internet standards
o RFC: Request for commentso IETF: Internet Engineering
Task Force
Local ISP
CompanyNetwork
Regional ISP
router workstationserver
mobile
1-3
How do loss and delay occur?packets queue in router buffersv packet arrival rate to link exceeds output link capacityv packets queue, wait for turn
A
B
packet being transmitted (delay)
packets queueing (delay)free (available) buffers: arriving packets dropped (loss) if no free buffers
1-4
Four sources of packet delay
v 1. Nodal Processing:o check bit errorso determine output link
v 2. Queueingo time waiting at output
link for transmission o depends on congestion
level of router
A
Bpropagation
transmission
nodalprocessing queueing
1-5
Delay in packet-switched networks3. Transmission delay:v R=link bandwidth (bps)v L=packet length (bits)v time to send bits into
link = L/R
4. Propagation delay:v d = length of physical
linkv s = propagation speed in
medium (~2x108 m/sec)v propagation delay = d/s
A
Bpropagation
transmission
nodalprocessing queueing
Note: s and R are very different quantities!
1-6
Caravan analogy
v Cars “propagate” at 100 km/hr
v Toll booth takes 12 sec to service a car (transmission time)
v car~bit; caravan ~ packetv Q: How long until caravan is
lined up before 2nd toll booth?
v Time to “push” entire caravan through toll booth onto highway = 12*10 = 120 sec
v Time for last car to propagate from 1st to 2nd toll both: 100km/(100km/hr)= 1 hr
v A: 62 minutes
toll booth
toll booth
ten-car caravan
100 km 100 km
1-7
Caravan analogy (more)
v Cars now “propagate” at 1000 km/hr
v Toll booth now takes 1 min to service a car
v Q: Will cars arrive to 2nd booth before all cars serviced at 1st booth?
v Yes! After 7 min, 1st car at 2nd booth and 3 cars still at 1st booth.
v 1st bit of packet can arrive at 2nd router before packet is fully transmitted at 1st router!
o See Ethernet applet at AWL Web site
toll booth
toll booth
ten-car caravan
100 km 100 km
1-8
Nodal delay
v dproc = processing delayo typically a few microsecs or less
v dqueue = queuing delayo depends on congestion
v dtrans = transmission delayo = L/R, significant for low-speed links
v dprop = propagation delayo a few microsecs to hundreds of msecs
proptransqueueprocnodal ddddd +++=
1-9
Queueing delay (revisited)
v R=link bandwidth (bps)v L=packet length (bits)v a=average packet arrival
rate
traffic intensity = La/R
v La/R ~ 0: average queueing delay smallv La/R -> 1: delays become largev La/R > 1: more “work” arriving than can be
serviced, average delay infinite!
1-10
Packet loss
v queue (aka buffer) preceding link in buffer has finite capacity
v when packet arrives to full queue, packet is dropped (aka lost)
v lost packet may be retransmitted by previous node, by source end system, or not retransmitted at all
1-11
Transport LayerOur goals:v understand
principles behind transport layer services:o multiplexing/demultipl
exingo reliable data transfero flow controlo congestion control
v learn about transport layer protocols in the Internet:o UDP: connectionless
transporto TCP: connection-oriented
transporto TCP congestion control
1-12
outline
v 3.1 Transport-layer services
v 3.2 Multiplexing and demultiplexing
v 3.3 Connectionless transport: UDP
v 3.4 Principles of reliable data transfer
v 3.5 Connection-oriented transport: TCPo segment structureo reliable data transfero flow controlo connection management
v 3.6 Principles of congestion control
v 3.7 TCP congestion control
1-13
Transport services and protocolsv provide logical
communication between app processes running on different hosts
v transport protocols run in end systems o send side: breaks app
messages into segments, passes to network layer
o rcv side: reassembles segments into messages, passes to app layer
v more than one transport protocol available to appso Internet: TCP and UDP
network
link
physical
network
link
physical
network
link
physical
network
link
physical
logical end-end transport
application
transport
network
link
physical
application
transport
network
link
physical
1-14
Transport vs. network layer
v network layer: logical communication between hosts
v transport layer:logical communication between processes o relies on, enhances,
network layer services
Household analogy:12 kids sending letters to 12
kidsv processes = kidsv app messages = letters in
envelopesv hosts = housesv transport protocol = Ann
and Billv network-layer protocol =
postal service
1-15
Internet transport-layer protocols
v reliable, in-order delivery (TCP)o congestion control o flow controlo connection setup
v unreliable, unordered delivery: UDPo no-frills extension of
“best-effort” IPv services not available:
o delay guaranteeso bandwidth guarantees
network
link
physical
network
link
physical
network
link
physical
network
link
physical
logical end-end transport
application
transport
network
link
physical
application
transport
network
link
physical
1-16
Chapter 3 outline
v 3.1 Transport-layer services
v 3.2 Multiplexing and demultiplexing
v 3.3 Connectionless transport: UDP
v 3.4 Principles of reliable data transfer
v 3.5 Connection-oriented transport: TCPo segment structureo reliable data transfero flow controlo connection management
v 3.6 Principles of congestion control
v 3.7 TCP congestion control
1-17
Multiplexing/demultiplexing
application
transport
network
link
physical
P1 application
transport
network
link
physical
application
transport
network
link
physical
P2P3 P4P1
host 1 host 2 host 3
= process= socket
delivering received segmentsto correct socket
Demultiplexing at rcv hostgathering data from multiplesockets, enveloping data with header (later used for demultiplexing)
Multiplexing at send host
1-18
How demultiplexing worksv host receives IP datagrams
o each datagram has source IP address, destination IP address
o each datagram carries 1 transport-layer segment
o each segment has source, destination port number (recall: well-known port numbers for specific applications)
v host uses IP addresses & port numbers to direct segment to appropriate socket
source port # dest port #
32 bits
applicationdata
(message)
other header fields
TCP/UDP segment format
1-19
Connectionless demultiplexing
v Create sockets with port numbers:
DatagramSocket mySocket1 = new DatagramSocket(9911);
DatagramSocket mySocket2 = new DatagramSocket(9922);
v UDP socket identified by two-tuple:
(dest IP address, dest port number)
v When host receives UDP segment:
o checks destination port number in segment
o directs UDP segment to socket with that port number
v IP datagrams with different source IP addresses and/or source port numbers directed to same socket
1-20
Connectionless demux (cont)DatagramSocket serverSocket = new DatagramSocket(6428);
ClientIP:B
P2
clientIP: A
P1P1P3
serverIP: C
SP: 9157
SP provides “return address”
SP: 6428
DP: 9157
DP: 6428
1-21
Connection-oriented demux
v TCP socket identified by 4-tuple: o source IP addresso source port numbero dest IP addresso dest port number
v recv host uses all four values to direct segment to appropriate socket
v Server host may support many simultaneous TCP sockets:
o each socket identified by its own 4-tuple
v Web servers have different sockets for each connecting client
o non-persistent HTTP will have different socket for each request
1-22
Connection-oriented demux (cont)
P3
clientIP: A
P1P1P3
serverIP: C
SP: 80DP: 9157
SP: 9157DP: 80
SP: 80DP: 5775
SP: 5775DP: 80
P4
clientIP: B
1-23
Chapter 3 outline
v 3.1 Transport-layer services
v 3.2 Multiplexing and demultiplexing
v 3.3 Connectionless transport: UDP
v 3.4 Principles of reliable data transfer
v 3.5 Connection-oriented transport: TCPo segment structureo reliable data transfero flow controlo connection management
v 3.6 Principles of congestion control
v 3.7 TCP congestion control
1-24
UDP: User Datagram Protocol [RFC 768]
v “no frills,” “bare bones” Internet transport protocol
v “best effort” service, UDP segments may be:
o losto delivered out of order to
appv connectionless:
o no handshaking between UDP sender, receiver
o each UDP segment handled independently of others
Why is there a UDP?v no connection
establishment (which can add delay)
v simple: no connection state at sender, receiver
v small segment headerv no congestion control:
UDP can blast away as fast as desired
1-25
UDP: morev often used for streaming
multimedia appso loss toleranto rate sensitive
v other UDP useso DNSo SNMP
v reliable transfer over UDP: add reliability at application layero application-specific
error recovery!
source port # dest port #
32 bits
Applicationdata
(message)
UDP segment format
length checksumLength, in
bytes of UDPsegment,including
header
1-26
UDP checksum
Sender:v treat segment contents
as sequence of 16-bit integers
v checksum: addition (1’s complement sum) of segment contents
v sender puts checksum value into UDP checksum field
Receiver:v compute checksum of received
segmentv check if computed checksum
equals checksum field value:o NO - error detectedo YES - no error detected. But
maybe errors nonetheless?More later ….
Goal: detect “errors” (e.g., flipped bits) in transmitted segment
1-27
Chapter 3 outline
v 3.1 Transport-layer services
v 3.2 Multiplexing and demultiplexing
v 3.3 Connectionless transport: UDP
v 3.4 Principles of reliable data transfer
v 3.5 Connection-oriented transport: TCPo segment structureo reliable data transfero flow controlo connection management
v 3.6 Principles of congestion control
v 3.7 TCP congestion control
1-28
Principles of Reliable data transferv important in app., transport, link layersv top-10 list of important networking topics!
v characteristics of unreliable channel will determine complexity of reliable data transfer protocol (rdt)
1-29
Reliable data transfer: getting started
sendside
receiveside
rdt_send(): called from above, (e.g., by app.). Passed data to
deliver to receiver upper layer
udt_send(): called by rdt,to transfer packet over
unreliable channel to receiver
rdt_rcv(): called when packet arrives on rcv-side of channel
deliver_data(): called by rdt to deliver data to upper
1-30
Reliable data transfer: getting startedWe’ll:v incrementally develop sender, receiver sides
of reliable data transfer protocol (rdt)v consider only unidirectional data transfer
o but control info will flow on both directions!v use finite state machines (FSM) to specify
sender, receiver
state1
event causing state transitionactions taken on state transition
state: when in this “state” next state uniquely
determined by next eventeventactions
state2
1-31
Rdt1.0: reliable transfer over a reliable channel
v underlying channel perfectly reliableo no bit errorso no loss of packets
v separate FSMs for sender, receiver:o sender sends data into underlying channelo receiver read data from underlying channel
Wait for call from above packet = make_pkt(data)
udt_send(packet)
rdt_send(data)extract (packet,data)deliver_data(data)
Wait for call from
below
rdt_rcv(packet)
sender receiver
1-32
Rdt2.0: channel with bit errorsv underlying channel may flip bits in packet
o recall: UDP checksum to detect bit errorsv the question: how to recover from errors:
o acknowledgements (ACKs): receiver explicitly tells sender that pkt received OK
o negative acknowledgements (NAKs): receiver explicitly tells sender that pkt had errors
o sender retransmits pkt on receipt of NAKo human scenarios using ACKs, NAKs?
v new mechanisms in rdt2.0 (beyond rdt1.0):o error detectiono receiver feedback: control msgs (ACK,NAK) rcvr->sender
1-33
rdt2.0: FSM specification
udt_send(NAK)
rdt_rcv(rcvpkt) && corrupt(rcvpkt)
rdt_send(data)
Wait for call from above
sndpkt = make_pkt(data, checksum)udt_send(sndpkt)
extract(rcvpkt,data)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) && isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&isNAK(rcvpkt)Wait for
ACK or NAK
Wait for call from
belowsender
receiver
L
1-34
udt_send(NAK)
rdt_rcv(rcvpkt) && corrupt(rcvpkt)
rdt_send(data)
Wait for call from above
sndpkt = make_pkt(data, checksum)udt_send(sndpkt)
extract(rcvpkt,data)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) && isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&isNAK(rcvpkt)Wait for
ACK or NAK
Wait for call from
belowsender
receiver
L
rdt2.0: operation with no errors
1-35
udt_send(NAK)
rdt_rcv(rcvpkt) && corrupt(rcvpkt)
rdt_send(data)
Wait for call from above
sndpkt = make_pkt(data, checksum)udt_send(sndpkt)
extract(rcvpkt,data)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) && isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&isNAK(rcvpkt)Wait for
ACK or NAK
Wait for call from
belowsender
receiver
L
rdt2.0: error scenario
1-36
rdt2.0 has a fatal flaw!What happens if ACK/NAK
corrupted?v sender doesn’t know what
happened at receiver!v can’t just retransmit:
possible duplicate
What to do?v sender ACKs/NAKs
receiver’s ACK/NAK? What if sender ACK/NAK lost?
v retransmit, but this might cause retransmission of correctly received pkt!
Handling duplicates: v sender adds sequence
number to each pktv sender retransmits
current pkt if ACK/NAK garbled
v receiver discards (doesn’t deliver up) duplicate pkt
Sender sends one packet, then waits for receiver response
stop and wait
1-37
rdt2.1: sender, handles garbled ACK/NAKs
Wait for call 0 from
above
sndpkt = make_pkt(0, data, checksum)udt_send(sndpkt)
rdt_send(data)
Wait for ACK or NAK 0 udt_send(sndpkt)
rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
sndpkt = make_pkt(1, data, checksum)udt_send(sndpkt)
rdt_send(data)
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt)
Wait forcall 1 from
above
Wait for ACK or NAK 1
LL
1-38
rdt2.1: receiver, handles garbled ACK/NAKs
Wait for 0 from below
sndpkt = make_pkt(NAK, chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) && not corrupt(rcvpkt) &&has_seq0(rcvpkt)
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && has_seq1(rcvpkt)
extract(rcvpkt,data)deliver_data(data)sndpkt = make_pkt(ACK, chksum)udt_send(sndpkt)
Wait for 1 from below
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && has_seq0(rcvpkt)
extract(rcvpkt,data)deliver_data(data)sndpkt = make_pkt(ACK, chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) && (corrupt(rcvpkt)
sndpkt = make_pkt(ACK, chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) && not corrupt(rcvpkt) &&has_seq1(rcvpkt)
rdt_rcv(rcvpkt) && (corrupt(rcvpkt)
sndpkt = make_pkt(ACK, chksum)udt_send(sndpkt)
sndpkt = make_pkt(NAK, chksum)udt_send(sndpkt)
1-39
rdt2.1: discussion
Sender:v seq # added to pktv two seq. #’s (0,1) will
suffice. Why?v must check if received
ACK/NAK corrupted v twice as many states
o state must “remember” whether “current” pkt has 0 or 1 seq. #
Receiver:v must check if
received packet is duplicateo state indicates whether
0 or 1 is expected pkt seq #
v note: receiver can not know if its last ACK/NAK received OK at sender
1-40
rdt2.2: a NAK-free protocol
v same functionality as rdt2.1, using ACKs onlyv instead of NAK, receiver sends ACK for last pkt
received OKo receiver must explicitly include seq # of pkt being ACKed
v duplicate ACK at sender results in same action as NAK: retransmit current pkt
1-41
rdt2.2: sender, receiver fragments
Wait for call 0 from
above
sndpkt = make_pkt(0, data, checksum)udt_send(sndpkt)
rdt_send(data)
udt_send(sndpkt)
rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) ||
isACK(rcvpkt,1) )
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt,0)
Wait for ACK
0
sender FSMfragment
Wait for 0 from
below
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && has_seq1(rcvpkt)
extract(rcvpkt,data)deliver_data(data)sndpkt = make_pkt(ACK1, chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) && (corrupt(rcvpkt) ||
has_seq1(rcvpkt))udt_send(sndpkt) receiver FSM
fragment
L
1-42
rdt3.0: channels with errors and loss
New assumption: underlying channel can also lose packets (data or ACKs)
o checksum, seq. #, ACKs, retransmissions will be of help, but not enough
Q: how to deal with loss?o sender waits until certain
data or ACK lost, then retransmits
o yuck: drawbacks?
Approach: sender waits “reasonable” amount of time for ACK
v retransmits if no ACK received in this time
v if pkt (or ACK) just delayed (not lost):
o retransmission will be duplicate, but use of seq. #’s already handles this
o receiver must specify seq # of pkt being ACKed
v requires countdown timer
1-43
rdt3.0 sendersndpkt = make_pkt(0, data, checksum)udt_send(sndpkt)start_timer
rdt_send(data)
Wait for
ACK0
rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) ||isACK(rcvpkt,1) )
Wait for call 1 from
above
sndpkt = make_pkt(1, data, checksum)udt_send(sndpkt)start_timer
rdt_send(data)
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt,0)
rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) ||isACK(rcvpkt,0) )
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt,1)
stop_timerstop_timer
udt_send(sndpkt)start_timer
timeout
udt_send(sndpkt)start_timer
timeout
rdt_rcv(rcvpkt)
Wait for call 0 from
above
Wait for
ACK1
Lrdt_rcv(rcvpkt)
LL
L
1-44
rdt3.0 in action
1-45
rdt3.0 in action
1-46
Performance of rdt3.0v rdt3.0 works, but performance stinksv example: 1 Gbps link, 15 ms e-e prop. delay, 1KB packet:
T transmit= 8kb/pkt10**9 b/sec = 8 microsec
U sender: utilization – fraction of time sender busy sending1KB packet every 30 msec -> 33kB/sec throughput over 1 Gbps linknetwork protocol limits use of physical resources!
L (packet length in bits)R (transmission rate, bps)
=
1-47
rdt3.0: stop-and-wait operation
first packet bit transmitted, t = 0sender receive
r
RTT
last packet bit transmittedt = L / R
first packet bit arriveslast packet bit arrives, send ACK
ACK arrives, send next packet, t = RTT + L / R
U sender =
.008 30.008
= 0.00027 microseconds
L / R RTT + L / R
=
1-48
Pipelined protocolsPipelining: sender allows multiple, “in-flight”, yet-
to-be-acknowledged packetso range of sequence numbers must be increasedo buffering at sender and/or receiver
v Two generic forms of pipelined protocols:go-Back-N, selective repeat
1-49
Pipelining: increased utilization
first packet bit transmitted, t = 0
sender receiver
RTT
last bit transmitted, t = L / R
first packet bit arriveslast packet bit arrives, send ACK
ACK arrives, send next packet, t = RTT + L / R
last bit of 2nd packet arrives, send ACKlast bit of 3rd packet arrives, send ACK
U sender =
.024 30.008
= 0.0008 microseconds
3 * L / R RTT + L / R
=
Increase utilizationby a factor of 3!
1-50
Go-Back-NSender:v k-bit seq # in pkt headerv “window” of up to N, consecutive unack’ed pkts allowed
ACK(n): ACKs all pkts up to, including seq # n - “cumulative ACK”may deceive duplicate ACKs (see receiver)
timer for each in-flight pkttimeout(n): retransmit pkt n and all higher seq # pkts in window
1-51
GBN: sender extended FSM
Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])…udt_send(sndpkt[nextseqnum-1])
timeout
rdt_send(data)
if (nextseqnum < base+N) {sndpkt[nextseqnum] = make_pkt(nextseqnum,data,chksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)
start_timernextseqnum++}
elserefuse_data(data)
base = getacknum(rcvpkt)+1If (base == nextseqnum)
stop_timerelse
start_timer
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)
base=1nextseqnum=1
rdt_rcv(rcvpkt) && corrupt(rcvpkt)
L
1-52
GBN: receiver extended FSM
ACK-only: always send ACK for correctly-received pkt with highest in-order seq #
o may generate duplicate ACKso need only remember expectedseqnum
v out-of-order pkt: o discard (don’t buffer) -> no receiver buffering!o Re-ACK pkt with highest in-order seq #
Wait
udt_send(sndpkt)default
rdt_rcv(rcvpkt)&& notcurrupt(rcvpkt)&& hasseqnum(rcvpkt,expectedseqnum)
extract(rcvpkt,data)deliver_data(data)sndpkt = make_pkt(expectedseqnum,ACK,chksum)udt_send(sndpkt)expectedseqnum++
expectedseqnum=1sndpkt =
make_pkt(expectedseqnum,ACK,chksum)
L
1-53
GBN inaction
1-54
Selective Repeatv receiver individually acknowledges all
correctly received pktso buffers pkts, as needed, for eventual in-order delivery
to upper layerv sender only resends pkts for which ACK not
receivedo sender timer for each unACKed pkt
v sender windowo N consecutive seq #’so again limits seq #s of sent, unACKed pkts
1-55
Selective repeat: sender, receiver windows
1-56
Selective repeat
data from above :v if next available seq # in
window, send pkttimeout(n):v resend pkt n, restart timerACK(n) in [sendbase,sendbase+N]:
v mark pkt n as receivedv if n smallest unACKed pkt,
advance window base to next unACKed seq #
sender
pkt n in [rcvbase, rcvbase+N-1]send ACK(n)out-of-order: bufferin-order: deliver (also deliver buffered, in-order pkts), advance window to next not-yet-received pktpkt n in [rcvbase-N,rcvbase-1]
ACK(n)otherwise:ignore
receiver
1-57
Selective repeat in action
1-58
Selective repeat:dilemma
Example: v seq #’s: 0, 1, 2, 3v window size=3
v receiver sees no difference in two scenarios!
v incorrectly passes duplicate data as new in (a)
Q: what relationship between seq # size and window size?