1
Transport Layer 3-1
Chapter 3 outline
31 transport-layer services
32 principles of reliable data transfer
33 connection-oriented transport TCPbull segment structurebull reliable data transferbull flow controlbull connection management
34 principles of congestion control
35 TCP congestion control
Transport Layer 3-2
congestionsect informally ldquotoo many sources sending too much
data too fast for network to handlerdquosect different from flow controlsect manifestationsbull lost packets (buffer overflow at routers)bull long delays (queueing in router buffers)
sect a top-10 problem
Principles of congestion control
Transport Layer 3-3
Causescosts of congestion scenario 1
sect two senders two receivers
sect one router infinite buffers sect output link capacity Rsect no retransmission
sect maximum per-connection throughput R2
unlimited shared output link buffers
Host A
original data lin
Host B
throughput lout
R2
R2
l out
lin R2
dela
y
lin
v large delays as arrival rate lin approaches capacity
Transport Layer 3-4
sect one router finite buffers sect sender retransmission of timed-out packet
bull application-layer input = application-layer output lin = lout
bull transport-layer input includes retransmissions lin lin
finite shared output link buffers
Host A
lin original data
Host B
loutlin original data plusretransmitted data
lsquo
Causescosts of congestion scenario 2
Transport Layer 3-5
idealization perfect knowledge
sect sender sends only when router buffers available
finite shared output link buffers
lin original dataloutlin original data plus
retransmitted data
copy
free buffer space
R2
R2
l out
lin
Causescosts of congestion scenario 2
Host B
A
Transport Layer 3-6
lin original dataloutlin original data plus
retransmitted data
copy
no buffer space
Idealization known losspackets can be lost dropped at router due to full buffers
sect sender only resends if packet known to be lost
Causescosts of congestion scenario 2
A
Host B
2
Transport Layer 3-7
lin original dataloutlin original data plus
retransmitted data
free buffer space
Causescosts of congestion scenario 2Idealization known loss
packets can be lost dropped at router due to full buffers
sect sender only resends if packet known to be lost
R2
R2lin
lout
when sending at R2 some packets are retransmissions but asymptotic goodput is still R2 (why)
A
Host BTransport Layer 3-8
A
linloutlincopy
free buffer space
timeout
R2
R2lin
lout
when sending at R2 some packets are retransmissions including duplicated that are delivered
Host B
Realistic duplicatessect packets can be lost dropped at
router due to full bufferssect sender times out prematurely
sending two copies both of which are delivered
Causescosts of congestion scenario 2
Transport Layer 3-9
R2
lout
when sending at R2 some packets are retransmissions including duplicated that are delivered
ldquocostsrdquo of congestionsect more work (retrans) for given ldquogoodputrdquosect unneeded retransmissions link carries multiple copies of pkt
bull decreasing goodput
R2lin
Causescosts of congestion scenario 2Realistic duplicatessect packets can be lost dropped at
router due to full bufferssect sender times out prematurely
sending two copies both of which are delivered
Transport Layer 3-10
sect four senderssect multihop pathssect timeoutretransmit
Q what happens as lin and linrsquo
increase
finite shared output link buffers
Host A lout
Causescosts of congestion scenario 3
Host B
Host CHost D
lin original datalin original data plus
retransmitted data
A as red linrsquo increases all arriving blue pkts at upper queue are dropped blue throughput g 0
Transport Layer 3-11
another ldquocostrdquo of congestionsect when packet dropped any ldquoupstream
transmission capacity used for that packet was wasted
Causescosts of congestion scenario 3
C2
C2
l out
linrsquo
Transport Layer 3-12
Chapter 3 outline
31 transport-layer services
32 multiplexing and demultiplexing
33 connectionless transport UDP
34 principles of reliable data transfer
35 connection-oriented transport TCPbull segment structurebull reliable data transferbull flow controlbull connection management
36 principles of congestion control
37 TCP congestion control
3
Transport Layer 3-13
TCP congestion control additive increase multiplicative decrease
sect approach sender increases transmission rate (window size) probing for usable bandwidth until loss occursbull additive increase increase cwnd by 1 MSS every
RTT until loss detectedbull multiplicative decrease cut cwnd in half after loss
cwnd
TC
P s
ende
r co
nges
tion
win
dow
siz
e
AIMD saw toothbehavior probing
for bandwidth
additively increase window size helliphellip until loss occurs (then cut window in half)
timeTransport Layer 3-14
TCP Congestion Control details
sect sender limits transmission
sect cwnd is dynamic function of perceived network congestion
TCP sending ratesect roughly send cwnd
bytes wait RTT for ACKS then send more bytes
last byteACKed sent not-
yet ACKed(ldquoin-flightrdquo)
last byte sent
cwnd
LastByteSent-LastByteAcked
lt cwnd
sender sequence number space
rate ~~cwndRTT
bytessec
Transport Layer 3-15
TCP Slow Start sect when connection begins
increase rate exponentially until first loss eventbull initially cwnd = 1 MSSbull double cwnd every RTTbull done by incrementing cwnd for every ACK received
sect summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-16
TCP detecting reacting to loss
sect loss indicated by timeoutbull cwnd set to 1 MSS bull window then grows exponentially (as in slow start)
to threshold then grows linearlysect loss indicated by 3 duplicate ACKs TCP RENObull dup ACKs indicate network capable of delivering
some segments bull cwnd is cut in half window then grows linearly
sect TCP Tahoe always sets cwnd to 1 (timeout or 3 duplicate acks)
Transport Layer 3-17
Q when should the exponential increase switch to linear
A when cwnd gets to 12 of its value before timeout
Implementationsect variable ssthreshsect on loss event ssthresh
is set to 12 of cwnd just before loss event
TCP switching from slow start to CA
Check out the online interactive exercises for more examples httpgaiacsumassedukurose_rossinteractive Transport Layer 3-18
Summary TCP Congestion Control
timeoutssthresh = cwnd2
cwnd = 1 MSSdupACKcount = 0
retransmit missing segment
Lcwnd gt ssthresh
congestionavoidance
cwnd = cwnd + MSS (MSScwnd)dupACKcount = 0
transmit new segment(s) as allowed
new ACK
dupACKcount++duplicate ACK
fastrecovery
cwnd = cwnd + MSStransmit new segment(s) as allowed
duplicate ACK
ssthresh= cwnd2cwnd = ssthresh + 3
retransmit missing segment
dupACKcount == 3
timeoutssthresh = cwnd2cwnd = 1 dupACKcount = 0retransmit missing segment
ssthresh= cwnd2cwnd = ssthresh + 3retransmit missing segment
dupACKcount == 3cwnd = ssthreshdupACKcount = 0
New ACK
slow start
timeoutssthresh = cwnd2
cwnd = 1 MSSdupACKcount = 0
retransmit missing segment
cwnd = cwnd+MSSdupACKcount = 0transmit new segment(s) as allowed
new ACKdupACKcount++duplicate ACK
Lcwnd = 1 MSS
ssthresh = 64 KBdupACKcount = 0
NewACK
NewACK
NewACK
4
Transport Layer 3-19
TCP throughputsect avg TCP thruput as function of window size RTTbull ignore slow start assume always data to send
sect W window size (measured in bytes) where loss occursbull avg window size ( in-flight bytes) is frac34 Wbull avg thruput is 34W per RTT
W
W2
avg TCP thruput = 34
WRTT bytessec
Transport Layer 3-20
TCP Futures TCP over ldquolong fat pipesrdquo
sect example 1500 byte segments 100ms RTT want 10 Gbps throughput
sect requires W = 83333 in-flight segmentssect throughput in terms of segment loss probability L
[Mathis 1997]
to achieve 10 Gbps throughput need a loss rate of L = 210-10 ndash a very small loss rate
sect new versions of TCP for high-speed
TCP throughput = 122 MSSRTT L
Transport Layer 3-21
fairness goal if K TCP sessions share same bottleneck link of bandwidth R each should have average rate of RK
TCP connection 1
bottleneckrouter
capacity R
TCP Fairness
TCP connection 2
Transport Layer 3-22
Why is TCP fairtwo competing sessionssect additive increase gives slope of 1 as throughout increasessect multiplicative decrease decreases throughput proportionally
R
R
equal bandwidth share
Connection 1 throughput
Con
nect
ion
2 th
roug
hput
congestion avoidance additive increaseloss decrease window by factor of 2
congestion avoidance additive increaseloss decrease window by factor of 2
Transport Layer 3-23
Fairness (more)Fairness and UDPsect multimedia apps often
do not use TCPbull do not want rate
throttled by congestion control
sect instead use UDPbull send audiovideo at
constant rate tolerate packet loss
Fairness parallel TCP connections
sect application can open multiple parallel connections between two hosts
sect web browsers do this sect eg link of rate R with 9
existing connectionsbull new app asks for 1 TCP gets
rate R10bull new app asks for 11 TCPs
gets R2
Transport Layer 3-24
network-assisted congestion controlsect two bits in IP header (ToS field) marked by network router
to indicate congestionsect congestion indication carried to receiving hostsect receiver (seeing congestion indication in IP datagram) )
sets ECE bit on receiver-to-sender ACK segment to notify sender of congestion
Explicit Congestion Notification (ECN)
sourceapplicationtransportnetworklink
physical
destinationapplicationtransportnetworklink
physical
ECN=00 ECN=11
ECE=1
IP datagram
TCP ACK segment
5
Transport Layer 3-25
Chapter 3 summarysect principles behind transport
layer servicesbull multiplexing
demultiplexingbull reliable data transferbull flow controlbull congestion control
sect instantiation implementation in the Internetbull UDPbull TCP
nextsect leaving the network ldquoedgerdquo (application transport layers)
sect into the network ldquocorerdquo
sect two network layer chaptersbull data planebull control plane
Quiz 2
Consider the following network Host A wants tosimultaneously send messages to hosts B and C A isconnected to B and C via a broadcast channelmdasha packet sentby A is carried by the channel to both B and C Suppose thatthe broadcast channel connecting A B and C canindependently lose and corrupt messages (and so forexample a message sent from A might be correctly receivedby B but not by C) The stop-and-wait-like error-controlprotocol is used for reliably transferring packets from A to Band C such that A will send a new packet from the upperlayer until it knows that both B and C have correctlyreceived the current packet Packets from upper layers maybe queued in a sufficiently large buffer Give FSM descriptionsof A and C (Hint The FSM for B should be the same as forC)
Transport Layer 3-26
Quiz 3
Consider the GBN protocol with a sender window size of 4and sequence number range of 1024 Suppose that at time tthe next in-order packet that the receiver is expecting has asequence number of k Assume that the medium does notreorder message Answer the following questionsaWhat are the possible sets of sequence numbers inside thesenderrsquos window at time tbWhat are all possible values of the ACK field in all possiblemessages currently propagating back to the sender at time t
Transport Layer 3-27 Transport Layer 3-28
aHere we have a window size of N=3 Suppose the receiver has received packet k-1 and has ACKed that and all other preceding packets If all of these ACKs have been received by sender then senders window is [k k+N-1] Suppose next that none of the ACKs have been received at the sender In this second case the senders window contains k-1 and the N packets up to and including k-1 The senders window is thus [k-Nk-1] By these arguments the senders window is of size 3 and begins somewhere in the range [k-Nk]bIf the receiver is waiting for packet k then it has received (and ACKed) packet k-1 and the N-1 packets before that If none of those N ACKs have been yet received by the sender then ACK messages with values of [k-Nk-1] may still be propagating backBecause the sender has sent packets [k-N k-1] it must be the case that the sender has already received an ACK for k-N-1 Once the receiver has sent an ACK for k-N-1 it will never send an ACK that is less that k-N-1 Thus the range of in-flight ACK values can range from k-N-1 to k-1
Transport Layer 3-29
Chapter 4 network layer
chapter goalssect understand principles behind network layer
services bull network layer service modelsbull forwarding versus routingbull how a router worksbull generalized forwarding
sect instantiation implementation in the Internet
4-30
Network Layer Data Plane
6
Network layersect transport segment from
sending to receiving host sect on sending side
encapsulates segments into datagrams
sect on receiving side delivers segments to transport layer
sect network layer protocols in every host router
sect router examines header fields in all IP datagrams passing through it
applicationtransportnetworkdata linkphysical
applicationtransportnetworkdata linkphysical
networkdata linkphysical network
data linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysicalnetwork
data linkphysical
4-31
Network Layer Data Plane
Two key network-layer functions
network-layer functionssectforwarding move packets from routerrsquos input to appropriate router outputsectrouting determine route taken by packets from source to destinationbull routing algorithms
analogy taking a tripsect forwarding process of
getting through single interchange
sect routing process of planning trip from source to destination
4-32
Network Layer Data Plane
Network service modelQ What service model for ldquochannelrdquo transporting datagrams from sender to receiver
example services for individual datagrams
sect guaranteed deliverysect guaranteed delivery with
less than 40 msec delay
example services for a flow of datagrams
sect in-order datagram deliverysect guaranteed minimum
bandwidth to flowsect restrictions on changes in
inter-packet spacing
4-33
Network Layer Data Plane
Router architecture overview
high-seed switching
fabric
routing processor
router input ports router output ports
forwarding data plane (hardware) operttes in
nanosecond timeframe
routing managementcontrol plane (software)operates in millisecond
time frame
sect high-level view of generic router architecture
4-34Network Layer Data Plane
linetermination
link layer
protocol(receive)
lookupforwarding
queueing
Input port functions
decentralized switchingsect using header field values lookup output
port using forwarding table in input port memory (ldquomatch plus actionrdquo)
sect goal complete input port processing at lsquoline speedrsquo
sect queuing if datagrams arrive faster than forwarding rate into switch fabric
physical layerbit-level reception
data link layereg Ethernetsee chapter 5
switchfabric
4-35Network Layer Data Plane
linetermination
link layer
protocol(receive)
lookupforwarding
queueing
Input port functions
decentralized switchingsect using header field values lookup output
port using forwarding table in input port memory (ldquomatch plus actionrdquo)
sect destination-based forwarding forward based only on destination IP address (traditional)
sect generalized forwarding forward based on any set of header field values
physical layerbit-level reception
data link layereg Ethernetsee chapter 5
switchfabric
4-36Network Layer Data Plane
7
DestinationAddress Range
11001000 00010111 00010000 00000000through11001000 00010111 00010111 11111111
11001000 00010111 00011000 00000000through11001000 00010111 00011000 11111111
11001000 00010111 00011001 00000000through11001000 00010111 00011111 11111111
otherwise
Link Interface
0
1
2
3
Q but what happens if ranges donrsquot divide up so nicely
Destination-based forwardingforwarding table
4-37Network Layer Data Plane
Longest prefix matching
Destination Address Range11001000 00010111 00010 11001000 00010111 00011000 11001000 00010111 00011 otherwise
DA 11001000 00010111 00011000 10101010
examplesDA 11001000 00010111 00010110 10100001 which interface
which interface
when looking for forwarding table entry for given destination address use longest address prefix that matches destination address
longest prefix matching
Link interface01
23
4-38Network Layer Data Plane
Longest prefix matching
sect wersquoll see why longest prefix matching is used shortly when we study addressing
sect longest prefix matching often performed using ternary content addressable memories (TCAMs)bull content addressable present address to TCAM
retrieve address in one clock cycle regardless of table sizebull Cisco Catalyst can up ~1M routing table
entries in TCAM
4-39Network Layer Data Plane
Switching fabricssect transfer packet from input buffer to appropriate
output buffersect switching rate rate at which packets can be
transfer from inputs to outputsbull often measured as multiple of inputoutput line ratebull N inputs switching rate N times line rate desirable
sect three types of switching fabrics
memory
memory
bus crossbar
4-40Network Layer Data Plane
Switching via memory
first generation routerssect traditional computers with switching under direct control
of CPUsect packet copied to systemrsquos memorysect speed limited by memory bandwidth (2 bus crossings per
datagram)
inputport(eg
Ethernet)
memoryoutputport(eg
Ethernet)
system bus
4-41Network Layer Data Plane
Switching via a bus
sect datagram from input port memoryto output port memory via a
shared bussect bus contention switching speed
limited by bus bandwidthsect 32 Gbps bus Cisco 5600
sufficient speed for access and enterprise routers
bus
4-42Network Layer Data Plane
8
Switching via interconnection network
sect overcome bus bandwidth limitationssect banyan networks crossbar other
interconnection nets initially developed to connect processors in multiprocessor
sect advanced design fragmenting datagram into fixed length cells switch cells through the fabric
sect Cisco 12000 switches 60 Gbps through the interconnection network
crossbar
4-43Network Layer Data Plane
2
Transport Layer 3-7
lin original dataloutlin original data plus
retransmitted data
free buffer space
Causescosts of congestion scenario 2Idealization known loss
packets can be lost dropped at router due to full buffers
sect sender only resends if packet known to be lost
R2
R2lin
lout
when sending at R2 some packets are retransmissions but asymptotic goodput is still R2 (why)
A
Host BTransport Layer 3-8
A
linloutlincopy
free buffer space
timeout
R2
R2lin
lout
when sending at R2 some packets are retransmissions including duplicated that are delivered
Host B
Realistic duplicatessect packets can be lost dropped at
router due to full bufferssect sender times out prematurely
sending two copies both of which are delivered
Causescosts of congestion scenario 2
Transport Layer 3-9
R2
lout
when sending at R2 some packets are retransmissions including duplicated that are delivered
ldquocostsrdquo of congestionsect more work (retrans) for given ldquogoodputrdquosect unneeded retransmissions link carries multiple copies of pkt
bull decreasing goodput
R2lin
Causescosts of congestion scenario 2Realistic duplicatessect packets can be lost dropped at
router due to full bufferssect sender times out prematurely
sending two copies both of which are delivered
Transport Layer 3-10
sect four senderssect multihop pathssect timeoutretransmit
Q what happens as lin and linrsquo
increase
finite shared output link buffers
Host A lout
Causescosts of congestion scenario 3
Host B
Host CHost D
lin original datalin original data plus
retransmitted data
A as red linrsquo increases all arriving blue pkts at upper queue are dropped blue throughput g 0
Transport Layer 3-11
another ldquocostrdquo of congestionsect when packet dropped any ldquoupstream
transmission capacity used for that packet was wasted
Causescosts of congestion scenario 3
C2
C2
l out
linrsquo
Transport Layer 3-12
Chapter 3 outline
31 transport-layer services
32 multiplexing and demultiplexing
33 connectionless transport UDP
34 principles of reliable data transfer
35 connection-oriented transport TCPbull segment structurebull reliable data transferbull flow controlbull connection management
36 principles of congestion control
37 TCP congestion control
3
Transport Layer 3-13
TCP congestion control additive increase multiplicative decrease
sect approach sender increases transmission rate (window size) probing for usable bandwidth until loss occursbull additive increase increase cwnd by 1 MSS every
RTT until loss detectedbull multiplicative decrease cut cwnd in half after loss
cwnd
TC
P s
ende
r co
nges
tion
win
dow
siz
e
AIMD saw toothbehavior probing
for bandwidth
additively increase window size helliphellip until loss occurs (then cut window in half)
timeTransport Layer 3-14
TCP Congestion Control details
sect sender limits transmission
sect cwnd is dynamic function of perceived network congestion
TCP sending ratesect roughly send cwnd
bytes wait RTT for ACKS then send more bytes
last byteACKed sent not-
yet ACKed(ldquoin-flightrdquo)
last byte sent
cwnd
LastByteSent-LastByteAcked
lt cwnd
sender sequence number space
rate ~~cwndRTT
bytessec
Transport Layer 3-15
TCP Slow Start sect when connection begins
increase rate exponentially until first loss eventbull initially cwnd = 1 MSSbull double cwnd every RTTbull done by incrementing cwnd for every ACK received
sect summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-16
TCP detecting reacting to loss
sect loss indicated by timeoutbull cwnd set to 1 MSS bull window then grows exponentially (as in slow start)
to threshold then grows linearlysect loss indicated by 3 duplicate ACKs TCP RENObull dup ACKs indicate network capable of delivering
some segments bull cwnd is cut in half window then grows linearly
sect TCP Tahoe always sets cwnd to 1 (timeout or 3 duplicate acks)
Transport Layer 3-17
Q when should the exponential increase switch to linear
A when cwnd gets to 12 of its value before timeout
Implementationsect variable ssthreshsect on loss event ssthresh
is set to 12 of cwnd just before loss event
TCP switching from slow start to CA
Check out the online interactive exercises for more examples httpgaiacsumassedukurose_rossinteractive Transport Layer 3-18
Summary TCP Congestion Control
timeoutssthresh = cwnd2
cwnd = 1 MSSdupACKcount = 0
retransmit missing segment
Lcwnd gt ssthresh
congestionavoidance
cwnd = cwnd + MSS (MSScwnd)dupACKcount = 0
transmit new segment(s) as allowed
new ACK
dupACKcount++duplicate ACK
fastrecovery
cwnd = cwnd + MSStransmit new segment(s) as allowed
duplicate ACK
ssthresh= cwnd2cwnd = ssthresh + 3
retransmit missing segment
dupACKcount == 3
timeoutssthresh = cwnd2cwnd = 1 dupACKcount = 0retransmit missing segment
ssthresh= cwnd2cwnd = ssthresh + 3retransmit missing segment
dupACKcount == 3cwnd = ssthreshdupACKcount = 0
New ACK
slow start
timeoutssthresh = cwnd2
cwnd = 1 MSSdupACKcount = 0
retransmit missing segment
cwnd = cwnd+MSSdupACKcount = 0transmit new segment(s) as allowed
new ACKdupACKcount++duplicate ACK
Lcwnd = 1 MSS
ssthresh = 64 KBdupACKcount = 0
NewACK
NewACK
NewACK
4
Transport Layer 3-19
TCP throughputsect avg TCP thruput as function of window size RTTbull ignore slow start assume always data to send
sect W window size (measured in bytes) where loss occursbull avg window size ( in-flight bytes) is frac34 Wbull avg thruput is 34W per RTT
W
W2
avg TCP thruput = 34
WRTT bytessec
Transport Layer 3-20
TCP Futures TCP over ldquolong fat pipesrdquo
sect example 1500 byte segments 100ms RTT want 10 Gbps throughput
sect requires W = 83333 in-flight segmentssect throughput in terms of segment loss probability L
[Mathis 1997]
to achieve 10 Gbps throughput need a loss rate of L = 210-10 ndash a very small loss rate
sect new versions of TCP for high-speed
TCP throughput = 122 MSSRTT L
Transport Layer 3-21
fairness goal if K TCP sessions share same bottleneck link of bandwidth R each should have average rate of RK
TCP connection 1
bottleneckrouter
capacity R
TCP Fairness
TCP connection 2
Transport Layer 3-22
Why is TCP fairtwo competing sessionssect additive increase gives slope of 1 as throughout increasessect multiplicative decrease decreases throughput proportionally
R
R
equal bandwidth share
Connection 1 throughput
Con
nect
ion
2 th
roug
hput
congestion avoidance additive increaseloss decrease window by factor of 2
congestion avoidance additive increaseloss decrease window by factor of 2
Transport Layer 3-23
Fairness (more)Fairness and UDPsect multimedia apps often
do not use TCPbull do not want rate
throttled by congestion control
sect instead use UDPbull send audiovideo at
constant rate tolerate packet loss
Fairness parallel TCP connections
sect application can open multiple parallel connections between two hosts
sect web browsers do this sect eg link of rate R with 9
existing connectionsbull new app asks for 1 TCP gets
rate R10bull new app asks for 11 TCPs
gets R2
Transport Layer 3-24
network-assisted congestion controlsect two bits in IP header (ToS field) marked by network router
to indicate congestionsect congestion indication carried to receiving hostsect receiver (seeing congestion indication in IP datagram) )
sets ECE bit on receiver-to-sender ACK segment to notify sender of congestion
Explicit Congestion Notification (ECN)
sourceapplicationtransportnetworklink
physical
destinationapplicationtransportnetworklink
physical
ECN=00 ECN=11
ECE=1
IP datagram
TCP ACK segment
5
Transport Layer 3-25
Chapter 3 summarysect principles behind transport
layer servicesbull multiplexing
demultiplexingbull reliable data transferbull flow controlbull congestion control
sect instantiation implementation in the Internetbull UDPbull TCP
nextsect leaving the network ldquoedgerdquo (application transport layers)
sect into the network ldquocorerdquo
sect two network layer chaptersbull data planebull control plane
Quiz 2
Consider the following network Host A wants tosimultaneously send messages to hosts B and C A isconnected to B and C via a broadcast channelmdasha packet sentby A is carried by the channel to both B and C Suppose thatthe broadcast channel connecting A B and C canindependently lose and corrupt messages (and so forexample a message sent from A might be correctly receivedby B but not by C) The stop-and-wait-like error-controlprotocol is used for reliably transferring packets from A to Band C such that A will send a new packet from the upperlayer until it knows that both B and C have correctlyreceived the current packet Packets from upper layers maybe queued in a sufficiently large buffer Give FSM descriptionsof A and C (Hint The FSM for B should be the same as forC)
Transport Layer 3-26
Quiz 3
Consider the GBN protocol with a sender window size of 4and sequence number range of 1024 Suppose that at time tthe next in-order packet that the receiver is expecting has asequence number of k Assume that the medium does notreorder message Answer the following questionsaWhat are the possible sets of sequence numbers inside thesenderrsquos window at time tbWhat are all possible values of the ACK field in all possiblemessages currently propagating back to the sender at time t
Transport Layer 3-27 Transport Layer 3-28
aHere we have a window size of N=3 Suppose the receiver has received packet k-1 and has ACKed that and all other preceding packets If all of these ACKs have been received by sender then senders window is [k k+N-1] Suppose next that none of the ACKs have been received at the sender In this second case the senders window contains k-1 and the N packets up to and including k-1 The senders window is thus [k-Nk-1] By these arguments the senders window is of size 3 and begins somewhere in the range [k-Nk]bIf the receiver is waiting for packet k then it has received (and ACKed) packet k-1 and the N-1 packets before that If none of those N ACKs have been yet received by the sender then ACK messages with values of [k-Nk-1] may still be propagating backBecause the sender has sent packets [k-N k-1] it must be the case that the sender has already received an ACK for k-N-1 Once the receiver has sent an ACK for k-N-1 it will never send an ACK that is less that k-N-1 Thus the range of in-flight ACK values can range from k-N-1 to k-1
Transport Layer 3-29
Chapter 4 network layer
chapter goalssect understand principles behind network layer
services bull network layer service modelsbull forwarding versus routingbull how a router worksbull generalized forwarding
sect instantiation implementation in the Internet
4-30
Network Layer Data Plane
6
Network layersect transport segment from
sending to receiving host sect on sending side
encapsulates segments into datagrams
sect on receiving side delivers segments to transport layer
sect network layer protocols in every host router
sect router examines header fields in all IP datagrams passing through it
applicationtransportnetworkdata linkphysical
applicationtransportnetworkdata linkphysical
networkdata linkphysical network
data linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysicalnetwork
data linkphysical
4-31
Network Layer Data Plane
Two key network-layer functions
network-layer functionssectforwarding move packets from routerrsquos input to appropriate router outputsectrouting determine route taken by packets from source to destinationbull routing algorithms
analogy taking a tripsect forwarding process of
getting through single interchange
sect routing process of planning trip from source to destination
4-32
Network Layer Data Plane
Network service modelQ What service model for ldquochannelrdquo transporting datagrams from sender to receiver
example services for individual datagrams
sect guaranteed deliverysect guaranteed delivery with
less than 40 msec delay
example services for a flow of datagrams
sect in-order datagram deliverysect guaranteed minimum
bandwidth to flowsect restrictions on changes in
inter-packet spacing
4-33
Network Layer Data Plane
Router architecture overview
high-seed switching
fabric
routing processor
router input ports router output ports
forwarding data plane (hardware) operttes in
nanosecond timeframe
routing managementcontrol plane (software)operates in millisecond
time frame
sect high-level view of generic router architecture
4-34Network Layer Data Plane
linetermination
link layer
protocol(receive)
lookupforwarding
queueing
Input port functions
decentralized switchingsect using header field values lookup output
port using forwarding table in input port memory (ldquomatch plus actionrdquo)
sect goal complete input port processing at lsquoline speedrsquo
sect queuing if datagrams arrive faster than forwarding rate into switch fabric
physical layerbit-level reception
data link layereg Ethernetsee chapter 5
switchfabric
4-35Network Layer Data Plane
linetermination
link layer
protocol(receive)
lookupforwarding
queueing
Input port functions
decentralized switchingsect using header field values lookup output
port using forwarding table in input port memory (ldquomatch plus actionrdquo)
sect destination-based forwarding forward based only on destination IP address (traditional)
sect generalized forwarding forward based on any set of header field values
physical layerbit-level reception
data link layereg Ethernetsee chapter 5
switchfabric
4-36Network Layer Data Plane
7
DestinationAddress Range
11001000 00010111 00010000 00000000through11001000 00010111 00010111 11111111
11001000 00010111 00011000 00000000through11001000 00010111 00011000 11111111
11001000 00010111 00011001 00000000through11001000 00010111 00011111 11111111
otherwise
Link Interface
0
1
2
3
Q but what happens if ranges donrsquot divide up so nicely
Destination-based forwardingforwarding table
4-37Network Layer Data Plane
Longest prefix matching
Destination Address Range11001000 00010111 00010 11001000 00010111 00011000 11001000 00010111 00011 otherwise
DA 11001000 00010111 00011000 10101010
examplesDA 11001000 00010111 00010110 10100001 which interface
which interface
when looking for forwarding table entry for given destination address use longest address prefix that matches destination address
longest prefix matching
Link interface01
23
4-38Network Layer Data Plane
Longest prefix matching
sect wersquoll see why longest prefix matching is used shortly when we study addressing
sect longest prefix matching often performed using ternary content addressable memories (TCAMs)bull content addressable present address to TCAM
retrieve address in one clock cycle regardless of table sizebull Cisco Catalyst can up ~1M routing table
entries in TCAM
4-39Network Layer Data Plane
Switching fabricssect transfer packet from input buffer to appropriate
output buffersect switching rate rate at which packets can be
transfer from inputs to outputsbull often measured as multiple of inputoutput line ratebull N inputs switching rate N times line rate desirable
sect three types of switching fabrics
memory
memory
bus crossbar
4-40Network Layer Data Plane
Switching via memory
first generation routerssect traditional computers with switching under direct control
of CPUsect packet copied to systemrsquos memorysect speed limited by memory bandwidth (2 bus crossings per
datagram)
inputport(eg
Ethernet)
memoryoutputport(eg
Ethernet)
system bus
4-41Network Layer Data Plane
Switching via a bus
sect datagram from input port memoryto output port memory via a
shared bussect bus contention switching speed
limited by bus bandwidthsect 32 Gbps bus Cisco 5600
sufficient speed for access and enterprise routers
bus
4-42Network Layer Data Plane
8
Switching via interconnection network
sect overcome bus bandwidth limitationssect banyan networks crossbar other
interconnection nets initially developed to connect processors in multiprocessor
sect advanced design fragmenting datagram into fixed length cells switch cells through the fabric
sect Cisco 12000 switches 60 Gbps through the interconnection network
crossbar
4-43Network Layer Data Plane
3
Transport Layer 3-13
TCP congestion control additive increase multiplicative decrease
sect approach sender increases transmission rate (window size) probing for usable bandwidth until loss occursbull additive increase increase cwnd by 1 MSS every
RTT until loss detectedbull multiplicative decrease cut cwnd in half after loss
cwnd
TC
P s
ende
r co
nges
tion
win
dow
siz
e
AIMD saw toothbehavior probing
for bandwidth
additively increase window size helliphellip until loss occurs (then cut window in half)
timeTransport Layer 3-14
TCP Congestion Control details
sect sender limits transmission
sect cwnd is dynamic function of perceived network congestion
TCP sending ratesect roughly send cwnd
bytes wait RTT for ACKS then send more bytes
last byteACKed sent not-
yet ACKed(ldquoin-flightrdquo)
last byte sent
cwnd
LastByteSent-LastByteAcked
lt cwnd
sender sequence number space
rate ~~cwndRTT
bytessec
Transport Layer 3-15
TCP Slow Start sect when connection begins
increase rate exponentially until first loss eventbull initially cwnd = 1 MSSbull double cwnd every RTTbull done by incrementing cwnd for every ACK received
sect summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-16
TCP detecting reacting to loss
sect loss indicated by timeoutbull cwnd set to 1 MSS bull window then grows exponentially (as in slow start)
to threshold then grows linearlysect loss indicated by 3 duplicate ACKs TCP RENObull dup ACKs indicate network capable of delivering
some segments bull cwnd is cut in half window then grows linearly
sect TCP Tahoe always sets cwnd to 1 (timeout or 3 duplicate acks)
Transport Layer 3-17
Q when should the exponential increase switch to linear
A when cwnd gets to 12 of its value before timeout
Implementationsect variable ssthreshsect on loss event ssthresh
is set to 12 of cwnd just before loss event
TCP switching from slow start to CA
Check out the online interactive exercises for more examples httpgaiacsumassedukurose_rossinteractive Transport Layer 3-18
Summary TCP Congestion Control
timeoutssthresh = cwnd2
cwnd = 1 MSSdupACKcount = 0
retransmit missing segment
Lcwnd gt ssthresh
congestionavoidance
cwnd = cwnd + MSS (MSScwnd)dupACKcount = 0
transmit new segment(s) as allowed
new ACK
dupACKcount++duplicate ACK
fastrecovery
cwnd = cwnd + MSStransmit new segment(s) as allowed
duplicate ACK
ssthresh= cwnd2cwnd = ssthresh + 3
retransmit missing segment
dupACKcount == 3
timeoutssthresh = cwnd2cwnd = 1 dupACKcount = 0retransmit missing segment
ssthresh= cwnd2cwnd = ssthresh + 3retransmit missing segment
dupACKcount == 3cwnd = ssthreshdupACKcount = 0
New ACK
slow start
timeoutssthresh = cwnd2
cwnd = 1 MSSdupACKcount = 0
retransmit missing segment
cwnd = cwnd+MSSdupACKcount = 0transmit new segment(s) as allowed
new ACKdupACKcount++duplicate ACK
Lcwnd = 1 MSS
ssthresh = 64 KBdupACKcount = 0
NewACK
NewACK
NewACK
4
Transport Layer 3-19
TCP throughputsect avg TCP thruput as function of window size RTTbull ignore slow start assume always data to send
sect W window size (measured in bytes) where loss occursbull avg window size ( in-flight bytes) is frac34 Wbull avg thruput is 34W per RTT
W
W2
avg TCP thruput = 34
WRTT bytessec
Transport Layer 3-20
TCP Futures TCP over ldquolong fat pipesrdquo
sect example 1500 byte segments 100ms RTT want 10 Gbps throughput
sect requires W = 83333 in-flight segmentssect throughput in terms of segment loss probability L
[Mathis 1997]
to achieve 10 Gbps throughput need a loss rate of L = 210-10 ndash a very small loss rate
sect new versions of TCP for high-speed
TCP throughput = 122 MSSRTT L
Transport Layer 3-21
fairness goal if K TCP sessions share same bottleneck link of bandwidth R each should have average rate of RK
TCP connection 1
bottleneckrouter
capacity R
TCP Fairness
TCP connection 2
Transport Layer 3-22
Why is TCP fairtwo competing sessionssect additive increase gives slope of 1 as throughout increasessect multiplicative decrease decreases throughput proportionally
R
R
equal bandwidth share
Connection 1 throughput
Con
nect
ion
2 th
roug
hput
congestion avoidance additive increaseloss decrease window by factor of 2
congestion avoidance additive increaseloss decrease window by factor of 2
Transport Layer 3-23
Fairness (more)Fairness and UDPsect multimedia apps often
do not use TCPbull do not want rate
throttled by congestion control
sect instead use UDPbull send audiovideo at
constant rate tolerate packet loss
Fairness parallel TCP connections
sect application can open multiple parallel connections between two hosts
sect web browsers do this sect eg link of rate R with 9
existing connectionsbull new app asks for 1 TCP gets
rate R10bull new app asks for 11 TCPs
gets R2
Transport Layer 3-24
network-assisted congestion controlsect two bits in IP header (ToS field) marked by network router
to indicate congestionsect congestion indication carried to receiving hostsect receiver (seeing congestion indication in IP datagram) )
sets ECE bit on receiver-to-sender ACK segment to notify sender of congestion
Explicit Congestion Notification (ECN)
sourceapplicationtransportnetworklink
physical
destinationapplicationtransportnetworklink
physical
ECN=00 ECN=11
ECE=1
IP datagram
TCP ACK segment
5
Transport Layer 3-25
Chapter 3 summarysect principles behind transport
layer servicesbull multiplexing
demultiplexingbull reliable data transferbull flow controlbull congestion control
sect instantiation implementation in the Internetbull UDPbull TCP
nextsect leaving the network ldquoedgerdquo (application transport layers)
sect into the network ldquocorerdquo
sect two network layer chaptersbull data planebull control plane
Quiz 2
Consider the following network Host A wants tosimultaneously send messages to hosts B and C A isconnected to B and C via a broadcast channelmdasha packet sentby A is carried by the channel to both B and C Suppose thatthe broadcast channel connecting A B and C canindependently lose and corrupt messages (and so forexample a message sent from A might be correctly receivedby B but not by C) The stop-and-wait-like error-controlprotocol is used for reliably transferring packets from A to Band C such that A will send a new packet from the upperlayer until it knows that both B and C have correctlyreceived the current packet Packets from upper layers maybe queued in a sufficiently large buffer Give FSM descriptionsof A and C (Hint The FSM for B should be the same as forC)
Transport Layer 3-26
Quiz 3
Consider the GBN protocol with a sender window size of 4and sequence number range of 1024 Suppose that at time tthe next in-order packet that the receiver is expecting has asequence number of k Assume that the medium does notreorder message Answer the following questionsaWhat are the possible sets of sequence numbers inside thesenderrsquos window at time tbWhat are all possible values of the ACK field in all possiblemessages currently propagating back to the sender at time t
Transport Layer 3-27 Transport Layer 3-28
aHere we have a window size of N=3 Suppose the receiver has received packet k-1 and has ACKed that and all other preceding packets If all of these ACKs have been received by sender then senders window is [k k+N-1] Suppose next that none of the ACKs have been received at the sender In this second case the senders window contains k-1 and the N packets up to and including k-1 The senders window is thus [k-Nk-1] By these arguments the senders window is of size 3 and begins somewhere in the range [k-Nk]bIf the receiver is waiting for packet k then it has received (and ACKed) packet k-1 and the N-1 packets before that If none of those N ACKs have been yet received by the sender then ACK messages with values of [k-Nk-1] may still be propagating backBecause the sender has sent packets [k-N k-1] it must be the case that the sender has already received an ACK for k-N-1 Once the receiver has sent an ACK for k-N-1 it will never send an ACK that is less that k-N-1 Thus the range of in-flight ACK values can range from k-N-1 to k-1
Transport Layer 3-29
Chapter 4 network layer
chapter goalssect understand principles behind network layer
services bull network layer service modelsbull forwarding versus routingbull how a router worksbull generalized forwarding
sect instantiation implementation in the Internet
4-30
Network Layer Data Plane
6
Network layersect transport segment from
sending to receiving host sect on sending side
encapsulates segments into datagrams
sect on receiving side delivers segments to transport layer
sect network layer protocols in every host router
sect router examines header fields in all IP datagrams passing through it
applicationtransportnetworkdata linkphysical
applicationtransportnetworkdata linkphysical
networkdata linkphysical network
data linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysicalnetwork
data linkphysical
4-31
Network Layer Data Plane
Two key network-layer functions
network-layer functionssectforwarding move packets from routerrsquos input to appropriate router outputsectrouting determine route taken by packets from source to destinationbull routing algorithms
analogy taking a tripsect forwarding process of
getting through single interchange
sect routing process of planning trip from source to destination
4-32
Network Layer Data Plane
Network service modelQ What service model for ldquochannelrdquo transporting datagrams from sender to receiver
example services for individual datagrams
sect guaranteed deliverysect guaranteed delivery with
less than 40 msec delay
example services for a flow of datagrams
sect in-order datagram deliverysect guaranteed minimum
bandwidth to flowsect restrictions on changes in
inter-packet spacing
4-33
Network Layer Data Plane
Router architecture overview
high-seed switching
fabric
routing processor
router input ports router output ports
forwarding data plane (hardware) operttes in
nanosecond timeframe
routing managementcontrol plane (software)operates in millisecond
time frame
sect high-level view of generic router architecture
4-34Network Layer Data Plane
linetermination
link layer
protocol(receive)
lookupforwarding
queueing
Input port functions
decentralized switchingsect using header field values lookup output
port using forwarding table in input port memory (ldquomatch plus actionrdquo)
sect goal complete input port processing at lsquoline speedrsquo
sect queuing if datagrams arrive faster than forwarding rate into switch fabric
physical layerbit-level reception
data link layereg Ethernetsee chapter 5
switchfabric
4-35Network Layer Data Plane
linetermination
link layer
protocol(receive)
lookupforwarding
queueing
Input port functions
decentralized switchingsect using header field values lookup output
port using forwarding table in input port memory (ldquomatch plus actionrdquo)
sect destination-based forwarding forward based only on destination IP address (traditional)
sect generalized forwarding forward based on any set of header field values
physical layerbit-level reception
data link layereg Ethernetsee chapter 5
switchfabric
4-36Network Layer Data Plane
7
DestinationAddress Range
11001000 00010111 00010000 00000000through11001000 00010111 00010111 11111111
11001000 00010111 00011000 00000000through11001000 00010111 00011000 11111111
11001000 00010111 00011001 00000000through11001000 00010111 00011111 11111111
otherwise
Link Interface
0
1
2
3
Q but what happens if ranges donrsquot divide up so nicely
Destination-based forwardingforwarding table
4-37Network Layer Data Plane
Longest prefix matching
Destination Address Range11001000 00010111 00010 11001000 00010111 00011000 11001000 00010111 00011 otherwise
DA 11001000 00010111 00011000 10101010
examplesDA 11001000 00010111 00010110 10100001 which interface
which interface
when looking for forwarding table entry for given destination address use longest address prefix that matches destination address
longest prefix matching
Link interface01
23
4-38Network Layer Data Plane
Longest prefix matching
sect wersquoll see why longest prefix matching is used shortly when we study addressing
sect longest prefix matching often performed using ternary content addressable memories (TCAMs)bull content addressable present address to TCAM
retrieve address in one clock cycle regardless of table sizebull Cisco Catalyst can up ~1M routing table
entries in TCAM
4-39Network Layer Data Plane
Switching fabricssect transfer packet from input buffer to appropriate
output buffersect switching rate rate at which packets can be
transfer from inputs to outputsbull often measured as multiple of inputoutput line ratebull N inputs switching rate N times line rate desirable
sect three types of switching fabrics
memory
memory
bus crossbar
4-40Network Layer Data Plane
Switching via memory
first generation routerssect traditional computers with switching under direct control
of CPUsect packet copied to systemrsquos memorysect speed limited by memory bandwidth (2 bus crossings per
datagram)
inputport(eg
Ethernet)
memoryoutputport(eg
Ethernet)
system bus
4-41Network Layer Data Plane
Switching via a bus
sect datagram from input port memoryto output port memory via a
shared bussect bus contention switching speed
limited by bus bandwidthsect 32 Gbps bus Cisco 5600
sufficient speed for access and enterprise routers
bus
4-42Network Layer Data Plane
8
Switching via interconnection network
sect overcome bus bandwidth limitationssect banyan networks crossbar other
interconnection nets initially developed to connect processors in multiprocessor
sect advanced design fragmenting datagram into fixed length cells switch cells through the fabric
sect Cisco 12000 switches 60 Gbps through the interconnection network
crossbar
4-43Network Layer Data Plane
4
Transport Layer 3-19
TCP throughputsect avg TCP thruput as function of window size RTTbull ignore slow start assume always data to send
sect W window size (measured in bytes) where loss occursbull avg window size ( in-flight bytes) is frac34 Wbull avg thruput is 34W per RTT
W
W2
avg TCP thruput = 34
WRTT bytessec
Transport Layer 3-20
TCP Futures TCP over ldquolong fat pipesrdquo
sect example 1500 byte segments 100ms RTT want 10 Gbps throughput
sect requires W = 83333 in-flight segmentssect throughput in terms of segment loss probability L
[Mathis 1997]
to achieve 10 Gbps throughput need a loss rate of L = 210-10 ndash a very small loss rate
sect new versions of TCP for high-speed
TCP throughput = 122 MSSRTT L
Transport Layer 3-21
fairness goal if K TCP sessions share same bottleneck link of bandwidth R each should have average rate of RK
TCP connection 1
bottleneckrouter
capacity R
TCP Fairness
TCP connection 2
Transport Layer 3-22
Why is TCP fairtwo competing sessionssect additive increase gives slope of 1 as throughout increasessect multiplicative decrease decreases throughput proportionally
R
R
equal bandwidth share
Connection 1 throughput
Con
nect
ion
2 th
roug
hput
congestion avoidance additive increaseloss decrease window by factor of 2
congestion avoidance additive increaseloss decrease window by factor of 2
Transport Layer 3-23
Fairness (more)Fairness and UDPsect multimedia apps often
do not use TCPbull do not want rate
throttled by congestion control
sect instead use UDPbull send audiovideo at
constant rate tolerate packet loss
Fairness parallel TCP connections
sect application can open multiple parallel connections between two hosts
sect web browsers do this sect eg link of rate R with 9
existing connectionsbull new app asks for 1 TCP gets
rate R10bull new app asks for 11 TCPs
gets R2
Transport Layer 3-24
network-assisted congestion controlsect two bits in IP header (ToS field) marked by network router
to indicate congestionsect congestion indication carried to receiving hostsect receiver (seeing congestion indication in IP datagram) )
sets ECE bit on receiver-to-sender ACK segment to notify sender of congestion
Explicit Congestion Notification (ECN)
sourceapplicationtransportnetworklink
physical
destinationapplicationtransportnetworklink
physical
ECN=00 ECN=11
ECE=1
IP datagram
TCP ACK segment
5
Transport Layer 3-25
Chapter 3 summarysect principles behind transport
layer servicesbull multiplexing
demultiplexingbull reliable data transferbull flow controlbull congestion control
sect instantiation implementation in the Internetbull UDPbull TCP
nextsect leaving the network ldquoedgerdquo (application transport layers)
sect into the network ldquocorerdquo
sect two network layer chaptersbull data planebull control plane
Quiz 2
Consider the following network Host A wants tosimultaneously send messages to hosts B and C A isconnected to B and C via a broadcast channelmdasha packet sentby A is carried by the channel to both B and C Suppose thatthe broadcast channel connecting A B and C canindependently lose and corrupt messages (and so forexample a message sent from A might be correctly receivedby B but not by C) The stop-and-wait-like error-controlprotocol is used for reliably transferring packets from A to Band C such that A will send a new packet from the upperlayer until it knows that both B and C have correctlyreceived the current packet Packets from upper layers maybe queued in a sufficiently large buffer Give FSM descriptionsof A and C (Hint The FSM for B should be the same as forC)
Transport Layer 3-26
Quiz 3
Consider the GBN protocol with a sender window size of 4and sequence number range of 1024 Suppose that at time tthe next in-order packet that the receiver is expecting has asequence number of k Assume that the medium does notreorder message Answer the following questionsaWhat are the possible sets of sequence numbers inside thesenderrsquos window at time tbWhat are all possible values of the ACK field in all possiblemessages currently propagating back to the sender at time t
Transport Layer 3-27 Transport Layer 3-28
aHere we have a window size of N=3 Suppose the receiver has received packet k-1 and has ACKed that and all other preceding packets If all of these ACKs have been received by sender then senders window is [k k+N-1] Suppose next that none of the ACKs have been received at the sender In this second case the senders window contains k-1 and the N packets up to and including k-1 The senders window is thus [k-Nk-1] By these arguments the senders window is of size 3 and begins somewhere in the range [k-Nk]bIf the receiver is waiting for packet k then it has received (and ACKed) packet k-1 and the N-1 packets before that If none of those N ACKs have been yet received by the sender then ACK messages with values of [k-Nk-1] may still be propagating backBecause the sender has sent packets [k-N k-1] it must be the case that the sender has already received an ACK for k-N-1 Once the receiver has sent an ACK for k-N-1 it will never send an ACK that is less that k-N-1 Thus the range of in-flight ACK values can range from k-N-1 to k-1
Transport Layer 3-29
Chapter 4 network layer
chapter goalssect understand principles behind network layer
services bull network layer service modelsbull forwarding versus routingbull how a router worksbull generalized forwarding
sect instantiation implementation in the Internet
4-30
Network Layer Data Plane
6
Network layersect transport segment from
sending to receiving host sect on sending side
encapsulates segments into datagrams
sect on receiving side delivers segments to transport layer
sect network layer protocols in every host router
sect router examines header fields in all IP datagrams passing through it
applicationtransportnetworkdata linkphysical
applicationtransportnetworkdata linkphysical
networkdata linkphysical network
data linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysicalnetwork
data linkphysical
4-31
Network Layer Data Plane
Two key network-layer functions
network-layer functionssectforwarding move packets from routerrsquos input to appropriate router outputsectrouting determine route taken by packets from source to destinationbull routing algorithms
analogy taking a tripsect forwarding process of
getting through single interchange
sect routing process of planning trip from source to destination
4-32
Network Layer Data Plane
Network service modelQ What service model for ldquochannelrdquo transporting datagrams from sender to receiver
example services for individual datagrams
sect guaranteed deliverysect guaranteed delivery with
less than 40 msec delay
example services for a flow of datagrams
sect in-order datagram deliverysect guaranteed minimum
bandwidth to flowsect restrictions on changes in
inter-packet spacing
4-33
Network Layer Data Plane
Router architecture overview
high-seed switching
fabric
routing processor
router input ports router output ports
forwarding data plane (hardware) operttes in
nanosecond timeframe
routing managementcontrol plane (software)operates in millisecond
time frame
sect high-level view of generic router architecture
4-34Network Layer Data Plane
linetermination
link layer
protocol(receive)
lookupforwarding
queueing
Input port functions
decentralized switchingsect using header field values lookup output
port using forwarding table in input port memory (ldquomatch plus actionrdquo)
sect goal complete input port processing at lsquoline speedrsquo
sect queuing if datagrams arrive faster than forwarding rate into switch fabric
physical layerbit-level reception
data link layereg Ethernetsee chapter 5
switchfabric
4-35Network Layer Data Plane
linetermination
link layer
protocol(receive)
lookupforwarding
queueing
Input port functions
decentralized switchingsect using header field values lookup output
port using forwarding table in input port memory (ldquomatch plus actionrdquo)
sect destination-based forwarding forward based only on destination IP address (traditional)
sect generalized forwarding forward based on any set of header field values
physical layerbit-level reception
data link layereg Ethernetsee chapter 5
switchfabric
4-36Network Layer Data Plane
7
DestinationAddress Range
11001000 00010111 00010000 00000000through11001000 00010111 00010111 11111111
11001000 00010111 00011000 00000000through11001000 00010111 00011000 11111111
11001000 00010111 00011001 00000000through11001000 00010111 00011111 11111111
otherwise
Link Interface
0
1
2
3
Q but what happens if ranges donrsquot divide up so nicely
Destination-based forwardingforwarding table
4-37Network Layer Data Plane
Longest prefix matching
Destination Address Range11001000 00010111 00010 11001000 00010111 00011000 11001000 00010111 00011 otherwise
DA 11001000 00010111 00011000 10101010
examplesDA 11001000 00010111 00010110 10100001 which interface
which interface
when looking for forwarding table entry for given destination address use longest address prefix that matches destination address
longest prefix matching
Link interface01
23
4-38Network Layer Data Plane
Longest prefix matching
sect wersquoll see why longest prefix matching is used shortly when we study addressing
sect longest prefix matching often performed using ternary content addressable memories (TCAMs)bull content addressable present address to TCAM
retrieve address in one clock cycle regardless of table sizebull Cisco Catalyst can up ~1M routing table
entries in TCAM
4-39Network Layer Data Plane
Switching fabricssect transfer packet from input buffer to appropriate
output buffersect switching rate rate at which packets can be
transfer from inputs to outputsbull often measured as multiple of inputoutput line ratebull N inputs switching rate N times line rate desirable
sect three types of switching fabrics
memory
memory
bus crossbar
4-40Network Layer Data Plane
Switching via memory
first generation routerssect traditional computers with switching under direct control
of CPUsect packet copied to systemrsquos memorysect speed limited by memory bandwidth (2 bus crossings per
datagram)
inputport(eg
Ethernet)
memoryoutputport(eg
Ethernet)
system bus
4-41Network Layer Data Plane
Switching via a bus
sect datagram from input port memoryto output port memory via a
shared bussect bus contention switching speed
limited by bus bandwidthsect 32 Gbps bus Cisco 5600
sufficient speed for access and enterprise routers
bus
4-42Network Layer Data Plane
8
Switching via interconnection network
sect overcome bus bandwidth limitationssect banyan networks crossbar other
interconnection nets initially developed to connect processors in multiprocessor
sect advanced design fragmenting datagram into fixed length cells switch cells through the fabric
sect Cisco 12000 switches 60 Gbps through the interconnection network
crossbar
4-43Network Layer Data Plane
5
Transport Layer 3-25
Chapter 3 summarysect principles behind transport
layer servicesbull multiplexing
demultiplexingbull reliable data transferbull flow controlbull congestion control
sect instantiation implementation in the Internetbull UDPbull TCP
nextsect leaving the network ldquoedgerdquo (application transport layers)
sect into the network ldquocorerdquo
sect two network layer chaptersbull data planebull control plane
Quiz 2
Consider the following network Host A wants tosimultaneously send messages to hosts B and C A isconnected to B and C via a broadcast channelmdasha packet sentby A is carried by the channel to both B and C Suppose thatthe broadcast channel connecting A B and C canindependently lose and corrupt messages (and so forexample a message sent from A might be correctly receivedby B but not by C) The stop-and-wait-like error-controlprotocol is used for reliably transferring packets from A to Band C such that A will send a new packet from the upperlayer until it knows that both B and C have correctlyreceived the current packet Packets from upper layers maybe queued in a sufficiently large buffer Give FSM descriptionsof A and C (Hint The FSM for B should be the same as forC)
Transport Layer 3-26
Quiz 3
Consider the GBN protocol with a sender window size of 4and sequence number range of 1024 Suppose that at time tthe next in-order packet that the receiver is expecting has asequence number of k Assume that the medium does notreorder message Answer the following questionsaWhat are the possible sets of sequence numbers inside thesenderrsquos window at time tbWhat are all possible values of the ACK field in all possiblemessages currently propagating back to the sender at time t
Transport Layer 3-27 Transport Layer 3-28
aHere we have a window size of N=3 Suppose the receiver has received packet k-1 and has ACKed that and all other preceding packets If all of these ACKs have been received by sender then senders window is [k k+N-1] Suppose next that none of the ACKs have been received at the sender In this second case the senders window contains k-1 and the N packets up to and including k-1 The senders window is thus [k-Nk-1] By these arguments the senders window is of size 3 and begins somewhere in the range [k-Nk]bIf the receiver is waiting for packet k then it has received (and ACKed) packet k-1 and the N-1 packets before that If none of those N ACKs have been yet received by the sender then ACK messages with values of [k-Nk-1] may still be propagating backBecause the sender has sent packets [k-N k-1] it must be the case that the sender has already received an ACK for k-N-1 Once the receiver has sent an ACK for k-N-1 it will never send an ACK that is less that k-N-1 Thus the range of in-flight ACK values can range from k-N-1 to k-1
Transport Layer 3-29
Chapter 4 network layer
chapter goalssect understand principles behind network layer
services bull network layer service modelsbull forwarding versus routingbull how a router worksbull generalized forwarding
sect instantiation implementation in the Internet
4-30
Network Layer Data Plane
6
Network layersect transport segment from
sending to receiving host sect on sending side
encapsulates segments into datagrams
sect on receiving side delivers segments to transport layer
sect network layer protocols in every host router
sect router examines header fields in all IP datagrams passing through it
applicationtransportnetworkdata linkphysical
applicationtransportnetworkdata linkphysical
networkdata linkphysical network
data linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysicalnetwork
data linkphysical
4-31
Network Layer Data Plane
Two key network-layer functions
network-layer functionssectforwarding move packets from routerrsquos input to appropriate router outputsectrouting determine route taken by packets from source to destinationbull routing algorithms
analogy taking a tripsect forwarding process of
getting through single interchange
sect routing process of planning trip from source to destination
4-32
Network Layer Data Plane
Network service modelQ What service model for ldquochannelrdquo transporting datagrams from sender to receiver
example services for individual datagrams
sect guaranteed deliverysect guaranteed delivery with
less than 40 msec delay
example services for a flow of datagrams
sect in-order datagram deliverysect guaranteed minimum
bandwidth to flowsect restrictions on changes in
inter-packet spacing
4-33
Network Layer Data Plane
Router architecture overview
high-seed switching
fabric
routing processor
router input ports router output ports
forwarding data plane (hardware) operttes in
nanosecond timeframe
routing managementcontrol plane (software)operates in millisecond
time frame
sect high-level view of generic router architecture
4-34Network Layer Data Plane
linetermination
link layer
protocol(receive)
lookupforwarding
queueing
Input port functions
decentralized switchingsect using header field values lookup output
port using forwarding table in input port memory (ldquomatch plus actionrdquo)
sect goal complete input port processing at lsquoline speedrsquo
sect queuing if datagrams arrive faster than forwarding rate into switch fabric
physical layerbit-level reception
data link layereg Ethernetsee chapter 5
switchfabric
4-35Network Layer Data Plane
linetermination
link layer
protocol(receive)
lookupforwarding
queueing
Input port functions
decentralized switchingsect using header field values lookup output
port using forwarding table in input port memory (ldquomatch plus actionrdquo)
sect destination-based forwarding forward based only on destination IP address (traditional)
sect generalized forwarding forward based on any set of header field values
physical layerbit-level reception
data link layereg Ethernetsee chapter 5
switchfabric
4-36Network Layer Data Plane
7
DestinationAddress Range
11001000 00010111 00010000 00000000through11001000 00010111 00010111 11111111
11001000 00010111 00011000 00000000through11001000 00010111 00011000 11111111
11001000 00010111 00011001 00000000through11001000 00010111 00011111 11111111
otherwise
Link Interface
0
1
2
3
Q but what happens if ranges donrsquot divide up so nicely
Destination-based forwardingforwarding table
4-37Network Layer Data Plane
Longest prefix matching
Destination Address Range11001000 00010111 00010 11001000 00010111 00011000 11001000 00010111 00011 otherwise
DA 11001000 00010111 00011000 10101010
examplesDA 11001000 00010111 00010110 10100001 which interface
which interface
when looking for forwarding table entry for given destination address use longest address prefix that matches destination address
longest prefix matching
Link interface01
23
4-38Network Layer Data Plane
Longest prefix matching
sect wersquoll see why longest prefix matching is used shortly when we study addressing
sect longest prefix matching often performed using ternary content addressable memories (TCAMs)bull content addressable present address to TCAM
retrieve address in one clock cycle regardless of table sizebull Cisco Catalyst can up ~1M routing table
entries in TCAM
4-39Network Layer Data Plane
Switching fabricssect transfer packet from input buffer to appropriate
output buffersect switching rate rate at which packets can be
transfer from inputs to outputsbull often measured as multiple of inputoutput line ratebull N inputs switching rate N times line rate desirable
sect three types of switching fabrics
memory
memory
bus crossbar
4-40Network Layer Data Plane
Switching via memory
first generation routerssect traditional computers with switching under direct control
of CPUsect packet copied to systemrsquos memorysect speed limited by memory bandwidth (2 bus crossings per
datagram)
inputport(eg
Ethernet)
memoryoutputport(eg
Ethernet)
system bus
4-41Network Layer Data Plane
Switching via a bus
sect datagram from input port memoryto output port memory via a
shared bussect bus contention switching speed
limited by bus bandwidthsect 32 Gbps bus Cisco 5600
sufficient speed for access and enterprise routers
bus
4-42Network Layer Data Plane
8
Switching via interconnection network
sect overcome bus bandwidth limitationssect banyan networks crossbar other
interconnection nets initially developed to connect processors in multiprocessor
sect advanced design fragmenting datagram into fixed length cells switch cells through the fabric
sect Cisco 12000 switches 60 Gbps through the interconnection network
crossbar
4-43Network Layer Data Plane
6
Network layersect transport segment from
sending to receiving host sect on sending side
encapsulates segments into datagrams
sect on receiving side delivers segments to transport layer
sect network layer protocols in every host router
sect router examines header fields in all IP datagrams passing through it
applicationtransportnetworkdata linkphysical
applicationtransportnetworkdata linkphysical
networkdata linkphysical network
data linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysicalnetwork
data linkphysical
4-31
Network Layer Data Plane
Two key network-layer functions
network-layer functionssectforwarding move packets from routerrsquos input to appropriate router outputsectrouting determine route taken by packets from source to destinationbull routing algorithms
analogy taking a tripsect forwarding process of
getting through single interchange
sect routing process of planning trip from source to destination
4-32
Network Layer Data Plane
Network service modelQ What service model for ldquochannelrdquo transporting datagrams from sender to receiver
example services for individual datagrams
sect guaranteed deliverysect guaranteed delivery with
less than 40 msec delay
example services for a flow of datagrams
sect in-order datagram deliverysect guaranteed minimum
bandwidth to flowsect restrictions on changes in
inter-packet spacing
4-33
Network Layer Data Plane
Router architecture overview
high-seed switching
fabric
routing processor
router input ports router output ports
forwarding data plane (hardware) operttes in
nanosecond timeframe
routing managementcontrol plane (software)operates in millisecond
time frame
sect high-level view of generic router architecture
4-34Network Layer Data Plane
linetermination
link layer
protocol(receive)
lookupforwarding
queueing
Input port functions
decentralized switchingsect using header field values lookup output
port using forwarding table in input port memory (ldquomatch plus actionrdquo)
sect goal complete input port processing at lsquoline speedrsquo
sect queuing if datagrams arrive faster than forwarding rate into switch fabric
physical layerbit-level reception
data link layereg Ethernetsee chapter 5
switchfabric
4-35Network Layer Data Plane
linetermination
link layer
protocol(receive)
lookupforwarding
queueing
Input port functions
decentralized switchingsect using header field values lookup output
port using forwarding table in input port memory (ldquomatch plus actionrdquo)
sect destination-based forwarding forward based only on destination IP address (traditional)
sect generalized forwarding forward based on any set of header field values
physical layerbit-level reception
data link layereg Ethernetsee chapter 5
switchfabric
4-36Network Layer Data Plane
7
DestinationAddress Range
11001000 00010111 00010000 00000000through11001000 00010111 00010111 11111111
11001000 00010111 00011000 00000000through11001000 00010111 00011000 11111111
11001000 00010111 00011001 00000000through11001000 00010111 00011111 11111111
otherwise
Link Interface
0
1
2
3
Q but what happens if ranges donrsquot divide up so nicely
Destination-based forwardingforwarding table
4-37Network Layer Data Plane
Longest prefix matching
Destination Address Range11001000 00010111 00010 11001000 00010111 00011000 11001000 00010111 00011 otherwise
DA 11001000 00010111 00011000 10101010
examplesDA 11001000 00010111 00010110 10100001 which interface
which interface
when looking for forwarding table entry for given destination address use longest address prefix that matches destination address
longest prefix matching
Link interface01
23
4-38Network Layer Data Plane
Longest prefix matching
sect wersquoll see why longest prefix matching is used shortly when we study addressing
sect longest prefix matching often performed using ternary content addressable memories (TCAMs)bull content addressable present address to TCAM
retrieve address in one clock cycle regardless of table sizebull Cisco Catalyst can up ~1M routing table
entries in TCAM
4-39Network Layer Data Plane
Switching fabricssect transfer packet from input buffer to appropriate
output buffersect switching rate rate at which packets can be
transfer from inputs to outputsbull often measured as multiple of inputoutput line ratebull N inputs switching rate N times line rate desirable
sect three types of switching fabrics
memory
memory
bus crossbar
4-40Network Layer Data Plane
Switching via memory
first generation routerssect traditional computers with switching under direct control
of CPUsect packet copied to systemrsquos memorysect speed limited by memory bandwidth (2 bus crossings per
datagram)
inputport(eg
Ethernet)
memoryoutputport(eg
Ethernet)
system bus
4-41Network Layer Data Plane
Switching via a bus
sect datagram from input port memoryto output port memory via a
shared bussect bus contention switching speed
limited by bus bandwidthsect 32 Gbps bus Cisco 5600
sufficient speed for access and enterprise routers
bus
4-42Network Layer Data Plane
8
Switching via interconnection network
sect overcome bus bandwidth limitationssect banyan networks crossbar other
interconnection nets initially developed to connect processors in multiprocessor
sect advanced design fragmenting datagram into fixed length cells switch cells through the fabric
sect Cisco 12000 switches 60 Gbps through the interconnection network
crossbar
4-43Network Layer Data Plane
7
DestinationAddress Range
11001000 00010111 00010000 00000000through11001000 00010111 00010111 11111111
11001000 00010111 00011000 00000000through11001000 00010111 00011000 11111111
11001000 00010111 00011001 00000000through11001000 00010111 00011111 11111111
otherwise
Link Interface
0
1
2
3
Q but what happens if ranges donrsquot divide up so nicely
Destination-based forwardingforwarding table
4-37Network Layer Data Plane
Longest prefix matching
Destination Address Range11001000 00010111 00010 11001000 00010111 00011000 11001000 00010111 00011 otherwise
DA 11001000 00010111 00011000 10101010
examplesDA 11001000 00010111 00010110 10100001 which interface
which interface
when looking for forwarding table entry for given destination address use longest address prefix that matches destination address
longest prefix matching
Link interface01
23
4-38Network Layer Data Plane
Longest prefix matching
sect wersquoll see why longest prefix matching is used shortly when we study addressing
sect longest prefix matching often performed using ternary content addressable memories (TCAMs)bull content addressable present address to TCAM
retrieve address in one clock cycle regardless of table sizebull Cisco Catalyst can up ~1M routing table
entries in TCAM
4-39Network Layer Data Plane
Switching fabricssect transfer packet from input buffer to appropriate
output buffersect switching rate rate at which packets can be
transfer from inputs to outputsbull often measured as multiple of inputoutput line ratebull N inputs switching rate N times line rate desirable
sect three types of switching fabrics
memory
memory
bus crossbar
4-40Network Layer Data Plane
Switching via memory
first generation routerssect traditional computers with switching under direct control
of CPUsect packet copied to systemrsquos memorysect speed limited by memory bandwidth (2 bus crossings per
datagram)
inputport(eg
Ethernet)
memoryoutputport(eg
Ethernet)
system bus
4-41Network Layer Data Plane
Switching via a bus
sect datagram from input port memoryto output port memory via a
shared bussect bus contention switching speed
limited by bus bandwidthsect 32 Gbps bus Cisco 5600
sufficient speed for access and enterprise routers
bus
4-42Network Layer Data Plane
8
Switching via interconnection network
sect overcome bus bandwidth limitationssect banyan networks crossbar other
interconnection nets initially developed to connect processors in multiprocessor
sect advanced design fragmenting datagram into fixed length cells switch cells through the fabric
sect Cisco 12000 switches 60 Gbps through the interconnection network
crossbar
4-43Network Layer Data Plane
8
Switching via interconnection network
sect overcome bus bandwidth limitationssect banyan networks crossbar other
interconnection nets initially developed to connect processors in multiprocessor
sect advanced design fragmenting datagram into fixed length cells switch cells through the fabric
sect Cisco 12000 switches 60 Gbps through the interconnection network
crossbar
4-43Network Layer Data Plane