ip: network layer - university of california, berkeleyinst.eecs.berkeley.edu/~ee122/sp04/network...
TRANSCRIPT
IP: Network Layer
OverviewAddressingRouting
TOC – IP
OverviewGoals and Tasks RoutingSwitchingIssuesBasic ideas
TOC – IP – Overview
Goals and TasksGoals of Network Layer
Guide packets from source to destinationUse network links “efficiently” (e.g., prefer shorter and fasterroutes)
AddressingAgree on addressing scheme to identify nodesIP addresses are location-based (similar to telephone numbers)This structure reduces the information routers must keepDifferent types of addresses
RoutingRouters exchange information to “learn” network topology Routers then calculate good routes to the different destinationsRouters store the results of these calculations in “routing tables” Different routing algorithms
TOC – IP – Overview – Goals and Tasks
RoutingDefinition
Finding path from source to destinationTypes: Path based on
Flow“Type or Traffic”Source/DestinationDestination Internet
A (S D): 1, 2, 3B (S D): 1, 2, 3C (S D): 1, 4, 5, 3
A (S D): 1, 2, 3B (S D): 1, 2, 3C (S D): 1, 4, 5, 3
Voice (S D): 1, 2, 3Data (S D): 1, 4, 5, 3
Voice (S D): 1, 2, 3Data (S D): 1, 4, 5, 3
11 22
44 55
33SS DD
S’S’ D’D’
(S D): 1, 2, 3(S’ D’): 1, 4, 5, 3
(S D): 1, 2, 3(S’ D’): 1, 4, 5, 3
(S D): 1, 2, 3(S’ D): 1, 2, 3
(S D): 1, 2, 3(S’ D): 1, 2, 3
TOC – IP – Overview – Routing
SwitchingDefinition
Sending the bits along the pathApproaches
Circuit (Telephone; Lightwave)Packet
Virtual Circuit (ATM)Datagram (Ethernet, IP)
NotesA circuit or VC can be a link in an IP networkAn Ethernet LAN can be a link in an IP network
TOC – IP – Overview – Switching
Switching (cont.)Datagram v/s Virtual Circuit
Datagram routingEach packet to be forwarded independently
Virtual CircuitEach packet from same “flow” uses same routeMore state (pick the “right” granularity)
QoS sensitive networks use VC’s and signalingFind a route that has the resources available for the connection. “Reserve” the resources before sending data packets
TOC – IP – Overview – Switching
IssuesScalability [great in IP]
Millions of nodesRouting tables should remain “small”Updates should be manageable
Topology Changes [good in IP]Routers compute new routes as topology changesChanges should not affect most tables
Performance [poor in IP]Link utilization should be well-balanced [not in practice]Updates should be fast [not always]Ideally, some flows would have a guaranteed rate [no]Network should detect configuration errors or other errors [no]Network should protect itself against attacks [no]
TOC – IP – Overview – Issues
Basic IdeasAddressing
Layer 2: Local scheme, typically flat not scalableLayer 3: Location based and hierarchical scalableTemporary addresses for mobile nodesNetwork Address Translation to reuse addresses
Routing Route is based on destination only (roughly: shortest path)Network decomposed into domainsInterdomain routing: Uses a path-vector algorithmIntradomain routing: Uses a link state or a distance vector algorithm
VariationsMulticast; P2P; Ad Hoc; Sensors; Content Distribution Networks
AddressingExamplesClass-Based AddressingCIDR: Classless Interdomain RoutingAssigning AddressesDHCPNetwork Address Translation
TOC – IP – Addressing
ExamplesFlat AddressingHierarchical AddressingInternetworkingLayers 2 and 3
TOC – IP – Addressing – Examples
Flat Addressing
3311
22
5544
66
a
b
a
b
c a
b
c a
b
baba
2: b3: a4: a5: a6: a
2: b3: a4: a5: a6: a
1: a3: b4: b5: b6: b
1: a3: b4: b5: b6: b
1: a2: b4: c5: c6: c
1: a2: b4: c5: c6: c
1: a2: a3: a5: c6: b
1: a2: a3: a5: c6: b 1: a
2: a3: a4: a5: b
1: a2: a3: a4: a5: b
Address Ports 1: a2: a3: a4: a6: b
1: a2: a3: a4: a6: b
Routing Table: One per nodeDestination Exit Port
Addresses are arbitrary; not based on topology (e.g., Ethernet)N nodes N -1 entries in every routing table; not scalable
TOC – IP – Addressing – Examples – Flat
Hierachical Addresses
1.11.11.21.2
1.31.3
2.22.22.12.1
2.32.3
a
b
a
b
c a
b
c a
b
baba
1.3: bDefault: a
1.3: bDefault: a
1.2: a1.3: b
Default: c
1.2: a1.3: b
Default: c
2.2: c2.3: b
Default: a
2.2: c2.3: b
Default: a2.3: b
Default: a2.3: b
Default: a
1.2: aDefault: b
1.2: aDefault: b
2.2: bDefault: a
2.2: bDefault: a
Addresses are arranged based on topology (e.g., IP)Few entries in each routing table; scalable
TOC – IP – Addressing – Examples – Hierarchical
InternetworkingRecall the basic internetworking scheme of IP:
1.4x
1.7y
2.5z
2.4u
3.6v 3.8
w
1.2t
1.*: localDefault: y1.*: local
Default: y
1.2: t1.7: y
1.2: t1.7: y
IP
Local
x y | 1.4 3.8 | data
z u | 1.4 3.8 | data
v w | 1.4 3.8 | data
a
b
d
a
1.*: local4.*: bDefault: a
1.*: local4.*: bDefault: a
1.4: x1.7: y
1.4: x1.7: y
TOC – IP – Addressing – Examples – Internetworking
Layers 2 and 3EthernetSwitch
EthernetSwitch
Router
p
Phy PhyPhy PhyPhy
Transport
Application
Phy
Transport
Application
Phy Phy
Destination Address B Local to port pLocal address B Layer 2 address w
Phy Phy
Linky
NetworkC D
LinkvLinkLink
x
NetworkA
Linkw
NetworkB
Link
Destination Address B Next Hop CLocal address C Layer 2 address y
TOC – IP – Addressing – Examples – Layers 2/3
Class-Based AddressesAddressesScalability Problem
TOC – IP – Addressing – Class
Addresses
Addressing reflects internet hierarchy
32 bits divided into 2 parts:
Class A
Class B
Class C
network host 00
network host 1160
network host 1240
~2 million nets256 hosts
8
0
1 0
TOC – IP – Addressing – Class - Addresses
Scalability ProblemExample: an organization initially needs 100 addresses
Allocate it a class C addressOrganization grows to need 300 addressesClass B address is allocated. (~64K hosts) That’s overkill -a huge wasteOnly about 8200 class B addresses!Artificial Address crises
TOC – IP – Addressing – Class - Scalability
Classless Internet Domain Routing (CIDR)
CIDR allows networks to be assigned on arbitrary bit boundaries.
Address ranges can be assigned in chunks of 2k k=1…32 Idea - use aggregation - provide routing for a large number of customers by advertising one common prefix.
This is possible because nature of addressing is hierarchicalSummarization reduces the size of routing tables, but maintains connectivity. Aggregation
Scalability and survivability of the Internet
TOC – IP – Addressing – CIDR
CIDR (cont.)Suppose fifty computers in a network are assigned IP addresses 128.23.9.0 - 128.23.9.49
They share the prefix 128.23.9Is this the longest prefix?
Range is 01111111 00001111 00001001 00000000 to01111111 00001111 00001001 00110001
How to write 01111111 00001111 00001001 00X?Convention: 128.23.9.0/26There are 32-27=6 bits for the 50 computers
26 = 64 addresses
TOC – IP – Addressing – CIDR
CIDR (cont.)
Specify a range of addresses by a prefix: X/YThe prefix common to the entire range is the first Y bits of X.X: The first address in the range has prefix XY: 232-Y addresses in the range
Example 128.5.10/23Common prefix is 23 bits: 01000000 00000101 0000101Number of addresses: 29 = 512
Prefix aggregationCombine two address ranges128.5.10/24 and 128.5.11/24 gives 128.5.10/23
Routers match to longest prefix
TOC IP Addressing – – – CIDR
CIDR Longest prefix match routing
1100, 1101, 1111
TOC – IP – Addressing – CIDR
1110
1001, 1011, 1010
0100, 0001
111
1101111
0
10
b
da
c
Dest. a b c d1100 3 2 1 01001 1 1 2 01111 4 3 1 0
Length of longest prefixmatch for given port
CIDR (cont.)Example
128.32.134.12 128.32.134.27
128.32.112.15
128.32.12.54
R1
R4R3
128.32.112128.32.134
128.32Default
Default
128.32.134
128.32.32128.32.12128.32.112Default
128.32.12
TOC – IP – Addressing – CIDR
CIDR - Subnets
e1:
H1
R1
H2
H3e1
e2
e3
e4 e5
IP1
IP3
H1: IP1 Mask: 255.255.255.0
H2: IP2 Mask: 255.255.255.0
H3: IP3 Mask: 255.255.255.0
H1: Is H3 on same subnet as I am?
Yes if IP3/24 = IP1/24
R2IP2
TOC – IP – Addressing – CIDR
e2:
e1:
CIDR (cont.)Direct Delivery
H1
R1
H2
H3e1
e2
e3
e4 e5
IP1
IP2
IP3
R2
IP1|IP2|X
Who is IP2?all|e1
I am IP2e2:e1|e2
e2|e1
IP1 IP2 on same subnetIP1 IP2 on same subnet
Address Resolution Protocol = Layer 3 Address Layer 2 AddressAddress Resolution Protocol = Layer 3 Address Layer 2 Address
TOC – IP – Addressing – CIDR
CIDR (cont.)Indirect Delivery IP1 IP3 not on same subnetIP1 IP3 not on same subnet
H1
R1
H2
H3e1
e2
e3
e4 e5
IP1
IP3
R2
IP1|IP3|Xe4|e1
IP1|IP3|XSH
Who is IP3?all|e5
I am IP3e5|e3
IP1|IP3|Xe3|e5
IP2
Note: Fragmentation may be required at R1TOC – IP – Addressing – CIDR
Assigning IP address (Ideally)
A host gets its IP address from the IP address block of its organizationAn organization gets an IP address block from its ISP’s address blockAn ISP gets its address block from its own provider OR from one of the 3 routing registries:
ARIN: American Registry for Internet NumbersRIPE: Reseaux IP EuropeensAPNIC: Asia Pacific Network Information Center
Each Autonomous System (AS) is assigned a 16-bit number (65536 total)
Currently 14,000 AS’s in use
TOC – IP – Addressing – Assigning Addresses
DHCP – Dynamic Host Configuration Protocol
IdeaTemporary addresses assigned “on demand”
AdvantagesEnables to reuse addresses
You come to a classroom with a laptopDial-up users
Automates the assignment of addressesDisadvantage
Cannot be a server (how to find address?)
TOC – IP – Addressing – DHCP
DHCP (cont.)
OperationsDHCP server maintains list of available addressesClient requests an address
Client sends “DHCP discover message”(“me all” = [0…0 | 1…1])Server replies with “DHCP offer”Client asks for address; server provides one…
Client can extend/release the leaseServer and client can test address
TOC – IP – Addressing – DHCP
NAT
OverviewExampleHow NAT works
TOC – IP – Addressing – NAT
OverviewShortage of IP AddressesCIDR may not be enoughIPv6 may take a long time until deployedNAT enables reuse of addressesPrivate Addresses:
10.0.0.0 - 10.255.255.255172.16.0.0 - 172.31.255.255196.168.0.0 - 196.168.255.255
See IETF RFC 1631 (1994)
TOC – IP – Addressing – NAT – Overview
ExampleHome Network
One IP address (IPa) is visible outside
IPa (typically DHCP)
IPb(DHCP with NAT)
IPc(DHCP with NAT)
NAT
Note: Can be extended to a set of addresses instead of only one (IPa)In that case, some “static” addresses can be reserved for servers …
TOC – IP – Addressing – NAT – Example
How it worksTrick: Use TCP port to distinguish computersThere are 64k port numbers, the first 1k are reserved
IPa
IPc
NAT
IPx[IPb | IPx | TCPm | TCPn | …]
[IPa | IPx | TCPb | TCPn | …]
[TCPb IPb, TCPm]
[IPx | IPa | TCPn | TCPb | …]
[IPx | IPb | TCPn | TCPm | …]
IPbTOC IP Addressing NAT – – – – How
RoutingRouting Sub-FunctionsHierarchicalTypes of Protocol
TOC – IP – Routing
Routing Sub-Functions
Topology Update: Characterize and maintain connectivity
Discover neighborsMeasure “distance” (one or more metric)Disseminate
Route Computation:Kind of path: Multicast, UnicastCentralized or Distributed AlgorithmPolicyHierarchy
Switching: Forward the packets at each node
TOC – IP – Routing – Sub-Functions
Hierarchical Routing The internet has many Administrative Domains
A
B
C
31
2
12
10
13
11
6
7
8
5
4
TOC – IP – Routing – Hierachical
Hierarchical Routing Border Routers
6
4
3
2
13
A
B
C
2
4
3
6
13
7
8
5
1 12
1011
OSPF
RIP
IGRP
BGP
TOC – IP – Routing – Hierachical
Hierarchical Routing Interdomain & Intradomain
A
B
C
6
7
8
5
4
31
2
12
10
13
11
6
4
3
2
13
B
2
4
3
6
13
OSPF
RIP
IGRP
BGP
InterDomainInterDomain
IntraDomain
IntraDomain
IntraDomain
TOC – IP – Routing – Hierachical
Types of Routing ProtocolOverviewLink StateDistance VectorLink State vs. Distance VectorPath Vector: Interdomain Routing
TOC – IP – Routing – Types
Overview
Topology changes can be detected by nearby nodesThese changes must be reflected in the routes
Mechanisms for disseminating informationLink State: Communicate the names and costs of neighbors. Each node maintains the entire topology. E.g. used in OSPFDistance Vector: Communicate current distance estimates of node to every other node. E.g. used in RIPPath Vector: Communicate current estimates of preferred paths from node to every other node. E.g. used in BGP
TOC – IP – Routing – Types – Overview
Overview
AABB
CCDD
2
1
1
3
A: [B, 2], [C, 1]B: [A, 2], [D, 1]C: [A, 1], [D, 3]D: [B, 1], [C, 3]
1) Exchange Link States 2) Each node computesthe shortest paths tothe others
LINK STATE
AABB
CCDD
2
1
1
3
DISTANCE VECTOR
0
0AA
BB
CCDD
2
1
1
3
1
03
AABB
CCDD
2
1
1
3
AABB
CCDD
2
1
1
3
PATH VECTOR
D
DAA
BB
CCDD
2
1
1
3
B,D
C,D
AABB
CCDD
2
1
1
3
“Don’t like B”
TOC IP Routing Types – – – – Overview
Link State Protocols
OverviewLink State Advertisements Shortest Path Algorithm: Dijkstra
TOC – IP – Routing – Types – Link State
Overview
1. Every node learns the topology of the network
Flooding of Link State Packets (LSP)2. An efficient shortest path algorithm
computes routes to every other node3. Node updates Forwarding Table
TOC – IP – Routing – Types – Link State - Overview
Link State Advertisements
Link State PacketsFlooding ExampleSome Issues
TOC – IP – Routing – Types – Link State - LSA
Link State Packets
SourceSequence Number
AgeList of Neighbors
Every router sends Link State Packets (LSPs) to all of its neighborsLSPs arrive and wait in buffers to be “accepted”If node j receives a LSP from node k it compares the sequence numbers. If this is the most recent one from k, send to N(j)-{k}.
This way each router can send its LSP to all other routersAge starts out at 7. At any router, value is decremented every 8seconds. At 0 discard.As long as sequence don’t wrap this works
Otherwise things can get ugly
TOC – IP – Routing – Types – Link State – LSA – LSP
LSP - Example
6
7
8
5
4
31
2
12
10
13
11
TOC – IP – Routing – Types – Link State – LSA – Example
LSP - Example
6
7
8
5
4
31
2
12
10
13
11
TOC – IP – Routing – Types – Link State – LSA – Example
LSP - Example
6
7
8
5
4
31
2
12
10
13
11
TOC – IP – Routing – Types – Link State – LSA – Example
LSP - Example
6
7
8
5
4
31
2
12
10
13
11
TOC – IP – Routing – Types – Link State – LSA – Example
Some Issues
What happens if some routers are much faster at transmitting LSPs?What happens if sequence numbers wrap?What happens when a partitioned network is reconstituted?What about security?Etc., etc.Many lines of code
TOC – IP – Routing – Types – Link State – LSA – Issues
Dijkstra
Every node knows the graphAll link weights are >= 0
Goal at node 1: Find the shortest paths from 1 to all the other nodes.Each node computes the same shortest paths so they all agree on the routesStrategy at node 1: Find the shortest paths in order of increasing path length
1
3
4
6
2
5
1
4
11
41
3
2
1
TOC – IP – Routing – Types – Link State - Dijkstra
DijkstraIDEA: Given P(k) we can find P(k+1) efficiently:To get P(k+1), observe that
1. This node cannot be in P(k)2. It must be one hop away from some node
in P(k)Suppose 2 were false. We picked i
Node i has no edge into P(k)There must be a node x, not in P(k) such thatx is one hop away from P(k) andD(1,i)=D(1,x)+D(x,i)
But then, D(1,x) < D(1,i) and we would have picked x instead.
Pick node(s) that is one hop away from P(k) that is closest to 1.Keep iterating until all nodes are in P
Notationc(i,j) >=0 :cost of link from (I,j)D(1,i): Shortest path from 1 to i.D(1,x,i): Shortest path from 1 to i via xLet P(k) be the set of nodes k-closest to 1
P(2)={1,2}
1
3
4
6
2
5
1
4
11
41
3D(1,5)=2D(1,6,5)=5 2
1
TOC – IP – Routing – Types – Link State - Dijkstra
Dijkstra
13
4
6
2
5
1
41 1
41
2
1
13
6
2
51
1 4
4 2
P(2)={1,2}D(1,2)=1
13
4
6
2
51
1
3 2
3
5
P(4)={1,2,3,5,6}D(1,3)=3D(1,6)=3
13
4
6
2
51
1
3 2
3
6
P(3)={1,2,5}D(1,5)=2
TOC – IP – Routing – Types – Link State - Dijkstra
Dijkstra - Forwarding Table
At node 5
Outgoing Cost
1 2 2
2 2 1
3 3 1
4 3 3
6 6 1
1
3
4
6
2
5
1
4
11
41
3
2
1
TOC – IP – Routing – Types – Link State - Dijkstra
Distance Vector Protocol
Bellman – FordWhy does it work?Counting to InfinityBad News Travel SlowlyAsynchronous Bellman – FordOscillations
TOC – IP – Routing – Types – DV
Bellman-Ford
1
3
4
6
2
5
1
4
11
41
2
3
1
i Di
1 (0,1,∞,∞,∞,4)2 (1,0,3,∞, 1,∞)3 (∞,3,0,2, 1,∞)4 (∞,∞,2,0, 4,∞)5 (∞,1,1,4, 0,1)6 (4,∞,∞,∞,1,0)
C(3,4) = 2
Initially
Communicate current distance estimates of node to every other node
This is called its distance vector: Di = (D(i,1),D(i,2),…,D(i,n))Initially, assume that D(i,j) = c(i,j) if there is a link ij
= ∞ otherwiseThe nodes do not need to learn the entire topology
Just the distance estimates (vectors) of their neighbors
Periodically each node sends its distance vector to all of its neighbors
TOC – IP – Routing – Types – DV – Bellman-Ford
1
3
4
6
2
5
1
4
11
41
2
3
1
Bellman-FordUpdate: when receive estimatesD(i,d) = minjεN(i) {c(i,j) + D(j,d)}
3 gets updates from 2 and 5
i Di
1 (0,1,∞,∞,∞,4)
TOC – IP – Routing – Types – DV – Bellman-Ford
2 (1,0,3,∞, 1,∞)3 (∞,3,0,2, 1,∞)4 (∞,∞,2,0, 4,∞)5 (∞,1,1,4, 0,1)6 (4,∞,∞,∞,1,0)
D(3,1) = min{c(3,2) + D(2,1), c(3,5) + D(5,1)}
= min{ 3 + 1 , 1 + ∞ }= 4
1
3
4
6
2
5
1
4
11
41
2
3
1
Bellman-FordFocus on destination 1Here are the values of D(i,1):
i 1 2 3 4 5 6 71 0 0 0 0 0 0 02 ∞ 1 1 1 1 1 13 ∞ ∞ 4 3 3 3 3
4 ∞ ∞ ∞ 6 5 5 55 ∞ ∞ 2 2 2 2 26 ∞ 4 4 3 3 3 3
step
TOC – IP – Routing – Types – DV – Bellman-Ford
Why does this compute shortest paths?
Suppose in every tick each node sends its distance vector.
Assume that initial distances are ∞At time h, node i has as an estimate of the shortest path to node j that has <= h+1 hops!Dh+1(i,j) = minkεN(i) {Dh(k,j) + c(i,k)}
13
4
6
2
5
1
3 2
3
513
6
2
5
1 4
13
4
6
2
5
1
41 1
41
23
1
13
4
6
2
5
1
1
6
21 3
6
4 2 3 24
TOC – IP – Routing – Types – DV – Why
Counting to Infinity
A B C
012
A B C
0
All links cost 1
4 3
A B C
06 5
Ping-Pong to Eternity
TOC – IP – Routing – Types – DV – Counting to Infinity
Bad News Travels Slowly…4 3
2
1
1
11
M
1
D(2,1)=2, D(3,1)=1, D(4,1)=2
TOC – IP – Routing – Types – DV – Bad News
Bad News Travels Slowly…4 3
2
1
1
11
M
1Node 2 takes about M Iterations to figure out thatD(2,1)=M
Fundamental Cause: After a network change, think of the networkprotocol running from time 0. The initial conditions are arbitrary…
•Tricks exist to get around these problems but not fool proof
TOC – IP – Routing – Types – DV – Bad News
Asynchronous Bellman Ford
In general, nodes are using different and possibly inconsistent estimatesIf no link changes after some time t, the algorithm will eventually converge to the shortest pathNo synchronization required at all…
TOC – IP – Routing – Types – DV – Asynchronous
Oscillations
Link costs must reflect link speed AND congestionUnder both LSP and DV routing occurs over a tree
The costs of the links of this tree will increaseThe other links will not be congested
Their costs will dropRouting protocol will shift traffic and create a new treeThis process of shifting and reshifting can be severeWay out: Change congestion costs slowly (exponential averaging) – Route dampening
TOC – IP – Routing – Types – DV – Oscillations
Oscillations - ExampleHeavy Load High Delay
1
2
3
4
5 5
1 1
Traffic
Light Load Low Delay
1
2
3
4
1 5
5 1
Traffic
Light Load Low Delay
Heavy Load High Delay
TOC – IP – Routing – Types – DV – Oscillations
Link State vs. Distance VectorNo clear winner
LS is robust since it each node computes its own routes independently
Suffers from the weaknesses of the topology update protocol. Inconsistency etc.Excellent choice for a well engineered network within one administrative domainE. g. OSPF
DV works well when the network is large since it requires no synchronization and has a trivial topology update algorithm
Suffers from convergence delaysVery simple to implement at each nodeExcellent choice for large networksE.g. RIP
TOC – IP – Routing – Types – LS vs. DV