cns 2640 lecture 6/7 assembled by m. ryan byrd

CNS 2640 CNS 2640 Lecture 6/7Lecture 6/7Assembled by M. Ryan ByrdAssembled by M. Ryan Byrd

Internet StructureInternet Structure

Recent Past

Today

NSFNET Backbone

WestnetRegional

BARNETRegional

. . .

UA

UNM

Stanford

BerkeleyPARC

NCARUNL

ISU

MidNetRegional

KU

WestnetRegional

BARNETRegional

UA

UNM

Stanford

BerkeleyPARC

Service Provided Backbone

NCARUNL

ISU

MidNetRegional

KU

NumbersNumbers

www.icann.org Internet Corporation for Assigned Names and Numbers

www.arin.net is our authority and has more details

Names and numbers have been privatized. The US government used to allocate them

The big pictureThe big picture

CurrentCurrent

DestinationsDestinations

Top Level DomainsTop Level Domains

Popular Interior Gateway Popular Interior Gateway ProtocolsProtocols

RIP: Route Information Protocol– developed at Berkeley– distributed with Unix– distance-vector algorithm- neighbors– based on hop-count

OSPF: Open Shortest Path First– recent Internet standard– uses link-state algorithm-bcast– supports load balancing– supports authentication

IP Service ModelIP Service Model

Packet Delivery Model– Best Effort

Global Addressing Scheme– IP Addresses

Packet Delivery ModelPacket Delivery Model

Connectionless (datagram-based)Best-effort delivery (unreliable service)

– packets are lost– packets are delivered out of order– duplicate copies of a packet are delivered– packets can be delayed for a long time

NSFNETNSFNET

A Partial Internet Trunk MapA Partial Internet Trunk MapShowing ISP Access LinesShowing ISP Access Lines

US Main Internet connectionsUS Main Internet connections

Visual TracerouteVisual Traceroute

IP Datagram formatIP Datagram format– Version (4): currently 4– Hlen (4): number of 32-bit words in header– TOS (8): type of service (not widely used QoS)– Length (16): number of bytes in this datagram– Ident (16): different for each datagram– Flags/Offset (16): used by fragmentation– TTL (8): number of hops this datagram has travelled– Protocol (8): demux key (TCP=6, UDP=17)– Checksum (16): of the header only– DestAddr & SrcAddr (32)

Fragmentation and Fragmentation and ReassemblyReassembly

Each network has some MTUStrategy

– fragment when necessary (MTU < Datagram)– try to avoid fragmentation at source host– refragmentation is possible– fragments are self-contained datagrams– use CS-PDU (not cells) for ATM– delay reassembly until destination host– do not recover from lost fragments– Fragment on 8 byte boundaries– Drop the last 3 bits of the offset field

ExampleExample

Datagram ForwardingDatagram Forwarding

Strategy– every datagram contains destination's address– if directly connected to destination network, then

forward to host– if not directly connected to destination network, then

forward to some router– forwarding table maps network number into next

hop– each host has a default router– each router maintains a forwarding table

Error DetectionError Detection

Error Control OverviewError Control Overview

Errors occur due to – Noise or interference in the communication channel– Congestion in the network where packets musts be

dropped

Error Control Strategies– Error Correcting codes (Forward Error Correction

(FEC))– Error detection and retransmission Automatic Repeat

Request (ARQ)

Cyclic Redundancy CheckCyclic Redundancy Check

Add k bits of redundant data to an n-bit message.

Represent n-bit message as an n-1 degree polynomial; e.g., MSG=10011010 corresponds to M(x) = x7+ x4 + x3 + x1.

Let k be the degree of some divisor polynomial C(x); e.g., C(x) = x3+ x2 + 1.

CRCCRC

Transmit polynomial P(x) that is evenly divisible by C(x), and receive polynomial P(x) + E(x); E(x)=0 implies no errors.

Recipient divides (P(x) + E(x)) by C(x); the remainder will be zero in only two cases: E(x) was zero (i.e. there was no error), or E(x) is exactly divisible by C(x). Choose C(x) to make second case extremely rare.

ExampleExample

Make all legal messages divisible by 3 If you want to send 10

– First multiply by 4 to get 40– Now add 2 to make it divisible by 3 = 42

When the data is received ..– Divide by 3, if there is no remainder there is no error– If no error, divide by 4 to get sent message

If we receive 43, 44, 41, 40, then error 45 would not be recognized as an error

TCP Congestion TCP Congestion ControlControl

TCP Congestion ControlTCP Congestion Control

Idea– assumes best-effort network – each source determines network capacity for

itself– uses implicit feedback– ACKs pace transmission (self-clocking)

Algorithm:Algorithm:

– increment CongestionWindow by one packet per RTT (linear increase)

– divide CongestionWindow by two whenever a timeout occurs (multiplicative decrease)

Underlying best-effort networkUnderlying best-effort network

drops messagesre-orders messagesdelivers duplicate copies of a given

messagelimits messages to some finite sizedelivers messages after an arbitrarily long

delay

Common end-to-end servicesCommon end-to-end services guarantee message delivery deliver messages in the same order they are sent deliver at most one copy of each message support arbitrarily large messages support synchronization allow the receiver to apply flow control to the sender support multiple application processes on each host

Simple Demultiplexor (User Simple Demultiplexor (User Datagram Protocol UDP)Datagram Protocol UDP)

Unreliable and unordered datagram service Adds multiplexing No flow control Endpoints identified by ports

– servers have well-known ports– see /etc/services on Unix

Optional checksum– pseudo header + udp header + data

Header format

SrcPort DstPort

Checksum Length

Data

0 16 31

UDP HeaderUDP Header

Bits 0 - 15 Bits 16 -31

Source Port Destination Port

Length Checksum

Data :::

TCP Header TCP Header (for contrast)(for contrast)

0-15 16-31

Source Port Destination Port

Sequence Number

Acknowledgment Number

Data Offset reserved ECN Control Bits

Window

Checksum Urgent Pointer

Options and padding :::

Data :::

Demux ProcessDemux Process (value of “port”) (value of “port”)

Applicationprocess

Applicationprocess

Applicationprocess

UDP

Packets arrive

Ports

Queues

Packetsdemultiplexed

Port 2000Port 3000

Port 3100

Reliable Byte-Stream Reliable Byte-Stream (TCP)(TCP)

OverviewOverviewConnection-orientedByte-stream

– sending process writes some number of bytes– TCP breaks into segments and sends via IP– receiving process reads some number of bytes– Full duplex

Flow control: keep sender from overrunning receiver

Congestion control: keep sender from overrunning network

Read p 272-287 in Cisco book

TCP Segment FormatTCP Segment Format Each connection identified

with 4-tuple:– <SrcPort, SrcIPAddr, DstPort, DstIPAddr>

Sliding window + flow control– Acknowledgment, SequenceNum, AdvertisedWindow

Flags: SYN, FIN, RESET, PUSH, URG, ACK

Checksum: pseudo header + tcp header + data

Src Port Dest Port

AdvertisedWindow

Acknowledgement

SequenceNum

CheckSum

Flags

options

UrgPtr

0(4) (6) (6)

(variable)

data

HdrLen

TCP Connection Establishment and TCP Connection Establishment and TerminationTermination

Three-Way Handshake-random number so that packets from consecutive sessions are unique

Active Participant Passive Participant

SYN, SequenceNum = x

Acknowledgement = x + 1

SYN + ACK, SequenceNum = y,

ACK, Acknowledgement = y + 1

Sliding Window RevisitedSliding Window Revisited

Each byte has a sequence number ACKs are cumulative

Sliding Window (details)Sliding Window (details)

Sending side– LastByteAcked LastByteSent– LastByteSent LastByteWritten– bytes between LastByteAcked and LastByteWritten

must be buffered

Receiving side– LastByteRead < NextByteExpected– bytes between NextByteRead and LastByteRcvd must

be buffered

ARP (again!)ARP (again!)

ARP

H1 H2 H3

H4 H5 H6

H7 H8 H9

H10 H11 H12

Switch

171

172

173

174

D ff.ff.ff.ff.ff.ffSP 128.187.171.2SH fe.34.56.32.d5.29DP 128.187.174.10DH 0.0.0.0.0.0

H10= IP 128.187.174.10, Ethernet 44.fe.34.56.32.d5

56.47.ef.c6.34.78

55.7e.c6.11.78.99




D fe.34.56.32.d5.29SP 128.187.174.10SH 44.fe.34.56.32.d5DP 128.187.171.2DH fe.34.56.32.d5.29

D fe.34.56.32.d5.29SP 128.187.174.10SH 44.fe.34.56.32.d5DP 128.187.171.2DH fe.34.56.32.d5.29

D 128.187.174.10D 44.fe.34.56.32.d5 S 128.187.171.2S fe.34.56.32.d5.29

D 128.187.174.10D 44.fe.34.56.32.d5 S 128.187.171.2S fe.34.56.32.d5.29

ARP

H1 H2 H3

H4 H5 H6

H7 H8 H9

H10 H11 H12

Router

171

172

173

174

H10= IP 128.187.174.10, Ethernet 44.fe.34.56.32.d5

56.47.ef.c6.34.78

55.7e.c6.11.78.99D ff.ff.ff.ff.ff.ff

SP 128.187.171.2SH 55.7e.c6.11.78.99DP 128.187.174.10

DH 0.0.0.0.0.0

D 128.187.174.10D 56.47.ef.c6.34.78 S 128.187.171.2S fe.34.56.32.d5.29


D 55.7e.c6.11.78.99SP 128.187.174.10SH 44.fe.34.56.32.d5DP 128.187.171.2DH 55.7e.c6.11.78.99

D fe.34.56.32.d5.29SP 128.187.174.10SH 56.47.ef.c6.34.78 DP 128.187.171.2DH fe.34.56.32.d5.29

D 128.187.174.10D 44.fe.34.56.32.d5S 128.187.171.2S 55.7e.c6.11.78.99

IP Routing Definitions and IP Routing Definitions and TerminologyTerminology

Routers are Layer 3 (Network Layer) devices Traditionally routers were called gateways Routers are used for information exchange within

a group of networks under the same administrative authority and control (Autonomous Systems)

Routing can be both dynamic and static Routing involves the determination of routing

paths and the transport of information groups (packets) through an internetwork

RIP Routing TableRIP Routing Table

RIP Packet FormatRIP Packet Format

RIP Packet Fields DescriptionRIP Packet Fields Description Command:

– Indicates that the packet is a request or a response. The request command requests the responding system to send all or part of its routing table. Destinations for which a response is requested are listed later in the packet. The response command represents a reply to a request or, more frequently, an unsolicited regular routing update. In the response packet, a responding system includes all or part of its routing table. Regular routing update messages include the entire routing table.

Version number:– Specifies the RIP version being implemented. With the potential

for many RIP implementations in the Internet, this field can be used to signal different, potentially incompatible, implementations.

RIP Packet Fields DescriptionRIP Packet Fields Description Address family identifier:

– Follows a 16-bit field of all zeros and specifies the particular address family being used. On the Internet, this address family is typically IP (value = 2), but other network types may also be represented

Address:– Follows another 16-bit field of zeros. In Internet RIP

implementations, this field typically contains an IP address Metric:

– Follows two more 32-bit fields of zeros and specifies the hop count. The hop count indicates how many internetwork hops (routers) must be traversed before the destination can be reached

OSPF - Open Shortest Path OSPF - Open Shortest Path FirstFirst

OSPF is a relatively recent intra-domain, link state, hierarchical routing protocol developed for IP networks by the Internet Engineering Task Force (IETF)

OSPF was derived from an early version of OSI's IS-IS routing protocol

IGRPIGRP

IGRP is an intra-domain distance vector routing protocol developed in the mid-1980s by Cisco Systems, Inc. It is designed for use in large, complex IP networks.

IGRP uses a combination (vector) of metrics. Internetwork delay, bandwidth, reliability, MTU, and load are all factored into the routing decision.

What is a routing protocol?What is a routing protocol?

Delivers information about networks this router knows to other routers

Receives and records information about other networks from other routers

Used to construct a virtual path through a series of routers

Provide end to end connectivity for a set of nodes or hosts

What is a distance vector What is a distance vector routing protocol?routing protocol?

Based on the Bellman-Ford algorithm Routes are advertised as a vector in the form

(distance, direction)– Distance is a metric (Usually hop count)– Direction is the next-hop router

Relies upon information learned from neighbors Common distance vector protocols include RIP, IPX

RIP(Novell), IGRP, RTMP(Appletalk)

Review of RIPReview of RIP

RIP is an example of a DV routing protocolWidely deployed in 1982, RFC’d in 1988RIP uses hop count for its metricPoor load balancing supportNot designed for unequally balanced

bandwidth linksLets look at an example…

Improved DV ProtocolsImproved DV Protocols

IGRP, EIGRPCisco ProprietaryDeveloped in the mid 1980s/mid 1990’sFeatures an advanced metric systemUnequal cost load sharing

(E)IGRP Metrics(E)IGRP Metrics

Bandwidth – How big the channel isDelay – Knob to adjust channel useLoad – Percent utilization on the channelReliability – Percent time the channel has

been upProvides constants which can be modified

to make one variable more important than another

Load BalancingLoad Balancing

Balances over equal cost paths by defaultBalances over unequal cost paths with

minor configurationYou can specify the variance of unequal

cost paths to distribute over

Routing Protocol TypesRouting Protocol TypesDistance Vector Passes routes by next hop and path cost or metric. Takes up little memory Ease of implementation

Link State Each router meets all others and passes information about

the links that attach it to the network Each router contains complete information about the

topology and from this information uses Dijkstra’s Algorithm to calculate forwarding decisions

Faster Convergence time Uses more memory, complex to implement

DHCPDHCP

Dynamic Host Configuration Protocol uses the same frame format and transport mechanism as BOOTP. It is supposed to provide a complete set of parameters to a host that queries the server. The neat new capability that DHCP adds is that it can assign addresses and reuse them.

BOOTP : assigns a host an address it can use “forever”

DHCP : loans an address to a host and is available again if the host does not renew.

NATNATA Network Address Translator sits on a network and

translates IP addresses of multiple stations into one address viewable by the outside world.

WAN

Workstation

Workstation

Workstation

Workstation

NAT

The TranslatorThe TranslatorThe NAT has a set of one or more globally unique IP

addresses that it can assign to nodes in the masked network.

If the NAT has a pool of globally unique IP addresses that is less than the number of nodes in the masked network, it can do Network Address Port Translation (NAPT). This translates between address and port pairs, allowing thousands of connections through the translator.

NAT and NAPT have helped delay the deployment of IPv6. These protocols can get ugly when encryption is used.

Autonomous SystemsAutonomous Systems

An Autonomous System (or AS) is a set of routers under a single technical administration, like an internet service provider. The administration of an AS appears to other ASs to have a single coherent interior routing plan and presents a consistent picture of what destinations are reachable through its network.

Examples of an AS are Sprint, Qwest, and MCIWorldCom

Gateway ProtocolsGateway Protocols

There are 2 types of Gateway Protocols

1. Interior Gateway Protocols are used within Autonomous Systems

2. Exterior Gateway Protocols are used between Autonomous Systems

A Problem with Distance Vector A Problem with Distance Vector ProtocolsProtocols

Counting to infinity: the LONG convergence time.

A B C

Assume each link has a metric of one. A knows that its cost to get to C is 2, B knows that its cost to get to C is 1. C Crashes! B discards its distance vector from C and recalculates, using the advertisement of 2 from router A and incrementing it one. Router A receives a metric of 3 to C from router B and changes its distance vector to 4. This Process continues until routers A and B determine that the metric to C is infinity. This problem is called counting to infinity.

SolutionsSolutions Hold Down: When a route that is in use goes

down, the connected router advertises that path as infinity to the rest of the network. Once a period of time passes, the connected router finds an alternative path. The issue here is that it slows down convergence time and doesn’t always work.

Report the entire path: this is expensive and you might as well use a link state algorithm

Split Horizon: When Router C Crashes, Router B stops advertising a route to C to Router A. Problem solved…somewhat (next slide)

Split HorizonSplit HorizonSplit Horizon still does not solve the count to infinity problem

in the topology below if the link to router D goes down. Router B will stop advertising the route to D, but routers A and C will continue to advertise routes to D, counting to infinity.

Counting to infinity doesn’t break the protocol, it just slows network convergence time.

A

B

C

D

Poison ReversalPoison Reversal

Poison Reversal is used with Split Horizon. Instead of not advertising the route, the router advertises the route with a metric of infinity. This solves the counting to infinity problem.

““Classful” Inter Domain Classful” Inter Domain RoutingRouting

In the early days of the internet, the network masks were inferred from the class of IP address to the destination network. They were not passed around by the routing protocols. In this age network masks were not needed by routers, as they could implicitly determine the network mask by looking at the first byte of the IP address in question.

It was assumed that Class A netmasks were 255.0.0.0, Class B netmasks were 255.255.0.0 and Class C netmasks were 255.255.255.0.

Classless Inter Domain Classless Inter Domain RoutingRouting

CIDR routing protocols pass subnet information with their routing information, eliminating the need to infer an address class.

Subnets are thus no longer limited to the classes. They could now be any arbitrary length that the network manager configured (within reason).

More on CIDRMore on CIDR The concept of address “classes” goes away. A 16 bit

address block is the same as an old class B network. Modern networks can be arbitrarily grouped (multiple

class C sized networks, into a class B sized network), or divided (one class B sized network into any set of arbitrary smaller sized networks).

A single routing advertisement can cover a block of old style addresses. This makes the size of routing tables smaller.

Larger blocks of addresses can be divided and allocated, increasing the lifetime of IPv4.

Routing Information ProtocolRouting Information Protocol

RIP is an Interior Gateway Protocol that uses the Distance Vector approach to routing. The most primitive version (1) was a class oriented routing protocol. RIP Version 2 adds support for subnet masks and authentication.

RIP works just as I have explained Distance Vector routing protocols.

RIP uses split horizon with poison reversal.

More on RIPMore on RIP

RIP is designed for smaller simple networks, as the infinity metric is 16 hops. Thus the protocol is limited to networks whose longest path (the network's diameter) is 15 hops with a metric of 1. RIP should not be used in larger networks.

Like other Distance Vector routing protocols, RIP counts to infinity to resolve unusual situations.

RIP is not appropriate for situations where routes need to be chosen based on real-time parameters such a measured delay, reliability, or load.

Why Use RIP?Why Use RIP?

Because of the simplicity of the protocol:There are many good, interoperable

implementations.These implementations have a minimal

number of bugs.There is minimal configuration.

Open Shortest Path FirstOpen Shortest Path FirstOSPF is an Interior Gateway Protocol that uses the Link

State approach to routing. In OSPF, each router maintains a database

describing the Autonomous System's topology. This database is referred to as the link-state

database. Each participating router has an identical

database. Each individual piece of this database is a

particular router's local state (e.g., the router's usable interfaces and reachable neighbors).

The router distributes its local state throughout the Autonomous System by flooding.

Hello ProtocolHello ProtocolRouters discover other OSPF capable routers through the

Hello Protocol.Once 2 routers have detected each other, a partial

adjacency has been formed. They can now share link state information through the exchange of database description packets.

During and after the Database Exchange Process, each router has a list of those LSAs (Link State Advertisements ) for which the neighbor has more up-to-date instances. Requests are sent until the database is updated, now the routers are fully adjacent.

What happens when there is What happens when there is more than one router on the more than one router on the

subnet? subnet? In this situation, the Hello Protocol has the

capability to elect a designated router. The router that is first to initialize becomes

designated router Or if 2 routers initialize at the same time, DR is

determined by priority or router ID. The designated router is the only router on that

specific network that shares the database with other routers on that network.

The CalculationThe Calculation

OSPF uses Djikstra’s Algorithm to construct a tree of shortest path routes across an autonomous system.

This is performed by all routers on the network in parallel.

The route costs or metrics are configured by the network administrator.

The tree that is calculated determines the entire path, but the router only uses this to determine forwarding of data packets to the next hop router.

Multicast RoutingMulticast Routing

Multicast hosts register with their local router through a protocol called Internet Group Management Protocol.

There are several routing protocols that can route IP multicast packets.

(DVMRP)Distance Vector Multicast Routing Protocol creates source routed tree structures.

Link-State Routing AlgorithmsLink-State Routing AlgorithmsNet topology, link costs known to all nodes

Link state distribution accomplished via “link state broadcast” – all nodes have same info

Compute least cost paths from a node to all other nodes– use Dijkstra’s algorithm

Dijkstra’s AlgorithmDijkstra’s Algorithm1 Initialization: 2 N = {A} 3 for all nodes v 4 if v adjacent to A 5 then D(v) = c(A,v) 6 else D(v) = infty 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N: 12 D(v) = min( D(v), D(w) + c(w,v) ) 13 /* new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v */ 15 until all nodes in N

Notation:- c(i,j): link cost from node i to j; cost infinite

if not direct neighbors- D(v): current value of cost of path from

source to destination v- N: set of nodes whose least cost path

definitively known

Dijkstra’s Algorithm: ExampleDijkstra’s Algorithm: ExampleStep

012345

start NA

ADADE

ADEBADEBC

ADEBCF

D(B),p(B)2,A2,A2,A

D(C),p(C)5,A4,D3,E3,E

D(D),p(D)1,A

D(E),p(E)infinity

2,D

D(F),p(F)infinityinfinity

4,E4,E4,E

A

ED

CB

F

2

2

13

1

1

2

53

5

B

A

S E

F

H

J

D

C

G

IK

represents link

represents a node that has received update

M

N

L

Link State BroadcastLink State Broadcast

B

A

S E

F

H

J

D

C

G

IK

send to neighbors

M

N

L



B

A

S E

F

H

J

D

C

G

IK

M

N

L

To avoid forwarding the same update multiple times, each update has a sequence number. If an arrived update does not have a higher seq., discard!- The packet received by E from C is discarded- The packet received by C from E is discarded as well - Node H receives packet from two neighbors, and will discard one of them

Summary of Link State BroadcastSummary of Link State Broadcast Link updates are given sequence numbers Each router maintains the highest seq. seen for each

router If the seq. of an arrived update is not higher than the

stored seq., discard the update; otherwise, update seq. of the src, and forward the update to all the links except the incoming link

To avoid corrupted seq. (or a router reboot) and therefore prevent any update, the state at each router has an age field

Updates are sent periodically

Routing in the InternetRouting in the Internet

The Global Internet consists of Autonomous Systems (AS) interconnected with each other– An AS is identified by an AS Number (ASN), e.g. Yale

ASN is 29

– Try %whois or– http://www.cs-ipv6.lancs.ac.uk/ftp-archive/

6Bone/Whois/internic-asn/asn.txt

Different Types of ASDifferent Types of AS

Stub AS: single service provider, e.g. small corporation– does not participate in inter-AS protocol– has one default route and sends

non-local traffic to service provider

Multihomed AS: large corporation (no transit)– does not participate in inter-AS routing protocol– has more than one service providers

Transit AS: provider

Qwest

Yaledefault routes 0.0.0.0/0

pointing to provider.

132.130.0.0/16

Routing with ASRouting with AS Intra-AS

– Routers in the same AS run the same routing protocol– Routers in different AS’s can run different intra-AS routing

protocols– Such protocols are called Interior Gateway Protocols (IGP)

RIP: Routing Information Protocol OSPF: Open Shortest Path First IS-IS: very similar to OSPF (or should we say OSPF is very similar to

IS-IS?) IGRP: Interior Gateway Routing Protocol (Cisco)

Inter-AS– A protocol runs among AS’s is also called an Exterior Gateway

Protocol (EGP) Unique standard in the current Internet: Border Gateway Protocol (BGP)

Inter-AS routing

between A and B

Intra-AS and Inter-AS RoutingIntra-AS and Inter-AS Routing

Host h2

a

b

b

aaC

A

Bd c

A.a

A.c

C.bB.a

cb

Hosth1

Intra-AS routingwithin AS A

Intra-AS routingwithin AS B

border (exterior gateway) routers

interior (gateway) routers

Why different Intra- and Inter-AS Routing?Why different Intra- and Inter-AS Routing?

Policy: Inter-AS: admin wants control over how its traffic routed,

who routes through its net Intra-AS: single admin, so no policy decisions needed

Scale: hierarchical routing saves table size and reduces update

traffic

Performance: Intra-AS: can focus on performance Inter-AS: policy may dominate over performance

Many Routing Processes Can Run on a Single Many Routing Processes Can Run on a Single Router Router

Forwarding Table

OSPFdomain

RIPdomain

BGP

OS kernel

RIP process

RIP routing table

Forwarding Table Manager

OSPF process

OSPF Routing table

BGP process

BGP routing table

RIP ( Routing Information Protocol)RIP ( Routing Information Protocol) Distance vector algorithm Included in BSD-UNIX

Distribution in 1982 Link cost: 1 Distance metric: # of

hops (max = 15 hops)– why?

Distance vectors– exchanged every 30 sec via Response Message (also called

advertisement) using UDP– Each advertisement: route to up to 25 destination nets

RIP (Routing Information Protocol) RIP (Routing Information Protocol)

Destination Network Next Router Num. of hops to dest. w A 2

y B 2 z B 7

x -- 1…. …. ....

w x y

z

A

C

D B

Routing table in D

RIP: Link Failure and RecoveryRIP: Link Failure and Recovery If no advertisement heard after 180 sec --> neighbor/link

declared dead– routes via neighbor invalidated

– new advertisements sent to neighbors

– neighbors in turn send out new advertisements (if tables changed)

– link failure info quickly propagates to entire net

– Reverse-poison used to prevent ping-pong loops (infinite distance = 16 hops)

OSPF (Open Shortest Path First)OSPF (Open Shortest Path First) “Open”: publicly available

Uses Link State algorithm – Link state (LS) packet dissemination– Topology map at each node– Route computation using Dijkstra’s algorithm

OSPF “Advanced” Features (not in OSPF “Advanced” Features (not in RIP)RIP)

Multiple same-cost paths allowed (only one path in RIP)

For each link, multiple cost metrics for different Type Of Service (eg, satellite link cost set “low” for best effort; high for real time)

Security: all OSPF messages authenticated (to prevent malicious intrusion); TCP connections used

Hierarchical OSPF in large domains

Hierarchical OSPFHierarchical OSPF

“summarize” distances to nets in own area, advertise to other Area Border routers.

run OSPF routing limited to backbone.

- Link-state advertisements only in area each nodes has detailed area topology;- only know direction (shortest path) to nets in other areas.

Two-level hierarchy: local area, backbone.

Internet Inter-AS Routing: BGPInternet Inter-AS Routing: BGP BGP (Border Gateway Protocol): the de facto

standard Path Vector protocol:

– Similar to Distance Vector protocol– Each Border Gateway broadcasts to neighbors

(peers) entire path (i.e., sequence of AS’s) to destination

– e.g., Gateway X may send its path to dest. Z:

Path (X,Z) = X,Y1,Y2,Y3,…,Z

BGP: Policy RoutingBGP: Policy RoutingSuppose: gateway X sends its path to peer gateway W W may or may not select path offered by X

– cost, policy (e.g., don’t route via competitor’s AS), loop prevention reasons

If W selects path advertised by X, then:Path (W,Z) = W, Path (X,Z)

Note: X can control incoming traffic by controlling its route advertisements to peers:– e.g., don’t want to route traffic to Z -> don’t advertise

any routes to Z

Selective TransitSelective Transit

NET BNET C

NET A provides transitbetween NET B and NET Cand between NET D and NET C

NET A

NET D

NET A DOES NOTprovide transitBetween NET D and NET B

IP traffic

advertise path to C, but not D

advertise path to B and D

advertise path to C, but not B

Suppose Net C is a paying costumer of Net A

BGP: Policy Interactions Could Lead to OscillationsBGP: Policy Interactions Could Lead to Oscillations

2

0

31

2 1 02 0

1 3 01 0

3 2 03 0

4

3

• If each one chooses the first choice, not consistent;• If one chooses the second choice, say 1 chooses 10, then 2 will choose 210, the only valid for 3 is 30; however, the choice of 3 forces 1 to change to 130

• Have not seen oscillations in practice, but this is a hidden threat!• Solution: check for dependency!

preferred

lesspreferred

Each router has a choiceamong two paths;The policy is to prefer its counter clock-wise neighbor

BGP Operations (Simplified) BGP Operations (Simplified)

Establish session on TCP port 179

Exchange all active routes

Exchange incremental updates

AS1

AS2

While connection is ALIVE exchangeroute UPDATE messages

BGP session

IGRP (Interior Gateway Routing IGRP (Interior Gateway Routing Protocol)Protocol)

CISCO proprietary; successor of RIP (mid 80s) Distance Vector, like RIP Several cost metrics (delay, bandwidth, reliability,

load etc) Uses TCP to exchange routing updates Loop-free routing via Distributed Updating Alg.

(DUAL) based on diffused computation

BGP MessagesBGP Messages Four types of messages

– OPEN: opens TCP connection to peer and authenticates sender

– UPDATE: advertises new path (or withdraws old)– KEEPALIVE keeps connection alive in absence of

UPDATES; also ACKs OPEN request– NOTIFICATION: reports errors in previous msg;

also used to close connection

Why is a routing protocol needed?Why is a routing protocol needed?

Early requirements to exchanges data between computers over interconnected networks.

Routing entities had to make a judgement on which path to route traffic to destination.

Background to RIPBackground to RIP

RIP dates back to 1969, the early networking days and ARPNET when Xerox and Berkley’s Unix implemented it broadly similar protocols.

RIP distributed through ‘route d’ application, included in early Unix O.S.

RIP uses a single class of routing algorithm known as distance vector - based on a simple hop count algorithm (derived from Bellman’s equation).

Although superseded by more complex algorithms, its simplicity means is still found widely in smaller autonomous systems.

Purpose of Routing ProtocolPurpose of Routing Protocol

The purpose of RoutING protocols is to supply information needed to do routing of datagrams from router to router.

RIP intended for use in IP based network environment. Operating at layer 3 of OSI (Network) RIP makes no formal distinction between networks and

hosts. Routers typically provide a gateway for datagrames to

leave one network or AS and be forwarded onward to another network.

Routers therefore, have to make decisions if there is a choice of forwarding path on offer.

Routing metricsRouting metrics

Routing entities keep a database (look up table) of basic information based on numeric result s (metric) of an algorithm to forward a datagram onward to its next destination.

Each entity participating in routing decisions sends update messages to its neighbour.

In order to provide complete network routing information every router within the AS must participate in the protocol.

Each router has a lookup table which contains one entry for every destination that is reachable.

How does a metric work?How does a metric work?

Metrics are the result of a formula based on a choice of measurement criteria.

Example, travel cost by taxi:

£10 to go by taxi from Edinburgh to Livingston. (P1)£25 to go from Livingston to Glasgow (P2)£15 to go from Edinburgh to Falkirk (P3)£30 to go from Falkirk to Glasgow (P4)

Cost (Edinburgh, Glasgow) = [P1+P2] = £35also/or [P3+P4] = £45

What is in a RIP routing table?What is in a RIP routing table?

Address - IP address (IPv4) of host or network destination. Router - First router along the route to destination. Interface - The physical network which must be used to

reach the next router. Metric - A number indicating the distance to the

destination. This number is the sum of the ‘costs’ that have to be transversed to get to the destination.

Timers - Time since entry was last updated and others. Flags - Various flags to indicate status of various adjacent

routers (for example).

Other entries in the routing tableOther entries in the routing table

The entries for directly connected networks typically have a value of 1 (a simple hop count).

Initially subnet masks were not included in RIP protocol implementations, but were included later to support feature extensions and to identify different subnets within local and distant networks.

Administrators may also add static routes for example, which are outside the scope of the routing system.

The RIP datagramThe RIP datagram

RIP is a UDP-based protocol.

Small regular messages, no need for windowing, handshaking or re-transmission.

Frames received and transmitted on UDP port number 520 (Rip 1&2)

1 - 25 RIP routing entries RTEs.

Gateway HierarchyGateway Hierarchy

InternetCore

AutonomousSystem

(AS)

AutonomousSystem

(AS)

Two levels of Routing Two levels of Routing ProtocolsProtocols

RoutingDomain

RoutingDomain

RoutingDomain

EGP EGP

EGP

IGP

IGP

IGP

Intra-domainrouting protocol

Exteriorrouting protocol

Routing ProtocolsRouting Protocols

Intra-domain Gateway Protocols– RIP– RIP V2– OSPF - open shortest path first– IS-IS (similar to OSPF)

Exterior Gateway Protocols– EGP– BGP

RIPRIP

Distance vector routing algorithm based on hops that communicates between routers using UDP

On initialization, router determines all available interfaces and sends a REQUEST packet out each interface. Special request for “send everything”

On receipt of request,– Either return everything– Or, for each requested destination, return distance + 1

On response– Update routing tables

RIP V1 ProtocolRIP V1 ProtocolCommand Version MBZ

32-bit IP address

Address Family MBZ

MBZ

MBZ

Metric (value of 1..16)

Up to 24 more routes in same format...

MetricsMetrics

R1

R2

N1

N2

N3

N2 is 1 hop

N3 is 1 hop

N1 is 1 hop

N2 is 1 hop

Route to N3via R2 with

hop count of 2

ProblemsProblems

Hop count limited to 15– Can only be used within an AS where

maximum network diameter of 15

It’s based on HOPS, not e.g., latency or bandwidth

No notion of subnet addressing in RIP V1

RIP V2 ProtocolRIP V2 ProtocolCommand Version Routing domain

32-bit IP address

Address Family Route tag

32-bit subnet mask

32-bit next-hop IP address

Metric (value of 1..16)

Up to 24 more routes in same format...

RIP V2RIP V2 Routing domain is an identifier of the routing daemon

– Process ID in UNIX– …So you can run multiple instances of RIP

Route tag carries an autonomous system number for EGP and BGP

Next op address is where packets corresponding to that (sub)network should be sent. A value of zero means send to the system sending RIP info.

Simple authentication scheme with clear-text password

Distance Vector RoutingDistance Vector Routing

Also called Bellman-Ford or Ford-Fulkerson algorithms Used by RIP

Each router is responsible for keeping track and informing it’s neighbors of its distance to each destination

The router computes its distance to a destination based on its neighbors distance to the destination

Router must know it’s own ID and the cost of its links to each neighbor

Distance Vector Routing For Distance Vector Routing For Address “D”Address “D”

R

12

3

4

5

172

35

541

Link cost

Link number


R

12

3

4

5

97

62

11829

81

172

35

541

Cost from neighbor to

destination D


R

12

3

4

5

97

62

11829

81

172

35

541

98

99

97

123 70

Cost for Rto get to Dvia this link

Minimumcost route


R

12

3

4

5

70

70

7070

70

172

35

541

Cost fromR to D

Problems With Distance Problems With Distance VectorVector

Slow convergence to the lowest cost route

Slow recovery time

Slow recovery leads to routing problems during recovery– Router loops– Count to infinity

Routing LoopsRouting Loops

A

B

C

D

A

B

C

D

Count To Infinity (worse case Count To Infinity (worse case loop)loop)

A

B

C

A

B1

2

1

2 A

B2

3 A

B3

4

OSPF - Open Shortest Path OSPF - Open Shortest Path FirstFirst

OSPF uses IP directly (I.e., like ICMP) Routes calculated based on TOS Each interface is assigned a dimensionless cost,

for each TOS If several equal-cost routes are available, traffic is

load-balanced Subnets are associated with each advertised route Supports authentication Uses multicast to distribute information

Link State RoutingLink State Routing

Used by OSPF and IS-IS

Construct a Link State Packet that lists neighbors and costs to get to those neighbours

Use Dijkstra’s algorithm to compute global routes as a tree from the current router

BGPBGP

Uses TCP

Distance vector protocol, but BGP enumerates the route to each destination (using a sequence of AS numbers)

Each AS is identified by a 16-bit number

cns 2640 lecture 6/7 assembled by m. ryan byrd

Documents