the network layer

181
1 The Network Layer • Services: Deliver packets between any two hosts, reliably or unreliably. A network-wide concern: Transport layer (above): between two end hosts. Data link layer (below): between two physically connected hosts. Network layer: involves each and every host, router, and gateway in the network.

Upload: walden

Post on 14-Jan-2016

48 views

Category:

Documents


0 download

DESCRIPTION

The Network Layer. Services: Deliver packets between any two hosts, reliably or unreliably. A network-wide concern: Transport layer (above): between two end hosts. Data link layer (below): between two physically connected hosts. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: The Network Layer

1

The Network Layer

• Services:

– Deliver packets between any two hosts, reliably or unreliably.

• A network-wide concern:

– Transport layer (above): between two end hosts.

– Data link layer (below): between two physically connected hosts.

– Network layer: involves each and every host, router, and gateway in the network.

Page 2: The Network Layer

2

Architectural Approaches

• Connectionless - similar to postal system; endpoint puts data to send into a packet and hands to network for delivery

• Connection-oriented - similar to telephone system; endpoints establish and maintain a connection as long as they have data to exchange

Page 3: The Network Layer

3

Connectionless (Datagram) Service

• No connection established

• Source of data adds destination information to data and delivers to network

• Network delivers each data item individually

• No routes set up at connection establishment time - each packet may follow different route to destination (but typically won’t).

• No guarantee of reliable, or in-order delivery (although data link layer may still do link-by-link error control).

• Advantages:– Robust with respect to node / link failures.– Recovery at end to end (transport) level.

• Examples: IP

Page 4: The Network Layer

4

Connection-oriented Service

• One endpoint requests connection from network

• Other endpoint agrees to connection

• Computers exchange data through connection

• Typically uses a “stream” interface

• Source delivers stream of data to network

• Network breaks into packets for delivery

• Data transmission not necessarily continuous; like telephone, connection remains in place while no data transmitted

• One endpoint requests network to break connection when transmission is complete

• Examples: Asynchronous Transfer Mode (ATM), X.25

Page 5: The Network Layer

5

Connection duration and persistence

• Connections can be made on-demand or set up permanently

– Switched connection or switched virtual circuit

– Permanent connection or provisioned virtual circuit

• Permanent connections

– Originally hard-wired

– Now configured at system initialization

• Switched connections

– Computer maintains permanent connection to network

– Network makes connection on demand

Page 6: The Network Layer

6

Virtual circuits

• Virtual: acts like a circuit, but isn’t really one.

• “Reliable” delivery of packets between end hosts.

• All packets within connection follow the same route.

AB C

D

E F

two VCsshare link B-C

Page 7: The Network Layer

7

Virtual circuits (2)

• At connection establishment time:

– Connection setup packet flows from sender to receiver.

– Routing tables updated at intermediate nodes to reflect new virtual circuit (VC).

– Fits well with quality of service (QoS) guarantees: reject call on path if QoS can’t be guaranteed.

– Potential difficulty: recovery from link or router failure.

Page 8: The Network Layer

8

Address and Connection Identifiers

• Asynchronous Transfer Mode (ATM) - 160-bit address, 28-bit connection identifier – Connection identifier

includes: – 12-bit virtual path

identifier (VPI) – 16-bit virtual circuit

identifier (VCI) – Connection identifier local

to each computer – May be different in different

parts of the ATM switch

• Address is a complete, unique identifier

• Connectionless delivery requires address on each packet

• Connection-oriented delivery can use a shorthand that identifies the connection rather than the destination

Page 9: The Network Layer

9

Internetworking

• In the real world, computers are connected by many different technologies

• Internetworking is a scheme for interconnecting multiple networks of dissimilar technologies

• Uses both hardware and software

• Extra hardware positioned between networks

• Software on each attached computer

• System of interconnected networks is called an “internetwork” or an internet

Page 10: The Network Layer

10

Routers

• A router is a hardware component used to interconnect networks

• The router is the main layer 3 building block for large internets.

• A router has interfaces on multiple networks

• Networks can use different technologies

• Router forwards packets between networks

• Transforms packets as necessary to meet standards for each network

Page 11: The Network Layer

11

Internet Architecture

• An internetwork is composed of arbitrarily many networks interconnected by routers

• Routers can have more than two interfaces

Page 12: The Network Layer

12

A virtual network

Net 2

Net 2

Net 3

Net 3

Net 1

Net 1

• Internetworking software builds a single, seamless virtual network out of multiple physical networks

• Universal addressing scheme

• Universal service

• All details of physical networks hidden from users and application programs

Page 13: The Network Layer

13

A virtual network

Net 2

Net 2 router

Physicalnetwork

Net 3

Net 3

Net 1

Net 1

• Internetworking software builds a single, seamless virtual network out of multiple physical networks

• Universal addressing scheme

• Universal service

• All details of physical networks hidden from users and application programs

Page 14: The Network Layer

14

Internetworking Protocols

• TCP/IP is the mostly widely used internetworking protocol suite

– First internetworking protocol suite

– Initially funded through ARPA

– Picked up by NSF

• Others include IPX, VINES, AppleTalk

• TCP/IP is by far the most widely used

– Vendor and platform independent

Page 15: The Network Layer

15

Internet addresses

• One key aspect of virtual network is single, uniform address format

• Cannot use hardware addresses because different technologies have different address formats

• Address format must be independent of any particular hardware address format

• Sending host puts destination internet address in packet

• Destination address can be interpreted by any intermediate router

• Routers examine address and forward packet on to the destination

Page 16: The Network Layer

16

IP addresses

• Addressing in TCP/IP is specified by the Internet Protocol (IP)

• Each host is assigned a 32-bit number

• Called the IP address or Internet address

• Unique across entire Internet

• Each IP address is divided into a prefix and a suffix

• Prefix identifies network to which computer is attached

• Suffix identifies computer within that network

• Address format makes routing efficient

Page 17: The Network Layer

17

Network and Host Numbers

• Every network in a TCP/IP internet is assigned a network number.

• Each host on a specific network is assigned a host number or host address that is unique within that network.

• Host's IP address is the combination of the network number (prefix) and host address (suffix)

• Network numbers must be unique.

• Host addresses may be reused on different networks; combination of network number prefix and host address suffix will be unique.

• Assignment of network numbers must be coordinated globally; assignment of host addresses can be managed locally.

Page 18: The Network Layer

18

IP address format

• IP designers chose 32-bit addresses (see RFC 790)

• Allocate some bits for prefix, some for suffix

– Large prefix, small suffix - many networks, few hosts per network

– Small prefix, large suffix - few networks, many hosts per network

• Because of variety of technologies, need to allow for both large and small networks

• Designers chose a compromise - multiple address formats that allow both large and small prefixes

• Each format is called an address class

• Class of an address is identified by first four bits

Page 19: The Network Layer

19

Dotted Decimal Notation

• 32 bits divided into 4 octets

• Each octet is converted to decimal value

• Dots used to separate the 4 decimal values

• Examples:

32 bit binary number Dotted decimal

10000001 00110100 00000110 00000000 129.52.6.0

11000000 00000101 00110000 00000011 192.5.48.3

10000000 10000000 11111111 00000000 128.128.255.0

Page 20: The Network Layer

20

IP addresses in C/C++

From /usr/include/netinet/in.h

/* Internet address * This definition contains obsolete fields for * compatibility with SunOS 3.x and 4.2bsd. The * presence of subnets renders divisions into fixed * fields misleading at best. New code should use * only the s_addr field. */

struct in_addr {

union {

struct { u_char s_b1,s_b2,s_b3,s_b4; } S_un_b;

struct { u_short s_w1,s_w2; } S_un_w;

u_long S_addr;

} S_un;

#define s_addr S_un.S_addr /* should be used for all code */

};

Page 21: The Network Layer

21

Useful function calls

unsigned long inet_addr( char* cp )

– Converts string with dotted address to 32 bit value

– Example: inet_addr(“129.0.0.1”)socketAddress.sin_addr.s_addr = inet_addr( charIPAddress );

char* inet_ntoa(struct in_addr in)

– Converts 32 bit value of IP address to a string in dotted decimal format.

Page 22: The Network Layer

22

IP Addresses in Java

• Class java.net.InetAddress

static InetAddress getByName(String host)

– Creates new instance of InetAddress based on a string address

– String can either be a dotted decimal IP address (e.g. “129.0.0.1”), or a host name

static InetAddress getByAddress(byte[] address)

– Creates new instance of InetAddress based on bytes containing the 4 values for the IP address

String getHostAddress( )– Returns the IP address as a dotted decimal string

byte[] getAddress( )– Returns the raw IP address as an array of bytes

Page 23: The Network Layer

23

IP Address Classes

Octet 1 Octet 2 Octet 3 Octet 4

0 prefix suffix

10 prefix suffixB

A

110 prefix suffixC

1110 multicastD

1111 reserved for future useE

1.0.0.1 to126.255.255.254

128.0.0.1 to191.255.255.254

192.0.0.1 to223.255.255.254

224.0.0.0 to239.255.255.255

240.0.0.0 to254.255.255.255

Class

Page 24: The Network Layer

24

Special IP addresses

Prefix Suffix Type of address

Purpose

All 0s All 0s This computer Used during rebooting

Network

All 0s Network Identifies a network

Network

All 1s Directed broadcast

Broadcast on specified net

All 1s All 1s Limited broadcast

Broadcast on local net

127 Any Loopback Testing

Page 25: The Network Layer

25

Allocation of IP address classes

Class Bits in prefix

Maximum number of networks

Bits in suffix

Maximum number of hosts / network

A 7 128 24 16777216

B 14 16384 16 65536

C 21 2097152 8 256

Page 26: The Network Layer

26

CIDR addresses

• CIDR = Classless Internet Domain Routing

• Created to allow more flexibility in subnet sizes; in particular, different values between 256 and 65536

• Notation: IP address / # bits in prefix

• Usage:

– Set up 32 bit mask with indicated number of 1 bits followed by 0 bits

– Logical AND with mask and IP address to get network prefix

Page 27: The Network Layer

27

CIDR Example

• Example: allocate 2 sub-networks that can hold 14 hosts each

• Prefix calculated by logical AND:

• Network 1: 128.211.0.16 / 28 ← 28 bits in prefix

• Network 2: 128.211.0.32 / 28

• Mask is: 11111111 11111111 11111111 11110000

• Net 1: 10000000 11010011 00000000 0001––––

– Allows IP addresses 128.211.0.17 through 128.211.30, since suffix cannot be all 0s or all 1s.

• Net 2: 10000000 11010011 00000000 0010––––

Page 28: The Network Layer

28

Routers and IP addressing

• IP address depends on network address

• What about routers - connected to two networks?

• IP address specifies an interface, or network attachment point, not a computer

• Router has multiple IP addresses - one for each interface

Token Ring223.240.129.0

Ethernet 131.108.0.0

WAN 76.0.0.0WAN 76.0.0.0

131.108.99.5

223.240.129.2

223.240.129.17

76.0.0.17

Page 29: The Network Layer

29

IP – Internet Protocol

Version IHL Service type Total length

Identification

Time to live Protocol Header Checksum

Flags Fragment offset

Source address

Destination address

Options

0 4 8 16 19 31

Data: up to 65,516 octets

Bits

Maximum packet size: 65,536 octets

Page 30: The Network Layer

30

IP protocol fields

• Definition: RFC 791, plus subsequent additions

• Version: version number of protocol (currently 4; version 6 also standardized)

• Internet Header Length (IHL): number of 32-bit words in header

– Minimum value: 5 (which indicates no options)

– Larger values used when options are present.

Page 31: The Network Layer

31

IP protocol fields

• Type of service:

– Specifies, precedence (bits 0-2), delay (bit 3), throughput (bit 4), reliability (bit 5) parameters

– 0 bit = normal, 1 bit = exceptional

• Total length: length of packet in octets

• Identification: sequence number

• Flags (3):

– More: indicates packet is a fragment, with more to come

– Don’t fragment: prohibits fragmentation

– (Reserved for future use)

Page 32: The Network Layer

32

IP Protocol Fields

• Fragment offset: Indicates where in original datagram, measured in 64-bit units– Note that this requires fragmentation happen at 64-bit

boundaries (except for last fragment)

• Time to live: specifies, in seconds, time remaining before this packet expires– Every router must decrease this value by at least one.

• Protocol: indicates protocol at next higher level– Current list:

http://www.iana.org/assignments/protocol-numbers– Examples

– 1: ICMP Internet Control Message Protocol

– 6: TCP Transmission Control Protocol– 17: UDP User Datagram Protocol

Page 33: The Network Layer

33

IP Protocol Fields

• Header checksum:

– 16 bit ones-complement addition of all 16 bit words in the header

– Set to zero before computation

– Re-computed at each router

– Some fields, such as time-to-live will change as message travels through network

• Source address: 32 bit IP address

• Destination address: 32 bit IP address

Page 34: The Network Layer

34

IP options

• Defined in RFC 791 and others

• Examples:

– Secure packet

– Routing information provided

– Record route

– Record time stamps

– Stream identifier

Page 35: The Network Layer

35

IP upper level interface

where:

– src = source address

– dst = destination address

– prot = protocol

– TOS = type of service

– TTL = time to live

– BufPTR = buffer pointer

– len = length of buffer

– Id = Identifier

– DF = Don't Fragment

– opt = option data

• Two service primitives: send and receive (recv)

Result = SEND(src,dst,prot,TOS,TTL,BufPTR,en,Id,DF,opt)

Result = RECV(BufPTR,prot,&src,&dst,&TOS,&len,&opt)

Page 36: The Network Layer

36

Internet Control Message Protocol (ICMP)

• Defined in RFC 792, plus updates

• Required for internet compliance

• Carried in IP packets

• ICMP messages often sent as a reply to IP packet

Type Code Checksum

Parameters

0 4 8 16 31

Message content: variable length

Bits

Page 37: The Network Layer

37

ICMP message types

8: Echo

0: Echo reply– Asks for return of this message for testing– Parameters: identifier, sequence number

3: Destination unreachable– Code indicates particular condition:

0: net unreachable1: host unreachable2: protocol unreachable3: port unreachable4: fragmentation required; don’t fragment flag set5: source route failure

– Data: original IP header, plus first 64 bits of data

Page 38: The Network Layer

38

ICMP message types

4: Source quench– Request to slow sending rate of IP packets– Data: as in destination unreachable

5: Redirect– Used to indicate a shorter routing path – Parameters: IP address of suggested router

11: Time exceeded– Time to live counter of IP packet reached zero– Data: as in destination unreachable

12: Parameter problem– Indicates problems with an IP message (usually bad

option format)– Data: as in destination unreachable

Page 39: The Network Layer

39

ICMP message types

13: Timestamp

– Sends message that records sending time, and asks for reply

– Data: sending time, reception time (to be filled in), reply sending time (to be filled in)

14: Timestamp reply

– Reply to timestamp request

– Data: values filled in from ICMP 13 message

17: Address mask request

– Host asks router on LAN for CIDR address mask (usually at reboot)

18: Address mask reply

– Reply to address mask request

– Data: the address mask

Page 40: The Network Layer

40

Network administration functionsthat use ICMP

• Ping: test if a host will respond

– Sends an ICMP echo message to designated host

– Host sends ICMP echo reply

– Used to test connectivity

– Many organizations have disabled ping to prevent denial-of-service attacks

• Traceroute: find route from source to destination

– Sends IP packet with time-to-live of 1

– First router will discard packet and send ICMP time exceeded message

– Next message sent has time-to-live of 2, and so on until destination is reached

– Each router en route will have sent an ICMP message

Page 41: The Network Layer

41

Mapping IP addresses

• Problem: How to map IP addresses onto hardware?

– Address resolution

• Where this takes place: router attached to physical network.

• Three methods used to resolve addresses:

– Table lookup

– “Computation”

– Message exchange

Page 42: The Network Layer

42

Resolution using Table Lookup

• Router keeps table.

• The following could be a table for network 197.15.3.0 / 24

• To save space and time, only the host value of the IP address would be stored.

IP address (32 bits) Hardware address (48 bits)

197.15.3.2 0A:07:4B:12:82:36

197.15.3.3 0A:9C:28:71:32:8D

197.15.3.4 0A:11:C3:68:01:99

197.15.3.5 0A:74:59:32:CC:1F

197.15.3.6 0A:04:BC:00:03:28

197.15.3.7 0A:77:81:0E:52:FA

Page 43: The Network Layer

43

Resolution using Computation

• If hardware addresses are configurable, they can be assigned to correspond with the host part of their IP address

– Example:

– host with IP address 229.123.1.1 is assigned hardware address 1;

– host with IP address 229.123.1.2 is assigned hardware address 2;

– … and so on.

• Computation: logical AND with value 000000FF.hardware_address = ip_address & 0xff

Page 44: The Network Layer

44

Resolution using Message Exchange

• Example: Ethernet Address Resolution Protocol (ARP)

– See RFC 826

• Router sends broadcast ARP message to LAN to query hosts as to who matches the IP address

– Only the host with the matching IP address replies directly to router

– Router then has hardware address

Page 45: The Network Layer

45

ARP message format

• There is a generic format in RFC 826

• The following is specific for Ethernet: 32 bit protocol (P) addresses and 48 bit hardware (H) addresses

Sender’s P. address pt. 2

0 8 16 31

Target protocol address

Bits

H. addr. length P. addr. length Operation

Target hardware address, part 2

Target H. address pt. 1

Sender’s H. address pt. 2 Sender’s P. address pt. 1

Sender’s hardware address, part 1

Protocol address type: 0800Hardware address type: 0001

Page 46: The Network Layer

46

Transmission of ARP messages

Ethernet frame

ARP packet

Preamble data CRCSourceAddr.

Dest.Addr.

7 46 – 1500 46 6 2

SFD

1 octets

octets

PaddingARP

octets1828

0806

Frametype

Page 47: The Network Layer

47

IP Fragmentation and Reassembly

• Construction of an IP packet requires obeying maximum frame sizes at each data link layer

– MTU: maximum transmission unit

– Example: IP packet carried inside an Ethernet frame (see next slide) can have, at most, 1478 octets of user data + 20 octets of IP header = 1498

• RFC 791 says any part of the internet must have an MTU 68 octets

– Any host must be able to receive 576 octets (possibly in fragments)

• If the IP “don’t fragment” flag is set, and there is more data than the MTU allows, a router will trash the IP packet and send an ICMP message (more on this later).

• Otherwise, router has to separate user data into fragments of allowable size.

• Fragmentation can be done at any router; reassembly is only done at final destination.

Page 48: The Network Layer

48

Example of MTU: Ethernet frames

Ethernet frame

IP Packet

Preamble data CRCSourceAddr.

Dest.Addr.

7 46 – 1500 46 6 2

SFD

1

1500 ( = MTU)

octets

octets

Layer 4 data

octets

0800

Frametype

SourceAddr.

Dest.Addr.

44

IP

12 24 – 1480

Page 49: The Network Layer

49

Example of Fragmented Data

User data: 2276 octets

TL=816, FO=185, more=0 User data: 796 octets

TL=1500, FO=0, more=1 User data: 1480 octets

20

20

TL = total length, FO = frame offset (in 8-octet/64-bit units)

With an MTU of 1500, this could be sent as:

Page 50: The Network Layer

50

IP Fragmentation

• The frame offset is used instead of a “fragment sequence number” because this allows for further fragmentation at a subsequent router

TL=816, FO=185, more=0 796TL=1500, FO=0, more=1 1480

TL=700, FO=100, more=1 680

TL=820, FO=0, more=1 800

TL=816, FO=185, more=0 796

MTU = 820:

Page 51: The Network Layer

51

Reassembly

• Reassembly is only done at the destination

– i.e. host with IP address in destination field

• Fragments are reassembled based on matching source address, destination address, identification field (sequence number), and protocol

• A reassembly timer is often used as the holding time for resources while waiting for all fragments

– Timer started when first fragment arrives.

– Timer cancelled when contiguous data from frame offset 0, to a fragment where the ‘more’ flag is 0 has arrived.

– If timer expires, buffer is released and fragments are trashed (and ICMP “time exceeded” message returned).

• Alternative: use ‘Time to live’ field of first fragment

Page 52: The Network Layer

52

IP Version 6 (IPv6)

• Defined in RFC 2460 and others

• Enhancements:

– 128 bit addresses

– Revised (incompatible) base header format

– Extension headers used for additional information

– Support for Quality of Service specification

– Extensibility

– Modifications to accommodate faster routing

Page 53: The Network Layer

53

IPv6 addresses

• IPv4 addresses have first 96 bits as 0 in IPv6

• New shorthand notation: colon hexadecimal

105.220.136.100.255.255.255.255.0.0.18.128.140.10.255.255

becomes

69DC:8864:FFFF:FFFF:0:1280:8C0A:FFFF

FFOC:0:0:0:0:0:0:0:B1

becomes

FFOC : : B1

• In IPv6, an IP address is assigned to an interface, not a node– One device can have 2 or more IPv6 addresses on the same

network– Intended to speed routing of packets

– Example: one address could be the “higher priority” interface.

Page 54: The Network Layer

54

IPv6 multiple headers

• Each extension header will identify its own length, as well as the type of extension header (“next header”) or data that follows.

IPv6 base dataExtension 1

40 octets

Extension N…

optional

Page 55: The Network Layer

55

IPv6 Base Header

Version Traffic class Flow label

Payload length

Source address

0 4 12 16 24 31Bits

Next header Hop limit

Destination address

Page 56: The Network Layer

56

IPv6 base header fields (1)

• Version: 6

• Traffic class:

– Available for establishing classes or priorities for packet handling

– First 6 bits: differentiated services field

– Last 2 bits: reserved for congestion notification (not yet standardized)

• Flow label: identifier for a sequence of packets from a single source, and with similar transmission requirements

– Example: one flow could identify a specific video transmission

Page 57: The Network Layer

57

IPv6 base header fields (2)

• Payload length (in octets):

– Length of all extension headers plus upper layer data

– Does not include the fixed header.

• Next header: identifies type of header following this header

– Could indicate upper level protocol, or IPv6 extension header

– Values are the protocol numbers defined in: http://www.iana.org/assignments/protocol-numbers

Page 58: The Network Layer

58

IPv6 base header fields (3)

• Hop limit: after visiting this many routers, packet will be discarded.

• Source, destination addresses

– Destination address may not be packet’s ultimate destination

– Available modes:

– Unicast: single destination

– Anycast: choose one destination from a list

– Multicast: specific group of destinations

– Broadcast: to everyone

Page 59: The Network Layer

59

Extension headers

• Recommended order of appearance:– IPv6 base (required) – Hop-by-hop options (next header = 0)– Destination options (next header = 60)

– To be processed by first destination in IPv6 header, plus destinations in routing header.

– Routing header (next header = 43)– Fragmentation header (next = 44)– Authentication (next header = 51)– Security / Encapsulation (next header = 50)– Destination options (next header = 60)

– For packet’s final destination– Upper layer protocol (next header = 6 for TCP, 17 for UDP,

58 for ICMPv6, 41 for IPv6 inside IPv6)

Page 60: The Network Layer

60

Hop-by-Hop Options

• “Jumbo payload”: packet is larger than 65,535 octets

– Payload length in fixed header must be zero

– No fragment header

• “Router alert”: information should be examined by each router along the way

– Example: using a protocol such as the Resource reSerVation Protocol (RSVP) to set up quality of service parameters.

Page 61: The Network Layer

61

Fragmentation in IPv6

• An extension header, the “fragment header” contains the fragmentation information not contained in the base header

• All fragmentation in IPv6 must be done by original sender

– This means that the sender has to discover the minimum MTU for the entire transmission.

– Find MTU by sending decreasingly larger ICMP “echo” messages with “don’t fragment” set, until an ICMP “echo reply” is returned instead of “destination unreachable”

– IPv6 has the rule that networks must have an MTU 1280 octets

Page 62: The Network Layer

62

Authentication Codes

• Message Authentication Code (MAC):

– carried in authentication header.

• Assume that sender A and receiver B have a shared secret key, KAB.

• MAC = f(KAB, M), where f is a mutually-agreed encryption function

• Receiving the correct MAC means:

– receiver knows that message is not altered.

– message is from correct sender

– sequence of message is correct

Page 63: The Network Layer

63

Congestion

• Congestion occurs when the number of packets being transmitted through the network approaches the packet handling capacity of the network

• Congestion control aims to keep number of packets below level at which performance falls off dramatically

• Data network is a network of queues

• Generally 80% utilization is critical

• Finite queues mean data may be lost

Page 64: The Network Layer

64

Queues at a Node

in

out

Page 65: The Network Layer

65

Router Packet Handling

• Packets arriving are stored at input buffers

• Routing decision made

• Packet moves to output buffer

• Packets queued for output transmitted as fast as possible

• If packets arrive to fast to be routed, or to be output, buffers will fill and overflow.

– Can discard packets

– Can use flow control– Can propagate congestion through

network

Page 66: The Network Layer

66

Congestion Principles

• Usually occurs at a point of transition to reduced throughput.

• Occurs when the higher capacity part of a system is currently carrying more traffic than the lower capacity part can handle.

• Difference from flow control:

– Flow control is one sender agreeing not to overflow one receiver at the endpoints of a transmission

– Congestion is usually caused by multiple senders, and occurs at an intermediate point in the network

– This makes congestion more difficult to detect, and to alleviate.

Page 67: The Network Layer

67

Implicit Congestion Detection

• What are the signs of congestion?

– Increased transmission time

– Packets spend more time in queues that are longer: delay increases

– Disappearance of packets

– On a fibre-based network (or ones with data link error control), disappearance of packets can be interpreted as a sign of congestion.

– Sending timers (at transport layer) start expiring.

Page 68: The Network Layer

68

Interaction of Queues

Page 69: The Network Layer

69

Idealized Performance

• Network can accept load up to its capacity

• Additional load will be delivered at capacity throughput rates.

– Packets are queued up at intermediate points

Page 70: The Network Layer

70

Idealized Performance: Throughput

0

0.2

0.4

0.6

0.8

1

1.2

0 0.5 1 1.5 2

Normalized load

No

rmal

ized

th

rou

gh

pu

t

Page 71: The Network Layer

71

Idealized Performance: Delay

0 0.5 1 1.5

Normalized load

Del

ay

Page 72: The Network Layer

72

Practical Performance

Load

Del

ay

Load

No

rmal

ized

th

rou

gh

pu

t

Page 73: The Network Layer

73

Practical Performance

• Ideal assumes infinite buffers and no overhead

• Buffers are finite

• Overheads occur in exchanging congestion control messages

Page 74: The Network Layer

74

The Congestion Control Paradox

• When congestion occurs, the problem is that there are too many packets in the network

• If packets are trashed, senders will likely resend them, along with new packets.

– Result: increased congestion

• If one node sends out messages to announce it is congested, then it increases the number of extra overhead packets in the network.

– Result: increased congestion

• If one node asks its neighbours to slow down, then the output queues of the neighbouring nodes will start filling up.

– Result: increased congestion

Page 75: The Network Layer

75

Congestion Control

• Implicit

– No action taken

– It is assumed senders will notice evidence of congestion and deal with it themselves.

– What can senders do?– Slow rate of packet sending– Increase timeout length for sent packets

• Explicit

– Various mechanisms to announce or alleviate congestion, taken by intermediate network notes.

Page 76: The Network Layer

76

Implicit Congestion Signaling

• Transmission delay may increase with congestion

• Packet may be discarded

• Source can detect these as implicit indications of congestion

• Useful on connectionless (datagram) networks

– Example: IP leaves congestion (and flow) control to upper layer (normally TCP).

• Used in frame relay LAPF

Page 77: The Network Layer

77

Explicit Congestion Signaling

• Network alerts end systems of increasing congestion

• End systems take steps to reduce offered load

• Backwards

– Congestion avoidance in opposite direction to packet required

• Forwards

– Congestion avoidance in same direction as packet required

Page 78: The Network Layer

78

Backpressure

• If node becomes congested it can slow down or halt flow of packets from other nodes

• May mean that other nodes have to apply control on incoming packet rates

• Propagates back to source

• Can restrict to logical connections generating most traffic

• Used in connection oriented that allow hop by hop congestion control (e.g. X.25)

• Not used in ATM nor frame relay

• Only recently developed for IP

Page 79: The Network Layer

79

Choke Packet

• Control packet

– Generated at congested node

– Sent to source node

– e.g. ICMP source quench

– From router or destination

– Source cuts back until no more source quench message

– Sent for every discarded packet, or anticipated

• Rather crude mechanism

Page 80: The Network Layer

80

Categories of Explicit Signaling

• Binary

– A bit set in a packet indicates congestion

• Credit based

– Indicates how many packets source may send

– Common for end to end flow control

• Rate based

– Supply explicit data rate limit

– e.g. ATM

Page 81: The Network Layer

81

TCP Slow Start

0

4096

8192

12288

16384

20480

24576

28672

32768

36864

40960

45056

Transmission Number

Co

ng

esti

on

Win

do

w (

byt

es) Threshold 1

Threshold 2

Timeout

Page 82: The Network Layer

82

Rate-based Congestion Control

• Regulate rate at which sender can inject packets into network:

• A packet must match up with (and remove) a token before entering network.

• Tokens added to bucket at rate r.

• At most b tokens can accumulate in bucket; tokens overflow and are lost after that– Bucket size b controls “burstiness”

• Max. number of packets entering network in [ t, t + δ ] is b + δr

tokens arriveat fixed rate

“bucket” of tokens

to network

storage for upto b tokens

packet waiting area

Page 83: The Network Layer

83

Congestion Control in Packet Switched Networks

• Send control packet to some or all source nodes

– Requires additional traffic during congestion

• Rely on routing information

– May react too quickly

• End to end probe packets

– Adds to overhead

• Add congestion info to packets as they cross nodes

– Either backwards or forwards

Page 84: The Network Layer

84

Traffic Management

• Fairness

• Quality of service

– May want different treatment for different connections

– What is more critical: delay or loss?

• Reservations

– e.g. ATM (Asynchronous Transfer Mode)

– Traffic contract between user and network

Page 85: The Network Layer

85

Case Study: ATM Traffic Management

• ATM standards specify several service categories

• Network traffic is managed to achieve Quality of Service (QoS) goals

• For each of the service categories (on subsequent slides):

– What is the highest priority for QoS?– Delay– Loss

– What would be a congestion control / avoidance strategy?

Page 86: The Network Layer

86

ATM Service Categories

• Real time

– Constant bit rate (CBR)

– Real time variable bit rate (rt-VBR)

• Non-real time

– Non-real time variable bit rate (nrt-VBR)

– Available bit rate (ABR)

– Unspecified bit rate (UBR)

Page 87: The Network Layer

87

Real Time Services

• QoS parameters:

– Amount of delay

– Variation of delay (jitter)

Page 88: The Network Layer

88

CBR: Constant Bit Rate

• Fixed data rate continuously available

• Tight upper bound on delay

• Uncompressed audio and video

– Video conferencing

– Interactive audio

– Audio / video distribution and retrieval

Page 89: The Network Layer

89

rt-VBR: Real-time Variable Bit Rate

• Time sensitive application

– Tightly constrained delay and delay variation

• rt-VBR applications transmit at a rate that varies with time

• Example: compressed video

– Produces varying sized image frames

– Original (uncompressed) frame rate constant

– So compressed data rate varies

• Can statistically multiplex connections

Page 90: The Network Layer

90

nrt-VBR: Non-real-time Variable Bit Rate

• May be able to characterize expected traffic flow

• Improve Quality of Service (QoS) in loss and delay

• End system specifies:

– Peak cell rate

– Sustainable or average rate

– Measure of how bursty traffic is

• e.g. Airline reservations, banking transactions

Page 91: The Network Layer

91

UBR: Unspecified Bit Rate

• May be additional capacity over and above that used by CBR and VBR traffic

– Not all resources dedicated

– Bursty nature of VBR

• For application that can tolerate some cell loss or variable delays

– e.g. TCP based traffic

• Cells forwarded on FIFO basis

• Best efforts service

Page 92: The Network Layer

92

ABR: Available Bit Rate

• Application specifies peak cell rate (PCR) and minimum cell rate (MCR)

• Resources allocated to give at least MCR

• Spare capacity shared among all ABR sources

• e.g. LAN interconnection

Page 93: The Network Layer

93

Asynchronous Transfer Mode (ATM)

• Properties of ATM:

– Small, fixed-sized packets, called “cells”

– ATM networks are connection-oriented: a connection must be set up at the start of a call

– Set up a “virtual path” (VP) on a “virtual channel” (VC)

– Subsequent cells will follow the same route to destination

– Control signaling on separate channel from user data

– Cell delivery is not guaranteed, but cell order is preserved

– Traffic management is taken into account when setting up a connection.

– High speed: data rates up to 622.08 Mbits / s

Page 94: The Network Layer

94

ATM Reference Model

Plane management

Layer management

Control plane User plane

ATM layer

ATM adaptation layer

Physical layer

Upper layer Upper layer

ATM layer

• ATM layer is approximately equivalent to the OSI network layer

Page 95: The Network Layer

95

Reference Model Layers

• Physical layer:

– Handles equivalent of OSI physical and data link layers

• ATM layer

– Deals with cells, and cell transport

– Defines cell layout, and header fields

– Establishment and release of virtual circuits

– Congestion control

• AAL: ATM adaptation layer

– Provides for transmission of packets larger than a cell.

– Various AAL protocols deal with different ATM service categories (CBR, etc.)

Page 96: The Network Layer

96

Reference Model Planes

• User plane

– Provides for user information transfer

• Control plane

– Call and connection control

• Management plane

– Plane management– whole system functions

– Layer management– Resources and parameters in protocol

entities

Page 97: The Network Layer

97

ATM Connection Setup

• Performed in control plane: VP0, VC5

• ITU protocol Q.2931

setupsetup

setupcall proceeding

connectcall proceeding

connectconnect

connect ackconnect ack

connect ack

releaserelease

releaserelease completerelease complete

release complete

Page 98: The Network Layer

98

ATM Cells

• Fixed size: 53 octets

– 5 octet header

– 48 octet information field

• Small cells reduce queuing delay for high priority cells

• Small cells can be switched more efficiently

• Easier to implement switching of small cells in hardware

Page 99: The Network Layer

99

ATM Cell Format

• Ordered transmission of 53 octet cells

• 5 octet header identifies virtual path, virtual channel , which together comprise a “connection identifier”

VPI: virtual path identifier - used for routing

VCI: virtual channel identifier - identifies transmissions within

PTI: payload type

CLP: cell loss priority

HEC: header error check

VPI HECCLPPTIVCI

12 16 3 1 8bits

upper level data

384 (= 48 octets)

Page 100: The Network Layer

100

User – Network Interface (UNI) cell

• First 4 bits of virtual path identifier used as a flow control field for a cell entering the network

• Will be overwritten by first router

GFC: generic flow control

VPI HECCLPPTIVCI

8 16 3 1 8bits

upper level data

384 (= 48 octets)

GFC

4

Page 101: The Network Layer

101

ATM payload type field

• Three bits:

0 0 0: User data cell type 0, no congestion

0 0 1: User data cell type 1, no congestion

0 1 0: User data cell type 0, congestion

0 1 1: User data cell type 1, congestion

1 0 0: Operation / administration / maintenance (OAM) message, this hop

1 0 1: OAM message, end to end

1 1 0: Resource management cell

1 1 1: Reserved for future use

Page 102: The Network Layer

102

ATM Traffic Management

• High speed, small cell size, limited overhead bits

• Still evolving

• Requirements

– Majority of traffic not amenable to flow control

– Feedback slow due to reduced transmission time compared with propagation delay

– Wide range of application demands

– Different traffic patterns

– Different network services

– High speed switching and transmission increases volatility

Page 103: The Network Layer

103

Latency/Speed Effects

• ATM 622.08 Mbps

• ~6.8x10-7 seconds to insert single cell

• Time to traverse network depends on propagation delay, switching delay

• Assume propagation at two-thirds speed of light

• If source and destination on opposite sides of Canada, propagation time ~ 2.75x10-2 seconds

• Given implicit congestion control, by the time dropped cell notification has reached source, 1.7x107 bits have been transmitted

• So, this is not a good strategy for ATM

Page 104: The Network Layer

104

Cell Delay Variation

• For ATM voice/video, data is a stream of cells

• Delay across network must be short

• Rate of delivery must be constant

• There will always be some variation in transit

• Delay cell delivery to application so that constant bit rate can be maintained to application

Page 105: The Network Layer

105

Network Contribution to Cell Delay Variation

• Packet switched networks in general

– Queuing delays

– Routing decision time

• ATM

– ATM protocol designed to minimize processing overheads at switches

– ATM switches have very high throughput

– Only noticeable delay is from congestion

– Must not accept load that causes congestion

Page 106: The Network Layer

106

Cell Delay Variation At The User-Network Interface

• Application produces data at fixed rate

• Processing at three layers of ATM causes delay

– Interleaving cells from different connections

– Operation and maintenance cell interleaving

– If using synchronous digital hierarchy frames, these are inserted at physical layer

– Can not predict these delays

Page 107: The Network Layer

107

Traffic and Congestion Control Framework

• ATM layer traffic and congestion control should support QoS classes for all foreseeable network services

• Should not rely on AAL protocols that are network specific, nor higher level application specific protocols

• Should minimize network and end to end system complexity

Page 108: The Network Layer

108

Timings Considered

• Cell insertion time

• Round trip propagation time

• Connection duration

• Long term

• Determine whether a given new connection can be accommodated

• Agree performance parameters with subscriber

Page 109: The Network Layer

109

Traffic Management and Congestion Control Techniques

• Resource management using virtual paths

• Connection admission control

• Usage parameter control

• Selective cell discard

• Traffic shaping

– Use the token bucket scheme for rate-based congestion control.

Page 110: The Network Layer

110

Resource Management Using Virtual Paths

• Separate traffic flow according to service characteristics

• User to user application

• User to network application

• Network to network application

• Concern with:

– Cell loss ratio

– Cell transfer delay

– Cell delay variation

Page 111: The Network Layer

111

Connection Admission Control

• First line of defense

• User specifies traffic characteristics for new connection by selecting a QoS

• Network accepts connection only if it can meet the demand

• Traffic contract

– Peak cell rate

– Cell delay variation

– Sustainable cell rate

– Burst tolerance

Page 112: The Network Layer

112

Usage Parameter Control

• Protection of network resources from overload by one connection

• Monitor connection to ensure traffic conforms to contract

– Monitor peak cell rate

– Measure cell delay variation

– Determine average cell rate

– Track burst sizes

• Discard cells that do not conform to traffic contract

– Called traffic policing

Page 113: The Network Layer

113

ATM-ABR Traffic Management

• Some applications (Web, file transfer) do not have well defined traffic characteristics

• Best efforts

– Allow these applications to share unused capacity

– If congestion builds, cells are dropped

• Closed loop control

– ABR connections share available capacity

– Share varies between minimum cell rate (MCR) and peak cell rate (PCR)

– ARB flow limited to available capacity by feedback

– Buffers absorb excess traffic during feedback delay

– Low cell loss

Page 114: The Network Layer

114

Feedback Mechanisms

• Transmission rate characteristics:

– Allowed cell rate

– Minimum cell rate

– Peak cell rate

– Initial cell rate

• Start with ACR=ICR

• Adjust ACR based on feedback from network

– Resource management cells– Congestion indication bit– No increase bit– Explicit cell rate field

Page 115: The Network Layer

115

Routers

• The main function of a router is to decide how best to forward packets, based on its network address.

• Action: look up identifier in a routing table, and forward packets to appropriate outgoing link, or to upper layer if applicable.

A

CBD

Page 116: The Network Layer

116

Properties Desired for Routing

• Correctness: send packet “closer” to destination

• Simplicity: less error-prone, faster

• Robustness: ability to react to changes

• Stability: routing algorithms should converge to a stable state

• Fairness: guarantee that packets are not held up indefinitely

• Performance: speed, throughput

• Scalability: can deal with ever-increasing number of network nodes

• Security: filtering of malicious activity

Page 117: The Network Layer

117

Performance Criteria

• Used for selection of route

• Criterion is used to measure the “least cost” route

• Cost could be…

– Number of hops

– $ price of link

– Delay time

– Suitability for QoS requirements

Page 118: The Network Layer

118

Costing of Routes

1

2 3

4 5

6

23

3 11 2

4

8

5

1

1

1

7

23

6

35

2

8

Page 119: The Network Layer

119

Routing Decision Time and Place

• Time

– Datagram service: on arrival of each packet

– Virtual circuit service: at connection setup

• Place

– Distributed

– Made by each node

– Centralized

– Source

– Initial sender specifies route (e.g. IP option)

Page 120: The Network Layer

120

Network Information Sourceand Update Timing

• Routing decisions usually (but not always!) based on knowledge of network

• Distributed routing– Nodes use local knowledge– May collect information from adjacent nodes– May collect information from all nodes on a potential

route

• Central routing– Collect information from all nodes

• Update timing– When is network info held by nodes updated?

– Fixed routing – requires human intervention– Adaptive - regular updates

Page 121: The Network Layer

121

Routing Strategies

• Fixed

• Flooding

• Random

• Adaptive

Page 122: The Network Layer

122

Fixed Routing

• Single permanent route for each source to destination pair

• Determine routes using a least cost algorithm

• Route fixed, at least until a change in network topology

Page 123: The Network Layer

123

Our example again…

1

2 3

4 5

6

23

3 11 2

4

8

5

1

1

1

7

23

6

35

2

8

Page 124: The Network Layer

124

Central Routing Table

From: To: 1 2 3 4 5 6

1 – 2 3 4 4 4

2 1 – 3 4 4 4

3 1 5 – 5 5 5

4 2 2 5 – 5 5

5 4 2 3 4 – 6

6 5 5 5 5 5 –

Page 125: The Network Layer

125

Local Routing Tables

12456

15555

12356

2

55

2

5

12345

5

55

55

23456

23444

13456

13444

1 2 3 4

4

34

2

6

1

34

2

6

5 6

261

Page 126: The Network Layer

126

Flooding

• No network info required

• Packet sent by node to every neighbor

• Incoming packets retransmitted on every link except incoming link

• Eventually a number of copies will arrive at destination

• Each packet is uniquely numbered so duplicates can be discarded

• Nodes can remember packets already forwarded to keep network load in bounds

• Can include a hop count in packets

Page 127: The Network Layer

127

Flooding Example

1

2 3

4 5

6

11

1

1,2

1,2

1,4

1,4

1,4

1,3

1,31,3

1,3

1,3,4,5

2,3,4

2,3,4

2,3,4

1,3,6

3 2

2,3 3,5

4,5

4

3

2,44

2,42

Page 128: The Network Layer

128

• Once more, but with routing tables…

– Assume packets carry a hop count for each node.

• Note: due to space limitations, the routing table for node 4 will not appear.

Page 129: The Network Layer

129

1

2 3

4 5

6

11

1

1,2

1,2

1,4

1,4

1,4

1,3

1,31,3

1,3

1,3,4,5

2,3,4

2,3,4

2,3,4

1,3,6

3 2

2,3 3,5

4,5

4

3

2,44

2,42

23456

13456

12456

12345

12346

1 134

11

1 124

11

5 1

3

34

2

11

3

6

2

1

3

3

2

13

35

2

21

234

111

Page 130: The Network Layer

130

Properties of Flooding

• All possible routes are tried

– Very robust

• At least one packet will have taken minimum hop count route

– Can be used to set up virtual circuit

• All nodes are visited

– Useful to distribute information

Page 131: The Network Layer

131

Random Routing

• Node selects one outgoing path for retransmission of incoming packet

• Selection can be random or round robin

• Can select outgoing path based on probability calculation

• No network info needed

• Route is typically not least cost nor minimum hop

Page 132: The Network Layer

132

Adaptive Routing

• Used by almost all packet switching networks

• Routing decisions change as conditions on the network change

– Failure

– Congestion

• Requires info about network

• Decisions more complex

• Tradeoff between quality of network info and overhead

– Reacting too quickly can cause oscillation

– Reacting too slowly to be relevant

Page 133: The Network Layer

133

Adaptive Routing

• Two factors used to make decision:

– Sending the packet in “generally” the right direction.

– Minimizing congestion

• Instead of having one entry in routing table for a destination, keep a list of alternative links.

• Each alternative has a bias factor Bi that indicates the preference for correct routing.

– Lowest bias factor implies “shortest” route to destination.

• Route packets based on the combination of the current outgoing queue length Qi for a particular link, and the bias factor.

– That is, minimize Qi + Bi over the set of alternatives.

Page 134: The Network Layer

134

Classification

• Based on information sources

– Local (isolated)

– Route to outgoing link with shortest queue

– Can include bias for each destination

– Rarely used - does not take advantage of easily available information about other nodes.

– Adjacent nodes

– All nodes

Page 135: The Network Layer

135

Local Adaptive Routing Example

To 1

To 2

To 3

To 5

1235

9630

Bias fordestination 6

Result: Chooselink to 3, since sumof bias and queuelength is 4

Page 136: The Network Layer

136

ARPANET Routing Strategies(1)

• First Generation (1969)

– Distributed adaptive

– Estimated delay as performance criterion (“cost”)

– Use modified Bellman-Ford algorithm (1962)

– Node exchanges delay vector with neighbors every 128 ms

– Update routing table based on incoming info

– Does not consider link speed, just queue length

– Queue length not a good measurement of delay

– Responds slowly to congestion

Page 137: The Network Layer

137

Bellman-Ford Algorithm

• Determines shortest paths from a source node s to all other nodes.

• For all nodes, keep the current best known shortest path– Initialize to 0 for the source and +∞ for all other nodes

• Algorithm proceeds by hop count from source node– Start with hop count of 0.

• Keep a set of edges E which have been examined.– Start with an empty set

• Repeat until E includes all edges: – Add one to current hop count– Add all edges that can be reached in this hop count to E.– For each edge added, if cost of edge to node is lower than current

minimum, replace current minimum.– Update the current best known shortest paths to all nodes, based

on inclusion of this edge.

Page 138: The Network Layer

138

Example: Bellman-Ford Algorithm

23456

1

2 3

4 5

6

23

3 11 2

4

8

5

1

1

1

7

23

6

35

2

8

1

2 3

4 5

6

∞∞2

2

1∞543

104

Page 139: The Network Layer

139

The result

1

2 3

4 5

6

1 2

1

1

2

1

2 3

4 5

6

23456

2

2

4

13

Page 140: The Network Layer

140

Distance (Cost) Vector Routing

• Localized version of Bellman-Ford algorithm

• Router receives information from neighbours, and chooses the best option from information received.

• Updates corresponds to stages in global algorithm:

– As router finds out about more destinations, new entries added.

R1 R2 R3

destination - costA - 1B - 2C - 2D - 6

destination - costA - 3B - 1E - 1F - 4

destination - costA - 2 via R1B - 2 via R3C - 3 via R1 D - 7 via R1E - 2 via R3F - 5 via R3

Page 141: The Network Layer

141

ARPANET Routing Strategies(2)

• Second Generation (1979)

– Uses delay as performance criterion– Delay measured directly

– Computed every 10 s by time-stamping packets.

– Significant changes passed on via flooding

– Uses Dijkstra’s algorithm (1959)

– Good under light and medium loads

– Under heavy loads, little correlation between reported delays and those experienced

– Why? Routers all recompute routing tables at same time, and could all switch from a heavily loaded link to a lightly loaded link – which just moves congestion elsewhere.

Page 142: The Network Layer

142

Dijkstra’s Algorithm

• Determines shortest paths from a source node s to all other nodes.

• For all nodes, keep the current best known shortest path

– Initialize to 0 for the source and +∞ for all other nodes

• Keep a set of nodes N for which the shortest path is known.

– Initialize this set to {s}.

• Repeat until N includes all nodes:

– For each node not in N, what would be the shortest path from s to the node by taking, as the last hop, an edge from a node in N?

– Whichever node results in the minimum shortest path, add that node to N.

– Update the current best known shortest paths to all nodes, based on inclusion of the new node.

Page 143: The Network Layer

143

Example: Dijkstra’s Algorithm

23456

1

2 3

4 5

6

23

3 11 2

4

8

5

1

1

1

7

23

6

35

2

8

1

2 3

4 5

6

∞∞2

2

1∞5432

2

3314

5

2 223

4446

Page 144: The Network Layer

144

The result

1

2 3

4 5

6

1 2

1

1

2

1

2 3

4 5

6

23456

∞∞2

2

4

1543

Page 145: The Network Layer

145

Distance (Cost) Vector Routing

• Localized version of Bellman-Ford algorithm

• Router receives information from neighbours, and chooses the best option from information received.

• Updates corresponds to stages in global algorithm:

– As router finds out about more destinations, new entries added.

R1 R2 R3

destination - costA - 1B - 2C - 2D - 6

destination - costA - 3B - 1E - 1F - 4

destination - costA - 2 via R1B - 2 via R3C - 3 via R1 D - 7 via R1E - 2 via R3F - 5 via R3 276

Page 146: The Network Layer

146

ARPANET Routing Strategies(3)

• Third Generation (1987)

– Link cost calculations changed– Measure average delay over last 10 seconds– Convert to utilization (0 ≤ U ≤ 1):

where Ts is the “service time” and T is the measured delay.

– Service time is average packet size (600 often used) divided by the speed of the data link.

– Normalize average utilization AU based on current value U and previous average:

AU′ = 0.5 AU + 0.5 U

TT

TTU

s

s

2

2

Page 147: The Network Layer

147

ARPANET Routing Strategies(3)

– Cost =

1, if AU ≤ 0.5

1 + 4(AU – 0.5), if AU > 0.5

– Special cost for satellite link =

2, if AU ≤ 0.75

2 + 4(AU – 0.75), if AU > 0.75

– Cost is in range 1 to 3.

– Maximum penalty for avoiding a congested link or node is 2 extra hops.

Page 148: The Network Layer

148

Routing Protocols

• Two types:– Interior: used within an “autonomous system” (AS)– Exterior: used between differing autonomous

systems.

• An “autonomous system” (RFC 1930) consists of routers (and networks) that: – Use a common routing protocol– Are managed by the same organization– Are connected (except when failures occur)

• Autonomous systems are identified by AS numbers– Assigned by IANA (Internet Authority for Assigned

Numbers) (www.iana.org)– In North America, IANA delegates to the American

Registry for Internet Numbers (ARIN) (www.arin.net)

Page 149: The Network Layer

149

Internetworking of Autonomous Systems

N1.2N1.2

N1.3N1.3

N1.4N1.4

N1.1N1.1

N2.1N2.1

N2.2N2.2

N2.3N2.3

N2.4N2.4

R3 R2

R7

R6

R8

R5

R1R4

AS 1

AS 2OSPFBGP

Physical link

Page 150: The Network Layer

150

Interior versus Exterior Routing

• Interior routing

– Typical situation: corporate network, ISP

– Usual protocol: Open Shortest Path First (OSPF) version 2 [RFC 2328]

– Needs detailed picture of network

– Least cost is the important factor

• Exterior routing

– Typical situation: connections between ISPs

– Usual protocol: Border Gateway Protocol (BGP) version 4 [RFC 1771]

– Less detailed information exchanged

– Reachability is the important factor

Page 151: The Network Layer

151

Exterior Routing with BGP

• Messages sent via TCP connection (BGP inside TCP inside IP)

• Procedures:

1. Neighbour acquisition– A neighbour is another router on the same

(physical) network but is part of a different autonomous system

– Routers agree to regular exchange of information.

2. Neighbour reachability– Maintaining the relationship with status updates

3. Network reachability– Keeping a data base of networks that can be

reached, and the preferred route to reach each network.

Page 152: The Network Layer

152

BGP Messages

• Open

– Begin a neighbour relationship with a new router

• Update

– Announce a new single route, or the deletion of one or more routes

• Keepalive

– Sent periodically to confirm router is still active and maintains the neighbour relationship

– Also acknowledges an Open message

– If keepalive message do not appear on time, connection is assumed to be broken.

• Notification

– Announces an error condition

Page 153: The Network Layer

153

Routing Tables for a BGP router

• RIB: routing information base

• Conceptually, 3 separate tables could be maintained

– Separate implementations are not required

1. Adjacent RIB inward

• Contains information learned from incoming BGP update messages

2. Local RIB

• Contains routing decisions made after applying local decision-making policies

• “The” routing table for this node

3. Adjacent RIB outward

• Contains information the router is willing to advertise via BGP

Page 154: The Network Layer

154

BGP message format

Marker

Length

Type

Authentication information – akin toa connection identifier

Number of octets in message

{Open, Update, Keepalive, Notification}

16

2

1

MessageSpecific

Information

octets

(not used for keepalive message)

Page 155: The Network Layer

155

BGP Open, Notification

• Open message has fields for (not a complete list)

– BGP protocol version (4)

– Identification of AS to which router belongs

– Hold time (period for keepalive messages)

– IP address of router

– Information to authenticate an authorized router

• Notification message indicates the following conditions:

– BGP message error

– BGP procedure error

– Hold timer expired

– Close BGP connection

Page 156: The Network Layer

156

BGP Update (1)

• Two possible functions within one update message: – Withdraw route set, listed by IP address / prefix– Add new single route

• Information about a single new route:– Origin:

– BGP (external), OSPF (internal), Unknown– Autonomous system path: a list of AS traversed for

this route– Allows routers to implement policy decisions

– Use of preferred networks– Avoidance of specific networks

Page 157: The Network Layer

157

BGP Update (2)

• Information about a single new route (continued):– Next hop: IP address of border router to be used as

next hop for IP address(es) listed below.– Could be distinct from the BGP router, if more

than one router in AS has external connections, but only one handles BGP information (example: R2 on slide 284)

– Network layer reachability information (NLRI)– A list of IP addresses to which this route applies– Could be address prefixes.

• Updates are passed on via flooding

Page 158: The Network Layer

158

Example BGP update

1.21.2

1.31.3

1.41.4

1.11.1

R3 R2

R1R4

2.12.1

2.22.2

2.32.3

2.42.4

R7

R6

R8

R5

AS1AS 2NLRI: 1.1, 1.3, 1.4

AS Path: AS1

Next hop: R1

Page 159: The Network Layer

159

BGP update propagation

2.12.1

2.22.2

2.32.3

2.42.4

AS2AS 3NLRI: 1.1, 1.3, 1.4

AS Path: AS2, AS1

Next hop: R7

3.13.1 …R7

R6

R8

R5 R9

Page 160: The Network Layer

160

Interior Routing with OSPF

• OSPF: Open Shortest Path First protocol

• Version 2 specified in RFC 2328

• Computes least cost route based on configurable metric (“cost”)

• Each router keeps track of network topology of which it is aware, including:

– Routers

– Transit networks: can carry data that neither originates nor terminates within the network

– Stub networks: data must originate or terminate within that network

Page 161: The Network Layer

161

OSPF Graph Information

• Network topology stored as a directed graph, with 4 types of nodes and 2 types of edges

• Node types:

– Router

– Transit network

– Stub network

– Host connected directly to router

• Edge types:

– Point to point link joining routers: bi-directional

– Router to network connection

N4

N8

R2

H1

Page 162: The Network Layer

162

Example of Autonomous System

stub network

transit networkrouter

host attachedto router

external networkconnections

Page 163: The Network Layer

163

AS as a Directed Graph

Page 164: The Network Layer

164

Routing Information Base

FromTo

R1

R2

R3

R4

R5

R6

R7

R8

R9

R10

R11

R12

N3

N6

N8

N9

R1 0R2 0R3 6 0R4 8 0R5 8 6 6R6 8 7 5R7 6 0R8 0R9 0

R10 7 0 0R11 0 0R12 0N1 3N2 3N3 1 1 1 1N4 2N6 1 1 1N7N8 4 3 2N9 1 1 1

N10 2N11 3H1 10

Page 165: The Network Layer

165

SPF Tree for R6

R1

N9

H1

N1

N2

N3

N4N6

N7

N8

R2

R3

R4

R5

R6 R7

R101

6

6

7

1

00

3

3

2

R8

0

R11

3

0

0

4

0

R9

R12 N10

N113

1

toN12N13N14

toN12N15

20

0

10

Page 166: The Network Layer

166

Routing Table for R6

Destination

Next Hop

Distance

N1 R3 10

N2 R3 10

N3 R3 7

N4 R3 8

N6 R10 8

N7 R10 12

N8 R10 10

N9 R10 11

N10 R10 13

N11 R10 14

H1 R10 21

R5 R5 6

R7 R10 8

externalrouters

Page 167: The Network Layer

167

OSPF Messages

• Five types of messages

1. Hello: Protocol to discover new routers– This is the only type of message exchanged

between non-adjacent nodes.

2. Link state request: Request initial database

3. Database description: Reply to link state request

4. Link state update: Announce new information

5. Link state acknowledgement: Confirm receipt of update

• Messages sent in IP packets– Acknowledgements add reliability to IP

• Routers are expected to treat OSPF messages with higher priority than regular data

Page 168: The Network Layer

168

Performance of Routing Algorithms

• Algorithms can be judged on:

– Speed.

– Computational complexity.

– Scalability.

– Speed of convergence after topological change.

– Ability to react to current traffic situation.

– Susceptibility to routing loops.

– Ability to include line characteristics in computing the cost.

Page 169: The Network Layer

169

Advanced Routing Features

• Type of service routing:– Allows choice of path that takes into account link

quality, data rate, etc.

• Load balancing:– If there are multiple routes of equivalent cost to the

destination, traffic can be distributed among different routes.

• Area routing:– A large routing domain can be partitioned into areas

to reduce the amount of routing information kept in each router.

• Authentication:– Each router will only accept routing information from

trusted routers, identified through authentication.

Page 170: The Network Layer

170

Integrated Services Architecture (1)

• Acronym: ISA

• Standards currently under development by IETF

– Base document in RFC 1633

• Categories of traffic:

– Inelastic: constraints on throughput, delay, jitter, and packet loss

– Elastic: can adjust to changes in network conditions

– Varying tolerances for changes in above factors

– E-mail: sensitive to loss, but not delay

– FTP file transfer: sensitive to throughput, but not jitter

Page 171: The Network Layer

171

ISA Services

• Guaranteed service

– Assured data rate

– Upper bound on queuing delay

– No queuing losses

• Controlled load

– Similar to guaranteed service, except that constraints are only expected to be met for a “high percentage” of packets instead of all packets.

• Best effort

– No quality of service parameters applied to traffic.

Page 172: The Network Layer

172

Elements of ISA

• Routing algorithm:

– As an alternative to delay, quality of service can be used to weight graph edges for OSPF

• Admission control

– For any service other than best effort, a reservation must be made using the RSVP protocol (RFC 2205)

• Queuing Discipline:

– Multiple output queues with fair selection for transmission

– Each flow of inelastic traffic can be queued separately

• Discard Policy

– Policy for which packets to discard when a queue is full.

Page 173: The Network Layer

173

ISA Router Architecture

RoutingProtocols

RoutingDatabase

Classification andRoute Selection

PacketScheduler

QoS queues

Best effort queue

TrafficDatabase

ReservationProtocol

AdmissionControl

ManagementAgent

Page 174: The Network Layer

174

Protocol Configuration

• A software vendor wants to sell identical copy of protocol software to all customers.

• Each system running a protocol will have different parameters:

– IP address

– Hardware address

– Location of local router

– Location of local servers for Domain Name Service, printing, time of day, …

• The problem:

– How to “discover” the local custom values when system is initialized?

Page 175: The Network Layer

175

Protocol Configuration Initialization

• Example: plugging your laptop into a data port in the SITE cafeteria tables

• You do not want to have to configure your system; you want to start using the Internet right away

• Problem:

– What address do you use to find an address?

Page 176: The Network Layer

176

Types of Address Discovery

• Fixed:

– Host is assigned a permanent set of addresses for IP, hardware, etc.

– Protocol software needs to find these parameters during initialization, either locally or from a server.

– Required for “well-known” locations (e.g. web server)

• Dynamic

– Host uses a temporary IP address obtained from a server for a specified period of time.

– Addresses are allocated from an available pool

– Examples: ISP dial-up connection, cafeteria data ports

Page 177: The Network Layer

177

Protocol Initialization

• Local, fixed option: manual configuration of IP address.

• Reverse Address Resolution Protocol (RARP)– ARP: Given IP address, find hardware address– RARP: Given hardware address, obtain IP address

– Needs fixed hardware address in network interface card (e.g. Ethernet)

• RARP request for IP address is broadcast over network.

• After obtaining an IP address, the next step is to find a router.– To do this, we need the subnet mask of the network, so

that we can find a router on the same network.– Broadcast ICMP “Address Mask Request” message– Reply contains IP mask– Broadcast ICMP “Gateway discovery” message

Page 178: The Network Layer

178

Dynamic Address Allocation

• Each host obtains a “lease” for an IP address assigned from a pool.

– Provisioning challenge: how large should the pool of IP addresses be for customer base?

• Lease has expiry time

– Lease can be renewed before expiry

– On expiry, IP address is returned to the available pool.

Page 179: The Network Layer

179

DHCP: Dynamic Host Configuration Protocol

• Defined in RFC 2131

• Protocol to automatically:

– Assign an IP address from a pool of available addresses– Assignment can be permanent or temporary– Temporary assignment (a “lease”) will have an expiry

time.

– Locate a server

– Locate a router

– Get the name of a server

• Relies on special IP addresses:

– IP address 0.0.0.0: used to send messages while obtaining IP address

– IP address 255.255.255.255: local network broadcast

Page 180: The Network Layer

180

DHCP Message Format

0 8 16 24 31Bits

Message type HW addr. type

Seconds elapsed Broadcast flag and 15 zeros

Header length Hops to server

Client IP address (if renewing)

“Your new” IP address

Reboot Server IP address

Router IP address

Client Hardware address (16 octets)

Server host name (64 octets)

Reboot file name (128 octets)

Transaction ID

Options (variable)

Page 181: The Network Layer

181

DHCP Message Types

• (not a complete list)

• Discover: request from client to find servers (broadcast)

• Offer: server reply to discover, with offer of configuration parameters (broadcast, possibly by more than one server)

• Request: confirmation of offer, sent from client to specific server

• Acknowledgement: configuration parameters issued by server to client

• Release: client returns allocations to server and cancels lease