multimedia communication & networks

142
13PIT101 Multimedia Communication & Networks UNIT II Dr.A.Kathirvel Professor & Head/IT - VCEW

Upload: ayyakathir

Post on 24-Jan-2015

149 views

Category:

Education


6 download

DESCRIPTION

ADVANCED ROUTING

TRANSCRIPT

Page 1: MULTIMEDIA COMMUNICATION & NETWORKS

13PIT101

Multimedia Communication & Networks

UNIT – II

Dr.A.Kathirvel

Professor & Head/IT - VCEW

Page 2: MULTIMEDIA COMMUNICATION & NETWORKS

Unit - II

Intra AS routing – Inter AS routing – Router

Architecture – Switch Fabric – Active Queue

Management – Head of Line blocking –

Transition from IPv4 to IPv6 – Multicasting –

Abstraction of Multicast groups – Group

Management – IGMP – Group Shared

Multicast Tree – Source based Multicast Tree –

Multicast routing in Internet – DVMRP and

MOSPF – PIM – Sparse mode and Dense

mode

Page 3: MULTIMEDIA COMMUNICATION & NETWORKS

INTRA AS ROUTING

Page 4: MULTIMEDIA COMMUNICATION & NETWORKS

#4

The Internet Network layer

routing

table

Host, router network layer functions:

Routing protocols

•path selection

•RIP, OSPF, BGP

IP protocol

•addressing conventions

•datagram format

•packet handling conventions

ICMP protocol

•error reporting

•router “signaling”

Transport layer: TCP, UDP

Link layer

physical layer

Network

layer

Page 5: MULTIMEDIA COMMUNICATION & NETWORKS

#5

Hierarchical Routing

scale: with 50 million

destinations:

• can’t store all dest’s in routing tables!

• routing table exchange would

swamp links!

administrative autonomy

• internet = network of networks

• each network admin may want to

control routing in its own network

Our routing study thus far - idealization

all routers identical

network “flat”

… not true in practice

Page 6: MULTIMEDIA COMMUNICATION & NETWORKS

#6

Hierarchical Routing

• aggregate routers into regions, “autonomous systems” (AS)

• routers in same AS run same routing protocol

– “intra-AS” routing protocol

– routers in different AS can run different intra-AS routing protocol

• special routers in AS

• run intra-AS routing

protocol with all other

routers in AS

• also responsible for routing

to destinations outside AS

– run inter-AS routing

protocol with other

gateway routers

gateway routers

Page 7: MULTIMEDIA COMMUNICATION & NETWORKS

#7

Intra-AS and Inter-AS routing

Gateways: •perform inter-AS

routing amongst

themselves

•perform intra-AS

routers with other

routers in their AS

inter-AS, intra-AS routing

in

gateway A.c

network layer

link layer

physical layer

a

b

b

a

a C

A

B

d

A.a

A.c

C.b B.a

c

b

c

Page 8: MULTIMEDIA COMMUNICATION & NETWORKS

#8

Intra-AS and Inter-AS routing

Host

h2 a

b

b

a

a C

A

B

d c

A.a

A.c

C.b B.a

c

b

Host

h1

Intra-AS routing

within AS A

Inter-AS

routing

between

A and B

Intra-AS routing

within AS B

We’ll examine specific inter-AS and intra-AS Internet

routing protocols shortly

Page 9: MULTIMEDIA COMMUNICATION & NETWORKS

#9

Routing: Example

AS A

(OSPF)

AS B

(OSPF intra routing)

AS D

AS C

i

b

a1

a2

d

E

F

AS I

i2

No Export

to F

Page 10: MULTIMEDIA COMMUNICATION & NETWORKS

#10

Routing: Example

AS A

(OSPF)

AS B

(OSPF intra routing)

AS D

AS C

i

b

How to specify?

a1

a2

d

E

F

AS I

d1

d2

Page 11: MULTIMEDIA COMMUNICATION & NETWORKS

#11

IP Addressing Scheme

• We need an address to uniquely identify each destination

• Routing scalability needs flexibility in aggregation of destination addresses – we should be able to aggregate a set of

destinations as a single routing unit

• Preview: the unit of routing in the Internet is a network---the destinations in the routing protocols are networks

Page 12: MULTIMEDIA COMMUNICATION & NETWORKS

#12

IP Addressing: introduction

• IP address: 32-bit identifier for host, router interface

• interface: connection between host, router and physical link

– router’s typically have multiple interfaces

– host may have multiple interfaces

– IP addresses associated with interface, not host, or router

223.1.1.1

223.1.1.2

223.1.1.3

223.1.1.4 223.1.2.9

223.1.2.2

223.1.2.1

223.1.3.2 223.1.3.1

223.1.3.27

223.1.1.1 = 11011111 00000001 00000001 00000001

223 1 1 1

Page 13: MULTIMEDIA COMMUNICATION & NETWORKS

#13

IP Addressing • IP address:

– network part

• high order bits

– host part

• low order bits

• What’s a network ? (from

IP address perspective)

– device interfaces with

same network part of IP

address

– can physically reach each

other without intervening

router

223.1.1.1

223.1.1.2

223.1.1.3

223.1.1.4 223.1.2.9

223.1.2.2

223.1.2.1

223.1.3.2 223.1.3.1

223.1.3.27

network consisting of 3 IP networks

(for IP addresses starting with 223,

first 24 bits are network address)

LAN

Page 14: MULTIMEDIA COMMUNICATION & NETWORKS

#14

IP Addressing

How to find the networks?

• Detach each interface

from router, host

• create “islands of isolated networks

223.1.1.1

223.1.1.3

223.1.1.4

223.1.2.2 223.1.2.1

223.1.2.6

223.1.3.2 223.1.3.1

223.1.3.27

223.1.1.2

223.1.7.0

223.1.7.1

223.1.8.0 223.1.8.1

223.1.9.1

223.1.9.2

Interconnected

system consisting

of six networks

Page 15: MULTIMEDIA COMMUNICATION & NETWORKS

#15

IP Addresses

0 network host

10 network host

110 network host

1110 multicast address

A

B

C

D

class

1.0.0.0 to

127.255.255.255

128.0.0.0 to

191.255.255.255

192.0.0.0 to

223.255.255.255

224.0.0.0 to

239.255.255.255

32 bits

given notion of “network”, let’s re-examine IP addresses:

“class-full” addressing:

Page 16: MULTIMEDIA COMMUNICATION & NETWORKS

#16

IP addressing: CIDR

• classful addressing: – inefficient use of address space, address space exhaustion

– e.g., class B net allocated enough addresses for 65K hosts, even if only 2K hosts in that network

• CIDR: Classless InterDomain Routing – network portion of address of arbitrary length

– address format: a.b.c.d/x, where x is # bits in network portion of address

11001000 00010111 00010000 00000000

network

part

host

part

200.23.16.0/23

Page 17: MULTIMEDIA COMMUNICATION & NETWORKS

#17

CIDR Address Aggregation

AS A

(OSPF)

AS D

i

a1

a2

d

i->a1: I can reach

130.132/16; my path:

I

AS I

d1

130.132.1/24

130.132.2/24

130.132.3/24

intradomain routing

uses /24

Page 18: MULTIMEDIA COMMUNICATION & NETWORKS

#18

CIDR Address Aggregation

x00/24: B

x01/24: C

x10/24: E

x11/24: F

A

B

C

E

F

G

Page 19: MULTIMEDIA COMMUNICATION & NETWORKS

#19

IP addresses: how to get one?

Hosts (host portion):

• hard-coded by system admin in a file

• DHCP: Dynamic Host Configuration Protocol: dynamically get address: “plug-and-play”

– host broadcasts “DHCP discover” msg

– DHCP server responds with “DHCP offer” msg

– host requests IP address: “DHCP request” msg

– DHCP server sends address: “DHCP ack” msg – The common practice in LAN and home access (why?)

Page 20: MULTIMEDIA COMMUNICATION & NETWORKS

#20

IP addresses: how to get one?

Network (network portion):

• get allocated portion of ISP’s address space: ISP's block 11001000 00010111 00010000 00000000 200.23.16.0/20

Organization 0 11001000 00010111 00010000 00000000 200.23.16.0/23

Organization 1 11001000 00010111 00010010 00000000 200.23.18.0/23

Organization 2 11001000 00010111 00010100 00000000 200.23.20.0/23

... ….. …. ….

Organization 7 11001000 00010111 00011110 00000000 200.23.30.0/23

Page 21: MULTIMEDIA COMMUNICATION & NETWORKS

#21

Hierarchical addressing: route aggregation

“Send me anything

with addresses

beginning

200.23.16.0/20”

200.23.16.0/23

200.23.18.0/23

200.23.30.0/23

Fly-By-Night-ISP

Organization 0

Organization 7 Internet

Organization 1

ISPs-R-Us “Send me anything

with addresses

beginning

199.31.0.0/16”

200.23.20.0/23

Organization 2

.

.

.

.

.

.

Hierarchical addressing allows efficient advertisement of routing

information:

Page 22: MULTIMEDIA COMMUNICATION & NETWORKS

#22

Hierarchical addressing: more specific routes

ISPs-R-Us has a more specific route to Organization 1

“Send me anything

with addresses

beginning

200.23.16.0/20”

200.23.16.0/23

200.23.18.0/23

200.23.30.0/23

Fly-By-Night-ISP

Organization 0

Organization 7 Internet

Organization 1

ISPs-R-Us “Send me anything

with addresses

beginning 199.31.0.0/16

or 200.23.18.0/23”

200.23.20.0/23

Organization 2

.

.

.

.

.

.

Page 23: MULTIMEDIA COMMUNICATION & NETWORKS

#23

Network Address Translation: Motivation

192.168.1.2

192.168.1.3

192.168.1.4

192.168.1.1

138.76.29.7

local network

(e.g., home network)

192.168.1.0/24

rest of

Internet

Datagrams with source or

destination in this network

have 192.168.1/24 address for

source, destination (as usual)

All datagrams leaving local

network have same single source NAT IP

address: 138.76.29.7,

different source port numbers

A local network uses just one public IP address as far as outside world is

concerned

Each device on the local network is assigned a private IP address

Page 24: MULTIMEDIA COMMUNICATION & NETWORKS

#24

NAT: Network Address Translation

Implementation: NAT router must:

– outgoing datagrams: replace (source IP address, port #) of every outgoing datagram to (NAT IP address, new port #)

. . . remote clients/servers will respond using (NAT IP address, new port #) as destination addr.

– remember (in NAT translation table) every (source IP address, port #) to (NAT IP address, new port #) translation pair

– incoming datagrams: replace (NAT IP address, new port #) in dest fields of every incoming datagram with corresponding (source IP address, port #) stored in NAT table

Page 25: MULTIMEDIA COMMUNICATION & NETWORKS

#25

NAT: Network Address Translation

192.168.1.2

S: 192.168.1.2, 3345

D: 128.119.40.186, 80

1

192.168.1.1

138.76.29.7

1: host 192.168.1.2

sends datagram to

128.119.40.186, 80

NAT translation table

WAN side addr LAN side addr

138.76.29.7, 5001 192.168.1.2, 3345

…… ……

S: 128.119.40.186, 80

D: 192.168.1.2, 3345

4

S: 138.76.29.7, 5001

D: 128.119.40.186, 80 2

2: NAT router

changes datagram

source addr from

192.168.1.2, 3345 to

138.76.29.7, 5001,

updates table

S: 128.119.40.186, 80

D: 138.76.29.7, 5001

3

3: Reply arrives

dest. address:

138.76.29.7, 5001

4: NAT router

changes datagram

dest addr from

138.76.29.7, 5001 to 192.168.1.2, 3345

192.168.1.3

192.168.1.4

Page 26: MULTIMEDIA COMMUNICATION & NETWORKS

#26

Network Address Translation: Advantages

• No need to be allocated range of addresses from ISP: - just one public IP address is used for all devices

– 16-bit port-number field allows 60,000 simultaneous connections with a single LAN-side address !

– can change ISP without changing addresses of devices in local network

– can change addresses of devices in local network without notifying outside world

• Devices inside local net not explicitly addressable, visible by outside world (a security plus)

Page 27: MULTIMEDIA COMMUNICATION & NETWORKS

#27

NAT: Network Address Translation

• If both hosts are behind different NAT, they will have difficulty establishing connection

• NAT is controversial:

– routers should process up to only layer 3

– violates end-to-end argument

• NAT possibility must be taken into account by app designers, e.g., P2P applications

– address shortage should instead be solved by having more addresses --- IPv6

Page 28: MULTIMEDIA COMMUNICATION & NETWORKS

#28

IP addressing: the last word...

Q: How does an ISP get block of addresses?

A: ICANN: Internet Corporation for Assigned

Names and Numbers

– allocates addresses

– manages DNS

– assigns domain names, resolves disputes

Page 29: MULTIMEDIA COMMUNICATION & NETWORKS

#29

Getting a datagram from source to dest.

223.1.1.1

223.1.1.2

223.1.1.3

223.1.1.4 223.1.2.9

223.1.2.2

223.1.2.1

223.1.3.2 223.1.3.1

223.1.3.27

A

B

E

IP datagram:

misc

fields

source

IP addr dest

IP addr data

datagram remains unchanged,

as it travels source to

destination

addr fields of interest here

mainly dest. IP addr

Dest. Net. next router Nhops

223.1.1 1 223.1.2 223.1.1.4 2

223.1.3 223.1.1.4 2

routing table in A

Page 30: MULTIMEDIA COMMUNICATION & NETWORKS

#30

Getting a datagram from source to dest.

223.1.1.1

223.1.1.2

223.1.1.3

223.1.1.4 223.1.2.9

223.1.2.2

223.1.2.1

223.1.3.2 223.1.3.1

223.1.3.27

A

B

E

Starting at A, given IP datagram

addressed to B:

look up net. address of B

find B is on same net. as A

link layer will send datagram directly

to B inside link-layer frame

B and A are directly connected

Dest. Net. next router Nhops

223.1.1 1 223.1.2 223.1.1.4 2

223.1.3 223.1.1.4 2

misc

fields 223.1.1.1 223.1.1.3 data

Page 31: MULTIMEDIA COMMUNICATION & NETWORKS

#31

Getting a datagram from source to dest.

223.1.1.1

223.1.1.2

223.1.1.3

223.1.1.4 223.1.2.9

223.1.2.2

223.1.2.1

223.1.3.2 223.1.3.1

223.1.3.27

A

B

E

Dest. Net. next router Nhops

223.1.1 1 223.1.2 223.1.1.4 2

223.1.3 223.1.1.4 2 Starting at A, dest. E:

look up network address of E

E on different network

A, E not directly attached

routing table: next hop router to E

is 223.1.1.4

link layer sends datagram to router

223.1.1.4 inside link-layer frame

datagram arrives at 223.1.1.4

continued…..

misc

fields 223.1.1.1 223.1.2.2 data

Page 32: MULTIMEDIA COMMUNICATION & NETWORKS

#32

Getting a datagram from source to dest.

223.1.1.1

223.1.1.2

223.1.1.3

223.1.1.4 223.1.2.9

223.1.2.2

223.1.2.1

223.1.3.2 223.1.3.1

223.1.3.27

A

B

E

Arriving at 223.1.4, destined for

223.1.2.2

look up network address of E

E on same network as router’s interface 223.1.2.9

router, E directly attached

link layer sends datagram to

223.1.2.2 inside link-layer frame

via interface 223.1.2.9

datagram arrives at 223.1.2.2!!!

(hooray!)

misc

fields 223.1.1.1 223.1.2.2 data network router Nhops interface

223.1.1 - 1 223.1.1.4 223.1.2 - 1 223.1.2.9

223.1.3 - 1 223.1.3.27

Dest. next

Page 33: MULTIMEDIA COMMUNICATION & NETWORKS

#33

IP datagram format

ver T

32 bits

data

(variable length,

typically a TCP

or UDP segment)

16-bit identifier

Internet

checksum

time to

live

32 bit source IP address

IP protocol version

number

header length

(bytes)

max number

remaining hops

(decremented at

each router)

for

fragmentation/

reassembly

total datagram

length (bytes)

upper layer protocol

to deliver payload to

head.

len

type of

service “type” of data

flgs fragment

offset upper

layer

32 bit destination IP address

Options (if any) E.g. timestamp,

record route

taken, specify

list of routers

to visit.

Page 34: MULTIMEDIA COMMUNICATION & NETWORKS

4-34

IP Fragmentation & Reassembly

• network links have MTU

(max.transfer size) - largest

possible link-level frame.

– different link types, different

MTUs

• large IP datagram divided

(“fragmented”) within net – one datagram becomes several

datagrams

– “reassembled” only at final destination

– IP header bits used to identify,

order related fragments

fragmentation:

in: one large datagram

out: 3 smaller datagrams

reassembly

Page 35: MULTIMEDIA COMMUNICATION & NETWORKS

Network Layer 4-35

IP Fragmentation and Reassembly

ID

=x offset

=0

fragflag

=0

length

=4000

ID

=x offset

=0

fragflag

=1

length

=1500

ID

=x offset

=185

fragflag

=1

length

=1500

ID

=x offset

=370

fragflag

=0

length

=1060

One large datagram becomes

several smaller datagrams

Example

4000 byte datagram

MTU = 1500 bytes

1480 bytes in

data field

offset =

1480/8

Page 36: MULTIMEDIA COMMUNICATION & NETWORKS

Lecture 6: Network Layer #36

Routing in the Internet

• The Global Internet consists of Autonomous Systems (AS)

interconnected with each other:

– Stub AS: small corporation

– Multihomed AS: large corporation (no transit)

– Transit AS: provider

• Two-level routing:

– Intra-AS: administrator is responsible for choice

– Inter-AS: unique standard

Page 37: MULTIMEDIA COMMUNICATION & NETWORKS

Lecture 6: Network Layer #37

Internet AS Hierarchy

Inter-AS border (exterior gateway) routers

Intra-AS interior (gateway) routers

Page 38: MULTIMEDIA COMMUNICATION & NETWORKS

Lecture 6: Network Layer #38

Intra-AS Routing

• Also known as Interior Gateway Protocols (IGP)

• Most common IGPs:

– RIP: Routing Information Protocol

– OSPF: Open Shortest Path First

– IGRP: Interior Gateway Routing Protocol (Cisco

propr.)

Page 39: MULTIMEDIA COMMUNICATION & NETWORKS

Lecture 6: Network Layer #39

RIP ( Routing Information Protocol)

• Distance vector algorithm

• Included in BSD-UNIX Distribution in 1982

• Distance metric: # of hops (max = 15 hops)

– why?

• Distance vectors: exchanged every 30 sec via Response

Message (also called advertisement)

• Each advertisement: route to up to 25 destination nets

Page 40: MULTIMEDIA COMMUNICATION & NETWORKS

Lecture 6: Network Layer #40

RIP (Routing Information Protocol)

Destination Network Next Router Num. of hops to dest.

w A 2

y B 2

z B 7

x -- 1 …. …. ....

w x y

z

A

C

D B

Routing table in D

Page 41: MULTIMEDIA COMMUNICATION & NETWORKS

Lecture 6: Network Layer #41

RIP: Link Failure and Recovery

If no advertisement heard after 180 sec --> neighbor/link declared

dead

– routes via neighbor invalidated

– new advertisements sent to neighbors

– neighbors in turn send out new advertisements (if

tables changed)

– link failure info quickly propagates to entire net

– poison reverse used to prevent ping-pong loops

(infinite distance = 16 hops)

Page 42: MULTIMEDIA COMMUNICATION & NETWORKS

Lecture 6: Network Layer #42

OSPF (Open Shortest Path First)

• “open”: publicly available

• Uses Link State algorithm

– LS packet dissemination

– Topology map at each node

– Route computation using Dijkstra’s algorithm

• OSPF advertisement carries one entry per neighbor router

• Advertisements disseminated to entire AS (via flooding)

Page 43: MULTIMEDIA COMMUNICATION & NETWORKS

Lecture 6: Network Layer #43

OSPF “advanced” features (not in RIP)

• Security: all OSPF messages authenticated (to prevent

malicious intrusion); TCP connections used

• Multiple same-cost paths allowed

– only one path in RIP

• For each link, multiple cost metrics for different ToS (eg,

satellite link cost set “low” for best effort; high for real time) • Integrated uni- and multicast support:

– Multicast OSPF (MOSPF) uses same topology data base as OSPF

• Hierarchical OSPF in large domains.

Page 44: MULTIMEDIA COMMUNICATION & NETWORKS

Lecture 6: Network Layer #44

Hierarchical OSPF

Page 45: MULTIMEDIA COMMUNICATION & NETWORKS

Lecture 6: Network Layer #45

Hierarchical OSPF

• Two-level hierarchy: local area, backbone.

– Link-state advertisements only in area

– each nodes has detailed area topology; only know

direction (shortest path) to nets in other areas.

• Area border routers: “summarize” distances to nets in own area, advertise to other Area Border routers.

• Backbone routers: run OSPF routing limited to backbone.

• Boundary routers: connect to other ASs.

Page 46: MULTIMEDIA COMMUNICATION & NETWORKS

Lecture 6: Network Layer #46

IGRP (Interior Gateway Routing Protocol)

• CISCO proprietary; successor of RIP (mid 80s)

• Distance Vector, like RIP

• several cost metrics (delay, bandwidth, reliability, load etc)

• uses TCP to exchange routing updates

• Loop-free routing via Distributed Updating Alg. (DUAL)

based on diffused computation

Page 47: MULTIMEDIA COMMUNICATION & NETWORKS

Lecture 6: Network Layer #47

Inter-AS routing

Page 48: MULTIMEDIA COMMUNICATION & NETWORKS

Lecture 6: Network Layer #48

Internet inter-AS routing: BGP

• BGP (Border Gateway Protocol): the de facto standard

• Path Vector protocol:

– similar to Distance Vector protocol

– each Border Gateway broadcast to neighbors

(peers) entire path (I.e, sequence of ASs) to

destination

– E.g., Gateway X may send its path to dest. Z:

Path (X,Z) = X,Y1,Y2,Y3,…,Z

Page 49: MULTIMEDIA COMMUNICATION & NETWORKS

Lecture 6: Network Layer #49

Internet inter-AS routing: BGP

Suppose: gateway X send its path to peer gateway W • W may or may not select path offered by X

– cost, policy (don’t route via competitors AS), loop prevention reasons.

• If W selects path advertised by X, then: Path (W,Z) = W, Path (X,Z)

• Note: X can control incoming traffic by controlling its route advertisements to peers: – e.g., don’t want to route traffic to Z -> don’t advertise any routes

to Z

Page 50: MULTIMEDIA COMMUNICATION & NETWORKS

Lecture 6: Network Layer #50

Internet inter-AS routing: BGP

• BGP messages exchanged using TCP.

• BGP messages:

– OPEN: opens TCP connection to peer and

authenticates sender

– UPDATE: advertises new path (or withdraws old)

– KEEPALIVE keeps connection alive in absence of

UPDATES; also ACKs OPEN request

– NOTIFICATION: reports errors in previous msg;

also used to close connection

Page 51: MULTIMEDIA COMMUNICATION & NETWORKS

Lecture 6: Network Layer #51

Why different Intra- and Inter-AS routing ?

Policy:

• Inter-AS: admin wants control over how its traffic routed, who

routes through its net.

• Intra-AS: single admin, so no policy decisions needed

Scale:

• hierarchical routing saves table size, reduced update traffic

Performance:

• Intra-AS: can focus on performance

• Inter-AS: policy may dominate over performance

Page 52: MULTIMEDIA COMMUNICATION & NETWORKS

Extra

Lecture 6: Network Layer #52

Page 53: MULTIMEDIA COMMUNICATION & NETWORKS

Network Layer 4-53

ICMP: Internet Control Message Protocol

• used by hosts & routers to

communicate network-level

information

– error reporting: unreachable host,

network, port, protocol

– echo request/reply (used by ping)

• network-layer “above” IP: – ICMP msgs carried in IP

datagrams

• ICMP message: type, code plus first 8

bytes of IP datagram causing error

Type Code description

0 0 echo reply (ping)

3 0 dest. network unreachable

3 1 dest host unreachable

3 2 dest protocol unreachable

3 3 dest port unreachable

3 6 dest network unknown

3 7 dest host unknown

4 0 source quench (congestion

control - not used)

8 0 echo request (ping)

9 0 route advertisement

10 0 router discovery

11 0 TTL expired

12 0 bad IP header

Page 54: MULTIMEDIA COMMUNICATION & NETWORKS

Network Layer 4-54

Traceroute and ICMP

• Source sends series of UDP

segments to dest

– First has TTL =1

– Second has TTL=2, etc.

– Unlikely port number

• When nth datagram arrives to nth

router:

– Router discards datagram

– And sends to source an ICMP

message (type 11, code 0)

– Message includes name of

router& IP address

• When ICMP message arrives, source calculates RTT

• Traceroute does this 3 times

Stopping criterion

• UDP segment eventually arrives at destination host

• Destination returns ICMP “dest port unreachable” packet (type 3, code 3)

• When source gets this ICMP, stops.

Page 55: MULTIMEDIA COMMUNICATION & NETWORKS

Example: tracert www.yahoo.com

Tracing route to www-real.wa1.b.yahoo.com [69.147.76.15]

over a maximum of 30 hops:

1 <1 ms <1 ms <1 ms 132.67.250.1

2 <1 ms 1 ms <1 ms dmz-cc-gw.math.tau.ac.il [132.67.252.2]

3 <1 ms <1 ms <1 ms tel-aviv.tau.ac.il [132.66.4.1]

4 1 ms <1 ms <1 ms gp1-tau-ge.ilan.net.il [128.139.191.70]

5 1 ms * 1 ms gp0-gp1-te.ilan.net.il [128.139.188.2]

6 87 ms 86 ms 87 ms iucc.rt1.fra.de.geant2.net [62.40.125.121]

7 87 ms 87 ms 87 ms TenGigabitEthernet7-3.ar1.FRA4.gblx.net [207.138.144.45]

8 177 ms 177 ms 177 ms 204.245.39.226

9 180 ms 177 ms 265 ms ae1-p151.msr2.re1.yahoo.com [216.115.108.23]

10 177 ms 177 ms 177 ms te-9-4.bas-a2.re1.yahoo.com [66.196.112.203]

11 177 ms 177 ms 177 ms f1.www.vip.re1.yahoo.com [69.147.76.15]

Trace complete.

Page 56: MULTIMEDIA COMMUNICATION & NETWORKS

Network Layer 4-56

IPv6

• Initial motivation: 32-bit address space soon to

be completely allocated.

• Additional motivation:

– header format helps speed processing/forwarding

– header changes to facilitate QoS

IPv6 datagram format:

– fixed-length 40 byte header

– no fragmentation allowed

Page 57: MULTIMEDIA COMMUNICATION & NETWORKS

Network Layer 4-57

IPv6 Header (Cont)

Priority: identify priority among datagrams in flow

Flow Label: identify datagrams in same “flow.” (concept of“flow” not well defined). Next header: identify upper layer protocol for data

Page 58: MULTIMEDIA COMMUNICATION & NETWORKS

Network Layer 4-58

Other Changes from IPv4

• Checksum: removed entirely to reduce

processing time at each hop

• Options: allowed, but outside of header,

indicated by “Next Header” field

• ICMPv6: new version of ICMP

– additional message types, e.g. “Packet Too Big”

– multicast group management functions

Page 59: MULTIMEDIA COMMUNICATION & NETWORKS

Network Layer 4-59

Transition From IPv4 To IPv6

• Not all routers can be upgraded simultaneous

– no “flag days”

– How will the network operate with mixed IPv4 and

IPv6 routers?

• Tunneling: IPv6 carried as payload in IPv4

datagram among IPv4 routers

Page 60: MULTIMEDIA COMMUNICATION & NETWORKS

Network Layer 4-60

Tunneling A B E F

IPv6 IPv6 IPv6 IPv6

tunnel Logical view:

Physical view: A B E F

IPv6 IPv6 IPv6 IPv6

C D

IPv4 IPv4

Flow: X

Src: A

Dest: F

data

Flow: X

Src: A

Dest: F

data

Flow: X

Src: A

Dest: F

data

Src:B

Dest: E

Flow: X

Src: A

Dest: F

data

Src:B

Dest: E

A-to-B:

IPv6

E-to-F:

IPv6 B-to-C:

IPv6 inside

IPv4

B-to-C:

IPv6 inside

IPv4

Page 61: MULTIMEDIA COMMUNICATION & NETWORKS

IPv6 status report • Operating systems –

– wide support – early 2000

– Windows (2000, XP, Vista), BSD, Linux, Apple

• Networking infrastructure – Cisco

• Deployment – Slow

• Penetration – Host - minor (less than 1%)

– Used in 2008 in China Olympic games

• Motivation: CIDR & NAT

Lecture 7: Network Layer II #61

Page 62: MULTIMEDIA COMMUNICATION & NETWORKS

Active Queue Management

Page 63: MULTIMEDIA COMMUNICATION & NETWORKS

Queuing Disciplines

• Each router must implement some queuing

discipline

• Queuing allocates both bandwidth and buffer

space:

– Bandwidth: which packet to serve (transmit) next

– Buffer space: which packet to drop next (when

required)

• Queuing also affects latency

Page 64: MULTIMEDIA COMMUNICATION & NETWORKS

Typical Internet Queuing

• FIFO + drop-tail – Simplest choice

– Used widely in the Internet

• FIFO (first-in-first-out) – Implies single class of traffic

• Drop-tail – Arriving packets get dropped when queue is full regardless

of flow or importance

• Important distinction: – FIFO: scheduling discipline

– Drop-tail: drop policy

Page 65: MULTIMEDIA COMMUNICATION & NETWORKS

FIFO + Drop-tail Problems

• Leaves responsibility of congestion control to

edges (e.g., TCP)

• Does not separate between different flows

• No policing: send more packets get more

service

• Synchronization: end hosts react to same

events

Page 66: MULTIMEDIA COMMUNICATION & NETWORKS

Active Queue Management

• Design active router queue management to aid

congestion control

• Why?

– Routers can distinguish between propagation and

persistent queuing delays

– Routers can decide on transient congestion, based

on workload

Page 67: MULTIMEDIA COMMUNICATION & NETWORKS

Active Queue Designs

• Modify both router and hosts

– DECbit – congestion bit in packet header

• Modify router, hosts use TCP

– Fair queuing

• Per-connection buffer allocation

– RED (Random Early Detection)

• Drop packet or set bit in packet header as soon as

congestion is starting

Page 68: MULTIMEDIA COMMUNICATION & NETWORKS

Internet Problems

• Full queues

– Routers are forced to have have large queues to maintain high utilizations

– TCP detects congestion from loss

• Forces network to have long standing queues in steady-state

• Lock-out problem

– Drop-tail routers treat bursty traffic poorly

– Traffic gets synchronized easily allows a few flows to monopolize the queue space

Page 69: MULTIMEDIA COMMUNICATION & NETWORKS

Design Objectives

• Keep throughput high and delay low

• Accommodate bursts

• Queue size should reflect ability to accept

bursts rather than steady-state queuing

• Improve TCP performance with minimal

hardware changes

Page 70: MULTIMEDIA COMMUNICATION & NETWORKS

Lock-out Problem

• Random drop

– Packet arriving when queue is full causes some

random packet to be dropped

• Drop front

– On full queue, drop packet at head of queue

• Random drop and drop front solve the lock-out

problem but not the full-queues problem

Page 71: MULTIMEDIA COMMUNICATION & NETWORKS

Full Queues Problem

• Drop packets before queue becomes full (early

drop)

• Intuition: notify senders of incipient

congestion

– Example: early random drop (ERD):

• If qlen > drop level, drop each new packet with fixed

probability p

• Does not control misbehaving users

Page 72: MULTIMEDIA COMMUNICATION & NETWORKS

Random Early Detection (RED)

• Detect incipient congestion, allow bursts

• Keep power (throughput/delay) high

– Keep average queue size low

– Assume hosts respond to lost packets

• Avoid window synchronization

– Randomly mark packets

• Avoid bias against bursty traffic

• Some protection against ill-behaved users

Page 73: MULTIMEDIA COMMUNICATION & NETWORKS

RED Algorithm

• Maintain running average of queue length

• If avgq < minth do nothing

– Low queuing, send packets through

• If avgq > maxth, drop packet

– Protection from misbehaving sources

• Else mark packet in a manner proportional to

queue length

– Notify sources of incipient congestion

Page 74: MULTIMEDIA COMMUNICATION & NETWORKS

RED Operation

Min thresh Max thresh

Average Queue Length

minth maxth

maxP

1.0

Avg queue length

P(drop)

Page 75: MULTIMEDIA COMMUNICATION & NETWORKS

RED Algorithm

• Maintain running average of queue length

– Byte mode vs. packet mode – why?

• For each packet arrival

– Calculate average queue size (avg)

– If minth ≤ avgq < maxth

• Calculate probability Pa

• With probability Pa

– Mark the arriving packet

• Else if maxth ≤ avg

– Mark the arriving packet

Page 76: MULTIMEDIA COMMUNICATION & NETWORKS

Queue Estimation

• Standard EWMA: avgq - (1-wq) avgq + wqqlen

– Special fix for idle periods – why?

• Upper bound on wq depends on minth

– Want to ignore transient congestion

– Can calculate the queue average if a burst arrives

• Set wq such that certain burst size does not exceed minth

• Lower bound on wq to detect congestion relatively quickly

• Typical wq = 0.002

Page 77: MULTIMEDIA COMMUNICATION & NETWORKS

Extending RED for Flow Isolation

• Problem: what to do with non-cooperative flows?

• Fair queuing achieves isolation using per-flow state – expensive at backbone routers – How can we isolate unresponsive flows without

per-flow state?

• RED penalty box – Monitor history for packet drops, identify flows

that use disproportionate bandwidth

– Isolate and punish those flows

Page 78: MULTIMEDIA COMMUNICATION & NETWORKS

FRED

• Fair Random Early Drop (Sigcomm, 1997)

• Maintain per flow state only for active flows

(ones having packets in the buffer)

• minq and maxq min and max number of

buffers a flow is allowed occupy

• avgcq = average buffers per flow

• Strike count of number of times flow has

exceeded maxq

Page 79: MULTIMEDIA COMMUNICATION & NETWORKS

FRED – Fragile Flows

• Flows that send little data and want to avoid

loss

• minq is meant to protect these

• What should minq be?

– When large number of flows 2-4 packets

• Needed for TCP behavior

– When small number of flows increase to avgcq

Page 80: MULTIMEDIA COMMUNICATION & NETWORKS

FRED

• Non-adaptive flows

– Flows with high strike count are not allowed more

than avgcq buffers

– Allows adaptive flows to occasionally burst to

maxq but repeated attempts incur penalty

Page 81: MULTIMEDIA COMMUNICATION & NETWORKS

Stochastic Fair Blue

• Same objective as RED Penalty Box

– Identify and penalize misbehaving flows

• Create L hashes with N bins each

– Each bin keeps track of separate marking rate (pm)

– Rate is updated using standard technique and a bin size

– Flow uses minimum pm of all L bins it belongs to

– Non-misbehaving flows hopefully belong to at least one

bin without a bad flow

• Large numbers of bad flows may cause false positives

Page 82: MULTIMEDIA COMMUNICATION & NETWORKS

Stochastic Fair Blue

• False positives can continuously penalize same

flow

• Solution: moving hash function over time

– Bad flow no longer shares bin with same flows

– Is history reset does bad flow get to make

trouble until detected again?

• No, can perform hash warmup in background

Page 83: MULTIMEDIA COMMUNICATION & NETWORKS

# 83

Head of Line

blocking

Page 84: MULTIMEDIA COMMUNICATION & NETWORKS

# 84

Buffers

• Input ports

• Output ports

• Inside fabric

• Shared Memory

• Combination of all

Buffer locations

Fabric

Page 85: MULTIMEDIA COMMUNICATION & NETWORKS

# 85

Input Queuing

fabric

Inp

uts

Outp

uts

Page 86: MULTIMEDIA COMMUNICATION & NETWORKS

# 86

• Input speed of queue – no more than input line

• Need arbiter (running N times faster than input)

• FIFO queue

• Head of Line (HoL) blocking .

• Utilization:

• Random destination

• 1- 1/e = 59% utilization

• due to HoL blocking

Input Buffer : properties

Page 87: MULTIMEDIA COMMUNICATION & NETWORKS

# 87

Head of Line Blocking

Page 88: MULTIMEDIA COMMUNICATION & NETWORKS

# 88

Page 89: MULTIMEDIA COMMUNICATION & NETWORKS

# 89

Page 90: MULTIMEDIA COMMUNICATION & NETWORKS

# 90

Head of Line Blocking

Stadium

Beer/Soda/Chips

Kwiky Mart

Page 91: MULTIMEDIA COMMUNICATION & NETWORKS

# 91

Stadium

Output Queuing

Beer/Soda/Chips

Kwiky Mart

Page 92: MULTIMEDIA COMMUNICATION & NETWORKS

# 92

Head of Line Blocking

B C A C B

A

B

C

Page 93: MULTIMEDIA COMMUNICATION & NETWORKS

# 93

Head of Line Blocking

B C A C B C A B

A

B

C

Page 94: MULTIMEDIA COMMUNICATION & NETWORKS

# 94

Head of Line Blocking

C B C B C A B C B A

A

B

C

Page 95: MULTIMEDIA COMMUNICATION & NETWORKS

# 95

A

B

C

VOQ—Virtual Output Queues

B C A C B

ARB

Page 96: MULTIMEDIA COMMUNICATION & NETWORKS

# 96

VOQ—Virtual Output Queues

B

C

A A

A

B

C

ARB

C B C A B

Page 97: MULTIMEDIA COMMUNICATION & NETWORKS

# 97

VOQ—Virtual Output Queues

B

C

A

B A C C B

A A A

A

B

C

ARB

Page 98: MULTIMEDIA COMMUNICATION & NETWORKS

# 98

Performance Issue with Cross-Bars

Source: M. J. Karol, M.G. Hluchyj, S. P. Morgan, “Input Versus Output Queueing [sic] on a Space-Division Packet Switch”, IEEE Transactions on Communications, Vol COM-35, No 12,

December 1987, page 1353

58.6%

Page 99: MULTIMEDIA COMMUNICATION & NETWORKS

# 99

The fabric looks ahead into the input buffer for packets that may be transferred if they were not blocked by the head of line.

Improvement depends on the depth of the look ahead.

This corresponds to virtual output queues where each input port has buffer for each output port.

Overcoming HoL blocking:

look-ahead

Page 100: MULTIMEDIA COMMUNICATION & NETWORKS

# 100

Input Queuing Virtual output queues

Page 101: MULTIMEDIA COMMUNICATION & NETWORKS

# 101

Each output port is expanded to L output

ports

The fabric can transfer up to L packets to

the same output instead of one cell.

Overcoming HoL blocking:

output expansion

Karol and Morgan,

IEEE transaction on communication, 1987: 1347-1356

Page 102: MULTIMEDIA COMMUNICATION & NETWORKS

# 102

fabric

L

Input Queuing

Output Expansion

Page 103: MULTIMEDIA COMMUNICATION & NETWORKS

# 103

Output Queuing The “ideal”

1

1

1

1

1

1

1

1

1

1 1

1

2

2

2

2

2

2

Page 104: MULTIMEDIA COMMUNICATION & NETWORKS

# 104

Output Buffer : properties

• No HoL problem

• Output queue needs to run faster than input lines

• Need to provide for N packets arriving to same queue

• solution: limit the number of input lines that can be destined to the output.

Page 105: MULTIMEDIA COMMUNICATION & NETWORKS

# 105

Shared Memory

a common pool of buffers divided into

linked lists indexed by output port number

FA

BR

IC

FA

BR

IC

MEMORY

Page 106: MULTIMEDIA COMMUNICATION & NETWORKS

# 106

Shared Memory: properties

• Packets stored in memory as they arrive

• Resource sharing

• Easy to implement priorities

• Memory is accessed at speed equal to sum of the

input or output speeds

• How to divide the space between the sessions

Page 107: MULTIMEDIA COMMUNICATION & NETWORKS

Multicast: one sender to many receivers

• Multicast: one sender to many receivers

– analogy: one teacher to many students

• Question: how to achieve multicast

Page 108: MULTIMEDIA COMMUNICATION & NETWORKS

Internet Multicast Service Model

multicast group concept:

– hosts send IP datagram pkts to multicast group

– hosts that have “joined” that multicast group will receive pkts sent to that group

Page 109: MULTIMEDIA COMMUNICATION & NETWORKS

Multicast groups

• host group semantics:

– anyone can “join” (receive) multicast group

– anyone can send to multicast gorup

– no network layer identification to hosts of members

• session/application-level mechanisms needed for membership identification, privacy

• needed: infrastructure to deliver mcast-addressed packets to all hosts that have joined that multicast group

Page 110: MULTIMEDIA COMMUNICATION & NETWORKS

Internet Multicast Addressing

• indirection: mcast address does not name a

destination, but host group to receive packet

• class D Internet addresses reserved for multicast:

packet addr: 226.17.30.197

Page 111: MULTIMEDIA COMMUNICATION & NETWORKS

Joining a mcast group: a two-step process

• local: host informs local mcast router of desire to join group: IGMP

• wide area: local router interacts with other routers to receive mcast packet flow

– many protocols (e.g., DVMRP, MOSPF, PIM)

Page 112: MULTIMEDIA COMMUNICATION & NETWORKS

IGMP: Internet Group Management Protocol

• host: sends IGMP report when application

joins mcast group

– IP_ADD_MEMBERSHIP socket option

– host need not explicitly “unjoin” group when leaving

• router: sends IGMP query at regular intervals

– host belonging to a mcast group must reply to

query

Page 113: MULTIMEDIA COMMUNICATION & NETWORKS

IGMP

IGMP version 1

• router: Host Membership Query msg broadcast on LAN to all hosts

• host: Host Membership Report msg to indicate group membership

– randomized delay before responding

– implicit leave via no reply to Query

• RFC 1112

IGMP v2: additions include

• group-specific Query

• Leave Group msg

– last host replying to Query can

send explicit Leave Group msg

– router performs group-specific

query to see if any hosts left in

group

– RFC 2236

IGMP v3: under development as

Internet draft

Page 114: MULTIMEDIA COMMUNICATION & NETWORKS

Multicast Issues

• Naming

• Membership Management

• Routing

Page 115: MULTIMEDIA COMMUNICATION & NETWORKS

IP Multicast Naming

• Class D address represents multicast group

– E.g. 226.17.30.197

• Datagram with destination address set to group delivered to all hosts in the group

– Indirection

– 226.17.30.197 => 65.30.1.2, 66.8.3.53, 128.32.75.60, …

– Sender may or may not be in the group

• No address hierarchy or subnets

– How is routing done?

Page 116: MULTIMEDIA COMMUNICATION & NETWORKS

Membership Management

• Some other questions:

– Who is part of the group?

– How does one join?

– How does one leave?

– Who decides if it’s OK?

• Membership management answers these

Page 117: MULTIMEDIA COMMUNICATION & NETWORKS

IGMP

• Internet Group Management Protocol

• Runs only between host and router

– Multicast routing takes care of communication

between routers

Page 118: MULTIMEDIA COMMUNICATION & NETWORKS

IGMP

hosts

routers

host-to-router protocol

(IGMP)

multicast routing protocols

(various)

Page 119: MULTIMEDIA COMMUNICATION & NETWORKS

IGMP query

• IGMP membership_query

– Router sends query

– Find out all groups a host belongs to

– Can query a specific group instead

– Sent to the “all systems group” (224.0.0.1) with

TTL=1

Page 120: MULTIMEDIA COMMUNICATION & NETWORKS

IGMP report

• IGMP membership_report

– Response from host to a query

– Can send report unsolicited

• Join group this way!

• IGMP leave_group

– Optional

– Router will clean up membership info on next

membership_query

Page 121: MULTIMEDIA COMMUNICATION & NETWORKS

IGMP properties

• Minimalist semantics

– Host controlled membership

• No decision about:

– Who controls membership

– Invitations

– How to find groups and join them

• Move these decisions to application layer

Page 122: MULTIMEDIA COMMUNICATION & NETWORKS

Soft state

• Host is authoritative on group membership

• Router maintains “soft state”

• A crashed router soon recovers

– Sends a new membership_query

– Misdelivers packets for a little while

• OK by IP service model!

Page 123: MULTIMEDIA COMMUNICATION & NETWORKS

CS 640 123

Protocol types

• Dense mode protocols

– assumes dense group membership

– Source distribution tree and NACK type

– DVMRP (Distance Vector Multicast Routing Protocol)

– PIM-DM (Protocol Independent Multicast, Dense Mode)

– Example: Company-wide announcement

• Sparse mode protocol

– assumes sparse group membership

– Shared distribution tree and ACK type

– PIM-SM (Protocol Independent Multicast, Sparse Mode)

– Examples: a Shuttle Launch

Page 124: MULTIMEDIA COMMUNICATION & NETWORKS

Multicast Routing

• A number of routers have hosts that belong to

a multicast group

• How to connect them (and others) in a tree?

– Shared tree: single tree for all

– Source-based tree: many trees

Page 125: MULTIMEDIA COMMUNICATION & NETWORKS

Core-Based Tree

• Tree rooted at a core

• To join a group, send unicast message towards

core

– Add all links traversed until hit existing tree

Page 126: MULTIMEDIA COMMUNICATION & NETWORKS

Diagram

Core

Page 127: MULTIMEDIA COMMUNICATION & NETWORKS

Choice of Core

• If core close to source, efficiency is good

• If core far from source, efficiency falls

– Delay up to twice optimal

• Optimal core placement is NP-hard

– Use heuristics

Page 128: MULTIMEDIA COMMUNICATION & NETWORKS

Source-based Trees

• Different tree for each possible source

– Why?

• Reverse path forwarding to figure out tree

• Pruning to leave out routers

Page 129: MULTIMEDIA COMMUNICATION & NETWORKS

Pruning

• Prune when no attached members or

downstream routers

• Propagate prune messages upstream

R1

R2

R3

R4

R5

R6 R7

router with attached

group member

router with no attached

group member

prune message

S: source

links with multicast

forwarding

P

P

P

Page 130: MULTIMEDIA COMMUNICATION & NETWORKS

DVMRP

• Distance Vector Multicast Routing Protocol

• DV + RPF + Pruning

• DV vector carries distance to multicast sources

• Pruning carries a timeout

– Afterwards, traffic delivery is resumed

• Explicit graft message to reverse pruning

– Done upon join

Page 131: MULTIMEDIA COMMUNICATION & NETWORKS

MOSPF

• Multicast Extensions to OSPF

• Link-state advertisements include multicast group

membership

– Only report directly connected hosts

• Compute shortest-path spanning tree rooted at

source

– On demand, when receiving packet from source for the

first time

– Forward multicast traffic along tree

Page 132: MULTIMEDIA COMMUNICATION & NETWORKS

MOSPF performance

• Global state allows source-based trees to be

used

– Faster delivery of messages

• Overhead

– Joins and leaves flooded to all routers

– Any change may cause whole tree to be

recomputed

Page 133: MULTIMEDIA COMMUNICATION & NETWORKS

PIM

• Protocol Independent Multicast

– Uses routing tables, but agnostic of how they are built

• Two settings:

– Dense: most routers members of a group

• Use RPF flooding with pruning

– Sparse: most routers not members of a group

• Use shared tree or source-based tree based on data characteristics

• Uses soft-state

Page 134: MULTIMEDIA COMMUNICATION & NETWORKS

Sparse vs. Dense

Dense Mode

• Dense participants

• B/W plentiful

• Membership assumed

until pruned

• Data driven

Sparse Mode

• Sparse participants

• B/W overhead

significant

• Membership explicitly

requested

• Receiver driven

Page 135: MULTIMEDIA COMMUNICATION & NETWORKS

Shared v. Source-based Trees

• Shared trees used initially

– Tree rooted at rendezvouz-point (RP)

• Can switch to source-based trees when data

rate is high

– RP sends a Join message to source

– Each router independently decides to switch to

source-based tree, sends Join to source

Page 136: MULTIMEDIA COMMUNICATION & NETWORKS

Shared Tree Example

RP

S

G

G G

Page 137: MULTIMEDIA COMMUNICATION & NETWORKS

PIM Receiver Join

RP

S

G

G G

G

Join *,G

Report G

What if

join is here?

Page 138: MULTIMEDIA COMMUNICATION & NETWORKS

PIM Shared Tree After Join

RP

S

G

G G

G

G

Page 139: MULTIMEDIA COMMUNICATION & NETWORKS

PIM Source Based Tree

RP

S

G

G G

G

G

Join s,g

Page 140: MULTIMEDIA COMMUNICATION & NETWORKS

PIM Source Based Tree

RP

S

G

G G

G

G

Page 141: MULTIMEDIA COMMUNICATION & NETWORKS

PIM routing tables

• Routing entries of the form (s,g)

– s - source

– g - group

• Wildcard entries (*,g) for shared-group trees

• Packets are routed using best match

Page 142: MULTIMEDIA COMMUNICATION & NETWORKS

Queries