udp, tcp/ip, and ip multicast com s 414 sunny gleason, vivek uppal tuesday, october 23 rd, 2001

64
UDP, TCP/IP, and IP Multicast COM S 414 Sunny Gleason, Vivek Uppal Tuesday, October 23 rd , 2001

Post on 21-Dec-2015

224 views

Category:

Documents


4 download

TRANSCRIPT

UDP, TCP/IP, and IP Multicast

COM S 414Sunny Gleason, Vivek UppalTuesday, October 23rd, 2001

In This Lecture

• We will build on understanding of IP (Internet Protocol)– UDP: User Datagram Protocol

• Unreliable, packet-based protocol

– TCP: Transmission Control Protocol• Reliable, connection-oriented, stream-based protocol

– IP Multicast (if time allows…)• Facilities for delivering datagrams to multiple

recipients

– We won’t discuss ICMP (Internet Control Message Protocol), but you can look it up if you want

Where To Find More Info

• For More “Practical” Information– Network Programming in Java

• The Java Custom Networking Trailhttp://java.sun.com/docs/books/tutorial/networking/sockets/http://java.sun.com/docs/books/tutorial/networking/datagrams/

– Network Programming in C• Books by W. Richard Stevens [HIGHLY recommended!]

– “TCP/IP Illustrated” Series– UNIX Network Programming, Vol. 1

– Kernel Source – “Real” Protocol Stacks• Linux TCP/IP Stack

– http://www.kernel.org/pub/linux/kernel/v2.4/• OpenBSD TCP/IP Stack

– ftp://ftp.openbsd.org/pub/OpenBSD/src/sys/netinet/

Where to Find More Info

• Papers, Lecture Notes and RFC’s– TCP Congestion Control

• Van Jacobson, “Congestion Avoidance and Control”, 1988

• Internet RFC Series: http://www.rfc-editor.org/

– CS514 - Fall 2000 Lecture Notes– Birman, Kenneth. Building Secure and

Reliable Network Applications. 1995.

First, some definitions…

• Keep the OSI Layers in mind!• Address

– An identifier, following an addressing convention, which allows a machine to be uniquely identified

• MAC Address, or Hardware Address– Numeric address used by Ethernet (data-link layer)– Might look like: “00:02:2D:08:68:F8”

• IP Address– Numeric address used by IP (network layer)– Might look like: “128.84.133.221”

First, some definitions…

• Packet, or Datagram– self-contained unit of information– consists of a header and body

• Packet Header– For now, realize that it includes

source address, destination address– With layered model, “nesting” of

headers

First, some definitions…

• Local Area Network (LAN)– Group of machines sharing a common

communications medium (such as Ethernet)– High data rates, “private wires”, shorter

distances

• Wide Area Network (WAN)– spans a greater geographic area, may

depend on publicly available network structures(telephone system, leased lines, satellites…)

First, some definitions…

• Router– Machine that moves packets from one network

to a network that is closer to the destination– (Based on a routing table, which may change)

• Bridge– A machine that “indiscriminantly” replicates

packets between two LANs– typically “not as smart” but faster than a router

• Gateway– A machine that routes packets from the LAN to

the WAN (What is a Firewall?)

First, some definitions…

• Port– In UDP and TCP, a number which the kernel

uses to deliver datagrams to the appropriate application

– For instance: HTTP is port 80, SMTP is port 25, Telnet is port 23, DNS is port 53, FTP is port 21

• In this model, receivers agree to wait for datagrams on a specified port

• Socket: {address, port}

The Internet

• A network based on the Internet Protocol (IP)

= Router

The Internet

• Routes IP Datagrams from point A to point B … [unreliably]

= Router

A: 171.64.14.203

B: 128.84.154.132

Unreliably?

• What good is that?• Packet loss rate is extremely low

(<< 1%)• Packets usually dropped by

overloaded routers (as we’ll see later)

• This is good enough for us to build the User Datagram Protocol (UDP)

UDP

• For applications where IP guarantees of reliability are good-enough– Streaming multimedia, stock quotes…

• Extends IP packet with source port, destination port

• In addition, provides fragmentation (and checksum)

Fragmentation in UDP

• Very simple: splits large UDP datagram into multiple IP datagrams, each with a sequence number

• Marks “fragmented” bit in the UDP header

• If one fragment is lost, the whole UDP packet is discarded

• UDP datagrams are discarded if checksum fails

The UDP API

• No-frills! Basically, you:– Create a socket {address, port}– Send data to a remote socket– Receive data on a given socket

• No guarantees about reliability, or even the ordering in which datagrams are received

• How can we get around this?

Adding Reliability to UDP

• Timeouts & Acknowledgements– Receiver sends acks of received datagrams– If sender does not receive ack within a certain

time, retransmit the packet

• Sequence Numbers– Sender marks datagrams with sequence

numbers– Receiver uses sequence numbers to restore

order to the datagrams, and ignore duplicates

• What if we have 100 or more concurrent applications? Is this efficient?

TCP

• A TCP connection is defined by:– { src_addr, src_port, dst_addr, dst_port }– Note symmetry at both ends of connection– Thus, sender is a receiver and vice-versa

• The goal: a reliable, stream-based, connection-oriented protocol– Reliable: data gets through [or connection

breaks]– Stream-based: imagine reading a file in-order– Connection-oriented: point-to-point

• How is it all done?

Vivek Presents …

• The inner workings of the TCP protocol…

• Any questions before we move on?

TCP

• TCP – Stream Protocol• 3-way Handshake• Closing a connection• Acknowledgments• Sliding Window• Flow Control• RED

TCP -- Stream Protocol

• Connection oriented• like a telephone connection• Needs set up before the transfer starts. • Reliable, point to point communication.• In order delivery• No loss or duplication. • Flow Control and error correction• Duplex connections

3 Way Hand ShakeTCP is connection Oriented

Connection initiated by a 3 - way handshake

Takes 3 packets

Protection against duplicate Syn Packets

A BSyn

Syn, Ack Of Syn

Ack Of Syn

Basic 3 Way Handshake

TCP A TCP B SEQ ACK CTL

1. CLOSED LISTEN

2. SYN-SENT <100> <SYN> SYN-RECV

3. ESTABLISH <300> <101> <SYN,ACK> SYN-RECV

4. ESTABLISH <101> <301> <ACK> ESTABLISH

Duplicate Recovery TCP A TCP B SEQ ACK CTL

1. CLOSED LISTEN

2. SYN-SENT <100> <SYN> ...

3. (duplicate) ... <90> <SYN> SYN-RECV

4. <300> <91> <SYN,ACK> (duplicate)

5. <91> <RST> LISTEN

6. ... <100> <SYN> SYN-RECV

7. SYN-SENT <400> <101> <SYN,ACK> SYN-RECV

8. ESTABLISH <101> <401> <ACK> ESTABLISH

3 Way Handshake

It ensures that both sides are ready to transmit data, and that both ends know that the other end is ready before transmission actually starts.

It allows both sides to pick the

initial sequence number to use.

Closing a Connection

Fin, Ack

Ack of Fin

A B Send a Fin packet before tearing the connection

Both processes must send Fin packets separately for closing the connection in that direction

Closing a Connection TCP A TCP B SEQ ACK CTL

1. ESTABLISHED ESTABLISHED

2. (Close) FIN <100> <300> <FIN,ACK>

5. <101> <301> <ACK> CLOSED

3. FIN <300> <101> <ACK> CLOSE-WAIT 4. (Close) <300> <101> <FIN,ACK> LAST-ACK

Acknowledgements

• Receiver acks only the last in-order packet received

• Send nacks for out-of-order packets• Sender resends the first

unacknowledged packet• timeout typically set to 1.5 * round

trip times

Sliding Window

Initially Empty

Initially Empty

The sender window has k segments (buffers)

Sliding Window

m[i]

Empty

Send message m[i]

m[i]

Sliding Window

m[i] m[i+1] … … m[i+k]

m[i] m[i+1]

ack

Sliding Window

m[i+2] m[i+3] … … m[i+k+1]

m[i+2] m[i+3]

ackm[i]

m[i+1]

Have been acked

TCP Congestion Control

• Dynamically adjust window size• Sender should not swamp the receiver – both

sides advertise maximum window size• Linear increase -- When packets are getting

through, increment the window size by 1.• When a packet is dropped, halve the window

size, and double the retransmission timeouts -- exponential backoff.

• Also called TCP fairness/friendliness

TCP Slow start

• Might take some time to get to the maximum possible window size

Optimization:• Exponential increase to start with.• Then follow the linear increase

exponential back off when the first packet is lost

RED

• Random Early Detection• Idea is very simple• Router senses that load is increasing• It simply notices that it has less

available memory for buffering• This is because packets are entering

faster than they can be forwarded

RED …

• Picks a packet at random and discards it• Even though perhaps it could be

forwarded• Receiver detects the loss and sends a

NACK• The network isn’t completely overloaded

yet so the NACK gets through• Sender chokes back

Sunny Presents

• IP Multicast …• Any questions before we move on?• Note: Slides were stolen from

CS514 FA2000 Web site

Unicast to multiple hosts

Multicast to multiple hosts

“to group”

Why do multicast?

• Send to a group, not to individual hosts– Reduces overhead in sender– Reduces bandwidth consumption in

network– Reduces latency seen by receivers

(all receive “at the same time”, in theory)

Logical addressing

• Multicast groups “handled by network”

• Senders, receivers do not need to know each others’ identities

• Group persists as long as it has at least one member

• a “rendezvous” mechanism

Applications

• Teleconferencing• Distance learning• Multimedia streaming• Directory service lookup• ...

Multicasting for resource location

• Expanding-ring search• We want to find an instance of a

resource (database, etc) which is close by

• Use multicast with IP time-to-live (TTL) values

Time-to-live and hop counts

• TTL is a counter in the packet header– Decrement at each “hop” through a

router– When TTL reaches zero, the packet is

dropped– special values for “global” and

“regional” TTL (use with care!)

Expanding-ring search

“Find me a database”, TTL=1

Expanding-ring search

“Find me a database”, TTL=2

“I’m a database, what can I do for you?”

Multicast addresses

• Class D IP addresses for group– 224.0.0.0 to 239.255.255.255

• Treated like any other IP address: can send from it or listen to it

• In practice, use UDP as well (more on this later)

Multicast at the LAN level

• Ethernet is a broadcast medium: all network cards see all packets

• Register the multicast address in the network card– only pass matching packets to OS– all other packets are ignored

Multicast beyond the LAN

• We would like to multicast between hosts on different LANs– LANs are joined together directly by

bridges– or can be connected through the

Internet by a sequence of routers– need an inter-LAN (WAN) protocol

• (in fact, this is rarely enabled!)

A naive approach

• We want to send multicasts everywhere where there are group members– use flooding to send multicast

between routers– when we get to a LAN, use regular

(Ethernet) multicast

Multicast by flooding

non-membergroup member

router

Multicast by flooding

non-membergroup member

router

Why simple flooding doesn’t work

non-membergroup member

router

Why simple flooding doesn’t work

non-membergroup member

router

wasted!

Multicast flooding

• Not a scalable mechanism– every LAN sees every multicast– every WAN router sees every

multicast: wastes bandwidth, CPU

• Requires a two-part solution– determining LAN group members – omitting WAN routers from multicast

Multicast trees

• Shortest-path tree to all multicast members, rooted at sender

• But must be computed independently by each router

• And must be dynamically adjusted for joins and leaves

A multicast tree

A multicast tree

IGMP

• Internet Group Management Protocol (Deering and Cheriton)

• Developed from work in V distributed operating system– introduced notion of process groups

(Cheriton and Zwaenepol)– groups for services, e.g. name

resolution, remote paging

IGMP

• Detects if a multicast group has any members within a LAN

• Query and report messages– router sends query of group

membership periodically– hosts report groups they’re in

IGMP

Internet

“Who is a member?”

IGMP

Internet

“I am” “I am” “I am”

IGMP

Internet

“I am” “I am” “I am”

Avoiding overloading

• Report packets may overload router– upon getting a query, each group

member sets a timer– if it sees a report for its group before

the timer expires, it suppresses its report

– otherwise reports on expiration

THE END!

• Any questions?• Slides will be put up on the web• If interested, check out the sources

for more information