raw sockets (+ other) unpv1 - stevens, fenner, rudoff linux socket programming - walton
TRANSCRIPT
cs423 - cotter 2
Other –
• Readv ( ) and writev ( )– Read or write data into multiple buffers– Connection-oriented.
• Recvmsg ( ) and sendmsg ( )– Most general form of send and receive.
Supports multiple buffers and flags. – Connectionless
cs423 - cotter 3
readv( ) and writev( )• ssize_t writev ( int filedes, const struct iovec *iov, int iovcnt);• ssize_t readv ( int filedes, const struct iovec *iov, int iovcnt);
– filedes – socket identifier– iov – pointer to an array of iovec structures– iovcnt – number of iovec structures in the array (16 < iovcnt: Linux 1024)– Return value is # of bytes transferred or -1 or error
struct iovec {
void * iov_base; // starting address of buffer
size_t iov_len; // size of buffer
};
cs423 - cotter 4
writev( ) example:• Write a packet that contains multiple separate data elements
– 2 character opcode– 2 character block count– 512 character data block
char opcode[2] = “03”;char blkcnt[2] = “17”;char data[512] = “This RFC specifies a standard …”;struct iovec iov[3];iov[0].iov_base = opcode;iov[0].iov_len = 2;iov[1].iov_base = blkcnt;iov[1].iov_len = 2;iov[2].iov_base = data;iov[3].iov_len = 512;writev (sock, &iov, 3);
cs423 - cotter 5
writev( ) example:
iov iov[0].iov_base
iov[0].iov_len
iov[1].iov_base
iov[1].iov_len
iov[2].iov_base
iov[2].iov_len
2
2
512
03
17
This RFC specifies a standard ..
writev (sock, &iov, 3);
cs423 - cotter 6
recvmsg ( ) and sendmsg ( )
• ssize_t recvmsg (int sock, struct msghdr *msg, int flags);
• ssize_t sendvmsg (int sock, struct msghdr *msg, int flags);– Sock – Socket identifier– Msg – struct msghdr that includes message to be
sent as well as address and msg_flags– Flags – sockets level flags– Return value is # of bytes transferred or -1 or error
cs423 - cotter 7
Sockets level flags
Flag Description RECV
SEND
MSG_DONTROUTEMSG_DONTWAITMSG_PEEKMSG_WAITALLMSG_OOB
Bypass routing table lookupOnly this operation is
nonblockingPeek at incoming message
Wait for all the dataSend or receive out-of-band
data
****
**
*
cs423 - cotter 8
Struct msghdr
struct msghdr {
void *msg_name; // protocol address
socklen_t msg_namelen // size of protocol address
struct iovec *msg_iov // scatter / gather array
int msg_iovlen //# of elements in msg_iov
void *msg_cntrl // cmsghdr struct
socklen_t msg_cntrllen // length of msg_cntrl
int msg_flags // flags returned by recvmsg
};
cs423 - cotter 9
Msg_flags
Flag Returned byrecvmsg msg_flags
MSG_EORMSG OOB
**
MSG_BCASTMSG_MCASTMSG_TRUNCMSG_CTRUNCMSG_NOTIFICATION
*****
cs423 - cotter 10
recvmsg example (ICMP) char recvbuf[BUFSIZE]; char controlbuf[BUFSIZE]; struct msghdr msg; struct iovec iov; sockfd = socket (PF_INET, SOCK_RAW, pr->icmpproto); iov.iov_base = recvbuf; iov.iov_len = sizeof(recvbuf); msg.msg_name = sin; //sockaddr struct, for sender’s IP & port msg.msg_iov = &iov; msg.msg_iovlen = 1; msg.msg_control = controlbuf; for ( ; ; ) { msg.msg_namelen =sizeof(sin); msg.msg_controllen = sizeof(controlbuf); n = recvmsg(sockfd, &msg, 0);::
cs423 - cotter 12
What are Raw Sockets?
1.A way to pass information to network protocols other than TCP or UDP (e.g. ICMP and IGMP)
2.A way to implement new IPv4 protocols
3.A way to build our own packets (be careful here)
cs423 - cotter 13
Why Would We Use Them?
• Allows us to access packets sent over protocols other than TCP / UDP
• Allows us to process IPv4 protocols in user space– Control, speed, troubleshooting
• Allow us to implement new IPv4 protocols• Allows us to control the IP header
– Control option fields (beyond setsockopt() )– Test / control packet fragmentation
cs423 - cotter 14
Limitations?
• Reliability Loss• No Ports• Nonstandard communication• No Automatic ICMP• Raw TCP / UDP unlikely• Requires root / admin
cs423 - cotter 15
OS Involvement in Sockets
User Space Kernel Space
Socket App TCP/IP StackLinux
Socket ( AF_INET, SOCK_STREAM,IPPROTO_TCP)
Socket ( AF_INET, SOCK_RAW,IPPROTO_ICMP)
Socket ( AF_PACKET, SOCK_RAW,htons(ETH_P_IP))
Identify Socket Type
Identify Socket Type
Identify Socket Type
TCP
IP
Ethernet
cs423 - cotter 16
Normal Socket Operation (TCP)
• Create a socket – s = socket (PF_INET, SOCK_STREAM, IPPROTO_TCP)
• Bind to a port (optional)– Identify local IP and port desired and create data structure– bind (s, (struct sockaddr *) &sin, sizeof(sin))
• Establish a connection to server– Identify server IP and port– connect (s, (struct sockaddr *) &sin, sizeof(sin))
• Send / Receive data– Place data to be send into buffer– recv (s, buf, strlen(buf), 0);
cs423 - cotter 17
Normal Socket Operation (TCP)
User Space Kernel Space
Socket App ProtocolLinux
socket ( ) Create socket
TCP, IP, Internetconnect( )
Bind to local port:Connect to remote port
send( ) TCP, IP, InternetPass data thru localstack to remote port
OK
OK
OK
TCP
cs423 - cotter 18
Raw Sockets Operation (ICMP)
• Create a socket – s = socket (PF_INET, SOCK_RAW, IPPROTO_ICMP)
• Since there is no port, there is no bind *• There is no TCP, so no connection *• Send / Receive data
– Place data to be sent into buffer
– sendto (s, buf, strlen(buf), 0, addr, &len);
* More later
cs423 - cotter 19
Raw Sockets Operation (ICMP)
User Space Kernel Space
Socket App ProtocolLinux
socket ( ) Create socket
sendto( ) IP, InternetPass data thru localstack to remote host
OK
OK
ICMP
cs423 - cotter 20
Create a Raw Socket• s = socket (AF_INET, SOCK_RAW, protocol)
– IPPROTO_ICMP, IPPROTO_IP, etc.
• Can create our own IP header if we wish– const int on = 1;– setsockopt (s, IPPROTO_IP, IP_HDRINCL, &on, sizeof (on));
• Can “bind”– Since we have no port, the only effect is to associate a local IP
address with the raw socket. (useful if there are multiple local IP addrs and we want to use only 1).
• Can “connect”– Again, since we have no TCP, we have no connection. The only
effect is to associate a remote IP address with this socket.
cs423 - cotter 21
Raw Socket Output
• Normal output performed using sendto or sendmsg. – Write or send can be used if the socket has been connected
• If IP_HDRINCL not set, starting addr of the data (buf) specifies the first byte following the IP header that the kernel will build.– Size only includes the data above the IP header.
• If IP_HDRINCL is set, the starting addr of the data identifies the first byte of the IP header. – Size includes the IP header– Set IP id field to 0 (tells kernel to set this field)– Kernel will calculate IP checksum
• Kernel can fragment raw packets exceeding outgoing MTU
cs423 - cotter 22
Raw Socket Input
• Received TCP / UDP NEVER passed to a raw socket.• Most ICMP packets are passed to a raw socket
– (Some exceptions for Berkeley-derived implementations)
• All IGMP packets are passed to a raw socket• All IP datagrams with a protocol field that the kernel does
not understand (process) are passed to a raw socket. • If packet has been fragmented, packet is reassembled
before being passed to raw socket
cs423 - cotter 23
Conditions that include / exclude passing to specific raw sockets
• If a nonzero protocol is specified when raw socket is created, datagram protocol must match
• If raw socket is bound to a specific local IP, then destination IP must match
• If raw socket is “connected” to a foreign IP address, then the source IP address must match
cs423 - cotter 24
Ping – Overview
• This example modified from code by Walton (Ch 18)• Very simple program that uses ICMP to send a ping to
another machine over the Internet. • Provides the option to send a defined number of packets
(or will send a default 25).• We will build an ICMP packet (with a proper header,
including checksum) that will be updated each time we send a new packet.
• We will display the raw packet that is received back from our destination host and will interpret some of the data. – (Output format is different from standard ping)
cs423 - cotter 25
ICMP Packet headerstruct icmphdr {
u_int8_t type // ICMP message type (0)
u_int8_t code // ICMP type sub-code (0)
u_int16_t checksumE306, etc.
u_int16_t id // echo datagram id (use pid)
u_int16_t sequence // echo seq # 1, 2, 3, etc.
};
Packet body:
0 1 2 3 4 5 6 7 8 9 : ; < = > ? … B
cs423 - cotter 26
myNuPing.c (overview)• Global Declarations
– Struct packet { }, some variables
• unsigned short checksum (void *b, int len)– Calculate checksum for ICMP packet (header and data)
• void display (void *buf, int bytes)– Format a received packet for display.
• void listener (void)– Separate process to capture responses to pings
• void ping (struct sockaddr_in *addr)– Create socket and send out pings 1/sec to specified IP addr
• int main (int count, shar *strings[ ])– Test for valid instantiation, create addr structure
– Fork a separate process (listener) and use existing process for ping
cs423 - cotter 27
#defines and checksum calc#define PACKETSIZE 64struct packet { struct icmphdr hdr; char msg[PACKETSIZE-sizeof(struct icmphdr)];};
int pid=-1;int loops = 25;struct protoent *proto=NULL;
unsigned short checksum(void *b, int len) { unsigned short *buf = b;
unsigned int sum=0; unsigned short result; for ( sum = 0; len > 1; len -= 2 ) sum += *buf++; if ( len == 1 ) sum += *(unsigned char*)buf; sum = (sum >> 16) + (sum & 0xFFFF); sum += (sum >> 16); result = ~sum; return result;}
cs423 - cotter 28
display - present echo infovoid display(void *buf, int bytes) {
int i; struct iphdr *ip = buf; struct icmphdr *icmp = buf+ip->ihl*4;
printf("----------------\n"); for ( i = 0; i < bytes; i++ ) { if ( !(i & 15) ) printf("\n%04X: ", i); printf("%02X ", ((unsigned char*)buf)[i]); } printf("\n"); printf("IPv%d: hdr-size=%d pkt-size=%d protocol=%d TTL=%d src=%s ", ip->version, ip->ihl*4, ntohs(ip->tot_len), ip->protocol, ip->ttl, inet_ntoa(ip->saddr)); printf("dst=%s\n", inet_ntoa(ip->daddr)); if ( icmp->un.echo.id == pid ) { printf("ICMP: type[%d/%d] checksum[%d] id[%d] seq[%d]\n", icmp->type, icmp->code, ntohs(icmp->checksum), icmp->un.echo.id, icmp->un.echo.sequence); }}
cs423 - cotter 29
Listener - separate process to listen for and collect messages-
void listener(void) { int sd, i; struct sockaddr_in addr; unsigned char buf[1024]; sd = socket(PF_INET, SOCK_RAW, proto->p_proto); if ( sd < 0 ) { perror("socket"); exit(0); } for (i = 0; i < loops; i++) { int bytes, len=sizeof(addr); bzero(buf, sizeof(buf)); bytes = recvfrom(sd, buf, sizeof(buf), 0, (struct sockaddr *) &addr,
&len); if ( bytes > 0 ) display(buf, bytes); else perror("recvfrom"); } exit(0);}
cs423 - cotter 30
ping - Create message and send itvoid ping(struct sockaddr_in *addr){
const int val=255; int i, j, sd, cnt=1; struct packet pckt; struct sockaddr_in r_addr;
sd = socket(PF_INET, SOCK_RAW, proto->p_proto); if ( sd < 0 ) { perror("socket"); return; } if ( setsockopt(sd, SOL_IP, IP_TTL, &val, sizeof(val)) != 0) perror("Set TTL option"); if ( fcntl(sd, F_SETFL, O_NONBLOCK) != 0 ) perror("Request nonblocking I/O");
cs423 - cotter 31
ping (cont) for (j = 0; j < loops; j++) { // send pings 1 per second
int len=sizeof(r_addr); printf("Msg #%d\n", cnt); if ( recvfrom(sd, &pckt, sizeof(pckt), 0, (struct sockaddr *)&r_addr, &len) > 0 ) printf("***Got message!***\n"); bzero(&pckt, sizeof(pckt)); pckt.hdr.type = ICMP_ECHO; pckt.hdr.un.echo.id = pid; for ( i = 0; i < sizeof(pckt.msg)-1; i++ ) pckt.msg[i] = i+'0'; pckt.msg[i] = 0; pckt.hdr.un.echo.sequence = cnt++; pckt.hdr.checksum = checksum(&pckt, sizeof(pckt)); if (sendto(sd, &pckt, sizeof(pckt), 0, (struct sockaddr *) addr, sizeof(*addr)) <= 0) perror("sendto"); sleep(1); }}
cs423 - cotter 32
myNuPing.c – main()int main(int count, char *argv[]) { struct hostent *hname; struct sockaddr_in addr; loops = 0; if ( count != 3 ) { printf("usage: %s <addr> <loops> \n", argv[0]); exit(0); } if (count == 3) // WE HAVE SPECIFIED A MESSAGE COUNT loops = atoi(argv[2]); if ( count > 1 ) { pid = getpid(); proto = getprotobyname("ICMP"); hname = gethostbyname(argv[1]); bzero(&addr, sizeof(addr)); addr.sin_family = hname->h_addrtype; addr.sin_port = 0; addr.sin_addr.s_addr = *(long*)hname->h_addr; if ( fork() == 0 ) listener(); else ping(&addr); wait(0); } else printf("usage: myping <hostname>\n"); return 0;}
cs423 - cotter 33
“Ping” Output[root]# ./myNuPing 134.193.12.34 2Msg #1----------------
0000: 45 00 00 54 CC 38 40 00 80 01 1F BE 86 12 34 56 0010: 86 12 34 57 00 00 E4 06 DF 07 01 00 30 31 32 33 0020: 34 35 36 37 38 39 3A 3B 3C 3D 3E 3F 40 41 42 43 0030: 44 45 46 47 48 49 4A 4B 4C 4D 4E 4F 50 51 52 53 0040: 54 55 56 57 58 59 5A 5B 5C 5D 5E 5F 60 61 62 63 0050: 64 65 66 00 IPv4: hdr-size=20 pkt-size=84 protocol=1 TTL=128 src=134.193.12.35 dst=134.193.12.34ICMP: type[0/0] checksum[58374] id[2015] seq[1]Msg #2***Got message!***----------------
0000: 45 00 00 54 CC 39 40 00 80 01 1F BD 86 12 34 56 0010: 86 12 34 57 00 00 E3 06 DF 07 02 00 30 31 32 33 0020: 34 35 36 37 38 39 3A 3B 3C 3D 3E 3F 40 41 42 43 0030: 44 45 46 47 48 49 4A 4B 4C 4D 4E 4F 50 51 52 53 0040: 54 55 56 57 58 59 5A 5B 5C 5D 5E 5F 60 61 62 63 0050: 64 65 66 00 IPv4: hdr-size=20 pkt-size=84 protocol=1 TTL=128 src=134.193.12.35 dst=134.193.12.34ICMP: type[0/0] checksum[58118] id[2015] seq[2][root]#
cs423 - cotter 34
Summary• Raw Sockets allow access to Protocols other than
the standard TCP and UDP• Performance and capabilities may be OS
dependent.– Some OSs block the ability to send packets that
originate from raw sockets (although reception may be permitted).
• Raw sockets remove the burden of the complex TCP/IP protocol stack, but they also remove the safeguards and support that those protocols provide