stream socket programming

Stream Socket Programming

Idioms and pitfalls

Stream Socket Characteristics

• Transmissions across a stream socket are considered to be a continuous stream of bytes.

• Any other structure must be created by the applications participating in the communication.

• "Message" boundaries are not guaranteed to be preserved.

• Most applications DO want to communicate in terms of a series of separate messages.

Application Protocols

• Rules the processes involved in your application use to communicate.

• Includes:– Allowed message types (formats)– Rules about when each message type can be

sent.

• Programming language structures don't always map exactly onto your message formats

Banking Example

• Report total deposit amount (in pennies), total withdrawal amount (in pennies), number of deposits, number of withdrawals.

• Design decision #1: encoding scheme– Character strings of digits– Binary integer values

Character Encoding

• Advantages– No limit on size of values that can be encoded– No byte ordering issues

• Disadvantages– Inefficient– Easy to get buffer size wrong or waste space– Must be careful about delimiters

Binary numeric encoding

• Advantages– Uses fewer bits to represent a given value– Fields in message are a fixed number of bytes

• Disadvantages– Byte ordering is significant – must use

hton/ntoh– Building structs to represent messages isn't

always straightforward (alignment issues)

Alignment rules

• Compilers lay out structs to maximize alignment.• Fields are allocated in the order they are

declared• A data value is aligned if its address is a multiple

of its size (e.g. 32 bit ints – 4 bytes – at addresses divisible by 4)

• A struct is aligned if its address is a multiple of the size of the largest data type it contains.

• Unnamed padding bytes are added to keep struct members aligned.

Example

struct mixedData {

char Data1;

short Data2;

int Data3;

char Data4;

};

Total data bytes = 1 + 2 + 4 + 1 = 8

Examplestruct MixedData /* after compilation */{ char Data1; char Padding0[1]; /* So following 'short' is

on a 2 byte boundary */ short Data2; int Data3; char Data4; char Padding1[3];};

Total bytes = 1 + 1 + 2 + 4 + 1 + 3 = 12Size is a multiple of sizeof(int)

To avoid padding:

struct mixedData2 { int Data3; short Data2; char Data1; char Data4;};

Data items declared in order of decreasing size. Assuming actual space needed is a multiple of size of largest data item, no padding needed.

sizeof(mixedData2) = 8

Bank Example

struct bankMsg {

int depositAmt;

short depositCnt;

int withdrawAmt;

short withdrawCnt;

};• Total data size is 4 + 2 + 4 + 2 bytes = 12 bytes• On RHEL 5, using gcc, sizeof(bankMsg) = 16. Why?

No padding needed:

struct bankMsg {int depositAmt;

int withdrawAmt;

short depositCnt;

short withdrawCnt;

};

You can also add the padding to your definition so it is explicit. (Recall sockaddr_in)

Parsing received messages

• If the fields are fixed size, we can just send and receive the associated struct:struct bankMsg msg;void *buffer = (void *) &msg;msg.depositAmt = 2324234;msg.withdrawAmt = 2232344;msg.depositCnt = 50;msg.withdrawCnt = 42;

send(s, &msg, sizeof(bankMsg), 0);

Parsing received messages• In the receiving process:

struct bankMsg msg;void *buffer = (void *) &msg;int rbytes, rv;...for (rbytes = 0; rbytes < sizeof(msg);

rbytes += rv){if ((rv = recv(s, buffer + rbytes,

sizeof(msg) – rbytes, 0)) < 0) /* handle error */

}/* Fix byte order! */msg.depositAmt = ntohl(msg.depositAmt);...

Portability/Compatibility

• The C/C++ standards don't specify alignment rules – left up to compiler implementations

• Say you don't pay attention to padding when you define message format structs in your code. What can go wrong?

Delimited char strings

• Sending multiple messages consisting of variable-length delimited character strings is tricky.

• Doing a receive may give you bytes from more than one message.

• Depending on how you have structured your parsing routines, it may be complicated to track which bytes of a receive buffer you have parsed.

Delimited char strings

• One solution: for each delimited string you expect, receive one byte at a time until you receive the delimiter.

• Leaves subsequent strings waiting in the received stream for subsequent calls to recv().

• Downside: slower than multi-byte receives, but if you don't know how many characters to expect, you're kind of stuck.

Working with core dumps

• When a program crashes, it can create a file containing the image of the address space of the process at the time of the crash – a "core dump"

• Debuggers can examine these files and show you precisely where the error occurred. Good for tracking down segmentation faults.

Requirements

• You must compile your program with –g to preserve symbol table information for the debugger

• You need to be allowed to create core files in your account.

• Use the ulimit –c command to check. If return is 0, no core files will be created.

• Change with "ulimit –c unlimited"• Caution: you can only do this once per login

session, you can't switch back and forth.

Using core dumps

• If a core dump is created, you will see "(core dumped)" in the error message

• Linux creates a file called core.n, where n is a unique number

• To examine the core file, use gdb (ddd also works):

gdb executable-name corefile-name

What you see

• gdb reports where the error occurred, e.g.

#0 0x08048370 in a (p=0x0) at test_core.c:1111 int y = *p;

• a is the method name• p is the variable that caused the problem• test_core is the name of the executable, 11 is

the line number• you can use gdb to examine variable values,

etc.

Caveats

• Core files are big, and because of the Linux naming convention, you will create a separate one every time a program crashes.

• Pay attention to creation dates, and make sure you're examining the latest core dump.

• Periodically delete core files. Make core.* one of the things the "clean" target in your makefile cleans up

• Don't set ulimit to "unlimited" unless you need to examine a core dump.

A reasonable tutorial

• http://www.network-theory.co.uk/articles/gccdebug.html

stream socket programming

Documents

total data size

int rbytes

total data bytes

total bytes

fixed size

char data1 char padding01

buffer rbytes

size of values