csci 243 the mechanics of programming more about i/o in c ...tvf/csci243/notes/11-c-io.pdf · csci...

32
CSCI 243 The Mechanics of Programming Timothy Fossum ([email protected]) TVF / RIT 20195 More About I/O in C

Upload: others

Post on 20-Jun-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CSCI 243 The Mechanics of Programming More About I/O in C ...tvf/CSCI243/Notes/11-c-io.pdf · CSCI 243 The Mechanics of Programming Timothy Fossum (tvf@cs.rit.edu) TVF / RIT 20195

CSCI 243The Mechanics of Programming

Timothy Fossum ([email protected])

TVF / RIT 20195

More About I/O in C

Page 2: CSCI 243 The Mechanics of Programming More About I/O in C ...tvf/CSCI243/Notes/11-c-io.pdf · CSCI 243 The Mechanics of Programming Timothy Fossum (tvf@cs.rit.edu) TVF / RIT 20195

TVF / RIT 20195 CS243: More About C I/O

Input and Output

• UNIX® and Linux® provide fundamental i/o services• Open/close, read, write

• Accessed via system calls

• Standard i/o library built on top of system calls• More abstract view of i/o

• Accessed via library function calls

UNIX® is a registered trademark of the Open Group Ltd.Linux® is the registered trademark of Linus Torvalds in the United States and other countries.

Page 3: CSCI 243 The Mechanics of Programming More About I/O in C ...tvf/CSCI243/Notes/11-c-io.pdf · CSCI 243 The Mechanics of Programming Timothy Fossum (tvf@cs.rit.edu) TVF / RIT 20195

TVF / RIT 20195 CS243: More About C I/O

UNIX/Linux System-Level I/O

• Processes do file operations using file descriptors (FDs)

• Non-negative integers• Originally, 0..19

• Modern systems typically allow 0..255, some 0..4095

• Several ways to get them• Open a file

• Inherit when process is created

• Create a pipe or socket connection

• More about this later

Page 4: CSCI 243 The Mechanics of Programming More About I/O in C ...tvf/CSCI243/Notes/11-c-io.pdf · CSCI 243 The Mechanics of Programming Timothy Fossum (tvf@cs.rit.edu) TVF / RIT 20195

TVF / RIT 20195 CS243: More About C I/O

File Descriptors

• Three FDs are usually pre-opened• 0: standard input

• 1: standard output

• 2: standard error output

• Defined in <unistd.h>:

• STDIN_FILENO, STDOUT_FILENO, STDERR_FILENO

• Generally, inherited from parent process• More on this later

Page 5: CSCI 243 The Mechanics of Programming More About I/O in C ...tvf/CSCI243/Notes/11-c-io.pdf · CSCI 243 The Mechanics of Programming Timothy Fossum (tvf@cs.rit.edu) TVF / RIT 20195

TVF / RIT 20195 CS243: More About C I/O

Standard I/O Library

• Part of the C library

• Built on top of OS i/o routines

• Character-oriented i/o• Vs. byte-oriented at OS level

• Buffered within user space• In addition to OS i/o buffering at system level

• Logical vs. physical i/o

• To use, include <stdio.h>

Page 6: CSCI 243 The Mechanics of Programming More About I/O in C ...tvf/CSCI243/Notes/11-c-io.pdf · CSCI 243 The Mechanics of Programming Timothy Fossum (tvf@cs.rit.edu) TVF / RIT 20195

TVF / RIT 20195 CS243: More About C I/O

Standard I/O Library

• I/O connections called streams

• Two types of streams: binary and text

• Binary streams are byte-oriented• Raw bytes of data

• No interpretation done by library

• Files are unstructured

• Text streams are character-oriented• Each byte represents a character

• Files are structured

• Zero or more lines

• Lines contain zero or more characters

• Lines end with EOLN character sequence

Page 7: CSCI 243 The Mechanics of Programming More About I/O in C ...tvf/CSCI243/Notes/11-c-io.pdf · CSCI 243 The Mechanics of Programming Timothy Fossum (tvf@cs.rit.edu) TVF / RIT 20195

TVF / RIT 20195 CS243: More About C I/O

Text Streams

• Character-oriented

• Text files have structure• Zero or more lines

• Lines contain zero or more characters

• Each line terminated with EOLN sequence• UNIX/Linux: LF (‘\n’)

• Windows: CR (‘\r’) and LF

• Older MacOS: CR

• C standard says:• Each newline sequence translated to unique single char value

• On text input, system’s native newline sequence translated to ‘\n’

• On text output, ‘\n’ translated to system’s native newline sequence

http://en.wikipedia.org/wiki/Newline#Representations

Page 8: CSCI 243 The Mechanics of Programming More About I/O in C ...tvf/CSCI243/Notes/11-c-io.pdf · CSCI 243 The Mechanics of Programming Timothy Fossum (tvf@cs.rit.edu) TVF / RIT 20195

TVF / RIT 20195 CS243: More About C I/O

Basic Stdio Usage

• All streams associated with a FILE• Defined in <stdio.h>

• Contains information about this stream

• Flags, file descriptor, buffer information, etc.

• Specify stream to use via a FILE *• Pointer into global array: FILE __iob[NFILE]

• Must be pointer so that contents can be modified by i/o routines

• Three predefined names in <stdio.h>:

• Macros in UNIX and POSIX®, actual global vars in Linux

#define stdin   (&__iob[0])#define stdout  (&__iob[1])#define stderr  (&__iob[2])

IEEE® and POSIX® are registered trademarks of the IEEE in the United States.

Page 9: CSCI 243 The Mechanics of Programming More About I/O in C ...tvf/CSCI243/Notes/11-c-io.pdf · CSCI 243 The Mechanics of Programming Timothy Fossum (tvf@cs.rit.edu) TVF / RIT 20195

TVF / RIT 20195 CS243: More About C I/O

Opening and Closing Streams

• To use:

• path - file to be opened

• mode - how to open it

• Actions:• Opens file via system call

• Allocates & initializes FILE structure

• Note: mode is a string• First character is access type: r, w, a

• Following characters are modifiers: b, +

• Return value is a FILE *, or a null pointer on error• On error, global errno contains error code

FILE *fopen( const char *path, const char *mode );

Page 10: CSCI 243 The Mechanics of Programming More About I/O in C ...tvf/CSCI243/Notes/11-c-io.pdf · CSCI 243 The Mechanics of Programming Timothy Fossum (tvf@cs.rit.edu) TVF / RIT 20195

TVF / RIT 20195 CS243: More About C I/O

Opening and Closing Streams

• When done:

• Closes currently-open stream

• Calls close(), deallocates FILE structure

• Returns 0 on success, -1 on error

int fclose( FILE *stream );

Page 11: CSCI 243 The Mechanics of Programming More About I/O in C ...tvf/CSCI243/Notes/11-c-io.pdf · CSCI 243 The Mechanics of Programming Timothy Fossum (tvf@cs.rit.edu) TVF / RIT 20195

TVF / RIT 20195 CS243: More About C I/O

Checking for Errors

• After any system call, errno contains a result code

• Can print interpretation of errno contents

• Example:

• Output:

void perror( const char *message );

FILE *in;

if( (in = fopen(“myfile”,”r”)) == NULL ) {   perror( “file ‘myfile’” );   exit( EXIT_FAILURE );}

file ‘myfile’: No such file or directory

Page 12: CSCI 243 The Mechanics of Programming More About I/O in C ...tvf/CSCI243/Notes/11-c-io.pdf · CSCI 243 The Mechanics of Programming Timothy Fossum (tvf@cs.rit.edu) TVF / RIT 20195

TVF / RIT 20195 CS243: More About C I/O

Checking for Errors

• Related global variables:

• Related functions:

const char *sys_errlist[];int sys_nerr;int errno;

char *strerror( int errnum );

Page 13: CSCI 243 The Mechanics of Programming More About I/O in C ...tvf/CSCI243/Notes/11-c-io.pdf · CSCI 243 The Mechanics of Programming Timothy Fossum (tvf@cs.rit.edu) TVF / RIT 20195

TVF / RIT 20195 CS243: More About C I/O

Binary Streams

• Raw bytes of data – no interpretation done by library

• Stream seen as a sequence of “items”

• buffer - array of items in memory

• length - size of each item (not size of array!)

• count - number of items to read/write

• stream - i/o stream to use

• Return value is number of items transferred• On error, return 0 and errno contains the error code

size_t fread(  void *buffer, size_t length,               size_t count, FILE *stream );

size_t fwrite( void *buffer, size_t length,               size_t count, FILE *stream );

Page 14: CSCI 243 The Mechanics of Programming More About I/O in C ...tvf/CSCI243/Notes/11-c-io.pdf · CSCI 243 The Mechanics of Programming Timothy Fossum (tvf@cs.rit.edu) TVF / RIT 20195

TVF / RIT 20195 CS243: More About C I/O

Example: employees (1/4)

#include <stdio.h>#include <stdlib.h>#include <unistd.h>

typedef    struct person_st {        char name[40];   // “John Q. Public”        char addr[24];   // “123 Elm Street”        int citycode;    // code for city        char statecode;  // code for state (1..50)        char mstatus;    // code for marital status        short empid;     // employee ID (1..30000)    } employee_t;

employee_t people[40];

Page 15: CSCI 243 The Mechanics of Programming More About I/O in C ...tvf/CSCI243/Notes/11-c-io.pdf · CSCI 243 The Mechanics of Programming Timothy Fossum (tvf@cs.rit.edu) TVF / RIT 20195

TVF / RIT 20195 CS243: More About C I/O

Example: employees (2/4)

int main( int argc, char *argv[] ) {    FILE *infp, *otfp;    int ch;    int data[40];

    if( argc != 3 ) {       fprintf( stderr, “usage:  %s infile otfile\n”,                argv[0] );        exit( EXIT_FAILURE );    }

    if( (infp = fopen(argv[1],”rb”)) == NULL ) {        perror( argv[1] );        exit( EXIT_FAILURE );    }

Page 16: CSCI 243 The Mechanics of Programming More About I/O in C ...tvf/CSCI243/Notes/11-c-io.pdf · CSCI 243 The Mechanics of Programming Timothy Fossum (tvf@cs.rit.edu) TVF / RIT 20195

TVF / RIT 20195 CS243: More About C I/O

Example: employees (3/4)

    if( (otfp = fopen(argv[2],”wb”)) == NULL ) {        perror( argv[2] );        fclose( infp );        exit( EXIT_FAILURE );    }

    size_t num;

    num = fread( people, sizeof(employee_t), 40, infp );    if( num != 40 ) {        if( num == 0 )            perror( argv[1] );        fprintf( stderr, “error ­ expected 40, got %lu\n”,                 num );        fclose( infp );        fclose( otfp );        exit( EXIT_FAILURE );    }

Page 17: CSCI 243 The Mechanics of Programming More About I/O in C ...tvf/CSCI243/Notes/11-c-io.pdf · CSCI 243 The Mechanics of Programming Timothy Fossum (tvf@cs.rit.edu) TVF / RIT 20195

TVF / RIT 20195 CS243: More About C I/O

Example: employees (4/4)

    ...    process( people );

    num = fwrite( people, sizeof(employee_t), 40, otfp );    if( num != 40 ) {        if( num == 0 )            perror( argv[2] );        fprintf( stderr, “error ­ expected 40, got %ul\n”,                 num );        fclose( infp );        fclose( otfp );        exit( EXIT_FAILURE );    }

    ...    fclose( infp );    fclose( otfp );    return( 0 );

}

Page 18: CSCI 243 The Mechanics of Programming More About I/O in C ...tvf/CSCI243/Notes/11-c-io.pdf · CSCI 243 The Mechanics of Programming Timothy Fossum (tvf@cs.rit.edu) TVF / RIT 20195

TVF / RIT 20195 CS243: More About C I/O

Review: Stdio Text I/O Functions

• Character i/o:• getchar(), getc(), fgetc()

• putchar(), putc(), fputc()

• Line i/o:• gets(), fgets()

• puts(), fputs()

• Formatted i/o:• printf() fprintf(), sprintf()

• scanf(), fscanf(), sscanf()

Page 19: CSCI 243 The Mechanics of Programming More About I/O in C ...tvf/CSCI243/Notes/11-c-io.pdf · CSCI 243 The Mechanics of Programming Timothy Fossum (tvf@cs.rit.edu) TVF / RIT 20195

TVF / RIT 20195 CS243: More About C I/O

Line I/O

• Problem with fgets():• Reads until newline OR buffer is almost full

• If line is longer than buffer, only get partial line

• Solution (if using GNU C library):

• lineptr – address of a char* variable

• n – address of a size_t variable

• stream – input stream to read from

• Reads an entire input line, regardless of its length• Like fgets(), retains newline character, NUL-terminates buffer

• Returns character count, or -1 on error or EOF

ssize_t getline( char **lineptr,                 size_t *n, FILE *stream );

Page 20: CSCI 243 The Mechanics of Programming More About I/O in C ...tvf/CSCI243/Notes/11-c-io.pdf · CSCI 243 The Mechanics of Programming Timothy Fossum (tvf@cs.rit.edu) TVF / RIT 20195

TVF / RIT 20195 CS243: More About C I/O

Behavior

• If *lineptr is not NULL:

• Uses *lineptr as input buffer and *n as its size

• Assumes buffer was allocated with malloc()

• If *lineptr is NULL:

• Allocates buffer via malloc()

• Stores new pointer into *lineptr and size into *n

• Reads into *lineptr buffer until entire line is read

• If buffer isn’t big enough

• Allocates larger buffer with malloc()

• Copies text read in so far, then frees old buffer

• CONTINUES READING, using new buffer

• At return:

• Stores final buffer pointer into *lineptr and buffer size into *n

• Returns count of characters read

Page 21: CSCI 243 The Mechanics of Programming More About I/O in C ...tvf/CSCI243/Notes/11-c-io.pdf · CSCI 243 The Mechanics of Programming Timothy Fossum (tvf@cs.rit.edu) TVF / RIT 20195

TVF / RIT 20195 CS243: More About C I/O

#define _GNU_SOURCE#include <stdio.h>#include <stdlib.h>#include <unistd.h>

int main( int argc, char *argv[] ) {    FILE *infp, *otfp;    char *buffer;    size_t len;

    buffer = NULL;

    if( argc < 3 ) {       fprintf( stderr, “usage:  %s infile otfile\n”,                argv[0] );        exit( EXIT_FAILURE );    }

Example: mycp3 (1/3)

Page 22: CSCI 243 The Mechanics of Programming More About I/O in C ...tvf/CSCI243/Notes/11-c-io.pdf · CSCI 243 The Mechanics of Programming Timothy Fossum (tvf@cs.rit.edu) TVF / RIT 20195

TVF / RIT 20195 CS243: More About C I/O

Example: mycp3 (2/3)    if( (infp = fopen(argv[1],”r”)) == NULL ) {        perror( argv[1] );        exit( EXIT_FAILURE );    }

    if( (otfp = fopen(argv[2],”w”)) == NULL ) {        perror( argv[2] );        exit( EXIT_FAILURE );    }

    while( getline(&buffer,&len,infp) != ­1 ) {        fputs( buffer, otfp );    }

    free( buffer );

    fclose( infp );    fclose( otfp );

    return( 0 );}

Page 23: CSCI 243 The Mechanics of Programming More About I/O in C ...tvf/CSCI243/Notes/11-c-io.pdf · CSCI 243 The Mechanics of Programming Timothy Fossum (tvf@cs.rit.edu) TVF / RIT 20195

TVF / RIT 20195 CS243: More About C I/O

Example: mycp3 (3/3)

$ ls ­Fdatafile        mycp3.c

$ gcc ­Wall ­std=c99 ­o mycp3 mycp3.c

$ cat datafileThis is a simple input file.Isn't it pretty?

$ ./mycp3 datafile moredata

$ ls ­Fdatafile        moredata        mycp3*         mycp3.c

$ cat moredataThis is a simple input file.Isn't it pretty?

Page 24: CSCI 243 The Mechanics of Programming More About I/O in C ...tvf/CSCI243/Notes/11-c-io.pdf · CSCI 243 The Mechanics of Programming Timothy Fossum (tvf@cs.rit.edu) TVF / RIT 20195

TVF / RIT 20195 CS243: More About C I/O

Review: Formatted Output

• printf() family of routines

• All traverse format string:• Ordinary characters are printed literally

• Format control sequences print next argument according to code

• Return: number of bytes transmitted

int printf( const char *fmt, ... );int fprintf( FILE *stream, const char *fmt, ... );int sprintf( char *buf, const char *fmt, ... );

Page 25: CSCI 243 The Mechanics of Programming More About I/O in C ...tvf/CSCI243/Notes/11-c-io.pdf · CSCI 243 The Mechanics of Programming Timothy Fossum (tvf@cs.rit.edu) TVF / RIT 20195

TVF / RIT 20195 CS243: More About C I/O

Example: Output Formatting (1/2)$ cat print.c#include <stdio.h>

int main( void ) {    int n;    float f;    const char *msg = "This is only another message";

    n = 42;    f = 12.0303125;

    printf( "'%d' / '%4d' / '%04d' / '%­4d' / '%+4d\n",            n, n, n, n, n );

    printf( "'%#o' / '%#x’\n", n, n );

    printf( "%f / %8f / %8.2f\n", f, f, f );

    printf( "'%s' / '%11.11s'\n", msg, msg );

    return( 0 );}

Page 26: CSCI 243 The Mechanics of Programming More About I/O in C ...tvf/CSCI243/Notes/11-c-io.pdf · CSCI 243 The Mechanics of Programming Timothy Fossum (tvf@cs.rit.edu) TVF / RIT 20195

TVF / RIT 20195 CS243: More About C I/O

Example: Output Formatting (2/2)

$ gcc ­Wall ­std=c99 ­o print print.c

$ ./print'42' / '  42' / '0042' / '42  ' / ' +42'052' / '0x2a’12.030313 / 12.030313 /    12.03'This is only another message' / 'This is onl'

Page 27: CSCI 243 The Mechanics of Programming More About I/O in C ...tvf/CSCI243/Notes/11-c-io.pdf · CSCI 243 The Mechanics of Programming Timothy Fossum (tvf@cs.rit.edu) TVF / RIT 20195

TVF / RIT 20195 CS243: More About C I/O

Formatted Input

• scanf() family of routines

• All traverse the format string:• Ordinary characters must match input exactly

• Format control sequences cause input conversion

• Return: number of conversions successfully performed

• Arguments are modified• Thus, must pass pointers

int scanf( const char *fmt, ... );int fscanf( FILE *stream, const char *fmt, ... );int sscanf( const char *buf, const char *fmt, ... );

Page 28: CSCI 243 The Mechanics of Programming More About I/O in C ...tvf/CSCI243/Notes/11-c-io.pdf · CSCI 243 The Mechanics of Programming Timothy Fossum (tvf@cs.rit.edu) TVF / RIT 20195

TVF / RIT 20195 CS243: More About C I/O

#include <stdio.h>#include <stdlib.h>#include <unistd.h>

#define BUFFER_SIZE 1024

int main( int argc, char *argv[] ) {    FILE *infp;    char buffer[BUFFER_SIZE];

    if( argc < 2 ) {        infp = stdin;    } else if( (infp = fopen(argv[1],"r")) == NULL ) {            perror( argv[1] );            exit( EXIT_FAILURE );    }    while( fscanf(infp,"%s",buffer) == 1 ) {        printf( "Next token: '%s'\n", buffer );    }

    fclose( infp );

    return( 0 );}

Example: Tokenizing Text Files (1/2)

Page 29: CSCI 243 The Mechanics of Programming More About I/O in C ...tvf/CSCI243/Notes/11-c-io.pdf · CSCI 243 The Mechanics of Programming Timothy Fossum (tvf@cs.rit.edu) TVF / RIT 20195

TVF / RIT 20195 CS243: More About C I/O

$ gcc ­Wall ­std=c99 ­o scan scan.c

$ cat datafileThis      is     a simple    input file.

Isn't it pretty?

I think so.

$ ./scan datafileNext token: 'This'Next token: 'is'Next token: 'a'Next token: 'simple'Next token: 'input'Next token: 'file.'Next token: 'Isn't'Next token: 'it'Next token: 'pretty?'Next token: 'I'Next token: 'think'Next token: 'so.'

Example: Tokenizing Text Files (2/2)

Page 30: CSCI 243 The Mechanics of Programming More About I/O in C ...tvf/CSCI243/Notes/11-c-io.pdf · CSCI 243 The Mechanics of Programming Timothy Fossum (tvf@cs.rit.edu) TVF / RIT 20195

TVF / RIT 20195 CS243: More About C I/O

Other Useful Functions

• Forces a system call to write all data buffered for stream

• If stream is NULL, flushes all open output strings

• Returns 0 on success, EOF on error

• Moves i/o offset according to offset and whence

• Can use to move to beginning, end, or arbitrary position in file

• Returns 0 on success, EOF on error

• Returns current i/o offset for stream, or EOF on error

int fflush( FILE *stream );

int fseek( FILE *stream, long offset, int whence );

long ftell( FILE *stream );

Page 31: CSCI 243 The Mechanics of Programming More About I/O in C ...tvf/CSCI243/Notes/11-c-io.pdf · CSCI 243 The Mechanics of Programming Timothy Fossum (tvf@cs.rit.edu) TVF / RIT 20195

TVF / RIT 20195 CS243: More About C I/O

Buffering

• OS buffers all i/o• Efficiency – reduce amount of physical i/o

• Buffer size depends on i/o device

• Terminal i/o: one line

• File i/o: one block

• Standard i/o routines also buffer input and output

• Why another level of buffering?• Reduce number of system calls

• First stdio read causes system call to read into buffer• Subsequent stdio reads pull data from buffer

Page 32: CSCI 243 The Mechanics of Programming More About I/O in C ...tvf/CSCI243/Notes/11-c-io.pdf · CSCI 243 The Mechanics of Programming Timothy Fossum (tvf@cs.rit.edu) TVF / RIT 20195

TVF / RIT 20195 CS243: More About C I/O

Buffering

• Writes just copy data into output buffer• When full, an output system call is done

• Can cause problems:• E.g., debugging output to file isn’t flushed until the buffer fills up

• If program aborts, buffer may not actually be physically written out!

• Solution: fflush( stream )• Forces immediate system call