advanced unix

45
240-491 Adv. UNIX:fp/10 Advanced UNIX Advanced UNIX Objectives of these slides: Objectives of these slides: a more detailed look at file a more detailed look at file processing in C processing in C 240-491 Special Topics in Comp. Eng. 1 Semester 2, 2000-2001 10. File Processing

Upload: keziah

Post on 09-Jan-2016

24 views

Category:

Documents


0 download

DESCRIPTION

Advanced UNIX. 240-491 Special Topics in Comp. Eng. 1 Semester 2, 2000-2001. Objectives of these slides: a more detailed look at file processing in C. 10. File Processing. Overview. 1.Background 2.Text Files 3.Error Handling 4.Binary Files 5.Direct Access. continued. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Advanced UNIX

240-491 Adv. UNIX:fp/10 1

Advanced UNIXAdvanced UNIX

Objectives of these slides:Objectives of these slides:– a more detailed look at file processing in Ca more detailed look at file processing in C

240-491 Special Topics in Comp. Eng. 1Semester 2, 2000-2001

10. File Processing

Page 2: Advanced UNIX

240-491 Adv. UNIX:fp/10 2

OverviewOverview

1.1. BackgroundBackground

2.2. Text FilesText Files

3.3. Error HandlingError Handling

4.4. Binary FilesBinary Files

5.5. Direct AccessDirect Access

continued

Page 3: Advanced UNIX

240-491 Adv. UNIX:fp/10 3

6.6. Temporary FilesTemporary Files

7.7. Renaming & Removing Renaming & Removing

8.8. Character PushbackCharacter Pushback

9.9. BufferingBuffering

10.10. Redirecting I/ORedirecting I/O

Page 4: Advanced UNIX

240-491 Adv. UNIX:fp/10 4

1. Background1. Background

Two types of file: Two types of file: texttext, , binarybinary

Two access methods: Two access methods: sequentialsequential, , directdirect (also (also called called random accessrandom access))

UNIX I/O is line bufferedUNIX I/O is line buffered– input is processed a line at a timeinput is processed a line at a time– output output may notmay not be written to a file immediately until a be written to a file immediately until a

newline is outputnewline is output

Page 5: Advanced UNIX

240-491 Adv. UNIX:fp/10 5

2. Text Files2. Text Files

Standard I/OStandard I/O File I/OFile I/Oprintf()printf() fprintf()fprintf()scanf()scanf() fscanf()fscanf()gets()gets() fgets()fgets()puts()puts() fputs()fputs()getchar()getchar() getc()getc()putcharputchar putc()putc()

most just add a 'f'

Page 6: Advanced UNIX

240-491 Adv. UNIX:fp/10 6

Function PrototypesFunction Prototypes

int fscanf(FILE *fp, char *format, ...);int fscanf(FILE *fp, char *format, ...); int fprintf(FILE *fp, char *format, ...);int fprintf(FILE *fp, char *format, ...); int fgets(char *str, int max, FILE *fp);int fgets(char *str, int max, FILE *fp); int fputs(char *str, FILE *fp);int fputs(char *str, FILE *fp); int getc(FILE *fp);int getc(FILE *fp); int putc(int ch, FILE *fp);int putc(int ch, FILE *fp);

the new argument is the file pointer fp

Page 7: Advanced UNIX

240-491 Adv. UNIX:fp/10 7

2.1. Standard FILE* Constants2.1. Standard FILE* Constants

NameName MeaningMeaningstdinstdin standard inputstandard inputstdoutstdout standard outputstandard outputstderrstderr standard errorstandard error

e.g.e.g.if (len >= MAX_LEN)if (len >= MAX_LEN) ffprintf(printf(stderrstderr, “String is too long\n”);, “String is too long\n”);

Page 8: Advanced UNIX

240-491 Adv. UNIX:fp/10 8

2.2. Opening / Closing2.2. Opening / Closing

FILE *fopen(char *filename, char *mode);FILE *fopen(char *filename, char *mode);

int fclose(FILE *fp);int fclose(FILE *fp);

fopen()fopen() modes: modes:ModeMode MeaningMeaning“r”“r” read moderead mode“w”“w” write modewrite mode“a”“a” append modeappend mode

Page 9: Advanced UNIX

240-491 Adv. UNIX:fp/10 9

Careful OpeningCareful Opening

FILE *fp;FILE *fp; /* file pointer *//* file pointer */char *fname = “myfile.dat”;char *fname = “myfile.dat”;

if ((fp = fopen(fname, “r”)) == NULL) {if ((fp = fopen(fname, “r”)) == NULL) { fprintf(stderr, “Error opening %s\n”, fprintf(stderr, “Error opening %s\n”,

fname);fname); exit(1); exit(1);}}...... /* file opened okay *//* file opened okay */

Page 10: Advanced UNIX

240-491 Adv. UNIX:fp/10 10

2.3. Text I/O2.3. Text I/O

As with standard I/O:As with standard I/O:– formatted I/Oformatted I/O ((ffprintfprintf, , ffscanfscanf))– line I/Oline I/O ((ffgetsgets, , ffputsputs))– character I/Ocharacter I/O ((getcgetc, , putcputc))

Page 11: Advanced UNIX

240-491 Adv. UNIX:fp/10 11

2.3.1. Formatted I/O2.3.1. Formatted I/O

int fscanf(FILE *fp, char *format, ...);int fscanf(FILE *fp, char *format, ...);

int fprintf(FILE *fp, char *format, ...);int fprintf(FILE *fp, char *format, ...);

Both return Both return EOFEOF if an error or end-of-file occurs. if an error or end-of-file occurs.

If okay, If okay, fscanf()fscanf() returns the number of bound returns the number of bound variables, variables, fprintf()fprintf() returns the number of returns the number of output characters.output characters.

Page 12: Advanced UNIX

240-491 Adv. UNIX:fp/10 12

2.3.2. Line I/O2.3.2. Line I/O

char *fgets(char *str, int max, FILE *fp);char *fgets(char *str, int max, FILE *fp);

int fputs(char *str, FILE *fp);int fputs(char *str, FILE *fp);

If an error or EOF occurs, If an error or EOF occurs, fgets()fgets() returns returns NULLNULL, , fputs()fputs() returns returns EOFEOF..

If okay, If okay, fgets()fgets() returns pointer to string, returns pointer to string, fputs()fputs() returns non-negative integer. returns non-negative integer.

Page 13: Advanced UNIX

240-491 Adv. UNIX:fp/10 13

Differences between fgets() and gets()Differences between fgets() and gets()

Use of Use of maxmax argument: argument: fgets()fgets() reads in at reads in at most most max-1max-1 chars (so there is room for chars (so there is room for ‘\0’‘\0’).).

fgets()fgets() retains the input retains the input ‘\n’‘\n’

Deleting the Deleting the ‘\n’‘\n’::len1 = strlen(line)-1;len1 = strlen(line)-1;if (line[len1] == ‘\n’) /* to be safe */if (line[len1] == ‘\n’) /* to be safe */ line[len1] = ‘\0’; line[len1] = ‘\0’;

Page 14: Advanced UNIX

240-491 Adv. UNIX:fp/10 14

Difference between fputs() and puts()Difference between fputs() and puts()

fputs()fputs() does not add a does not add a‘\n’‘\n’ to the output. to the output.

Page 15: Advanced UNIX

240-491 Adv. UNIX:fp/10 15

Line-by-line EchoLine-by-line Echo

#define MAX 100#define MAX 100 /* max line length *//* max line length */ : :void output_file(char *fname)void output_file(char *fname){{ FILE *fp;FILE *fp; char line[MAX]; char line[MAX];

if ((fp = if ((fp = fopenfopen(fname, “r”)) == NULL) {(fname, “r”)) == NULL) { fprintf(stderr, “Error opening %s\n”, fname); fprintf(stderr, “Error opening %s\n”, fname); exit(1); exit(1); } } while ( while (fgetsfgets(line, MAX, fp) != NULL)(line, MAX, fp) != NULL) fputsfputs(line, stdout);(line, stdout); fclosefclose(fp);(fp);}}

Page 16: Advanced UNIX

240-491 Adv. UNIX:fp/10 16

2.3.3. Character I/O2.3.3. Character I/O

int getc(FILE *fp);int getc(FILE *fp);

int putc(int ch, FILE *fp);int putc(int ch, FILE *fp);

Both return Both return EOFEOF if an error or end-of-file if an error or end-of-file occurs.occurs.

Can also use Can also use fgetc()fgetc() and and fputc()fputc()..

Page 17: Advanced UNIX

240-491 Adv. UNIX:fp/10 17

Char-by-char EchoChar-by-char Echo#define MAX 100#define MAX 100 /* max line length *//* max line length */ : :void output_file(char *fname)void output_file(char *fname){{ FILE *fp; FILE *fp; int ch; int ch;

if ((fp = fopen(fname, “r”)) == NULL) { if ((fp = fopen(fname, “r”)) == NULL) { fprintf(stderr, “Error opening %s\n”, fname); fprintf(stderr, “Error opening %s\n”, fname); exit(1); exit(1); } } while ((ch = while ((ch = getcgetc(fp)) != EOF) (fp)) != EOF) putcputc(ch, stdout);(ch, stdout); fclose(fp); fclose(fp);}}

Page 18: Advanced UNIX

240-491 Adv. UNIX:fp/10 18

Using feof()Using feof()

Rewrite the previous while-loop as:Rewrite the previous while-loop as:

while (!while (!feoffeof(fp)) {(fp)) { ch = getc(fp); ch = getc(fp); putc(ch, stdout); putc(ch, stdout);}}

– not a common coding style.not a common coding style.

Page 19: Advanced UNIX

240-491 Adv. UNIX:fp/10 19

3. Error Handling3. Error Handling

int ferror(FILE *fp);int ferror(FILE *fp);

– check error status of file streamcheck error status of file stream– it returns non-zero if there is an errorit returns non-zero if there is an error

void clearerr(FILE *fp);void clearerr(FILE *fp);

– reset error statusreset error status

continued

Page 20: Advanced UNIX

240-491 Adv. UNIX:fp/10 20

void perror(char *str);void perror(char *str);

– print print strstr (usually a filename) followed by (usually a filename) followed by colon and a system-defined error messagecolon and a system-defined error message

......fp = fopen(fname, “r”);fp = fopen(fname, “r”);if (fp == NULL) {if (fp == NULL) { perrorperror(fname);(fname); exit(1); exit(1);}}

common in advancedcoding

Page 21: Advanced UNIX

240-491 Adv. UNIX:fp/10 21

errnoerrno

The system error message is based on a The system error message is based on a system error number (system error number (errnoerrno) which is set ) which is set when a library function returns an error.when a library function returns an error.

#include <errno.h>#include <errno.h>......fp = fopen(fname, “r”);fp = fopen(fname, “r”);if (errno == ...)if (errno == ...) ... ...

continued

Page 22: Advanced UNIX

240-491 Adv. UNIX:fp/10 22

Many Many errnoerrno integer constants are defined in integer constants are defined in errno.herrno.h– it is better style to use the constant name instead of the it is better style to use the constant name instead of the

numbernumber– linux distributions usually put most linux distributions usually put most errnoerrno constants in constants in asm/errno.hasm/errno.h

Example Example errnoerrno constants: constants:EPERMEPERM permission deniedpermission deniedENOENTENOENT no such file / directoryno such file / directory

Page 23: Advanced UNIX

240-491 Adv. UNIX:fp/10 23

4. Binary Files4. Binary Files

For storing non-character dataFor storing non-character data– arrays, structs, integers (as bytes), GIFs, arrays, structs, integers (as bytes), GIFs,

compressed datacompressed data

NotNot portable across different systems portable across different systems– unless you have cross-platform reading/writing unless you have cross-platform reading/writing

utilities, such as utilities, such as gzipgzip

For portability, use text filesFor portability, use text files

Page 24: Advanced UNIX

240-491 Adv. UNIX:fp/10 24

fopen() modes for Binary Filesfopen() modes for Binary Files

ModeMode MeaningMeaning“rb”“rb” read binary fileread binary file“wb”“wb” write binary filewrite binary file“ab”“ab” append to binary fileappend to binary file

add a "b" to thetext file modes

Page 25: Advanced UNIX

240-491 Adv. UNIX:fp/10 25

Reading / WritingReading / Writing

int fread(void *buffer, int size, int fread(void *buffer, int size, int num, FILE *fp);int num, FILE *fp);

int fwrite(void *buffer, int size, int fwrite(void *buffer, int size, int num, FILE *fp);int num, FILE *fp);

Returns number of things read/written Returns number of things read/written (or (or EOFEOF).).

Page 26: Advanced UNIX

240-491 Adv. UNIX:fp/10 26

ExampleExample

The code will write to a binary file The code will write to a binary file containing employee records with the containing employee records with the following type structure:following type structure:

#define MAX_NAME_LEN 50#define MAX_NAME_LEN 50

struct employee {struct employee { int salary; int salary; char name[MAX_NAME_LEN + 1]; char name[MAX_NAME_LEN + 1];};};

continued

Page 27: Advanced UNIX

240-491 Adv. UNIX:fp/10 27

struct employee e1, emps[MAX];struct employee e1, emps[MAX];::::

/* write the struct to fp *//* write the struct to fp */fwrite(&e1, sizeof(struct employee), 1, fp);fwrite(&e1, sizeof(struct employee), 1, fp);

/* write all of the array with 1 op *//* write all of the array with 1 op */fwrite(emps, sizeof(struct employee), fwrite(emps, sizeof(struct employee),

MAX, fp);MAX, fp);

Page 28: Advanced UNIX

240-491 Adv. UNIX:fp/10 28

5. Direct Access5. Direct Access Direct accessDirect access: move to any record in the : move to any record in the

binary file and then read (you do not binary file and then read (you do not have to read the others before it).have to read the others before it).

e.g. a move to the e.g. a move to the 5th5th employee record employee record would mean a move of size:would mean a move of size:

44 * sizeof(struct employee) * sizeof(struct employee)

5th

Page 29: Advanced UNIX

240-491 Adv. UNIX:fp/10 29

fopen() Modes for Direct Access (+)fopen() Modes for Direct Access (+)

ModeMode MeaningMeaning“rb“rb++”” open binary file for read/writeopen binary file for read/write

“wb“wb++”” create/clear binary file for create/clear binary file for read/writeread/write

“ab“ab++”” open/create binary file for open/create binary file for read/write at the endread/write at the end

Page 30: Advanced UNIX

240-491 Adv. UNIX:fp/10 30

Employees ExampleEmployees Example

#include <stdio.h>#include <stdio.h>#include <stdlib.h>#include <stdlib.h>#include <string.h>#include <string.h>

#define DF “employees.dat”#define DF “employees.dat”#define MAX_NAME_LEN 50#define MAX_NAME_LEN 50

struct employee {struct employee { int salary; int salary; char name[MAX_NAME_LEN + 1]; char name[MAX_NAME_LEN + 1];};};

int num_emps = 0; /* num of employees in DF */int num_emps = 0; /* num of employees in DF */FILE *fp;FILE *fp;

::

Poor style:globalvariables

Page 31: Advanced UNIX

240-491 Adv. UNIX:fp/10 31

Data FormatData Format

e1 e2 e3 e4

num

ber

. . . . . . . .

employees.dat

The basic coding technique is to store the The basic coding technique is to store the number of employeenumber of employee currently in the file (e.g. 4) currently in the file (e.g. 4)– some functions will need this number in order to some functions will need this number in order to

know where the end of the data occursknow where the end of the data occurs

empty space ofthe right size

Page 32: Advanced UNIX

240-491 Adv. UNIX:fp/10 32

Open the Data FileOpen the Data File

void open_file(void)void open_file(void){{ if ((fp = fopen(DF, “rb+”)) == NULL) { if ((fp = fopen(DF, “rb+”)) == NULL) { fp = fopen(DF, “wb+”); fp = fopen(DF, “wb+”);

/* create file *//* create file */ num_emps = 0; /* initial num. */ num_emps = 0; /* initial num. */ } } else /* opened file, read in num. */ else /* opened file, read in num. */ fread(&num_emps, sizeof(num_emps), fread(&num_emps, sizeof(num_emps),

1, fp);1, fp);}}

Page 33: Advanced UNIX

240-491 Adv. UNIX:fp/10 33

Move with fseek()Move with fseek()

int fseek(FILE *fp, long offset, int fseek(FILE *fp, long offset, int origin);int origin);

Movement is specified with a Movement is specified with a starting starting positionposition and and offsetoffset from there. from there.

The current position in the file is indicated The current position in the file is indicated with the with the file position pointerfile position pointer (not the same (not the same as as fpfp).).

Page 34: Advanced UNIX

240-491 Adv. UNIX:fp/10 34

Origin and OffsetOrigin and Offset

fseek()fseek() origin values: origin values:NameName ValueValue MeaningMeaningSEEK_SETSEEK_SET 0 0 beginning of filebeginning of fileSEEK_CURSEEK_CUR 1 1 current positioncurrent positionSEEK_ENDSEEK_END 2 2 end of fileend of file

Offset is a Offset is a largelarge integer integer– can be negative (i.e. move backwards)can be negative (i.e. move backwards)– equals the number of bytes to moveequals the number of bytes to move

Page 35: Advanced UNIX

240-491 Adv. UNIX:fp/10 35

Employees ContinuedEmployees Continued

void put_rec(int posn, struct employee *ep)void put_rec(int posn, struct employee *ep)/* write an employee at position posn *//* write an employee at position posn */{{ long loc; long loc;

loc = loc = sizeof(num_emps)sizeof(num_emps) + + (((posn-1)(posn-1)*sizeof(struct *sizeof(struct

employee));employee)); fseek(fp, loc, SEEK_SET); fseek(fp, loc, SEEK_SET); fwritefwrite(ep, sizeof(struct employee), 1,fp);(ep, sizeof(struct employee), 1,fp);}}

Can writeanywhere

No checking to avoidover-writing.

Page 36: Advanced UNIX

240-491 Adv. UNIX:fp/10 36

Read in an EmployeeRead in an Employee

void get_rec(int posn, struct employee *ep)void get_rec(int posn, struct employee *ep)/* read in employee at position posn *//* read in employee at position posn */{{ long loc; long loc;

loc = loc = sizeof(num_emps)sizeof(num_emps) + + (((posn-1)(posn-1)*sizeof(struct *sizeof(struct

employee));employee)); fseek(fp, loc, SEEK_SET); fseek(fp, loc, SEEK_SET); freadfread(ep, sizeof(struct employee), 1,fp);(ep, sizeof(struct employee), 1,fp);}}

should really check ifep contains something

Page 37: Advanced UNIX

240-491 Adv. UNIX:fp/10 37

Close Employees FileClose Employees File

void close_file(void)void close_file(void){{ rewind(fp); rewind(fp); /* same as fseek(fp, 0, 0); *//* same as fseek(fp, 0, 0); */

/* update num. of employees */ /* update num. of employees */ fwrite(&num_emps, sizeof(num_emps), fwrite(&num_emps, sizeof(num_emps),

1, fp);1, fp); fclose(fp); fclose(fp);}}

Page 38: Advanced UNIX

240-491 Adv. UNIX:fp/10 38

ftell()ftell()

Return current position of the file Return current position of the file position pointer (i.e. its offset in bytes position pointer (i.e. its offset in bytes from the start of the file):from the start of the file):

long ftell(FILE *fp);long ftell(FILE *fp);

Page 39: Advanced UNIX

240-491 Adv. UNIX:fp/10 39

6. Temporary Files6. Temporary Files

FILE *tmpfile(void);FILE *tmpfile(void);/* create a temp file *//* create a temp file */

char *tmpnam(char *name);char *tmpnam(char *name);/* create a unique name /* create a unique name

*/*/

tmpfile()tmpfile() opens file with opens file with “wb+”“wb+” mode; mode;removed when program exitsremoved when program exits

Page 40: Advanced UNIX

240-491 Adv. UNIX:fp/10 40

7. Renaming & Removing 7. Renaming & Removing

int rename(char *old_name, int rename(char *old_name, char *new_name);char *new_name);

– like like mvmv in UNIX in UNIX

int remove(char *filename);int remove(char *filename);

– like like rmrm in UNIX in UNIX

Page 41: Advanced UNIX

240-491 Adv. UNIX:fp/10 41

8. Character Pushback8. Character Pushback

int ungetc(int ch, FILE *fp);int ungetc(int ch, FILE *fp);

Overcomes Overcomes somesome problems with reading too problems with reading too muchmuch– 1 character lookahead can be coded1 character lookahead can be coded

ungetc()ungetc() only works only works onceonce between between getc()getc() calls calls Cannot pushback Cannot pushback EOFEOF

Page 42: Advanced UNIX

240-491 Adv. UNIX:fp/10 42

9. Buffering9. Buffering

int fflush(FILE *fp);int fflush(FILE *fp);

– e.g. e.g. fflush(stdout);fflush(stdout);

Flush partial linesFlush partial lines– overcomes output line bufferingovercomes output line buffering

stderrstderr is not buffered. is not buffered.

Page 43: Advanced UNIX

240-491 Adv. UNIX:fp/10 43

setbuf()setbuf()

void setbuf(FILE *fp, char *buffer);void setbuf(FILE *fp, char *buffer);

Most common use is to switch off Most common use is to switch off buffering:buffering:

setbuf(stdout, NULL);setbuf(stdout, NULL);

– equivalent to equivalent to fflush(fflush()) after every output after every output function callfunction call

Page 44: Advanced UNIX

240-491 Adv. UNIX:fp/10 44

10. Redirecting I/O10. Redirecting I/O FILE *freopen(char *filename, FILE *freopen(char *filename,

char *mode, FILE *fp);char *mode, FILE *fp);

– opens the file with the mode and associates the opens the file with the mode and associates the stream with itstream with it

Most common use is to redirect Most common use is to redirect stdinstdin, , stdoutstdout, , stderrstderr to mean the file to mean the file

It is better style (usually) to use I/O It is better style (usually) to use I/O redirection at the UNIX level.redirection at the UNIX level.

continued

Page 45: Advanced UNIX

240-491 Adv. UNIX:fp/10 45

FILE *in;FILE *in;int n;int n;

in = in = freopenfreopen("infile", "r", stdin);("infile", "r", stdin);if (in == NULL) {if (in == NULL) { perror("infile"); perror("infile"); exit(1); exit(1);}}scanfscanf("%d", &n); ("%d", &n);

/* read from infile *//* read from infile */::

fclose(in);fclose(in);