03 unix files
TRANSCRIPT
-
7/30/2019 03 Unix Files
1/35
UNIX Files
by Armin R. Mikler
-
7/30/2019 03 Unix Files
2/35
Overview Files in UNIX
Directories and Paths
User vs. System Mode
UNIX I/O primitives
A simple example The basic I/O system calls
Buffered vs. un-buffered I/O
File Locking Ownership and Permissions
-
7/30/2019 03 Unix Files
3/35
Files
UNIX Input/Output operations are based on the conceptof files.
Files are an abstraction of specific I/O devices.
A very small set of system calls provide the primitivesthat give direct access to I/O facilities of the UNIX
kernel. Most I/O operations rely on the use of these primitives.
We must remember that the basic I/O primitives are
system calls, executed by the kernel. What does thatmean to us as programmers???
-
7/30/2019 03 Unix Files
4/35
User and System Space
Program
Code
Library Routine
fread()read()
user code
read()kernel codeKernel Space
User Space
-
7/30/2019 03 Unix Files
5/35
Different types of files
UNIX deals with two different classes of files: Special Files
Regular Files
Regular files are just ordinary data files on disk -
something you have used all along when you studiedprogramming!
Special files are abstractions of devices. UNIX deals with
devices as if they were regular files. The interface between the file system and the device is
implemented through a device driver - a program thathides the details of the actual device.
-
7/30/2019 03 Unix Files
6/35
special files
UNIX distinguishes two types of special files:
Block Special Files represent a device withcharacteristics similar to a disk. The device drivertransfers chunks or blocks of data between theoperating system and the device.
Character Special Files represent devices withcharacteristics similar to a keyboard. The device is
abstracted by a stream of bytes that can only beaccessed in sequential order.
-
7/30/2019 03 Unix Files
7/35
Access Primitives
UNIX provides access to files and devicesthrough a (very) small set of basic system calls(primitives)
create() open()
close()
read()
write()
ioctl()
-
7/30/2019 03 Unix Files
8/35
the open()call
#include
#include
#include
int open(const char *path, int flags, [mode_t mode]);
char *path: is a string that contains the fully qualifiedfilename of the file to be opened.
int flags: specifies the method of access i.e. read_only,write_only read_and_write.
mode_t mode: optional parameter used to set the accesspermissions upon file creation.
-
7/30/2019 03 Unix Files
9/35
read()and write()
#include
ssize_t read(int filedes, void *buffer, size_t n);
ssize_t write(int filedes, const void *buffer, size_t n);
int filedes:file descriptor that has been obtained though an
open() or create() call.void *buffer:pointer to an array that will hold the data that
is read or holds the data to be written.
size_t n:the number of bytes that are to be read or written
from/to the file.
-
7/30/2019 03 Unix Files
10/35
A close()call
Although all open files are closed by the OS upon completion
of the program, it isgood programming styleto clean upafter you are done with any system resource.
Please make it a habit to closeall files that you program has
used as soon as you dont need them anymore!
#include
int close(int filedes);
Remember, closing resources timely can improve systemperformance and prevent deadlocks from happening (morelater)
-
7/30/2019 03 Unix Files
11/35
A rudimentary example:#include /* controls file attributes */
#include /* defines symbolic constants */main()
{
int fd; /* a file descriptor */
ssize_t nread; /* number of bytes read */
char buf[1024]; /* data buffer *//* open the file data for reading */
fd = open(data, O_RDONLY);
/* read in the data */
nread = read(fd, buf, 1024);
/* close the file */close(fd);
}
-
7/30/2019 03 Unix Files
12/35
Directories and Paths
At each point in time, every process has an associated
working directorywhich is used for path name resolution. If the pathname does not start with a /, the path is
assumed to start in the current directory.
A pathname starting with ./ refers to the current directory
The pathname starting with ../ refers to the parentdirectory
These pathnames are referred to as relative pathnames
The current directory associated with your shell at login isreferred to as home directory
Question: What is the purpose of the search path?
-
7/30/2019 03 Unix Files
13/35
Some useful functions
char *getcwd( char *buf, size_t size) returns the pathname of the current working directory.
long sysconf(int name) returns values of system-wide limits such as clock-ticks-per-
second and the number of processes allowed per user.
long pathconf( const char *path, int name)
long fpathconf( int filedes, int name);
these functions report limits that are associated with aparticular file or directory, i.e., the maximum path length.
-
7/30/2019 03 Unix Files
14/35
Navigating through Directories
An important UNIX command is the find command find path ... [operand_expression]
!!! Have a look at the manual pages - this command is rathercomplex!!
There are a number of system calls that are related todirectory navigation: opendir()
readdir()
rewinddir()
closedir()
-
7/30/2019 03 Unix Files
15/35
the one who seeksshall find
The OS remembers the current position of the read-write
pointer to the file. The read-write pointer indicates whichbyte is the next to be read from (or written to) file.
The lseek()system call enable the user to change the
position of the read-write pointer.
off_t lseek(int filedes, off_t offset, int start_flag);
off_t offset: number of bytes to move from the startposition.
int start_flag: indicates from where the offset is going to beapplied.
-
7/30/2019 03 Unix Files
16/35
start_flag
SEEK_SET (0): Measure the offset from the beginning of
the file SEEK_CUR (1): Measure the offset from the current
position
SEEK_END (2): Offset is measured from the end of the
file
Example:
newpos = lseek(fd, (off_t)-16, SEEK_END);
sets the read-write pointer 16 bytes before the end of thefile!
-
7/30/2019 03 Unix Files
17/35
Buffered vs unbuffered I/O
The system can execute in user mode or kernel mode!
Memory is divided into user space and kernel space!
What happens when we write to a file? the write call forces a context switch to the system. What??
the system copies the specified number of bytes from user space
into kernel space. (into mbufs) the system wakes up the device driver to write these mbufs to
the physical device (if the file-system is in synchronous mode).
the system selects a new process to run.
finally, control is returned to the process that executed thewrite call.
Discuss the effects on the performance of your program!
-
7/30/2019 03 Unix Files
18/35
Un-buffered I/O
Every read and write is executed by the kernel.
Hence, every read and write will cause a contextswitch in order for the system routines toexecute.
Why do we suffer performance loss? How can we reduce the loss of performance?
==> We could try to move as much data aspossible with each system call.
How can we measure the performance?
-
7/30/2019 03 Unix Files
19/35
Buffered I/O
explicit versus implicit buffering: explicit - collect as many bytes as you can before
writing to file and read more than a single byte at atime.
However, use the basic UNIX I/O primitives Careful !! Your program my behave differently on different
systems.
Here, the programmer is explicitly controlling the buffer-size
implicit - use the Stream facility provided by
FILE *fd, fopen, fprintf, fflush, fclose, ... etc.
a FILE structure contains a buffer (in user space) thatis usually the size of the disk blocking factor (512 or1024)
-
7/30/2019 03 Unix Files
20/35
File Locking
Consider the following problem:Programs can obtain a unique integer by reading from a
file. The file contains a single integer (at all times),which must be incremented by the program thatexecutes a read. Since multiple programs can compete
for the file (a unique integer), we must make sure thatthe file access is synchronized.
HOW??
What happens if we use buffered I/O ?
-
7/30/2019 03 Unix Files
21/35
lockf() File & Record Locking
lockf() is a C-Library function for locking recordsof a file. Its prototype is
int lockf( int fd, int func, long size);
func-parameters are: F_ULOCK: 0 (unlock a locked section)
F_LOCK: 1 (locks a section)
F_TLOCK: 2 (Test and Lock a section)
F_TEST: 3 (Test section for Locks) see the UNIX manual pages!!
-
7/30/2019 03 Unix Files
22/35
lockf() contd
If we rewind the file before locking AND use a size of 0L
as the corresponding size parameter, the entire file isbeing locked.
lseek(fd, 0L, 0) can be used to rewind the file (fd) to the
beginning.
The lockf()function provides both, the ability to lock andto test if a lock is set.
If we are trying to F_Lock a region that has already been
locked by another process, the calling process is put tosleep until the region becomes available.
-
7/30/2019 03 Unix Files
23/35
lockf() example
consider the following code segment:
...
if (lockf(fd, F_TEST, size) ==0{
rc = lockf(fd, F_LOCK, size);
...}
NOTE: it is possible that right after the test has succeededanother process locks the file. Your process will then have towait until the region becomes available.
We could use rc = lockf(fd, F_TLOCK, size);to avoid thissituation!
When would you use a non-blocking locking call ??? DISCUSS!
-
7/30/2019 03 Unix Files
24/35
flock() a 4.3BSD advisory lock
flock() is a UNIX system call to apply or remove an advisory lock to an open file
The locking is only on an advisory basis (not absolute)! What does that mean?
Prototype:int flock(int fd, int operation)
The operations are:LOCK_SH: Shared Lock
LOCK_EX: Exclusive lockLOCK_UN: Unlock
LOCK_NB: modifier to declare non-blocking i.e. (LOCK_SH|LOCK_NB) or
(LOCK_EX | LOCK_NB)
-
7/30/2019 03 Unix Files
25/35
a flock()example
#include
my_lock(fd)
int fd;
{if (flock(fd, LOCK_EX) == -1
{
printf(error locking file %d /n/n, fd);
exit(-1);}
}
-
7/30/2019 03 Unix Files
26/35
File Ownership and Permissions
Every file in UNIX has an owner, a group, and aset of permissions.
You can use the ls commandto view thepermissons set for a file:
ls -l
Trwxrwxrwx n owner group size date name
The first field, T: _ for an ordinary file
d for an directory
l Symbolic link p FIFO special file
-
7/30/2019 03 Unix Files
27/35
permissions contd Trwxrwxrwx n owner group size date name
3 sets of rwxrepresent the read, write and executepermission flags for the owner, the group, and others,reprectively.
nrepresents the number of links to this file or
directory ownerrepresents the current owner of this file
grouprepresents the group associated with this file
sizeis the number of bytes in this file
dateconsists of date and time when the file was lastmodified
nameis the name of the file
-
7/30/2019 03 Unix Files
28/35
chmod
OWNER GROUP OTHERS
RWX RWX RWX
4-2-1 4-2-1 4-2-1
chmod 754 myfile
OWNER: Read Write and ExecuteGROUP: Read and Execute
OTHERS: Read
-
7/30/2019 03 Unix Files
29/35
The file creation mask umask
Upon creating a new file, the operating system will apply
default permissions. The open()and create()system calls have an optional
argument, which allows for the specification ofpermissions for the file that is created.
filedes = open(datafile, O_CREAT, 0644) However, this process is governed by a maskwhich
represents the bits that will always been turned off on anewly created file.
Effectively, the above open()call is executed as: filedes = open(datafile, O_CREAT, (~mask)&mode);
-
7/30/2019 03 Unix Files
30/35
example
Lets say the mask is set to 04+02+01 = 07
The call filedes = open(datafile, O_CREAT, 0644)will createthe file datafilewith permissions 640. WHY??
The question is: How can we determine the value of the
file creation mask?
The system provides a system call umask(), which can beused to change the default creation mask.
A umask commandis also available at the shell level, sothat the default creation mask can be changed for everyfile that is created.
-
7/30/2019 03 Unix Files
31/35
umask()
#include
#include mode_t umask(mode_t newmask)
umask() returns the old mask value!!
This is a good way of determining the default masksetting!
(see homework)
Example:
mode_t oldmask;.....
oldmask = umask(022); /* What does the mask of 022accomplish ??? */
-
7/30/2019 03 Unix Files
32/35
The UNIX file system!
Each UNIX file has a description that is stored ina structure called inode. An inodeincludes: file size
location
owner (uid)
permissions
access times
etc.
-
7/30/2019 03 Unix Files
33/35
Directories
A UNIX directoryis a file containing acorrespondence between inodesand filenames.
When referencing a file, the OS traverses theFS tree to find the inode/namein theappropriate directory.
Once the OS has determined the inodenumber itcan access the inodeto get information about the
file.
-
7/30/2019 03 Unix Files
34/35
Links
A link is an association between a filename and aninode. We distinguish 2 types of links: hard links
soft (or symbolic) links
Directory entries arehard links
as they directlylink an inodeto a filename.
Symbolic links use the file (contents) as a pointerto another filename.
-
7/30/2019 03 Unix Files
35/35
More on links
each inodemay be pointed to by a number ofdirectory entries (hard links)
each inodekeeps a counter, indicating how manyhard links exist to that inode.
When a hard link is removed via the rmor unlinkcommand, the OS removes the correspondin linkbut does not free the inodeand correspondingdata blocks until the link count is 0