spsa(unix notes)

Unit 4What is an Operating System ?

An operating system (OS) is a resource manager. It takes the form of a set of software routines that allow users and application programs to access system resources (e.g. the CPU, memory, disks, modems, printers network cards etc.) in a safe, efficient and abstract way.

For example, an OS ensures safe access to a printer by allowing only one application program to send data directly to the printer at any one time. An OS encourages efficient use of the CPU by suspending programs that are waiting for I/O operations to complete to make way for programs that can use the CPU more productively. An OS also provides convenient abstractions (such as files rather than disk locations) which isolate application programmers and users from the details of the underlying hardware.

Fig. 1.1: General operating system architecture

Fig. 1.1 presents the architecture of a typical operating system and shows how an OS succeeds in presenting users and application programs with a uniform interface without regard to the details of the underlying hardware. We see that:

The operating system kernel is in direct control of the underlying hardware. The kernel provides low-level device, memory and processor management functions (e.g. dealing with interrupts from hardware devices, sharing the processor among multiple programs, allocating memory for programs etc.)

Basic hardware-independent kernel services are exposed to higher-level programs through a library of system calls (e.g. services to create a file, begin execution of a program, or open a logical network connection to another computer).

Application programs (e.g. word processors, spreadsheets) and system utility programs (simple but useful application programs that come with the operating system, e.g. programs which find text inside a group of files) make use of system calls. Applications and system utilities are launched using a shell (a textual command line interface) or a graphical user interface that provides direct user interaction.

Basic features of UNIX OSOperating systems (and different flavours of the same operating system) can be distinguished from one another by the system calls, system utilities and user interface they provide, as well as by the resource scheduling policies implemented by the kernel.

The UNIX* operating system was designed to let a number of programmers access the computer at the same time and share its resources.

The operating system coordinates the use of the computer's resources, allowing one person, for example, to run a spell check program while another creates a document, lets another edit a document while another creates graphics, and lets another user format a document -- all at the same time, with each user oblivious to the activities of the others.

The operating system controls all of the commands from all of the keyboards and all of the data being generated, and permits each user to believe he or she is the only person working on the computer.

This real-time sharing of resources make UNIX one of the most powerful operating systems ever.

Although UNIX was developed by programmers for programmers, it provides an environment so powerful and flexible that it is found in businesses, sciences, academia, and industry. Many telecommunications switches and transmission systems also are controlled by administration and maintenance systems based on UNIX.

While initially designed for medium-sized minicomputers, the operating system was soon moved to larger, more powerful mainframe computers. As personal computers grew in popularity, versions of UNIX found their way into these boxes, and a number of companies produce UNIX-based machines for the scientific and programming communities.

The uniqueness of UNIX

The features that made UNIX a hit from the start are:

Multitasking capability Multiuser capability

Portability

UNIX programs

Library of application software

Multitasking

Many computers do just one thing at a time, as anyone who uses a PC or laptop can attest. Try logging onto your company's network while opening your browser while opening a word processing program. Chances are the processor will freeze for a few seconds while it sorts out the multiple instructions.

http://www.bell-labs.com/history/unix/tutorial.html

UNIX, on the other hand, lets a computer do several things at once, such as printing out one file while the user edits another file. This is a major feature for users, since users don't have to wait for one application to end before starting another one.

Multiusers

The same design that permits multitasking permits multiple users to use the computer. The computer can take the commands of a number of users -- determined by the design of the computer -- to run programs, access files, and print documents at the same time.

The computer can't tell the printer to print all the requests at once, but it does prioritize the requests to keep everything orderly. It also lets several users access the same document by compartmentalizing the document so that the changes of one user don't override the changes of another user.

System portability

A major contribution of the UNIX system was its portability, permitting it to move from one brand of computer to another with a minimum of code changes. At a time when different computer lines of the same vendor didn't talk to each other -- yet alone machines of multiple vendors -- that meant a great savings in both hardware and software upgrades.

It also meant that the operating system could be upgraded without having all the customer's data inputted again. And new versions of UNIX were backward compatible with older versions, making it easier for companies to upgrade in an orderly manner.

UNIX tools

UNIX comes with hundreds of programs that can divided into two classes:

Integral utilities that are absolutely necessary for the operation of the computer, such as the command interpreter, and

Tools that aren't necessary for the operation of UNIX but provide the user with additional capabilities, such as typesetting capabilities and e-mail.

Architecture of the Linux Operating SystemLinux has all of the components of a typical OS (at this point you might like to refer back to Fig 1.1):

Kernel The Linux kernel includes device driver support for a large number of PC hardware devices (graphics cards, network cards, hard disks etc.), advanced processor and memory management features, and support for many different types of filesystems (including DOS floppies and the ISO9660 standard for CDROMs). In terms of the services that it provides to application programs and system utilities, the kernel implements most BSD and SYSV system calls, as well as the system calls described in the POSIX.1 specification.

The kernel (in raw binary form that is loaded directly into memory at system startup time) is typically found in the file /boot/vmlinuz, while the source files can usually be found in /usr/src/linux.The latest version of the Linux kernel sources can be downloaded from http://www.kernel.org/.

http://www.kernel.org/

Shells and GUIs Linux supports two forms of command input: through textual command line shells similar to those found on most UNIX systems (e.g. sh - the Bourne shell, bash - the Bourne again shell and csh - the C shell) and through graphical interfaces (GUIs) such as the KDE and GNOME window managers. If you are connecting remotely to a server your access will typically be through a command line shell.

System Utilities Virtually every system utility that you would expect to find on standard implementations of UNIX (including every system utility described in the POSIX.2 specification) has been ported to Linux. This includes commands such as ls, cp, grep, awk, sed, bc, wc, more, and so on. These system utilities are designed to be powerful tools that do a single task extremely well (e.g. grep finds text inside files while wc counts the number of words, lines and bytes inside a file). Users can often solve problems by interconnecting these tools instead of writing a large monolithic application program.

Like other UNIX flavours, Linux's system utilities also include server programs called daemons which provide remote network and administration services (e.g. telnetd and sshd provide remote login facilities, lpd provides printing services, httpd serves web pages, crond runs regular system administration tasks automatically). A daemon (probably derived from the Latin word which refers to a beneficient spirit who watches over someone, or perhaps short for "Disk And Execution MONitor") is usually spawned automatically at system startup and spends most of its time lying dormant (lurking?) waiting for some event to occur.

Application programs Linux distributions typically come with several useful application programs as standard. Examples include the emacs editor, xv (an image viewer), gcc (a C compiler), g++ (a C++ compiler), xfig (a drawing package), latex (a powerful typesetting language) and soffice (StarOffice, which is an MS-Office style clone that can read and write Word, Excel and PowerPoint files).

Redhat Linux also comes with rpm, the Redhat Package Manager which makes it easy to install and uninstall application programs.

The UNIX File StructureThe UNIX operating system is built around the concept of a filesystem which is used to store all of the information that constitutes the long-term state of the system. This state includes the operating system kernel itself, the executable files for the commands supported by the operating system, configuration information, temporary workfiles, user data, and various special files that are used to give controlled access to system hardware and operating system functions.

Every item stored in a UNIX filesystem belongs to one of four types:

1. Ordinary files Ordinary files can contain text, data, or program information. Files cannot contain other files or directories. Unlike other operating systems, UNIX filenames are not broken into a name part and an extension part (although extensions are still

frequently used as a means to classify files). Instead they can contain any keyboard character except for '/' and be up to 256 characters long (note however that characters such as *,?,# and & have special meaning in most shells and should not therefore be used in filenames). Putting spaces in filenames also makes them difficult to manipulate - rather use the underscore '_'.

2. Directories Directories are containers or folders that hold files, and other directories.

3. Devices To provide applications with easy access to hardware devices, UNIX allows them to be used in much the same way as ordinary files. There are two types of devices in UNIX - block-oriented devices which transfer data in blocks (e.g. hard disks) and character-oriented devices that transfer data on a byte-by-byte basis (e.g. modems and dumb terminals).

4. Links A link is a pointer to another file. There are two types of links - a hard link to a file is indistinguishable from the file itself. A soft link (or symbolic link) provides an indirect pointer or shortcut to a file. A soft link is implemented as a directory file entry containing a pathname.

CPU Scheduling Basic Concepts Scheduling Criteria

Scheduling Algorithms

Multiple-Processor Scheduling

Real-Time Scheduling

Thread Scheduling

Operating Systems Examples

Java Thread Scheduling

Algorithm Evaluation

Basic concept

Maximum CPU utilization obtained with multiprogramming CPU–I/O Burst Cycle – Process execution consists of a cycle of CPU execution and I/O wait

CPU burst distribution

Alternating Sequence of CPU And I/O Bursts

CPU Scheduler

Selects from among the processes in memory that are ready to execute, and allocates the CPU to one of them

CPU scheduling decisions may take place when a process:

o Switches from running to waiting state

o Switches from running to ready state

o Switches from waiting to ready

o Terminates

Scheduling under 1 and 4 is nonpreemptive

All other scheduling is preemptive

Dispatcher

Dispatcher module gives control of the CPU to the process selected by the short-term scheduler; this involves:

o switching context

o switching to user mode

o jumping to the proper location in the user program to restart that program

Dispatch latency – time it takes for the dispatcher to stop one process and start another running

Scheduling Criteria

CPU utilization – keep the CPU as busy as possible Throughput – # of processes that complete their execution per time unit

Turnaround time – amount of time to execute a particular process

Waiting time – amount of time a process has been waiting in the ready queue

Response time – amount of time it takes from when a request was submitted until the first response is produced, not output (for time-sharing environment)

Optimization Criteria

Max CPU utilization Max throughput

Min turnaround time

Min waiting time

Min response time

First-Come, First-Served (FCFS) Scheduling

Process Burst Time

P1 24

P2 3

P3 3

Suppose that the processes arrive in the order: P1 , P2 , P3

The Gantt Chart for the schedule is: Waiting time for P1 = 0; P2 = 24; P3 = 27

Average waiting time: (0 + 24 + 27)/3 = 17

Suppose that the processes arrive in the order

P2 , P3 , P1

The Gantt chart for the schedule is: Waiting time for P1 = 6; P2 = 0; P3 = 3

Average waiting time: (6 + 0 + 3)/3 = 3

Much better than previous case

Convoy effect short process behind long process

Shortest-Job-First (SJR) Scheduling

Associate with each process the length of its next CPU burst. Use these lengths to schedule the process with the shortest time

Two schemes:

o nonpreemptive – once CPU given to the process it cannot be preempted until completes its CPU burst

o preemptive – if a new process arrives with CPU burst length less than remaining time of current executing process, preempt. This scheme is know as the Shortest-Remaining-Time-First (SRTF)

SJF is optimal – gives minimum average waiting time for a given set of processes

Example of Non-Preemptive SJF

Process Arrival Time Burst Time

P1 0.0 7

P2 2.0 4

P3 4.0 1

P4 5.0 4

SJF (non-preemptive) Average waiting time = (0 + 6 + 3 + 7)/4 = 4

Example of Preemptive SJF

Process Arrival Time Burst Time

P1 0.0 7

P2 2.0 4

P3 4.0 1

P4 5.0 4

SJF (preemptive) Average waiting time = (9 + 1 + 0 +2)/4 = 3

Priority Scheduling

A priority number (integer) is associated with each process The CPU is allocated to the process with the highest priority (smallest integer º highest priority)

o Preemptive

o nonpreemptive

SJF is a priority scheduling where priority is the predicted next CPU burst time

Problem º Starvation – low priority processes may never execute

Solution º Aging – as time progresses increase the priority of the process

Round Robin (RR)

Each process gets a small unit of CPU time (time quantum), usually 10-100 milliseconds. After this time has elapsed, the process is preempted and added to the end of the ready queue.

If there are n processes in the ready queue and the time quantum is q, then each process gets 1/n of the CPU time in chunks of at most q time units at once. No process waits more than (n-1)q time units.

Performance

o q large Þ FIFO

o q small Þ q must be large with respect to context switch, otherwise overhead is too high

Example of RR with Time Quantum = 20

Process Burst Time

P1 53

P2 17

P3 68

P4 24

The Gantt chart is:

Typically, higher average turnaround than SJF, but better response

Multilevel Queue

Ready queue is partitioned into separate queues:foreground (interactive)background (batch)

Each queue has its own scheduling algorithm

o foreground – RR

o background – FCFS

Scheduling must be done between the queues

o Fixed priority scheduling; (i.e., serve all from foreground then from background). Possibility of starvation.

o Time slice – each queue gets a certain amount of CPU time which it can schedule amongst its processes; i.e., 80% to foreground in RR

20% to background in FCFS

Multilevel Queue Scheduling

Multilevel Feedback Queue

A process can move between the various queues; aging can be implemented this way Multilevel-feedback-queue scheduler defined by the following parameters:

o number of queues

o scheduling algorithms for each queue

o method used to determine when to upgrade a process

o method used to determine when to demote a process

o method used to determine which queue a process will enter when that process needs service

Memory management: Swapping and Demand paging Virtual memory – separation of user logical memory from physical memory.

o Only part of the program needs to be in memory for execution

o Logical address space can therefore be much larger than physical address space

o Allows address spaces to be shared by several processes

o Allows for more efficient process creation

Virtual memory can be implemented via:

o Demand paging

o Demand segmentation

Virtual Memory That is Larger Than Physical Memory

Page Table When Some Pages Are Not in Main Memory

Shared Library Using Virtual Memory

Demand Paging

Bring a page into memory only when it is neededo Less I/O needed

o Less memory needed

o Faster response

o More users

Page is needed Þ reference to it

o invalid reference Þ abort

o not-in-memory Þ bring to memory

Lazy swapper – never swaps a page into memory unless page will be needed.swapper deals with processes.

o Swapper that deals with pages is a pager

Transfer of a Paged Memory to Contiguous Disk Space

Valid-Invalid Bit

With each page table entry a valid–invalid bit is associated(v Þ in-memory, i Þ not-in-memory)

Initially valid–invalid bit is set to i on all entries

Example of a page table snapshot:

During address translation, if valid–invalid bit in page table entry

is I Þ page fault

Page Table When Some Pages Are Not in Main Memory

Page Fault

If there is a reference to a page, first reference to that page will trap to operating system:o page fault

Operating system looks at another table to decide:

o Invalid reference Þ abort

o Just not in memory

Get empty frame

Swap page into frame

Reset tables

Set validation bit = v

Restart the instruction that caused the page fault

As a worst case example consider 3 address instruction fetch and decode instruction, fetch A, fetch B, Add A and B, store the sum in C. if a page fault occurs when we try to store in C(because C is in a page not currently present in memory) we will have to get desired page, bring it in, correct the page and process table, and restart the instruction. Restart requires fetching the instruction again, decoding, fetching two operands, adding, and then storing.

Steps in Handling a Page Fault

Performance of Demand Paging

Page Fault Rate 0 £ p £ 1.0o if p = 0 no page faults

o if p = 1, every reference is a fault

Effective Access Time (EAT)

EAT = (1 – p) x memory access+ p (page fault overhead+ swap page out

+ swap page in+ restart overhead )

Page replacement – find some page in memory, but not really in use, swap it outo algorithm

o performance – want an algorithm which will result in minimum number of page faults

Same page may be brought into memory several times

Page Replacement

Prevent over-allocation of memory by modifying page-fault service routine to include page replacement

Use modify (dirty) bit to reduce overhead of page transfers – only modified pages are written to disk

Page replacement completes separation between logical memory and physical memory – large virtual memory can be provided on a smaller physical memory

Need For Page Replacement

Basic Page Replacement

Find the location of the desired page on disk Find a free frame:

- If there is a free frame, use it - If there is no free frame, use a page replacement algorithm to select a victim frame

Bring the desired page into the (newly) free frame; update the page and frame tables

Restart the process

Page Replacement Algorithms

Want lowest page-fault rate

Evaluate algorithm by running it on a particular string of memory references (reference string) and computing the number of page faults on that string

In all our examples, the reference string is

1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5

First-In-First-Out (FIFO) Algorithm

FIFO Page Replacement

Optimal Algorithm

Optimal Page Replacement

Least Recently Used (LRU) Algorithm

LRU Page Replacement

Stack implementation – keep a stack of page numbers in a double link form:o Page referenced:

move it to the top

requires 6 pointers to be changed

o No search for replacement

Use Of A Stack to Record The Most Recent Page References

Typical UNIX Directory Structure

Fig. 2.1: Part of a typical UNIX filesystem tree

The UNIX filesystem is laid out as a hierarchical tree structure which is anchored at a special top-level directory known as the root (designated by a slash '/'). Because of the tree structure, a directory can have many child directories, but only one parent directory. Fig. 2.1 illustrates this layout.

To specify a location in the directory hierarchy, we must specify a path through the tree. The path to a location can be defined by an absolute path from the root /, or as a relative path from the current working directory. To specify a path, each directory along the route from the source to the destination must be included in the path, with each directory in the sequence being separated by a slash. To help with the specification of relative paths, UNIX provides the shorthand "." for the current directory and ".." for the parent directory. For example, the absolute path to the directory "play" is /home/will/play, while the relative path to this directory from "zeb" is ../will/play.

Fig. 2.2 shows some typical directories you will find on UNIX systems and briefly describes their contents. Note that these although these subdirectories appear as part of a seamless logical filesystem, they do not need be present on the same hard disk device; some may even be located on a remote machine and accessed across a network.

Directory Typical Contents

/ The "root" directory

/bin Essential low-level system utilities

/usr/bin Higher-level system utilities and application programs

/sbin Superuser system utilities (for performing system administration tasks)

/lib Program libraries (collections of system calls that can be

included in programs by a compiler) for low-level system utilities

/usr/lib Program libraries for higher-level user programs

/tmp Temporary file storage space (can be used by any user)

/home or /homes

User home directories containing personal file space for each user. Each directory is named after the login of the user.

/etc UNIX system configuration and information files

/dev Hardware devices

/proc A pseudo-filesystem which is used as an interface to the kernel. Includes a sub-directory for each active program (or process).

Fig. 2.2: Typical UNIX directories

Unix file system: block, inodesThere are a large number of file systems derived from the original Unix file system. They all have a great many characteristics in common, so we can discuss them somewhat generically. This will be something of a lowest common denominator, simplified discussion; after we've given the basic idea we can go on to some of the details that make them practical.

Components

An inode-based file system will divide a disk partition into five parts: the superblock, a bunch of "information nodes," or inodes, the data blocks, and two groups of bitmaps used to represent allocated and freed inodes and datablocks, respectively. Here's a picture showing the layout of the five regions on the disk.

We won't be describing these areas in the order shown above; we'll talk about the superblock, then the inodes, then the data blocks, and then come back to the bitmaps.

The Superblock

The first block of the filesystem is called the superblock. The superblock gives information regarding the tuneable parameters of the filesystem: the number of inodes, the number of data blocks, the size of the data blocks... It may also include information such as a volume name to identify the partition.

Inodes

Following the superblock is a set of data structures called inodes, with exactly inode per disk file. So far as the filesystem data structures are concerned, the inodes represent the files.

A file's inode is quite small; typically 32 or 64 bytes. It includes all the information regarding the file it describes. Taking a look at the documentation for the Linux ext2 file system, this includes:

... pointers to the filesystem blocks which contain the data held in the object and all of the metadata about an object except its name. The metadata about an object includes the permissions, owner, group, flags, size, number of blocks used, access time, change time, modification time, deletion time, number of links, fragments, version (for NFS) and extended attributes (EAs) and/or Access Control Lists (ACLs).

So, it contains a lot of permission, ownership, and file type information, as well as pointers to all the data blocks containing the data making up the file (for a regular file or a directory). It looks something like this:

Everything towards the top of the inode is "metadata": the UID (User ID) and Group ID (GID) of the file owner, the modification time of the file, the file type (regular file, directory, character special, block special....), the length of the file (not relevant if it's a special file), and a bunch more (the etc entries). Exactly what all is in the inode varies with which flavor of Unix-like file system you're looking at.

The pointers, indirect pointer, and double indirect pointer will be described later, when we talk about regular files.

Data Blocks

The data blocks are just what the name implies: they contain the actual data for the filesystem object.

Bitmaps

The bitmaps keep track of allocated and free inodes (for the inode bitmap) and data blocks (for the data blocks). Every inode (and every data block) has a corresponding bit in the bitmaps. The state of that bit describes the state of its object.

Regular Files

A regular file is represented by an inode and a bunch of data blocks. As stated above, the inode contains almost all the metadata for the file, and pointers to the file data. These pointers are simply data block numbers, and tell where the data for the file is. So the first data pointer points to the first block with data for the file, the second points to the second block, and so forth.

In designing the inode, this gives us a tradeoff we have to make: to represent a large file, we need to have a data pointer for each block. But the vast majority of files are small, so this results in a lot of wasted space. In the case of Ext2fs, there are twelve data block pointers in the inode; with a 4K block size, this limits files to only 48K. That's unacceptable.

The solution is to create indirect, double indirect, and triple indirect pointers. When we run out of pointers, we point to a data block that is, itself, full of pointers to data blocks. With 32 bit pointers this gives 1024 pointers in the data block, so an indirect block increases the possible file size to 4 MB.

When 4MB is too small for the file, we go on to a double-indirect pointer. This points to a data block full of pointers to indirect blocks; this gives us 1048576 (1024 * 1024) blocks, for a maximum file size of 4 gigabytes.

There are even a few applications that need files bigger than this! For instance, movies will be on files that are bigger than this. The solution is to go another step to a triple-indirect pointer; it points to a data block full of

double-indirect pointers. At this point a file can be 4 terabytes. Note that some Unix-like file systems don't go as far as triple-indirect pointers (Ext2FS does).

Here's a picture showing the structure of a file in a Unix-like file system:

As you can see, the inode serves as the root of a tree. There are several blocks pointed to directly by the inode; there is an indirect block, and there is a double indirect block. The figure doesn't show a triple indirect block because it was getting too messy already!

User to User communication

Mail

The mail command enables the user to send and receive electronic mail messages to and from users on both the Unix system and remote users.

This is the basic mail command. Enhanced versions, such as programs that run under a windows program (e.g. mailtool), or screen-based versions of mail (e.g. elm) may be available, and you will probably find them preferable to mail. If so, much of the following can safely be ignored. Remember however that some version of mail will definitely be available on any unix system that you use.

Sending mail

To send a message to a user on your system, type:

% mail username

The cursor will move to the next line, and you will get a Subject: prompt. You can now type in the subject of your message, and then press <RETURN>. The cursor will go to the start of the next line and there will be no prompt. You now type in the text of your message. Terminate each line with <RETURN>. When you have finished the text of the message, type an end-of-file character (usually ^D), or a full-stop character. You

should now return to your normal shell prompt. If the message is dispatched successfully, you will hear no more about it. The following is example of the mail command in action:

% mail lnp6ttldSubject: UNIX courseI don't think I'll ever be able to get the studentsin the UNIX course to understand how to use e-mail.^D%

Entering the text of the message by this method is a rather crude process. Errors on the line being typed can be erased with your delete key, but once you have pressed <RETURN>, a line cannot be edited. A message may be aborted by pressing ^C twice.

Subcommands while entering mail

There are several commands you can type while entering mail:

<CTRL/Z> will cancel the message, and leave the text in a file named dead.letter.

^e invoke a text editor to edit your message.

~v invoke a screen editor to edit your message.

~f reads the contents of the message you have just read, into your message text.

~r file reads contents of file into your message text.

While this method is quick and easy to use, and quite adequate for short and simple messages, many users prefer to first create a file containing the text of the message, and then mail this file to the intended recipient. This enables you to use any system editor and formatter to create the message, and you do not need to send it immediately.

The following sequence shows how to send a file note containing the text of a message to another user.

% mail lnp6ttld < note

To understand fully how this works see the section on 'Re-direction of standard output' in Chapter 8 below.

In this example the message will not contain a subject heading, unless one has already been included as the first line of the file note. There is a -s option with the mail command, that can be used to include a subject header, as follows:

% mail -s UNIX lnp6ttld < note

The string following the -s is the subject; in this case, the subject is "UNIX".

Receiving mail

If new mail is waiting for you when you login, you will see the message:

You have new mail

To start the mail program type the command:

% mail

Each message is summarised on a numbered list. The current message is marked with a ">" character. The mail prompt character is "&". Type the number of the message you want to read, or just press <RETURN> to read through the list. The list of mail headers will look something like this:

% mailMail version SMI 4.0 Thu Oct 11 12:59:09 PDT 1990 Type ? for help."/usr/spool/mail/lnp5jb": 2 messages 2 new>N 1 lnp5mw Thu Jan 9 15:10 11/262 hello N 2 lnp5js Thu Jan 9 15:11 10/287 party&

This tells Jenny Brown that she has two messages, one from user lnp5mw, and one from lnp5js. The date and time at which the messages were received is also listed, and so is the subject header (the last item on each line - here 'hello' and 'party'). The following commands can be entered to the mail prompt:

d Mark the current message for deletion

d n Mark message number n for deletion

u n undelete message number n.

w file save the current message in file with the mail header and mark for deletion

s file Save the current message in file without the mail header and mark for deletion

r Reply to the current message

q Quit mail, removing deleted messages from your system mailbox. Undeleted messages that have been read are normally stored in your personal mailbox (see below)

x Exit mail, leaving your mailbox untouched, i.e. messages deleted in this session are restored

h Show list of message headers

? List the useful mail commands

! command Execute specified shell command

- Re-read previous message.

m recipient Send mail to named recipient

Files used by mail

~/mbox Your personal mailbox, located in your home directory. This is where messages that you have saved are stored, unless you specified another location when you saved them. You can access this file by issuing the command:

% mail -f mbox

~/.mailrc A file that can hold commands for mail to obey when it starts up.

Sending mail to remote users

The following also applies to the elm mail program.

Sending mail to users on other computer systems is simple using mail. Simply type the full address of the remote user where the system username is used above. For example:

% mail [email protected]% mail -s Hello [email protected] < note

These two examples show two ways of sending mail shown above.

It is also possible to use mail to look at folders of mail that you have already received. To do this type:

% mail -f folder_name

and it will treat the messages in the folder as incoming mail.

Sending on-line messages

As you have seen, messages sent using mail are received in a special buffer, and it is up to the recipient when to look at them and what to do with them. It is also possible to send a message that will simply appear on the screen of the recipient, if they are logged on. This is less useful than mail for the following reasons:

mail can be used irrespective of whether the recipient is logged on or not.

mail messages can be stored by the recipient. This means that files can be transferred by mail, and a record of transactions can be kept.

On-line messages can be confused with whatever the recipient has on screen and can easily disrupt what the are doing. They can be very annoying!

On the other hand, on-line messages do have the advantage of obtaining the immediate attention of another user, and it is possible to have an interactive conversation. Bearing these facts in mind, use the following command with caution!

write

The write command is used to send on-line messages to another user on the same machine.

The format of the write command is as follows:

% write usernametext of message^D

After typing the command, you enter your message, starting on the next line, terminating with the end-of-file character. The recipient will then hear a bleep, then receive your message on screen, with a short header attached. The following is a typical exchange. User lnp5jb types:

% write lnp8zzHi there - want to go to lunch?^D%

User lnp8zz will hear a beep and the following will appear on his/her screen:

Message from lnp5jb on sun050 at 12:42Hi there - want to go to lunch?EOF

If lnp8zz wasn't logged on, the sender would see the following:

% write lnp8zzlnp8zz not logged in.

SunOS has the talk command. This has several advantages over write. Firstly, talk can call other machines on a network. Secondly, talk provides a clearer interface for the exchange of messages, dividing the screen into two windows for the interlocutors. Type

talk username@machine

You can stop messages being flashed up on your screen if you wish. To turn off direct communications type:

% mesg n

It will remain off for the remainder of your session, unless you type:

% mesg y

to turn the facility back on. Typing just mesg lets you know whether it is on or off.

Remote logins

It is possible to log on to another machine on a Unix network, provided that you have permission to do so. To do this use the rlogin command. Type:

rlogin machine

and you will be asked for your password. It may be necessary for you to do this to make on-line communications with another user easier.

Unit 5

User name and groups

Adding Users useradd (in /usr/sbin):

useradd is a utility for adding new users to a UNIX system. It adds new user information to the /etc/passwd file and creates a new home directory for the user. When you add a new user, you should also set their password (using the -p option on useradd, or using the passwd utility):

# useradd bob # passwd bob

Controlling User Groups groupadd (in /usr/sbin):

groupadd creates a new user group and adds the new information to /etc/group:

# groupadd groupname

usermod (in /usr/sbin):

Every user belongs to a primary group and possibly also to a set of supplementary groups. To modify the group permissions of an existing user, use

# usermod -g initialgroup username -G othergroups

where othergroups is a list of supplementary group names separated by commas (with no intervening whitespace).

groups

You can find out which groups a user belongs to by typing:

# groups username

Logging in,changing password and Format of Unix command

Logging into (and out of) UNIX SystemsText-based (TTY) terminals:

When you connect to a UNIX computer remotely (using telnet) or when you log in locally using a text-only terminal, you will see the prompt:

login:

At this prompt, type in your usename and press the enter/return/ key. Remember that UNIX is case sensitive (i.e. Will, WILL and will are all different logins). You should then be prompted for your password:

login: will password:

Type your password in at the prompt and press the enter/return/ key. Note that your password will not be displayed on the screen as you type it in.

If you mistype your username or password you will get an appropriate message from the computer and you will be presented with the login: prompt again. Otherwise you should be presented with a shell prompt which looks something like this:

$

To log out of a text-based UNIX shell, type "exit" at the shell prompt (or if that doesn't work try "logout"; if that doesn't work press ctrl-d).

Graphical terminals:

If you're logging into a UNIX computer locally, or if you are using a remote login facility that supports graphics, you might instead be presented with a graphical prompt with login and password fields. Enter your user name and password in the same way as above (N.B. you may need to press the TAB key to move between fields).

Once you are logged in, you should be presented with a graphical window manager that looks similar to the Microsoft Windows interface. To bring up a window containing a shell prompt look for menus or icons which mention the words "shell", "xterm", "console" or "terminal emulator".

To log out of a graphical window manager, look for menu options similar to "Log out" or "Exit".

Changing your passwordOne of the things you should do when you log in for the first time is to change your password.

The UNIX command to change your password is passwd:

$ passwd

The system will prompt you for your old password, then for your new password. To eliminate any possible typing errors you have made in your new password, it will ask you to reconfirm your new password.

Remember the following points when choosing your password:

o Avoid characters which might not appear on all keyboards, e.g. '£'. o The weakest link in most computer security is user passwords so keep your

password a secret, don't write it down and don't tell it to anyone else. Also avoid dictionary words or words related to your personal details (e.g. your boyfriend or girlfriend's name or your login).

o Make it at least 7 or 8 characters long and try to use a mix of letters, numbers and punctuation.

General format of UNIX commandsA UNIX command line consists of the name of a UNIX command (actually the "command" is the name of a built-in shell command, a system utility or an application program) followed by its "arguments" (options and the target filenames and/or expressions). The general syntax for a UNIX command is

$ command -options targets

Here command can be though of as a verb, options as an adverb and targets as the direct objects of the verb. In the case that the user wishes to specify several options, these need not always be listed separately (the options can sometimes be listed altogether after a single dash).

Characters with special meaning

Quotes

Quoting a single character with the backslash

You can prevent the shell from interpreting a character by placing a backslash ("\") in front of it. Here is a shell script that can delete any files that contain an asterisk:

echo This script removes all files that echo contain an asterisk in the name.echoecho Are you sure you want to remove these files\?rm -i *\**

The backslash was also necessary before the question mark, which is also a shell meta-character. Without it, the shell would look for all files that match the pattern "files?." If you had the files "files1" and "files2" the script would print out

Are you sure you want to remove these files1 files2

which is not what you want.

http://www.grymoire.com/Unix/Quote.html#toc-uh-0

The backslash is the "strongest" method of quotation. It works when every other method fails. If you want to place text on two or more lines for readability, but the program expects one line, you need a line continuation character. Just use the backslash as the last character on the line:

% echo This could be \a very \long line\!This could be a very long line!%

This escapes or quotes the end of line character, so it no longer has a special meaning. In the above example, I also put a backslash before the exclamation point. This is necessary if you are using the C shell, which treats the "!" as a special character. If you are using some other shell, it might not be necessary.

Strong Quoting with the Single Quotes

When you need to quote several character at once, you could use several backslashes:

% echo a\ \ \ \ \ \ \ b

(There are 7 spaces between 'a' and 'b'.) This is ugly but works. It is easier to use pairs of quotation marks to indicate the start and end of the characters to be quoted:

% echo 'a b'

(The HTML ruins the formatting. Imagine that there are 7 spaces between the a and b. -Bruce) Inside the single quotes, you can include almost all meta-characters:

% echo 'What the *heck* is a $ doing here???'What the *heck* is a $ doing here???

The above example uses asterisks, dollar signs, and question marks meta-characters. The single quotes should be used when you want the text left alone. If you are using the C shell, the "!" character may need a backslash before it. It depends on the characters next to it. If it is surrounded by spaces, you don't need to use a backslash.

Weak Quotes with the Double Quotes

Sometimes you want a weaker type of quoting: one that doesn't expand meta-characters like "*" or "?," but does expand variables and does command substitution. This can be done with the double quote characters:

% echo "Is your home directory $HOME?"Is your home directory /home/kreskin/u0/barnett?% echo "Your current directory is `pwd`"Your current directory is /home/kreskin/u0/barnett

Once you learn the difference between single quotes and double quotes, you will have mastered a very useful skill. It's not hard. The single quotes are stronger than the double quotes. Got it? Okay. And the backslash is the strongest of all.



Directory related commands

Directory and File Handling CommandsThis section describes some of the more important directory and file handling commands.

pwd (print [current] working directory)

pwd displays the full absolute path to the your current location in the filesystem. So

$ pwd /usr/bin

implies that /usr/bin is the current working directory.

ls (list directory)

ls lists the contents of a directory. If no target directory is given, then the contents of the current working directory are displayed. So, if the current working directory is /,

$ ls bin dev home mnt share usr var boot etc lib proc sbin tmp vol

Actually, ls doesn't show you all the entries in a directory - files and directories that begin with a dot (.) are hidden (this includes the directories '.' and '..' which are always present). The reason for this is that files that begin with a . usually contain important configuration information and should not be changed under normal circumstances. If you want to see all files, ls supports the -a option:

$ ls -a

Even this listing is not that helpful - there are no hints to properties such as the size, type and ownership of files, just their names. To see more detailed information, use the -l option (long listing), which can be combined with the -a option as follows:

$ ls -a -l (or, equivalently,) $ ls -al

Each line of the output looks like this:

where: o type is a single character which is either 'd' (directory), '-' (ordinary file), 'l'

(symbolic link), 'b' (block-oriented device) or 'c' (character-oriented device). o permissions is a set of characters describing access rights. There are 9

permission characters, describing 3 access types given to 3 user categories. The three access types are read ('r'), write ('w') and execute ('x'), and the three users categories are the user who owns the file, users in the group that the file belongs to and other users (the general public). An 'r', 'w' or 'x' character means the corresponding permission is present; a '-' means it is absent.

o links refers to the number of filesystem links pointing to the file/directory (see the discussion on hard/soft links in the next section).

o owner is usually the user who created the file or directory.

o group denotes a collection of users who are allowed to access the file according to the group access rights specified in the permissions field.

o size is the length of a file, or the number of bytes used by the operating system to store the list of files in a directory.

o date is the date when the file or directory was last modified (written to). The -u option display the time when the file was last accessed (read).

o name is the name of the file or directory.

ls supports more options. To find out what they are, type:

$ man ls

man is the online UNIX user manual, and you can use it to get help with commands and find out about what options are supported. It has quite a terse style which is often not that helpful, so some users prefer to the use the (non-standard) info utility if it is installed:

$ info ls

cd (change [current working] directory)

$ cd path

changes your current working directory to path (which can be an absolute or a relative path). One of the most common relative paths to use is '..' (i.e. the parent directory of the current directory).

Used without any target directory

$ cd

resets your current working directory to your home directory (useful if you get lost). If you change into a directory and you subsequently want to return to your original directory, use

$ cd -

mkdir (make directory)

$ mkdir directory

creates a subdirectory called directoryin the current working directory. You can only create subdirectories in a directory if you have write permission on that directory.

rmdir (remove directory)

$ rmdir directory

removes the subdirectory directory from the current working directory. You can only remove subdirectories if they are completely empty (i.e. of all entries besides the '.' and '..' directories).

cp (copy)

cp is used to make copies of files or entire directories. To copy files, use:

$ cp source-file(s) destination

where source-file(s) and destination specify the source and destination of the copy respectively. The behaviour of cp depends on whether the destination is a file or a directory. If the destination is a file, only one source file is allowed and cp makes a new file called destination that has the same contents as the source file. If the destination is a directory, many source files can be specified, each of which will be copied into the destination directory. Section 2.6 will discuss efficient specification of source files using wildcard characters.

To copy entire directories (including their contents), use a recursive copy:

$ cp -rd source-directories destination-directory

mv (move/rename)

mv is used to rename files/directories and/or move them from one directory into another. Exactly one source and one destination must be specified:

$ mv source destination

If destination is an existing directory, the new name for source (whether it be a file or a directory) will be destination/source. If source and destination are both files, source is renamed destination. N.B.: if destination is an existing file it will be destroyed and overwritten by source (you can use the -i option if you would like to be asked for confirmation before a file is overwritten in this way).

rm (remove/delete)

$ rm target-file(s)

removes the specified files. Unlike other operating systems, it is almost impossible to recover a deleted file unless you have a backup (there is no recycle bin!) so use this command with care. If you would like to be asked before files are deleted, use the -i option:

$ rm -i myfile rm: remove 'myfile'?

rm can also be used to delete directories (along with all of their contents, including any subdirectories they contain). To do this, use the -r option. To avoid rm from asking any questions or giving errors (e.g. if the file doesn't exist) you used the -f (force) option. Extreme care needs to be taken when using this option - consider what would happen if a system administrator was trying to delete user will's home directory and accidentally typed:

$ rm -rf / home/will

(instead of rm -rf /home/will).

Looking at the file contents [cat (catenate/type) ]

$ cat target-file(s)

displays the contents of target-file(s) on the screen, one after the other. You can also use it to create files from keyboard input as follows (> is the output redirection operator, which will be discussed in the next chapter):

$ cat > hello.txt hello world! [ctrl-d] $ ls hello.txt hello.txt $ cat hello.txt hello world! $

more and less (catenate with pause)

$ more target-file(s)

displays the contents of target-file(s) on the screen, pausing at the end of each screenful and asking the user to press a key (useful for long files). It also incorporates a searching facility (press '/' and then type a phrase that you want to look for).

You can also use more to break up the output of commands that produce more than one screenful of output as follows (| is the pipe operator, which will be discussed in the next chapter):

$ ls -l | more

less is just like more, except that has a few extra features (such as allowing users to scroll backwards and forwards through the displayed file). less not a standard utility, however and may not be present on all UNIX systems.

Absolute and relative path names

See inside the topic “Typical UNIX Directory Structure”

File permission changing permission modes

File and Directory PermissionsPermission File Directory

ReadUser can look at the contents of the file

User can list the files in the directory

WriteUser can modify the contents of the file

User can create new files and remove existing files in the directory

ExecuteUser can use the filename as a UNIX command

User can change into the directory, but cannot list the files unless (s)he has read permission. User can read files if (s)he has read permission on them.

Fig 3.1: Interpretation of permissions for files and directoriesAs we have seen in the previous chapter, every file or directory on a UNIX system has three types of permissions, describing what operations can be performed on it by various categories of users. The permissions are read (r), write (w) and execute (x), and the three categories of users are user/owner (u), group (g) and others (o). Because files and directories are different entities, the interpretation of the permissions assigned to each differs slightly, as shown in Fig 3.1.File and directory permissions can only be modified by their owners, or by the superuser (root), by using the chmod system utility.

chmod (change [file or directory] mode)

$ chmod options files

chmod accepts options in two forms. Firstly, permissions may be specified as a sequence of 3 octal digits (octal is like decimal except that the digit range is 0 to 7 instead of 0 to 9). Each octal digit represents the access permissions for the user/owner, group and others respectively.

The mappings of permissions onto their corresponding octal digits is as follows:

--- 0--x 1-w- 2-wx 3r-- 4r-x 5rw- 6rwx 7

For example the command:

$ chmod 600 private.txt

sets the permissions on private.txt to rw------- (i.e. only the owner can read and write to the file).

Permissions may be specified symbolically, using the symbols u (user), g (group), o (other), a (all), r (read), w (write), x (execute), + (add permission), - (take away permission) and = (assign permission). For example, the command:

$ chmod ug=rw,o-rw,a-x *.txt

sets the permissions on all files ending in *.txt to rw-rw---- (i.e. the owner and users in the file's group can read and write to the file, while the general public do not have any sort of access).

chmod also supports a -R option which can be used to recursively modify file permissions, e.g.

$ chmod -R go+r play

will grant group and other read rights to the directory play and all of the files and directories within play.

chgrp (change group)

$ chgrp group files

can be used to change the group that a file or directory belongs to. It also supports a -R option.

Standard input, output, and error files

The output from programs is usually written to the screen, while their input usually comes from the keyboard (if no file arguments are given). In technical terms, we say that processes usually write to standard output (the screen) and take their input from standard input (the keyboard). There is in fact another output channel called standard error, where processes write their error messages; by default error messages are also sent to the screen.

To redirect standard output to a file instead of the screen, we use the > operator:

$ echo hello hello $ echo hello > output $ cat output hello

In this case, the contents of the file output will be destroyed if the file already exists. If instead we want to append the output of the echo command to the file, we can use the >> operator:

$ echo bye >> output $ cat output hello bye

To capture standard error, prefix the > operator with a 2 (in UNIX the file numbers 0, 1 and 2 are assigned to standard input, standard output and standard error respectively), e.g.:

$ cat nonexistent 2>errors $ cat errors cat: nonexistent: No such file or directory $

You can redirect standard error and standard output to two different files:

$ find . -print 1>errors 2>files

or to the same file:

$ find . -print 1>output 2>output or $ find . -print >& output

Standard input can also be redirected using the < operator, so that input is read from a file instead of the keyboard:

$ cat < output hello bye

You can combine input redirection with output redirection, but be careful not to use the same filename in both places. For example:

$ cat < output > output

will destroy the contents of the file output. This is because the first thing the shell does when it sees the > operator is to create an empty file ready for the output.

One last point to note is that we can pass standard output to system utilities that require filenames as "-":

$ cat package.tar.gz | gzip -d | tar tvf -

Here the output of the gzip -d command is used as the input file to the tar command.

Filters and Pipelines

Filters(Finding Text in Files) grep (General Regular Expression Print)

$ grep options pattern files

grep searches the named files (or standard input if no files are named) for lines that match a given pattern. The default behaviour of grep is to print out the matching lines. For example:

$ grep hello *.txt

searches all text files in the current directory for lines containing "hello". Some of the more useful options that grep provides are: -c (print a count of the number of lines that match), -i (ignore case), -v (print out the lines that don't match the pattern) and -n (printout the line number before printing the matching line). So

$ grep -vi hello *.txt

searches all text files in the current directory for lines that do not contain any form of the word hello (e.g. Hello, HELLO, or hELlO).

If you want to search all files in an entire directory tree for a particular pattern, you can combine grep with find using backward single quotes to pass the output from find into grep. So

$ grep hello `find . -name "*.txt" -print`

will search all text files in the directory tree rooted at the current directory for lines containing the word "hello".

The patterns that grep uses are actually a special type of pattern known as regular expressions. Just like arithemetic expressions, regular expressions are made up of basic subexpressions combined by operators.

The most fundamental expression is a regular expression that matches a single character. Most characters, including all letters and digits, are regular expressions

that match themselves. Any other character with special meaning may be quoted by preceding it with a backslash (\). A list of characters enclosed by '[' and ']' matches any single character in that list; if the first character of the list is the caret `^', then it matches any character not in the list. A range of characters can be specified using a dash (-) between the first and last items in the list. So [0-9] matches any digit and [^a-z] matches any character that is not a digit.

The caret `^' and the dollar sign `$' are special characters that match the beginning and end of a line respectively. The dot '.' matches any character. So

$ grep ^..[l-z]$ hello.txt

matches any line in hello.txt that contains a three character sequence that ends with a lowercase letter from l to z.

egrep (extended grep) is a variant of grep that supports more sophisticated regular expressions. Here two regular expressions may be joined by the operator `|'; the resulting regular expression matches any string matching either subexpression. Brackets '(' and ')' may be used for grouping regular expressions. In addition, a regular expression may be followed by one of several repetition operators:

`?' means the preceding item is optional (matched at most once). `*' means the preceding item will be matched zero or more times. `+' means the preceding item will be matched one or more times. `{N}' means the preceding item is matched exactly N times. `{N,}' means the preceding item is matched N or more times. `{N,M}' means the preceding item is matched at least N times, but not more than M times.

For example, if egrep was given the regular expression

'(^[0-9]{1,5}[a-zA-Z ]+$)|none'

it would match any line that either:

o begins with a number up to five digits long, followed by a sequence of one or more letters or spaces, or

o contains the word none

You can read more about regular expressions on the grep and egrep manual pages.

Note that UNIX systems also usually support another grep variant called fgrep (fixed grep) which simply looks for a fixed string inside a file (but this facility is largely redundant).

Processes

A process is a program in execution. Every time you invoke a system utility or an application program from a shell, one or more "child" processes are created by the shell in response to your command. All UNIX processes are identified by a unique process identifier or PID. An important process that is always present is the init process. This is the first process to be created when a UNIX system starts up and usually has a PID of 1. All other processes are said to be "descendants" of init.

PipesThe pipe ('|') operator is used to create concurrently executing processes that pass data directly to one another. It is useful for combining system utilities to perform more complex functions. For example:

$ cat hello.txt | sort | uniq

creates three processes (corresponding to cat, sort and uniq) which execute concurrently. As they execute, the output of the who process is passed on to the sort process which is in turn passed on to the uniq process. uniq displays its output on the screen (a sorted list of users with duplicate lines removed). Similarly:

$ cat hello.txt | grep "dog" | grep -v "cat"

finds all lines in hello.txt that contain the string "dog" but do not contain the string "cat".

Process, finding out about process, and stopping background process

Controlling processes associated with the current shellMost shells provide sophisticated job control facilities that let you control many running jobs (i.e. processes) at the same time. This is useful if, for example, you are editing a text file and want ot interrupt your editing to do something else. With job control, you can suspend the editor, go back to the shell prompt, and start work on something else. When you are finished, you can switch back to the editor and continue as if you hadn't left.

Jobs can either be in the foreground or the background. There can be only one job in the foreground at any time. The foreground job has control of the shell with which you interact - it receives input from the keyboard and sends output to the screen. Jobs in the background do not receive input from the terminal, generally running along quietly without the need for interaction (and drawing it to your attention if they do).

The foreground job may be suspended, i.e. temporarily stopped, by pressing the Ctrl-Z key. A suspended job can be made to continue running in the foreground or background as needed by typing "fg" or "bg" respectively. Note that suspending a job is very different from interrupting a job (by pressing the interrupt key, usually Ctrl-C); interrupted jobs are killed off permanently and cannot be resumed.

Background jobs can also be run directly from the command line, by appending a '&' character to the command line. For example:

$ find / -print 1>output 2>errors & [1] 27501 $

Here the [1] returned by the shell represents the job number of the background process, and the 27501 is the PID of the process. To see a list of all the jobs associated with the current shell, type jobs:

$ jobs [1]+ Running find / -print 1>output 2>errors & $

Note that if you have more than one job you can refer to the job as %n where n is the job number. So for example fg %3 resumes job number 3 in the foreground.

To find out the process ID's of the underlying processes associated with the shell and its jobs, use ps (process show):

$ ps PID TTY TIME CMD 17717 pts/10 00:00:00 bash 27501 pts/10 00:00:01 find 27502 pts/10 00:00:00 ps

So here the PID of the shell (bash) is 17717, the PID of find is 27501 and the PID of ps is 27502.

To terminate a process or job abrubtly, use the kill command. kill allows jobs to referred to in two ways - by their PID or by their job number. So $ kill %1 or $ kill 27501

would terminate the find process. Actually kill only sends the process a signal requesting it shutdown and exit gracefully (the SIGTERM signal), so this may not always work. To force a process to terminate abruptly (and with a higher probability of sucess), use a -9 option (the SIGKILL signal):

$ kill -9 27501

kill can be used to send many other types of signals to running processes. For example a -19 option (SIGSTOP) will suspend a running process. To see a list of such signals, run kill -l.

Controlling other processesYou can also use ps to show all processes running on the machine (not just the processes in your current shell):

$ ps -fae (or ps -aux on BSD machines)

ps -aeH displays a full process hierarchy (including the init process).

Many UNIX versions have a system utility called top that provides an interactive way to monitor system activity. Detailed statistics about currently running processes are displayed and constantly refreshed. Processes are displayed in order of CPU utilization. Useful keys in top are:

s - set update frequency k - kill process (by PID) u - display processes of one user q - quit

On some systems, the utility w is a non-interactive substitute for top.

One other useful process control utility that can be found on most UNIX systems is the pkill command. You can use pkill to kill processes by name instead of PID or job number. So another way to kill off our background find process (along with any another find processes we are running) would be:

$ pkill find [1]+ Terminated find / -print 1>output 2>errors $

Note that, for obvious security reasons, you can only kill processes that belong to you (unless you are the superuser).

Vi editor

Introduction to vi vi (pronounced "vee-eye", short for visual, or perhaps vile) is a display-oriented text editor based on an underlying line editor called ex. Although beginners usually find vi somewhat awkward to use, it is useful to learn because it is universally available (being supplied with all UNIX systems). It also uses standard alphanumeric keys for commands, so it can be used on almost any terminal or workstation without having to worry about unusual keyboard mappings. System administrators like users to use vi because it uses very few system resources.

To start vi, enter:

$ vi filename

where filename is the name of the file you want to edit. If the file doesn't exist, vi will create it for you.

Basic Text Input and Navigation in viThe main feature that makes vi unique as an editor is its mode-based operation. vi has two modes: command mode and input mode. In command mode, characters you type perform actions (e.g. moving the cursor, cutting or copying text, etc.) In input mode, characters you type are inserted or overwrite existing text.

When you begin vi, it is in command mode. To put vi into input mode, press i (insert). You can then type text which is inserted at the current cursor location; you can correct mistakes with the backspace key as you type.To get back into command mode, press ESC (the escape key). Another way of inserting text, especially useful when you are at the end of a line is to press a (append).

In command mode, you are able to move the cursor around your document. h, j, k and l move the cursor left, down, up and right respectively (if you are lucky the arrow keys may also work). Other useful keys are ^ and $ which move you to the beginning and end of a line respectively. w skips to the beginning of the next word and b skips back to the beginning of the previous word.

To go right to the top of the document, press 1 and then G. To go the bottom of the document, press G. To skip forward a page, press ^F, and to go back a page, press ^B. To go to a particular line number, type the line number and press G, e.g. 55G takes you to line 55.

To delete text, move the cursor over the first character of the group you want to delete and make sure you are in command mode. Press x to delete the current character, dw to delete the next word, d4w to delete the next 4 words, dd to delete the next line, 4dd to delete the next 4 lines, d$ to delete to the end of the line or even dG to delete to the end of the document. If you accidentally delete too much, pressing u will undo the last change.

Occasionally you will want to join two lines together. Press J to do this (trying to press backspace on the beginning of the second line does not have the intuitive effect!)

Moving and Copying Text in vivi uses buffers to store text that is deleted. There are nine numbered buffers (1-9) as well as the undo buffer. Usually buffer 1 contains the most recent deletion, buffer 2 the next recent, etc.

To cut and paste in vi, delete the text (using e.g. 5dd to delete 5 lines). Then move to the line where you want the text to appear and press p. If you delete something else before you paste, you can still retrieve the delete text by pasting the contents of the delete buffers. You can do this by typing "1p, "2p, etc.

To copy and paste, "yank" the text (using e.g. 5yy to copy 5 lines). Then move to the line where you want the text to appear and press p.

Searching for and Replacing Text in viIn command mode, you can search for text by specifying regular expressions. To search forward, type / and then a regular expression and press . To search backwards, begin with a ? instead of a /. To find the next text that matches your regular expression press n.

To search and replace all occurences of pattern1 with pattern2, type :%s/pattern1/pattern2/g . To be asked to confirm each replacement, add a c to this substitution command. Instead of the % you can also give a range of lines (e.g. 1,17) over which you want the substitution to apply.

Other Useful vi CommandsProgrammers might like the :set number command which displays line numbers (:set nonumber turns them off).

To save a file, type :w . To save and quit, type :wq or press ZZ. To force a quit without saving type :q! .

To start editing another file, type :e filename .

To execute shell commands from within vi, and then return to vi afterwards, type :!shellcommand . You can use the letter % as a substitute for the name of the file that you are editing (so :!echo % prints the name of the current file).

. repeats the last command.

Quick reference for vi

Inserting and typing text: i insert text (and enter input mode) $a append text (to end of line) ESC re-enter command mode J join lines

Cursor movement: h left j down k up l right ^ beginning of line $ end of line 1 G top of document G end of document <n> G go to line <n> ^F page forward ^B page backward w word forwards b word backwards

Deleting and moving text: Backspace delete character before cursor (only works in insert mode) x delete character under cursor dw delete word dd delete line (restore with p or P) <n> dd delete n lines d$ delete to end of line dG delete to end of file yy yank/copy line (restore with p or P) <n> yy yank/copy <n> lines

Search and replace: %s/<search string>/<replace string>/g

Miscellaneous: u undo :w save file :wq save file and quit ZZ save file and quit :q! quit without saving

Unit 6

Inspecting file

Besides cat there are several other useful utilities for investigating the contents of files: file filename(s)

file analyzes a file's contents for you and reports a high-level description of what type of file it appears to be:

$ file myprog.c letter.txt webpage.html myprog.c: C program text letter.txt: English text webpage.html: HTML document text

file can identify a wide range of files but sometimes gets understandably confused (e.g. when trying to automatically detect the difference between C++ and Java code).

head, tail filename

head and tail display the first and last few lines in a file respectively. You can specify the number of lines as an option, e.g.

$ tail -20 messages.txt $ head -5 messages.txt

tail includes a useful -f option that can be used to continuously monitor the last few lines of a (possibly changing) file. This can be used to monitor log files, for example:

$ tail -f /var/log/messages

continuously outputs the latest additions to the system log file.

objdump options binaryfile

objdump can be used to disassemble binary files - that is it can show the machine language instructions which make up compiled application programs and system utilities.

od options filename (octal dump)

od can be used to displays the contents of a binary or text file in a variety of formats, e.g.

$ cat hello.txt hello world $ od -c hello.txt 0000000 h e l l o w o r l d \n 0000014 $ od -x hello.txt

0000000 6865 6c6c 6f20 776f 726c 640a 0000014

There are also several other useful content inspectors that are non-standard (in terms of availability on UNIX systems) but are nevertheless in widespread use. They are summarised in Fig. 3.2.

File type Typical extension Content viewer

Portable Document Format .pdf acroread

Postscript Document .ps ghostview

DVI Document .dvi xdvi

JPEG Image .jpg xv

GIF Image .gif xv

MPEG movie .mpg mpeg_play

WAV sound file .wav realplayer

HTML document .html netscape

Fig 3.2: Other file types and appropriate content viewers.

Searching for pattern

Note- refer above discussed “grep” command

Comparing files

Diff command. diff command will compare the two files and print out the differences between. Here I have two ascii text files. fileone and file two. Contents of fileone are

This is first file this is second linethis is third linethis is different as;lkdjf this is not different

filetwo contains This is first filethis is second linethis is third linethis is different xxxxxxxas;lkdjfthis is not different

diff fileone filetwo will give following output 4c4< this is different as;lkdjf---> this is different xxxxxxxas;lkdjf

Cmp command. cmp command compares the two files. For exmaple I have two different files fileone and filetwo. cmp fileone filetwo will give me

fileone filetwo differ: char 80, line 4

if I run cmp command on similar files nothing is returned.-s command can be used to return exit codes. i.e. return 0 if files are identical, 1 if files are different, 2 if files are inaccessible. This following command prints a message 'no changes' if files are same cmp -s fileone file1 && echo 'no changes' . no changes

Dircmp Command. dircmp command compares two directories. If i have two directories in my home directory nameddirone and dirtwo and each has 5-10 files in it. Then dircmp dirone dirtwo will return this

Dec 9 16:06 1997 dirone only and dirtwo only Page 1 ./cal.txt ./fourth.txt./dohazaar.txt ./rmt.txt./four.txt ./te.txt./junk.txt ./third.txt./test.txt

Operation on filesPrinting files

Printer Controlo lpr -Pprintqueue filename

lpr adds a document to a print queue, so that the document is printed when the printer is available. Look at /etc/printcap to find out what printers are available.

o lpq -Pprintqueue

lpq checks the status of the specified print queue. Each job will have an associated job number.

o lprm -Pprintqueue jobnumber

lprm removes the given job from the specified print queue.

Note that lpr, lpq and lprm are BSD-style print management utilities. If you are using a strict SYSV UNIX, you may need to use the SYSV equivalents lp, lpstat and cancel.

Sorting files

Sorting filesThere are two facilities that are useful for sorting files in UNIX:

sort filenames

sort sorts lines contained in a group of files alphabetically (or if the -n option is specified) numerically. The sorted output is displayed on the screen, and may be stored in another file by redirecting the output. So

$ sort input1.txt input2.txt > output.txt

outputs the sorted concentenation of files input1.txt and input2.txt to the file output.txt.

uniq filename

uniq removes duplicate adjacent lines from a file. This facility is most useful when combined with sort:

$ sort input.txt | uniq > output.txt

Splitting filesSyntax

split [-l line_count][-a suffix_length][file[name]]

split -b n[k|m][-a suffix_length][file[name]]

split [-line_count][-a suffix_length][file[name]]

DESCRIPTIONThe split utility reads an input file and writes one or more output files. The default size of each output file is 1000 lines. The size of the output files can be modified by specification of the -b or -l options. Each output file is created with a unique suffix. The suffix consists of exactly suffix_length lower-case letters from the POSIX locale. The letters of the suffix are used as if they were a base-26 digit system, with the first suffix to be created consisting of all a characters, the second with a b replacing the last a, and so on, until a name of all z characters is created. By default, the names of the output files are x, followed by a two-character suffix from the character set as described above, starting with aa, ab, ac, and so on, and continuing until the suffix zz, for a maximum of 676 files.

If the number of files required exceeds the maximum allowed by the suffix length provided, such that the last allowable file would be larger than the requested size, the split utility will fail after creating the last file with a valid suffix; split will not delete the files it created with valid suffixes. If the file limit is not exceeded, the last file created will contain the remainder of the input file, and may be smaller than the requested size.

OPTIONS

The following options are supported:

-a suffix_length

Use suffix_length letters to form the suffix portion of the filenames of the split file. If -a is not specified, the default suffix length is two. If the sum of the name operand and the suffix_length option-argument would create a filename exceeding {NAME_MAX} bytes, an error will result; split will exit with a diagnostic message and no files will be created.

-b n

Split a file into pieces n bytes in size.

-b nk

Split a file into pieces n*1024 bytes in size.

-b nm

Split a file into pieces n*1048576 bytes in size.

-l line_count

-line_count

Specify the number of lines in each resulting file piece. The line_count argument is an unsigned decimal integer. The default is 1000. If the input does not end with a newline character, the partial line will be included in the last output file.

OPERANDSThe following operands are supported:

file

The pathname of the ordinary file to be split. If no input file is given or file is "-", the standard input will be used.

name

The prefix to be used for each of the files resulting from the split operation. If no name argument is given, x will be used as the prefix of the output files. The combined length of the basename of prefix and suffix_length cannot exceed {NAME_MAX} bytes; see the OPTIONS section.

STDINSee the INPUT FILES section.

INPUT FILESAny file can be used as input.

EXAMPLESIn the following examples foo is a text file that contains 5000 lines.

1. Create five files, xaa, xab, xac, xad and xae:

split foo

2. Create five files, but the suffixed portion of the created files consists of three letters, xaaa, xaab, xaac, xaad and xaae:

split -a 3 foo

3. Create three files with four-letter suffixes and a supplied prefix, bar_aaaa, bar_aaab and bar_aaac:

split -a 4 -l 2000 foo bar_

4. Create as many files as are necessary to contain at most 20*1024 bytes, each with the default prefix of x and a five-letter suffix:

split -a 5 -b 20k foo

Traslating characters(awk utility)

awk (Aho, Weinberger and Kernigan)

awk is useful for manipulating files that contain columns of data on a line by line basis. Like sed, you can either pass awk statements directly on the command line, or you can write a script file and let awk read the commands from the script.

Say we have a file of cricket scores called cricket.dat containing columns for player number, name, runs and the way in which they were dismissed:

1 atherton 0 bowled 2 hussain 20 caught 3 stewart 47 stumped 4 thorpe 33 lbw 5 gough 6 run-out

To print out only the first and second columns we can say:

$ awk '{ print $1 " " $2 }' cricket.dat atherton 0 hussain 20 stewart 47 thorpe 33 gough 6 $

Here $n stands for the nth field or column of each line in the data file. $0 can be used to denote the whole line.

We can do much more with awk. For example, we can write a script cricket.awk to calculate the team's batting average and to check if Mike Atherton got another duck:

$ cat > cricket.awk BEGIN { players = 0; runs = 0 } { players++; runs +=$3 } /atherton/ { if (runs==0) print "atherton duck!" } END { print "the batting average is " runs/players } (ctrl-d) $ awk -f cricket.awk cricket.dat

atherton duck! the batting average is 21.2 $

The BEGIN clause is executed once at the start of the script, the main clause once for every line, the /atherton/ clause only if the word atherton occurs in the line and the END clause once at the end of the script.

Unit 7

Shells and Shell ScriptsA shell is a program which reads and executes commands for the user. Shells also usually provide features such job control, input and output redirection and a command language for writing shell scripts. A shell script is simply an ordinary text file containing a series of commands in a shell command language (just like a "batch file" under MS-DOS).

There are many different shells available on UNIX systems (e.g. sh, bash, csh, ksh, tcsh etc.), and they each support a different command language. Here we will discuss the command language for the Bourne shell sh since it is available on almost all UNIX systems (and is also supported under bash and ksh).

Shell Variables and the EnvironmentA shell lets you define variables (like most programming languages). A variable is a piece of data that is given a name. Once you have assigned a value to a variable, you access its value by prepending a $ to the name:

$ bob='hello world' $ echo $bob hello world $

Variables created within a shell are local to that shell, so only that shell can access them. The set command will show you a list of all variables currently defined in a shell. If you wish a variable to be accessible to commands outside the shell, you can export it into the environment:

$ export bob

(under csh you used setenv). The environment is the set of variables that are made available to commands (including shells) when they are executed. UNIX commands and programs can read the values of environment variables, and adjust their behaviour accordingly. For example, the environment variable PAGER is used by the man command (and others) to see what command should be used to display multiple pages. If you say:

$ export PAGER=cat

and then try the man command (say man pwd), the page will go flying past without stopping. If you now say:

$ export PAGER=more

normal service should be resumed (since now more will be used to display the pages one at a time). Another environment variable that is commonly used is the EDITOR variable which specifies the default editor to use (so you can set this to vi or emacs or which ever other editor you prefer). To find out which environment variables are used by a particular command, consult the man pages for that command.

Another interesting environment variable is PS1, the main shell prompt string which you can use to create your own custom prompt. For example:

$ export PS1="(\h) \w> " (lumberjack) ~>

The shell often incorporates efficient mechanisms for specifying common parts of the shell prompt (e.g. in bash you can use \h for the current host, \w for the current working directory, \d for the date, \t for the time, \u for the current user and so on - see the bash man page).

Another important environment variable is PATH. PATH is a list of directories that the shell uses to locate executable files for commands. So if the PATH is set to:

/bin:/usr/bin:/usr/local/bin:.

and you typed ls, the shell would look for /bin/ls, /usr/bin/ls etc. Note that the PATH contains'.', i.e. the current working directory. This allows you to create a shell script or program and run it as a command from your current directory without having to explicitly say "./filename".

Note that PATH has nothing to with filenames that are specified as arguments to commands (e.g. cat myfile.txt would only look for ./myfile.txt, not for /bin/myfile.txt, /usr/bin/myfile.txt etc.)

Simple Shell Scripting/Interactive shell scriptConsider the following simple shell script, which has been created (using an editor) in a text file called simple:

#!/bin/sh # this is a comment echo "The number of arguments is $#" echo "The arguments are $*" echo "The first is $1" echo "My process number is $$" echo "Enter a number from the keyboard: " read number echo "The number you entered was $number"

The shell script begins with the line "#!/bin/sh" . Usually "#" denotes the start of a comment, but #! is a special combination that tells UNIX to use the Bourne shell (sh) to interpret this script. The #! must be the first two characters of the script. The arguments passed to the script can be accessed through $1, $2, $3 etc. $* stands for all the arguments, and $# for the number of arguments. The process number of the shell executing the script is given by $$. the read number statement assigns keyboard input to the variable number.

To execute this script, we first have to make the file simple executable:

$ ls -l simple -rw-r--r-- 1 will finance 175 Dec 13 simple $ chmod +x simple $ ls -l simple -rwxr-xr-x 1 will finance 175 Dec 13 simple $ ./simple hello world The number of arguments is 2 The arguments are hello world The first is hello My process number is 2669 Enter a number from the keyboard: 5 The number you entered was 5 $

We can use input and output redirection in the normal way with scripts, so:

$ echo 5 | simple hello world

would produce similar output but would not pause to read a number from the keyboard.

More Advanced Shell Scripting if-then-else statements

Shell scripts are able to perform simple conditional branches:

if [ test ] then commands-if-test-is-true else commands-if-test-is-false fi

The test condition may involve file characteristics or simple string or numerical comparisons. The [ used here is actually the name of a command (/bin/[) which performs the evaluation of the test condition. Therefore there must be spaces before and after it as well as before the closing bracket. Some common test conditions are:

-s file true if file exists and is not empty -f file true if file is an ordinary file -d file true if file is a directory -r file true if file is readable -w file true if file is writable -x file true if file is executable

$X -eq $Y true if X equals Y $X -ne $Y true if X not equal to Y $X -lt $Y true if X less than $Y $X -gt $Y true if X greater than $Y $X -le $Y true if X less than or equal to Y $X -ge $Y true if X greater than or equal to Y "$A" = "$B" true if string A equals string B "$A" != "$B" true if string A not equal to string B $X ! -gt $Y true if string X is not greater than Y $E -a $F true if expressions E and F are both true $E -o $F true if either expression E or expression F is true

for loops

Sometimes we want to loop through a list of files, executing some commands on each file. We can do this by using a for loop:

for variable in list do statements (referring to $variable) done

The following script sorts each text files in the current directory:

#!/bin/sh for f in *.txt do echo sorting file $f cat $f | sort > $f.sorted echo sorted file has been output to $f.sorted done

while loops

Another form of loop is the while loop:

while [ test ] do

statements (to be executed while test is true) done

The following script waits until a non-empty file input.txt has been created:

#!/bin/sh while [ ! -s input.txt ] do echo waiting... sleep 5 done echo input.txt is ready

You can abort a shell script at any point using the exit statement, so the following script is equivalent:

#!/bin/sh while true do if [ -s input.txt ] echo input.txt is ready exit fi echo waiting... sleep 5 done

case statements

case statements are a convenient way to perform multiway branches where one input pattern must be compared to several alternatives:

case variable in pattern1) statement (executed if variable matches pattern1) ;; pattern2) statement ;; etc. esac

The following script uses a case statement to have a guess at the type of non-directory non-executable files passed as arguments on the basis of their extensions (note how the "or" operator | can be used to denote multiple patterns, how "*" has been used as a catch-all, and the effect of the forward single quotes `):

#!/bin/sh for f in $* do

if [ -f $f -a ! -x $f ] then case $f in core) echo "$f: a core dump file" ;; *.c) echo "$f: a C program" ;; *.cpp|*.cc|*.cxx) echo "$f: a C++ program" ;; *.txt) echo "$f: a text file" ;; *.pl) echo "$f: a PERL script" ;; *.html|*.htm) echo "$f: a web document" ;; *) echo "$f: appears to be "`file -b $f` ;; esac fi done

capturing command output

Any UNIX command or program can be executed from a shell script just as if you would on the line command line. You can also capture the output of a command and assign it to a variable by using the forward single quotes ` `:

#!\bin\sh lines=`wc -l $1` echo "the file $1 has $lines lines"

This script outputs the number of lines in the file passed as the first parameter.

arithmetic operations

The Bourne shell doesn't have any built-in ability to evaluate simple mathematical expressions. Fortunately the UNIX expr command is available to do this. It is frequently used in shell scripts with forward single quotes to update the value of a variable. For example:

lines = `expr $lines + 1`

adds 1 to the variable lines. expr supports the operators +, -, *, /, % (remainder), <, <=, =, !=, >=, >, | (or) and & (and).

Start-up Shell ScriptsWhen you first login to a shell, your shell runs a systemwide start-up script (usually called /etc/profile under sh, bash and ksh and /etc/.login under csh). It then looks in your home directory and runs your personal start-up script (.profile under sh, bash and ksh and .cshrc under csh and tcsh). Your personal start-up script is therefore usually a good place to set up environment variables such as PATH, EDITOR etc. For example with bash, to add the directory ~/bin to your PATH, you can include the line:

export PATH=$PATH:~/bin

in your .profile. If you subsequently modify your .profile and you wish to import the changes into your current shell, type:

$ source .profile or $ . ./profile

The source command is built into the shell. It ensures that changes to the environment made in .profile affect the current shell, and not the shell that would otherwise be created to execute the .profile script.

With csh, to add the directory ~/bin to your PATH, you can include the line:

set path = ( $PATH $HOME/bin )

in your .cshrc.

Some Common System Administration Tasks.

Performing backups Adding and removing users

Adding and removing hardware

Restoring files from backups that users have accidentally deleted

Installing new software

Answering users' questions

Monitoring system activity, disc use and log files

Figuring out why a program has stopped working since yesterday, even though the user didn't change anything; honest!

Monitoring system security

Adding new systems to the network

Talking with vendors to determine if a problem is yours or theirs, and getting them to fix it when it is their problem

Figuring out why "the network" (or "Pine", or "the computer") is so slow

Trying to free up disc space

Rebooting the system after a crash (usually happens when you are at home or have to catch a bus)

Writing scripts to automate as many of the above tasks as possible

Types of Unix users.

root (a.k.a. "super user"--guard this account!) everybody else

Booting/Shutdown.Boot Phases.

ROM monitor may run some simple hardware diagnostics and finds boot device ROM monitor loads/runs bootstrap loader

Bootstrap loader loads/runs kernel (PID 0)

Kernel may perform more elaborate diagnostics, then checks root file system

Kernel mounts root file system, then starts init(PID 1)

Operator may need to switch to multi-user mode

init runs initialization scripts (rc scripts)

getty/login produces prompts on terminals (or graphics system starts up) and system is ready to go

This example of the output of System V style ps will help identify these processes.

Shutdown and System Start-up Shutdown: shutdown, halt, reboot (in /sbin)

/sbin/shutdown allows a UNIX system to shut down gracefully and securely. All logged-in users are notified that the system is going down, and new logins are blocked. It is possible to shut the system down immediately or after a specified delay and to specify what should happen after the system has been shut down:

# /sbin/shutdown -r now (shut down now and reboot) # /sbin/shutdown -h +5 (shut down in 5 minutes & halt) # /sbin/shutdown -k 17:00 (fake a shutdown at 5pm)

halt and reboot are equivalent to shutdown -h and shutdown -r respectively.

If you have to shut a system down extremely urgently or for some reason cannot use shutdown, it is at least a good idea to first run the command:

# sync

which forces the state of the file system to be brought up to date.

System startup:

At system startup, the operating system performs various low-level tasks, such as initialising the memory system, loading up device drivers to communicate with hardware devices, mounting filesystems and creating the init process (the parent of all processes). init's primary responsibility is to start up the system services as specified in /etc/inittab. Typically these services include gettys (i.e. virtual terminals where users can login), and the scripts in the directory /etc/rc.d/init.d which usually spawn high-level daemons such as httpd (the web server). On most UNIX systems you can type dmesg to see system startup messages, or look in /var/log/messages.

If a mounted filesystem is not "clean" (e.g. the machine was turned off without shutting down properly), a system utility fsck is automatically run to repair it. Automatic running can only fix certain errors, however, and you may have to run it manually:

# fsck filesys

where filesys is the name of a device (e.g. /dev/hda1) or a mount point (like /). "Lost" files recovered during this process end up in the lost+found directory. Some more modern filesystems called "journaling" file systems don't require fsck, since they keep extensive logs of filesystem events and are able to recover in a similar way to a transactional database.

Backup and Restoration

UNIX systems usually support a number of utilities for backing up and compressing files. The most useful are:

tar (tape archiver)

tar backs up entire directories and files onto a tape device or (more commonly) into a single disk file known as an archive. An archive is a file that contains other files plus information about them, such as their filename, owner, timestamps, and access permissions. tar does not perform any compression by default.

To create a disk file tar archive, use

$ tar -cvf archivename filenames

where archivename will usually have a .tar extension. Here the c option means create, v means verbose (output filenames as they are archived), and f means file.To list the contents of a tar archive, use

$ tar -tvf archivename

To restore files from a tar archive, use

$ tar -xvf archivename

cpio

cpio is another facility for creating and reading archives. Unlike tar, cpio doesn't automatically archive the contents of directories, so it's common to combine cpio with find when creating an archive:

$ find . -print -depth | cpio -ov -Htar > archivename

This will take all the files in the current directory and the directories below and place them in an archive called archivename.The -depth option controls the order in which the filenames are produced and is recommended to prevent problems with directory permissions when doing a restore.The -o option creates the archive, the -v option prints the names of the files archived as they are added and the -H option specifies an archive format type (in this case it creates a tar archive). Another common archive type is crc, a portable format with a checksum for error control.

To list the contents of a cpio archive, use

$ cpio -tv < archivename

To restore files, use:

$ cpio -idv < archivename

Here the -d option will create directories as necessary. To force cpio to extract files on top of files of the same name that already exist (and have the same or later modification time), use the -u option.

compress, gzip

compress and gzip are utilities for compressing and decompressing individual files (which may be or may not be archive files). To compress files, use:

$ compress filename or $ gzip filename

In each case, filename will be deleted and replaced by a compressed file called filename.Z or filename.gz. To reverse the compression process, use:

$ compress -d filename or $ gzip -d filename

Maintaining user account

Refer Unit 5 “user name and group”

Comparision of Unix and WindowsMainframe computers have been used by the industry for a long time. They often use any one of the various versions of UNIX provided by vendors. The various implementations of the UNIX operating system have served the industry well, as witnessed by the large base of installed systems and the number of large-scale applications installed on these systems. However, there are increasing signs of dissatisfaction with expensive, often proprietary solutions running on UNIX architecture. In many cases, the user experience of these systems involves text-based interaction with dumb terminals or a terminal-emulation session on a computer. Dissatisfied with these characteristics of UNIX, IT managers have often sought alternatives to the increasing expense of being tied to single-vendor software and hardware solutions. In addition, since the mid-1980s, corporate data centers have been moving away from mainframes, with their dedicated operating systems, to minicomputers.This has prompted more organizations to evaluate migrating to Intel-based architecture as an option. One of the most extraordinary and unexpected successes of the Intel PC architecture is the flexibility to extend its framework to include very large server and data center environments. Large-scale hosting organizations are now offering enterprise-level services to multiple client organizations at an availability level of more than 99.99 percent on what are just racks of relatively inexpensive computers. These catalysts have led more organizations to migrate their enterprise-level applications from UNIX to Windows. The key benefits (for example, reduction of costs and increased flexibility) of the Microsoft® Windows® operating system has fueled this interest, increasing the number of organizations that are considering migrating to alternative platforms.Migration, or coexistence, with the Microsoft Windows operating system makes sense because it offers greater enterprise-readiness and a better-integrated future roadmap for application development and server support, along with quantifiable reductions in total cost of ownership (TCO).Before you migrate from UNIX to Windows, it is recommended that you learn about the evolution and architecture of the two operating systems. The following section provides this information.

Comparison of Windows and UNIX Environments

This section compares the Windows and UNIX architectures, emphasizing the areas that directly affect software development in a migration project.

Kernels and APIs

As with most operating systems, Windows and UNIX both have kernels. The kernel provides the base functionality of the operating system. The major functionality of the kernel includes process management, memory management, thread management, scheduling, I/O management, and power management.

In UNIX, the API functions are called system calls. System calls are a programming interface common to all implementations of UNIX. The kernel is a set of functions that are used by processes through system calls.

Windows has an API for programming calls to the executive. In addition to this, each subsystem provides a higher-level API. This approach allows Windows operating systems to provide different APIs, some of which mimic the APIs provided by the kernels of other operating systems. The standard subsystem APIs include the Windows API (the Windows native API) and the POSIX API (the standards-based UNIX API).

Windows Subsystems

A subsystem is a portion of the Windows operating system that provides a service to application programs through a callable API.

Subsystems come in two varieties, depending upon the application program that finally handles the request:

Environment subsystems. These subsystems run in a user mode and provide functions through a published API. The Windows API subsystem provides an API for operating system services, GUI capabilities, and functions to control all user input and output. The Win32 subsystem and POSIX subsystem are part of the environment subsystems and are described as follows:

Win32 subsystem. The Win32 environment subsystem allows applications to benefit from the complete power of the Windows family of operating systems. The Win32 subsystem has a vast collection of functions, including those required for advanced operating systems, such as security, synchronization, virtual memory management, and threads. You can use Windows APIs to write 32-bit and 64-bit applications that run on all versions of Windows.

POSIX subsystem and Windows Services for UNIX. To provide more comprehensive support for UNIX programs, Windows uses the Interix subsystem. Interix is a multiuser UNIX environment for a Windows-based computer. Interix conforms to the POSIX.1 and POSIX.2 standards. It provides all features of a traditional UNIX operating system, including pipes, hard links, symbolic links, UNIX networking, and UNIX graphical support through the X Windows system. It also includes case-sensitive file names, job control tools, compilation tools, and more than 300 UNIX commands and tools, such as KornShell, C Shell, awk, and vi.

Because it is layered on top of the Windows kernel, the Interix subsystem is not an emulation. Instead, it is a native environment subsystem that integrates with the Windows kernel, just as the Win32 subsystem does. Shell scripts and other scripted applications that use UNIX and POSIX.2 tools run under Interix.

Integral subsystems. These subsystems perform key operating system functions and run as a part of the executive or kernel. Examples are the user-

mode subsystems, Local Security Authority subsystem (LSASS), and Remote Procedure Call subsystem (RPCSS).

Kernel Objects and Handles

Kernel objects are used to manage and manipulate resources—such as files, synchronization objects, and pipes—across the processes. These kernel objects are owned by the kernel and not by the process. Handles are the opaque numbers or the data type used to represent the objects and to uniquely identify the objects. To interact with an object, you must obtain a handle to the object.

In UNIX, the kernel object can be created using the system calls and it returns an unsigned integer. There is no exact equivalent of handles in UNIX.

In Windows, Windows APIs are used to create the kernel object and it returns a Windows-specific data type called HANDLE.

Hardware Drivers

Hardware drivers are system software used to interact the hardware devices with the operating system.

In UNIX, there are several different ways to manage drivers. Some UNIX implementations allow dynamic loading and unloading of drivers, whereas other implementations do not. The UNIX vendor usually provides drivers. The range of Intel hardware supported for UNIX is typically smaller than that for Windows.

In Windows, the Windows driver model provides a platform for developing drivers for industry-standard hardware devices attached to a Windows-based system. The key to developing a good driver package is to provide reliable setup and installation procedures and interactive GUI tools for configuring devices after the installation. In addition, the hardware must be compatible with Windows Plug and Play technology in order to provide a user-friendly hardware installation.

Process Management

A process is usually defined as the instance of the running program. Process management describes how the operating systems manage the multiple processes running at a particular instance of time. Multitasking operating systems such as Windows and UNIX must manage and control many processes simultaneously. Both Windows and UNIX operating systems support processes and threads.

The following sections provide more details on process management in both UNIX and Windows:

Multitasking. UNIX is a multiprocessing, multiuser system. At any given point, you can have many processes running on UNIX. Consequently, UNIX is very efficient at creating processes.

Windows has evolved greatly from its predecessors, such as Digital Equipment Corporation’s VAX/VMS. It is now a preemptive multitasking operating system. As a result, Windows relies more heavily on threads than processes. (A thread is a

construct that enables parallel processing within a single process.) Creating a new process is a relatively expensive operation while creating a new thread is not as expensive in terms of system resources like memory and time. Hence, multiprocess-oriented applications on UNIX typically translate to multithreaded applications on the Windows platform, thus saving such system resources as memory and time.

Multiple users. One key difference between UNIX and Windows is in the creation of multiple user accounts on a single computer.

When you log on to a computer running UNIX, a shell process is started to service your commands. The UNIX operating system keeps track of the users and their processes and prevents processes from interfering with one another. The operating system does not come with any default interface for user interaction. However, the shell process on the computer running UNIX can connect to other computers to load third-party UIs.

When a user logs on interactively to a computer running Windows, the Win32 subsystem’s Graphical Identification and Authentication dynamic-link library (GINA) creates the initial process for that user, which is known as the user desktop, where all user interaction or activity takes place. The desktop on the user's computer is loaded from the server. Only the user who is logged on has access to the desktop. Other users are not allowed to log on to that computer at the same time. However, if a user employs Terminal Services or Citrix, Windows can operate in a server-centric mode just as UNIX does.

The Windows operating system supports multiple users simultaneously through the command line and a GUI. The latter requires the use of Windows Terminal Services. The UNIX operating system supports multiple simultaneous users through the command line and through a GUI. The latter requires the use of X Windows. Windows comes with a default command shell (cmd.exe); UNIX typically includes several shells and encourages each user to choose a shell as the user’s default shell. Both operating systems provide complete isolation between simultaneous users. There are some key differences between the two: Windows comes with a “single user” version that allows one user at a time (Windows XP) as well as a multiuser server version (Windows Server). It is rare for a Windows Server system to have multiple simultaneous command-line users.

Multithreading. Most new UNIX kernels are multithreaded to take advantage of symmetric multiprocessing (SMP) computers. Initially, UNIX did not expose threads to programmers. However, POSIX has user-programmable threads. There

is a POSIX standard for threads (called Pthreads) that all current versions of UNIX support.

In Windows, creating a new thread is very efficient. Windows applications are capable of using threads to take advantage of SMP computers and to maintain interactive capabilities when some threads take a long time to execute.

Process hierarchy. When a UNIX-based application creates a new process, the new process becomes a child of the process that created it. This process hierarchy is often important, and there are system calls for manipulating child processes.

Unlike UNIX, Windows processes do not share a hierarchical relationship. The creating process receives the process handle and ID of the process it created, so a hierarchical relationship can be maintained or simulated if required by the application. However, the operating system treats all processes like they belong to the same generation. Windows provides a feature called Job Objects, which allows disparate processes to be grouped together and adhere to one set of rules.

spsa(unix notes)

Documents