baisc content on operating systems

OPERATING SYSTEMS These notes have been prepared mainly from the following books:- (1) Operating system Concepts By Silberschatz, Galvin, Gagne. Galvin, Kindly refer above book for more details.

Processes

Process:- A process is a program in execution. A program by itself is not a process: a program is a passive entity, whereas a process is an active entity.

A process includes the following:-
(i) Text section:- It contains the program code
(ii) Current Activity:- It is represented by the value of the program counter and the contents of the processors registers.
(iii) Stack:- Contains temporary data such as function parameters, return addresses, and local variables.
(iv) Data section:- Contains global variables.
(v) Heap:- Used for dynamic memory allocation during process runtime.

Process State:-
Each process may be in one of the following states during execution:-

These notes have been prepared mainly from the following books:-

(1) Operating system Concepts

By ilberschat!" galvin " gagne#alvin"$indly refer above book for more details%

&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&'age no%


  process is a program in e*ec+tion program by itself is not a process: a program is a

passive entity" where as a process is an active entity" whereas a process is an active entity%

  process incl+des the following:-

(i) Test section :-,t contains the program code

(ii) C+rrent ctivity:-,t is represented by the val+e of the program co+nter and the contents of the

processors registers%(iii) tock:-

Contains temporary data s+ch as f+nction parameters" ret+rn addresses" and

local variables%(iv) ata ection:-

Contains global variables%(v) .eap:-

/sed for dynamic memory allocation d+ring process r+ntime%


'age no" 0

Process State:-

ach process may be in one of the following states d+ring e*ec+tion:-

(i) 2ew:- The process is being created(ii) 3+nning:- instr+ctions are being e*ec+ted%(iii) 4aiting:- The process in waiting for some event to occ+rs s+ch as an ,5O

completion or reception of a signal%(iv) 3eady:- The process is waiting to be assigned to a processor%(v) Terminated:-

The process has finished e*ec+tion

6 Only one process can be r+nning on any processor at any instant%

Process control Block:-

ach process is represented by a process control block ('CB) 7 also called a task

control block%

  'CB contains the following information:-

(1) 'rocess tate() 'rogram Co+nter (0) C'/ 3egisters

(8) C'/ ched+ling ,nformation(9) emory anagement ,nformation(;) cco+nting ,nformation(<) ,5O tat+es%(=) 'rocess ,dentifier (',) > 'arent 'rocess ,d ('',)

Operations on Process:-

  process may create several new processes" via a crate- process system call" d+ring

the co+rse of e*ec+tion%

The creating process is called a parent process" and the new processes are called the

children of that process% ach of these processes may in t+rn create other processes" forming a

tree of processes%4hen a process creates a s+b process" the s+b process may be able to obtain its

reso+rces directly from the operating system" or it may be constrained to a s+nset of the

reso+rces of the parent process%

3estricting a child process to a s+bset of the parent?s reso+rces prevents any process

from overloading the system by creating too many s+b processes%


'age no% 8

Process Table :-

The process table is a data str+ct+re maintained by the operating system to facilitatecontent switching and sched+ling

very process has an entry in process table% These entries are known as process

control blocks ('CB)

Trap or Exception:-

  trap is a software generated interr+pt ca+sed either by an error or by a specific re@+est

from a +ser program%

Dual –Mode Operation in an Operating System : -

(i) /ser ode(ii) $ernel ode or +pervisor ode or system ode or 'rivileged ode%

Mode Bit :-,t is +sed to disting+ish between +ser mode and kernel mode%

(a) Aor +ser mode: - ode 7 bit 1(b) Aor kernel mode :- ode 7 bit


'age no% 9


DTimer is +sed to ens+re that the operating system maintains control over the C'/

D timer can be set to interr+pt the comp+ter after a specified period%

D4e can +se the timer to prevent a +ser program from r+nning too long%

ork ! " :-4hen a process forks" it creates a copy of itself 

,n the child process" the ret+rn val+e of the fork is !ero%

,n the parent process" the ret+rn val+e is the ', of the newly created child process%

The fork operation creates a separate address space for the child%

Both the parent and child processes posses the same code segments" b+t e*ec+te

independently of each other%

Aor (int iE iFnE , GG)

Aork ( )

Aor this program ( n -1) child will be created%


'age no% ;


Problems #it$ traditional processes :-

(i) Traditional processes have separate address spaces%(ii) 'rocess creation is e*pensive%(iii) 'rocess switch is e*pensive%(iv) haring memory areas among processes is non- trivial

(v) process is a single thread of e*ec+tion%


  tread is a basic +nit of C'/ +tili!ation% ,t comprises :-

(i) Thread ,(ii) 'rogram Co+nter ('C)(iii) register set(iv) stack

,t shares with other threads belonging to the name process its code section" data section"

and other operating system reso+rces" s+ch as open filer and signals%

6 any operating system kernels are now m+ltithreaded%


'age no% <

Benefits of +ltithreading:-

(i) 3esponsiveness : -+ltithreading an interactive application may allow a program to contin+e r+nning

even it part of it is blocked or is performing a length operation" thereby increasing

responsiveness to the +ser%(ii) 3eso+rces haring :-

Threads share the address space and reso+rces of the process to which they

belong%(iii) conomy :-(a) Thread creation is m+ch simpler than process creation%(b) Thread switch is simple%(c) Comm+nication bl+e threads are simple%(d)

(iv) /tili!ation of m+ltiprocessor architect+res:-Threads can r+n parallel on different processors%

Types of Threads:-T#o types:-

(1) /ser-level threads :- (managed by thread library)$ernel not aware of threads%

() $ernel 7level threads :- ll thread 7 management done in kernel%

Multit$reading Models:-

There m+st e*ist a relationship between +ser threads and kernel threads% There are

three common ways of establishing this relationship:-

(i) any 7to-One odel :-,t maps many +ser 7level threads to one kernel thread%

,t is efficient" b+t the entire process will block if a thread makes a blocking system call%

Beca+se only one thread can accesses the kernel at a time" m+ltiple threads are

+nable to r+n in parallel on m+ltiprocessors%

'age no%=

(ii) One 7 to- One model :-

,t maps each +ser thread to a kernel thread%

,t provides more conc+rrency than the many 7 to- one model by allowing another 

thread to r+n when a thread makes a blocking system call%

,t also allows m+ltiple threads to r+n in parallel on m+ltiprocessors%

The only drawback to this model is that crating a +ser thread re@+ires crating the

corresponding kernel thread%

(iii) any 7 to 7 any odel :-,t m+ltiple*es many +ser 7 level threads to a smaller or e@+al n+mber of kernel

threads%,t does not s+ffer from the short comings of other two models%

Pop %&p T$reads :-

,n distrib+ted systems arrival of a message ca+ses the system to create a new thread to

handle the massage% +ch a thread is called a pop 7 +p thread%

  key advantage of pop-+p threads is that since they are brand new" they do not have

any history 7 registers" stacks etc% that m+st be restored% ach one starts o+t fresh and each

one is identical to all the others%

&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&%'age no% H

'rocess ched+ling


The obIective of m+ltiprogramming is to have some process r+nning at all times" to

ma*imi!e C'/ +tili!ation%

,n m+lti programmed systems" the +ser can not interact with the system%

Time S$aring:-

The obIective of time sharing is to switch the C'/ among processes so fre@+ently that+sers can interact with each program while its r+nning%

Time sharing system are also called a m+ltitasking systems%

Aor a time sharing system" the response time sho+ld be short%

Time sharing system internally +ses m+ltiprogramming%


'age no% 1

Sc$eduling 'ueues:-

(i) Job K+e+e :- s processes enter the system" they are p+t into a Iob @+e+e% 4hich consists of 

all process in the system%(ii) 3eady K+e+e:-

The processes that are residing in main memory and are ready and waiting to

e*ec+te are kept on a list called the ready @+e+e%(iii) evice K+e+e :-

The list of processes waiting for a partic+lar 15 device is called a device @+e+e%ach device has to its own device @+e+e%

(ontext S#itc$:-

witching the C'/ to another process re@+ires performing a state save of the c+rrent

process and a state restore of a different process% This task is known as a conte*t switch%

Conte*t 7 switch time is p+re overhead" beca+se system does no +sef+l work while


,ts speed varies from machine to machine" depending on the memory speed" the

n+mber of registers that m+st be copied " and the e*istence of special instr+ction ( +ch as asingle instr+ction to load or store all registers)


The dispatches is the mod+le that gives control of the C'/ to the process selected by

the short 7term sched+ler%

This f+nction involves the following:-

(i) witching conte*t(ii) witching to +ses mode(iii) J+mping to the proper location in the +ser 

'rogram to restart the program

The dispatcher sho+ld be as fast as possible

The time taken by the dispatcher to stop one process and start another r+nning is

known as the dispatch latency%


'age no 1

'reemptive and 2on preemptive ched+ling :-

D/nder non preemptive sched+ling" once the C'/ has been allocated to a process" theprocess keeps the C'/ +ntil it releases the C'/ either by terminating or by switching to the

waiting state%

D 2on preemptive sched+ling is also called as cooperative sched+ling%

D4henever a process switches from the r+nning state to the ready state or waiting state to the

ready state" and if sched+ling takes place in these case then the sched+ling is called

preemptive sched+ling%

D sched+ling is preemptive if once a process has been given the C'/ can taken away%

D 'reemptive sched+ling inc+rs a cost associated with access to shared data% The shared data

can become inconsistent" if one process that is +pdating some data is preempted by another 

process and after that another process tires to read this data%

Sc$eduling (riteria :-

(i) C'/ /tili!ation :-C'/ sho+ld be as b+sy as possible%

(ii) Thro+gh p+t :-The n+mber of process that are completed per +nit time" is called thro+gh p+t%

(iii) T+rnaro+nd time :-The interval from the time of s+bmission of a process to the time of completion is

the t+rnaro+nd time%(iv) 4aiting time :-

4aiting time is the s+m of the periods spent waiting in the ready @+e+e%

(v) 3esponse time:-

The time from the s+bmission of a re@+est +ntil the first response is prod+ced is

called response time

6 ,t is desirable to ma*imi!e C'/ +tili!ation and thro+ghp+t and to minimi!e t+rnaro+nd time"

waiting time" and response time%


'age no% 10

ched+ling lgorithms:-

!)" irst-(ome* irst Ser+ed Sc$eduling :-The process that re@+ests the C'/ first is allocated to the C'/ first% ,ts

implemented +sing A,AO K+e+e%ACA sched+ling algorithm is non- 'reemptive The average waiting time in

ACA is often @+ite lag%!," S$ortest – ob -irst Sc$eduling :-

The algorithm associated with each process the length of the processes ne*t

C'/ b+rst% 4hen the C'/ is available" it is assigned to the process% that has the

smallest ne*t C'/ b+rst%,f the ne*t C'/ b+rsts of two processes are same" the ACA sched+ling is +sed

to break the tie%The JA algorithm is optimal" in that it gives the minim+m average waiting time

for a given set of processes%The JA algorithm cannot be implemented at the level of short term C'/

sched+ling% There is no way to know the length of the not C'/ b+rst%

The JA algorithm can be either preemptive or non preemptive%'reemptive JA sched+ling is sometimes called hortest 7 3emaining 7Time

Airst sched+ling%!." Priority Sc$eduling :-

  priority is associated with each process" and the C'/ is allocated to the

process with the highest priority%@+al- priority processes are sched+led in ACA order%

'riority sched+ling can be either preemptive or non preemptive%

  preemptive priority sched+ling algorithm will preempt the C'/ if the priority of 

the newly arrived process is higher than the priority of the c+rrently r+nning


  maIor problem with priority sched+ling algorithms is indefinite bloching or 


  sol+tion to the problem of indefinite blockage of low-priority processes is aging%

/ging :-

 ging is a techni@+e of grad+ally increasing the priority of process that wait in the

system for a long time%


'age no% 18

!0" 1ound- 1obin Sc$eduling :-The ro+nd-robin sched+ling algorithm is designed especially for time-sharing

systems%,t is similar to ACA sched+ling" b+t preemption is added to switch between

processes%The ready @+e+e is treated as a circ+lar @+e+e% The C'/ sched+ler goes aro+nd

the ready @+e+e" allocating the C'/ to each process for a time interval of +p to 1

time @+ant+m%!2" Multile+el 'ueue Sc$eduling : -

  m+ltilevel @+e+e sched+ling algorithm partitions the ready @+e+e into several

separate @+e+es%The processes are permanently assigned to one @+e+e" generally based on the

properties of the process" s+ch as memory si!e" process priority" or process type%ach @+e+e has its own sched+ling algorithm for e*ample" separate @+e+e might

be +sed for foregro+nd and backgro+nd processes" and these processes might

have separate sched+ling re@+irement%,n addition" there m+st be sched+ling among the @+e+es" which is commonly

implemented as fi*ed-priority preemptive sched+ling%

.ighest priority

Lowest priority

!3" Multile+el eedback- 'ueue Sc$eduling :-




Batch Process

The m+ltilevel feedback-@+e+e sched+ling algorithm" in contrast" allows a

process to move between @+e+es%The idea is to separate processes according to the characteristics of their C'/

b+sts" ,f a process +ses too m+ch C'/ time" it will be moved to a lower- priority


This scheme leaves ,5- bo+nd and interactive processes in the higher priority

@+e+es% ,n addition" a process that waits too long in a lower-priority @+e+e may

be moved to a higher 7 priority" This form of aging prevents starvation%

  process entering the ready @+e+e is p+t in @+e+e %  process in @+e+e is given a time @+ant+m of = ms% ,f it does not finish within

this time" it is moved to the tail of @+e+e1% ,f @+e+e o is empty" the process at the

head of @+e+e 1 is given a @+ant+m of 1; ms% ,f it does not complete" it is

preempted and is p+t in @+e+e %'rocess , in @+e+e are r+n in on ACA basis

b+t are r+n only when @+e+e and 1 are empty%


'age no% 19

+ltiple 7 'rocessor ched+ling :-There are two approaches to +ltiple 7'rocessor ched+ling:-

(1) symmetric +ltiprocessing :- ll ched+ling decisions" ,5O processing" and other system activities

handled by a single processor 7 the master server%() ymmetric +ltiprocessing (') :-

.ere each processor is self sched+ling%

Processor /44inity: -

4hen a process migrates from one processor to another processor" the contents of the

cache memory m+st be invalidated for the processor being migrated from" and the cache m+stbe repop+lated for the processor being migrated to%

Beca+se of the high cost of invalidating and re-pop+lating caches" most ' systems

try to avoid migration of processes and instead attempts to keep a process r+nning on the same


This is known as processor affinity%

5oad Balancing :-

Load balancing attempts to keep the workload evenly distrib+ted across all processors in

a ' system% The load balancing is typically only necessary on systems where each

processor has its own private @+e+e of eligible processes to e*ec+te%

There are two approaches to load balancing :-

(1) '+sh igration :-  specific task periodically checks the load on each processor and- if it finds an

imbalance 7 evenly distrib+tes the load by moving processes from overloaded to

idle or less b+sy processors%() '+ll igration :-

,t occ+rs when an idle processor p+lls a waiting task from a b+sy processor% 

6 '+sh and '+ll migration need not be m+t+ally e*cl+sive and are often implemented in parallel%


'age no% 1;

ead lock :-  set of processes is in a deadlock state when every process in the set is waiting for an

event that can be ca+sed only by another process in the set%

6ecessary (onditions 4or deadlock :-

  deadlock sit+ation can arise if the following fo+r conditions hold sim+ltaneo+sly in a


(1) +t+al *cl+sion :- t least one reso+rce m+st be held in a non-sharable modeE that is " only one

process at a time can +se the reso+rce%() .old and 4ait :-

  process m+st be holding at least one reso+rce and writing for ac@+iring

additional reso+rces that are c+rrently being held by other processes%(0) 2o 'reemption:-

3eso+rces cannot be preempted that is" a reso+rce can be released only

vol+ntarily by the process holding it" after that process has completed its task%(8) Circ+lar 4ait :-

  set M 'a'1---'nN of waiting process m+st e*ist s+ch that p is waiting for 

reso+rces held by p"---" pn-1 is waiting for a reso+rce held by p"----" pn-1 is waiting

for a reso+rce held by pn and pn is waiting for a reso+rces held by ' %

These fo+r conditions are not completely independent%&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&

'age no%1<

3eso+rces 7 llocation #raph :-

D/sed for describing deadlocks precisely%DThis graph consists of a set of vertices and a set of edges % The set of vertices is

portioned into two different types of nodes:-(i) 'Mp1%p0%---'nN :-

The set consisting of all reso+rce type in the system%

1e7uest Edge :-

  directed edge" from process '1 to 3 I " is called a re@+est edge% ,t signifies that process

pi has re@+ested an instance of reso+rce type 3I and is c+rrently waiting for that


/ssignment Edge :-

  directed edge from reso+rces type 3 I to process 'i" denoted by 3 I D'i" is called an

assignment edge" ,t signifier that an instance of reso+rce type 3 I has been allocated to process


6 ,f the graph contains no cycles" then no process in the system is deadlocked%

6 ,f the graph does contain a cycle" then a deadlock may e*ist%

6 ,f the cycle involves only a set of reso+rces types" each of which has only o single instance"

then a deadlock has occ+rred%

6 ,f each reso+rce type has several instances" than a cycle does not necessarily imply that a

dead lock has occ+rred%

'age no% 1=

Met$ods 4or 8andling Deadlocks:-

There are three ways to deal with the deadlock problem:-

(i) 4e can +se a protocol to prevent or avoid deadlocks" ens+ring that the system will

never enter a deadlock state%(ii) 4e can allow the system to enter a deadlock state" detect it" and recover%(iii) 4e can ignore the problem altogether and pretend that deadlock never occ+r in the

system%(iv) 4e can ignore the problem altogether and pretend that deadlocks never occ+r in the


Deadlock Pre+ention:-

By ens+ring that at least one of these conditions cannot hold" we can prevent the

occ+rrence of a deadlock:-

(i) +t+al *cl+sion(ii) .old and wait(iii) 2o preemption(iv) Circ+lar wait

The drawback of this method is low device +tili!ation and red+ced system


Deadlock /+oidance :-

  deadlock - avoidance algorithm dynamically e*amines the reso+rce- allocation

state to ens+re that a circ+lar wait condition can never e*ist%The reso+rce 7 allocation state is defined by the n+mber of available and

allocated reso+rces and the ma*im+m demand of the processes%There are two algorithms for deadlock-avoidance:-

(i) 3eso+rces- allocation- graph lgorithm

(ii) Banker?s lgorithm

Sa4e State :-

  state is safe if the system can allocate reso+rces to each process in some

order and still avoid a deadlock" ore formally" a system is in a safe state only if there e*ists a safe se@+ence%

  se@+ence of processes F '1'  73nP is a safe se@+ence for the c+rrent

allocation state if for each ' i% the reso+rces re@+ests that pi can still make can be

satisfied by the c+rrently available reso+rces pl+s the reso+rces held by all pi"

with I F i%

6 safe state is not a deadlocked state% Convenely" a deadlocked state is an +nsafe state is an

+nsafe state"

6 2ot all +nsafe starts are deadlocks" however n +nsafe state may lead to a deadlock%


'age no% 1H

,nitially" the system is in a safe state" 4henever a process re@+est a reso+rce that is c+rrently

available the system m+st decide whether the reso+rce can be allocated immediately or 

whether the process m+st wait" The re@+est is granted only if the allocation leaves the system in

a safe state%

3eso+rce 7 llocation- #raph lgorithm :-This algorithm is +sed when each reso+rce has only one instance%

.ere we +se a new type of edge" called a claim edge%

  claim edge 'i D3I indicates that process 'i may re@+est reso+rce 3I at same time in

the f+t+re" This edge resembles a re@+est edge in direction b+t is represented in the graph by adashed line%

4hen process 'i re@+ested to a re@+est 3I" the claim edge piD3I is converted to a

re@+est edge%

imilarly when a reso+rce 3I is released by pi" the assignment edge 3IDpi is

reconverted to a claim edge pi D3I

+ppose that process 'i re@+ests reso+rce 3I% The re@+est can be granted only if 

converting the re@+est can be granted only if converting the re@+est edge 'iD3I to an

assignment edge 3I D'i does not res+lt in the formation of a cycle in the reso+rce allocation


Banker?s lgorithm :-

This algorithm is +sed when a reso+rces type may have m+ltiple instances%

4hen a new process enters the system" it m+st declare the ma*im+m n+mber of 

instances of each reso+rce type that it may need" This n+mber may not e*ceed the total n+mber 

reso+rces in the system%

4hen a +sers re@+ests a set of reso+rces" the system m+st determine whether the

allocation of these reso+rces will leave the system in a safe state% ,f it will" the reso+rces are

allocatedE otherwise" the process m+st wait +ntil some other process release eno+gh reso+rces%

Let n be the n+mber of processes in the system and m be the n+mber of reso+rces

types% 4e need the following data str+ct+res :-

(i) vailable :-  vector of length m indicates the n+mber of available reso+rces of each type%,f vailable (I) e@+als k" there are k-instances of reso+rces type 3I available%

(ii) a* :- n n * m matri* defines the ma*im+m demand of each process%,A a* (i) (I) k" then process 'i may re@+est at most k- instance of reso+rce

type 3I%


'age no%

(iii) llocation :- n n * m matri* defines the n+mber of reso+rces of each type c+rrently allocated

to each process%,f llocation (i) (I) k" then process 'i is c+rrently allocated k instances of 

reso+rce type $I%(iv) 2eed :-

 n n * m matri* indicated the remaining reso+rces need of each process%,f 2eed (i) (I) $ then process 'i may need $ more instance of reso+rces type 3I

to complete its task%

2eed (i) (I) a* (i) (I) 7 llocation (i) (I)

*ample :- Consider a system with five processes po thro+gh p1 and three reso+rces types "B

and C 3eso+rces type has 1 instances" B has 9 instances" and C has < instances" +ppose

that at time To" the following snapshot of the system has been taken:-

 llocation a* vailable

  B C BC B C

' 1 % <90 0 0

'1 0

' 0 H

'0 1 1

'8 800

2ow check :- (i) 4hether the system is in safe state or notQ

(iii) 4hether re@+est (1"") by p1 can be immediately grantedQ(iv) 4hether re@+est ( 0"0") by p8 can be immediately granted Q

ol+tion: - The contents of the matri* 2eed are :-


  B C ( 2eed 7a* 7 llocation)

' < 8 0

'1 1

' ;

'0 1 1

'8 8 0 1

(i) ince the se@+ence F p1" p0" p8"p"p P satisfies the safety criteria" o"the system is

c+rrently in a safe state%

(ii) 3e@+est (1" %) by p1

To decide whether this re@+est can be immediately granted" we first check that

3e@+est vailable- that is" that (1"") (0"0") which is tr+e%4e then pretend that this re@+est has been f+lfilled" and we arrive at the

following new state:- llocation 2eed vailable

' 1 1 ; 8 1 0 '1 1 ' 0 ; '0 1 1 1 1'8 8 0 1

  B C BC B C

4e m+st determine whether this new system is safe or not% .ence" we can immediately

grant the re@+est of process '1%


'age no% 1

(iii) 3e@+est (0"0") by '8

.ere 3e@+est ≤  va+lable

D (0"0") ≤  (0"0")

2ow s+ppose this re@+est has been f+lfilled and we arrive at the following new state :-

 llocation 2eed vailable

  B C B C B C

' 1 < 9 0

'1 0

' 0 H

'0 1 1

'8 0 0 8 1 1

ince this new system is not safe% .ence we cannot grant this re@+est of process '8

3ecovery from eadlock : -4hen a deadlock detection algorithm determines that a deadlock e*ists" several

alternatives are available:-

!)" Process termination :-Two methods are available:-

(a) bort all deadlocked processes%(b) bort one process at a time +ntil the deadlock cycle is eliminated%

!," 1esource Preemption :-

To eliminate deadlock" +sing reso+rce preemption" we s+ccessf+lly

preempt some reso+rces form processes and give these reso+rces to other 

processer +ntil the deadlock cycle is broken%

,n this method" three iss+es need to be addressed: -(1) electing a victim :-

4hich reso+rces and which processes are to be preemptedQ() 3ollback:-

,f we preempt a reso+rce from a process" we m+st rollback the

process to some safe state and restart it form that state%(0) tarvation :-

.ow can we g+arantee that reso+rces will not always be

preempted from the same process%


'age no%

,nter process Comm+nication > ynchroni!ation9nter process (ommunication:-

'rocesses e*ec+ting conc+rrently in the operating system may be either independent

processes or cooperating process%

  process is independent if it can not affect or be affected by the other processes

e*ec+ting in system% ny process that does not share data with any other process is


  process is cooperating if it can affect or be affected by the other processes e*ec+ting

in the system" Clearly any process that share data with other processes is a cooperating


There are several reasons for providing an environment that allows processcooperation:-

(i) ,nformation haring(ii) Comp+tation speed+p:-

,f we want a partic+lar task to r+n faster" we m+st break it into s+btasks" each of 

which will be e*ec+ting in parallel with the others%(iii) od+larity :-

4e may want to constr+ct the system in a mod+lar fashion" dividing the system

f+nctions into separate process or threads%(iv) Convenience :-

ven an individ+al +ser may work on many tasks at the same time%


'age no" 0

Cooperating processes re@+ire an inter processes comm+nication (,'C) mechanism

that will allow them to e*change data and information%

There are two f+ndamental models of inter process comm+nication:-

(1) hared emory() essage 'assing

S$ared Memory Model :-

,n this model" a region of memory that is shared by cooperating processes is


Typically" a shared - memory region resides in the address space of the process

creating the shared memory segment% Other processes that wish to comm+nicate +sing this

shared 7memory segment m+st attach it to their address space%The processes are also responsible for ens+ring that they are not writing to the same

location sim+ltaneo+sly%

Messages – Passing System :-

essage passing provides a mechanism to allow processes to comm+nicate and to

synchroni!e their actions witho+t sharing the same address space and is partic+larly +sef+l in a

distrib+ted environment where the comm+nicating processes may reside on different comp+ters

connected by a network

  message 7 passing facility provides at least two operations:-

(i) end (message)

(ii) 3eceive (message)

The comm+nication between different processes in a message passing system can be of 

following types :-

(a) ynchrono+s or synchrono+s Comm+nication(b) irect or ,ndirect Comm+nication%

Sync$ronous (ommunication: -

,n synchrono+s comm+nication processes synchroni!e at every message% Both send

and receive are blocking operations%

/sync$ronous (ommunication:-

The send operation is almost always non 7blocking% The receive operation" however"

can be blocking or non-blocking%


'age no% 8

Blocking end :-

The sending process is blocked +ntil the message is received by the receiving process%

2on blocking send :-

The sending process sends the message and res+mes operation%

Blocking 3eceive:-

The receiver blocks +ntil a message is available%

2on blocking 3eceive :-

The receiver retrieves either a valid message or a n+ll%

irect Comm+nication:-

'rocess m+st e*plicitly name the receiver or sender of a message (symmetric


- end ('" message)%end message to process '%3eceive (K" message)%3eceive message from K%

,n a client- server system" the server does not have to know the name of a specific client

in order to receive a message% ,n this case" a variant of the receive operation can be+sed (asymmetric addressing%)

-receive (," message)% 3eceive a pending message from any process% 4hen a

message arrives" , is let to the name of the sender%

,n direct comm+nication" the interconnection between the sender and receiver 

has the following characteristics :-

(i) link is established a+tomatically" b+t the processes need to know each others

identify%(ii) +ni@+e link is associated with the two processes%(iii) The link is +s+ally bi-directional" b+t it can be +nidirectional%

9ndirect (ommunication :-4ith indirect comm+nication" the messages are sent to and received from

mailbo*es" or points%

  mailbo* can be viewed abstractly as an obIect into which messages can be

placed by processes and from which messages can be removed%

The send ( ) and receive ( ) primitives are defined as follows:-

-send (" message )% end a message to mailbo* %

-receive ( message)% 3eceive a message from mailbo* %

#enerally" a mailbo* is associated with many senders and receivers%

,n some systems" only one receiver is associated with a partic+lar mailbo*" s+ch a

mailbo* is often called a port%


'age no%9

,n indirect comm+nication" the interconnection between the sender and receiver 

has the following characteristics:-

(i) link is established between two processes only if they share a mailbo*%(ii) link may be associated with more than two processes%(iii) Comm+nicating processes may have different links between them" each

corresponding to one mailbo*%

Producer – consumer Problem : -

  prod+cer process prod+ces information that is cons+med by a cons+mer process%

One sol+tion to the prod+cer- cons+mer problem +ses shared memory% To allow

prod+cer and cons+mer processes to r+n conc+rrently" we m+st have available a b+ffer 

of items that can be filled by the prod+cer" and emptied by the cons+mer%

The prod+cer and cons+mer m+st be synchroni!ed" so that the cons+mer does

not try to cons+me an item that has not yet been prod+ced%

Two types of b+ffers can be +sed :-

(i) /nbo+nded B+ffer :-,t places no practical limit on the si!e of the b+ffer% The prod+cer can

always prod+ce new items" b+t the cons+mer may have to wait for new


(ii) Bo+nded B+ffer :-,t ass+mes a fi*ed b+ffer si!e% ,n this case" the cons+mer m+st wait if the

b+ffer is empty" and the prod+cer m+st wait if the b+ffer is f+ll%


'age no%;

ol+tion of prod+cer 7 Cons+mer problem +sing

bo+nded b+ffer :-The following variable reside in a region of memory shared by the prod+cer and cons+mer 


(1) 6 define B/AA3 ,R 1() Type of str+ct M

N itemE(0) ,tem b+ffer (B/AA3 7,R)(8) ,nt in E(9) ,nt o+t E

The shared b+ffer is implemented as a circ+lar array with two logical painters:-

,n and o+t%

The variable in paints to the ne*t free position in the b+ffers o+t points to the first f+ll

position in the b+ffer%

,s f+ll when ((in G1) S B/AA3-,R)o+t%

(ode 4or producer:-

,tem ne*t prod+cedE

4hile (tr+e) M

5 prod+ce an item in ne*t prod+ced 5

4hile (((inG1) S B/AA3U ,R) o+t)

" 5 do nothing 5

B+ffer VinW ne*t prod+ced"

,n (in G1)S B/AA3U,RE


(ode 4or consumer :-

,tem ne*t cons+medE

4hile (tr+e)

4hile (ino+t)

" 5 do nothing 5

2e*t cons+med b+ffer (o+t)E

O+t 7 (o+t G1) S B/AA3U,RE

5 cons+me the item in ne*t cons+med 5


The prod+cer process has a local variable ne*t prod+ced in which the new item to be

prod+ced is stored%

The cons+mer process has a local variable ne*t cons+med in which the item to becons+med is stored%

This scheme allows at most B/AA3U,R-1 items in the b+ffer at the same time%

+ppose we want to modify the algorithm to remedy this deficiency%

One possibility is to add an integer variable co+nter" initiali!ed to % Co+nter is

incremented every time we add a new item to the b+ffer and is decremented every time we

remove one item from the b+ffer%


'age no%<

Modi4ied code 4or producer :-

4hile (tr+e)


5 prod+ce an item in ne*t p+blished 5 while (co+nter B/AA3U,R)E 5 do nothing 5

b+ffer VinW ne*t prod+cedE in (inG1)S B/AA3U,RE

Co+nter GGE


Modi4ied (ode 4or (onsumer :-

4hile (tr+e)


4hile (co+nter )

" 5 do nothing 52e*t cons+med b+ffer (o+t)E

O+t (o+t G 1) S B/AA3U,RE

Co+nter - - E

5 cons+me the item in ne*t cons+med 5


 ltho+gh both the prod+cer and cons+mer ro+tines are correct separately" they may not

f+nction correctly when e*ec+ted conc+rrently%

+ppose that the val+e of the variable co+nter is c+rrently 9 and that the prod+cer and

cons+mer processes e*ec+te the statement XCo+nterGG? > XCo+nter - - Xconc+rrently%

Aollowing the e*ec+tion of these two statements" the val+e of the variable co+nter may be

8"9"or ;

The only correct res+lt" thro+gh is co+nter 9" 4hich is generated correctly if the prod+cer 

and cons+mer e*ec+te separately%

1ace (ondition:-

  sit+ation" where several processes access and manip+late the same data conc+rrently

and the o+tcome of the e*ec+tion depends on the partic+lar order in which the access takes

place" is called a race condition%

6 To g+ard against the race condition above" we need to ens+re that only one process at a

time can be manip+lating the variable co+nter% To make s+ch a g+arantee" we re@+ire thatthe processes be synchroni!ed in some way%


'age no% =

T$e (ritical-section problem :-

Consider a system consisting of n processes M '"'1"- - '-1 N% ach process has a

segment of code" called a critical section" in which the process may be changing common

variable" +pdating a table" writing a file" and so on%

The important feat+re of the system is that" when one process is e*ec+ting in the critical

section no other process is to be allowed to e*ec+te in its critical section%

The critical 7section problem is to design a protocol that the processes can +se to


ach process m+st re@+est permission to enter its critical section%

do M

Critical section

3emainder section

N 4hile ( tr+e)E

  ol+tion to the critical section problem m+st satisfy the following three re@+irements:-

(1) +t+al *cl+sion :-,f process 'i  is e*ec+ting in its critical section" then no other processes can be

e*ec+ting in their critical sections"() 'rogress :-

,f no process is e*ec+ting in the critical section and some processes wish to

enter their critical sections" then only those processes that are not e*ec+ting in

their remainder sections can participate in the decision on which will enter itscritical section ne*t%

(0) Bo+nded 4aiting : -There e*ists a bo+nd" or limit" on the n+mber" of times that other processes are

allowed to enter their critical sections after a process had made a re@+est to

enter its critical section and before that re@+est is granted%

Two general approaches are +sed to handle critical sections in operating systems :-



Exit section

(i) 'reemptive $ernels :- llows a process to be preempted while its r+nning in

kernel mode"(ii) 2on preemptive kernels :-oes not allow a process r+nning in kernel mode to be

preempted" ,t will r+n +ntil it e*its kernel mode" blocks" or vol+ntarily yields control of 

the C'/%

  non preemptive kernel is essentially free from race conditions on kernel datastr+ct+res" as only one process is active in the kernel at a time%


'age no% H

Petersons Solution:-

'eterson?s sol+tion is restricted to two processes that alternate e*ec+tion between their 

critical sections and remainder sections% The processes are n+mbered 'i and ' ;* where I 1- i

'eterson?s sol+tion re@+ires two data items to be shared between the two processes :-

,nt t+rn E

Boolean Alag VW"

The variable t+rn indicates whose t+rn it is to enter its critical section%The flag array is +sed to indicate that a process is ready to enter its critical section%

T$e structure o4 process Pi in Petersons Solution:-

o M

(entry section)

Critical section

(e*it section)

3emainder section


To enter the critical section" process pi first sets flag ViW to be tr+e and then sets t+rn tothe val+e I" thereby asserting that if the other process wishes to enter the critical section" it can

do so%

,f both processes try to enter at the same time" t+rn will be set to both i and I at

ro+ghly the same time% Only one of these assignments will lastE the other will occ+r b+t will be

overwritten immediately% The event+al val+e of t+rn decides which of the two processes is

allowed to enter its critical section first%

Fa! "i# $ tr%e&

 Tr%n ' ( &

)hie *+a! "(# , t%rn

$$ ( &

Fa! "i#$ .ase&

6 'eterson?s sol+tion is a software based sol+tion%


'age no% 0

ynchroni!ation .ardware:-,n general" we can state that any sol+tion to critical section problem re@+ires a simple

total 7 a lock 3ace conditions are prevented by re@+iring that critical regions be protected by


o M

Critical section

3emainder section

N 4hile (time) E

The critical section problem co+ld be solved simply in a +niprocessor environment if we

wo+ld prevent interr+pts from occ+rring while a shared variable was being notified% ,n this

manner" we wo+ld be s+re that the c+rrent se@+ence of instr+ction wo+ld be allowed to e*ec+te

in order witho+t preemption % This is the approach taken by non preemptive kernel%

/nfort+nately" this sol+tion is not as feasible in a m+ltiprocessor environment% isabling

interr+pts on a m+ltiprocessor can be time cons+ming%

any modern comp+ter systems therefore provide special hardware instr+ctions thatallow +s either to test and modify the content of a word or to swap the contents of two words


(i) Test nd et ( ) instr+ction :-The important characteristic of this instr+ction is that this instr+ction is e*ec+ted

atomically% Th+s" if two Test nd etc) instr+ctions are e*ec+ted sim+ltaneo+sly

(each on a different ('/)" they will be e*ec+ted se@+entially in some arbitrary

order%Boolean Test and et (Boolean Lock)M

Boolean rv - lockE% Lock 7 tr+eE3etr+n rvE


,f the machine s+pports the Test and et ( ) instr+ction then we can implement

m+t+al e*cl+sion by declaring a Boolean variable lock" initiali!ed to false%

o M





4hile ( Test nd et ( > lock ) )

E 5 do nothing 5

Lock false%

N while (tr+e)E


'age no% 01

(ii) wap () instr+ction :-The wap ( ) instr+ction operates on the contents of two words% Like the Test nd

set () instr+ction" it is e*ec+ted atomically%oid swap (Boolean lock" Boolean key )MBoolean temp lock"

%lock keyE

% key tempE


,f the machine s+pports the swap ( ) instr+ction" then m+t+al e*cl+sion

can be provided as follows :-

o M

$ey tr+eE

4hile (key tr+e)

wap ( > lock"> key)E

Lock falseE

N 4hile (time)E

  global Boolean variable lock is declared and is initiali!ed to false%ach process has a local Boolean variable key%

Bo+nded 7 waiting m+t+al e*cl+sion with Test andset () instr+ction :-

 ltho+gh previo+s algorithms satisfy the m+t+al% *cl+sion re@+irement" they do not

satisfy the bo+nded% 4aiting re@+irement%

Aor the algorithm +sing the Test and set ( ) instr+ction that satisfies all the

critical section re@+irements" the common data str+ct+res are : -Boolean waiting VnWE

1ritica Section

3emainder section





Boolean lockEThese data str+ct+re are initiali!ed to false

o M4aiting ViW tr+eE$ey tr+eE

4hile ( waiting ViW > > key )$ey Test nd et (> lock)"

4aiting ViW- falseE

J ( , G1) S h E4hile ((I Y , ) > > Y waiting VIW)

 I-(IG1) S hE

if (I i)lock false E


4aiting VIW" Aalse"

N 4hile (tr+e)E


'age no% 0


The vario+s hardware 7 based sol+tions to the critical- sections problem are complicated

for the application programmers to +se% To over come this diffic+lt" we can +se a synchroni!ationtool called a semaphore%

  semaphore is an integer variable that" apart from initiali!ation" is accessed only

thro+gh two standard atomic operations :- wait () and signal ()%

The wait ( ) operation was originally termed p and signal () was originally called

The definition of wait ( ) is a follows:-

4ait (9)

4hile 9 F

E 5 2o operation 5

- -


The definition of signal ( ) is as follows:-

ignal (s)




Critical ection

3emainder ection

Binary Semap$ores:-

The val+e of a binary semaphore can range only between and 1% On some systems"

binary semaphores are known as m+tes" locks" as they are locks that provide m+t+al e*cl+sion%

4e can +se binary semaphore to deal with the critical section problem for m+ltiple

processes% The n processes share a semaphore" m+ter" initiali!ed to 1%

o M

4aiting (m+ter)E

ignal (m+ter)E

N 4hile ( tr+e)E

(ounting Semap$ores :-

The val+e of a co+nting semaphore can range over an +nrestricted domain%

Co+nting semaphores can be +sed to control access to a given reso+rce consisting of a

finite n+mber of instances% The semaphore is initiali!ed to the n+mber of instances% The

semaphore is initiali!ed to the n+mber of reso+rces available% ach process that wishes to +se

a reso+rce performs a wait ( ) operation% 4hen a process release a reso+rce" it performs a

signal () operation

4hen the co+nt for the semaphore goes to " all reso+rces are being +sed%


'age no" 00

Semap$ore 9mplementation:-The main disadvantage of the semaphore definition given here is that it re@+ires b+sy

waiting% B+sy waiting wastes C'/ cycles that some other process might be able to +se

prod+ctively% This type of semaphore is also called a spinlock beca+se the process X pins X

while waiting for the lock%

To over come the need for b+sy waiting" we can modify the definition of the wait ( ) and

signal ( ) semaphore operations%

Type def str+ct V

,nt val+eE

tr+ct process %list

N emaphoreE

,n wait ( ) operation" the process can block itself instead of engaging in b+sy waiting% The block

operation places a process into a waiting @+e+e associated with the semaphore" and the state

of the process is switched to the waiting state%

4ait ( semaphore 9)


D val+e - - E

,f (sDval+e F o )

Critical ection

3emainder ection

 dd this process to sDlistE

Block ( ) E


4hen some other process e*ec+tes a signal ( ) operation" the process is restarted by a wake +p( ) operation" which changes the process from the waiting state to the ready state% The process

is then placed in the ready @+e+e%

ignal ( semaphore 9)


D val+e -GE

,f ( s D val+es F)


3emove a process p from s D list"

4ake +p (')E



6ote : - ltho+gh +nder the classical definition of semaphores with b+sy waiting the semaphore

val+e is never negative" this implementation may have negative semaphore val+es"% ,f the

semaphore val+e is negative" its magnit+de is the n+mber of process waiting on that


The critical aspect of semaphores is that they be e*ec+ted atomically% 4e m+st

g+arantee that no two process can e*ec+te wait ( ) and signal () operations on the same

semaphore at the same time%

This is a critical-section problem" and in a single processor environment" we can solve it

by simply inhibiting interr+pts d+ring the time the wait () and signal () operations are e*ec+ting%&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&%

'age no% 08

,n a m+ltiprocessor environment disabling interr+pt on every processor can be a diffic+lt

task% Therefore ' system m+st provide alternative locking techni@+es 7 s+ch as spinlocks 7

to ens+re that wait () and signal () are performed atomically%

6ote :-

,t is important to admit that we have not completely eliminated b+sy waiting with this

definition of the wait () and signal () operations% .ere we have limited the b+sy waiting to the

critical sections of the wait (c) and signal ( ) operations" and these sections are short%

Deadlocks and Star+ation:-

The implementation of a semaphore with a waiting @+e+e may res+lt in a sit+ation where

two or more processes are waiting indefinitely for an event that can be ca+sed only by one of 

the waiting processes% The event in @+estion is the e*ec+tion of a signal ( ) operation%

4hen s+ch a state is reached" these processes are said to be deadlocked%

' '1

4ait ()E 4ait (K)?

4ait (K)E 4ait ()E

  3 3

ignal ()E ignal (K)E

ignal (K) ignal ()E

 nother problem related to deadlocks is indefinite blocking as starvation" which may occ+r if we

add and remove processes in L,AO order%


'age no% 09

Aile ystems

ile System Structure :-

  file system is +sed to allow the data to be stored" located" and retrieved easily%

The file system itself is generally composed of many different levels:-

 pplication 'rograms


Logical Aile ystem


Aile 7 organi!ation mod+le


Basic file system


15 controlZ


,5o Control : -

,t consists of device drivers and interr+pt handlers to transfer information between the

main memory and the disk systems%

Basic ile System :-

,t needs only to iss+e generic commands to the appropriate device driver to read and

write physical block on the disk%

ile Organi<ation Module :-

,t knows abo+t filer and their logical blocks as well as physical blocks% ,t translate logical

block addresses to physical block addresses%


'age no" 0;

5ogical ile System :-

Page 30: Baisc Content on Operating Systems

,t manages the metadata information% ,t manages the directory str+ct+re to provide the

file-organi!ation mod+le with the information the latter needs" given a symbolic file name%

,t maintains file str+ct+re via file 7 control blocks%

Aile control Block ( ACB):-

,t contains information abo+t the file" incl+ding ownership" permissions" and location of 

the file contents%

ile system used by +arious kind o4 OS :-

(i) /,2[ D/2,[ Aile system (/A)" which is based on Berkeley Aast file system (AA)%(ii) 4indows D AT" AT0" 2TA(iii) Lin+* D*tended Aile system ( e*t > e*t 0)

everal on 7disk and in-memory str+ct+re are +sed to implement a file system% These str+ct+re

vary depending on the operating system and the file system b+t general principles apply%

On disk" the file system may contain following information:-

(i) Boot Control Block ( per ol+me) :-,t contains information needed by the system to boot a O form that vol+mes%

,n /A" it is called the boot block%

,n 2TA" it is called the partition boot sector%

(ii) ol+me ( control Block (per vol+me) :-,t contains vol+me details s+ch as the member of blocks in the partition" si!e of 

the blocks etc%,n /A" it is called a s+per block%,n 2TA" it is stored in master file table%

(iii) irectory tr+ct+re (per file ystem):-

,t is +sed to organi!e the files%,n /A" this incl+des file names and associated inode n+mbers%,n 2TA" it is stored in the master file table%

(iv) Aile control Block 7ACB (per file) :-,t contain details abo+t the file incl+ding file permissions" ownership" si!e etc,n /A" this is called the inode block%,n 2TA" this information is act+ally stored within the master file table%


'age no% 0<

The in- memory information is +sed for both file system management and performance

improvement via coaching% The data are loaded at mo+nt time and discarded at dismo+nt%

The str+ct+res may incl+de the ones described below :-

(i) ,n-memory mo+nt table :-,t contains information abo+t each mo+nted vol+mes%

(ii) ,n-memory directory-str+ct+re cache : -,t holds the directory information of recently accessed directories?%

(iii) ystem-wide open- file table:-,t contains the ACB of each open file%

(iv) 'er-process open file table :-

Page 31: Baisc Content on Operating Systems

,t contains a pointer to the appropriate entry in the system wide open file table%

/llocation Met$ods:-

Three maIor methods of allocating disk space are in wide +se :-

(i) Contig+o+s llocation :-,t re@+ires that each file occ+py a set of contig+o+s blocks on the disk% ,t s+ffers

from e*ternal fragmentation%

,B 5C operating system +ses contig+o+s allocation%

(ii) Linked llocation :-4ith linked allocation each file is a linked list of disk blocks" the disk blocks may

be scattered anywhere on the disk% The directory contains a printer to the first

and last blocks of the file%There is no e*ternal fragmentation with linked allocation%

 n important variation on linked allocation is the +se of a file allocation table

(AT)% This is +sed by the -O and O5 operating systems% section of thedisk at the beginning of each val+e me is set aside to contain the table% The has

one entry for each disk block and is indented by block n+mber% The directory

entry contains the block n+mber% The directory entry contains the block n+mber 

of the first block of the file% The table entry inde*ed by that block of the file% This

chain contin+es +ntil the last block which has a special end of file val+e as table

entry% /n+sed blocks are indicated by a table val+e%(iii) inde*ed llocation :-

Linked allocation solves the e*ternal fragmentation and si!e-declaration

problems of contig+o+s allocation% .owever in the absence of a AT% Linked

allocation cannot s+pport efficient direct access" since the pointers to the blocks

are scattered with the blocks themselves all over the disk and m+st be retrievedin order%

,nde*ed allocation salves this problem by bringing all the pointers

together into one locations the inde* block%ach file has its own inde* block" which is an array of disk-block

addresses% The i th entry in the inde* block points to the i  th  block of the file The

directory contains the address of the inde* block%,nde*ed allocation s+pports direct access" witho+t s+ffering form e*ternal

fragmentation .owever" it s+ffers from wasted space%ince every file m+st have an inde* block" o we want the inde* block to

be as small as possible% ,f the inde* block is too small" it will not able to hold

eno+gh pointers for a large file" and a mechanism will have to be available todeal with this iss+e%

echanism for this p+rpose incl+de the following :-(a) Linked cheme :-

 n inde* block is normally one disk block% Th+s it can be read and written

directly by itself% To allow for larger files" we can link together several

inde* blocks%(b) +ltilevel ,nde* :-

Page 32: Baisc Content on Operating Systems

.ere we +se a first level inde* block to point to a set of second level inde*

blocks" which in t+rn points to the file blocks

  ( C) Combined cheme :-

This scheme is +sed in /ni* file system (/%A%)%

,n this scheme" we keep the first 19 pointers of the inde* block in file?s inode% The

first 1 of these pointers point to direct blocksE that is they contain addresses of blocks thatcontain data of the file% Th+s" the data for small files do not need a separate inde* block%

The ne*t three pointers point to indirect blocks% The first points to a single indirect

block% The second points to a do+ble indirect block% The last pointer contains the address of a

triple indirect block%


'age no% 0H

tr+ct+re of /2,[ Aile ystem :-









6 ,f the si!e of blocks is 1$B and each address re@+ires ∆ B" then total memory addressed

by each entry of inode table is :-

1$B G9;$B G9;*9;$B G9;*9;*9;$B%

ree Space Management:-

(1) Bit ap or Bit ector : -The free-space list is implemented as a bit map or bit vector% ach block is

represented by 1- bit% ,f the block is free" the bit is 1" if the block is allocated" the

bit is O%

*ample:- consider a disk where blocks "0"8"9"="H"1"11"1"10"1<"1="9";" and

< are free and the rest of the blocks are allocated% The free space map wo+ld


() Linked List :-Link together all this free disk blocks" keeping a pointer to the first free block in a

special location on the disk and caching it in memory%

5og-structured ile Systems:-

,n these file systems" log-based database recovery techni@+es are applied to file-system

meta data +pdates% These file systems are also known as log-based transaction 7oriented (or 

 Io+rnaling) file systems%

These file system help in keeping the file system consistent%


'age no% 8

econdary torage tr+ct+re

 +*iliary emory:-

evices that provide back+p storage are called a+*iliary memory% The most common

a+*iliary memory devices are magnetic disks and tapes%

agnetic isks :-

ach disk platter has a flat circ+lar shape" like a C% 4e store information by recording it

magnetically on the platters%

The read write heads are attached to a disk arm that moves all the heads as a +nit

The s+rface of a platter in logically divided into circ+lar tracks" which are s+bdivided into

sector %The set of tracks that are at one arm position makes +p a cylinder%

Transfer 3ate :-

The transfer rate is the rate at which data flow between the drive and the comp+ter 

eek time :-

The time to move the disk arm to the desired cylinder%

3otational Latency:-

The time for the desired sector to rotate to the disk head%

Transfer time:-

Time re@+ired to transfer data to or from the device%


 ccess Time :-

The average time re@+ired to reach a storage location in memory and obtain its contents

is called the access time%

  disk drive is attached to a comp+ter by a set of wires called an ,5O B+s%The data transfer on a b+s are carried o+t by special electronic processors called controllers :-

(i) .ost Controller :-This is the controller at the comp+ter end of the b+s%

(ii) isk Controller :-,t is b+ilt into each disk drive%

/ddressing in Magnetic Disks:-

odern disk drives are addressed as large one-dimensional array of logical blocks%

The one-dimensional array of logical blocks is mapped onto the sectors of the disk


ector O is the first sector of the first track on the o+termost cylinder% The mapping

proceeds in order thro+gh that track" then thro+gh the rest of the tracks in that cylinder and thentro+gh the rest of the cylinders from o+termost to innermost%

(onstant 5inear =elocity ! (5=" :-

On edia that +se constant linear velocity (CL)" the density of bits per track is +niform%

The drive increases its rotation speed as the head moves from the o+ter to the inner tracks to

keep the same rate of data moving +nder the head%

This method is +sed in C-3O and -3O drives%

(onstant /ngular =elocity !(/=" :-

The disk rotation speed remains constant" and the density of bits decreases form inner 

tracks to o+ter tracks to keep the data rate constant%

This method is +sed in hard-disks%


'age no" 8

Disk Sc$eduling :-

(i) ACA ched+ling :-

Access time $ See0 time 4Rotationa 5atency time

The algorithm is intrinsically fair" b+t it generally does not provide the fastest

service%(ii) hortest eek Time Airst (TA) ched+ling :-

The ,A algorithm selects the re@+est with minim+m seak time from the c+rrent

head position%,t may ca+se starvation of some re@+ests%This algorithm is not optimal%

(iii) C2 ched+ling :-

The disk arm starts at one end of the disk and moves towards the other end"

servicing re@+ests as it reaches each cylinder%

 t the other end" the direction of head movement is reversed%

,t is sometimes called the elevator algorithm%

(iv) Circ+lar C2 (C-C2) ched+ling:-,t is similar to C2 sched+ling" however when the head reaches the other end"

it immediately ret+rn to the beginning of the disk" witho+t servicing any re@+est

on the ret+rn trip%


'age no% 80

Disk ormatting:-

  new magnetic disk is a blank slate :,t is I+st a platter of a magnetic recording material

(i) 'hysical Aormatting :-

Before a disk can store data" it m+st be divided into sectors that the disk

controller can read and write" This process is called low-level formatting or 

physical formatting%

,T fills the disk with a special data str+ct+re for each sector%

(ii) Logical Aormatting :-

.ere the operating system stores the initial file-system data str+ct+res onto thedisk%These data str+ct+res may incl+de maps of free and allocated space and an

initial empty directory%

1o# Disk :- ome operating systems give special programs the ability to +se a disk partition as

a large se@+ential array of logical blocks" witho+t any file-system data str+ct+re% This array is

sometimes called the raw disk" and ,5O to this array is termed as raw ,5O%

Disk Block :-

  disk block is the +nit of data transfer b5w disk and memory%

Block si!e m+st be m+ltiple of sector si!e"Block si!e cannot e*ceed the si!e of track%


'age no% 88

Aile Organi!ation and ,nde*es

Spanned 1ecords:-

,f records can span more than one block in a disk" this organi!ation is called spanned%

4henever a record is larger than a block" we m+st +se a spanned organi!ation%

&nspanned 1ecords:-

,f records are not allowed to cross block bo+ndaries" the organi!ation is called

+nspanned %This is +sed with fi*ed- length records having BP3%

Blocking actor  :-+ppose that the block si!e in B bytes% Aor a file of fi*ed length records of si!e 3 bytes"

with B P 3" we can fit bfr LB53\ records per block%

The val+e bfr is called the blocking factor%

The ma*im+m n+mber of records that can fit in a disk block" is called the blocking factor 



'age no% 89

Types o4 Organi<ation o4 iles :-

There are three types of organi!ation of files :-

(1) Ailes of /nordered 3ecords (.eap Ailes) :-The records are placed in the file in order in which they are inserted" so new

records are inserted at the end of the file%This organi!ation is often +sed with additional access paths" s+ch as the

secondary inde*es%() Ailes of Ordered 3ecords (orted Ailes):-

4e can physically order the records of a file on disk based on the val+es of one

of the their fields-called the ordering field%,t the ordering field is also a key field of the file" then the field is called the

ordering key field%(0) .ash Ailes :-

The idea behind hashing is to provide a f+nction h called a hash f+nction" that isapplied to the hash field val+e of a record and yields the address of the disk

block in which the record is stored"%Aor most records" we need only a single-block access to retrieve that recorded%The search condition m+st be an e@+ality condition one single field%

6 ll the above organi!ation of files are called primary organi!ation%

9ndexes :-

The inde* str+ct+res typically provide secondary access paths to access the records

witho+t affecting the physical placement of records on disk%

Dense /nd sparse 9ndexes:-  dense inde* has an inde* entry for every search key val+e%

  sparse inde* has inde* entries for only some of the search val+es%

Types o4 9ndexes:-

(1) ingle Level Ordered ,nde*es :-(a) 'rimary ,nde*es(b) Cl+stering ,nde*es

() +lti Level ,nde*es


'age no" 8;

Single 5e+el Ordered 9ndexes :-

  primary inde* is specified on the ordering key field of an ordered file of records% ,f the

ordering field is not a key field" then cl+stering inde* is +sed%  file can have at most one physical ordering field" so it can have at most one primary

inde* or one cl+stering inde*" b+t not both%

  secondary inde* can be specified on any non-ordering field of a file%

Primary 9ndexes:-

  primary inde* is an ordered file whose records are of fi*ed length with two fields" The

first field is of the same data type as the ordering key field 7 called the primary key 7 of the data

file" and the second field is a pointer to a disk block%

There is one inde* entry in the inde* file for each block in the data file" that has the val+e

of the primary key field for the first record in the block and a painter to that block%

Total no% of entries in the inde* is same as the no% of disk blocks in the ordered data file%

  primary inde* is a non-dense ( sparse) inde*%

  maIor problem with a primary inde* 7as with any ordered file 7 is insertion and deletion of 


(lustering 9ndexes :-

,f records of a file are physically ordered on a non-key field" that field is called the

cl+stering field%

  cl+stering inde* is also an ordered file with two field with two field the first is same as

cl+stering field" and the second field is a block pointer%

There is one entry in the cl+stering inde* for each distinct val+e of the cl+stering field"

containing the val+e and the pointer to the first block in the data file that has a record with thatval+e%

  cl+stering inde* is also a non dense (sparse) inde*" beca+se it has an entry for every

distinct val+e of the inde*ing field" rather than for every record in the file%


'age no% 8<

Secondary 9ndexes:-

,t is also an ordered file with two fields% The first field is of the same data type as some

non-ordering field of the data file that is an inde*ing field% The second field is either a block

painter or record painter%

There can be many secondary inde*es for the same file%

,t the secondary inde* is on a key field" then there is one inde* entry for each recorded in the

data file% .ence s+ch an inde* is dense%

,f the secondary inde* is on a non-key field" then n+mero+s records in the data file can have the

same val+e for the inde*ing field% There are vario+s options for implementing s+ch an inde*%

  secondary inde* provides a logical ordering on the records by the inde*ing field%

Multile+el 9ndexes :-

  m+ltilevel inde* considers the inde* file (first level) as an offered file with a distinct

val+e for each $ (i) % .ence we create a primary inde* for the first level%

This inde* to the first level is called the second level of the m+ltilevel inde*" and so on%

4e re@+ire a second level only if the first level needs more than one block of disk

storage and so on%

ach level red+ces the n+mber of entries at the previo+s level by a factor of fo-the inde*fan 7o+t

,f the first level has r " entries and the blocking factor 7 which is also the fan-o+t for the

inde* is bfr fo " then the first level needs V(r 1%5f )W blocks"

4hich is there fore the n+mber of entries r " needed of the second level of the inde*%


'age no% 8=

B-Trees :-

The B-Tree is a search tree with additional constraints that ens+re that the tree is always


  B -tree of order p" when +sed as an access str+ct+re on a key field to search for 

records in a data file" can be defined as follows:-

(1) ach internal node in the B-tree is of the form:-

Fp1"Fk1"'r1"P'"F$"'3P"--------Fk@"'r@ P" '1P

4here @F p% ach 'i is a tree pointes- a pointer to another node in the B-tree

ach 'r is a data pointer 7 a pointer to the record whose search key field val+e is e@+al to ki

() 4ithin each node" $1 F k F ---- F $@-1

(0) Aor all search key field val+es [ in the s+b tree pointed at by 'i" 4e have :-$ i-1 F * F ki for 1F i F @"[ F $i for i -1" and

$ i-1 F * for i @%

(8) ech node has at most ' tree painters%(9) ach node" e*cept the root and leaf nodes" has at least V(p5)W tree pointers% The

root node has at least two tree pointers% The root node has at least two tree

pointer +nless it is only node in the tree%(;) node with @ tree pointer" @p"has @-1 search key field val+es%(<) ll leaf nodes are at the same level" leaf nodes have the same str+ct+re as

internal nodes e*cept that all of this tree pointers 'i are n+ll%

Example :- +ppose the search field is H bytes long" the disk block si!e is B91 bytes" a

record (data) pointer is pr< bytes and a block pointer is p; bytes then :-

(p> p) G ((p-1) * (pr Gv)) ≤  B

D ( p >; ) G ( p-1) * 1; ≤  91

D %p ≤  9=

  ' ≤  8&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&

'age no% 8H

B? - Trees :-

,n a B G - tree" data pointers are stored only at leaf nodes on the treeE hence" str+ct+re of 

the leaf nodes differ from the str+ct+re of the internal nodes%

The leaf nodes have an entry for every val+e of the search field along with a data pointer 

to the record if the search field is a key field% Aor a non key search field" the pointer points to a

block containing pointers to the data file records" creating an e*tra level of indirection%

The leaf nodes of the BG tree are +s+ally linked together to provide ordered access on

the search field to the records% ome search field val+es from the leaf nodes are repeated in the

internal nodes of the BG tree to g+ide the search%

The str+ct+re of the internal nodes of a B- - tree of order ' is as follows:-

(1) ach internal node is of the form:-

F p1" $1" '" $" - p@-1" k@-1" p@ P

4here each internal node " k" F k F ---F k@-1

() 4ithin each internal node" $1 F $ F -- F k@-1

(0) ach internal node has at most p tree pointers%(8) Aor all search field vales [ in the s+btree pointed at by 'i" we have :-

$i-1 F [≤

 $1 for 1 F , F@"  [ ≤  $i for , 1" and

  $ i-1 F * for , @%

(9) ach internal node" e*cept the root" has at least V(p5)W tree pointers% The root node has

at least two tree pointers if it is an internal node%(;) n internal node with @ pointers" @ ≤  p" has @-1 search field val+es%

The str+ct+re of the leaf nodes of a BG tree of order p is as follows:-

F F k1 " 'r1 P " F k1" pr P E -------- F k@-1" 'r@-1 P " 'ne*t P

4here @ F p " each 'r1 is a data pointer and 'ne*t points to the ne*t leaf node of the BG-tree%

(0) ach 'ri is a data pointer that points to the reward whose search field val+e is $ 1%

(8) ach leaf node has at least V(p-1)5W val+es%(9) ll leaf nodes are at the same level%

6 'ne*t painter provides ordered access to the data records on the indening field%


'age no% 9

Beca+se entries in the internal nodes of a BG- tree incl+de search val+es and tree pointers

witho+t any data pointers" more entries can be packed into an internal node of a BG tree than for 

a similar B 7 tree%

Beca+se the str+ct+res for internal nodes and for leaf nodes of a BG tree are different" the order 

p can be different%

4e +se p to denote the order for internal nodes and ' leaf  to denote the order for leaf nodes%

*ample :- Aor a BG - tree" the search key field is H bytes long" the block si!e is B91 bytes"

a record pointer is 'r  < bytes" and a block pointer is p; bytes% Calc+late the order of internal

and leaf nodes%

ol+tion :- Aor internal node :-

(p*' ) G(p -1) *   ≤  B

D ; ' G H * (p-1) ≤  91

D 19 p ≤  91

D p ≤  08

D p 08

or lea4 nodes:-

' leaf  * ( pr Gv ) Gp ≤  B

  D ( < Gp) * ' leaf G; ≤  91

D 1; * ' leaf ≤  9;D ' leaf 01

,nsertion in BG trees :-

Overflow :-4hen n+mber of search key val+es e*ceed '-1%

,nsert =

Splitting 5ea4 6ode into t#o nodes :-

D 1st node contains Vp5W val+es%

D nd node contains remaining val+es%

D copy the ma*im+m search key val+e of the 1 st node to the parent node

Splitting 6on-5ea4 node into t#o nodes :-

< H 10 19

 D 1st node contains V(p-1) 5W val+es

D ove the smallest of the remaining val+es together with pointer to the parent%

D nd node contains the remaining val+es%


'age no% 91

Example: - Constr+ct a BG tree for ( 1" 8" <"1" 1<" 1" 01" 9" 1H"" ="8 ) with b8


Deletion in B? tree :-

&nder 4lo# :-


(i) Leaf 2ode :-(a) 3edistrib+te to sibling

D3ight node sho+ld not be less than left node%

D ove the smallest val+e of the right node to left node%

D 3eplace the between val+e in parent by the smallest val+e of the right node%

(b) erge :-

D Both left and right nodes have V(p-1)5W val+esD3emove the between val+es in parent%


'age no% 9

(iii) 2on-leaf 2ode :-(a) 3edistrib+te to sibling :-

DThro+gh parentD3ight node sho+ld not be less than left node%

(b) erge :-D Both left and right node have V (p-1)5W val+es%

Dove all val+es" pointers to left node%Delete the right node" and pointers in parent%

Example: - elete ="01"1"9"1H from the given BG tree%

