advance operating systems

qwertyuiopasdfghjklzxcvbnmqwertyuiopasdfgh

jklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmrtyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqw

Advanced Operating Systems (Distributed Systems)


1.Differentiate between tightly coupled and loosely coupled systems

Computer architectures consisting of interconnected, multiple processors are basically of two types:

In tightly coupled systems, there is a single system wide primary memory (address space) that is shared by all the processors (Fig. 1.1). If any processor writes, for example, the value 100 to the memory location x, any other processor subsequently reading from location x will get the value 100. Therefore, in these systems, any communication between the processors usually takes place through the shared memory.

In loosely coupled systems, the processors do not share memory, and each processor has its own local memory (Fig. 1.2). If a processor writes the value 100 to the memory location x, this write operation will only change the contents of its local memory and will not affect the contents of the memory of any other processor. Hence, if another processor reads the memory location x, it will get whatever value was there before in that location of its own local memory. In these systems, all physical communication between the processors is done by passing messages across the network that interconnects the processors.

Usually, tightly coupled systems are referred to as parallel processing systems, and loosely coupled systems are referred to as distributed computing systems, or simply distributed systems. In contrast to the tightly coupled systems, the processors of distributed computing systems can be located far from each other to cover a wider geographical area. Furthermore, in tightly coupled systems, the number of processors that can be usefully deployed is usually small and limited by the bandwidth of the shared memory. This is not the case with distributed computing systems that are more freely expandable and can have an almost unlimited number of processors.

Tightly Coupled Multiprocessor Systems

M.C.A vth sem


Loosely Coupled Multiprocessor Systems

Hence, a distributed computing system is basically a collection of processors interconnected by a communication network in which each processor has its own local memory and other peripherals, and the communication between any two processors of the system takes place by message passing over the communication network. For a particular processor, its own resources are local, whereas the other processors and their resources are remote. Together, a processor and its resources are usually referred to as a node or site or machine of the distributed computing system.

2. Describe about Buffering. What are the four types of buffering strategies?

The transmission of messages from one process to another can be done by copying the body of the message from the sender’s address space to the receiver’s address space. In some cases, the receiving process may not be ready to receive the message but it wants the operating system to save that message for later reception. In such cases, the operating system would rely on the receiver’s buffer space in which the transmitted messages can be stored prior to receiving process executing specific code to receive the message.

The synchronous and asynchronous modes of communication correspond to the two extremes of buffering: a null buffer, or no buffering, and a buffer with unbounded capacity. Two other commonly used buffering strategies are

single-message and finite-bound, or multiple message buffers. These four types of buffering strategies are given below:

No buffering: In this case, message remains in the sender’s address space until the receiver executes the corresponding receive.

Single message buffer: A buffer to hold a single message at the receiver side is used. It is used for implementing synchronous communication because in this case an application can have only one outstanding message at any given time.

M.C.A vth sem


Unbounded - Capacity buffer: Convenient to support asynchronous communication. However, it is impossible to support unbounded buffer.

Finite-Bound Buffer: Used for supporting asynchronous communication.

3.Define DSM. Discuss any four design and implementation issues of DSM.

This is also called DSVM (Distributed Shared Virtual Memory). It is a loosely coupled distributed-memory system that has implemented a software layer on top of the message passing system to provide a shared memory abstraction for the programmers. The software layer can be implemented in the OS kernel or in runtime library routines with proper kernel support. It is an abstraction that integrates local memory of different machines in a network environment into a single logical entity shared by cooperating processes executing on multiple sites. Shared memory exists only virtually.

DSM Systems: A comparison between message passing and tightly coupled multiprocessor systems DSM provides a simpler abstraction than the message passing model. It relieves the burden from the programmer from explicitly using communication primitives in their programs. In message passing systems, passing complex data structures between two different processes is difficult. Moreover, passing data structures containing pointers is generally expensive in message passing model. Distributed Shared Memory takes advantage of the locality of reference exhibited by programs and improves efficiency. Distributed Shared Memory systems are cheaper to build than tightly coupled multiprocessor systems.

The large physical memory available facilitates running programs requiring large memory efficiently.

DSM can scale well when compared to tightly coupled multiprocessor systems. Message passing system allows processes to communicate with each other while being protected from one another by having private address spaces, whereas in DSM one can cause another to fail by erroneously altering data. When message passing is used between heterogeneous computers marshaling of data takes care of differences in data representation; how can memory be shared between computers with different integer representation. DSM can be made persistent - i.e. processes communicating via DSM may execute with overlapping lifetimes. A process can leave information in an agreed location to another process. Processes communicating via message passing must execute at the same time.

Which is better? Message passing or Distributed Shared Memory? Distributed Shared Memory appears to be a promising tool if it can be implemented efficiently.

Distributed Shared Memory Architecture

M.C.A vth sem


As shown in the above figure, the DSM provides a virtual address space shared among processes on loosely coupled processors. DSM is basically an abstraction that integrates the local memory of different machines in a network environment into a single local entity shared by cooperating processes executing on multiple sites. The shared memory itself exists only virtually. The application programs can use it in the same way as traditional virtual memory, except that processes using it can run on different machines in parallel.

DSM – Design and Implementation Issues

The important issues involved in the design and implementation of DSM systems are as follows: Granularity: It refers to the block size of the DSM system, i.e. to the units of sharing and the unit of data transfer across the network when a network block fault occurs. Possible units are a few words, a page, or a few pages. Structure of Shared Memory Space: The structure refers to the Lay out of the shared data in memory. It is dependent on the type of applications that the DSM system is intended to support. Memory coherence and access synchronization: Coherence (consistency) refers to memory coherence problem that deals with the consistency of shared data that lies in the main memory of two or more nodes. Synchronization refers to synchronization of concurrent access to shared data using synchronization primitives such as semaphores. Data Location and Access: A DSM system must implement mechanisms to locate data blocks in order to service the network data block faults to meet the requirements of the memory coherence semantics being used.

Block Replacement Policy: If the local memory of a node is full, a cache miss at that node implies not only a fetch of the accessed data block from a remote node but also a replacement. i.e. a data block of the local memory must be replaced by the new data

M.C.A vth sem


block. Therefore a block replacement policy is also necessary in the design of a DSM system.

Thrashing: In a DSM system, data blocks migrate between nodes on demand. If two nodes compete for write access to a single data item, the corresponding data block may be transferred back and forth at such a high rate that no real work can get done. A DSM system must use a policy to avoid this situation (known as Thrashing).

Heterogeneity: The DSM systems built in for homogenous systems need not address the heterogeneity issue. However, if the underlying system environment is heterogeneous, the DSM system must be designed to take care of heterogeneity so that it functions properly with machines having different architectures.

4. Discuss any five features of good global scheduling algorithm

No a priori Knowledge about the process: A good process scheduling algorithm should operate with absolutely no a priori knowledge about the processes.

ii) Dynamic in Nature: It is intended that a good process-scheduling algorithm should be able to take care of the dynamically changing load at various nodes. The process assignment decisions should be based on the current load of the system and not on some fixed static policy.

iii) Quick Decision Making: A good process scheduling algorithm must be capable of taking quick decisions regarding node assignment for processes.

iv) Scheduling overhead: The general observation is that as overhead is increased in an attempt to obtain more information regarding the global state of the system, the usefulness of the information is decreased due to both the aging of the information gathered and the low scheduling frequency as a result of the cost of gathering and processing that information. Hence algorithms that provide near optimal system performance with a minimum of global state information gathering overhead are desirable.

Stability: The algorithm should be stable: i.e., the system should not enter a state in which nodes spend all their time migrating processes or exchanging control messages without doing any useful work.

vi) Scalable: The algorithm should be scalable i.e. the system should be able to handle small and large networked systems. A simple approach to make an algorithm scalable is to probe only m of N nodes for selecting a host. The value of m can be dynamically adjusted depending on the value of N.

vii) Fault Tolerance: The algorithm should not be affected by the crash of one or more nodes in the system. At any instance of time, it should continue functioning for nodes that are up at that time. Algorithms that have decentralized decision making a capability and consider only available nodes in their decision making approach have better fault tolerance capability.

M.C.A vth sem


viii) Fairness of service: How fairly a service is allocated is a common concern. For example, two users simultaneously initiating equivalent processes should receive the same quality of service. What is desirable is a fair strategy that will improve response time to the former without unduly affecting the latter. For this the concept of load balancing has to be replaced by load-sharing, i.e., a node will share some of its resources as long as its users are not significantly affected.

5.What is replication? Discuss the three replication approaches in DFS

The main approach to improving the performance and fault tolerance of a DFS is to replicate its content. A replicating DFS maintains multiple copies of files on different servers. This can prevent data loss, protect a system against down time of a single server, and distribute the overall workload.

There are three approaches to replication in a DFS:

1. Explicit replication: The client explicitly writes files to multiple servers. This approach requires explicit support from the client and does not provide transparency.

2. Lazy file replication: The server automatically copies files to other servers after the files are written. Remote files are only brought up to date when the files are sent to the server. How often this happens is up to the implementation and affects the consistency of the file state.

3. Group file replication: write requests are simultaneously sent to a group of servers. This keeps all the replicas up to date, and allows clients to read consistent file state from any replica.

6. List and explain the desirable features of good naming system

A good naming system for a distributed system should have the following features: i) Location transparency The name of an object should not reveal any hint about the physical location of the object ii) Location independency Name of an object should not be required to be changed when the object’s location changes. Thus

A location independent naming system must support a dynamic mapping scheme

An object at any node can be accessed without the knowledge of its physical location

An object at any node can issue an access request without the knowledge of its own physical location

iii) Scalability

M.C.A vth sem


Naming system should be able to handle the dynamically changing scale of a distributed system iv) Uniform naming convention

Should use the same naming conventions for all types of objects in the system

v) Multiple user-defined names for the same object Naming system should provide the flexibility to assign multiple user-defined names for the same object. vi) Grouping name Naming system should allow many different objects to be identified by the same name. vii) Meaningful names

A naming system should support at least two levels of subject identifiers, one convenient for human users and the other convenient for machines.

M.C.A vth sem

advance operating systems

Education