smd149 - operating systems - networking · distributed systems lab 2 distributed operating systems...

IntroductionNetworking

Distributed systemsLab 2

SMD149 - Operating Systems - Networking

Roland Parviainen

December 5, 2005

1 / 60



Overview

Networking

Distributed systems

Getting started with Lab 2

2 / 60



OSI

3 / 60



TCP/IP

TCP/IP protocol stack

4 layers

Application layer

Highest levelProvides protocols for applictaions to communicate

Transport layer

End-to-end communicationRelies on network layer to determine proper path from one end ofcommunication to the other

Network layer

Moving data between computers

Link layer

Interface between the network layer and the physical medium

4 / 60



Transport layer

TCP

Connection oriented protocolSegments sent will arrive

UndamagedIn order

Error control, congestion control, retransmission

UDP

ConnectionlessMinimum overheadBest effort

Datagrams can be reorderd or lost

No congestion control

5 / 60



OSI

6 / 60



Communication Process Fundamentals

Socket: application-programming interface (API) that interfaces theapplication and transport layers Addressing Name or address of anend-system (IP address) Identifier that specifies the receiving process inthe destination (port number) Examples: HTTP server: TCP port 80;SMTP: TCP port 25

7 / 60



Client/server model

8 / 60



Distributed systems

Remote computers cooperate via a network to appear as a localmachine

Users are given the impression that they are interacting with just onemachine

Spread computation and storage throughout a network of computers

Applications are able to execute code and to share data, files andother resources among local and remote machines

9 / 60



Attributes of distributed systems

Internet has made distributed systems common

Attributes of distributed systems:

PerformanceScalabilityConnectivitySecurityReliabilityFault tolerance

10 / 60



Perfomance and scalability

Centralized system

A single server handles all user requests

Distributed system

User requests can be sent to different servers working in parallel toincrease performance

Scalability

Allows a distributed system to grow

11 / 60



Connectivity and security

Distributed systems

Susceptible to attacks by malicious users if the rely on insecurecommunications media

To improve securty

Allow only authorized users to access resourcesEnsure that information transmitted over the network is readableonly by the intended recipientsProvice mechanisms to protect resources from attack

12 / 60



Reliability and fault tolerance

Fault tolerance

Implemented by providing replication of resources

Replication

Offers increased reliability and availabilityConsistency

13 / 60



Transparancy

Access transparancy

Location transparancy

Failure transparancy

Replication transparancy

Persistence transparancy

Migration and relocation transparancy

Transaction transparancy

14 / 60



Network operating system

Access resources on remote computers that run independentoperating systems

Not responsible for resource management at remote locations

Distributed functions are explicit rather than transparent

Lack of transparancy in network OSs

Does not provide some of the benefits of distributed OSsEasier to implement

15 / 60



Distributed operating systems

Manage resources located in multiple networked computers

Many of the same communication methods, filesystem structures,etc. as network operating systems

Transparent communication

Objects are unaware of the separate computers

Not many “truly” distributed systems

16 / 60



Distributed operating systems

Access and manage resources located in multiple networkedcomputers

Employ many of the same communication methods, filesystem structures and other protocols found in networked operatingsystems

Transparent communication

Objects in the system are unaware of the separate computers thatprovide the service (unlike network operating systems)User has the illusion of working on a single local systemImplementation is generally much more complex than networkedoperating systems

17 / 60



Communication in distributed systems

Designers must establish interoperability between heterogeneouscomputers and applications

InteroperabilityPermits software components to interact among different

Hardware and software platformsProgramming languagesCommunication protocols

Standardized interface

Allows each client/server pair to communicate using a single,common interface that is understood by both sides

18 / 60



Middleware

Software in distributed systems that helps provide:

PortabilityTransparencyInteroperabilityProvides standard programming interfaces to enable interprocesscommunication between remote computers

19 / 60



Remote Procedure Call

Allows a process executing on one computer to invoke a procedurein a process executing on another computer

Goal of RPC

To simplify the process of writing distributed applications bypreserving the syntax of a local procedure call while transparentlyinitiating network communication

Based on the application-oriented design paradigm

Begin by designing a conventional program (that runs on a singlemachine) to solve a problemDivide the program into two or more pieces, and add communicationprotocols that allow pieces to execute on separate computers

20 / 60



RPC

To issue an RPC:

Client process makes a call to the procedure in the client stubClient stub performs marshalling of data (procedure name,arguments) into a message for transmission over a networkClient stub passes the message to the serverServer transmits the message to the server stubMessage are unmarshalledStub sends the parameters to the appropriate local procedureServer stub marshals the result and sends it back to the clientClient stub unmarshals the result, notifies the process and passes itth result

21 / 60



RPC

22 / 60



RPC

ONC RPC - Open Network Computing Remote Procedure Call (SunRPC)

Initially developed for the Network File System,

ONC RPC supports procedure calls over both UDP and TCP.

The interface description language for ONC RPC is XDR (eXternalData Representation)

Access to RPC services on a machine are provided via a port mapperthat listens for queries on a well-known port, port 111 over UDP andTCP.

23 / 60



RMI

Java Remote Method Invocation

RPC for Java

Object serialization

MarshallingJava objects can be used for parameters/return values

24 / 60



CORBA

Common Object Request Broker Architecture

Open standard

Objects as parameters or return values

Object Request Brokers (ORB) marshals data

Programming language independence through Interface DefinitionLanguge (IDL)

GIOP (General Inter-ORB Protocol), IIOP (Internet Inter-OrbProtocol)

25 / 60



DCOM

Distributed Component Object Model

Objects are access via interfaces

Objects may have multiple interfaces

Uses DCE/RPC (Distributed Computing Environment/RemoteProcedure Calls).

MSRPC

26 / 60



Web services

CORBA, DCOM, etc have problems with network firewalls, etc

Web services much more successful

Open standards

HTTP and XML

27 / 60



XML RPC

Uses XML to encode calls

HTTP as transport

Very simple, two page standard

28 / 60



<?xml version="1.0"?><methodCall><methodName>examples.getStateName</methodName><params><param>

<value><i4>41</i4></value></param>

</params></methodCall>

<?xml version="1.0"?><methodResponse><params><param>

<value><string>South Dakota</string></value></param>

</params></methodResponse>

30 / 60



SOAP

SOAP

Originally: Simple Object Access Protocol

XML/HTTP

Lengthy syntax:

Easy to readComplexSlow processing times (10x slower than RMI/IIOP)

31 / 60



Synchronization

Determining the order in which events occur is difficult

Communication delays in a distributed network are unpredictable

Causal ordering

Ensures that all processes recognize that a causally dependent eventmust occur only after the event on which it is dependentImplemented by the happens-before relation:

If events a and b belong to the same process, then a → b if aoccurred before bIf event a is the sending of a message and event b is the receiving ofthat message, then a → bThis relation is transitive

32 / 60



Synchronization

Causal ordering only ensures a partial ordering

Events for which it cannot be determined which occurred earlier aresaid to be concurrent

Total ordering

Ensures that all events are ordered and that causality is preservedCan be implemented through a logical clock that assigns atimestamp to each event that occurs in the systemScalar logical clocks synchronize the logical clocks on remote hostsand ensure causality

33 / 60



Mutual exclusion

In environments with no shared memory, mutual exclusion must beimplemented via message passing Message passing relies on clocksynchronization

FIFO broadcast

Guarantees that when two messages are sent from one process toanother, the message that was sent first will arrive first

Causal broadcast

Ensures that when message M1 is causally dependent on messageM2, then no process delivers M1 before delivering M2

Atomic broadcast (aka totally ordered or agreed broadcast)

Guarantees that all messages in a system are received in the sameorder at each process

34 / 60



Deadlock

Three types of distributed deadlocks

Resourceinter-process communication

Circular waiting for signals

phantom deadlock

Due to delayDeadlocks that does not exist are detected

35 / 60



Deadlock prevention

Wound-wait strategy

Breaks deadlock by denying the no-preemption condition

Recently created processes are forced to wait if they need a resourceheld by an earlier created process

Recently created processes are restarted if they hold a resourcerequested by an earlier created process

36 / 60



Wound-wait

37 / 60



Wait-die

Wait-die strategy

Breaks deadlock by denying the wait-for condition

Recently created processes “die” instead of waiting for a resource(currently being held by an earlier created process)

An earlier created process will wait if it needs a resource being heldby recently created processes

38 / 60



Wait-die

39 / 60



Deadlock detection

Central deadlock detection

Hiearchical deadlock detection

Distributed deadlock detection

40 / 60



Distributed file systems

Distributed file server

Stateful

The server keeps state information of the client requests so thatsubsequent access to the file is easier

Stateless

The client must specify which file to access in each request

Transparency is a key feature

Complete file location transparency means that the user is unawareof the physical location of a file within a distributed file systemThe user sees only a global file system

41 / 60



Distributed file systems

Many distributed systems implement client caching to avoid theoverhead of multiple RPCs

Clients keep a local copy of a file and flush modified copies of it tothe server from time to timeBecause there are multiple copies of the same file, files can becomeinconsistent

Distributed file systems are designed to share information amonglarge groups of computers

New computers should be able to be added to the system easilyDistributed file systems should be scalable

Security concerns in distributed file systems

Ensuring secure communicationsAccess controlTrusted/untrusted client machines?

42 / 60



Network File System

43 / 60



Network File System

NFS versions 2 and 3

Assume a stateless server implementationFault tolerance easier to implement than with a stateful serverWith stateless servers, if the server crashes, the client can simplyretry its request until the server responds

NFS-4

StatefulAllows faster access to filesHowever, if the server crashes, all the state information of the clientis lost, so the client needs to rebuild its state on the server beforeretrying

44 / 60



Network File System

NFS-4 extends the client-caching scheme through delegation

Allows the server to temporarily transfer the control of a file to aclientWhen the server grants a read delegation of a particular file to theclient, then no other client can write to that fileWhen the server grants a write delegation of a particular file to theclient, then no other client can read or write to that fileIf another client requests a file that has been delegated, the serverwill revoke the delegation and request that the original client flushthe file to disk

45 / 60



Multicomputer system architecture

Clustering

Takes advantage of distributed systems and parallel systems to buildpowerful computers

Peer-to-peer distributed computing model

Used to remove many central points of failure in applications likeinstant messengers

Grid computing

Exploits unused computer power to solve complex problems

46 / 60



Clustering

Clustering

Interconnecting nodes (single-processor computers or multiprocessorcomputers) within a high-speed LAN to function as a single parallelcomputer

Cluster: Set of nodes that forms the single parallel machine

Multiple computers working together to solve large and complexproblems

47 / 60



Clustering types

High-performance clusters

All nodes in the cluster work to improve performance

High-availability clusters

Only some of the nodes in the cluster are active while others serve asbackupsIf working nodes fail, the backup nodes immediately start runningand take over the jobs that were being executed onthe failed nodes, without interrupting service

Load-balancing clusters

A particular node works as a load balancer to distribute the load to aset of nodes so that all hardware is utilized efficiently

48 / 60



Clustering benefits

Economically interconnects relatively inexpensive components

Reduces the cost for building a clustered system compared to asingle parallel computer with the same capability

High performance

Each node in the clustering system shares the workload

Fast communications among nodes in a cluster

Faster than those in unclustered distributed computing systems dueto the high-speed LAN between nodes

49 / 60



Clustering benefits

Scalability

A cluster is able to add or remove nodes (or the components ofnodes) to adjust its capabilities without affecting the existing nodesin the clusterBetter scalability than multiprocessors

Reliability and fault tolerance by providing backups or redundancyfor the services and resources

The failure of any one computer will not affect the availability of therest of the system’s resources

50 / 60



Grid computing

Links computational resources that are distributed over the widearea network (such as computers, data storages and scientificdevices) to solve complex problems

Using unused resources (CPU cycles and/or disk storage) of largenumbers of disparate, often desktop, computers

Treated as a virtual, distributed cluster

Emphasizes public collaboration

Individuals and research institutes coordinate resources that are notsubject to a centralized controlExample: SETI@home

51 / 60



Grid computing

Grid computing has the same advantage as clustering

High performance, achieved by cost-effectively using spare computerpower and collaborating resources

However, grid computing requires advanced software to managedistributed computing tasks efficiently and reliably

52 / 60



Fallacies of distributed computing

The Eight Fallacies of Distributed Computing - P. Deutsch

1 The network is reliable

2 Latency is zero

3 Bandwidth is infinite

4 The network is secure

5 Topology doesn’t change

6 There is one administrator

7 Transport cost is zero

8 The network is homogeneous

53 / 60



Lab 2 - getting started

Unpack and get nachos running (make sure you have run “source.smd149rc”!

If not, make sure to run gmake clean first

join:

Put thread to sleepWake up thread when other thread finish

Sleep: See KThread.sleep()...

Mark thread as blocked

How do we know when to wake up?

Store a reference to calling thread in the called threadWake thread up when called thread finish (See KThread.finish())

KThread.ready()

54 / 60



Condition variables

Maintain a queue of waiting threads (LinkedList or Vector)

See Condition.java and Semaphore.java for hints

wakeAll: while wait queue is not empty, call wake

wake: remove first thread from wait queue, wake thread up(KThread.ready())

Protect access to queue with boolean intStatus =Machine.interrupt().disable() andMachine.interrupt().restore(intStatus);

sleep

disable interruptsrelease condition lockadd current thread to queueblock current thread (KThread.sleep())acquire condition lockrestore interrupts

55 / 60



Alarm

waitUntil:

disable/enable interruptsadd sleep time for current thread (add variable to KThread?)add to sleep queueblock current thread

timerInterrupt:

Disable interruptsRemove and wake up all threads where sleep time has expired(Machine.timer().getTime())Enable interruptsYield

56 / 60



Part 2

Add debugging statements

Write a simple program to copy a file (to test file systemimplementation)

Make sure file is larger than one page

Remember to protect data structures by e.g. disable interrupts

Free page management

FileSystem fs

fs = Machine.stubFileSystem();

57 / 60



Example

public int writeVirtualMemory(int vaddr, byte[] data, int offset,int length)

int read(int fd, char *buffer, int size)

case(syscallRead):fd = a0;of = fdTable.getFile(fd);buf = new byte[a2];rd1 = of.read(buf, 0, a2);rd2 = writeVirtualMemory(a1, buf, 0, rd1);return rd2;

59 / 60



Summary

Next time: security

60 / 60

smd149 - operating systems - networking · distributed systems lab 2 distributed operating systems...

Documents