eec-681/781 distributed computing systems
DESCRIPTION
EEC-681/781 Distributed Computing Systems. Lecture 8 Wenbing Zhao [email protected] Cleveland State University. Outline. Midterm#1 results Processes and threads Clients and Servers. Midterm#1 Results. Max: 98 Min: 72 Mean: 89 P1 mean: 37/40 P2 mean: 34/40 P3 mean: 18/20. Process. - PowerPoint PPT PresentationTRANSCRIPT
EEC-681/781EEC-681/781Distributed Computing Distributed Computing
SystemsSystems
Lecture 8Lecture 8
Wenbing ZhaoWenbing [email protected]@ieee.org
Cleveland State UniversityCleveland State University
Fall Semester 2006Fall Semester 2006 EEC-681: Distributed Computing SystemsEEC-681: Distributed Computing Systems Wenbing ZhaoWenbing Zhao
22
OutlineOutline• Midterm#1 results
• Processes and threads
• Clients and Servers
Fall Semester 2006Fall Semester 2006 EEC-681: Distributed Computing SystemsEEC-681: Distributed Computing Systems Wenbing ZhaoWenbing Zhao
33
Midterm#1 ResultsMidterm#1 Results
• Max: 98• Min: 72• Mean: 89
• P1 mean: 37/40• P2 mean: 34/40• P3 mean: 18/20
0
2
4
6
8
10
12
60-69 70-79 80-89 90-100
Grand Range
Nu
mb
er o
f S
tud
ents
Fall Semester 2006Fall Semester 2006 EEC-681: Distributed Computing SystemsEEC-681: Distributed Computing Systems Wenbing ZhaoWenbing Zhao
44
ProcessProcess• Communication takes place between processes• Process is a program in execution• For an OS, process management and
scheduling are most important• For distributed systems, other issues are equally
or more important– Multithreading– Client-Server organization– Code migration– Software agent
Fall Semester 2006Fall Semester 2006 EEC-681: Distributed Computing SystemsEEC-681: Distributed Computing Systems Wenbing ZhaoWenbing Zhao
55
ProcessProcess
• An operating system creates a number of virtual processors, each one for running a different program
• To keep track of these virtual processors, OS maintains a process table– CPU register values, memory maps, open files,
accounting info, privileges, etc.
Fall Semester 2006Fall Semester 2006 EEC-681: Distributed Computing SystemsEEC-681: Distributed Computing Systems Wenbing ZhaoWenbing Zhao
66
ProcessProcess
• OS ensures concurrency transparency for different processes that share the same CPU and other hardware resources
• Each process has its own address space• Switch CPU between two processes is
expensive– CPU context, modify registers for memory
management unit (MMU), invalidate address translation caches such as in the translation lookaside buffer (TLB)
Fall Semester 2006Fall Semester 2006 EEC-681: Distributed Computing SystemsEEC-681: Distributed Computing Systems Wenbing ZhaoWenbing Zhao
77
Motivation to Use a Finer GranularityMotivation to Use a Finer Granularity
• It is hard to program a single threaded process for efficient distributed computing– Difficult to use non-blocking system calls
• Could have used a pool of processes, but – Creation/deletion of a process is expensive– Inter-process communication (IPC) is expensive
Fall Semester 2006Fall Semester 2006 EEC-681: Distributed Computing SystemsEEC-681: Distributed Computing Systems Wenbing ZhaoWenbing Zhao
88
Introduction to ThreadsIntroduction to Threads
• Thread: A minimal software processor in whose context a series of instructions can be executed
• Saving a thread context implies stopping the current execution and saving all the data needed to continue the execution at a later stage
• A process can have one or more threads• Threads share the same address space.
=> Thread context switching can be done entirely independent of the operating system
Fall Semester 2006Fall Semester 2006 EEC-681: Distributed Computing SystemsEEC-681: Distributed Computing Systems Wenbing ZhaoWenbing Zhao
99
Context SwitchingContext Switching
• Creating and destroying threads is much cheaper than doing so for processes
• Process switching is generally more expensive as it involves getting the OS in the loop, i.e., trapping to the kernel
Fall Semester 2006Fall Semester 2006 EEC-681: Distributed Computing SystemsEEC-681: Distributed Computing Systems Wenbing ZhaoWenbing Zhao
1010
Threads and Distributed SystemsThreads and Distributed Systems
• Multithreaded clients: – Hiding network latency
• Multithreaded servers: – Improved performance and – Better structure
Fall Semester 2006Fall Semester 2006 EEC-681: Distributed Computing SystemsEEC-681: Distributed Computing Systems Wenbing ZhaoWenbing Zhao
1111
Multithreaded ClientsMultithreaded Clients
• Multithreaded clients: hiding network latency • Multithreaded Web client:
– Web browser scans an incoming HTML page, and finds that more files need to be fetched
– Each file is fetched by a separate thread, each doing a (blocking) HTTP request
– As files come in, the browser displays them
Fall Semester 2006Fall Semester 2006 EEC-681: Distributed Computing SystemsEEC-681: Distributed Computing Systems Wenbing ZhaoWenbing Zhao
1212
Multithreaded ServersMultithreaded Servers
• Improve performance:– Starting a thread to handle an incoming request is
much cheaper than starting a new process– Multi-threaded server can scale well to a
multiprocessor system– Hide network latency by reacting to next request while
previous one is being replied
Fall Semester 2006Fall Semester 2006 EEC-681: Distributed Computing SystemsEEC-681: Distributed Computing Systems Wenbing ZhaoWenbing Zhao
1313
Multithreaded ServersMultithreaded Servers
• Better server structure:– Using simple, well-understood blocking calls simplifies
the overall structure– Multithreaded programs can be smaller and easier to
understand due to simplified flow of control
Fall Semester 2006Fall Semester 2006 EEC-681: Distributed Computing SystemsEEC-681: Distributed Computing Systems Wenbing ZhaoWenbing Zhao
1414
Multithreaded ServersMultithreaded Servers• Dispatcher/worker model
– Thread-per-object– Thread-per-request– Thread-per-client– Thread pool
Fall Semester 2006Fall Semester 2006 EEC-681: Distributed Computing SystemsEEC-681: Distributed Computing Systems Wenbing ZhaoWenbing Zhao
1515
Multithreaded ServersMultithreaded Servers
• Three ways to construct a server:
Model Characteristics
Threads Parallelism, blocking system calls
Single-threaded process No parallelism, blocking system calls
Finite-state machine Parallelism, nonblocking system calls
Fall Semester 2006Fall Semester 2006 EEC-681: Distributed Computing SystemsEEC-681: Distributed Computing Systems Wenbing ZhaoWenbing Zhao
1616
Client-Side SoftwareClient-Side Software
• User interface– X-window system– Model-View-Controller Pattern
• Providing distribution transparency
Fall Semester 2006Fall Semester 2006 EEC-681: Distributed Computing SystemsEEC-681: Distributed Computing Systems Wenbing ZhaoWenbing Zhao
1717
The X-Window SystemThe X-Window System
Fall Semester 2006Fall Semester 2006 EEC-681: Distributed Computing SystemsEEC-681: Distributed Computing Systems Wenbing ZhaoWenbing Zhao
1818
The X-Window SystemThe X-Window System
• X distinguishes two types of applications– Normal application
• Can request creation of a window• Mouse and keystroke events are captured when a window is
active
– X windows manager• Given special permission to manipulate the entire screen• Determines the look and feel
• X applications and X kernel interacts through an X protocol– Supports Unix and TCP/IP sockets– X terminals
Fall Semester 2006Fall Semester 2006 EEC-681: Distributed Computing SystemsEEC-681: Distributed Computing Systems Wenbing ZhaoWenbing Zhao
1919
Model-View-Controller PatternModel-View-Controller Pattern
• Invented in a Smalltalk context for decoupling the graphical interface of an application from the code that actually does the work
• MVC was originally developed to map the traditional input, processing, output roles into the GUI realm: Input --> Processing --> Output
Controller --> Model --> View
Fall Semester 2006Fall Semester 2006 EEC-681: Distributed Computing SystemsEEC-681: Distributed Computing Systems Wenbing ZhaoWenbing Zhao
2020
Model-View-ControllerModel-View-Controller
• Model - manages one or more data elements, responds to queries about its state, and responds to instructions to change state
• View - responsible for mapping graphics onto a device. Multiple views might be attached to the same model
• Controller - responsible for mapping end-user action to application response
Fall Semester 2006Fall Semester 2006 EEC-681: Distributed Computing SystemsEEC-681: Distributed Computing Systems Wenbing ZhaoWenbing Zhao
2121
Model-View-ControllerModel-View-Controller
Fall Semester 2006Fall Semester 2006 EEC-681: Distributed Computing SystemsEEC-681: Distributed Computing Systems Wenbing ZhaoWenbing Zhao
2222
Client-Side Software:Client-Side Software: Providing Distribution Transparency Providing Distribution Transparency
• Access transparency: client-side stubs for RPCs and RMIs• Location/migration transparency: let client-side software
keep track of actual location• Replication transparency: multiple invocations handled by
client stub• Failure transparency:
mask server and communication failures
Fall Semester 2006Fall Semester 2006 EEC-681: Distributed Computing SystemsEEC-681: Distributed Computing Systems Wenbing ZhaoWenbing Zhao
2323
Server-Side SoftwareServer-Side Software
• Basic model: A server is a process that waits for incoming service requests at a specific transport address
• A server typically listens on a well-known port:ftp-data 20 File Transfer [Default Data]ftp 21 File Transfer [Control]
ssh 22 Secure Shelltelnet 23 Telnetsmtp 25 Simple Mail Transfer
Fall Semester 2006Fall Semester 2006 EEC-681: Distributed Computing SystemsEEC-681: Distributed Computing Systems Wenbing ZhaoWenbing Zhao
2424
Server-Side SoftwareServer-Side Software
• Superservers: Servers that listen to several ports, i.e., provide several independent services– When a service request comes in, they start a
subprocess to handle the request
Fall Semester 2006Fall Semester 2006 EEC-681: Distributed Computing SystemsEEC-681: Distributed Computing Systems Wenbing ZhaoWenbing Zhao
2525
Server-Side SoftwareServer-Side Software
• Iterative vs. concurrent servers: – Iterative servers can handle only one client at
a time– Concurrent servers can handle multiple
clients at the same time
Fall Semester 2006Fall Semester 2006 EEC-681: Distributed Computing SystemsEEC-681: Distributed Computing Systems Wenbing ZhaoWenbing Zhao
2626
Servers and StateServers and State• Stateless servers: Never keep accurate
information about the status of a client after having handled a request
• Consequences:– Clients and servers are completely independent– State inconsistencies due to client or server crashes
are reduced– Possible loss of performance because, e.g., a server
cannot anticipate client behavior
• Question: Does connection-oriented communication fit into a stateless design?
Fall Semester 2006Fall Semester 2006 EEC-681: Distributed Computing SystemsEEC-681: Distributed Computing Systems Wenbing ZhaoWenbing Zhao
2727
Servers and StateServers and State
• Stateful servers: Keeps track of the status of its clients
– Record that a file has been opened, so that prefetching can be done
– Knows which data a client has cached, and allows clients to keep local copies of shared data
• The performance of stateful servers can be extremely high (from a particular client’s point of view)
• Drawback– Crash recovery a lot more challenging– Less scalable
Fall Semester 2006Fall Semester 2006 EEC-681: Distributed Computing SystemsEEC-681: Distributed Computing Systems Wenbing ZhaoWenbing Zhao
2828
Practical Implementation of ServersPractical Implementation of Servers
• Servers need to maintain clients state• Where to store such state? Database systems• Solution – three-tier architecture
– Application servers interface directly to clients and execute according to business logic
– Data (state) is stored in the data access tier so that the application servers can be made stateless
– E.g., Web-page personalization using cookies
Fall Semester 2006Fall Semester 2006 EEC-681: Distributed Computing SystemsEEC-681: Distributed Computing Systems Wenbing ZhaoWenbing Zhao
2929
Reasons for Migrating CodeReasons for Migrating Code
• Load balancing– Migrate processes from heavy loaded machine to light
loaded machines
• Minimize communication– Move code from client to server– Move code from server to client
• Parallel execution– Web crawlers
• Flexibility– Dynamically configure distributed systems
Fall Semester 2006Fall Semester 2006 EEC-681: Distributed Computing SystemsEEC-681: Distributed Computing Systems Wenbing ZhaoWenbing Zhao
3030
Strong and Weak MobilityStrong and Weak Mobility
• Process components:– Code segment: set of instructions that make up the
program– Resource/data segment: contains references to
external resources needed by the process, such as files, devices, other processes
– Execution segment: contains the current execution state of a process such as private data, stack, program counter
Fall Semester 2006Fall Semester 2006 EEC-681: Distributed Computing SystemsEEC-681: Distributed Computing Systems Wenbing ZhaoWenbing Zhao
3131
Strong and Weak MobilityStrong and Weak Mobility
• Weak mobility: Move only code and data segment (and start execution from the beginning) after migration
• Strong mobility: Move component, including execution state– Migration: move entire process from one machine to
another– Cloning: start a clone, and set it in the same
execution state
Fall Semester 2006Fall Semester 2006 EEC-681: Distributed Computing SystemsEEC-681: Distributed Computing Systems Wenbing ZhaoWenbing Zhao
3232
Process-to-Resource BindingProcess-to-Resource Binding
• By identifier: the process requires a specific instance of a resource – A specific web page or a remote file– local communication endpoint
• By value: the process requires the value of a resource – Shared library– Memory
• By type: the process requires that only a type of resource is available – A color monitor– A printer
Fall Semester 2006Fall Semester 2006 EEC-681: Distributed Computing SystemsEEC-681: Distributed Computing Systems Wenbing ZhaoWenbing Zhao
3333
Resource-to-Machine BindingResource-to-Machine Binding
• Fixed: the resource cannot be migrated– Local devices– local communication endpoints
• Fastened: the resource can, in principle, be migrated but only at high cost– Local databases– complete web site
• Unattached: the resource can easily be moved along with the process – A cache– Files
Fall Semester 2006Fall Semester 2006 EEC-681: Distributed Computing SystemsEEC-681: Distributed Computing Systems Wenbing ZhaoWenbing Zhao
3434
Migration in Heterogeneous SystemsMigration in Heterogeneous Systems
• Challenges:– The target machine may not be suitable to execute the
migrated code– The definition of process/thread/processor context is
highly dependent on local hardware, operating system and runtime system
• Solution: Make use of an abstract machine that is implemented on different platforms– Interpreted languages running on a virtual machine
(Java/JVM; scripting languages)– Virtual machine
Fall Semester 2006Fall Semester 2006 EEC-681: Distributed Computing SystemsEEC-681: Distributed Computing Systems Wenbing ZhaoWenbing Zhao
3535
What’s an Agent?What’s an Agent?
• An agent is an autonomous process capable of reacting to, and initiating changes in its environment, possibly in collaboration with users and other agents– collaborative agent: collaborate with others in a
multiagent system– mobile agent: can move between machines– interface agent: assist users at user-interface level– information agent: manage information from
physically different sources
Fall Semester 2006Fall Semester 2006 EEC-681: Distributed Computing SystemsEEC-681: Distributed Computing Systems Wenbing ZhaoWenbing Zhao
3636
Agent TechnologyAgent Technology
• The general model of an agent platform
Management: Keeps track of where the agents on this platform are (mapping agent ID to port)Directory: Mapping of agent names & attributes to agent IDsACC: Agent Communication Channel, used to communicate with other platforms
Intra-platformcommunication