part 1 introduction to distributed computing
TRANSCRIPT
-
8/14/2019 Part 1 Introduction to Distributed Computing
1/42
EMM4086: DMS 1
Chapter 1:Part 1
Introduction to Distributed
Computing
-
8/14/2019 Part 1 Introduction to Distributed Computing
2/42
EMM4086: DMS 2
Contents
Definition
Examples of DS
Advantages and Disadvantages
Key characteristics
Challenges in designing DS
-
8/14/2019 Part 1 Introduction to Distributed Computing
3/42
EMM4086: DMS 3
Definition
Coulouris - A system
in which hardware orsoftware components
located at networked
computers
communicate and
coordinate their actions
only by message
passing
-
8/14/2019 Part 1 Introduction to Distributed Computing
4/42
EMM4086: DMS 4
Tanenbaum - A
distributed system is acollection of
independent
computers that appear
to the users of the
system as a single
computer
-
8/14/2019 Part 1 Introduction to Distributed Computing
5/42
EMM4086: DMS 5
A distributed system is a collection of autonomous
computers linked by a network with softwaredesigned to produce an integrated computingfacility
Sharing of resources is a main motivation forconstructing DS
Resources may be managed by servers and accessedby clients or they may be encapsulated as objects andaccessed by other client objects
The term resource extends from hardware such as
disks, printers, to software defined entities such asfiles, databases and data objects of all kinds.
Also includes video streams and audio streams
-
8/14/2019 Part 1 Introduction to Distributed Computing
6/42
EMM4086: DMS 6
Reasons for DS
DS consisting of collections of microcomputers may
have processing powers that no supercomputer willever achieve
10,000 CPUs, each running at 50 MIPS* yields500000 MIPS, then instructions to be executed in
0.002 nsec, equivalent to light distance of 0.6 mm -any processor chip of that size would meltimmediately.
Collection of microprocessors offer a better price/
performance ratio than main frames- main frames: 10times faster, 1000 times as expensive
(*) MIPS million instructions per second
-
8/14/2019 Part 1 Introduction to Distributed Computing
7/42
EMM4086: DMS 7
Why DS and not
isolated
hardware?
Enhance person to
personcommunication
Need to share data
and resourcesamongst users
Flexibility: different computers with different
capabilities can be shared among users
-
8/14/2019 Part 1 Introduction to Distributed Computing
8/42
EMM4086: DMS 8
Problems with distributed, connected systems
Software - how
to design and
manage it in a
DS
Dependency
on the
underlying
infrastructure
Easy access to
shared data
raises security
concerns
-
8/14/2019 Part 1 Introduction to Distributed Computing
9/42
-
8/14/2019 Part 1 Introduction to Distributed Computing
10/42
EMM4086: DMS 10
Example of DS - Internet
intranet
ISP
desktop computer:
backbone
satellite link
server:
network link:
Figure 1.1: A typical portion of the Internet - Coulouris
-
8/14/2019 Part 1 Introduction to Distributed Computing
11/42
EMM4086: DMS 11
Example of DS - Internet
Vast collection of interconnected computer networks
of many different types programs on various computers interact by passing
messages as a means of communication
the design and construction of the internetcommunication mechanisms (the internet protocols)
is a major technical achievement enabling a program
running anywhere to address messages programs
anywhere else.
-
8/14/2019 Part 1 Introduction to Distributed Computing
12/42
EMM4086: DMS 12
Example of DS - Internet
As shown in Figure 1.1, a collection of intranets,
subnets are operated by organizations. ISPs arecompanies that provide modem links and otherconnection to individual users and smallorganizations enabling the users to access internetservices.
Intranets are linked together by backbones. abackbone is a new link with a high transmissioncapacity, employing satellite connections, fiber opticcables and other high bandwidth circuits
Multimedia services are available in the internet, butquite limited as the internet does not have facilities toreserve network capacities for individual streams ofdata
-
8/14/2019 Part 1 Introduction to Distributed Computing
13/42
EMM4086: DMS 13
Example of DS - Intranet
the rest of
email server
Web server
Desktopcomputers
File server
router/firewall
print and other servers
other servers
print
Local area
network
email server
the Internet
Figure 1.2: A typical intranet - Coulouris
-
8/14/2019 Part 1 Introduction to Distributed Computing
14/42
EMM4086: DMS 14
Example of DS - Intranet
Portion of the internet is intranet, but separately
administered and has a boundary that can beconfigured to enforce local security policies.
Composed of several LANs linked by backboneconnections
intranet is connected to internet via routers. Manyorganizations need to protect their own services
Firewall is to protect an intranet by preventingunauthorized messages leaving or entering
-
8/14/2019 Part 1 Introduction to Distributed Computing
15/42
EMM4086: DMS 15
Example of DS - DMS
Often use internet infrastructure
characteristics: Heterogeneous data sources andsinks that need to be synchronized in real time
(Video, Audio, Text)
Often distribution services- Multicast
Examples:
1. Tele-teaching
2. Video-conferencing
3. Video and audio on demand
-
8/14/2019 Part 1 Introduction to Distributed Computing
16/42
EMM4086: DMS 16
Example of DS - Mobile and Ubiquitous systems
Laptop
Mobile
Printer
Camera
Internet
Host intranet Home intranetWAP
Wireless LAN
phone
gateway
Host site
Figure 1.3: Portable and handheld devices in a distributed system -
Coulouris
-
8/14/2019 Part 1 Introduction to Distributed Computing
17/42
EMM4086: DMS 17
Example of DS - Mobile and Ubiquitous systems
Integration of small and portable devices such as
Laptop computers
Handheld devices including PDAs, mobile phones,
pagers, cameras
wearable devices such as smart watches withfunctionality similar to a PDA
Devices embedded in appliances such as washing
machines, cars, and refrigerators are integrated into
DS
-
8/14/2019 Part 1 Introduction to Distributed Computing
18/42
EMM4086: DMS 18
Example of DS - Mobile and Ubiquitous systems
In mobile computing, users who are away from their
home intranet are still provided with access toresources via the devices they carry with them. They
can continue to access the Internet.
Ubiquitous computing:
It would be convenient for users to control their washing
machine etc from a universal remote control device in the
home.
Equally the washing machine could page the user via a
smart badge when the washing is done.
-
8/14/2019 Part 1 Introduction to Distributed Computing
19/42
EMM4086: DMS 19
Example of DS - Mobile and Ubiquitous systems
As shown in Figure 1.3, the user has access to three
forms of wireless connection1. their laptop has a means of connecting to the hosts
wireless LAN. This provides coverage for few hundred
meters
2. It connects to the rest of intranet through gateway
3. Mobile users are connected to the internet through WAP
via a gateway
users carries a digital camera, which can
communicate over a infrared link when pointed at
corresponding device such as printer.
-
8/14/2019 Part 1 Introduction to Distributed Computing
20/42
EMM4086: DMS 20
Other examples of DS
MMU campus IT network, where each faculty has a pool of computers linked together in
a LAN , sharing common resources, such as email server, web server and ftp server.
-
8/14/2019 Part 1 Introduction to Distributed Computing
21/42
-
8/14/2019 Part 1 Introduction to Distributed Computing
22/42
EMM4086: DMS 22
Advantages of DS
1. Extensibility: can be
extended as the demand for
service grows with out replacingany of the existing components
2. Resource sharing: Extensive
hardware can be shared among users.
Data files can be managed andmaintained centrally, but remains
accessible to all users
3. Replication: High reliability
achieved by maintaining several
copies of data in different
computers
4. Continued availability: the
failure of a single workstation does
not affect the service to all users
-
8/14/2019 Part 1 Introduction to Distributed Computing
23/42
EMM4086: DMS 23
Disadvantages of DS
1. Loss of flexibility in allocation of memory and processing resources: In a centralized
system, all processor and memory resources are available for allocation by the OS in any
manner required by the current workload. In DS, the processor and memory capacity of theworkstation determine the largest task that can be performed.
2. Dependence on network performance and reliability: failure of the local network can
cause the service to users to be interrupted. Overloading of the network can degrade theperformance and responsiveness to the users.
3. Security weakness: To achieve openness, many of the software interfaces are made readily
available to clients. This leads to security problems
-
8/14/2019 Part 1 Introduction to Distributed Computing
24/42
EMM4086: DMS 24
Challenges
1. HETEROGENEITY: Internet enables users to access services
and run applications over a heterogeneous collection ofcomputers and networks.
Heterogeneity applies to the following:
computer networks
computer hardware
OS
Programming languages
Implementation by different developers
Middleware: It applies to a software layer that provides a
programming abstraction as well as masking the heterogeneity of
the underlying networks, hardware, OS and PLs.
Example: CORBA, Java RMI
-
8/14/2019 Part 1 Introduction to Distributed Computing
25/42
EMM4086: DMS 25
2. OPENNESS: The openness is determined by the degree to
which the new resource sharing services can be added andbe made available for use by a variety of client programs.
Openness cannot be achieved unless the specification and
documentation of the key software interfaces are
published. Open DS constructed from heterogeneous hardware and
software possibly from different vendors.
-
8/14/2019 Part 1 Introduction to Distributed Computing
26/42
EMM4086: DMS 26
3. SECURITY: There are 3 components:
i. confidentiality (protection against disclosure to unauthorizedindividuals)
ii. Integrity (protection against alteration or corruption)
iii. Availability (protection against interference with the means to accessthe resources).
Two security challenges that is yet to be met:
1. Denial of service attacks: Here the users may disrupt a service forsome reason. Achieved by bombarding the service with such a largenumber of pointless requests that the serious users are unable to useit. This is denial of attack
2. Security of mobile code: Assume some one who receives the EXEfrom email as attachment: it may access local resources, or denial ofservice attack.
-
8/14/2019 Part 1 Introduction to Distributed Computing
27/42
EMM4086: DMS 27
4. SCALABILITY:
DS is scalable if it remains effective when there is asignificant increase in the number of resources and the
number of users.
Date Computers Web servers
1979, Dec. 188 0
1989, July 130,000 0
1999, July 56,218,000 5,560,866
2003, Jan. 171,638,297 35,424,956
Figure 1.5: Computers in the Internet - Coulouris
-
8/14/2019 Part 1 Introduction to Distributed Computing
28/42
EMM4086: DMS 28
Design of Scalable DS presents challenges given
below1. Controlling the cost of physical resources: as a demand
of the resource grows, it should be possible to extend the
system, at reasonable cost. In general, for a system with
n users to be scalable, then the quantity of the physicalresources required to support them should be at most
O(n)- that is proportional to n. Ex: 20 users- 1 server,
then 40 users- 2 server and so on.
2. Controlling the performance Loss: In managing thedata the time taken to access data is O(log n), where n is
the size of the data. For a system to be scalable, the
maximum performance loss should be no worse than this.
-
8/14/2019 Part 1 Introduction to Distributed Computing
29/42
EMM4086: DMS 29
3. Preventing software resources running out: An
example of lack of scalability is shown by the numbersused in the Internet. In late 1970s, it was decided to use
32-bit. It run out in early 2000s. New version use 128 bit
Internet address. The demand is difficult to predict. But
too large Internet address occupy extra space in messages
and in computer storage.
4. Avoiding bottleneck: Algorithms to be decentralized to
avoid having performance bottleneck. Example : Domain
Name System can be kept in a single master file and
downloaded to any computers when needed. But this isefficient only when the number of computers are less. If
more, then it is to be decentralized.
-
8/14/2019 Part 1 Introduction to Distributed Computing
30/42
EMM4086: DMS 30
Date Computers Web servers Percentage
1993, July 1,776,000 130 0.008
1995, July 6,642,000 23,500 0.4
1997, July 19,540,000 1,203,096 6
1999, July 56,218,000 6,598,697 12
2001, July 125,888,197 31,299,592 25
42,298,371
Figure 1.6: Computers vs. Web servers in the Internet - Coulouris
-
8/14/2019 Part 1 Introduction to Distributed Computing
31/42
EMM4086: DMS 31
5. FAILURE HANDLING: Failures in DS is partial - that is
some components fail while others continue to functions.Therefore the handling of failures is difficult.
Detecting failures: Some failures may be detected e.g. check sums to
be used to find the error in a message. But difficult to detect the
failures such as remote crashed server in the Internet. The challenge
of DS is to manage in the presence of failures that cannot be detected,
but may be suspected
Masking failures: some failures may be detected but can be hidden
or made less severe. Examples.
1. Messages can be retransmitted
2. File data can be written to a pair of disks so that if one is corrupted, the
other may still be correct
-
8/14/2019 Part 1 Introduction to Distributed Computing
32/42
EMM4086: DMS 32
Tolerating failures: example: when a web browser cannot
contact the web server, it does not make the user wait for ever
while it keeps on trying- it informs the user about the problem
Recovery from failures: Recovery involves the design of the
software so that the state of the permanent data can be recovered
or rolled back after the server has crashed.
Redundancy: services may be made to tolerate failures by the
use of redundant components. Examples:
1. Always two different routes between any two routers in Internet
2. In the DNS, every name table is replicated in at least two different
servers
3. A database may be replicated in several servers to ensure that data
remains accessible after the failure of the single server
-
8/14/2019 Part 1 Introduction to Distributed Computing
33/42
EMM4086: DMS 33
Concurrency: since there is a possibility that several clients will attempt
to access a shared resource at the same time, any object that represents a
shared resource in a DS must be responsible for ensuring that it operatescorrectly in a concurrent environment.
For an object to be safe in a concurrent environment, its operations must
be synchronized in such a way that its data remains consistent.
This can be achieved by standard techniques such as semaphores.
Example: IF TWO CONCURRENT BIDS ARE A: 15 and B: 56 AND
THE CORRESPONDING OPERATIONS ARE INTERLEAVED
WITHOUT ANY CONRTOL, THEN THEY MIGHT GET STORED AS
A:56 and B:15.
It is possible that several threads may be executing concurrently within
an object and in which case their operations on the object may conflict
with one another and produce inconsistent results as above.
T i
-
8/14/2019 Part 1 Introduction to Distributed Computing
34/42
EMM4086: DMS 34
Transparencies Access transparency: enables local and remote resources to be accessed
using identical operations.
Location transparency: enables resources to be accessed withoutknowledge of their physical or network location (for example, whichbuilding or IP address).
Concurrency transparency: enables several processes to operateconcurrently using shared resources without interference between them.
Replication transparency: enables multiple instances of resources to be
used to increase reliability and performance without knowledge of thereplicas by users or application programmers. Failure transparency: enables the concealment of faults, allowing users
and application programs to complete their tasks despite the failure ofhardware or software components.
Mobility transparency: allows the movement of resources and clients
within a system without affecting the operation of users or programs. Performance transparency: allows the system to be reconfigured toimprove performance as loads vary.
Scaling transparency: allows the system and applications to expand inscale without change to the system structure or the application algorithms.
C
-
8/14/2019 Part 1 Introduction to Distributed Computing
35/42
EMM4086: DMS 35
Consequences
1. Concurrency:
In a network of computers, concurrent program executionis the norm. (ex. Sharing of files from common server).
The coordination of concurrently executing programs thatshare the resources is an important one
1. No Global clock:
When programs need to cooperate, they coordinate theiractions by exchanging messages.
Close coordination often depends on a shared idea of thetime at which programs actions occur.
But it turns out that there are limits to the accuracy with
which the computers in a network can synchronize theirclocks.
There is no single global notion of the correct time.
-
8/14/2019 Part 1 Introduction to Distributed Computing
36/42
EMM4086: DMS 36
3. Independent Failures:
DS can fail in new ways. Faults in the network result inthe isolation of the computers that are connected to it, but
does not mean that they stop running.
In fact, the programs on them may not be able to detect
whether the network has failed or has become unusuallyslow
Each component can fail independently leaving the
others still running
Cl k h i ti
-
8/14/2019 Part 1 Introduction to Distributed Computing
37/42
EMM4086: DMS 37
Clock synchronization
With a single computer and a single clock, it does not matter
much if this clock is off by a small amount.
As soon as multiple CPUs are introduced, each with its ownclock, the situation changes. Although the frequency at whicha crystal oscillator runs is usually fairly stable, it is impossible
to guarantee that the crystals in different computers all run atexactly the same frequency. This leads to the clocks to get outsynchronization and different values when read out. It is calledclock skew.
As a consequence of this clock skew, programs that expect thetime associated with a file, object process or message to becorrect and independent of the machine on which it wasguaranteed can fail.
-
8/14/2019 Part 1 Introduction to Distributed Computing
38/42
EMM4086: DMS 38
1. Computer on which compiler runs
2. Computer on which editor runs
Output.ocreated
Output.ccreated
2144 2145 2146 2147
2142 2143 2144 2145
Figure 1.7: when each machine has its own clock, an event that
occurred after another event may nevertheless be assigned an earlier
time
-
8/14/2019 Part 1 Introduction to Distributed Computing
39/42
EMM4086: DMS 39
Let output.o has time 2144 and shortly thereafter output.c ismodified but is assigned a time 2143 because the clock of its
machine is slightly behind. Make will not call the compiler. The resulting exe binary
program will then contain a mixture of object files from theold sources and the new sources.
It will probably crash and the programmer will go crazy trying
to understand what is wrong with the code. Another case: programmer has finished changing all the
source files, he starts make, which examines the times atwhich the source and object files were last modified.
If the source file input.c has time 2151 and the corresponding
object file input.o has time 2150, make knows that input.c hasbeen changed since input.o was created, and thus input.c mustbe recompiled.
-
8/14/2019 Part 1 Introduction to Distributed Computing
40/42
EMM4086: DMS 40
For ordinary clocks based on quartz crystal, this drift is about 10 -6
seconds/ second- giving a difference of 1 second in every 11.6days.
High precision quartz clock: 10-7 to 10-8
More accurate physical clocks use atomic oscillators whose
drift rate is about one part in 1013
-
8/14/2019 Part 1 Introduction to Distributed Computing
41/42
EMM4086: DMS 41
2. Resource sharing
1. Openness3. Concurrency
4. Scalability
5. Fault tolerance
6. Transparency
Key
characteristics
C t N t k d DS
-
8/14/2019 Part 1 Introduction to Distributed Computing
42/42
Computer Networks and DS
Computer Network: The autonomous computers are
explicitly visible (have to be explicitly addressed) Distributed system: Existence of multiple
autonomous computers is transparent
However
many problems in common
in some sense networks are also DS
Normally every DS relies on services provided by acomputer network