lesson11 - introduction to distributed computing (v1b)
TRANSCRIPT
Grupo de Arquitectura de Computadores,
Comunicaciones y Sistemas
INTRODUCTION TO
DISTRIBUTED COMPUTING
Distributed ComputingLesson 11for the Alejandro Calderón Mateos work
22
Goals
Knowing the definition and characteristics of
distributed computing.
Knowing the main evolution of distributed
computing.
Knowledge of advanced technology for data
processing.
44
Contents
What a distributed system is.
What kind of elements are needed.
How elements can be organized.
What paradigms are used to build one.
Example of distributed system.
55
Introduction
“We define a distributed system as one in
which hardware or software components
located at networked computers
communicate and coordinate their actions
only by passing messages”
“A distributed system is a collection of
independent computers that appears to its
users as a single coherent system”
66
Introduction
“A distributed system is one in
which the failure of a computer
you didn't even know existed
can render your own computer
unusable”
Leslie Lamport
77
Introduction
http://www.nethistory.info/History%20of%20the%20Internet/origins.html#apps
~ 1960
88
Centralized systems
Multiple users share the
resources of a centralize
system at all times.
Costs are divided.
It have a single point of
control: easy maintenance.
Centralized system have non-
autonomous components.
http://books.cs.luc.edu/distributedsystems/issues.html
99
Roadmap to connected computers
~ 1969
ARPANET
http://www.nethistory.info/History%20of%20the%20Internet/origins.html#apps
1010
Roadmap to connected computers
~1970
UCLA-BBN
by AT&T
~ 1969
ARPANET
http://www.nethistory.info/History%20of%20the%20Internet/origins.html#apps
1111
Roadmap to connected computers
1972
TCP/IP
~1970
UCLA-BBN
by AT&T
~ 1969
ARPANET
http://microchip.wdfiles.com/local--files/tcpip:tcp-ip-five-layer-model/TCPIP_5_layer_overview.JPG
1212
Roadmap to connected computers
1972
TCP/IP
~1970
UCLA-BBN
by AT&T
~ 1969
ARPANET
~1974
Xerox PARC
http://www.nethistory.info/History%20of%20the%20Internet/origins.html#apps
1313
Roadmap to connected computers
1972
TCP/IP
~1970
UCLA-BBN
by AT&T
~ 1969
ARPANET
~1974
Xerox PARC
http://edc.tversu.ru/elib/inf/0091/tcpip/figs/tcp2_0303.gif
~1975
UUCP,
Mail, etc.
1414
Roadmap to connected computers
1972
TCP/IP
~1970
UCLA-BBN
by AT&T
~ 1969
ARPANET
~1974
Xerox PARC
https://en.wikipedia.org/wiki/IBM_Personal_Computer
1981
IBM PC
1515
Roadmap to connected computers
1972
TCP/IP
~1970
UCLA-BBN
by AT&T
~ 1969
ARPANET
~1974
Xerox PARC
http://www.thefoa.org/tech/ref/appln/OLAN.html
1983
IEEE 802.3
1616
Roadmap to connected computers
1972
TCP/IP
~1970
UCLA-BBN
by AT&T
~ 1969
ARPANET
~1974
Xerox PARC
http://cdn.arstechnica.net/2011/09/23/hdd-capacity-scale-4e7ce6c-intro.png
1983
IEEE 802.3
1717
Internet (network applications)
1972
TCP/IP
~1970
UCLA-BBN
by AT&T
~ 1969
ARPANET
~1974
Xerox PARC
https://bitcointalk.org/index.php?topic=430357.0
<1994
Internet starts
Computer network (& net. apps.):computers are explicitly visible
1818
Evolution…
1972
TCP/IP
~1970
UCLA-BBN
by AT&T
~ 1969
ARPANET
~1974
Xerox PARC
https://bitcointalk.org/index.php?topic=430357.0
?
1919
Evolution…
Centralized Distributed
https://www.dcsorg.com/images/image_centralized_management.jpg
Distributed system:existence of multiple elements is transparent
2020
Centralized systemsremembering…
Multiple users share the
resources of a centralize
system at all times.
Costs are divided.
It have a single point of
control: easy maintenance.
Centralized system have non-
autonomous components.
http://books.cs.luc.edu/distributedsystems/issues.html
2121
Distributed systems
Users share the resources.
Extensibility (scalability) with
better price/performance.
If redundant elements are
properly used: reliability.
But it has multiple points of
failure with multiple
autonomous components.
http://books.cs.luc.edu/distributedsystems/issues.html
2222
Distributed systems
Speed through parallelism.
But difficult to design
But multiple points of failure
Diversification though
heterogeneous technology
and autonomous components.
But non-trivial integration &
maintenance costs
http://books.cs.luc.edu/distributedsystems/issues.html
2323
Distributed application fundamentals
1972
TCP/IP
~1970
UCLA-BBN
by AT&T
~ 1969
ARPANET
~1974
Xerox PARC
http://www.fearnleyeducation.com/files/PageImages/clients%20-%20server%20model.PNG
1960 - 2000
Distributed soft.
Lesson 11
2424
Roadmap to connected computers
1972
TCP/IP
~1970
UCLA-BBN
by AT&T
~ 1969
ARPANET
~1974
Xerox PARC
https://bitcointalk.org/index.php?topic=430357.0
~1990
Internet explotes
2525
Roadmap to connected computers
1985
1G
1972
TCP/IP
~1970
UCLA-BBN
by AT&T
~ 1969
ARPANET
~1974
Xerox PARC
http://img.frbiz.com/news/145317_s/Mobile_communication_base_station_radio_equipment_greet_with_the_explosive_growth_mobile_communication_base_station_radio_equipment_3G_communication_industry.jpg
1983
IEEE 802.3
2626
Roadmap to connected computers
2007
Smartphones
http://www.videcom.com/Portals/0/iphone1.png & http://images.techtimes.com/data/images/full/127370/events-for-gmail.jpg?w=600
1985
1G
1972
TCP/IP
~1970
UCLA-BBN
by AT&T
~ 1969
ARPANET
~1974
Xerox PARC
1983
IEEE 802.3
2727
Roadmap to connected computers
http://kburnett.net/business-case/technology/mobility-2/
2007
Smartphones
1985
1G
1972
TCP/IP
~1970
UCLA-BBN
by AT&T
~ 1969
ARPANET
~1974
Xerox PARC
1983
IEEE 802.3
2828
Internet
http://kburnett.net/business-case/technology/mobility-2/
2007
Smartphones
1985
1G
1972
TCP/IP
~1970
UCLA-BBN
by AT&T
~ 1969
ARPANET
~1974
Xerox PARC
1983
IEEE 802.3
2929
Internet of Things (IoT)
http://www.mercurynews.com/business/ci_24836116/internet-things-seen-bonanza-bay-area-businesses
3030
Internet of Things (IoT)
http://knowledgeblob.com/technology/a-brief-about-internet-of-things-iot/
3131
From Big Data to Huge Data…living services
http://www.elandroidelibre.com/2015/10/living-services-la-tercera-revolucion-tras-la-web-y-los-smartphones.html
3232
From Big Data to Huge Data…living services
http://tarrysingh.com/2014/07/fog-computing-happens-when-big-data-analytics-marries-internet-of-things/
3333
From Big Data to Huge Data…living services
http://blog.atlasrfidstore.com/wp-content/uploads/2013/07/beecham_research_internet_of_things.jpg
3434
Distributed application fundamentals
1972
TCP/IP
~1970
UCLA-BBN
by AT&T
~ 1969
ARPANET
~1974
Xerox PARC
http://www.fearnleyeducation.com/files/PageImages/clients%20-%20server%20model.PNG
1960 - 2020
Distributed
software
3535
Contents
What a distributed system is.
What kind of elements are needed.
How elements can be organized.
What paradigms are used to build one.
Example of distributed system.
3636
Elements in a distributed system
“We define a distributed system as
one in which hardware or software
components located at networked
computers communicate and
coordinate their actions only by
passing messages”
3737
Elements in a distributed system
http://cdn.comsol.com/wordpress/2014/02/Speeding-up-communications-distributed-memory-computing-copy.jpg
3838
Distributed Systems Challenges
http://cdn.comsol.com/wordpress/2014/02/Speeding-up-communications-distributed-memory-computing-copy.jpg
The network is reliable
The network is secure
The netwerk is homogeneous
The topology does not change
Latency is zero
Bandwidth is infinite
Transport cost is zero
There is one administrator
3939
Distributed Systems Challenges
Heterogeneity
Openness
Security
Scalability
Failure Handling
Concurrency
Transparency
4040
Distributed Systems Challenges
Heterogeneity
Openness
Security
Scalability
Failure Handling
Concurrency
Transparency
Different networks connected thanks
to standard Internet protocols.
Different computing hardware
execute same code thanks to
virtualization (virtual machines)
Different software interacts thanks to
middleware software layers.
4141
Distributed Systems Challenges
Heterogeneity
Openness
Security
Scalability
Failure Handling
Concurrency
Transparency
Openness is determined by the
degree on which new resource-
sharing services can be added and
be available for client programs
(services publication)
The new resource-sharing service is
described by the interfaces to be
used by software developers
(interface notification)
IS473 at http://www.xpowerpoint.com/ppt/system-model-distributed-systems.html
4242
Distributed Systems Challenges
Heterogeneity
Openness
Security
Scalability
Failure Handling
Concurrency
Transparency
Security for information resources has
three main components:
Confidentiality: avoid unauthorized access
Integrity: avoid unwanted alteration
Availability: avoid deny of service
Firewall, encryption, and anti-virus
software is used.
IS473 at http://www.xpowerpoint.com/ppt/system-model-distributed-systems.html
4343
Distributed Systems Challenges
Heterogeneity
Openness
Security
Scalability
Failure Handling
Concurrency
Transparency
A system is scalable if will remain
effective when there is a significant
increase in the number of resources and
the number of users (represented by
client programs).
Many challenges:
Control of performance degradation
Cost of physical resources
But preventing running out of resources
But avoiding performance bottlenecks
IS473 at http://www.xpowerpoint.com/ppt/system-model-distributed-systems.html
4444
Distributed Systems Challenges
Heterogeneity
Openness
Security
Scalability
Failure Handling
Concurrency
Transparency
Avoid a single failing element stops all
the distributed system
Service can be more fault tolerant by
using redundant components.
Recovery from failures involves
n-versioning and/or checkpointing
where the state of the system could be
kept stable after recovery.
IS473 at http://www.xpowerpoint.com/ppt/system-model-distributed-systems.html
4545
Distributed Systems Challenges
Heterogeneity
Openness
Security
Scalability
Failure Handling
Concurrency
Transparency
A shared-resource service
implementation must ensure that it
operate properly in a concurrent
environment.
The implementation of interface
operations must be synchronized in
order to keep resource consistent.
Concurrent mechanism are used: locks,
semaphores, monitors, etc.
IS473 at http://www.xpowerpoint.com/ppt/system-model-distributed-systems.html
4646
Distributed Systems Challenges
Heterogeneity
Openness
Security
Scalability
Failure Handling
Concurrency
Transparency
Access transparency Hide differences in how resources are accessed
Location transparency Hide where a resource is located
Migration transparency Hide that a resource may move to another place
Relocation transparency Hide migration while resource is used
Replication transparency Hide that a resource is replicated
Concurrency transparency Hide that a resource may be shared in parallel
Failure transparency Hide the failure and recovery of a resource
4747
Distributed Systems Challengessummary
Heterogeneity
Openness
Security
Scalability
Failure Handling
Concurrency
Transparency
Extra softw/hardw on Distributed system:existence of multiple elements is transparent
http://www.pixempire.com/images/preview/orchestra-director-with-stick-icon.jpg
4848
Contents
What a distributed system is.
What kind of elements are needed.
How elements can be organized.
What paradigms are used to build one.
Example of distributed system.
4949
(Example of) Services for helping on
Distributed Systems Challenges
http://thenewstack.io/helix-a-linkedin-framework-for-distributed-systems-development/
5050
(Example of) Elements for helping on
Distributed Systems Challenges
http://www.ukoln.ac.uk/distributed-systems/jisc-ie/arch/
5151
Common Elements on a
Distributed Systems Challenges
Coordination and Agreement
Time and Global States
Coordination and Agreement
Time and Global States
Security and Fault Tolerance
Name Service
Networking+Internetworking
Security and Fault Tolerance
Name Service
Networking+Internetworking
Distributed File System
Distributed Multimedia System
Mobile and Ubiquitous Computing
Distributed File System
Distributed Multimedia System
Mobile and Ubiquitous Computing
Distributed algorithms
Distributed services
System services
Transactions and Concurrency Control
Distributed Transactions
Replication and Consistency
Transactions and Concurrency Control
Distributed Transactions
Replication and Consistency
Shared data
Remote invocation
Dist. Objects and components
Remote invocation
Dist. Objects and components
Middleware
Client/Server
Peer-to-Peer
Client/Server
Peer-to-Peer
Distributed Models
5252
Distributed System Software Stack
http://books.cs.luc.edu/distributedsystems/issues.html
5353
Contents
What a distributed system is.
What kind of elements are needed.
How elements can be organized.
What paradigms are used to build one.
Example of distributed system.
5454
Main paradigms
1. Message passing
2. Client/Server and Peer-to-Peer
3. Remote procedure/method call
4. Network services, Object Request Broker
and mobile agents
5. Object spaces and collaborative applications
http://www.arcos.inf.uc3m.es/~dsd/
5555
What a paradigm is…
Abstraction: encapsulation or hiding details
Paradigm: a pattern, an example or model
Strategy: identify the basic pattern or basic model and
classify the details according to these models.
Two main characteristic: process communication and
event synchronization
http://www.arcos.inf.uc3m.es/~dsd/
5656
Distributed computation paradigms…
Ordered by its abstraction level:
Object space, collaborative applications
Network service, object request broker, mobile agents
Remote procedure, remote method
Client/Server, peer-to-peer
Message passing
High
Low
http://www.arcos.inf.uc3m.es/~dsd/
5757
Distributed computation paradigms…
Ordered by its abstraction level:
Object space, collaborative applications
Network service, object request broker, mobile agents
Remote procedure, remote method
Client/Server, peer-to-peer
Message passing
High
Low
http://www.arcos.inf.uc3m.es/~dsd/
5858
Message passing
Basic (and classic) paradigm for distributed apps:
Process send request message.
Process receive a request, process this request and send
back a response message.
m1
m2
m3
Process A Process B
Message
Message passing
http://www.arcos.inf.uc3m.es/~dsd/
5959
MOM paradigm
Message-oriented Middleware:
The message system help process to interchange
message: receive the message, store in the proper
queue, and send to the destination process.
Two types: Point to point and publish/subscribe.
Examples: IBM MQ*Series, Microsoft MSMQ, Java JMS
http://www.arcos.inf.uc3m.es/~dsd/
...
...Message syste
Receiver Sender
6060
Distributed computation paradigms…
Ordered by its abstraction level:
Object space, collaborative applications
Network service, object request broker, mobile agents
Remote procedure, remote method
Client/Server, peer-to-peer
Message passing
High
Low
http://www.arcos.inf.uc3m.es/~dsd/
6161
Client/Server paradigm
Asymmetric role assignment:
Server: process that wait for requests,
it typically manages one or several sharing-resources.
Client: makes request to servers and wait its responses.
http://www.arcos.inf.uc3m.es/~dsd/
...
Service request
Server process
Client process
Service
Server Client 1
Client 2
6262
Peer-to-Peer paradigm
Symmetric role assignment:
All participant process has the same role
Un process can play as client and as a server
Resources are shared among all computers
Example: Gnutella
http://www.arcos.inf.uc3m.es/~dsd/
Pro
cess
1
Request
Response
Request
Response
Pro
cess
2
6363
Hybrid c/s+p2p
Hybrid:
Example: Napster
http://www.arcos.inf.uc3m.es/~dsd/
6464
Multitier Client/Server
Divide application in two parts (client/server):
b) User interface, and business logic + database access.
d) User interface + business logic, and database access.
http://csis.pace.edu/~marchese/CS865/Lectures/Chap2/Chapter2a.htm
6565
Multitier Client/Server
Divide application in two parts (client/server):
b) User interface, and business logic + database access.
d) User interface + business logic, and database access.
http://sce.uhcl.edu/helm/rationalunifiedprocess/process/workflow/ana_desi/co_dpatt.htm
6666
Multitier Client/Server
Divide application in two parts (client/server):
View-Broker
Broker-Model
http://www.svgopen.org/2004/papers/MakingSVGaWebServiceinaMessageBasedMVCArchitecture/
6767
Distributed computation paradigms…
Ordered by its abstraction level:
Object space, collaborative applications
Network service, object request broker, mobile agents
Remote procedure, remote method
Client/Server, peer-to-peer
Message passing
High
Low
http://www.arcos.inf.uc3m.es/~dsd/
6868
Remote Procedure Call paradigm
Goal: programming the distributed software in the
same way that a non-distributed software.
Using RPC the concept of communication is the same
as a local procedure call.
Examples: RPC, DCE, SOAP
http://www.arcos.inf.uc3m.es/~dsd/
proc1(arg1, arg2)
proc2(arg1)
proc3(arg1,arg2,arg3)
Process A Process B
6969
Remote Method Invocation paradigm
Goal: programming the distributed software in the
same way that a non-distributed software.
Objects provides methods to request services, the
objects are distributed on the networked system.
Examples: Java RMI
http://www.arcos.inf.uc3m.es/~dsd/
method1
method2
Process 1Process 2
Remote object
RMI
7070
Distributed computation paradigms…
Ordered by its abstraction level:
Object space, collaborative applications
Network service, object request broker, mobile agents
Remote procedure, remote method
Client/Server, peer-to-peer
Message passing
High
Low
http://www.arcos.inf.uc3m.es/~dsd/
7171
Network services paradigm
A directory service provides the
reference (abstract location) to
the available services.
Steps:
1. Request process asks the
directory service for a service.
2. The directory service provides a
reference to that service.
3. By using the reference, the
request process interacts with
the service.
Example: Java Jini
http://www.arcos.inf.uc3m.es/~dsd/
Service requester
Directory service
Service object
1
2
3
7272
Object Request Broker paradigm
The ORB works as a middleware layer.
The ORB redirects the requests to the most appropriated
object that provides the requested service.
It instantiate the classes/objects (heterogeneous objects).
Examples: CORBA, Microsoft DCOM, Java Beans
http://www.arcos.inf.uc3m.es/~dsd/
Object
requesterObject
Object Request Broker
7373
Mobile Agents paradigm
Mobile agent: application or
object that can be transported
to other computers.
Idea:
1. The agent is started in one
computer.
2. It automatically goes to others
computers as it needed.
3. It access to the resource/services
of the system that visits.
Examples: IBM Aglet, D’agent
http://www.arcos.inf.uc3m.es/~dsd/
Computer 1
agent
Computer 2
Computer 3
Computer 4
7474
Distributed computation paradigms…
Ordered by its abstraction level:
Object space, collaborative applications
Network service, object request broker, mobile agents
Remote procedure, remote method
Client/Server, peer-to-peer
Message passing
High
Low
http://www.arcos.inf.uc3m.es/~dsd/
7575
Object Space paradigm
They hide the details of searching resources and the
distributed objects. Technology based on components.
Object space: entity that contains a set of objects
Providers: introduce object into the object space
Requesters: subscribe to the object space and access to objects
Example: JavaSpaces
http://www.arcos.inf.uc3m.es/~dsd/
ProviderRequester
Requesterread
update
Object space
7676
Collaborative applications paradigm
Groupware: collaborative session where processes participate.
Each participant contributes through:
Multicast.
Virtual blackboard.
http://www.arcos.inf.uc3m.es/~dsd/
message
message
message
Message-based groupware blackboard-based groupware
7777
Distributed computation paradigms…summary
Ordered by its abstraction level:
Object space, collaborative applications
Network service, object request broker, mobile agents
Remote procedure, remote method
Client/Server, peer-to-peer
Message passing
High
Low
http://www.arcos.inf.uc3m.es/~dsd/
7878
What is best? Depends…
Each one has it advantages and disadvantages.
Key aspects:
Abstraction level vs overheat.
Scalability.
Fault-tolerance support, platform support, etc.
7979
Contents
What a distributed system is.
What kind of elements are needed.
How elements can be organized.
What paradigms are used to build one.
Example of distributed system.
8080
Example: Hadoop/HDFS
http://bradhedlund.com/2011/09/10/understanding-hadoop-clusters-and-the-network/
8181
Example: Hadoop/HDFS
http://bradhedlund.com/2011/09/10/understanding-hadoop-clusters-and-the-network/
8282
Example: Hadoop/HDFS
http://bradhedlund.com/2011/09/10/understanding-hadoop-clusters-and-the-network/
8383
Example: Hadoop/HDFS
http://bradhedlund.com/2011/09/10/understanding-hadoop-clusters-and-the-network/
8484
Example: Hadoop/HDFS
http://bradhedlund.com/2011/09/10/understanding-hadoop-clusters-and-the-network/
8585
Example: Hadoop/HDFS
http://www.monitis.com/blog/2013/12/19/big-data-and-hadoop-whats-it-all-about/
8686
Example: Hadoop/HDFS
https://www.simple-talk.com/cloud/data-science/analyze-big-data-with-apache-hadoop-on-windows-azure-preview-service-update-3/