lesson11 - introduction to distributed computing (v1b)

87
Grupo de Arquitectura de Computadores, Comunicaciones y Sistemas INTRODUCTION TO DISTRIBUTED COMPUTING Distributed Computing Lesson 11 for the Alejandro Calderón Mateos work

Upload: alejandro-calderon-mateos

Post on 27-Jan-2017

82 views

Category:

Education


2 download

TRANSCRIPT

Grupo de Arquitectura de Computadores,

Comunicaciones y Sistemas

INTRODUCTION TO

DISTRIBUTED COMPUTING

Distributed ComputingLesson 11for the Alejandro Calderón Mateos work

22

Goals

Knowing the definition and characteristics of

distributed computing.

Knowing the main evolution of distributed

computing.

Knowledge of advanced technology for data

processing.

33

Scope

Grid

computing

Cluster

computing

SM Parallel

computing

44

Contents

What a distributed system is.

What kind of elements are needed.

How elements can be organized.

What paradigms are used to build one.

Example of distributed system.

55

Introduction

“We define a distributed system as one in

which hardware or software components

located at networked computers

communicate and coordinate their actions

only by passing messages”

“A distributed system is a collection of

independent computers that appears to its

users as a single coherent system”

66

Introduction

“A distributed system is one in

which the failure of a computer

you didn't even know existed

can render your own computer

unusable”

Leslie Lamport

77

Introduction

http://www.nethistory.info/History%20of%20the%20Internet/origins.html#apps

~ 1960

88

Centralized systems

Multiple users share the

resources of a centralize

system at all times.

Costs are divided.

It have a single point of

control: easy maintenance.

Centralized system have non-

autonomous components.

http://books.cs.luc.edu/distributedsystems/issues.html

99

Roadmap to connected computers

~ 1969

ARPANET

http://www.nethistory.info/History%20of%20the%20Internet/origins.html#apps

1010

Roadmap to connected computers

~1970

UCLA-BBN

by AT&T

~ 1969

ARPANET

http://www.nethistory.info/History%20of%20the%20Internet/origins.html#apps

1111

Roadmap to connected computers

1972

TCP/IP

~1970

UCLA-BBN

by AT&T

~ 1969

ARPANET

http://microchip.wdfiles.com/local--files/tcpip:tcp-ip-five-layer-model/TCPIP_5_layer_overview.JPG

1212

Roadmap to connected computers

1972

TCP/IP

~1970

UCLA-BBN

by AT&T

~ 1969

ARPANET

~1974

Xerox PARC

http://www.nethistory.info/History%20of%20the%20Internet/origins.html#apps

1313

Roadmap to connected computers

1972

TCP/IP

~1970

UCLA-BBN

by AT&T

~ 1969

ARPANET

~1974

Xerox PARC

http://edc.tversu.ru/elib/inf/0091/tcpip/figs/tcp2_0303.gif

~1975

UUCP,

Mail, etc.

1414

Roadmap to connected computers

1972

TCP/IP

~1970

UCLA-BBN

by AT&T

~ 1969

ARPANET

~1974

Xerox PARC

https://en.wikipedia.org/wiki/IBM_Personal_Computer

1981

IBM PC

1515

Roadmap to connected computers

1972

TCP/IP

~1970

UCLA-BBN

by AT&T

~ 1969

ARPANET

~1974

Xerox PARC

http://www.thefoa.org/tech/ref/appln/OLAN.html

1983

IEEE 802.3

1616

Roadmap to connected computers

1972

TCP/IP

~1970

UCLA-BBN

by AT&T

~ 1969

ARPANET

~1974

Xerox PARC

http://cdn.arstechnica.net/2011/09/23/hdd-capacity-scale-4e7ce6c-intro.png

1983

IEEE 802.3

1717

Internet (network applications)

1972

TCP/IP

~1970

UCLA-BBN

by AT&T

~ 1969

ARPANET

~1974

Xerox PARC

https://bitcointalk.org/index.php?topic=430357.0

<1994

Internet starts

Computer network (& net. apps.):computers are explicitly visible

1818

Evolution…

1972

TCP/IP

~1970

UCLA-BBN

by AT&T

~ 1969

ARPANET

~1974

Xerox PARC

https://bitcointalk.org/index.php?topic=430357.0

?

1919

Evolution…

Centralized Distributed

https://www.dcsorg.com/images/image_centralized_management.jpg

Distributed system:existence of multiple elements is transparent

2020

Centralized systemsremembering…

Multiple users share the

resources of a centralize

system at all times.

Costs are divided.

It have a single point of

control: easy maintenance.

Centralized system have non-

autonomous components.

http://books.cs.luc.edu/distributedsystems/issues.html

2121

Distributed systems

Users share the resources.

Extensibility (scalability) with

better price/performance.

If redundant elements are

properly used: reliability.

But it has multiple points of

failure with multiple

autonomous components.

http://books.cs.luc.edu/distributedsystems/issues.html

2222

Distributed systems

Speed through parallelism.

But difficult to design

But multiple points of failure

Diversification though

heterogeneous technology

and autonomous components.

But non-trivial integration &

maintenance costs

http://books.cs.luc.edu/distributedsystems/issues.html

2323

Distributed application fundamentals

1972

TCP/IP

~1970

UCLA-BBN

by AT&T

~ 1969

ARPANET

~1974

Xerox PARC

http://www.fearnleyeducation.com/files/PageImages/clients%20-%20server%20model.PNG

1960 - 2000

Distributed soft.

Lesson 11

2424

Roadmap to connected computers

1972

TCP/IP

~1970

UCLA-BBN

by AT&T

~ 1969

ARPANET

~1974

Xerox PARC

https://bitcointalk.org/index.php?topic=430357.0

~1990

Internet explotes

2525

Roadmap to connected computers

1985

1G

1972

TCP/IP

~1970

UCLA-BBN

by AT&T

~ 1969

ARPANET

~1974

Xerox PARC

http://img.frbiz.com/news/145317_s/Mobile_communication_base_station_radio_equipment_greet_with_the_explosive_growth_mobile_communication_base_station_radio_equipment_3G_communication_industry.jpg

1983

IEEE 802.3

2626

Roadmap to connected computers

2007

Smartphones

http://www.videcom.com/Portals/0/iphone1.png & http://images.techtimes.com/data/images/full/127370/events-for-gmail.jpg?w=600

1985

1G

1972

TCP/IP

~1970

UCLA-BBN

by AT&T

~ 1969

ARPANET

~1974

Xerox PARC

1983

IEEE 802.3

2727

Roadmap to connected computers

http://kburnett.net/business-case/technology/mobility-2/

2007

Smartphones

1985

1G

1972

TCP/IP

~1970

UCLA-BBN

by AT&T

~ 1969

ARPANET

~1974

Xerox PARC

1983

IEEE 802.3

2828

Internet

http://kburnett.net/business-case/technology/mobility-2/

2007

Smartphones

1985

1G

1972

TCP/IP

~1970

UCLA-BBN

by AT&T

~ 1969

ARPANET

~1974

Xerox PARC

1983

IEEE 802.3

2929

Internet of Things (IoT)

http://www.mercurynews.com/business/ci_24836116/internet-things-seen-bonanza-bay-area-businesses

3030

Internet of Things (IoT)

http://knowledgeblob.com/technology/a-brief-about-internet-of-things-iot/

3131

From Big Data to Huge Data…living services

http://www.elandroidelibre.com/2015/10/living-services-la-tercera-revolucion-tras-la-web-y-los-smartphones.html

3232

From Big Data to Huge Data…living services

http://tarrysingh.com/2014/07/fog-computing-happens-when-big-data-analytics-marries-internet-of-things/

3333

From Big Data to Huge Data…living services

http://blog.atlasrfidstore.com/wp-content/uploads/2013/07/beecham_research_internet_of_things.jpg

3434

Distributed application fundamentals

1972

TCP/IP

~1970

UCLA-BBN

by AT&T

~ 1969

ARPANET

~1974

Xerox PARC

http://www.fearnleyeducation.com/files/PageImages/clients%20-%20server%20model.PNG

1960 - 2020

Distributed

software

3535

Contents

What a distributed system is.

What kind of elements are needed.

How elements can be organized.

What paradigms are used to build one.

Example of distributed system.

3636

Elements in a distributed system

“We define a distributed system as

one in which hardware or software

components located at networked

computers communicate and

coordinate their actions only by

passing messages”

3737

Elements in a distributed system

http://cdn.comsol.com/wordpress/2014/02/Speeding-up-communications-distributed-memory-computing-copy.jpg

3838

Distributed Systems Challenges

http://cdn.comsol.com/wordpress/2014/02/Speeding-up-communications-distributed-memory-computing-copy.jpg

The network is reliable

The network is secure

The netwerk is homogeneous

The topology does not change

Latency is zero

Bandwidth is infinite

Transport cost is zero

There is one administrator

3939

Distributed Systems Challenges

Heterogeneity

Openness

Security

Scalability

Failure Handling

Concurrency

Transparency

4040

Distributed Systems Challenges

Heterogeneity

Openness

Security

Scalability

Failure Handling

Concurrency

Transparency

Different networks connected thanks

to standard Internet protocols.

Different computing hardware

execute same code thanks to

virtualization (virtual machines)

Different software interacts thanks to

middleware software layers.

4141

Distributed Systems Challenges

Heterogeneity

Openness

Security

Scalability

Failure Handling

Concurrency

Transparency

Openness is determined by the

degree on which new resource-

sharing services can be added and

be available for client programs

(services publication)

The new resource-sharing service is

described by the interfaces to be

used by software developers

(interface notification)

IS473 at http://www.xpowerpoint.com/ppt/system-model-distributed-systems.html

4242

Distributed Systems Challenges

Heterogeneity

Openness

Security

Scalability

Failure Handling

Concurrency

Transparency

Security for information resources has

three main components:

Confidentiality: avoid unauthorized access

Integrity: avoid unwanted alteration

Availability: avoid deny of service

Firewall, encryption, and anti-virus

software is used.

IS473 at http://www.xpowerpoint.com/ppt/system-model-distributed-systems.html

4343

Distributed Systems Challenges

Heterogeneity

Openness

Security

Scalability

Failure Handling

Concurrency

Transparency

A system is scalable if will remain

effective when there is a significant

increase in the number of resources and

the number of users (represented by

client programs).

Many challenges:

Control of performance degradation

Cost of physical resources

But preventing running out of resources

But avoiding performance bottlenecks

IS473 at http://www.xpowerpoint.com/ppt/system-model-distributed-systems.html

4444

Distributed Systems Challenges

Heterogeneity

Openness

Security

Scalability

Failure Handling

Concurrency

Transparency

Avoid a single failing element stops all

the distributed system

Service can be more fault tolerant by

using redundant components.

Recovery from failures involves

n-versioning and/or checkpointing

where the state of the system could be

kept stable after recovery.

IS473 at http://www.xpowerpoint.com/ppt/system-model-distributed-systems.html

4545

Distributed Systems Challenges

Heterogeneity

Openness

Security

Scalability

Failure Handling

Concurrency

Transparency

A shared-resource service

implementation must ensure that it

operate properly in a concurrent

environment.

The implementation of interface

operations must be synchronized in

order to keep resource consistent.

Concurrent mechanism are used: locks,

semaphores, monitors, etc.

IS473 at http://www.xpowerpoint.com/ppt/system-model-distributed-systems.html

4646

Distributed Systems Challenges

Heterogeneity

Openness

Security

Scalability

Failure Handling

Concurrency

Transparency

Access transparency Hide differences in how resources are accessed

Location transparency Hide where a resource is located

Migration transparency Hide that a resource may move to another place

Relocation transparency Hide migration while resource is used

Replication transparency Hide that a resource is replicated

Concurrency transparency Hide that a resource may be shared in parallel

Failure transparency Hide the failure and recovery of a resource

4747

Distributed Systems Challengessummary

Heterogeneity

Openness

Security

Scalability

Failure Handling

Concurrency

Transparency

Extra softw/hardw on Distributed system:existence of multiple elements is transparent

http://www.pixempire.com/images/preview/orchestra-director-with-stick-icon.jpg

4848

Contents

What a distributed system is.

What kind of elements are needed.

How elements can be organized.

What paradigms are used to build one.

Example of distributed system.

4949

(Example of) Services for helping on

Distributed Systems Challenges

http://thenewstack.io/helix-a-linkedin-framework-for-distributed-systems-development/

5050

(Example of) Elements for helping on

Distributed Systems Challenges

http://www.ukoln.ac.uk/distributed-systems/jisc-ie/arch/

5151

Common Elements on a

Distributed Systems Challenges

Coordination and Agreement

Time and Global States

Coordination and Agreement

Time and Global States

Security and Fault Tolerance

Name Service

Networking+Internetworking

Security and Fault Tolerance

Name Service

Networking+Internetworking

Distributed File System

Distributed Multimedia System

Mobile and Ubiquitous Computing

Distributed File System

Distributed Multimedia System

Mobile and Ubiquitous Computing

Distributed algorithms

Distributed services

System services

Transactions and Concurrency Control

Distributed Transactions

Replication and Consistency

Transactions and Concurrency Control

Distributed Transactions

Replication and Consistency

Shared data

Remote invocation

Dist. Objects and components

Remote invocation

Dist. Objects and components

Middleware

Client/Server

Peer-to-Peer

Client/Server

Peer-to-Peer

Distributed Models

5252

Distributed System Software Stack

http://books.cs.luc.edu/distributedsystems/issues.html

5353

Contents

What a distributed system is.

What kind of elements are needed.

How elements can be organized.

What paradigms are used to build one.

Example of distributed system.

5454

Main paradigms

1. Message passing

2. Client/Server and Peer-to-Peer

3. Remote procedure/method call

4. Network services, Object Request Broker

and mobile agents

5. Object spaces and collaborative applications

http://www.arcos.inf.uc3m.es/~dsd/

5555

What a paradigm is…

Abstraction: encapsulation or hiding details

Paradigm: a pattern, an example or model

Strategy: identify the basic pattern or basic model and

classify the details according to these models.

Two main characteristic: process communication and

event synchronization

http://www.arcos.inf.uc3m.es/~dsd/

5656

Distributed computation paradigms…

Ordered by its abstraction level:

Object space, collaborative applications

Network service, object request broker, mobile agents

Remote procedure, remote method

Client/Server, peer-to-peer

Message passing

High

Low

http://www.arcos.inf.uc3m.es/~dsd/

5757

Distributed computation paradigms…

Ordered by its abstraction level:

Object space, collaborative applications

Network service, object request broker, mobile agents

Remote procedure, remote method

Client/Server, peer-to-peer

Message passing

High

Low

http://www.arcos.inf.uc3m.es/~dsd/

5858

Message passing

Basic (and classic) paradigm for distributed apps:

Process send request message.

Process receive a request, process this request and send

back a response message.

m1

m2

m3

Process A Process B

Message

Message passing

http://www.arcos.inf.uc3m.es/~dsd/

5959

MOM paradigm

Message-oriented Middleware:

The message system help process to interchange

message: receive the message, store in the proper

queue, and send to the destination process.

Two types: Point to point and publish/subscribe.

Examples: IBM MQ*Series, Microsoft MSMQ, Java JMS

http://www.arcos.inf.uc3m.es/~dsd/

...

...Message syste

Receiver Sender

6060

Distributed computation paradigms…

Ordered by its abstraction level:

Object space, collaborative applications

Network service, object request broker, mobile agents

Remote procedure, remote method

Client/Server, peer-to-peer

Message passing

High

Low

http://www.arcos.inf.uc3m.es/~dsd/

6161

Client/Server paradigm

Asymmetric role assignment:

Server: process that wait for requests,

it typically manages one or several sharing-resources.

Client: makes request to servers and wait its responses.

http://www.arcos.inf.uc3m.es/~dsd/

...

Service request

Server process

Client process

Service

Server Client 1

Client 2

6262

Peer-to-Peer paradigm

Symmetric role assignment:

All participant process has the same role

Un process can play as client and as a server

Resources are shared among all computers

Example: Gnutella

http://www.arcos.inf.uc3m.es/~dsd/

Pro

cess

1

Request

Response

Request

Response

Pro

cess

2

6363

Hybrid c/s+p2p

Hybrid:

Example: Napster

http://www.arcos.inf.uc3m.es/~dsd/

6464

Multitier Client/Server

Divide application in two parts (client/server):

b) User interface, and business logic + database access.

d) User interface + business logic, and database access.

http://csis.pace.edu/~marchese/CS865/Lectures/Chap2/Chapter2a.htm

6565

Multitier Client/Server

Divide application in two parts (client/server):

b) User interface, and business logic + database access.

d) User interface + business logic, and database access.

http://sce.uhcl.edu/helm/rationalunifiedprocess/process/workflow/ana_desi/co_dpatt.htm

6666

Multitier Client/Server

Divide application in two parts (client/server):

View-Broker

Broker-Model

http://www.svgopen.org/2004/papers/MakingSVGaWebServiceinaMessageBasedMVCArchitecture/

6767

Distributed computation paradigms…

Ordered by its abstraction level:

Object space, collaborative applications

Network service, object request broker, mobile agents

Remote procedure, remote method

Client/Server, peer-to-peer

Message passing

High

Low

http://www.arcos.inf.uc3m.es/~dsd/

6868

Remote Procedure Call paradigm

Goal: programming the distributed software in the

same way that a non-distributed software.

Using RPC the concept of communication is the same

as a local procedure call.

Examples: RPC, DCE, SOAP

http://www.arcos.inf.uc3m.es/~dsd/

proc1(arg1, arg2)

proc2(arg1)

proc3(arg1,arg2,arg3)

Process A Process B

6969

Remote Method Invocation paradigm

Goal: programming the distributed software in the

same way that a non-distributed software.

Objects provides methods to request services, the

objects are distributed on the networked system.

Examples: Java RMI

http://www.arcos.inf.uc3m.es/~dsd/

method1

method2

Process 1Process 2

Remote object

RMI

7070

Distributed computation paradigms…

Ordered by its abstraction level:

Object space, collaborative applications

Network service, object request broker, mobile agents

Remote procedure, remote method

Client/Server, peer-to-peer

Message passing

High

Low

http://www.arcos.inf.uc3m.es/~dsd/

7171

Network services paradigm

A directory service provides the

reference (abstract location) to

the available services.

Steps:

1. Request process asks the

directory service for a service.

2. The directory service provides a

reference to that service.

3. By using the reference, the

request process interacts with

the service.

Example: Java Jini

http://www.arcos.inf.uc3m.es/~dsd/

Service requester

Directory service

Service object

1

2

3

7272

Object Request Broker paradigm

The ORB works as a middleware layer.

The ORB redirects the requests to the most appropriated

object that provides the requested service.

It instantiate the classes/objects (heterogeneous objects).

Examples: CORBA, Microsoft DCOM, Java Beans

http://www.arcos.inf.uc3m.es/~dsd/

Object

requesterObject

Object Request Broker

7373

Mobile Agents paradigm

Mobile agent: application or

object that can be transported

to other computers.

Idea:

1. The agent is started in one

computer.

2. It automatically goes to others

computers as it needed.

3. It access to the resource/services

of the system that visits.

Examples: IBM Aglet, D’agent

http://www.arcos.inf.uc3m.es/~dsd/

Computer 1

agent

Computer 2

Computer 3

Computer 4

7474

Distributed computation paradigms…

Ordered by its abstraction level:

Object space, collaborative applications

Network service, object request broker, mobile agents

Remote procedure, remote method

Client/Server, peer-to-peer

Message passing

High

Low

http://www.arcos.inf.uc3m.es/~dsd/

7575

Object Space paradigm

They hide the details of searching resources and the

distributed objects. Technology based on components.

Object space: entity that contains a set of objects

Providers: introduce object into the object space

Requesters: subscribe to the object space and access to objects

Example: JavaSpaces

http://www.arcos.inf.uc3m.es/~dsd/

ProviderRequester

Requesterread

update

Object space

7676

Collaborative applications paradigm

Groupware: collaborative session where processes participate.

Each participant contributes through:

Multicast.

Virtual blackboard.

http://www.arcos.inf.uc3m.es/~dsd/

message

message

message

Message-based groupware blackboard-based groupware

7777

Distributed computation paradigms…summary

Ordered by its abstraction level:

Object space, collaborative applications

Network service, object request broker, mobile agents

Remote procedure, remote method

Client/Server, peer-to-peer

Message passing

High

Low

http://www.arcos.inf.uc3m.es/~dsd/

7878

What is best? Depends…

Each one has it advantages and disadvantages.

Key aspects:

Abstraction level vs overheat.

Scalability.

Fault-tolerance support, platform support, etc.

7979

Contents

What a distributed system is.

What kind of elements are needed.

How elements can be organized.

What paradigms are used to build one.

Example of distributed system.

8080

Example: Hadoop/HDFS

http://bradhedlund.com/2011/09/10/understanding-hadoop-clusters-and-the-network/

8181

Example: Hadoop/HDFS

http://bradhedlund.com/2011/09/10/understanding-hadoop-clusters-and-the-network/

8282

Example: Hadoop/HDFS

http://bradhedlund.com/2011/09/10/understanding-hadoop-clusters-and-the-network/

8383

Example: Hadoop/HDFS

http://bradhedlund.com/2011/09/10/understanding-hadoop-clusters-and-the-network/

8484

Example: Hadoop/HDFS

http://bradhedlund.com/2011/09/10/understanding-hadoop-clusters-and-the-network/

8585

Example: Hadoop/HDFS

http://www.monitis.com/blog/2013/12/19/big-data-and-hadoop-whats-it-all-about/

8686

Example: Hadoop/HDFS

https://www.simple-talk.com/cloud/data-science/analyze-big-data-with-apache-hadoop-on-windows-azure-preview-service-update-3/

Grupo de Arquitectura de Computadores,

Comunicaciones y Sistemas

INTRODUCTION TO

DISTRIBUTED COMPUTING

Distributed ComputingLesson 11for the Alejandro Calderón Mateos work