lecture 1 - jnu.ac.in · lecture 1 . roadmap • introduction • parallel and distributed...
TRANSCRIPT
Lecture 1
Roadmap
• Introduction
• Parallel and Distributed Computing Landscape • Cloud vs. Grid • Cloud Computing
– Possibilities – Some Characteristics of Cloud Computing – SaaS and Cloud Computing – Supercomputing & Cloud Computing
• Clouds Examples • Conclusions • References
Introduction
• The landscape of parallel and distributed computing has significantly evolved over the last sixty years.
• It is forecast that between 20-50 billion devices will be added to the internet by 2020
• 43 trillion gigabytes of data will be generated and will need to be processed in cloud data centers.
• How is the journey so far?
1: http://www.gartner.com/newsroom/id/3165317 2: http://spectrum.ieee.org/tech-talk/telecom/internet/ popular-internet-of-things-forecast-of-50-billion-devices-by-2020-is-outdated
Data Canters • In an increasingly data-driven world, solutions that
provide real-time information benefit us all, from the data center, its customers, and ultimately the consumers at the end of the chain.
(1) Drinking Water Distribution Systems
Seamless Electricity
delivery as a Utility to Users
kero
sene
lam
p o
r ca
nd
les
(2) Electricity Distribution Systems
(3) Computing Resource Distribution Systems
Water Distribution Network
Seamlessly delivering Water as a
Utility to Users
UNIX OS Web Tier
Windows OS Web Tier
IDS
Seamless computing delivery as a Utility 24/7 from any where
Computing Trends
Internet
Conventional Computing
19
60
s
Year
Parallel Computing & Distributed Computing -- help solve a single large problem by breaking it down into several tasks where each task is computed in the individual processor of the distributed system.
Mainframe
• Used primarily by large organizations where you could submit your jobs to and have it return the result,
• Common use includes bulk data processing (e.g, census, enterprise resource planning; and transaction processing).
Distributed Computing
Network
CPU CPU CPU CPU
Mem Mem Mem Mem
Shared Memory
CPU CPU CPU CPU
Parallel Computing
Computing Trends 1
97
0s
Distributed memory
Year
19
80
s
• Clients are diskless and all file, print, http and even cycle computation requests are sent to servers.
• Server are minicomputers dedicated to one or more different types of services.
• Communication through RPC (Remote Procedure Call)/ RMI (Remote Method Invocation)
• No process migration invoked
LAN
(R
PC
/RM
I )
Server Diskless clients
Client-Server
Cluster Computing
Master Node 1Gbps
SAN
100Gbps LAN
Clu
ster
Client
• Tightly coupled distributed system
• A cluster consists of a master node and several slave nodes connected to a high-speed network.
• Provide high performance/high throughput (Harnessing many idle resources) where requests are served in parallel.
• Condor was notable example
20
00
Computing Trends
Bahman Javadi, Jemal H. Abawajy, Mohammad K. Akbari: Analytical modeling of interconnection networks in heterogeneous multi-cluster systems. The Journal of Supercomputing 40(1): 29-47 (2007)
Year
20
00
s
Grid Computing
Cluster Computing (Bengaluru)
Master Node 1Gbps
SAN
100Gbps LAN
Clu
ster
Client
Slaves Clients Master
Cluster Computing (Kolkata)
Supercomputer (New Delhi)
Grid Computing
• Grid computing consists of loosely coupled supercomputers and clusters sparsely located over different administrative domains.
• Suitable for very large problems needing lots of CPU, memory, etc.
• Requires distributed resource management & scheduling
Computing Trends
Z. Pooranian, M. Shojafar, J. H. Abawajy, A. Abraham, "An Efficient Meta-heuristic Algorithm for Grid Computing", Springer, Journal of Combinatorial Optimization (JOCO), ISSN: 1382-6905, Impact Factor: 0.939, Vol. 30, Iss. 3, pp. 413-434, October 2015
Data Grids
Compute Site
Instrument
Storage Facility
Storage Facility
Compute Site
Compute Site Scientist
Instrument
Scientist
Storage Facility
Storage Facility
Storage Facility
Information & Discovery
Compute Site
Jemal H. Abawajy, Mustafa Mat Deris: Data Replication Approach with Consistency Guarantee for Data Grid. IEEE Trans. Computers 63(12): 2975-2987 (2014)
Cloud Computing Motivation
TIME
IT C
APA
CIT
Y
Actual Load
Allocated IT-capacities
“Waste“ of capacities
“Under-supply“ of capacities
Fixed cost of IT-capacities
Load Forecast
Barrier for innovations
In a non-cloud view, there are inefficiencies
• Cloud computing is the on-demand delivery of computing as a service where you pay only for the service you used.
Virtualization Technology
Hypervisor/VMM
abstracts HW
from an OS
VM pool
App 1 App 2 App 3
UNIX O SWeb Tier
Windows OSWeb Tier
Remote Physical
Machine Pool
Consumer applications
VM is SW that
executes apps as
if it was running
on a PM
Computing Trends
M. Shojafar, C. Canali, R. Lancellotti, J. H. Abawajy, ”Adaptive Computing-plus-Communication Optimization Framework for Multimedia Processing in Cloud Systems", IEEE Transactions on Cloud Computing, TCC, ISSN: 2168-7161, Impact Factor: 1.59, Vol. PP, Iss. 99, pp. 1-14, October 2017
Elasticity Automated
Management Availability Multi-Tenancy
Cloud Improves
Actual Load
Allocated IT capacities
Reduction of initial
investments
Reduction of “over-supply“
No “under-supply“
Possible reduction of IT-
capacities in case of reduced
load
Time
IT C
APA
CIT
Y
Load Forecast
Cloud Service Models Infrastructure Platform
Storage
Networking
Servers
Databases
Virtualization Runtimes Applications
Security & Integration • Private Cloud (On-Premise) - Owner manages every thing
• Public Cloud (IaaS) – Vendor manages the infrastructure owner manages the rest Databases
Storage
Networking
Servers
Virtualization
Runtimes Applications
Security & Integration
Managed by owner Managed by vendor
• Public Cloud (PaaS) – Vendor manages the infrastructure and platform while owner manages the rest
Databases
Storage
Networking
Servers
Virtualization
Runtimes Applications
Security & Integration
Managed by owner Managed by vendor
• Public Cloud (SaaS) – Vendor manages everything
Databases
Storage
Networking
Servers
Virtualization
Runtimes Applications
Security & Integration
Managed by owner Managed by vendor
Cloud Types
HYBRID CLOUD PUBLIC CLOUD
PRIVATE CLOUD
3rd party, multi-tenant Cloud infrastructure & services: * available on subscription basis
Cloud model run within a company’s own Data Center / infrastructure for internal and/or partners use.
Mixed usage of private and public Clouds: Leasing public cloud services when private cloud capacity is insufficient
Bahman Javadi, Jemal H. Abawajy, Richard O. Sinnott: Hybrid Cloud resource provisioning policy in the presence of resource failures. CloudCom 2012: 10-17
• A plethora of energy limited devices (smartphones, tablets and wearables) are increasingly becoming a mainstream element of our lives.
Wearables
TV
Meter
Ph
one
Sm
art
Th
ings
UNIX O S
Web Tier
Windows OS
Web Tier
Ap
pli
cati
ons
Smart House Smart HealthcareSmart Cities Smart GridSmart Cars
Clo
ud
Lay
er
Sara Ghanavati, Jemal Abawajy and Davood Izadi (2017), Opportunities & Challenges of Integration of IoT and Cloud Computing with WBANs, The Internet of Things: Foundation for Smart City, E-health and Ubiquitous Computing, edited by Armentano, Bhadoria, Chatterjee, and Deka, CRC Press, Taylor and Francis
Computing Trends
• These devices are expected to exceed 50 billion by 2020.
• Cloud as a centralised server will soon become an untenable computing model.
Edge nodes (eg., routers, mobile base stations and switches that route network traffic
Computing Trends
Paola G. V. Naranjo, Zahra Pooranian, Shahaboddin Shamshirband, Jemal H. Abawajy and Mauro Conti, Fog over Virtualized IoT: New Opportunity for Context-Aware Networked Applications and a Case Study, Appl. Sci. 2017, 7(12), 1325; doi:10.3390/app7121325
• Harnessing computational capabilities of resources at the edge of the network
Enzo Baccarelli; Paola G. Vinueza Naranjo; Michele Scarpiniti; Mohammad Shojafar; Jemal H. Abawajy, Fog of Everything: Energy-Efficient Networked Computing Architectures, Research Challenges, and a Case Study, IEEE Access Year: 2017, Volume: 5, Pages: 9882 - 9910
• Integrated Fog computing (FC) and Internet of Everything (IoE)
Computing Trends
• Virtualization technology is key – allows a single physical machine to run many independent virtual
machine thus increases utilization of physical servers
– Enables multiple types of OSs to run in isolation of other OSs
– Separating applications from the underlying infrastructure
– Enables portability of virtual machines between physical machines
Virtualization Technology
Hypervisor/VMM
abstracts HW
from an OS
VM pool
App 1 App 2 App 3
UNIX O SWeb Tier
Windows OSWeb Tier
Remote Physical
Machine Pool
Consumer applications
VM is SW that
executes apps as
if it was running
on a PM
Computing Trends
Virtual Data Center
Baker Alrubaiey, Jemal H. Abawajy: Virtual networks dependability assessment framework. IJHPCN 10(1/2): 3-12 (2017)
Workload Types
Task 1
Task 2 Task 3 Task n
Structure of matrix multiplication application
• Divisible jobs (an application can be arbitrarily partitioned into smaller tasks)
• Example: matrix multiplication application
• Indivisible processes (entire process must be assign to a single processor).
Workflow Applications
• Workflow applications are represented by a directed acyclic task graph.
T1
T2
T3
T5
T4
T7
T8
T6
Structure of divide-and-conquer programs
MapReduce Computation Workflow
• A programming framework good for processing large data sets (e.g, signal processing, image processing) in a distributed environment
• The computation of MapReduce applications is organized in a workflow of map and reduce tasks. – reducer tasks execute after map tasks completed. – The output of a map tasks is input to the reducer tasks.
key-
valu
e p
air
Aggregates intermediate data tuples into a smaller set of tuples or key-value pairs
Cloud Workload Types Workload Description and Examples Key Quality-of-Service Metrics
Server Centric
Web sites Freely available web sites for social networking, informational web sites large number of users
Large amounts of storage, high network bandwidth,
Scientific computing Bioinformatics, atmospheric modeling, other numerical computations
Computing capacity
Enterprise software Email servers, SAP, enterprise content management
Security, high availability, customer support
Performance testing Simulation of large workloads to test the performance characteristics of software under development
Computing capacity
Online financial services Online banking, insurance Security, high availability, Internet accessibility
E-commerce Retail shopping Variable computing load, especially at holiday times
Core financial services Banking and insurance systems Security, high availability
Storage and backup services General data storage and backup Large amounts of reliable storage
Client Centric
Productivity applications Users logging on interactively for email, word processing, and so on
Network bandwidth and latency, data backup, security
Development and testing Software development of web applications with Rational Software Architect, Microsoft® Visual Studio, and so on
User self-service, flexibility, rich set of infrastructure services
Graphics intensive Animation and visualization software applications Network bandwidth and latency, data backup
Rich Internet applications Web applications with a large amount of JavaScript
Mobile Centric
Mobile services Servers to support rich mobile applications High availability
Workload Patterns Optimal For Cloud
Usage
Co
mp
ute
Time
Average
Inactivity
Period
On & off workloads (e.g. batch job) Over provisioned capacity is wasted Time to market can be cumbersome
Co
mp
ute
Time
Average Usage
Unexpected/unplanned peak in demand Sudden spike impacts performance Can’t over provision for extreme cases
Average Usage Co
mp
ute
Time
Successful services needs to grow/scale Keeping up w/ growth is big IT challenge Complex lead time for deployment
Co
mp
ute
Time
Average Usage
Services with micro seasonality trends Peaks due to periodic increased demand IT complexity and wasted capacity
Google MapReduce Infrastructure Overview
25
Worker
Worker
Worker
Worker
Worker
Master
Split 0
Split 1
Split 2
Split 3
Split 4
Output File 0
Output File 1
User Program
(1) fork (1) fork
(1) fork
(2) assign map (2) assign reduce
(3) read (4) local
write (5) remote read
(6) write
input files map phase reduce phase output files intermediate
files
OCloud Architecture
Dispatcher
VM
Monitor
Service Request
Monitor
Pricing Accounting
Service Request Examiner and
Admission Control
- Customer-driven Service Management
- Computational Risk Management
- Autonomic Resource Management
Users/
Brokers
SLA
Resource
Allocator
Virtual
Machines
(VMs)
Physical
Machines
Market-Oriented Cloud Architecture: QoS negotiation
and SLA-based Resource Allocation
Dispatcher
VM
Monitor
Service Request
Monitor
Pricing Accounting
Service Request Examiner and
Admission Control
- Customer-driven Service Management
- Computational Risk Management
- Autonomic Resource Management
Users/
Brokers
SLA
Resource
Allocator
Virtual
Machines
(VMs)
Physical
Machines
Market-Oriented Cloud Architecture: QoS negotiation
and SLA-based Resource Allocation
Dispatcher
VM
Monitor
Service Request
Monitor
Pricing Accounting
Service Request Examiner and
Admission Control
- Customer-driven Service Management
- Computational Risk Management
- Autonomic Resource Management
Users/
Brokers
SLA
Resource
Allocator
Virtual
Machines
(VMs)
Physical
Machines
29
Thank you.
Questions, Comments, …?