c loud c omputing : f eatures, i ssues and c hallenges demetris delgeris
TRANSCRIPT
CLOUD COMPUTING: FEATURES, ISSUES AND CHALLENGESDemetris Delgeris
INTRODUCTION
Cloud Computing
A definition refers to any situation in which computing is done in a remote location (out in the clouds), rather than on your desktop or portable device.
You tap into that computing power over an Internet connection.
“The cloud is a smart, complex, powerful computing system in the sky that people can just plug into."
--Web browser pioneer Marc Andreessen
Refers to both:
the applications delivered as services over the Internet and
the hardware and systems software in the datacenters that provide those services.
The services themselves have long been referred to as Software as a Service (SaaS).
The datacenter hardware and software is what we will call a Cloud.
Public Cloud: offered as pay-as-you-go to the general public
Private Cloud: internal datacenters of a business
CLOUD TYPES
Two different but related types of clouds are:
those that provide computing instances on demand
and those that provide computing capacity
The first is designed to scale out by providing additional computing instances
The second is designed to support data- or compute-intensive applications via scaling capacity.
FEATURES
WHAT’S NEW Scaling:
Company infrastructures scale over several (or more) data-centres.
Pricing: pricing model that lets you pay as you go and for just the
services that you need. No capital expenditure is required.
Simplicity: writing code for high-performance and distributed
computing was relatively complicated (explicitly passed messages between nodes and other specialized methods).
cloud-based storage service APIs and Map-Reduce-style computing (parallel programming method) APIs are relatively simple compared to previous methods.
HARDWARE POINT OF VIEW
3 new aspects:
The illusion of infinite computing resources available on demand
The elimination of an up-front commitment by Cloud users
The ability to pay for use of computing resources on a short-term basis as needed
CLOUD COMPUTING SERVICE MODELS
Software As A Service: Gmail
Defined as service-on-demand, where a provider will license software tailored to user needs
Customers can utilize greater computing power while saving on the following
– Cost– Space– Power Consumption
End users don’t have control over cloud infrastructure
CLOUD COMPUTING SERVICE MODELS
Platform As A Service: Google App Engine
Provides all the facilities necessary to support the complete process of building and delivering web applications and services, all available over the internet
Has to possess development infrastructure including programming environment, tools and configuration management
CLOUD COMPUTING SERVICE MODELS
Infastructure As A Service: Amazon EC2
Defined as delivery of computer infrastructure as a service Fully outsourced service so businesses do not have to
purchase servers, software or equipment
Infrastructure providers can dynamically allocate resources for service providers
LAYERS OF CLOUD COMPUTING
A cloud client consists of computer hardware and computer software that relies on cloud computing for application delivery, or that is specifically designed for delivery of cloud services
Deliver software as a service over the Internet, eliminating the need to
install and run the application on the customer’s own pc and simplify maintenance and support.
Cloud platform services deliver a computing platform as a service, often consuming cloud infrastructure and sustaining cloud applications.
LAYERS OF CLOUD COMPUTING
Cloud infrastructure services delivers computer infrastructure
The servers layer consists of computer hardware and software products that are specifically designed for the delivery of cloud services.
CLOUD APPLICATION CHARACTERISTICS These represent ideals that people want for the
applications in the cloud
Incremental Scalability Agility: the cloud provides flexible, automated
management to distribute the computing resources among the cloud’s users
Availability: Cloud environments take advantage of the large numbers of servers by enabling high levels of availability
SLA-driven: Clouds are managed dynamically based on service-level agreements that define policies like delivery parameters, costs, and other factors
APIs: Because clouds virtualize resources as a service they must have an application programming interface
ISSUES AND CHALLENGES
CLOUD STORAGE ISSUE
Is a model of networked Computer data storage where data is stored on multiple virtual servers, generally hosted by third parties, rather than being hosted on dedicated servers.
Hosting companies operate large data centers; and people who require their data to be hosted buy or lease storage capacity from them and use it for their storage needs.
Requires an ability to keep data synchronized even though it is stored in two or more distinct geographies.
CLOUD STORAGE ISSUE That requires attacking three key issues:
The efficient transfer of large data blocks or files over long distances,
Caching technologies that can help to overcome some of the distance delays, and
Synchronization and coordination among the storage sites
The second and third items on the list are being addressed by a number of storage technology vendors
The issue of efficient high-speed transfer remains elusive.
PERFORMANCE ISSUE:HIGH SPEED TRANSFERS
The delays in accessing storage at a distance can be orders of magnitude compared to local storage.
A huge amount of it arises in the low level details of transmitting large blocks of data over long distance links using traditional networking technologies.
Goal: to raise performance level of remote storage to a level where delta between it and local storage can be concealed with caching.
Moving data between remote sites depends on a reliable transport such as TCP.
TCP uses a sliding window protocol (congestion) to detect failures or collisions within the network.
These failures or collisions result in dropped packets and can impact the network performance.
Congestion window determines the number of bytes that the sender can transmit before it must stop and wait for an acknowledgement from the receiver.
Performance metric is the bandwidth-delay product of the interconnect.
It is a measure of the amount of data that can be stored ‘in transit’ on the wire.
If TCP’s max congestion window is much smaller than the bandwidth-delay product for the wire connection, the transport will not be able to keep the wire continuously full.
For LANs the bandwidth-delay product of a reasonable wire length is on the order of magnitude of a typical TCP congestion window size.
As the bandwidth-delay product of the wire increases, either due to increasing length or increasing bandwidth, so too must the congestion window size.
A larger window size means that more bytes are in-flight at any given point so:
Risk of loss due to congestion dropping or error.
As the congestion window size increases, so too does the overhead associated with recovering from a lost byte.
RDMA OVER WANS It is as a message passing service
Applications exchange messages with each other
directly using shared memory buffers.
Efficiently implementing the RDMA access method over a network depends on a network transport mechanism
which is suitable for transporting memory buffers between servers.
InfiniBand Architecture instead of TCP:
extremely low end-to-end latencies ability to reduce the memory bandwidth burdens on end
nodes packet congestion window instead of byte congestion window
DATA MOBILITY ISSUE Cloud data may reside on a location a long way away
geographically from the organization that owns the data.
Cloud providers may decide to keep moving data from one location to another.
There are several reasons for this, including:
Reducing the cost of storing data Efficiency of retrieval of data Efficient linking of different data resident on different locations Resource optimization.
High levels of data mobility have negative implications for:
Data security and data protection Data availability.
DATA LOCATION ISSUE Applications have little or no information regarding the
location of their data in the network. Without this information, applications cannot optimize their
execution by moving computation closer to data, data closer to users, or related data closer to each other.
The current state-of-the-art solution for this problem
involves guesswork: the cloud determines data placement by predicting the future
access patterns of the application based on past history, treating the application as a black box.
counter-productive since the application typically has more accurate information than the cloud about its own future behaviour.
Another idea is to expose the location of data to applications and allow them to optimize their own execution. We want applications to be able to estimate the time taken to update or retrieve data from different network locations.
DATA LOCATION ISSUE
The Contour System uses replication topologies to monitors network interactions
The basic functionality provided by Contour is data access latency estimation: Applications can estimate the time taken to read or
write data from any compute node in the network. closest-node discovery constraint satisfaction
This is useful if the application wants to choose an existing compute node to run a particular task based on the data it accesses.
It is also useful for cloud allocation (moving data closer to given network locations, or for requesting new resources near existing data.
RELIABILITY ISSUE The cloud computing is more service-oriented than
resource-oriented
The reliability of the cloud computing is very critical but hard to analyze due to its characteristics of massive-scale service sharing, wide-area network, heterogeneous software/hardware components and complicated interactions among them.
Cloud Service Reliability: the probability that a cloud service under consideration can be
successfully completed for a user in a specified period of time.
Possible Failures: overflow, timeout, data/computing resource missing,
software/hardware/network failure
Request stage Failures: Overflow, Timeout The due time for a specific service is the allowed time spent from
the submission of the job request to the completion of the job. If a job request is not served by a scheduler before the due time, it
will be dropped. The dropping rate is denoted by μd. Τhe arrival of submissions of job requests follow a Poisson process
with the arrival rate of λd . State n (n=0,1,…,N) represents the number of requests in the
queue.
At state N, the arrival of a new request will make the request queue overflow, so the request is dropped and the queue still stays at state N. The service rate of a request by a schedule server is μr . If n ≤ S , then n requests can be immediately served by the S schedule servers, so the departure rate of any one request is equal to nμr . If n > S , only S requests are being simultaneously served by schedule servers, so the departure rate is Sμr. The dropping rate for any one request in the
queue to reach its due time is nμd (n=1,2,…,N). qn is the steady probability for the system to stay at state n (n=0,1,…,N). It is easy to derive qn by solving the following Chapman-Kolmogorov equations:
To study the timeout failure, suppose the current length of the request queue is n (n=0,1,…,N-1) when the new service request under consideration arrives. The probability density function of waiting time to complete the n requests by S schedule servers is
If the waiting time is longer than the due time Τd , the timeout failure occurs.
NETWORKING ISSUE Network's mission in cloud computing:
connecting the servers into a resource pool and then connecting users to the correct resources
Public Networking Issues
Not all cloud computing providers will support encrypted tunnels, so your information may be sent in the open on the Internet.
Where encryption is available, using it will certainly increase delay and may impact performance.
The only way to reduce delay without compromising security is by minimizing transit “hops”
reaching a given cloud computing service may involve transiting several provider networks
The best ISP combination in terms of delay will almost always be one with the smallest number of hops.
NETWORKING ISSUE Private Networking Issues
Enterprises will access their own private clouds using the same technology they employed for access to their data centers. (Internet VPN)
All cloud computing implementations will rely on intra-cloud networking to link users with the resource
The performance of those connections will then impact cloud computing performance overall.
Security Principles C I A (Confidentiality, Integrity, Availability)
Provider Security
Threads Provider controls servers, network, etc. Customer must trust provider’s security Failures may violate CIA principles
Countermeasures Verify and monitor provider’s security
SECURITY ISSUE
SECURITY ISSUE
Attacks from other customers security
Threads
Provider resources shared with untrusted parties Customer data and applications must be separated Failures will violate CIA principles
Countermeasures
VPNs, VLANs, firewalls for network separation Cryptography (strong)
SUMMARY AND CONCLUSIONS
SUMMARY
Pros:
Reduced Hardware equipment/maintenance cost for end users
Improved Performance Accessibility Flexibility
Cons:
Performance in terms of Internet Connection speed
Availability Security
3 major services:
Infastructure as a Service Platform as a Service Software as a Service
CONCLUSIONS With cloud computing, the “unit of computing” has
moved from a single computer or rack of computers to a data center of computers resulting in complexity increase.
It has also introduced software, systems, and programming models that significantly reduce the complexity of accessing and using these resources.
Cloud computing provides a super computing power.
The applications and data served by cloud are available to broad group of users.
REFERENCES M. Armbrust, A. Fox, R. Griffith, A. Joseph, R. Katz, A. Konwinski, G. Lee, D.
Patterson, A. Rabkin, and I. Stoica, "Above the clouds: A Berkeley view of cloud computing," EECS Department, University of California, Berkeley, Tech. Rep. UCB/EECS-2009-28, 2009.
Mladen A. Vouk “Cloud Computing – Issues, Research and Implementations “, Department of Computer Science, North Carolina State University, Journal of Computing and Information Technology - CIT 16, 2008, 4, 235–246
Tharam Dillon, Chen Wu and Elizabeth Chang, “Cloud Computing: Issues and Challenges”, Curtin University of Technology Perth, Australia, 2010 24th IEEE International Conference on Advanced Information Networking and Applications
Robert L. Grossman,” The Case for Cloud Computing”, University of Illinois at Chicago and Open Data Group,IT Professional magazine
Paul Grun, Storage at a Distance; Using RoCE as a WAN Transport, System Fabric Works, Inc.
Paul T. Jaeger, Jimmy Lin, Justin M. Grimes “Cloud Computing and Information Policy: Computing in a Policy Cloud?”, Journal of Information Technology & Politics
Yuan-Shun Dai, Bo Yang, Jack Dongarra, Gewei Zhang “Cloud Service Reliability: Modeling and Analysis”, Innovative Computing Laboratory, Department of Electrical Engineering & Computer Science, University of Tennessee, Knoxville, TN, USA
Birjodh Tiwana, Mahesh Balakrishnan, Marcos K. Aguilera, Hitesh Ballani, Z. Morley Mao
“Location, Location, Location! Modeling Data Proximity in the Cloud”,
QUESTIONS