cloud computing lec7 v1...– bulk synchronous parallel model [valiant, 1990][jakovits et al, hpcs...

34
Basics of Cloud Computing – Lecture 7 More AWS, Serverless Computing and Cloud Research Satish Srirama

Upload: others

Post on 22-May-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Cloud Computing Lec7 V1...– Bulk Synchronous Parallel model [Valiant, 1990][Jakovits et al, HPCS 2013] – Building a fault-tolerant BSP framework (NEWT) [Kromonov et al, HPCS 2014]

Basics of Cloud Computing – Lecture 7

More AWS, Serverless Computing

and Cloud Research

Satish Srirama

Page 2: Cloud Computing Lec7 V1...– Bulk Synchronous Parallel model [Valiant, 1990][Jakovits et al, HPCS 2013] – Building a fault-tolerant BSP framework (NEWT) [Kromonov et al, HPCS 2014]

Outline

• More Amazon Web Services

• More on serverless computing

• Cloud based Research @ Mobile & Cloud Lab

3/27/2018 Satish Srirama 2/34

Page 3: Cloud Computing Lec7 V1...– Bulk Synchronous Parallel model [Valiant, 1990][Jakovits et al, HPCS 2013] – Building a fault-tolerant BSP framework (NEWT) [Kromonov et al, HPCS 2014]

Cloud Providers and Services – we

already discussed• Amazon Web Services

– Amazon EC2

– Amazon S3

– Amazon EBS

– Amazon Elastic Load Balancing

– Amazon Auto Scale

– Amazon CloudWatch

• Eucalyptus

• OpenStack

• SciCloud

• Management providers– AWS Management Console

– OpenStack Horizon

– RightScale

• PaaS– Google AppEngine

– Windows Azure

3/27/2018 Satish Srirama 3/34

Page 4: Cloud Computing Lec7 V1...– Bulk Synchronous Parallel model [Valiant, 1990][Jakovits et al, HPCS 2013] – Building a fault-tolerant BSP framework (NEWT) [Kromonov et al, HPCS 2014]

MORE AWS

3/27/2018 Satish Srirama 4

Page 5: Cloud Computing Lec7 V1...– Bulk Synchronous Parallel model [Valiant, 1990][Jakovits et al, HPCS 2013] – Building a fault-tolerant BSP framework (NEWT) [Kromonov et al, HPCS 2014]

AWS we discuss

• AWS Management Console

• AWS Identity and Access Management

• AWS CloudFormation

• Amazon Elastic MapReduce

3/27/2018 Satish Srirama 5/34

Page 6: Cloud Computing Lec7 V1...– Bulk Synchronous Parallel model [Valiant, 1990][Jakovits et al, HPCS 2013] – Building a fault-tolerant BSP framework (NEWT) [Kromonov et al, HPCS 2014]

AWS Management Console

• Hope some of you have started using Amazon accounts

• You can manage your complete Amazon account with management console– AMI Management

– Instance Management

– Security Group Management

– Elastic IP Management

– Elastic Block Store

– Key Pair management etc.

• Have different panes for different services

3/27/2018 Satish Srirama 6/34

Page 7: Cloud Computing Lec7 V1...– Bulk Synchronous Parallel model [Valiant, 1990][Jakovits et al, HPCS 2013] – Building a fault-tolerant BSP framework (NEWT) [Kromonov et al, HPCS 2014]

AWS Management Console - screenshot

https://console.aws.amazon.com/

Page 8: Cloud Computing Lec7 V1...– Bulk Synchronous Parallel model [Valiant, 1990][Jakovits et al, HPCS 2013] – Building a fault-tolerant BSP framework (NEWT) [Kromonov et al, HPCS 2014]

AWS Identity and Access Management

(IAM)

• How can an enterprise or group of people use a single credit card?

• Manage IAM users

– Create new users and manage them

– Create groups

• Manage permissions

– Creating policies

• Manage credentials

– Create and assign temporary security credentials

3/27/2018 Satish Srirama 8/34

Page 9: Cloud Computing Lec7 V1...– Bulk Synchronous Parallel model [Valiant, 1990][Jakovits et al, HPCS 2013] – Building a fault-tolerant BSP framework (NEWT) [Kromonov et al, HPCS 2014]

IAM policy

• Example policy giving access to complete EC2

http://aws.amazon.com/iam/

3/27/2018 Satish Srirama 9/34

Page 10: Cloud Computing Lec7 V1...– Bulk Synchronous Parallel model [Valiant, 1990][Jakovits et al, HPCS 2013] – Building a fault-tolerant BSP framework (NEWT) [Kromonov et al, HPCS 2014]

AWS CloudFormation

• Provides an easy way to create and manage a collection of related AWS resources, provisioning and updating them in an orderly and predictable fashion

• It is based on templates model– Templates describe the AWS resources, the associated

dependencies, and runtime parameters to run an app.

– The templates describe stacks, which are set of software and hardware resources.

– Something similar to CloudML and RightScale server templates

• Hides several details– How the AWS services need to be provisioned

– Subtleties of how to make those dependencies work.

3/27/2018 Satish Srirama 10/34

Page 11: Cloud Computing Lec7 V1...– Bulk Synchronous Parallel model [Valiant, 1990][Jakovits et al, HPCS 2013] – Building a fault-tolerant BSP framework (NEWT) [Kromonov et al, HPCS 2014]

AWS CloudFormation

• Amazon provides several pre-built templates to start common apps as:

– WordPress (blog)

– LAMP stack

– Gollum (wiki used by GitHub)

– …

• There is no additional charge for AWS CloudFormation. You pay for AWS resources (e.g. EC2 instances, Elastic Load Balancers, etc.) http://aws.amazon.com/cloudformation/

3/27/2018 Satish Srirama 11/34

Page 12: Cloud Computing Lec7 V1...– Bulk Synchronous Parallel model [Valiant, 1990][Jakovits et al, HPCS 2013] – Building a fault-tolerant BSP framework (NEWT) [Kromonov et al, HPCS 2014]

Amazon Elastic MapReduce

• Last lecture already discussed it as “Data Processing as a Service”

• Web interface and command-line tools for running Hadoop jobs on EC2

• Data stored in Amazon S3

• Monitors job and shuts machines after use

• Running a job– Upload job jar & input data to S3

– Create the cluster

– Create a Job Flow as steps

– Wait for the completion and examine the results

3/27/2018

http://aws.amazon.com/elasticmapreduce/

Satish Srirama 12/34

Page 13: Cloud Computing Lec7 V1...– Bulk Synchronous Parallel model [Valiant, 1990][Jakovits et al, HPCS 2013] – Building a fault-tolerant BSP framework (NEWT) [Kromonov et al, HPCS 2014]

Other interesting AWS

• Amazon Relational Database Service

– Provides access to the capabilities of familiar database engines

– MySQL, Oracle or Microsoft SQL Server

• NoSQL databases

– Simple DB

– DynamoDB

• AWS Snowball & Snowmobile

– Exabyte-scale data transfer services used to move extremely large amounts of data to AWS

3/27/2018 Satish Srirama 13/34

Page 14: Cloud Computing Lec7 V1...– Bulk Synchronous Parallel model [Valiant, 1990][Jakovits et al, HPCS 2013] – Building a fault-tolerant BSP framework (NEWT) [Kromonov et al, HPCS 2014]

A BIT MORE ON SERVERLESS

COMPUTING

3/27/2018 Satish Srirama 14

Page 15: Cloud Computing Lec7 V1...– Bulk Synchronous Parallel model [Valiant, 1990][Jakovits et al, HPCS 2013] – Building a fault-tolerant BSP framework (NEWT) [Kromonov et al, HPCS 2014]

3/27/2018 Satish Srirama 15

Page 16: Cloud Computing Lec7 V1...– Bulk Synchronous Parallel model [Valiant, 1990][Jakovits et al, HPCS 2013] – Building a fault-tolerant BSP framework (NEWT) [Kromonov et al, HPCS 2014]

Serverless computing - continued

• Newer workloads are a better fit for event driven programming– Execute application logic in response to database triggers

– Execute app logic in response to sensor data

– Execute app logic in response to scheduled tasks etc.

• Applications are charged by compute time (millisecond) rather than by reserved resources

• Greater linkage between cloud resources used and business operations executed

• Serverless in a nutshell– Event-action platforms to execute code in response to

events

3/27/2018 Satish Srirama 16/34

Page 17: Cloud Computing Lec7 V1...– Bulk Synchronous Parallel model [Valiant, 1990][Jakovits et al, HPCS 2013] – Building a fault-tolerant BSP framework (NEWT) [Kromonov et al, HPCS 2014]

Current platforms for serverless

3/27/2018 Satish Srirama 17/34

Page 18: Cloud Computing Lec7 V1...– Bulk Synchronous Parallel model [Valiant, 1990][Jakovits et al, HPCS 2013] – Building a fault-tolerant BSP framework (NEWT) [Kromonov et al, HPCS 2014]

Apache OpenWhisk

• Initiated by IBM but now an Apache project

• Open source cloud platform

• Serverless deployment and operations model

• Optimal utilization and granular pricing

• Scales on a per-request basis

• Supports JS, Swift, Python, Java, Docker

3/27/2018 Satish Srirama 18/34

Page 19: Cloud Computing Lec7 V1...– Bulk Synchronous Parallel model [Valiant, 1990][Jakovits et al, HPCS 2013] – Building a fault-tolerant BSP framework (NEWT) [Kromonov et al, HPCS 2014]

Triggers, actions, rules (and packages)

• Services or data sources define the events

they emit as triggers

• Developers associate the actions to handle

the events via rules

• Packages are a shared

collection of Triggers and

Actions

3/27/2018 Satish Srirama 19/34

Page 20: Cloud Computing Lec7 V1...– Bulk Synchronous Parallel model [Valiant, 1990][Jakovits et al, HPCS 2013] – Building a fault-tolerant BSP framework (NEWT) [Kromonov et al, HPCS 2014]

Triggers and Rules

• Trigger examples

– changes to database records

– IoT sensor readings that exceed a certain

temperature

– new code commits to a GitHub repository

– simple HTTP requests from web or mobile apps

• Rule is an association of a trigger to an action

– Many to many mapping

3/27/2018 Satish Srirama 20/34

Page 21: Cloud Computing Lec7 V1...– Bulk Synchronous Parallel model [Valiant, 1990][Jakovits et al, HPCS 2013] – Building a fault-tolerant BSP framework (NEWT) [Kromonov et al, HPCS 2014]

Actions

• They can be

– small snippets of JavaScript or Swift code

– custom binary code embedded in a Docker container

• Instantly deployed and executed whenever a trigger fires

• It is also possible to directly invoke an action by using the OpenWhisk API, CLI, or iOS SDK

• A set of actions can also be chained

3/27/2018 Satish Srirama

CLI – Command line interface

21/34

Page 22: Cloud Computing Lec7 V1...– Bulk Synchronous Parallel model [Valiant, 1990][Jakovits et al, HPCS 2013] – Building a fault-tolerant BSP framework (NEWT) [Kromonov et al, HPCS 2014]

Actions - continued

• Create an action : wsk action create firstactionHello-Python.py

• Invoke an action : wsk action invoke firstaction --result

• Update an action : wsk action update firstactionHello-Python.py

3/27/2018 Satish Srirama 22/34

Page 23: Cloud Computing Lec7 V1...– Bulk Synchronous Parallel model [Valiant, 1990][Jakovits et al, HPCS 2013] – Building a fault-tolerant BSP framework (NEWT) [Kromonov et al, HPCS 2014]

System overview

3/27/2018 Satish Srirama

https://github.com/apache/incubator-openwhisk/blob/master/docs/about.md

23/34

Page 24: Cloud Computing Lec7 V1...– Bulk Synchronous Parallel model [Valiant, 1990][Jakovits et al, HPCS 2013] – Building a fault-tolerant BSP framework (NEWT) [Kromonov et al, HPCS 2014]

The internal flow of processing

For more information read

https://github.com/apache/incubator-openwhisk/blob/master/docs/about.md

Page 25: Cloud Computing Lec7 V1...– Bulk Synchronous Parallel model [Valiant, 1990][Jakovits et al, HPCS 2013] – Building a fault-tolerant BSP framework (NEWT) [Kromonov et al, HPCS 2014]

CLOUD BASED RESEARCH @

MOBILE & CLOUD LAB

3/27/2018 Satish Srirama 25

Page 26: Cloud Computing Lec7 V1...– Bulk Synchronous Parallel model [Valiant, 1990][Jakovits et al, HPCS 2013] – Building a fault-tolerant BSP framework (NEWT) [Kromonov et al, HPCS 2014]

Scientific Computing on the Cloud

• Public clouds provide very convenient access to computing resources– On-demand and in real-time

– As long as you can afford them

• High performance computing (HPC) on cloud– Virtualization and communication latencies are major

hindrances [Srirama et al, SPJ 2011; Batrashev et al, HPCS 2011]

• Things have improved significantly over the years

• Containerization is a good alternative

– Research at scale• Cost-to-value of experiments

3/27/2018 Satish Srirama 26/34

Page 27: Cloud Computing Lec7 V1...– Bulk Synchronous Parallel model [Valiant, 1990][Jakovits et al, HPCS 2013] – Building a fault-tolerant BSP framework (NEWT) [Kromonov et al, HPCS 2014]

Communication pattern of Cluster vs

Cloud

• Cloud has huge troubles with communication/transmission latencies– Virtualization technology is the culprit

• Performance Comparison of virtual machines and Linux containers (e.g. Docker)

3/27/2018 Satish Srirama

[Srirama et al, SPJ 2011]

27/34

Page 28: Cloud Computing Lec7 V1...– Bulk Synchronous Parallel model [Valiant, 1990][Jakovits et al, HPCS 2013] – Building a fault-tolerant BSP framework (NEWT) [Kromonov et al, HPCS 2014]

Migrating Scientific Workflows to the

Cloud

• Workflow can be represented as weighted directed acyclic graph (DAG)

• Partitioning the workflow into groups with graph partitioning techniques [Srirama and Viil, HPCC 2014]

– Such that the sum of the weights of the edges connecting to vertices in different groups is minimized

– Utilized Metis’ multilevel k-way partitioning

• Scheduling the workflows with tools like Pegasus– Considered peer-to-peer file manager (Mule) for Pegasus

• A framework is developed to help the scientist with the procedure [Viil and Srirama, JSC 2018]

Page 29: Cloud Computing Lec7 V1...– Bulk Synchronous Parallel model [Valiant, 1990][Jakovits et al, HPCS 2013] – Building a fault-tolerant BSP framework (NEWT) [Kromonov et al, HPCS 2014]

Adapting Computing Problems to

Cloud• Reducing the algorithms to cloud computing frameworks

such as MapReduce [Srirama et al, FGCS 2012]

• Designed a classification on how the algorithms can be adapted to MR– Algorithm � single MapReduce job

• Monte Carlo, RSA breaking

– Algorithm � n MapReduce jobs• CLARA (Clustering), Matrix Multiplication

– Each iteration in algorithm � single MapReduce job• PAM (Clustering)

– Each iteration in algorithm � n MapReduce jobs • Conjugate Gradient

• Applicable especially for Hadoop MapReduce

3/27/2018 Satish Srirama 29/34

Page 30: Cloud Computing Lec7 V1...– Bulk Synchronous Parallel model [Valiant, 1990][Jakovits et al, HPCS 2013] – Building a fault-tolerant BSP framework (NEWT) [Kromonov et al, HPCS 2014]

Issues with Hadoop MapReduce

• It is designed and suitable for:

– Data processing tasks

– Embarrassingly parallel tasks

• Has serious issues with iterative algorithms

– Long „start up“ and „clean up“ times ~17 seconds

– No way to keep important data in memory between MapReduce job executions

– At each iteration, all data is read again from HDFS and written back there at the end

– Results in a significant overhead in every iteration

3/27/2018 Satish Srirama 30/34

Page 31: Cloud Computing Lec7 V1...– Bulk Synchronous Parallel model [Valiant, 1990][Jakovits et al, HPCS 2013] – Building a fault-tolerant BSP framework (NEWT) [Kromonov et al, HPCS 2014]

Alternative Approaches

• Restructuring algorithms into non-iterative versions

– CLARA instead of PAM [Jakovits & Srirama, Nordicloud 2013]

• Alternative MapReduce implementations that are designed to handle iterative algorithms [Jakovits and Srirama,

HPCS 2014]

– E.g. Twister, HaLoop, Spark [Distributed Data Processing on the Cloud - LTAT.06.005

(Fall 2018)]

• Alternative distributed computing models

– Bulk Synchronous Parallel model [Valiant, 1990] [Jakovits et al, HPCS

2013]

– Building a fault-tolerant BSP framework (NEWT) [Kromonov et

al, HPCS 2014]

3/27/2018 Satish Srirama 31/34

Page 32: Cloud Computing Lec7 V1...– Bulk Synchronous Parallel model [Valiant, 1990][Jakovits et al, HPCS 2013] – Building a fault-tolerant BSP framework (NEWT) [Kromonov et al, HPCS 2014]

This week in lab

• Continue working with Google AppEngine

3/27/2018 Satish Srirama 32/34

Page 33: Cloud Computing Lec7 V1...– Bulk Synchronous Parallel model [Valiant, 1990][Jakovits et al, HPCS 2013] – Building a fault-tolerant BSP framework (NEWT) [Kromonov et al, HPCS 2014]

Next Lecture

• Continue with Research on cloud

3/27/2018 Satish Srirama 33/34

Page 34: Cloud Computing Lec7 V1...– Bulk Synchronous Parallel model [Valiant, 1990][Jakovits et al, HPCS 2013] – Building a fault-tolerant BSP framework (NEWT) [Kromonov et al, HPCS 2014]

References

• Check Amazon videos and webinars at http://aws.amazon.com/resources/webinars/

• Mike Roberts, “Serverless Architectures”, https://martinfowler.com/articles/serverless.html

• Abel Avram, “FaaS, PaaS, and the Benefits of the Serverless Architecture”, https://www.infoq.com/news/2016/06/faas-serverless-architecture

• Apache OpenWhisk - https://github.com/apache/incubator-openwhisk/blob/master/docs/about.md

• List of Publications - Satish Narayana Srirama - http://math.ut.ee/~srirama/publications.html

• [Srirama et al, FGCS 2012] S. N. Srirama, P. Jakovits, E. Vainikko: Adapting Scientific Computing Problems to Clouds using MapReduce, Future Generation Computer Systems Journal, 28(1):184-192, 2012. Elsevier press. DOI 10.1016/j.future.2011.05.025.

• [Jakovits and Srirama, HPCS 2014] P. Jakovits, S. N. Srirama: Evaluating MapReduce Frameworks for Iterative Scientific Computing Applications, The 2014 (12th) International Conference on High Performance Computing & Simulation (HPCS 2014), July 21-25, 2014, pp. 226-233. IEEE.

• [Kromonov et al, HPCS 2014] I. Kromonov, P. Jakovits, S. N. Srirama: NEWT - A resilient BSP framework for iterative algorithms on Hadoop YARN, The 2014 (12th) International Conference on High Performance Computing & Simulation (HPCS 2014), July 21-25, 2014, pp. 251-259. IEEE.

• [Srirama and Viil, HPCC 2014] S. N. Srirama, J. Viil: Migrating Scientific Workflows to the Cloud: Through Graph-partitioning, Scheduling and Peer-to-Peer Data Sharing, 16th Int. Conf. on High Performance and Communications (HPCC 2014) workshops, August 20-22, 2014, pp. 1137-1144. IEEE.

• [Viil and Srirama, JSC 2018] J. Viil, S. N. Srirama: Framework for Automated Partitioning and Execution of Scientific Workflows in the Cloud, The Journal of Supercomputing, ISSN: 0920-8542, 2018. Springer. DOI: 10.1007/s11227-018-2296-7 (In Print)

• [Jakovits and Srirama, Nordicloud 2013] P. Jakovits, S. N. Srirama: Clustering on the Cloud: Reducing CLARA to MapReduce, 2nd Nordic Symposium on Cloud Computing & Internet Technologies (NordiCloud 2013), September 02-03, 2013, pp. 64-71. ACM.

• [Jakovits et al, HPCS 2013] P. Jakovits, S. N. Srirama, I. Kromonov: Viability of the Bulk Synchronous Parallel Model for Science on Cloud, The 2013 (11th) International Conference on High Performance Computing & Simulation (HPCS 2013), July 01-05, 2013, pp. 41-48. IEEE.

• [Srirama et al, SPJ 2011] S. N. Srirama, O. Batrashev, P. Jakovits, E. Vainikko: Scalability of Parallel Scientific Applications on the Cloud, Scientific Programming Journal, Special Issue on Science-driven Cloud Computing, 19(2-3):91-105, 2011. IOS Press. DOI 10.3233/SPR-2011-0320.

• [Batrashev et al, HPCS 2011] O. Batrashev, S. N. Srirama, E. Vainikko: Benchmarking DOUG on the Cloud, The 2011 International Conference on High Performance Computing & Simulation (HPCS 2011), 2011, pp. 677-685. IEEE.

• [Valiant, 1990] L. G. Valiant: A bridging model for parallel computation, Commun. ACM, vol. 33, no. 8, pp. 103–111, Aug. 1990.