cloud computing open source cloud infrastructures keke chen

32
Cloud Computing Open source cloud infrastructures Keke Chen

Upload: gavyn-marson

Post on 19-Jan-2016

226 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Cloud Computing Open source cloud infrastructures Keke Chen

Cloud Computing

Open source cloud infrastructures

Keke Chen

Page 2: Cloud Computing Open source cloud infrastructures Keke Chen

Outline Project 3 Eucalyptus OpenStack

Page 3: Cloud Computing Open source cloud infrastructures Keke Chen

Project 3: using AWS Tasks (work from nimbus17 or your own

PC) Create AWS account and setup the

environment Try basic EC2 commands Start a hadoop cluster on EC2, using the

hadoopEC2 tool Read the code of hadoopEC2 to understand

how to interact with EC2 in shell scripts

Page 4: Cloud Computing Open source cloud infrastructures Keke Chen

Starting hadoop cluster on EC2 Read

http://wiki.apache.org/hadoop/AmazonEC2

Setup Check src/contrib/ec2/bin/hadoop-ec2-env.sh

You don’t need to change anything there You should setup your own environment

variables in .profile, .login, or .bashrc AWS_ACCOUNT_ID, AWS_ACCESS_KEY_ID,

AWS_SECRET_ACCESS_KEY

Page 5: Cloud Computing Open source cloud infrastructures Keke Chen

Starting hadoop on EC2 copy $HADOOP_HOME/src/contrib/ec2 to

your own directory % bin/hadoop-ec2 launch-cluster your-

cluster-name #ofslaves % bin/hadoop-ec2 login your-cluster-name Test your cluster

/usr/local/hadoop-* Hadoop fsck /

Diagnose problems (understand the hadoop setup) http://www.michael-noll.com/tutorials/running-ha

doop-on-ubuntu-linux-single-node-cluster/

Page 6: Cloud Computing Open source cloud infrastructures Keke Chen

Read the source of the EC2 tool Check the script hadoop-ec2 and learn

how to automatically launch instances Pass initialization scripts to instances Change Hadoop configuration

Answer some questions

Page 7: Cloud Computing Open source cloud infrastructures Keke Chen

Make your own AMI install a recent Hadoop version e.g.,

1.0.x in the AMI HadoopEC2 provides some scripts but

they need to be revised to work with the current setting

Page 8: Cloud Computing Open source cloud infrastructures Keke Chen

Experiment with HDFS and S3 Hadoop can use either HDFS or S3 as

the storage for MapReduce. You need to learn the performance

difference for these two options How to configure Hadoop to use S3

https://wiki.apache.org/hadoop/AmazonS3 Conduct a simple experiment to compare

the performance of different storage

Page 9: Cloud Computing Open source cloud infrastructures Keke Chen

Most popular open-source AWS equivalence Eucalyptus

Started by UCSB researchers, now a company

OpenStack Started by NASA, now an open source

platform

Page 10: Cloud Computing Open source cloud infrastructures Keke Chen

Eucalyptus Compatible to AWS APIs (EC2, S3,

mainly) Thus, Boto library can be used, too A good example for understanding how AWS

works

Page 11: Cloud Computing Open source cloud infrastructures Keke Chen

Paper “The Eucalyptus Open-source Cloud-computing System” How VM instances are managed How to provide virtual network (like elastic

IP) How to provide data storage (like S3) A very brief description, but we can get

something

Page 12: Cloud Computing Open source cloud infrastructures Keke Chen

System Design

Data center

CLC: cloud controller Walrus: storage controller similar to S3CC: cluster controller NC: node controller

Page 13: Cloud Computing Open source cloud infrastructures Keke Chen

Components: Node Controller Make queries to discover physical resources

# of cores Size of memory Available disk space State of VM instances

Propagate the information to Cluster Controller DescribeResource DescribeInstances

Run/terminate instances CLCCC NC hypervisor (Xen)

Page 14: Cloud Computing Open source cloud infrastructures Keke Chen

Node controller Start an instance

Copy instance image from walrus or local cache Create endpoint in the virtual network overlay Instruct hypervisor to boot the instance

Stop an instance Instruct hypervisor to terminate the VM Tear down the virtual network endpoint Clean up the files associated with the instance

Page 15: Cloud Computing Open source cloud infrastructures Keke Chen

Cluster Controller Gather/report information of NCs

Through the interface provided by NCs Report the summary to CLC

Schedule incoming instance “run” requests to specific NCs

Control the virtual network overlay

Page 16: Cloud Computing Open source cloud infrastructures Keke Chen

Virtual network overlay VM instance interconnectivity (between

different nodes/networks) Not very well mentioned in Xen Connectivity, isolation and performance

At least one of a set of VMs be exposed externally Map the public IP to that instance

Restricted communication VMs in the same set can talk to each other VMs from different sets should be isolated

Page 17: Cloud Computing Open source cloud infrastructures Keke Chen

Virtual network overlay

•Each VM has a private IP; one VM in the set also has a public IP•VLAN tag defines the subnet – to isolate sets of VMs•Cluster Controller serves as the router between VM subnets

- CC uses Linux iptable control traffics- Use iptable Network Address Translation (NAT) to define the map from Public IP to private IP

Page 18: Cloud Computing Open source cloud infrastructures Keke Chen

Storage Controller (Walrus) Provide SOAP/REST interfaces

Compatible with S3 – you can use S3 tools

Use Walrus to stream data in/out of the cloud

Store VM images (same as AMI) Root file system, kernel image, ramdisk

image

No locking for object writes Conflict writes – late write overwrites the

earlier

Page 19: Cloud Computing Open source cloud infrastructures Keke Chen

Provides the same tool Amazon uses Generate AMI

Maintains a cache of images Authentication is applied when NC

accesses images

Page 20: Cloud Computing Open source cloud infrastructures Keke Chen

Cloud Controller A collection of web services

Resource services Data services Interface services

Page 21: Cloud Computing Open source cloud infrastructures Keke Chen

Cloud Controller: resource services Receive user requests Interact with CCs to allocate/deallocate System Resource State (SRS) is

maintained by querying CCs CCs will collect information from NCs

Follows a “transactional” operation Reservation, VM creation commit Or errors rollback

Realizing SLAs

Page 22: Cloud Computing Open source cloud infrastructures Keke Chen

Cloud Controller: data services Handles the creation, modification,

interrogation, and storage of stateful system and user data There is a system database…

Users can query the services Discover resource info (images, clusters) Manipulate abstract parameters(keypairs,

security groups, network definitions) Recall some of AWS interfaces…

Page 23: Cloud Computing Open source cloud infrastructures Keke Chen

Cloud Controller: interface services User-visible interfaces Programmatic interfaces (SOAP/REST) Web interface

Handling authentication Provide system management tools

Page 24: Cloud Computing Open source cloud infrastructures Keke Chen

OpenStack

Page 25: Cloud Computing Open source cloud infrastructures Keke Chen

OpenStack Originated at NASA, with Rackspace Driven by an open community process Multiple hypervisors: Xen, KVM, ESXi,

Hyper-V First release: Oct 2010

Page 26: Cloud Computing Open source cloud infrastructures Keke Chen
Page 27: Cloud Computing Open source cloud infrastructures Keke Chen

Components Nova – Compute (equivalent to EC2) Swift – object storage (S3) Image service (AMI) Networking (virtual network) Block storage (Elastic block storage) Identity Dashboard (AWS web console)

-- mostly implemented with python

Page 28: Cloud Computing Open source cloud infrastructures Keke Chen

Fastest Growing Global Open Source Community

COMPANIES

TOTAL CONTRIBUTORS

AVERAGE MONTHLY

CONTRIBUTORS

CODE CONTRIBUTIONS

1,036 238 70,137

231

10,149INDIVIDUAL MEMBERS

COUNTRIES

121

As of July 2013

Page 29: Cloud Computing Open source cloud infrastructures Keke Chen

Global Community

Countries with members

Page 30: Cloud Computing Open source cloud infrastructures Keke Chen

Developer Growth

Contributors per month (ohloh)

Page 31: Cloud Computing Open source cloud infrastructures Keke Chen

1 Million+ Lines of Code

Lines of code (ohloh)

Page 32: Cloud Computing Open source cloud infrastructures Keke Chen

Ecosystem Growth

Participating Companies