© 2012 eucalyptus systems, inc. eucalyptus internals release 3.2 rich wolski cto eucalyptus systems

51
© 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

Upload: deborah-obrien

Post on 29-Dec-2015

236 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

Eucalyptus InternalsRelease 3.2

Rich WolskiCTOEucalyptus Systems

Page 2: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

Outline• Functionality

• Architecture and Implementation

• Instances

• Volumes

• Walrus (S3)

• EUARE (IAM)

• Reporting

• User Console

• HA

Page 3: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

Functionality• AWS APIs

– EC2 (VMs, network, volumes, IAM)– S3 (buckets, objects, images)

• Linux deployment neutral– Distro independent– Packaged for Linux installation

• Limited network topological requirements– Does not require full network connectivity at L2 or L3

• Limited storage requirements– Can use only direct attach storage and various SAN/JBOD

configurations

• Platform High-availability

• Open Source implementation– All dependencies must be co-licensable– Strive for license compatibility

Page 4: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

Current Status• AWS API

– Baseline API• EC2 Instances (Linux and Windows), Volumes, VM network, IAM, S3

– Not present but in the works• SDKs• ELB, Autoscaling, Cloudwatch• User-controlled networking• Cloud Formations (templating)

– Not present, not currently scheduled, but eventually• SQS• SNS• VPC (tiered networking)

– Not present and waiting for adoption• RDS• SimpleDB• Elastic Beanstalk• Elastic MapReduce• VPC (VPN)

Page 5: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

Linux Distro Compatibility• CentOS/RHEL, Fedora, Ubuntu, Debian

– Dropping all but CentOS/RHEL 6 at 3.2 (Dec.)– Community support and existing customer support for the

others

• Repo-based package install– yum/apt

• Build-from-source

• Non-Java configuration options are config-file specified– “Java-style” runtime configuration for Java components

Page 6: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

Two “Styles” of Deployment

• Linux-only– Fully open source (Eucalyptus and dependencies)– All dependencies carried either by Eucalyptus or part of

supported distro– Eucalyptus is fully self-contained (it controls all configurations)

• Linux + Pre-existing installation– VMWare => attach to a pre-existing VMWare installation– SAN => use a pre-installed and separately configured SAN

devices

Page 7: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

Architecture

NC

Client-side APITranslator

Cloud ControllerCC

Cluster Controller Node Controller

Walrus

SC

Storage Controller

vSphere

ESXESXi

ESX

ESXi

ESX

ESXRHEV

(planned)

VMWare B

Java Object Persistence

Page 8: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

Eucalyptus Components• Cloud Controller (CLC)

– User request processing (except for Walrus), Credentials management, VM (instance) state management

• Walrus (S3)– S3 user request processing, Append-only, Put/Get object storage

• Cluster Controller (CC)– VM inventory, Network provisioning/security group implementation

• Storage Controller (SC)– Block level, network attached storage (SAN and Linux)

• Node Controller (NC)– Hypervisor interface and control, VM launch/decommissioning

• VMWare Broker– Gateway between CC and ESX and/or vSphere for VMWare

• User Console Proxy– Service side of browser-based GUI for users

Page 9: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

Web Services

• Java Components– CLC, Walrus, SC, VMWare Broker– Enterprise Service Bus– Hibernate + TreeCache for object persistence

• C Components– CC, NC– Apache + Axis2C + Rampart– Linux files for persistence (when needed)

Page 10: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

Java Component Stack

10

Local BootstrapComponent Discovery

Service Naming &Registration

Protocol Multiplexing (Netty/JiBX)

Staged Execution (Mule)

Distributed ServiceConfiguration

Persistence (Hibernate)& Caching (TreeCache)

Message Routing& Transport (Netty)

Service StateManagement

Group Membership for HA(Jgroups)

Service (CLC, Walrus, SC, VMb)

Message Stack

Service Accounting

Object Persistence

Service Bootstrap

Page 11: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

Object Persistence Stack

Relational Database (Postgres)

DB Connection Pooling (Proxool)

Persistence APIs (JPA)

JDBC Connection (Postgres JDBC)

L1 Object & Query Caching (Hibernate)

Object Relational Mapping (Hibernate)

Component Service

message routing& web services

L2 Distributed Object Caching (TreeCache)

distributedbootstrap

Distributed DB Connection (HA-JDBC)

Page 12: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

C Component Stack

SOAP message handling (Axis2/c)Local Bootstrap

(Apache)

Service Naming, Registration & Lifecycle(Axis2/c + Rampart/C)

Staged Execution (Axis2/c)

Service (CC, NC)

Page 13: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

Web Service Stack

Service Context Dispatch

Service State Checks

Binding

WS-Security

SOAP Envelope Processing

SOAP Unmarshalling/Marshalling

Protocol Selection

HTTP Chunking

HTTP Decoding/Encoding

SSL Decoding/Encoding

Request Routing

Reply Handling

ComponentService

ProtocolMultiplicity{

Java: Netty, JiBXC: Apache, Axis2c, Rampart

Page 14: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

Components Service Architecture

UserRequest

Authorization

Validation

Reservation

Fulfilled

in ProgressResource Allocation

Resource State Update

RequestProcessing

StateManagement

ResourceControl

Resource

Eucalyptus Message Processing

Page 15: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

Request Walkthrough

• All requests are parsed and syntax checked in the CLC

• All requests have authorization checked in the CLC

• All requests are validated against IAM policies in CLC

• Requests arrive from the network via euca2ools/ec2 tools or from the User Console Proxy

Page 16: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

Instances• Create and destroy virtual machines

– User credentialed

• Image, kernel, and ramdisk

• Virtual machines are attached to credentialed and isolated virtual networks

– Security groups

• Virtualization target– Open source hypervisors via libvirt– ESXi through http interface– vSphere via web interface

• Object-backed– Reset on run

• Volume-backed– persistent

Page 17: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

Run Instance

• Determine if request can be satisfied– CLC maintains capacity accounting per AZ

• Send Image, Kernel, and Ramdisk to hypervisor– Pull from Walrus if not cached locally

• Inject user public ssh key into instance

• Attach running instance to isolated virtual network– Proper security group

Page 18: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

Run Instance Message Overview1. Run instance request processed at CLC

2. StartNetwork sent to CC (VLAN index for sec. group)

3. Run sent to CC (Img, Ke, Rd, user-key)

4. CLC polls CC continuously for instance status

5. CC queries metadata service for instance network info

6. CC instantiates DHCP information for instance

7. CC installs Iptables rules for instance

8. CC sends start network to NC

9. CC sends instance start to NC

10. CC continuously polls NC for instance status

11. NC builds VM file for libvirt (Linux DM)

12. NC launches VM using libvirt

Page 19: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

Run Instance Message Sequence

Page 20: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

Virtual Networking• Full network implementation: MANAGED mode

– Other modes supported but at the cost of some functionality– Requires:

• NC < -- > CC L2 and L3 connectivity• VLAN trunking enabled at the switch ports

• Each security group gets its own subnet– Instances ARP for private addresses directly– Machine hosting CC acts as a gateway for security group

• Firewall rules• Public IP NAT for each instancein sec. group

• CC– Installs Iptable rules for NAT and firewall– Attaches public IP for instance to its machine’s interface– CC bridges its internal interface to the VLAN for the sec. group– Sets default route for sec group in DHCP state for instance

Page 21: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

Virtual Networking Continued

• NC instantiates a bridge on the VLAN associated with sec. group

• NC gives VM MAC address determined by CC so that DHCP will find the right network configuration

• NC launches VM attached to that bridge

• VM DHCPs for its network configuration when it starts

Page 22: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

Walrus-backed Images on Linux

• Stored internally as a root file system (i.e. a separate partition) in Walrus

– Compatible with AWS AMI format used for Xen

• Kernels and ramdisks also stored in Walrus

• Encrypted with the key of the user that does the image install

– Can be made readable publicly

• NC “stitches together” the kernel, ramdisk, image, and other partitions (ephemeral and swap) in a format suitable for the local hypervisor

Page 23: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

The Blob Store• For Linux using libvirt and local hypervisor

– NC fetches image from Walrus on first run– Image components are cached locally

• The NC maintains a catalog of cached components and uses the Linux device mapper to build an instance file for libvirt at launch time

– Creates a block device for libvirt composed of separate block devices for the components (using /dev/loop)

– Components are copy-on-write for the instance– If local key injection enabled, will write user key COW into

image

• Theory: the blob store can stitch together any set of block devices, including iSCSI targets

• Practice: the blob store works with local storage to maintain as few copies as possible of cached data

Page 24: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

VMWare Broker

• The VMWare broker presents itself to the CC as a collection of NCs in an AZ

– The CC uses the same URL for all requests into the AZ

• The VMWare Broker either– Translates NC commands into web service requests and

sends them to different ESXi hypervisors– Translates NC requests and sends them to a single vSphere

installation

• The VMWare Broker also implements an image cache– Image, Raw Disk, VMDK cached at the VMWare Broker– Image template cached inside VMWare– Template is cloned if it is available

Page 25: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

Volumes (EBS)• Credentialed persistent storage accessed by instances as

block devices

• Local to an AZ

• Operations: create/destroy, attach/detach, describe, snapshot/“unsnapshot”

– Unsnapshot subset of create logic

Page 26: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

Volume Attach Overview1. CLC receives attach as an EBS request

2. CLC requests volume from SC and receives Ack with IQN and encrypted password for the volume

3. CLC sends attach request with encrypted password to CC (IQN for the volume to be attached via iSCSI)

4. CC forwards request to the NC

5. SC instructs storage platform to export iSCSI target with specific IQN

6. NC decrypts password and begins to attempt attach of iSCSI device

7. CC polls NC continuously to determine status of attach

8. CLC polls CC to determine status of the attach

Page 27: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

Volume Attach Message Flow

Page 28: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

Three Implementations of Volumes (Plug-ins)• Linux + file system backed

– Linux machine hosting SC is the storage platform– Tgt exports iSCSI targets– LVM + /dev/loop manages the storage volumes

• Linux + JBOD– Linux machine hosting SC is the storage platform– Tgt exports iSCSI targets– LVM + JBOD to manage volumes

• SAN-based– SAN must be able to export iSCSI targets– SC programs SAN remotely

• Ssh or WS– Plug-ins for different SANs

Page 29: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

Snapshot and Unsnapshot

• Snapshot– Volume is “bundled” and stored in a set of Walrus buckets– Requires an explict “puts” to Walrus– Euca-bundle-volume

• Unsnapshot– Reverses the process– Euca-create-volume-from-snapshot

Page 30: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

Volume-backed Instances

• Called “Boot from EBS” in Eucalyptus parlance

• Instance is generated from a snapshot– Create an instance file bootable by libvirt or VMWare in a

volume– Snapshot the volume into Walrus

• Register image takes snapshot and installs emi

• Run instance takes emi registered from snapshot– Eucalyptus uses create-vol-from-snapshot logic to create a

volume on the fly– NC uses volume attach protocol to gain access to newly

created volume– Boots instance from iSCSI on the hypervisor

Page 31: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

Walrus

• Implements S3 Object store– Create/destroy buckets– Put/get objects– Object metadata (e.g. MD5 etag)– Append-only, eventually consistent

Page 32: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

Walrus Put Overview1. Request from user sent to Walrus end-point

2. Put request header parsed for object metadata1. AWS name converted to bucket and object name2. Bucket is directory name3. Object is file named with hash tag mapped to object name in

metadata DB

3. Data streams directly to Walrus back-end for temporary file write

4. When write completes, file rename replaces old object with latest put

5. Transaction for new metadata is committed to DB via persistent Walrus object

Page 33: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

Walrus Put Message Flow

http “put” request

1 2

3

linux

45

Page 34: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

Other Walrus Functionality• Access control lists (ACL)

– Buckets and objects

• Bucket logging– Access logs for one bucket are stored in another

• Versioning– Objects can be read by version number

• Snapshot– Bucket containing single “volume” object– System generated (bypasses user authentication checks)

• Virtual bucket hosting– DNS lookup of object as a host in cloud-local domain

• Bucket policies (IAM – planned)

• Multi-part upload (planned)

Page 35: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

IAM and EUARE

• IAM– Users, accounts, and delegation– JSON predicates evaluated on each request

• predicates controlling operations• Bucket policies

• EUARE– Eucalyptus User Accounting and Resource Environment– Superset of IAM

• Adds predicates describing quotas• Adds an “administrator” rights

Page 36: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

IAM Policy Overview

1. User uploads IAM/EUARE policy via separate endpoint

2. Policy is stored with a group in persistent object1. User’s access is determined by which group

3. Policy is carried with user and account object when a request is made to the CLC or to Walrus

4. Policy is checked just before resource allocation step in “front-end” CLC, EUARE, or Walrus1. Non-quota checks as part of permissions processing2. Quotas as part of admission control

Page 37: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

IAM Message Flow

12,4

3

4

4

Page 38: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

Reporting

• Usage report generation (e.g. for charge-back)– Generated from the CLC DB– Generated from an externally configured data warehouse

• Usage “sensors” built into each Eucalyptus component

Page 39: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

Reporting Overview

1. CLC picks up raw usage information via its polling duty cycle– NC data communicated through the CC

2. CLC stores the data in DB as cloud metadata via persistence API

3. Data warehouse fetches raw data from web service interface to CLC

4. Data warehouse pushes raw and report-processed data to local DB

Page 40: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

Reporting Message Flow

1

2

4

3

Page 41: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

Notes on Reporting• With Release 3.2, two possible configurations:

– CLC metadata DB is used for report generation– External data warehouse used for report generation

• Report generation tools can be loaded either with the CLC to use the internal DB or with the data warehouse

• Using the internal CLC DB removes the requirement to configure a second external DB for the data warehouse

– Report load may slow CLC response– Recovery from fail over in HA may be long due to DB sync of

reporting data

• Data warehouse must be configured separately and “attached” to Eucalyptus

• Transfer to data warehouse is a manual export-and-purge process

Page 42: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

User Console• Browser-based management dashboard

• Java script + Python-based User Console Proxy

• Stateless, one-one translation of API requests– Compatible with current HA architecture– Multiple console proxies can be configured into the same

system for load balancing

Page 43: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

High Availability for the Eucalyptus Platform

• Requirement: remove single point of failure from Eucalyptus installation

– Machine failure– Process failure– Network connectivity failure

• Requirement: avoid data corruption– Fail-stop whenever data integrity cannot be ensured

• Goal: tolerate as many concurrent failures as possible while meeting these requirements

Page 44: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

HA Architecture Specification

• HA implemented at the component level– Topology independent

• Primary-secondary redundancy for all Eucalyptus components except the NC

– Automatic fail-over from primary to secondary– Automatic recovery of secondary while primary functions

alone after a failure

• VM connectivity must be preserved during fail-over and recovery

– Network “shoot-down” of VM-to-CC network state– Primary and secondary CC must be on the same network

Page 45: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

Deployment Architecture

linuxlinux

HA-JDBC + Jgroups

DRBD

SAN

Page 46: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

HA Cloud State Management• Primary CLC maintains primary-backup state for all

components– Synchronized DB access between primary and secondary

CLC, each with its own DB via HA-JDBC– Primary CLC controls primary-secondary status of all

components

• Object persistence layer for all Java components uses HA-JDBC

– TreeCache replicates persistence state in memory for read efficiency

• Jgroups determines causal ordering of primary-backup status at both CLCs

– Secondary CLC can start with DB state and availability status causally synchronized

Page 47: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

HA Storage Replication and Management

• Walrus uses DRBD in synchronous mode as disk storage for ext4 file system

– DRBD configuration and primary-backup status controlled by Walrus

• SC expects HA and volume redundancy in the SAN

Page 48: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

Network Failover

• CC represented by Java proxy on ESB– CC itself is stateless to DB persistence not an issue– Heartbeat determines primary back-up status

• On failover, new primary CC gets state from CLC and from NCs necessary to rebuild network state on CC machine

– Secondary CC, if active, clears local network state

• VMs and external connections rediscover network connectivity

– Connection loss may depend on arp and tcp timeouts

Page 49: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

VMWare Broker failover

• VMWare Broker and CC are paired– CLC forwards group membership state to CC– Waits for response or failure before processing next group

membership multicast– CC determines if paired VMb is part of active group– VMb waits for CLC state transition messages

Page 50: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

Glossary of Acronyms• AZ: Availability Zone

– Part of AWS API specification that denotes a partition of resources with similar non-independent failure probabilities. In Eucalyptus, a partition of resources having similar general QoS properties (including failure probabilities)

• DHCP: Dynamic Host Configuration Protocol– Standard Layer 2 network protocol for delivering IP network

configuration information to a machine or device

• DRBD: Distributed Replicated Block Device– Linux-portable mechanism and protocol for creating replicated

disks attached to machines interconnected by a network

• EBS: Elastic Block Store– Part of AWS specification that denotes block storage devices

(volumes) accessible across the network.

Page 51: © 2012 Eucalyptus Systems, Inc. Eucalyptus Internals Release 3.2 Rich Wolski CTO Eucalyptus Systems

© 2012 Eucalyptus Systems, Inc.

Glossary (2)• ELB: Elastic Load Balancing

– AWS API for provisioning a dynamic, user-controlled load balancer

• JDBC and HA-JDBC: Java Database Connectivity– Open source software for implementing database access in

Java programs. HA is a version for implementing high availability.

• iSCSI: Internet Small Computer Systems Interface– Internet standard protocol for transporting the SCSI disk

access protocol over IP networks

• NAT: Network Address Translation– Method for translating IP addresses in the network in a way

that is transparent to the end points

• VLAN: Virtual Local Area Network– Standard network protocol for creating Layer 2 (L2) isolated

broadcast domains that share a common L2 network fabric