anynines - cloud foundry on openstack - an experience report

Post on 25-Jan-2015

196 Views

Category:

Technology

3 Downloads

Preview:

Click to see full reader

DESCRIPTION

 

TRANSCRIPT

Cloud Foundry on OpenStack An Experience Report

Introduction

about.me/fischerjulian

The anynines Stack

Hardware

OpenStack

Cloud Foundry

VMware

We migrated from a Rented VMware to a

self-hosted OpenStack.

For more details on this: http://rh.gd/a9vmw2sos

Things we had to think about

OpenStack Upgrades

Before Grizzly OpenStack was

not ready for production

• The upgrade process included a lot manual work

• No script driven upgrades

• Manual DB schema migrations

• Manual configuration file changes, etc.

„I scheduled a week of total downtime with all instances offline.“ - jon@jonproulx , http://rh.gd/1sNhiiz

Upcoming Upgrade Havanna > Icehouse

• Chef is used to roll-out Icehouse (incl. configuration changes)

• The upgrade is well tested on a separate multi-server OpenStack staging system

Goal: <30 min downtime.

Let’s cross fingers :)

Looking forward to rolling Upgrades with OpenStack

Icehouse http://rh.gd/1ymhViL

• No need to shutdown VMs during upgrade

• No downtime of the entire cloud

VM availability

What killes VMs?

• Random kernel panics (kernel bug) http://rh.gd/1oBUeCc

• Hardware outages (hw & power failures)

• …

Availability Zones

• Build disjunct networks, racks, etc.

• Each disjunct zone = availability zone

• Tell OpenStack about availability zones

• On provision you can choose the AZ

• Build Bosh releases accordingly

Aggregates

• Similar to AZ

• Not about failover

• Select hosts with certain attributes

• E.g. SSD-aggregate

• On provision choose host with SSD disks

Load Balancing

• Not inherently clustered

• LBaaS failover can be realized using

• Pacemaker/corosync

• GlusterFS (share LB configuration)

VM Failover Strategies

Resurrect

• Monitor VM

• Re-Build VMs automatically

• e.g. using Cloud Foundry Bosh

• + Easy

• - Takes long (minutes not seconds)

• - Open Stack doesn’t release persistent disks automatically

Failover to Standby VM

• Provide stand-by VM

• Monitor VM and perform failover

• e.g. using Pacemaker

• + Fast failover (seconds)

• - Pacemaker is not easy to use

• - Increased resource usage by stdby VM(s)

IP Failover

Three ways to failover IPs

Load Balancer

• + Fast

• + Easy (use lb weights)

• - LB becomes a bottleneck

• When OS supports HA Proxy (L3) this drawback can be eliminated

Floating-IPs

• + Easy

• + Fast

• - Only for public networking

NIC Re-attachment

• + No network bottleneck

• + No dependencies to other services

• - Slightly higher failover time (several seconds)

Implications for Cloud Foundry

Accept that VMs are ephemeral

Distribute CF components across OS availability zones

• 2 * UAA

• 2 * CC

• 2 * n * DEAs

• 2 * Health Manager

• …

UAA & CC DB =

SPOF

HA Postgres

• UAA and Cloud Controller database

• Single point of failure for Cloud Foundry

• Postgres not inherently clusterable > failover with standby vm

• Master/slave replication

• Pacemaker/corosync

• IP-Failover using NIC-reattachment

That’s half way towards a PostgreSQL CF Service

• Add a V2 Service Broker

• Add a provisioning logic

• Provision 2-node db cluster on cf create service postgres medium-cluster

CF Service Design

• Use clusterable services if possible

• Implement automatic failover if not

• Autoprovisioning using Bosh

• Organize self-healing

• (Semi-)Automatic recovery from degraded mode

Summary

• VMware’s high availability options are nice

• OpenStack helped us to save 50% costs

• OS is stable enough to run Cloud Foundry on top

• OS hardening is required and feasible

Open Source OpenStack and Open Source Cloud Foundry are SME’s best

friends!

Questions?

Thank you!

Preparing for disaster recovery

• Cinder Volume Snapshots

OpenStack Backups

OpenStack Swift

• Open source Amazon S3 replacement

• Object store with RESTful interface

• Scales horizontally to petabyte dimensions

• Fully redundant, highly available

• CF service > App Asset Storage

Coderequire "fileutils" require "find" require "fog" !class Blobstore   def initialize(connection_config, directory_key, cdn=nil, root_dir=nil)     @root_dir = root_dir     @connection_config = connection_config     @directory_key = directory_key     @cdn = cdn   end !  def local?     @connection_config[:provider].downcase == "local"   end !  def exists?(key)     !file(key).nil?   end

top related