orchestration panel at cloud connect 2010

Orchestration:The Next Frontier for Cloud Applications

Alex Honor, ControlTier Project

Damon Edwards, DTO Solutions Inc.

John Willis, Opscode Inc.

Mark Hinkle, Zenoss Inc.Duncan Johnston-Watt, CloudSoft Corporation

You worried about…

Now you worry about…

How do you make all of that work together in the cloud?

Orchestration!

The Path to Orchestration


1. Bring “Dev”, “Ops”, and “Biz” points-of-view and practices into alignment



See also: #DevOps



2. Fully automated infrastructure

See also: #DevOps



2. Fully automated infrastructure

See also: #DevOps

See also: “Infrastructure as Code”

Alex Honor (ControlTier)

Damon Edwards (DTO Solutions)

John Willis (Opscode)

Mark Hinkle (Zenoss)

Duncan Johnston-Watt (CloudSoft)

Q&A with Panel

Moderator

Agenda

John WillisVP of Services - Opscode, Inc.

Orchestration and System Administration

Infrastructure is Hard

Fully Automated Infrastructure is Hard!

1999Inventory, packaged file transfers and desktops

2005Unattended bare metal servers “very very” hard7k Nodes took 5 days w/90 success

2007 Unattended bare metal in under 10 minutesFully configured in under 3 mins

2008 Unattended server in 2 minutes 5000 servers in a week

201010k Nodes in under 5 minutes

Managing Infrastructure Is HardHas Always Been

1980

1989

1999

2001

• Solve very little of the problem...

• Reach just a handful of large, enterprise customers

• Require custom implementations with large professional services

• Deployed exclusively on-premise

• Acquired by companies with large consulting organizations (IBM, HP, CA)

Proprietary Solutions

Open Source SolutionsCfengineStarted in 1993 by Mark Burgess. He created a scientific approach to model systems and set a new paradigm for CM. DSL based, declarative, abstract, convergent and self documenting configuration management.

PuppetFounded in 2005 by Luke Kanies. Frustrated with Cfengine syntax and ability to adapt to real world configuration management, he made a quantum leap in making a DSL easier to use for declarative, abstract, convergent and self documenting configuration management.

ChefFounded in 2008 by Adam Jacob. A community leader working with Puppet on massively scalable fully automated infrastructures, Saw the problem as a “systems Integration” problem first and configuration management as a subcomponent.

Industry Shifts

Infrastructure is changing• Easier to get (good!) ...but harder to manage (bad!)

• Demand is dynamic

• Developers are crucial to Operations

• Web / Cloud services are proliferating ...and Enterprise is following along

• Manual configuration no longer a crutch

• Few tools to solve a ubiquitous problem

Core Principles

• System Integration

• Infrastructure as Code

• Infrastructure API

• Community involvement

• Zero touch

Infrastructure as CodeNodes -- Where recipes are applied

Roles -- Allow you to group together nodes

Cookbooks -- Recipes, Definitions, Attributes, Libraries, Files and Templates

Resources -- The basic unit of work in Chef - a resource might be a package, file or service

Providers -- A provider takes actions on resources. A node decides what provider should be used by default.

Metadata -- Defines cookbook dependencies and additional parts.

Cookbooks

Distribution

Recipes, Attributes

Assets

Definitions, LWRP, Libraries

Metadata

Load Balancer Example

Alex HonorProject Leader, ControlTier Open Source Project

Orchestration and Application Administration

Classic Application Administration Problem


Complexity!


Complexity!Changing procedures!


Complexity!Changing procedures!Environment differences!


Complexity!Changing procedures!Environment differences!Lack of repeatability!

Clouds Make it Worse


Complexity!Changing procedures!Environment differences!Lack of repeatability! +


Complexity!Changing procedures!Environment differences!Lack of repeatability! +Transient infrastructure!


Complexity!Changing procedures!Environment differences!Lack of repeatability! +Transient infrastructure!Dynamic scale!


Complexity!Changing procedures!Environment differences!Lack of repeatability! +Transient infrastructure!Dynamic scale!Multiple providers!

Command Dispatcher

• Abstraction at several levels

Command Dispatcher Provides

• Abstraction at several levels– Nodes


• Abstraction at several levels– Nodes – Services


• Abstraction at several levels– Nodes – Services– Management Procedures



• Sequenced or parallel execution



• Sequenced or parallel execution

• Plug-in control modules


Example: Cluster Management

• Coordinate actions within a larger procedure

• Roll sets of tasks across sets of nodes

• Manage as whole or logical slices

Example: Scale Differences

Wednesday 04:00

Example: Scale Differences

Wednesday 04:00

Wednesday 11:00

Example: Self-Service

Command Dispatcher Projects

• Capistrano (capify.org)

• Fabric (fabfile.org)

• Func (fedorahosted.org/func)

Example command dispatchers…

Command Dispatcher Projects

• ControlTier (controltier.org)– Workflow system on top of dispatcher– Web-based GUI and command line tools– Fine-grain access controls– Logging and reporting framework– Integrated with CMDB

Example command dispatchers (cont’d)…

Mark HinkleVP of Community, Zenoss Inc.

Orchestration and Monitoring

Legacy IT

Cartoon originally copyrighted by the authors; G. Renee Guzlas, artist

Different perspective, lack of coordination

Legacy Monitoring Perspective

Types of Monitoring• Availability Monitoring – Binary, Moment in

Time• Performance Monitoring – Two

Dimensions, Time and State• Change Management – Comparisons of

states in Time• Event Management – Normalizing

Randomness• Synthetic Transactions – Simulated

Experiences• Business Service Management (BSM) –

$$$ Consequences of IT Performance

Data Collection• SNMP• SSH• WMI• Syslog• Proprietary Agents

The Myth of the Nines

Availability % Downtime per Year Downtime per Month Downtime per Week

99.9% (three nines) 8.76 hours 43.2 minutes 10.1 minutes99.95% 4.38 hours 21.56 minutes 5.04 minutes99.99% (four nines) 52.6 minutes 4.32 minutes 1.01 minutes99.999% (five nines) 5.26 minutes 25.9 seconds 6.05 minutes99.9999% (six nines) 31.5 seconds 2.59 seconds .0605 seconds

•Average polling interval for monitoring? 5 minutes? •Even super human operations people can’t be alerted and take action in under 5 minutes. •One outage per year could drop service level to three nines.

Legacy Systems Management: Fragmented Awareness

ConfigurtationDatabase

Process server

Agent

Analytics server

databaseProcess server

Agent

databaseProcess server

Agent

Analytics server

Provisioning Performance & Availability Management

Global dashboard is a difficult mash-up of disparate systems or doesn’t exist. No

communication, No automation

Multiple data models across disciplines with no common object model

Each management discipline managed has its own separate product (UI, process, database, and

domain specific language)Multiple agents required for each discipline and platform

database

Configuration Management

Physical/Virtual/Cloud Infrastructure

Virtual Machine Virtual Machine

Op. System

Application Application

Op. System

Unlegacy Systems Management:Integrated Model, Interactive, Automated

Example – Broadcast Company

• Servers are automatically built using configuration management software

• As servers are brought into service configuration management inserts hosts into CMDB used by monitoring database

• One way interaction between configuration management and monitoring system

• Reports are generated to determine which systems are compliant

Large premium television content provider serves national cable network with content served from Linux servers.

Example - Geeknet

• Servers are automatically built using configuration management software

• Discovery tool finds infrastructure and populates a CMDB then spits out information to scripts that translate information to BIND configurations for DNS

• Monitoring tool adds hosts to polling tool to start monitoring servers for availability

• As infrastructure changes systems are updated automatically

• Servers can be spun up and managed in minutes, not hours automatically with little or no human interaction

Hundreds of servers, serving web, databases, and other infrastructure for some of the world’s most highly trafficked websites – over 40 million visitors per month.

Unlegacy Future: Devops

OperationsDevelopment

Duncan Johnston-WattCEO, Cloudsoft Corporation

Orchestral Manoeuvres in the Cloud

The Application Mobility Manifesto

• Application mobility is the ability to …– Dynamically change some or all of the infrastructure that an

application is using without any disruption of service

– Optimize the location of application components in the cloud

– Bridge the gap between your private cloud and trusted third party cloud services providers

• Application mobility is achieved by orchestrating the cloud

• Application mobility is the “Missing Link” in Cloud Computing

Demo: Application Mobility in Action

• EzBrokerage is implemented using CloudSoft’s Monterey middleware platform

• EzBrokerage benefits from two complementary policies

– Workload policy: ensures the service is adequately resourced based on server demand by managing the size of a pool and distribution of workload across it

– Geolocation policy: ensures the service is hosted in the right region based on client demand by managing the overall distribution of workload across multiple resource pools or clouds

Alex Honor (controltier.org)

Damon Edwards (dtosolutions.com)

John Willis (opscode.com)

Duncan Johnston-Watt (cloudsoftcorp.com)

@damonedwards

@alexhonor

@mrhinkle

@duncanjw

@botchagalupe

Mark Hinkle (zenoss.com)

orchestration panel at cloud connect 2010

Technology

transient infrastructure

command dispatcher

environment differences

ubiquitous problem

lack of repeatability

systems integration

biz points

code nodes