austin agile conf 2012 infrastructure automation-gmiranda

Post on 27-Jan-2015

106 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

 

TRANSCRIPT

Agility Through Infrastructure Automation

George Mirandagmiranda@opscode.com

Austin Agile Conference 2012November 16, 2012

Friday, November 16, 12

Introductions# finger $(whoami)Login: gmiranda! ! ! ! ! ! ! Name: George MirandaDirectory: /home/gmiranda! ! ! Shell: /bin/bashOn since Mon 14 Apr 1997 18:01 (GMT) on tty1 from :0No mail on gmiranda@opscode.comPlan:

twitter:! gmiranda23github:!! gmiranda23irc:!! ! gmiranda23! (irc.freenode.net - #chef)community:!gmiranda23! (community.opscode.com)role:! ! consultant, evangelist, trainer, *:*

Friday, November 16, 12

Scope

Friday, November 16, 12

ScopeAutomation + Culture = Agility

Friday, November 16, 12

ScopeAutomation + Culture = Agility

• Infrastructure Automation Approaches

Friday, November 16, 12

ScopeAutomation + Culture = Agility

• Infrastructure Automation Approaches

• Infrastructure & Automation Best Practices

Friday, November 16, 12

ScopeAutomation + Culture = Agility

• Infrastructure Automation Approaches

• Infrastructure & Automation Best Practices

• Cultural Pitfalls

Friday, November 16, 12

ScopeAutomation + Culture = Agility

• Infrastructure Automation Approaches

• Infrastructure & Automation Best Practices

• Cultural Pitfalls

• Making more awesome

Friday, November 16, 12

ScopeAutomation + Culture = Agility

• Infrastructure Automation Approaches

• Infrastructure & Automation Best Practices

• Cultural Pitfalls

• Making more awesome

What this talk is not

• Chef vs. Puppet

• Cloud All The Things!!!

• How to structure your Organization

• Which Development Model to adopt

Friday, November 16, 12

System Build Approaches

http://www.flickr.com/photos/dancedaoc/3083836988/sizes/z/in/photostream/

Friday, November 16, 12

Complications

• “That one host” you know you can’t rebuild

• Untracked configuration change

• Collections of Bash, PERL, Python, ???

• Rebuild from: wiki, cheatsheets, folklore

http://www.flickr.com/photos/humblog/4996661110/sizes/l/in/photostream/

Friday, November 16, 12

Unprecedented Growth

1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989

Virtual Nodes

Physical Hardware

1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015

Friday, November 16, 12

Unprecedented Growth

1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989

Virtual Nodes

Physical Hardware

1980Mainframe

1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015

Friday, November 16, 12

Unprecedented Growth

1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989

Virtual Nodes

Physical Hardware

1980Mainframe

1990Client/Server

1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015

Friday, November 16, 12

Unprecedented Growth

1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989

Virtual Nodes

Physical Hardware

1980Mainframe

1990Client/Server

2000Datacenter

1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015

Friday, November 16, 12

Unprecedented Growth

1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989

Virtual Nodes

Physical Hardware

1980Mainframe

1990Client/Server

2000Datacenter

2010+Cloud

1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015

Friday, November 16, 12

Unprecedented Growth

1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989

Virtual Nodes

Physical Hardware

1980Mainframe

1990Client/Server

2000Datacenter

2010+Cloud

The things that got us here…

…must change to get us here!

1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015

Friday, November 16, 12

The Rise of Configuration Management

http://www.flickr.com/photos/24375810@N06/6611017007

Friday, November 16, 12

We have a problem at scale

Friday, November 16, 12

Here’s a hint

Friday, November 16, 12

Cabling?

Friday, November 16, 12

Close...

Friday, November 16, 12

http://www.flickr.com/photos/michaelheiss/3090102907/

Complexity

Friday, November 16, 12

Infrastructure

Friday, November 16, 12

Items we manipulate• Routes

• Users

• Groups

• Tasks

• Packages

• Software

• Services

• Nodes

• Networking

• Files

• Directories

• Symlinks

• Mounts

• Ruby Gems

• Python Modules

• Java Artifacts

• Disks

• Volumes

• Filesystems

• Firewall Rules

Friday, November 16, 12

See Node

Application

Friday, November 16, 12

See Nodes

Application

Application Database

Friday, November 16, 12

See Nodes Grow

Application

Application Databases

Friday, November 16, 12

Grow Nodes

App Servers

Application Databases

Friday, November 16, 12

...Grow

Load Balancer

Application Databases

App Servers

Friday, November 16, 12

Grow Nodes, Grow

Load Balancers

Application Databases

App Servers

Friday, November 16, 12

Grow Nodes, Grow

Load Balancers

App DB Cache

App Servers

Application Databases

Friday, November 16, 12

Infrastructure has a Topology

Load Balancers

App DB Cache

App Servers

Application Databases

Friday, November 16, 12

Infrastructure IS a Snowflake

Load Balancers

App DB Cache

App Servers

Application Databases

Floating IP?

Friday, November 16, 12

Complexity Increases Quickly

App LBs

App Servers

NoSQL

DB slaves

Cache

DB Cache

DBs

Friday, November 16, 12

... Increases Very Quickly

DC1 DC3

DC2

Friday, November 16, 12

Configuration Management

http://www.flickr.com/photos/philliecasablanca/3354734116/

Friday, November 16, 12

Sysadmins

Friday, November 16, 12

The Past

Friday, November 16, 12

• Labor intensive

• Error prone

• Hard to reproduce

• Unsustainable

http://www.flickr.com/photos/pureimaginations/4805330106/

Manual Configuration

Friday, November 16, 12

• Typically very brittle

• Throw away, one off scripts

• grep sed awk perl

• curl | bash

http://www.flickr.com/photos/40389360@N00/2428706650/

Scripting

Friday, November 16, 12

• NFS mounts

• rdist

• scp-on-a-for-loop

• rsync on cron

http://www.flickr.com/photos/walkadog/4317655660

File Distribution

Friday, November 16, 12

for i in `cat servers.txt` ; do scp ntp.conf root@$i:/etc/ntpd.conf ; done

for i in `cat servers.txt` ; do ssh root@$i /etc/init.d/ntpd restart ; done

for i in `cat servers.txt` ; do ssh root@$i chkconfig ntpd on ; done

• ^ does not scale

http://www.flickr.com/photos/alexerde/3479006495

This used to be awesome.

Friday, November 16, 12

• Cluster SSH

• ISConf

• Golden Images

Execution Management

Friday, November 16, 12

Typical Boring Infrastructure

Jboss App

Memcache

Postgres Slaves

Postgres Master

NagiosGraphite

Friday, November 16, 12

Typical Boring Infrastructure

Jboss App

Memcache

Postgres Slaves

Postgres Master

NagiosGraphite

• Move SSH off port 22

• Lets put it on 2022

Friday, November 16, 12

Typical Boring Infrastructure

Jboss App

Memcache

Postgres Slaves

Postgres Master

NagiosGraphite

• Move SSH off port 22

• Lets put it on 2022

• edit /etc/ssh/sshd_config

Friday, November 16, 12

Typical Boring Infrastructure

Jboss App

Memcache

Postgres Slaves

Postgres Master

NagiosGraphite

• Move SSH off port 22

• Lets put it on 2022

• edit /etc/ssh/sshd_config

1 2

3

4

5

6

Friday, November 16, 12

Maintenance Window

Jboss App

Memcache

Postgres Slaves

Postgres Master

NagiosGraphite

Friday, November 16, 12

Maintenance Window

Jboss App

Memcache

Postgres Slaves

Postgres Master

NagiosGraphite 1 2

3

8

5 64 7

9

10 11

12

• Launch, Delete

• Repeat

• Typically manually

Friday, November 16, 12

Maintenance Window

Jboss App

Memcache

Postgres Slaves

Postgres Master

NagiosGraphite 1 2

3

8

5 64 7

9

10 11

12

• Launch, Delete

• Repeat

• Typically manually

• Don’t break anything!

• Bob just got fired =(

Friday, November 16, 12

Different IP Addresses?

Jboss App

Memcache

Postgres Slaves

Postgres Master

NagiosGraphite

• Invalid configs!

Friday, November 16, 12

Systems Integration

• Keep a list of current resources

• Collect vast amounts of data on those resources

• Quickly search through stacks of current resource data

• Generate your Infrastructure Topology from a current source of truth

http://www.flickr.com/photos/fotos_medem/3399096196/

Friday, November 16, 12

So when this...

Jboss App

Memcache

Postgres Slaves

Postgres Master

NagiosGraphite

Friday, November 16, 12

... becomes this...

Jboss App

Memcache

Postgres Slaves

Postgres Master

NagiosGraphite

Friday, November 16, 12

That can happen automatically

Jboss App

Memcache

Postgres Slaves

Postgres Master

NagiosGraphite

Friday, November 16, 12

Copyright © 2010 Opscode, Inc - All Rights Reserved 41Friday, November 16, 12

Copyright © 2010 Opscode, Inc - All Rights Reserved 41Friday, November 16, 12

Managing Complexity Today

Friday, November 16, 12

Managing Complexity TodayHow Do we Manage This at Cloud Scale?

• Thousands of infrastructure dependencies and configurations needed for each change.

• Huge Amounts of Time

• Increased Cost of Correction of Manual Errors

• Huge Need for Talent

• Risk of Critical Skills Shortage

Friday, November 16, 12

Google, Amazon, Microsoft, Yahoobuilt their own tools

Friday, November 16, 12

but it was “secret sauce”

Friday, November 16, 12

everyone else was here

... inexperienced & poorly equipped for the world they must now operate in.

Friday, November 16, 12

everyone else was here

... inexperienced & poorly equipped for the world they must now operate in.

Friday, November 16, 12

Infrastructure

"It is common to think in terms of individual machines rather than view an entire infrastructure as a combined whole"

“A good infrastructure, whether departmental, divisional, or enterprise-wide, is a single loosely-

coupled virtual machine, with hundreds or thousands of hard drives and CPU's.”

-- Bootstrapping an Infrastructure USENIX LISA ’98

Friday, November 16, 12

Infrastructure as Code

• Programmatically provision and configure

• Treat like any other code base

• Gives you tools to manage complexity while being flexible enough to evolve with your Infrastructure

• Reconstruct the business from code repository, data backup, and baremetal resources

Friday, November 16, 12

Declarative Syntax

• Define policy

• Say what, not how

• Abstraction between platforms

• Many positive side effects

Friday, November 16, 12

Idempotence

• You’ll hear this a lot

• Property of declarative interface

• Eliminates brittleness of scripting

• Identity function: f(x)=x

• Safe to repeat

Friday, November 16, 12

Chef is a Tool

http://www.flickr.com/photos/wessexandy/7690486884/sizes/c/in/pool-96164123@N00/

Friday, November 16, 12

Wax Philosophical

• We are artists & masters of our craft

• Everyone needs great tools

• Nobody remember’s Picasso’s paintbrush

http://www.flickr.com/photos/vgm8383/2686128924/sizes/l/

Friday, November 16, 12

The core ideas in this talk:

Automation + Culture =AGILITY!

Friday, November 16, 12

The core ideas in this talk:

Automation + Culture =AGILITY!

Friday, November 16, 12

Pitfalls

http://www.flickr.com/photos/nesposit/2787559303/sizes/o/in/photostream/

Friday, November 16, 12

This should sound familiar...

Friday, November 16, 12

Friday, November 16, 12

Friday, November 16, 12

Traditional thinking

Dev’s job is to add new featuresOps’ job is to keep the site stable and fast

http://www.flickr.com/photos/stewart/461099066/

Slide Courtesy of John Allspaw - http://www.slideshare.net/jallspaw/10-deploys-per-day-dev-and-ops-cooperation-at-flickr

Friday, November 16, 12

Ops’ job is NOT to keep the site stable and fast

Dev’s job is NOT to add new features

Friday, November 16, 12

OUR jobis to ENABLE our business

Friday, November 16, 12

Our business REQUIRES change

Friday, November 16, 12

BUT CHANGE IS THE CAUSE OF MOST OUTAGES!

Friday, November 16, 12

Choose:Discourage change in the interests of

stabilityOR

Allow change to happen as often as it needs to

Friday, November 16, 12

http://www.flickr.com/photos/gsfc/6795048198/sizes/o/in/photostream/

The Great Abyss

Friday, November 16, 12

The right culture is a requirement for survival & success at web

scale.

Friday, November 16, 12

Lessons Learned:Every Post-mortem

Friday, November 16, 12

Lessons Learned:Every Post-mortem

Ever...

Friday, November 16, 12

Friday, November 16, 12

Root Cause:“Bad Luck... it was a perfect storm of impossible events”

Friday, November 16, 12

Lesson #1“We have a bunch of manual processes which we need to

automate”

Friday, November 16, 12

Copyright © 2010 Opscode, Inc - All Rights Reserved 70Friday, November 16, 12

Lesson #2“We introduced too many changes

at once”

Friday, November 16, 12

Friday, November 16, 12

Slide Courtesy of John Allspaw - http://www.slideshare.net/jallspaw/ops-metametrics-the-currency-you-pay-for-change

Friday, November 16, 12

Image Courtesy of John Allspaw - http://www.slideshare.net/jallspaw/ops-metametrics-the-currency-you-pay-for-change

RAAAWR!!! I’m SCARY!

Friday, November 16, 12

Slide Courtesy of John Allspaw - http://www.slideshare.net/jallspaw/ops-metametrics-the-currency-you-pay-for-change

Friday, November 16, 12

Slide Courtesy of John Allspaw - http://www.slideshare.net/jallspaw/ops-metametrics-the-currency-you-pay-for-change

Friday, November 16, 12

Images Courtesy of John Allspaw - http://www.slideshare.net/jallspaw/ops-metametrics-the-currency-you-pay-for-change

I can haz cuddle?

Friday, November 16, 12

Friday, November 16, 12

MAKEMORE

AWESOME!!!

Friday, November 16, 12

Continuous Delivery

Faster Time to ValueHigher Availability

Happier TeamsMore Cool Stuff

Friday, November 16, 12

Friday, November 16, 12

Stuff Suits Care About

• Visibility & Accountability

• Reduce Risk

• Business Agility

Friday, November 16, 12

Stuff Engineers care about

• Change when we need it

• Innovate Faster

• Constant Improvements

• Application & Site Resiliency

Friday, November 16, 12

Recap

Friday, November 16, 12

Recap

•Step 1) Automate your Infrastructure

Friday, November 16, 12

Recap

•Step 1) Automate your Infrastructure

•Step 2) Bridge the Cultural Divide

Friday, November 16, 12

Recap

•Step 1) Automate your Infrastructure

•Step 2) Bridge the Cultural Divide

•Step 3) Profit!

Friday, November 16, 12

Recap

•Step 1) Automate your Infrastructure

•Step 2) Bridge the Cultural Divide

•Step 3) Profit!

•Automation + Culture = Agility

Friday, November 16, 12

Try it out!

• Hosted Chef is a SaaS product hosted by Opscode

• http://manage.opscode.com

• Our wiki: http://wiki.opscode.com

• Fast start guide:

• http://wiki.opscode.com/display/chef/Fast+Start+Guide

• Our Community site: http://community.opscode.com

• Cookbooks in our Github account: http://github.com/opscode/cookbooks

• The materials for our 3-day Chef Fundamentals class are online:

• https://github.com/opscode/chef-fundamentals

Friday, November 16, 12

Supported Platforms

• Ubuntu (10.04, 10.10, 11.04, 11.10, 12.04)

• Debian (5.x, 6.x)

• RHEL & CentOS (5.x, 6.x)

• Fedora 10+

• SUSE Enterprise (11.2)

• openSUSE (12.1)

• Solaris (5.9, 5.10, 5.11 -- x86 and SPARC)

• Mac OS X (10.4, 10.5, 10.6, 10.7)

• Windows 7

• Windows Server 2003 R2, 2008, 2008 R2

Friday, November 16, 12

Additional Resources• Opscode Youtube Channel:

• http://www.youtube.com/opscode

• Jesse Robbins, Changing Culture & Being a force for Awesome

• http://www.youtube.com/watch?v=OU8ihx3nT6I

• Matt Ray on Automating Continuous Deployment

• http://www.opscode.com/blog/2012/11/13/automating-continuous-deployment-wchef/

• Continuous Delivery by Jez Humble & David Farley

• http://continuousdelivery.com/

Friday, November 16, 12

Thanks!

• George Miranda

• gmiranda@opscode.com

• @gmiranda23

Friday, November 16, 12

Questions?

• On freenode: #chef and #chef-hacking

• http://lists.opscode.com

• http://tickets.opscode.com

• http://help.opscode.com

• @opscode and @opscode_status on Twitter

Friday, November 16, 12

top related