feature flagging your infrastructure for fun and profit
DESCRIPTION
At Etsy we run our production stack exclusively on physical hardware and have (including developer and CI VMs) about 1000 nodes managed with a single Chef server running the same cookbooks across development and production nodes. We have built the knife-spork plugin to support our workflow and have 30 engineers making changes about 20 times a day. While this setup keeps configuration changes and package installations consistent across our network, it makes it harder to gradually roll out and test changes on a per node or role basis. In the talk our general workflow will be introduced and explained and how we enable engineers to frequently roll out small changes. Following the general introduction, our workflow for testing changes will be discussed and the library we wrote to enable config flags and gradual roll outs for our infrastructure within our chef recipes.TRANSCRIPT
Feature Flagging your Infrastructure for Fun and
Pro!tDaniel Schauenberg
[email protected]@mrtazz
Tuesday, September 10, 13
Tuesday, September 10, 13
LAMMP
Item by TheBackPackShoppe
Tuesday, September 10, 13
Item by FrankelPhotos
Monolithic App
Tuesday, September 10, 13
Etsy Infrastructure
• ~1400 nodes
• ~30 dev & ops engineers making changes regularly
• Open Source Chef server + GitHub Enterprise
• Default environment setup (production, development, testing)
Tuesday, September 10, 13
jonlives/knife-spork
Tuesday, September 10, 13
knife-spork
• Work!ow to manage cookbook and environment changes
• Versioned cookbooks and pinned environments
• Speci"c work!ow, di#erent ways of using plugins
Tuesday, September 10, 13
Tuesday, September 10, 13
% chef-shellchef > recipe_modechef:recipe > echo offchef:recipe > include_recipe "apache"chef:recipe > run_chef
Tuesday, September 10, 13
% review -r jcowie --cc ops
Tuesday, September 10, 13
% knife spork check apache% knife spork bump apache% git commit% git push% knife spork upload apache
Staging DeployTuesday, September 10, 13
jonlives/knife-!ip
% knife node flip node.etsy.com testing% knife role flip testRole testing
Tuesday, September 10, 13
% knife spork promote apache% git commit% git push% knife spork promote apache --remote
Production DeployTuesday, September 10, 13
Monitoring
Tuesday, September 10, 13
19:18:06 irccat | CHEF: Daniel Schauenberg promoted [email protected] to development https://github.etsycorp.com/gist/12345
Tuesday, September 10, 13
etsy/chef-handlers19:20:00 irccat | Chef run failed on test.etsy.com19:20:00 irccat | https://github.etsycorp.com/gist/12347
jgoulah/knife-lastrun
% knife node lastrun test.etsy.com
Tuesday, September 10, 13
Graphs!
Tuesday, September 10, 13
Downsides
• Longer testing blocks others
• Staged cookbooks can accidentally be promoted
• Testing environment a#ects more than one cookbook
• Used “upgrade” environments to circumvent
Tuesday, September 10, 13
Feature Flags
Tuesday, September 10, 13
Branching in Code
• Well established pattern for “dark launches”
• Used in the Etsy Web stack
• Allows for restricted roll outs
• http://code.!ickr.net/2009/12/02/!ipping-out/
Tuesday, September 10, 13
Tuesday, September 10, 13
etsy/chef-whitelist
Tuesday, September 10, 13
chef-whitelist
• Data bag driven whitelist
• Library to include in cookbooks
• Easy access to feature !ags
Tuesday, September 10, 13
whitelist data bag{ "id": "my_whitelist", "patterns": [ "host.example.com", "*.subdomain.example.com", "prefix*.example.com" ]}
Tuesday, September 10, 13
whitelist data bag{ "id": "my_whitelist", "patterns": [ "host.example.com", "*.subdomain.example.com", "prefix*.example.com" ], "roles": [ "Webserver", "DatabaseServer" ]}
Tuesday, September 10, 13
feature !ags in recipe
if node.is_in_whitelist? "my_whitelist" # new hawtnesselse # old stuffend
Tuesday, September 10, 13
Customizable
node.is_in_whitelist? "my_whitelist", "acl", "hosts"
Tuesday, September 10, 13
real worldexample
{ "id": "php-5-4-19", "patterns": [ "dschauenberg.vm.dev.etsy.com", "web0270.etsy.com", "api04.etsy.com", "imgcache01.etsy.com", "imgwriter01.etsy.com", "worker01.etsy.com", "beacon01.etsy.com", "paymentsweb01.etsy.com" ], "roles": [ ]}
Tuesday, September 10, 13
Advantages
• Easy to access list of what gets upgrades
• Upgrades don’t need the spork work!ow
• Pattern already known by all engineers
Tuesday, September 10, 13
Downsides
• Changes outside the regular work!ow
• No graphs (yet)
• Less visible cleanup required
Tuesday, September 10, 13
Summary
• GitHub Enterprise, Dev VMs, chef-shell as development environment
• Chef Server and knife-spork as Deployment System
• Feature !agging with chef-whitelist
• Monitoring, Noti"cations, Graphs
Tuesday, September 10, 13
• http://codeascraft.etsy.com/
• http://codeascraft.com/2013/08/02/infrastructure-upgrades-with-chef/
• http://www.slideshare.net/jonlives/michelin-starred-cooking-with-chef
• http://www.slideshare.net/mcdonnps/lessons-from-etsy-avoiding-kitchen-nightmares-chefconf-2012
• https://github.com/etsy
Further information
Tuesday, September 10, 13
Thank you!Questions?
Tuesday, September 10, 13
Feature Flagging your Infrastructure for Fun and
Pro!tDaniel Schauenberg
[email protected]@mrtazz
Tuesday, September 10, 13