inside the chef push jobs service - chefconf 2015
TRANSCRIPT
Chef Push in 2015Mark Anderson, 2015-04-01
Mark AndersonEngineer, Chef
The basics of Chef Push
If you want to run a command on a set of nodes • `knife ssh` can be problematic
• Key distribution/revocation • Access control/User accounts • Difficult to audit • Extra work required if the node is behind firewall • Doesn’t really scale very far past tens of nodes
• None of the alternative systems suited our needs
Why Chef Push?
• We wanted a remote execution system that is • Robust under network and client failure • Gates execution on a quorum being available • Provides presence information • Scale to hundreds if not thousands of nodes • Integrated with Chef authentication and
authorization system • Works behind firewalls and NAT
Why Chef Push?
• knife job start -quorum 90% 'chef-client' --search 'role:webapp'
• Finds all nodes with role webapp • Submits a job to the push server. • Checks quorum; 90% nodes listed must be available • Starts job chef-client on available nodes • Gathers success and failures • And will do this for ten nodes...or a thousand
Push jobs in a command line
The lifecycle of a job
Server
Client
Job Accepted
Send Command
Clients ACK
Wait for Quorum Start Exec
Clients Exec
Collect Results
• Erlang service • Extends the Chef REST API
• Job creation and tracking • Push client configuration
• Controls the clients via ZeroMQ • Heartbeats to track node availability • Command execution • All ZeroMQ packets are signed
Chef Push Server
• Simple ruby client • Receives heartbeats from the server • Sends back heartbeats to the server • Executes commands
• Configuration requirements are minimal • The client initiates all connections to the server
• Most configuration is via Chef API call to config endpoint
• Using that info opens ZeroMQ connections to server
Chef Push Client
Chef Push Networking
Message switch
Heartbeat generator
REST API
Client
HTTPS
PUB/SUB
DEALERROUTER
• All control for push is via extensions to the chef API • Node status • Job control
• start • stop • status
• Job listing
Chef Push knife extension
• Access rights controlled by groups • ‘push_job_writers’ group controls job creation and
deletion • ‘push_job_readers’ group controls read access to
job status and results • Whitelist for commands
• The client rejects commands that aren’t on the whitelist
• We’d like to do finer grained access control in the future
Access control
• Version 1.0 scales to 2k nodes • Works with Chef 12 • Open source since Fall 2014
• We’ve been working on new features since last spring
• But Chef 12 had to go out first • Required features from Enterprise Chef • Open sourcing chef push pretty meaningless
without a open source server
Status:
New Features in Chef Push 2.0
• Breaking change to the protocol • End to end encryption of every packet
• Required for us to implement parameter passing and output return features
• Built on the ZeroMQ4 implementation of CurveCP • CurveCP provides a framework which is
• Fast • Crypto hardened against modern attacks • Forward secrecy
• We still bootstrap the authentication using the Chef Client key
End to End Encryption
Enhanced control for the job execution environment • A config file up 100k • Effective User • Working directory • Environment variables
• User defined variables • Special variables for
• job id • job file location
Command environment and config files
• New flag for job • capture_output: boolean
• Capture is all or nothing • All nodes in the job • Both stdout and stderr
• Stored on server with job description • No streaming output … yet
Command output capture
Two event feeds • Per org feed
• Job start • Job completion summary • Runs forever
• Per job feed with fine grained execution data • Job voting start • Quorum votes by node • Job start • Completion state by node • Job completion
Server Sent Event Feeds
• Previously we’ve been advertising around 2k as the limit
• 10k connected nodes demonstrated • 10 sec heartbeats • c3.2xlarge chef server in standalone mode • Push server consumes 2 cores and about 2GB
• Up to 1k nodes in a single job • around 1.5-2k nodes we start seeing some
stampede problems • Not done scaling; there are a few tweaks left to do
Stable at 10k connected nodes
Demo some improvements
• That test was done with real push clients • 20 m3.2xlarge nodes, • Each running 500 docker containers
• But we also do a lot of testing using a simulator • Understanding the limits of our current system
• SystemTap is amazing for this kind of work
Current work: Scalability and Stability drive
Axes of scaling tested • # of active clients • Heartbeat rate for a client • Number of clients in a single job
Below 10k clients there is a pretty linear trade between heartbeat rate and number of connected clients; heartbeats/sec is was a useful metric
Must use care to avoid stampedes in job execution
Scaling and Tuning
• A port in ZeroMQ is bound to a single thread • All communications go through a single ‘command
switch’ • Client heartbeats, and all command messages go
through the switch • The switch ended up being a bottleneck at around
2k messages/sec • Experiment: multiple command switches
• Exercises some weaknesses in the ZeroMQ - Erlang interface
• Not as big of a win as hoped, ended up being more complex than we’d like
Lessons from scaling
Nearly feature complete but: • Remaining work for new features
• Knife push extensions for everything • Documentation
• Windows testing and stability • Committed to making Windows a first class citizen
• CentOS 7 • Polish around installation and cookbooks • Upgrade tooling for 1.0->2.0 • Bug fixes
• Please file bugs
Remaining work for 2.0
Roadmap for 2.1 and beyond
• Currently we support • Ubuntu 10.04, 12.04, 14.04 LTS • CentOS 5, 6, and 7 soon • Windows (client only)
• Investigating client support for • AIX • Solaris
Platform Support
• Key rotation support • Multiple keys breaks some assumptions around
how we auth in push • Needs fixes on Chef Server as well as Push
• Better access control • Controlling access on a node by node basis • Examining persistent jobs as a first class object
with their own ACLs - look for the RFC
Features for 2.x releases
• Integration into Chef Client package • Delayed joining the two because of the protocol
breaking changes in 2.0 • Future server versions will be backward
compatible.
Features for 2.x releases
Scaling • Rate limited job execution
• Prevent stampede effect • Protects both push and chef server • Starting 1k chef client runs at once is a bad idea
anyways • Per-job and server global limits
• Multiple socket command switch • Biggest scaling bottleneck • Infrastructure for distributed server
Features for 2.x releases
• Move push connections to front ends in tiered Chef • Push will be running on all of the front end nodes • Expect should improve scaling
• Better HA support • Move to a true active-active model on BE
• Scaling • Our goal is to scale with Chef server
Future major releases - 3.x and beyond
Protocol changes required • Complex networks difficult; proxies are hard
• ZeroMQ was helpful at first, but hitting limitations • Stability problems at scale • Erlang doesn’t need a lot of what ZeroMQ brings
• Backward compatibility will be a priority
Future major releases - 3.x and beyond
• Office hours • Currently Monday and Wednesday 12:00PST
• chef-push is the master repository • github.com/chef/chef-push • File issues here • Specific issues and PRs are fine to file against the
individual repos • Pull requests always welcome
• RFCs for major new features
`