puppet camp boston 2014: orchestrating infrastructure change using puppet rake, mcollective, lm and...
DESCRIPTION
Orchestrating Infrastructure Change Using Puppet Rake, mcollective, LM and Jenkins presented by Anton Gurov and Chaminda Delpagodage, Paydiant at Puppet Camp Boston 2014TRANSCRIPT
Application Deployment Orchestrationwith Puppet and JenkinsAnton Gurov, Chaminda Delpagodage
August 20, 2014
22
About Us
Chaminda DelpagodagePaydiant Technical Operations TeamRelease Engineering, Systems Administration, Automationlinkedin.com/in/chamindad
Anton GurovPaydiant Technical Operations TeamInfrastructure, Systems Administration, Securitylinkedin.com/in/antongurov
33
Cloud-based mobile wallet solution
Open ecosystem for mobile payments, offers and loyalty
Completely white-label
“Bank grade” platform of shared services↘ SaaS
↘ Secure SDKs for iPhone and Android
Top tier investors and well capitalized
44
Paydiant Puppet Use
Puppet Enterprise (PE) users since day one
100% PE coverage of Paydiant platform↘ PE handles everything after instance bootstrap
Multiple environments actively managed by PE↘ 4 Puppet Masters in multiple datacenters and security zones
↘ 8 Environments
Licensed node count doubling every year
Hosts0
100200300400500600700800900
Estimated by Year-End
Nodes under man-agement
2011 20122013 2014 EST
55
Paydiant Puppet Use
‘11-12 – Bi-annual production platform releases↘ Waterfall – major platform change
↘ Big outage – 1-2 days on the weekend
‘13-14 – Transition to daily/weekly non-production and monthly production releases
↘ Agile – smaller platform changes
↘ Zero-downtime deployment
↘ 100% Production release success rate since inception
Heavy usage of Puppet Dashboard, Puppet APIs and Jenkins
66
Puppet Dashboard as data repository
Why Dashboard?↘ Visual, flexible, powerful (if used right)
↘ Allows for business data edits by teams unfamiliar with Puppet
↘ Hiera not available at the time
Decided early on to keep Puppet code and data separate
Came up with our own Dashboard pattern – “Classes, Parameters and Supergroups”
Puppet Module
Code
PuppetDashboard
BusinessData
Puppet Module
Parameters
77
Puppet Dashboard as data repositoryClasses, Parameters and Supergroups pattern overview
class_C
supergroup_type_A
class_Bclass_A
parameters_X parameters_Y parameters_Z…
…
node 1 node 2 node X…node 4node 3
Groups
Nodes
88
Puppet Dashboard as data repositoryClasses, Parameters and Supergroups pattern overview
class_C
supergroup_type_B
class_Bclass_A
parameters_X parameters_Y parameters_Z…
…
node 1 node 2 node X…node 4node 3
Groups
Nodes
99
class_Bdef: default paramsincl:
Puppet Dashboard as data repositoryClass building block
class B
class A class B
class_Adef: default paramsincl: class A
class_Cdef: default paramsincl: class C
class C
…
Classes
Groups
Group name prefixed with class_Contains Puppet class and some default variables/parameters for the class
1010
Puppet Dashboard as data repositoryClass building block - example
1111
Puppet Dashboard as data repositoryParameters building block
Group name prefixed with parameters_
Only contains data and data overrides
Arbitrary hierarchy levels
Allows for inheritance and reuse
parameters_X_1incl:def: params overridesdef: additional params
parameters_Xdef: default params
parameters_X
supergroup_A supergroup_B
parameters_X_2incl:def: params overridesdef: additional params
parameters_X
supergroup_C
1212
Puppet Dashboard as data repositoryParameters building block – inheritance example
1313
Puppet Dashboard as data repositorySupergroup building block == server “role”
Group name prefixed with supergroup_
Contains all the “ingredients” for the node to configure and define itself
Node can belong to only one supergroup (many-to-one)
supergroup_type_Aincl:
def: params overrides (if any)def: additional params (if any)
class_B
class_A parameters_X
parameters_Z
node 1 node 2
Groups
Nodes
class_B
class_A
parameters_X
parameters_Z
1414
Puppet Dashboard as data repositorySupergroup building block - example
2-3 pages condensed
1515
Classes, Parameters and Supergroups pattern Pros
All parameters and classes are visible on the Supergroup page↘ See missing parameters (if inherited “SET ME!” from parent for example)
↘ See parameter clashes (Dashboard will warn if parameter is defined in 2 places)
↘ See exactly where parameter is defined
Allows teams unfamiliar with Puppet to make changes via Dashboard
Arbitrary data hierarchy/inheritance
Data reuse
1616
Classes, Parameters and Supergroups pattern Cons
Version control is difficult↘ Have to resolve to group cloning/export/import (custom RAKE copy/clone command from Puppet support)
↘ Puppet roadmap to fix this
Dashboard UI could use some help↘ Too much data on the screen sometimes
↘ Lack of sorting/grouping
Can’t store complex multi-line variables like text blobs
Zero-Downtime Deployment architecture …
v.1
FrontendLoad Balancer
FE-B FE-Av.1
FE-B FE-Bv.1
BackendLoad Balancer
FE-B BE-Av.1
FE-B BE-Bv.1
parameters_deployment-staging-FE-BankApaydiant_deployment_bank=STAGING-FRONTEND-Apaydiant_app_operation_mode=LIVEpaydiant_app_version=1
v.1
High-level platformrepresentation
parameters_deployment-staging-BE-BankApaydiant_deployment_bank=STAGING-BACKEND-Apaydiant_app_operation_mode=LIVEpaydiant_app_version=1
parameters_deployment-staging-FE-BankBpaydiant_deployment_bank=STAGING-FRONTEND-Bpaydiant_app_operation_mode=LIVEpaydiant_app_version=1
parameters_deployment-staging-BE-BankBpaydiant_deployment_bank=STAGING-BACKEND-Bpaydiant_app_operation_mode=LIVEpaydiant_app_version=1
FrontendLoad Balancer
FE-B FE-Av.1
FE-B FE-Bv.1
BackendLoad Balancer
FE-B BE-Av.1
FE-B BE-Bv.1
Disable B(FE+BE)
v.1v.1
parameters_deployment-staging-FE-BankBpaydiant_deployment_bank=STAGING-FRONTEND-Bpaydiant_app_operation_mode=MAINTENANCEpaydiant_app_version=1
parameters_deployment-staging-BE-BankBpaydiant_deployment_bank=STAGING-BACKEND-Bpaydiant_app_operation_mode=MAINTENANCEpaydiant_app_version=1
v.2a
FrontendLoad Balancer
FE-B FE-Av.1
FE-B FE-Bv.1
BackendLoad Balancer
FE-B BE-Av.1
FE-B BE-Bv.1
Run first phase of database changes(i.e. adds new stuff & migrate data)
v.2aDB changes Phase 1
FrontendLoad Balancer
FE-B FE-Av.1
FE-B FE-Bv.2
BackendLoad Balancer
FE-B BE-Av.1
FE-B BE-Bv.2
Upgrade B (FE+BE)
v.2av.2a
parameters_deployment-staging-FE-BankBpaydiant_deployment_bank=STAGING-FRONTEND-Bpaydiant_app_operation_mode=MAINTENANCEpaydiant_app_version=2
parameters_deployment-staging-BE-BankBpaydiant_deployment_bank=STAGING-BACKEND-Bpaydiant_app_operation_mode=MAINTENANCEpaydiant_app_version=2
FrontendLoad Balancer
FE-B FE-Av.1
FE-B FE-Bv.2
BackendLoad Balancer
FE-B BE-Av.1
FE-B BE-Bv.2
Re-enable B (FE+BE)
v.2av.2a
parameters_deployment-staging-FE-BankBpaydiant_deployment_bank=STAGING-FRONTEND-Bpaydiant_app_operation_mode=LIVEpaydiant_app_version=2
parameters_deployment-staging-BE-BankBpaydiant_deployment_bank=STAGING-BACKEND-Bpaydiant_app_operation_mode=LIVEpaydiant_app_version=2
FrontendLoad Balancer
FE-B FE-Av.1
FE-B FE-Bv.2
BackendLoad Balancer
FE-B BE-Av.1
FE-B BE-Bv.2
Disable A(FE+BE)
v.2av.2a
parameters_deployment-staging-FE-BankApaydiant_deployment_bank=STAGING-FRONTEND-Apaydiant_app_operation_mode=MAINTENANCEpaydiant_app_version=1
parameters_deployment-staging-BE-BankApaydiant_deployment_bank=STAGING-BACKEND-Apaydiant_app_operation_mode=MAINTENANCEpaydiant_app_version=1
FrontendLoad Balancer
FE-B FE-Av.2
FE-B FE-Bv.2
BackendLoad Balancer
FE-B BE-Av.2
FE-B BE-Bv.2
Upgrade A (FE+BE)
v.2av.2a
parameters_deployment-staging-FE-BankApaydiant_deployment_bank=STAGING-FRONTEND-Apaydiant_app_operation_mode=MAINTENANCEpaydiant_app_version=2
parameters_deployment-staging-BE-BankApaydiant_deployment_bank=STAGING-BACKEND-Apaydiant_app_operation_mode=MAINTENANCEpaydiant_app_version=2
FrontendLoad Balancer
FE-B FE-Av.2
FE-B FE-Bv.2
BackendLoad Balancer
FE-B BE-Av.2
FE-B BE-Bv.2
Re-enable A (FE+BE)
v.2av.2a
parameters_deployment-staging-FE-BankApaydiant_deployment_bank=STAGING-FRONTEND-Apaydiant_app_operation_mode=LIVEpaydiant_app_version=2
parameters_deployment-staging-BE-BankApaydiant_deployment_bank=STAGING-BACKEND-Apaydiant_app_operation_mode=LIVEpaydiant_app_version=2
v.2
FrontendLoad Balancer
FE-B FE-Av.2
FE-B FE-Bv.2
BackendLoad Balancer
FE-B BE-Av.2
FE-B BE-Bv.2
Run second phase of database changes(Cleanup old v.1 data)
v.2DB changes Phase 2
Details of the upgrade sequence …
v.1
FrontendLoad Balancer
FE-B FE-Av.1
FE-B FE-Bv.1
BackendLoad Balancer
FE-B BE-Av.1
FE-B BE-Bv.1
Putting a set of nodes into maintenance mode
2929
Putting nodes into maintenance mode Using LB node health check – http://nodeX:8080/healthcheck.jsp
Puppet ERB template for healthcheck.jsp content
………
Pseudo code:Check if “maintenance mode” throw exception elseIf “module A” present
Check if module A is upIf “module B” present
Check if module B is up…Throw 503 if any exception caught
3030
Putting nodes into maintenance mode cont.
A parameter group controls the maintenance mode
E.g. Parameter group “parameters_deployment-staging-BankB” controls “paydiant_app_operation_mode” for the nodes in set FE-B of the Staging environment
3131
Putting nodes into maintenance mode cont.
Update group parameter using Rake API (as ‘puppet-dashboard’ user)
RACK_ENV=production /opt/puppet/bin/rake -s -X -f /opt/puppet/share/puppet-dashboard/Rakefile nodegroup:variables [parameters_deployment-stagin-BankB, 'paydiant_app_operation_mode=MAINTENANCE’]
Puppet run-once using MCO (as ‘peadmin’ user)
mco puppet runonce --with-fact fact_paydiant_deployment_bank=STAGING-FRONTEND-B
While loop… check the health check page till all nodes return 503 (i.e. in maintenance) status
mco shellcmd --with-fact fact_paydiant_deployment_bank=STAGING-FRONTEND-B --cmd=\''curl --silent http://localhost:8080/healthcheck/healthcheck.jsp
FrontendLoad Balancer
FE-B FE-Av.1
FE-B FE-Bv.2
BackendLoad Balancer
FE-B BE-Av.1
FE-B BE-Bv.2
Upgrading applicationson a set of nodes
v.2a
3333
Upgrading Application Version
Disable Puppet agent
mco puppet disable --with-fact fact_paydiant_deployment_bank=STAGING-FRONTEND-B
Stop Tomcat service
mco service tomcat stop --with-fact fact_paydiant_deployment_bank=STAGING-FRONTEND-B
Cleanup exploded Tomcat webapps directory (for sanity)
mco shellcmd --with-fact fact_paydiant_deployment_bank=STAGING-FRONTEND-B --cmd='find $TOMCAT_HOME/webapps/ -maxdepth 1 -mindepth 1 -type d -exec rm -rf {} \;’
3434
Upgrading Application Version Cont.
Upgrade the application version
RACK_ENV=production /opt/puppet/bin/rake -s -X -f /opt/puppet/share/puppet-dashboard/Rakefile nodegroup:variables [parameters_deployment-stagin-BankB, ’paydiant_app_version=2’]
Re-enable Puppet
mco puppet enable --with-fact fact_paydiant_deployment_bank=STAGING-FRONTEND-B
Puppet run-once
mco puppet runonce --with-fact fact_paydiant_deployment_bank=STAGING-FRONTEND-B
FrontendLoad Balancer
FE-B FE-Av.1
FE-B FE-Bv.2
BackendLoad Balancer
FE-B BE-Av.1
FE-B BE-Bv.2
Taking a set of nodes out ofmaintenance mode
v.2a
3636
Taking nodes out of maintenance mode
Update parameter using Rake API (as ‘puppet-dashboard’ user)
RACK_ENV=production /opt/puppet/bin/rake -s -X -f /opt/puppet/share/puppet-dashboard/Rakefile nodegroup:variables [parameters_deployment-staging-BankB, 'paydiant_app_operation_mode=LIVE’]
Puppet run-once using MCO (as ‘peadmin’ user)
mco puppet runonce --with-fact fact_paydiant_deployment_bank=STAGING-FRONTEND-B
While loop… check the health check page till all nodes return 200 (i.e. live) status
mco shellcmd --with-fact fact_paydiant_deployment_bank=STAGING-FRONTEND-B --cmd=\''curl --silent http://localhost:8080/healthcheck/healthcheck.jsp
FrontendLoad Balancer
FE-B FE-Av.1
FE-B FE-Bv.2
BackendLoad Balancer
FE-B BE-Av.1
FE-B BE-Bv.2
Switching traffic toupgraded stack
v.2a
Viewing transition in Splunk across multiple datacenters
Version Transition
Jenkins …
4040
What is Jenkins
Tool to schedule and monitor the execution of repeated jobs
4141
Why Jenkins ?
Configurability↘ Different types of input parameters
↘ Invoke shell scripts
↘ Post-build actions (automatic/manual)
4242
Why Jenkins ? cont.
Plugin support↘ More than 600 plugins (https://wiki.jenkins-ci.org/display/JENKINS/Plugins)
↘ Eg. vSphere plugin (stop/start, snapshots, rollbacks…)
↘ Build pipeline plugin
↘ Parameterized remote trigger plugin
4343
Why Jenkins ? cont.
Keeps all your console logs at a single place↘ No need to hunt for 10 log files on 5 different machines
↘ Visual representation of passed/failed/in-progress status, based on downstream shell scripts or other jobs
4444
Why Jenkins ? cont.
And it’s…
MCO
Rake API
DB FE-B FE-* FE-B BE-*
Source code, liquibase
change sets
4646
Jenkins – Puppet Integration
4747
Jenkins – Puppet Integration cont.
4848
Jenkins – Puppet Integration cont.
4949
Jenkins – Puppet Integration cont.
5050
Jenkins – Puppet Integration cont.
Jenkins invoke local bash scripts, which in turn use SSH to call;↘ MCO (as ‘peadmin’ user on Puppet Master)
↘ Rake API (as ‘puppet-dashboard’ user on Puppet Master)
SSH login as ‘peadmin’ and ‘puppet-dashboard’ is password-less, using PKI↘ Generate RSA keypair for the local Jenkins user, using ssh-keygen command
↘ Append the public key to ~/.ssh/authorized_keys file of ‘peadmin’ and ‘puppet-dashboard’ users, on Puppet Master
MCO special purpose sub commands we use;↘ puppet
↘ service
↘ shellcmd* (ask your Puppet Enterprise Support for this custom MCO plugin)
5151
Links
Rake API: https://docs.puppetlabs.com/pe/latest/console_rake_api.html
MCO: https://docs.puppetlabs.com/mcollective/reference/basic/basic_cli_usage.html
Jenkins: http://jenkins-ci.org/
Liquibase: http://www.liquibase.org/documentation/index.html
5252
Recap/Takeaways… Use Puppet Enterprise
↘ Support is awesome (Celia Cottle, Jay Wallace, Ken Johnson, Zachary Stern – you guys rock!)
↘ Gotten help and features from James Turnbull and Nigel Kersten with some early versions of PE
↘ Live management and Mcollective are essential for any self-respecting enterprise
Zero-downtime upgrades↘ To Dashboard or not to Dashboard?
↘ Database update phases
↘ Managing LB health check monitors dynamically using Puppet
Automation baby steps – don’t boil the ocean↘ Understand what you are doing before automating it - develop runbooks
↘ Identify manual steps and script some of them
↘ Add scripts to orchestration tool (Jenkins, ServiceNow, whatever else you use in-house)
Thank you.