distributed release management

Post on 08-May-2015

7.849 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Full Stack Engineering Meetup in NYC, May 27, 2014.

TRANSCRIPT

Distributed Release Management Deploying etsy.com 40+ times per day

Mike Brittain

Engineering Director, Etsy

@mikebrittain mikebrittain.com/talks

1st Day Assignment Put your face on etsy.com/about

What I’m showing you tonight is the result of four years of iteration.

Small incremental changes to the application “Dark” features: new classes, methods, controllers Graphics, stylesheets, templates Copy/content changes !

App deploys

Turning flags on, off, or % ramp up

Config deploys

Latent bugs and security holes Traffic management, load shedding Adding and removing infrastructure !

Tweaking config flags or releasing patches.

“Operating” the site

IRC, #push

/topic mbrittain | jgoulah | rsnyder | ekastner

/topic mbrittain, jgoulah, rsnyder | ekastner

Keep real people in the loop

Queue, with max batch size of seven.

Automated deployment run by humans

4 people in this deploy.

“I’ve pushed my changes to master.”

“Everyone has checked in.”

Build QA and Pre-prod

Build progress

Status in #push

Git SHA1 in for each env.

Date, username, deploy log, changeset, link to dashboard from time of deploy

Reporting what’s going on in Deployinator, and who triggered

Status from build cluster

Pre-prod (“princess”) has been deployed. !

SHA1 of the change Time it took to deploy Link to changeset in GitHub Log of the deploy script

Btw, there are three bots talking in channel at this point. O_o

Queuing for next deploy

Humans talk to other humans from time to time.

Talking to pushbot. !

Pushbot knows some Spanish… because, ya know, why not?

Link to test results for CI environment, along with how long the tests took.Alerting by name.

8 minutes have elapsed… We’ve built and tested our release in the CI environment (“QA”). !

QA build failed our 5 min. SLA for tests.

“Try” is our pre-commit testing cluster.

Bots help reinforce our values. This is especially helpful for new people on the team.

Still 8 minutes elapsed… Pre-prod has been deployed and tested. !

This ran in parallel with our QA build and tests.

Cross-traffic: In a separate channel (#config), our app configs files were deployed to pre-prod.

Cross-traffic: Ops team deployed a configuration change.

And, yes… another non-human.

Code is live Link to dashboard.

13 minutes elapsed… Code is now in production with public traffic.

Who committed code in the last deploy? And how many lines did each of them change?

Handoff for the next deploy.

Entire app deploy took 15 minutes. !

4 people running the deployment 8 committers Config deploy and Chef change deployed in parallel.

Optimal queue size

Normalized communication

Improved visibility

Historical record is ideal for post-mortems

Organic evolution

Hold up the queue (.hold)

Work the issue with the people available in #push

Additional help always available in #sysops

Buddy-system for off-hours deploys

Ops-on-call, dev-on-call

When something goes wrong?

25 Million Items listed 60+ Million Monthly unique visitors 200 Countries with annual transactions !

175+ Committers, everyone deploys

Items by anjaysdesigns, betwixxt, OneStarLeatherGoods, mediumcontrol, TheDesignPallet

@mikebrittain

DEPLOYMENTS PER DAYAPP CODE CONFIG FILES

Start small. (We did.)

Automated tests and production monitoring.

Have a story around maintaining quality.

“We can always go back to the old way.”

Demonstrate value to leadership.

Go write your own story.

Thank you.

Mike Brittain

Engineering Director, Etsy

@mikebrittain mikebrittain.com/talks

top related