contiuously deploying culture 2.0 - agile Ísland
TRANSCRIPT
Continuously Deploying Culture 2.0
Rich Smith
Director of Security, Etsy
@iodboi Agile Iceland 2014
A Story Of Scaling Engineering Culture at Etsy
@iodboi
Who Am I ?• Rich Smith, Brooklyn NYC
• Director of Security Engineering at Etsy
• Cover the AppSec, NetSec, Risk Engineering teams
• Also building the security organisation & security culture
• Co-Founder of here in Iceland !
@iodboi
WTF is !• Online community for
handcrafted & vintage items
• Human scale manufacturing
• Marketplace for small businesses
@iodboi
Now (FY 2013)• Gross Marketplace Sales (GMS) $1.35 Billion
• 40 million members, 1 million active sellers
• 26 million active listings
• 200+ Countries Performing Transactions
• >615 Employees
• Offices in 8 countries
@iodboi
Core Engineering Principles• Empower the edges
• Trustworthy not trusted
• Every engineer can push to prod at any time
• ‘Just Ship’ - Get things done
• ‘If it moves graph it’ - Let the data lead you
@iodboi
Very end of 2009 Today
Pushes Per Day
But How Did We Get There?
@iodboi
What’s Etsy’s Story of Engineering Culture?
@iodboi
Continuously Deploying Culture v1• Mike Rembetsy & Patrick McDonnell
• Gave a talk at the Velocity conferences in 2011/12
• Etsy’s engineering culture evolution 2006-2011/12
• Slides here: http://slidesha.re/1xYxZrG Watch it here: http://vimeo.com/51310058
• Today we are extending those lessons up to the
Disclaimer A + B != Culture
@iodboi
2006 - 2008 Silos and Barriers• Etsy 4 person startup grows to employ 30 - 35 FTE’s
• Around 15 engineers
• A very siloed culture, creates barriers to engineering collaboration
• Bred initiatives like Sprouter - ‘Middleware of distrust!’
• Project dedicated to stopping engineers touching databases
@iodboi
Management Changes• Maria Thomas from NPR promoted to CEO
• Brings a clear understanding that community is very important
• Prioritises a culture that supports community
• Chad Dickerson brought on as CTO
• Brings a clearer focus to the engineering team
• ‘This Silo’d culture cannot work, we need to start over’
@iodboi
2006-2008 Takeaways• Downtime was an accepted fact of life
• It was even expected to a degree!
• Engineering projects were often low impact
• Community needs to be a technical focus
• Survived the holiday season … just!
@iodboi
2008 End of Year Snapshot
GMS
Employees Engineers
$87.5 Million
35 15
@iodboi
2009 Internal Improvements• As teams grow, big efforts in good communication
• Daily standups begun
• Much better cross-team collaboration between Ops & others
• Network solidified and provided basis for future growth
• Moved from Downtown Brooklyn to DUMBO
@iodboi
2009 Takeaways• Big growth year
• Built solid foundations:
• Infrastructure
• Invested in human capital
• ‘DevOps’ culture begins in earnest …..
• A lot of reflection and finding an Engineering identity
@iodboi
2009 End of Year Snapshot
GMS
$180.6 Million
2010 Renewed Energy
@iodboi
2010 Standardisation & Graphs• Moved to PHP & MySQL for *
• It almost doesn’t matter what you choose, just stick to it
• ‘If it moves Graph it’
• Graphite, Ganglia, FITB, Weathermap, Nagios, Naglite …..
• Starting to use this data for work/life balance as well as technical/systems reasons
**
@iodboi
2010 Following our ideals …..
• Blameless PostMortems
• 1:1’s as a core mgmt tool
• Eng career planning (Reverb)
• Accept failures, but not low standards
• Developer on-call
• Use of A/B testing
• Lots of Prototypes
• FeatureFlags & Ramp Up
Management Ideals Engineering Ideals
@iodboi
2010 Takeaways• Reduce number of technologies used in development
• Focus on technical visibility throughout the org
• Developers responsible for code release (start of DevOps)
• Member support rotations for all
• Work hard at work/life balance & have data to support
@iodboi
2010 End of Year Snapshot
GMS
$314.3 Million
@iodboi
2011 Tech highlights• End of long tail legacy silo holdovers (Sprouter gone!)
• Non-Standard technologies removed from production
• Engineers receive 3 annual goals:
• Speak at a conference
• Write a blog post
• Release open source software
@iodboi
2011 - Organisational Changes• Snr. management to become more Engineering focused
• Chad to CEO
• Kellan to CTO
• Allspaw to SVP of Operations
• Consolidates importance of engineering culture to the very top of Etsy and increases stability
@iodboi
2011 Takeaways• Year of the Open Sourced tool
• Statsd, Logster, Deployinator, Supergrep, Schemanator ….
• Overall maturing of engineering - platform & people
• Automation & config management solidified (Chef everything)
• Security starts becoming a 1st class citizen
• ‘Security Culture at Etsy’ begins to be chased & discussed
@iodboi
2011 End of Year Snapshot
GMS
$525.6 Million
Let’s catch our breath!
@iodboi
2011-2012 - A Focus on Security• Security alongside Dev & Ops as being integral to culture
• Applying our ‘DevOps’ principles & learnings to security
• Emphasis on security being a facilitator not a blocker
• Security often ‘enforced’ with terrible cultural impact
• Build a human and effective security organization
@iodboi
2012 - Growth + Foster Our Values• Explosive growth in hiring, allow easy transfers
• Some major changes around product
• Increased focus on community
• Internationalisation
• High impact products (Shipping Labels, Gift Cards)
• Became a certified B-Corp - not just the bottom line
@iodboi
What’s a B-Corp ?• Aim to use the power of business to solve social &
environmental issues
• Impacts engineering in new and interesting ways:
• Waste, Recycling, Compost, Flushes (Yes we graph them!)
• Efficiency of our tech, data centre usage & partners
• ‘Make the world more like Etsy’ - Extending the culture
@iodboi
2012 - Technical Achievements• Create wholly separate payments environment
• Allows PCI compliance without disrupting the culture
• Interface with the webstack via a restricted Internet facing API
• Get serious on Data Science
• Dedicated Hadoop cluster for full time data scientists
• Taking some chances and broadening of our engineers
@iodboi
2012 Takeaways• Do what’s needed to sustain long term & not just keep
the lights on
• More headcount than required allows us to take chances
• Focus on info exchange, internally & externally with communities
• Open source all of the things
@iodboi
2012 End of Year Snapshot
GMS
$895.1 Million
@iodboi
2012 Action Items• Security is integral DevOps lifecycle and culture
• Know when to flick sights from short to longer term goals
• Pursue dynamic engineering resource allocation
• Do not allow increasing org size to dictate culture
@iodboi
2013 - An Interesting Year!• Had many of the hard engineering wins taken care of …
• Time to focus internally
• No engineer can know everything any longer
• Need to maintain the culture of transparency & trust
• Really was the year of internal tooling to achieve this
@iodboi
2013 - Technical• Morgue tool created to capture and aid postmortems
• Moved to Vertica for BI data & metrics
• Superbit allowed simple querying of Vertica & big data by anyone who knows SQL
• Catapult launched to relate metrics to experiments
• Begin a refocus on a Mobile/API First product vision
@iodboi
2013 Takeaways• Democratisation of data is made easier with tooling that
levels access and allows interrogation by ALL
• Conscious effort on internal tooling to minimize the pain of large & complex stack
• Engineering invested in transparency & trust
• The world doesn’t wait, mobile is the future
@iodboi
2013 End of Year Snapshot
GMS
Employees Engineers
$1.35 Billion
615 >33%
@iodboi
2013 Action Items• Datasets grow, evaluate how they can be accessed,
evaluated and contextualized
• Have you reached a point where no one can know everything?
• While tooling can’t create culture it can help you support it
• Be free to apply your culture in new ways
• Inward focus cannot lead to outward blindness, tech changes fast
@iodboi
2014 - Organisational changes• Everyone pushes on their 1st day, not just Engineers
• Yearly planning is restructured
• Take account of a growing Etsy
• San Francisco opens as 1st non-Brooklyn Eng hub
• Acquire & integrate A Little Market with Etsy
@iodboi
Cultural Acquisition • As part of growth, Paris based A Little Market acquired
• Integrating another engineering culture can be tough
• Etsy’s culture is ‘different’ & this can be a big step
• Language, timezone and human cultural differences
• Can be very successful, but don’t underestimate
@iodboi
2014 - Technical• Move away from Splunk and to ElasticSearch/Logstash/
Kibana (ELK)
• Mobile CI infrastructure embedded & ramped up
• API First a huge effort and development push
• Mobile First as an increasing product focus
• Technical work for quality of life - on-call sleep tracking
@iodboi
Logstash Lessons• Replacing Splunk with LogStash taught many lessons
• Changing of a core tool require huge comms investment
• Without it enclaves & silos can form to resist change
• Explain the whys not just in terms of technicals or $$
• Fully understanding all use cases, not just the main ones
• Don’t settle for a half complete end goal, go the distance
@iodboi
API First• Supporting the Mobile First push & diversity of clients
• No longer assume LAMP, decoupling required
• Adds security & agility
• Embeds fundamental future resilience
• Capacity planning becomes more challenging
@iodboi
Mobile First• Applying your principles and culture to the changing
tech landscape is key
• Continuous Deployment hard in the ‘App Store world’
• Continuous Integration still applies of course
• Continuous Deployment becomes Continuous Delivery
• Still use API to enable feature flag driven native apps
Continuous Deployment Continuous Delivery
Frequent checkins directly to mainline ✓ ✓Automated build & test cycle ✓ ✓Keep the build green, always ready to release ✓ ✓One button deploys ✓ ✓Business dictates when to deploy ✓Every passing build deployed to prod ✓All enhancements gated by feature flag ✓ ?
@iodboi
Why This Approach?• Continuous integration, Continuous Delivery
• Build your apps in a reproducible way after each push to git
• Identify bugs, missing dependencies early & often
• Integrate security testing throughout lifecycle
• Improve Mean Time To Recovery
• Stop stressing about releases!
Single release
Many releases
50K LOC/month
Few opportunities for failureWide surface area (50,000 LOC) High MTTR !
All of the bugs we’ve written
More opportunities for failure Narrow surface area (< 100 LOC)
Low MTTR !
A fraction of the bugs we’vewritten per release
Imagine that we’ll write
@iodboi
Sleep Tracking• Experiment with fitbands & Ops
• Collect sleep data for on-call
• Analyse in a variety of manners
• Sleep lost when on-call/pagerduty
• Alert on VPN/SSH logins while asleep
• Focus on data for quality of life
@iodboi
2014 Takeaways• Another year of big growth, also now M&A
• Integrating other engineering cultures inside your own is a challenge you should prepare for
• Core tooling changes require great thought & comms
• Mobile focus does not mean the end of always pushing
• Tooling for happiness & W/L balance is a win for all
@iodboi
2014 Action Items• Culture is still king despite growth or M&A activity
• It takes effort to keep it so however
• Ensure your API is up to the job of supporting Mobile 1st
• Ensure core tooling changes are understood & embraced by all
• Communicate your Eng culture & history to new hires
Credit for the awesome graph: Patrick Koch
@iodboi
Conclusions• Culture doesn’t come for free, it takes continuous work
• Iterate & improve - Even when you think you have ‘it’
• Don’t give in to potential disruptors like growth & security and let them destroy your culture
• Get smart and use them to test, support and improve it
Never. Stop. Pushing.
@iodboi
Questions?
@iodboi
Links / ReferencesContinuously Deploying Culture (Mike Rembetsy, Patrick McDonnell)
Slides: http://slidesha.re/1xYxZrG Video: http://vimeo.com/51310058
Scaling Etsy, what went wrong, what went right (Ross Snyder)
Slides & Video: http://bit.ly/po8zIj
Etsy’s journey to continuous integration for mobile apps (Nassim Kammah)
Blog post: http://bit.ly/1yiGWwc
Mean time to sleep (Ryan Frantz, Laurie Denness)
Slides, Blog post, code: http://ryanfrantz.com/mtts/