lookout scaling growing pains
DESCRIPTION
Lookout! Growing Pains was originally presented at Lookout's Scaling for Mobile event on July 25, 2013. Kyle Barton is a Senior Software Engineer at Lookout, Inc. Kyle's talk focused on the growing pains that Lookout has faced as we've scaled and engaged more users. Lookout has grown immensely in the last year. We've doubled the size of the company—added more than 80 engineers to the team, support 45+ million users, have over 1000 machines in production, see over 125,000 QPS and more than 2.6 billion requests/month. Our analysts use Hadoop, Hive, and MySQL to interactively manipulate multibillion row tables. With that, there are bound to be some growing pains and lessons learned.TRANSCRIPT
LOOKOUT!GROWING PAINS
KYLE BARTON, LOOKOUT, INC.
THEN AND NOW
• Getting a solid product off of the ground
• Dealing with a growing user base
• Where we are today
GETTING OFF THE GROUNDMVP
• Try to reproduce locally with a test
• Production debugging
• Eye balling change sets
• Synclogs
HOW WE DEALT
WE STARTED GROWING
• More Users
• More Traffic
• More Developers
• More Problems
• Ran out of integer space for primary keys
• Load spikes
• Clients checked in at the same time
• No way to tell clients to back off
PROBLEMS
• More code going out in each deploy
• Difficult to diagnose issues after deploy
• Pretty much guaranteed rollbacks on deploy
PROBLEMS CONT...
• Implemented “load shedding”
• Under high load respond to client with “come back later” message
• Automate with ganglia checks
• Randomized client check in
• Picture Backup disabled by default
LOAD SPIKES
INTEGER OVERFLOW
• Created new tables
• Big int primary and foreign keys
• Application code to use new and old tables
• New data goes to new table
• Old data moved on login
• Background Process to update untouched records
GIANT DEPLOYS
• Feature rollouts
• Slowly enable a new features
• Configuration
• Ability to turn off new code without rolling back
WHERE WE ARE TODAY
PROBLEMS
• Syncml
• One endpoint to service all types of requests
• Bloated request and responses
• Team Structure
• Communication
• Too many meetings
• Versioned RESTful APIs on existing architecture
• Eventually end of life old clients that haven’t upgraded
DEPRECATE SYNCML
• Persisted chat
• Smaller teams
• Better documentation
TEAM STRUCTURE AND COMMUNICATION
Keep in touch with
@lookout
/mylookout
blog.lookout.com
http://bit.ly/scaling-for-mobile