riding the n train: how we dismantled groupon's ruby on rails monolith

39
Riding the N(ode) Train: Dismantling the Monoliths Tuesday, December 3, 2013 Sean McCullough – Engineer at Groupon @mcculloughsean

Upload: sean-mccullough

Post on 06-May-2015

1.281 views

Category:

Technology


1 download

DESCRIPTION

This is a story about how Groupon's business was changing and our technology couldn't keep up. We rewrote the web site using node.js and changed the way our company and culture.

TRANSCRIPT

Page 1: Riding The N Train: How we dismantled Groupon's Ruby on Rails Monolith

Riding the N(ode) Train: Dismantling the Monoliths

Tuesday, December 3, 2013

Sean McCullough – Engineer at Groupon @mcculloughsean

Page 2: Riding The N Train: How we dismantled Groupon's Ruby on Rails Monolith

Part I

Broken Architecture and

A Changing Business

Page 3: Riding The N Train: How we dismantled Groupon's Ruby on Rails Monolith

Business in Early 2012

Page 3

Page 4: Riding The N Train: How we dismantled Groupon's Ruby on Rails Monolith

Architecture in 2012

Page 4

Page 5: Riding The N Train: How we dismantled Groupon's Ruby on Rails Monolith

0%

20%

40%

60%

80%

100%

January ‘11

January ‘13

October ’12

July ’12

April ’12

January ’12

October ‘11

July ’11

April ’11

March ‘13

June ‘13

Leading the Mobile Commerce Revolution

Page 5

Mobile Transaction Mix Monthly, January 2011 to September 2013 (% of transactions)

September ’13

Page 6: Riding The N Train: How we dismantled Groupon's Ruby on Rails Monolith

Product Engineering was Stuck

We couldn’t build features fast enough

We wanted to build features world-wide

Mobile and Web weren’t at feature parity

Page 6

Page 7: Riding The N Train: How we dismantled Groupon's Ruby on Rails Monolith

Part II

The Rewrite

Page 7

Page 8: Riding The N Train: How we dismantled Groupon's Ruby on Rails Monolith

The Rewrite

Page 8

Page 9: Riding The N Train: How we dismantled Groupon's Ruby on Rails Monolith

The Rewrite

Should ...

• be built on APIs for consistent contract with mobile

• be easy to hire developers

• allow for teams to work at their own pace

• allow teams to deploy their own code

• allow for global design changes

• have out of the box I18n/L13n support

• be optimized for our read-heavy traffic pattern

• be small Page 9

Page 10: Riding The N Train: How we dismantled Groupon's Ruby on Rails Monolith

How do we…?

• Deploy

• Authorize Users

• Share Sessions

• Route to different applications

• Manage distributed ops

• QA the whole site

Page 10

Page 11: Riding The N Train: How we dismantled Groupon's Ruby on Rails Monolith

We Tried This Before and Failed

• Rolled out a new site design in our monolith

• Too many things changed all at once

• Hard to evaluate performance of each feature

Page 11

Page 12: Riding The N Train: How we dismantled Groupon's Ruby on Rails Monolith

New Platform Evaluation

We evaluated:

• Node

• MRI Ruby/Rails, MRI Ruby/Sinatra

• JRuby/Rails, Sinatra

• MRI Ruby + Sinatra+EM

• Java/Play, Java/Vertx

• Python+Twisted

• PHPPage 12

Page 13: Riding The N Train: How we dismantled Groupon's Ruby on Rails Monolith

Why Node?

• Vibrant community

• NPM!

• Easy to hire JavaScript developers

• Had the minimum viable performance characteristic

• Easy scaling (process model)

Page 13

Page 14: Riding The N Train: How we dismantled Groupon's Ruby on Rails Monolith

The First App

Page 14

Page 15: Riding The N Train: How we dismantled Groupon's Ruby on Rails Monolith

Growing Pains

Page 15

Page 16: Riding The N Train: How we dismantled Groupon's Ruby on Rails Monolith

Poking Holes in our Infrastructure

• Longevity Test over two days

• Try to root out memory leaks

• Talking only to non-production systems

Page 16

Page 17: Riding The N Train: How we dismantled Groupon's Ruby on Rails Monolith

Poking Holes in our Infrastructure

Within 2 hours we had a major site outage

Page 17

Page 18: Riding The N Train: How we dismantled Groupon's Ruby on Rails Monolith

Poking Holes in our Infrastructure

• SSL termination on our hardware load balancer caused CPU to max out at 100%

• Production systems were using same LB as test and development systems

Page 18

Page 19: Riding The N Train: How we dismantled Groupon's Ruby on Rails Monolith

Lessons Learned

• You will run into problems with Node

• You will find problems with your infrastructure

• Don’t panic!

Page 19

Page 20: Riding The N Train: How we dismantled Groupon's Ruby on Rails Monolith

The Second App

• Looking for the next page

• Chose the “Browse” page

• Recently Built

• Built using mostly Backbone

• Experienced team of JS developers

Page 20

Page 21: Riding The N Train: How we dismantled Groupon's Ruby on Rails Monolith

The Second App

Page 21

Page 22: Riding The N Train: How we dismantled Groupon's Ruby on Rails Monolith

The Second App

New Problems:

• User authentication

• More service calls

• Complicated routing

• More traffic

• Needed to share look and feel

Page 22

Page 23: Riding The N Train: How we dismantled Groupon's Ruby on Rails Monolith

The Second App

• Cultural problems

• Change of workflow

• Feedback loop fell apart

3 rewrites

6 months to launch

Page 23

Page 24: Riding The N Train: How we dismantled Groupon's Ruby on Rails Monolith

Shared Layout

Maintain consistent look and feel across site:

• Distribute layout as library

• Use ESIs for top/bottom of page

• Apps are called through a “chrome service”

• Fetch templates from service

Page 24

Page 25: Riding The N Train: How we dismantled Groupon's Ruby on Rails Monolith

Groupon Interface Guidelines

Page 25

Page 26: Riding The N Train: How we dismantled Groupon's Ruby on Rails Monolith

Layout Service

• Uses semantic versioning

• Roll forward with bug fixes

• Stay locked on a specific version

• Enable Site-Wide ExperimentsPage 26

Page 27: Riding The N Train: How we dismantled Groupon's Ruby on Rails Monolith

Layout Service

Page 27

Page 28: Riding The N Train: How we dismantled Groupon's Ruby on Rails Monolith

Layout Service

Page 28

Page 29: Riding The N Train: How we dismantled Groupon's Ruby on Rails Monolith

Routing Service

Page 29

Page 30: Riding The N Train: How we dismantled Groupon's Ruby on Rails Monolith

The Big Push… or There’s No Going Back

Page 30

• Decided to get the whole company to move at once

• Supporting two platforms is hard – Rip off the band aid!

• End of June 2012 - move to I-Tier by September 1st

Page 31: Riding The N Train: How we dismantled Groupon's Ruby on Rails Monolith

The Big Push… or There’s No Going Back

Page 31

• ~150 developers

• Global effort

• Feature freeze – A/B testing against mostly the same features

Page 32: Riding The N Train: How we dismantled Groupon's Ruby on Rails Monolith

Part III

It Worked!

Page 32

Page 33: Riding The N Train: How we dismantled Groupon's Ruby on Rails Monolith

95% Consumer Traffic On Node

Page 33

Page 34: Riding The N Train: How we dismantled Groupon's Ruby on Rails Monolith

Sustained US Traffic Over 120k RPM

Page 34

Page 35: Riding The N Train: How we dismantled Groupon's Ruby on Rails Monolith

Our Pages Got Faster

Page 35

Page 36: Riding The N Train: How we dismantled Groupon's Ruby on Rails Monolith

It Worked!

Page 36

Page 37: Riding The N Train: How we dismantled Groupon's Ruby on Rails Monolith

Success?

Page 37

• Moving to a new platform is not a straight line

• Solving for old problems

• Solving for new problems

• Culture shift

Page 38: Riding The N Train: How we dismantled Groupon's Ruby on Rails Monolith

38

• Streaming responses for better performance

• Better resiliency to outages… circuit breakers, brownouts

• Distributed Tracing

• International

• Open Source

New I-Tier apps as we build new teams, products, ideas.

Latest technologies to help us drive our business.

Next Steps

Page 39: Riding The N Train: How we dismantled Groupon's Ruby on Rails Monolith

Q&A