rails infrastructure

119
http://omarqureshi.net @omarqureshi Rails Infrastructure 1

Upload: qureshiomar

Post on 19-Jun-2015

377 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Rails infrastructure

http://omarqureshi.net@omarqureshi

Rails Infrastructure

1

Page 2: Rails infrastructure

Topics Covered

2

Page 3: Rails infrastructure

Topics Covered

• Lots of facepalm

2

Page 4: Rails infrastructure

Topics Covered

• Lots of facepalm• Rackspace

2

Page 5: Rails infrastructure

Topics Covered

• Lots of facepalm• Rackspace• Linux distribution choices

2

Page 6: Rails infrastructure

Topics Covered

• Lots of facepalm• Rackspace• Linux distribution choices• Automation and Orchestration

2

Page 7: Rails infrastructure

Topics Covered

• Lots of facepalm• Rackspace• Linux distribution choices• Automation and Orchestration • Logging

2

Page 8: Rails infrastructure

Edison Nation

3

Page 9: Rails infrastructure

Edison Nation• Distributed team (US/Canada/UK)

3

Page 10: Rails infrastructure

Edison Nation• Distributed team (US/Canada/UK)• Old (2009) and poorly maintained

application

3

Page 11: Rails infrastructure

Edison Nation• Distributed team (US/Canada/UK)• Old (2009) and poorly maintained

application• Rails 2.3 app

3

Page 12: Rails infrastructure

Edison Nation• Distributed team (US/Canada/UK)• Old (2009) and poorly maintained

application• Rails 2.3 app• (previous) focus on churn

3

Page 13: Rails infrastructure

Edison Nation• Distributed team (US/Canada/UK)• Old (2009) and poorly maintained

application• Rails 2.3 app• (previous) focus on churn• 3 Rails developers (+ 1 designer and an

intern)

3

Page 14: Rails infrastructure

Edison Nation• Distributed team (US/Canada/UK)• Old (2009) and poorly maintained

application• Rails 2.3 app• (previous) focus on churn• 3 Rails developers (+ 1 designer and an

intern)• >100,000 members

3

Page 15: Rails infrastructure

Edison Nation• Distributed team (US/Canada/UK)• Old (2009) and poorly maintained

application• Rails 2.3 app• (previous) focus on churn• 3 Rails developers (+ 1 designer and an

intern)• >100,000 members• Little inhouse sysadmin experience

3

Page 16: Rails infrastructure

Additional Quirks

4

Page 17: Rails infrastructure

Additional Quirks

• Used 1.8.7 since God does not play nicely with Ruby Enterprise Edition and we couldn’t use 1.9 because of Rails 2.3

4

Page 18: Rails infrastructure

Additional Quirks

• Used 1.8.7 since God does not play nicely with Ruby Enterprise Edition and we couldn’t use 1.9 because of Rails 2.3

• Provisioning process was terribly slow

4

Page 19: Rails infrastructure

Additional Quirks

• Used 1.8.7 since God does not play nicely with Ruby Enterprise Edition and we couldn’t use 1.9 because of Rails 2.3

• Provisioning process was terribly slow• Very little caching

4

Page 20: Rails infrastructure

Additional Quirks

• Used 1.8.7 since God does not play nicely with Ruby Enterprise Edition and we couldn’t use 1.9 because of Rails 2.3

• Provisioning process was terribly slow• Very little caching• Quite a lot of server generated JS

4

Page 21: Rails infrastructure

SURPRISE!

5

Page 22: Rails infrastructure

Featured on Nightline

6

Page 23: Rails infrastructure

Featured on Nightline

• No warning (announced pretty late EST)

6

Page 24: Rails infrastructure

Featured on Nightline

• No warning (announced pretty late EST)• No preparation time (engineers already

signed off for the night)

6

Page 25: Rails infrastructure

Featured on Nightline

• No warning (announced pretty late EST)• No preparation time (engineers already

signed off for the night)• Couldn’t provision servers to deal with

the traffic spike in time (and we would have needed a lot of them)

6

Page 26: Rails infrastructure

7

Page 27: Rails infrastructure

Load balancer recorded 3000 concurrent requests

including assets or around 300 excluding

assets

8

Page 28: Rails infrastructure

The Stack

9

Page 29: Rails infrastructure

Figuring out the bottlenecks

10

Page 30: Rails infrastructure

Nginx kept serving - though these were 502

errors

11

Page 31: Rails infrastructure

Post-mortem of the requests that did make it through made it look like

the application servers were to blame

12

Page 32: Rails infrastructure

Database was under heavy load but by no means the bottleneck

13

Page 33: Rails infrastructure

Make better use of the application server pool

14

Page 34: Rails infrastructure

Got some quick wins in the code by caching more

and moving jQuery to Google

15

Page 35: Rails infrastructure

<script src="//ajax.googleapis.com/ajax/libs/jquery/1.6.2/

jquery.min.js"></script>

16

Page 36: Rails infrastructure

Get rid of any server generated JS

17

Page 37: Rails infrastructure

Pretty much re-trained myself to be a systems

administrator

18

Page 38: Rails infrastructure

Completely re-think the way we do Operations

19

Page 39: Rails infrastructure

What components make up a solid multi-server

setup?

20

Page 40: Rails infrastructure

Load balancing

21

Page 41: Rails infrastructure

TLS SNI Extension

22

Page 42: Rails infrastructure

Theoretically only have two load balancers for

ALL domains

23

Page 43: Rails infrastructure

Simplified SSL Nginx configserver { listen 443; server_name www.edisonnation.com; ssl on; ssl_certificate /path/to/cert/en.com.cert; ssl_certificate_key /path/to/cert/en.com.key;}

server { listen 443; server_name www.edisonnation.vn; ssl on; ssl_certificate /path/to/cert/en.vn.cert; ssl_certificate_key /path/to/cert/en.vn.key;}

24

Page 44: Rails infrastructure

Windows XP + Internet Explorer

25

Page 45: Rails infrastructure

Windows XP

• Internet Explorer 6-8 on Windows XP would not work compared to modern OS + Browser combinations

• Ignores the server name for HTTPS• Will give you an invalid SSL certificate

error when browsing

26

Page 46: Rails infrastructure

Rackspace (v2) Load Balancer

27

Page 47: Rails infrastructure

Rackspace Load Balancer

• SSL termination at the Load Balancer• No need to serve HTTPS traffic from

Nginx any more - X-Forwarded-Proto tells Rails if page is supposed to be encrypted

• Less processing required here• Less complexity managing certificates

and Nginx configs

28

Page 48: Rails infrastructure

Split up the application servers

29

Page 49: Rails infrastructure

Move Nginx to it’s own machine and reverse

proxy back to Unicorn app servers

30

Page 50: Rails infrastructure

New stack

31

Page 51: Rails infrastructure

Switch Unicorn to use TCP sockets rather than

Unix

32

Page 52: Rails infrastructure

Linux

33

Page 53: Rails infrastructure

Debian Squeeze

34

Page 54: Rails infrastructure

Why Debian?

35

Page 55: Rails infrastructure

Why Debian?

• Pick the most stable distribution

35

Page 56: Rails infrastructure

Why Debian?

• Pick the most stable distribution• Debian is pretty stable, plus you can use

Lucid Lynx packages for anything that you need which is cutting edge

35

Page 57: Rails infrastructure

Why Debian?

• Pick the most stable distribution• Debian is pretty stable, plus you can use

Lucid Lynx packages for anything that you need which is cutting edge

• However, God requires you to use a custom kernel before it will work properly

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=609004

35

Page 58: Rails infrastructure

Ubuntu LTS also viable as a choice as is any RHEL

36

Page 59: Rails infrastructure

Basically, anything where the packages aren’t crazy and support is still there

(not Arch/Fedora/Ubuntu)

37

Page 60: Rails infrastructure

Packaging

38

Page 61: Rails infrastructure

We don’t image servers (but may start doing so)

39

Page 62: Rails infrastructure

Provisioning tools should be able to build a server

on any hardware

40

Page 63: Rails infrastructure

Never build from source

41

Page 64: Rails infrastructure

Never build from source

• Either package yourself or get from a reliable source

41

Page 65: Rails infrastructure

Never build from source

• Either package yourself or get from a reliable source

• Ditch RVM (though they now have binary rubies - anyone tried?)

41

Page 66: Rails infrastructure

Never build from source

• Either package yourself or get from a reliable source

• Ditch RVM (though they now have binary rubies - anyone tried?)

• Check out Brightbox Next Generation Ubuntu packages

http://wiki.brightbox.co.uk/docs:ruby-ng

41

Page 67: Rails infrastructure

Pin everything else

Package: *Pin: release a=squeeze-backportsPin-Priority: 200

Package: puppetPin: release a=squeeze-backportsPin-Priority: 900

Package: puppet-commonPin: release a=squeeze-backportsPin-Priority: 900

42

Page 68: Rails infrastructure

Server build time decreased from 45

minutes to < 15 minutes

43

Page 69: Rails infrastructure

How do we provision servers?

44

Page 70: Rails infrastructure

A small bash script + Puppet

45

Page 71: Rails infrastructure

Bash script does basic pinning and installs

essential packages (Ruby + Emacs + Puppet +

puppet-el)

46

Page 72: Rails infrastructure

Works very well since we use Hetzner EX4S’s for non-critical systems

47

Page 73: Rails infrastructure

Hetzner + (Xen/OpenVZ) == FANTASTIC

48

Page 74: Rails infrastructure

(See me at the end if you want to talk about

provisioning some more)

49

Page 75: Rails infrastructure

Managing Puppet

50

Page 76: Rails infrastructure

Always running Puppet rather than run on

demand

51

Page 77: Rails infrastructure

Encourage developers to document infrastructure

changes

52

Page 78: Rails infrastructure

Still unsure about how to go about Puppet testing

53

Page 79: Rails infrastructure

Campfire reporting

54

Page 80: Rails infrastructure

Orchestration

55

Page 81: Rails infrastructure

MCollective

56

Page 82: Rails infrastructure

STOMP server connects all of our servers together

57

Page 83: Rails infrastructure

MCollective executes Remote Procedure Calls

58

Page 84: Rails infrastructure

Great for pushing out urgent Puppet updates

59

Page 85: Rails infrastructure

Also great for Munin#!/bin/bashstr="includedir /etc/munin/munin-conf.d"for addr in `/usr/bin/mco facts ipaddress | awk '{gsub("found", ""); print $1}' | grep "^[0-9]"`do fqdn=`/usr/bin/mco facts fqdn -F ipaddress=$addr | grep "^\W" | awk '{print $1}'` str="$str

[$fqdn] address $addr use_node_name yes"done

echo "$str" > /etc/munin/munin.conf/usr/sbin/service munin-node restart

60

Page 86: Rails infrastructure

No longer have to manually maintain

Munin

61

Page 87: Rails infrastructure

Can be used for other painful tasks - such as

making sure packages are up to date on all the

servers

62

Page 88: Rails infrastructure

RPC libraries are written in Ruby

63

Page 89: Rails infrastructure

Service management

64

Page 90: Rails infrastructure

M/Monit

65

Page 91: Rails infrastructure

Not free - however, extremely worthwhile.

Can hook into shell scripts

66

Page 92: Rails infrastructure

Log management

67

Page 93: Rails infrastructure

Graylog2

68

Page 94: Rails infrastructure

Java JAR with a Rails frontend and

Elasticsearch + Mongo backend

69

Page 95: Rails infrastructure

Deals with exception management

70

Page 96: Rails infrastructure

Can do analytics on logs

71

Page 97: Rails infrastructure

Specify streams of logs (i.e 404 errors)

72

Page 98: Rails infrastructure

No longer have to juggle lots of files which exist on

different machines

73

Page 99: Rails infrastructure

A little tricky to set-up

74

Page 100: Rails infrastructure

Use the gelf-rb gem sparingly in your Rails app and NOT as your

main logger

75

Page 101: Rails infrastructure

Found out, that the log requests were not

threaded

76

Page 102: Rails infrastructure

For us, gelf-rb ONLY sends exception

notifications

77

Page 103: Rails infrastructure

Introducing Logstashd

78

Page 104: Rails infrastructure

Written by the awesome Jordan Sissel (FPM)

79

Page 105: Rails infrastructure

Nginx doesn’t support sending to Graylog

straight out

80

Page 106: Rails infrastructure

Logstashd acts as a log tailing and transporting

mechanism

81

Page 107: Rails infrastructure

Runs in its own process - so threading doesnt

matter so much

82

Page 108: Rails infrastructure

Whats left?

83

Page 109: Rails infrastructure

Upgrade to Rails 3

84

Page 110: Rails infrastructure

Great benefits with Rails 3 such as Dalli for

memcached failovers and Lograge

85

Page 111: Rails infrastructure

Oh yeah - assets pipeline!

86

Page 112: Rails infrastructure

Implement read slaves for backups

87

Page 113: Rails infrastructure

Make Jenkins do our deployment

88

Page 114: Rails infrastructure

Better caching solutions - maybe Varnish / conditional GET

89

Page 115: Rails infrastructure

Re-implement TLS SNI once Windows XP

security updates stop

90

Page 116: Rails infrastructure

Handle large spikes better

91

Page 117: Rails infrastructure

Autoscaling?

92

Page 118: Rails infrastructure

Using AWS as an additional cloud failover

93

Page 119: Rails infrastructure

Hybrid Dedicated and Cloud for production

94