inside picnik: how we built picnik and what we learned along the way mike harrington justin huff web...

52
Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

Upload: veronica-atkinson

Post on 13-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

Inside Picnik:How we built Picnik and

what we learned along the way

Mike HarringtonJustin Huff

Web 2 Expo, April 3rd 2009

Page 2: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009
Page 3: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

Thanks! We couldn’t have done this without help!

Page 4: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

Traffic, show scale of our problem

Our happy problem

Page 5: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

StorageRender

MySQL

Apache

Python

Flex HTML

DataCenter

Client

Amazon

Page 6: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

Fun with Flex

Page 7: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

Flex pros and cons

We Flash

• Browser and OS independence• Secure and Trusted• Huge installed base• Graphics and animation in your browser

Page 8: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

Flex pros and cons

We Flash

• Decent tool set• Decent image support• ActionScript

Page 9: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

Flex pros and cons

Flash isn’t Perfect

• Not Windows, Not Visual Studio• ActionScript is not C++• SWF Size/Modularity• Hardware isn’t accessible• Adobe is our Frienemy• Missing support for jpg Compression

Page 10: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

Flex pros and cons

Flash is a control freak

Page 11: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

Forecast: Cloudy with a chance of CPU

Page 12: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

User Cloud

Your users are your free cloud!

• Free CPU & Storage!• Cuts down on server round trips• More responsive (usually)• 3rd Party APIs Directly from your client• Flash 10 enables direct load

Page 13: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

User Cloud

• Your user is a crappy sysadmin• Hard to debug problems you can’t see• Heterogeneous Hardware• Some OS Differences

You get what you pay for.

Page 14: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

User Cloud

Client Side Challenges

• Hard to debug problems you can’t see• Every 3rd party API has downtime• You get blamed for everything • Caching madness

Page 15: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

Dealing with nasty problems

• Logging• Ask customers directly• Find local customers• Test, test, test

Page 16: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

Would we use Flash again?

Absolutely.

Page 17: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

LAMP

Page 18: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

Linux

YUP.

Page 19: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

Apache

Check.(static files)

Page 20: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

MySQL

Yeah, that too.

Page 21: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

P

Is for Python

Page 22: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

Python

90% of Picnik is API.

Page 23: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

CherryPy

• We chose CherryPy• RESTish API• Cheetah for a small amount of templating

Page 24: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

Ok, that was boring.

Page 25: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

Fun Stuff

Page 26: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

Virtualization

• XEN• Dual Quad core machines• 32GB RAM• We virtualize everything

– Except DB servers

Page 27: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

Virtualization Pros

• Partition functions• Matching CPU bound and IO bound VMs• Hardware standardization (in theory)

Page 28: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

Virtualization Cons

• More complexity• Uses more RAM• IO/CPU inefficiencies

Page 29: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

Storage

• We started with one server– Files stored locally– Not scalable

Page 30: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

Storage• Switched to MogileFS

– Working great– No dedicated storage

machines!

Page 31: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

Storage• Added S3 into the mix

Page 32: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

Storage – S3

• S3 is dead simple to implement.• For our usage patterns, it's expensive• Picnik generates lots of temp files• Problems keeping up with deletes• Made a decision to focus on other things

before fixing deletes

Page 33: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

Load Balancers

• Started using Perlbal• Bought two BigIPs

– Expensive, but good.• Outgrew the BigIPs• Went with A10 Networks

– A little bumpy– They've had great support.

Page 34: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

CDNs Rock

• We went with Level3• Cut outbound traffic in half:

Page 35: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

Rendering

• We render images when a user saves• Render jobs go into a queue• Workers pull from the queue• Workers in both the DC and EC2

Page 36: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

Rendering• Manager process maintain enough idle

workers• Workers in DC are at ~100% utilization

%$@#!!!

Page 37: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

EC2• Our processing is elastic between EC2 and our

own servers• We buy servers in batches

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

0

10

20

30

40

50

60

70

80

Render - EC2

Render - Local

Web

Time

Page 38: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

Monitoring

If a rack falls in the DC, does it make any sound?

Page 39: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

Monitoring• Nagios for Up/Down

– Integrated with automatic tools– ~120 hosts– ~1100 service checks

• Cacti for general graphing/trending– ~3700 data sources– ~2800 Graphs

• Smokeping– Network latency

Page 40: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

Redundancy

• I like sleeping at night• Makes maintenance tasks easier• It takes some work to build• We built early (thanks Flickr!)

Page 41: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

War Stories

Page 42: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

ISP Issues

• We're happy with both of our providers.• But...

– Denial of service attacks– Router problems– Power

• Have several providers• Monitor & trend!

Page 43: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

ISP Issues• High latency on a transit link:

• Cause: High CPU usage on their aggregation router

• Solution: Clear and reconfig our interface

Page 44: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

Amazon Web Services

• AWS is pretty solid• But, it's not 100%• When AWS breaks, we're down

– The worst type of outage because you can't do anything.

• Watch out for issues that neither party controls– The Internet isn't as reliable point to point

Page 45: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

EC2 Safety Net

• Bug caused render times to increase

Page 46: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

EC2 Safety net

• Thats OK!

Page 47: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

Flickr Launch• Very slow Flickr API calls post-launch• Spent an entire day on phone/IM with Flickr

Ops• Finally discovered an issue with NAT and TCP

timestamps• Lessons:

– Have tools to be able to dive deep– Granular monitoring

Page 48: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

Firewalls• We had a pair of Watchguard x1250e

firewalls.– Specs: “1.5Gbps of firewall throughput”– Actual at 100Mbps:

Page 49: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

authorize.net

• Connectivity issues• Support was useless• Eventually got to their NOC• We rerouted via my home DSL.• Lessons:

– Access to technical people– Handle failure of services gracefully

Page 50: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

The end.

Page 51: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

Thanks!

[email protected]@picnik.com

Page 52: Inside Picnik: How we built Picnik and what we learned along the way Mike Harrington Justin Huff Web 2 Expo, April 3 rd 2009

Credits/ResourcesTalks that influenced us:http://tinyurl.com/scalingwebsites

Images:http://www.flickr.com/photos/kimberlyfaye/2785383504/http://www.flickr.com/photos/piez/574804017/sizes/o/http://www.flickr.com/photos/unclejojo/3224365412/sizes/o/http://www.flickr.com/photos/phoenixdailyphoto/1467681879/sizes/

l/http://abductionlamp.com/http://www.flickr.com/photos/msig/2716038191/http://www.flickr.com/photos/touringcyclist/303577344/http://www.flickr.com/photos/davidt2006/1407837008/