http at your local bigco
TRANSCRIPT
![Page 1: HTTP at your local BigCo](https://reader033.vdocument.in/reader033/viewer/2022060202/559c1ebe1a28abc2298b4706/html5/thumbnails/1.jpg)
HTTP at your local BigCo:How the internet sausage gets made
Peter Griess
@pgriess
![Page 2: HTTP at your local BigCo](https://reader033.vdocument.in/reader033/viewer/2022060202/559c1ebe1a28abc2298b4706/html5/thumbnails/2.jpg)
Goals and non-goals
• Basics of TCP/IP, DNS and HTTP and how they work together; pitfalls and optimizations
• A 1,000 foot view of scaling out HTTP infrastructure
– All manner of load balancing / traffic shaping
– Living on the edge
• Not: how to make a fast application (database access, rendering performance, etc)
![Page 3: HTTP at your local BigCo](https://reader033.vdocument.in/reader033/viewer/2022060202/559c1ebe1a28abc2298b4706/html5/thumbnails/3.jpg)
Background: DNS
• Map hostnames to IP(s)
– www.facebook.com 69.171.229.12, 69.171.228.40
• Resolution process
– Recursion (and what does the DNS server see?)
– Caching
• Latencies: on-host, cached in LAN, cached at ISP, miss
![Page 4: HTTP at your local BigCo](https://reader033.vdocument.in/reader033/viewer/2022060202/559c1ebe1a28abc2298b4706/html5/thumbnails/4.jpg)
Background: TCP
• Stateful protocol
• Negotiated by a synchronous 3-way handshake:
– 2xRTT before first byte is sent!
– e.g. USA => South America ~250ms RTT
• Seamless failover is hard (but not impossible)
• Load balancing must be aware of flows
![Page 5: HTTP at your local BigCo](https://reader033.vdocument.in/reader033/viewer/2022060202/559c1ebe1a28abc2298b4706/html5/thumbnails/5.jpg)
Background: HTTP
• Layered on top of TCP/TLS• Has some useful bits
– Compression– Connection re-use– Pipelining– Caching
• Kind of sucks– Headers on all requests/responses– Compression on bodies only– Pipelining has to be disabled most of the time– Pipelining suffers from head-of-line blocking
![Page 6: HTTP at your local BigCo](https://reader033.vdocument.in/reader033/viewer/2022060202/559c1ebe1a28abc2298b4706/html5/thumbnails/6.jpg)
mycutekittens.tv
Big bad internet
HTTP
68.193.17.4
![Page 7: HTTP at your local BigCo](https://reader033.vdocument.in/reader033/viewer/2022060202/559c1ebe1a28abc2298b4706/html5/thumbnails/7.jpg)
Problem?
![Page 8: HTTP at your local BigCo](https://reader033.vdocument.in/reader033/viewer/2022060202/559c1ebe1a28abc2298b4706/html5/thumbnails/8.jpg)
Problem
• Availability
– Server goes down (kernel panic?)
– Network goes down (cable cut?)
– Datacenter goes down (EC2?)
• Overload
– Shed load (good, can be transparent)
– Get infinitely slow (not good)
![Page 9: HTTP at your local BigCo](https://reader033.vdocument.in/reader033/viewer/2022060202/559c1ebe1a28abc2298b4706/html5/thumbnails/9.jpg)
mycutekittens.tv: multi-server
Big bad internet
???
![Page 10: HTTP at your local BigCo](https://reader033.vdocument.in/reader033/viewer/2022060202/559c1ebe1a28abc2298b4706/html5/thumbnails/10.jpg)
We have options
• DNS load balancing
• IP load balancing
• HTTP load balancing
![Page 11: HTTP at your local BigCo](https://reader033.vdocument.in/reader033/viewer/2022060202/559c1ebe1a28abc2298b4706/html5/thumbnails/11.jpg)
DNS load balancing
• mycutekittens.tv resolves to IPs: A, B, C, D– Add new IPs to scale out– Remove IPs when hosts go down
• Benefits– Don’t need extra hardware to do load balancing– Can span datacenters– DNS servers are cheap / fast
• Drawbacks– Hotspots due to caching– Hotspots due to ordering in result list– Hotspots due to resolver size– TTL / flexibility trade-off
![Page 12: HTTP at your local BigCo](https://reader033.vdocument.in/reader033/viewer/2022060202/559c1ebe1a28abc2298b4706/html5/thumbnails/12.jpg)
mycutekittens.tv: DNS
Big bad internet
68.193.17.4 68.193.17.5 68.193.17.6
DNS ServerDNS
![Page 13: HTTP at your local BigCo](https://reader033.vdocument.in/reader033/viewer/2022060202/559c1ebe1a28abc2298b4706/html5/thumbnails/13.jpg)
IP load balancing (1)
• mycutekittens.tv resolves to 1 public IP owned by an IP load balancer
– Add new backend hosts w/ private IPs to scale out
– Load balancer health-checks hosts actively or passively to avoid dead hosts
• Scheduling policies vs. failover
• DSR
![Page 14: HTTP at your local BigCo](https://reader033.vdocument.in/reader033/viewer/2022060202/559c1ebe1a28abc2298b4706/html5/thumbnails/14.jpg)
IP load balancing (2)
• Benefits
– Only 1 public IP (high DNS TTL)
– Backend network capacity/membership transparent to the internet
– Cheap-ish
– Failover is possible, not insanely difficult
• Drawbacks
– Can’t do what you can with HTTP
![Page 15: HTTP at your local BigCo](https://reader033.vdocument.in/reader033/viewer/2022060202/559c1ebe1a28abc2298b4706/html5/thumbnails/15.jpg)
mycutekittens.tv: IP
Big bad internet
10.0.0.1
10.0.0.2
10.0.0.3LB
GW
68.193.17.4
![Page 16: HTTP at your local BigCo](https://reader033.vdocument.in/reader033/viewer/2022060202/559c1ebe1a28abc2298b4706/html5/thumbnails/16.jpg)
HTTP load balancing (1)
• mycutekittens.tv resolves to 1 public IP owned by an HTTP load balancer– Largely same as IP load balancing
– Terminates TCP connections (sees all bytes)
– Can make routing decisions based on HTTP
– Can autonomously serve requests (caching, access control, etc)
• Examples:– Send requests for /foo/* to pool A
– 401 requests without cookie Q
![Page 17: HTTP at your local BigCo](https://reader033.vdocument.in/reader033/viewer/2022060202/559c1ebe1a28abc2298b4706/html5/thumbnails/17.jpg)
HTTP load balancing (2)
• Benefits
– Largely the same as IP
– More flexible rules
– Can terminate TLS (security+, cost+)
• Drawbacks
– No DSR
– Failover difficult
– Not as performant as IP
![Page 18: HTTP at your local BigCo](https://reader033.vdocument.in/reader033/viewer/2022060202/559c1ebe1a28abc2298b4706/html5/thumbnails/18.jpg)
mycutekittens.tv: HTTP
Big bad internet
10.0.0.1
10.0.0.2
10.0.0.3
LB68.193.17.4
HTTP(S)
![Page 19: HTTP at your local BigCo](https://reader033.vdocument.in/reader033/viewer/2022060202/559c1ebe1a28abc2298b4706/html5/thumbnails/19.jpg)
mycutekittens.tv: MOAR
• Eventually a single LB is going to be a problem
– Not enough capacity
– Availability
• Turtles all the day way down
– LB of LBs!
– DNS load balancing between datacenters
– …
![Page 20: HTTP at your local BigCo](https://reader033.vdocument.in/reader033/viewer/2022060202/559c1ebe1a28abc2298b4706/html5/thumbnails/20.jpg)
HTTPS: myths and reality
• Too computationally expensive– Only a few percent (imperialviolet.org); is your
webserver actually CPU bound? doubt it
– SSL acceleration cards, GPUs, etc
• Too much latency– Handshaking is 5-7xRTT
• Session resume
• False start
• Snap start
– Caching breaks
![Page 21: HTTP at your local BigCo](https://reader033.vdocument.in/reader033/viewer/2022060202/559c1ebe1a28abc2298b4706/html5/thumbnails/21.jpg)
My latency is huge in Japan
• RTT to USA is (or any single DC) can be huge
– Re-use connections (connection: keep-alive)
– Send work in parallel (pipelining)
– Use compression (content-encoding)
– Lots of tricks for static resources (bundling, CDNs, caching, etc)
– Pre-fetch data
![Page 22: HTTP at your local BigCo](https://reader033.vdocument.in/reader033/viewer/2022060202/559c1ebe1a28abc2298b4706/html5/thumbnails/22.jpg)
Let’s get crazy: SPDY
• Don’t limit yourself to HTTP; use a different protocol
– SPDY developed by Google, supported by Chrome, google.com (and soon facebook.com)
– Connection re-use w/o head-of-line blocking
– Headers always compressed
– Always SSL (but breaks caching)
![Page 23: HTTP at your local BigCo](https://reader033.vdocument.in/reader033/viewer/2022060202/559c1ebe1a28abc2298b4706/html5/thumbnails/23.jpg)
Let’s get crazy: TCP termination
• Synchronous RTTs: the silent killer– Opening new TCP connections is very costly
• Run proxies close to users and proxy traffic back to core using optimized protocol– Low RTT to proxy– Do SPDY-like tricks between edge + core– Potentially faster network to core than public internet
• Advertise these proxies via DNS– Geo-targetting– AS-adjacency
• Akamai CDN does this, sort of
![Page 24: HTTP at your local BigCo](https://reader033.vdocument.in/reader033/viewer/2022060202/559c1ebe1a28abc2298b4706/html5/thumbnails/24.jpg)
Let’s get crazy: DNS anycast
• Remember how DNS resolutions were slow?
– DNS servers could be far away from a user
• Advertise multiple network routes for the same DNS IP, let the IP stack pick the closest one