usenix lisa 2012 - choosing a proxy
DESCRIPTION
My talk on choosing an HTTP proxy cache at Usenix LISA 2012.TRANSCRIPT
Choosing a Proxy-
Don’t roll the D20!
Leif HedstromCisco WebEx
Who am I?
• Unix developer since 1985• Yeah, I’m really that old, I learned Unix on BSD 2.9• Long time SunOS/Solaris/Linux user
• Mozilla committer (but not active now)• VP of Apache Traffic Server PMC• ASF member• Overall hacker, geek and technology addict
[email protected]@zwoop
+lhedstrom
So which proxy cache should you choose?
Plenty of Proxy Servers
PerlBal
And plenty of “reliable” sources…
Answer: the one that solves your problem!
http://mihaelasharkova.files.wordpress.com/2011/05/5steploop2.jpg
But first…
• While you are still awake, and the coffee is fresh:
My crash course in HTTP proxy and caching!
Forward Proxy
Reverse Proxy
Intercepting Proxy
Why Cache is King
• The content fastest served is the data the user already has locally on his computer/browser– This is near zero cost and zero latency!
• The speed of light is still a limiting factor– Reduce the latency -> faster page loads
• Serving out of cache is computationally cheap– At least compared to e.g. PHP or any other higher
level page generation system– It’s easy to scale caches horizontally
Choosing an intermediary
Plenty of Proxy Servers
PerlBal
Plenty of Free Proxy Servers
PerlBal
Plenty of Free Proxy Servers
PerlBal
Plenty of Free Caching Proxy Servers
Choosing an intermediary
The problem
• You can basically not buy a computer today with less than 2 CPUs or cores
• Things will only get “worse”!– Well, really, it’s getting better
• Typical server deployments today have at least 8 – 16 cores– How many of those can you actually use??– And are you using them efficiently??
• NUMA turns out to be kind of a bitch…
Solution 1: Multi-threading
Problems with multi-threading
• It’s a wee bit difficult to get it right!
http://www.flickr.com/photos/stuartpilbrow/3345896050/
Problems with multi-threading
Solution 2: Event Processing
Problems with Event Processing
• It hates blocking APIs and calls!– Hating it back doesn’t help :/
• Still somewhat complicated• It doesn’t scale on SMP by
itself
Where are we at ?
Apache TS Nginx Squid VarnishProcesses 1 1 - <n> 1 - <n> 1
Threads Based on cores 1 1 Lots
Evented Yes Yes Yes Yes *)
*) Can use blocking calls, with (large) thread pool
Proxy Cache test setup• AWS Large instances, 2 CPUs• All on RCF 1918 network (“internal” net)• 8GB RAM• Access logging enabled to disk (except on Varnish)• Software versions
– Linux v3.2.0– Traffic Server v3.3.1– Nginx v1.3.9– Squid v3.2.5– Varnish v3.0.3
• Minimal configuration changes• Cache a real (Drupal) site
ATS configuration
• etc/traffficserver/remap.config:
map / http://10.118.154.58• etc/trafficserver/records.config:
CONFIG proxy.config.http.server_ports STRING 80
Nginx configuration try 1, basically defaults (broken, don’t use)
worker_processes 2;access_log logs/access.log main;
proxy_cache_path /mnt/nginx_cache levels=1:2 keys_zone=my-cache:8m \ max_size=16384m inactive=600m;proxy_temp_path /mnt/nginx_temp;
server { listen 80;
location / { proxy_pass http://10.83.145.47/; proxy_cache my-cache;}
Nginx configuration try 2 (works but really slow, 10x slower)
worker_processes 2;access_log logs/access.log main;
proxy_cache_path /mnt/nginx_cache levels=1:2 keys_zone=my-cache:8m \ max_size=16384m inactive=600m;proxy_temp_path /mnt/nginx_temp;
gzip on;server { listen 80;
location / { proxy_pass http://10.83.145.47/; proxy_cache my-cache; proxy_set_header Accept-Encoding "";}
Nginx configuration try 3 (works and reasonably fast, but WTF!)
worker_processes 2;access_log logs/access.log main;
proxy_cache_path /mnt/nginx_cache levels=1:2 keys_zone=my-cache:8m \ max_size=16384m inactive=600m;proxy_temp_path /mnt/nginx_temp;
server { listen 80; set $ae ""; if ($http_accept_encoding ~* gzip) { set $ae "gzip"; }
location / { proxy_pass http://10.83.145.47/; proxy_cache my-cache; proxy_set_header If-None-Match ""; proxy_set_header If-Modified-Since ""; proxy_set_header Accept-Encoding $ae; proxy_cache_key $uri$is_args$args$ae; }
location ~ /purge_it(/.*) { proxy_cache_purge example.com $1$is_args$args$myae }
Thanks to Chris Ueland at NetDNA for the snippet
Squid configuration
http_port 80 accelhttp_access allow allcache_mem 4096 MBworkers 2memory_cache_shared oncache_dir ufs /mnt/squid 100 16 256cache_peer 10.83.145.47 parent 80 0 no-query originserver
Varnish configuration
backend default { .host = "10.83.145.47”; .port = "80";}
Performance AWS 8KB HTML (gzip)
ATS 3.3.1 Nginx 1.3.9 hack Squid 3.2.5 Varnish 3.0.3 Varnish 3.0.3 varnishlog -w
0
1,000
2,000
3,000
4,000
5,000
6,000
7,000
8,000
9,000
10,000
0.0
5.0
10.0
15.0
20.0
25.0
7.40 7.92
12.16
9.20
22.81
QPS Latency
Thro
ughp
ut
Tim
e to
firs
t re
spon
se (m
s)
Performance AWS 8KB HTML (gzip)
ATS 3.3.1 Nginx 1.3.9 hack Squid 3.2.5 Varnish 3.0.3 Varnish 3.0.3 varnishlog -w
0
1,000
2,000
3,000
4,000
5,000
6,000
7,000
8,000
9,000
10,000
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
80.00%
90.00%
100.00%
63%60%
82%
73%
83%
QPS CPU usage
Thro
ughp
ut
CPU
use
d (d
ual c
ore)
Performance AWS 500 bytes JPG
ATS 3.3.1 Nginx 1.3.9 hack Squid 3.2.5 Varnish 3.0.3 Varnish 3.0.3 varnishlog -w
0
2,000
4,000
6,000
8,000
10,000
12,000
14,000
16,000
0.0
2.0
4.0
6.0
8.0
10.0
12.0
14.0
16.0
18.0
4.955.93
9.10
7.27
16.41
QPS Latency
Thro
ughp
ut
Tim
e to
firs
t re
spon
se (m
s)
Performance AWS 500 bytes JPG
ATS 3.3.1 Nginx 1.3.9 hack Squid 3.2.5 Varnish 3.0.3 Varnish 3.0.3 varnishlog -w
0
2,000
4,000
6,000
8,000
10,000
12,000
14,000
16,000
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
80.00%
90.00%
100.00%
78% 77%
84%
77% 76%
QPS CPU usage
Thro
ughp
ut
CPU
use
d (d
ual c
ore)
Choosing an intermediary
RFC 2616 is not optional!
• Neither is the new BIS revision!• Understanding HTTP and how it relates to
Proxy and Caching is important– Or you will get it wrong! I promise.
How things can go wrong: Vary!$ curl -D - -o /dev/null -s --compress http://10.118.73.168/HTTP/1.1 200 OKServer: nginx/1.3.9Date: Wed, 12 Dec 2012 18:00:48 GMTContent-Type: text/html; charset=utf-8Content-Length: 8051Connection: keep-aliveX-Powered-By: PHP/5.4.9X-Drupal-Cache: HITEtag: "1355334762-0-gzip"Content-Language: enX-Generator: Drupal 7 (http://drupal.org)Cache-Control: public, max-age=900Last-Modified: Wed, 12 Dec 2012 17:52:42 +0000Expires: Sun, 19 Nov 1978 05:00:00 GMTVary: Cookie,Accept-EncodingContent-Encoding: gzip
How things can go wrong: Vary!$ curl -D - -o /dev/null -s http://10.118.73.168/HTTP/1.1 200 OKServer: nginx/1.3.9Date: Wed, 12 Dec 2012 18:00:57 GMTContent-Type: text/html; charset=utf-8Content-Length: 8051Connection: keep-aliveX-Powered-By: PHP/5.4.9X-Drupal-Cache: HITEtag: "1355334762-0-gzip"Content-Language: enX-Generator: Drupal 7 (http://drupal.org)Cache-Control: public, max-age=900Last-Modified: Wed, 12 Dec 2012 17:52:42 +0000Expires: Sun, 19 Nov 1978 05:00:00 GMTVary: Cookie,Accept-EncodingContent-Encoding: gzip
EPIC FAIL!
Note: no gzip support
What type of proxy do you need?
• Of our candidates, only two fully supports all proxy modes!
CoAdvisor HTTP protocol quality tests for reverse proxies
ATS 3.3.1
Nginx 1.3.9
Squid 3.2.5
Varnish 3.0.3
0 100 200 300 400 500 600
Failures Violations Success
49%
81%
51%
68%
CoAdvisor HTTP protocol quality tests for reverse proxies
ATS 3.3.1
Nginx 1.3.9
Squid 3.2.5
Varnish 3.0.3
0 100 200 300 400 500 600
Failures Violations Success
25%
6%
27%
15%
Choosing an intermediary
My subjective opinions
ATS – The good
• Good HTTP/1.1 support, including SSL• Tunes itself very well to the system / hardware
at hand• Excellent cache features and performance
– Raw disk cache is fast and resilient• Extensible plugin APIs, quite a few plugins• Used and developed by some of the largest
Web companies in the world
ATS – The bad
• Load balancing is incredibly lame• Seen as difficult to setup (I obviously disagree)• Developer community is still too small• Code is complicated
– By necessity? Maybe …
ATS – The ugly
• Too many configuration files!• There’s still legacy code that has to be
replaced or removed• Not a whole lot of commercial support
– But there’s hope (e.g. OmniTI recently announced packaged support)
Nginx – The good
• Easy to understand the code base, and software architecture– Lots of plugins available, including SPDY
• Excellent Web and Application server– E.g. Nginx + fpm (fcgi) + PHP is the awesome,
according to a very reputable source• Commercial support available from the people
who wrote and know it best. Huge!
Nginx – The bad
• Adding extensions implies rebuilding the binary
• By far the most configurations required “out of the box” to even do anything remotely useful
• It does not make good attempts to tune itself to the system
• No good support for conditional requests
Nginx – The ugly
• The cache is a joke! Really• The protocol support as an HTTP proxy is
rather poor. It fares the worst in the tests, and can be outright wrong if you are not very careful
• From docs: “nginx does not handle "Vary" headers when caching.” Seriously?
Squid – The Good
• Has by far the most HTTP features of the bunch. I mean, by far, nothing comes even close
• It also is the best HTTP conformant proxy today. It has the best scores in the CoAdvisor tests, by a wide margin
• The features are mature, and used pretty much everywhere
• Works pretty well out of the box
Squid – The Bad
• Old code base• Cache is not particularly efficient• Has traditionally been prone to instability• Complex configurations
– At least IMO, I hate it
Squid – The Ugly
• SMP is quite an afterthought– Duct tape
• Why spend so many years rewriting from v2.x to v3.x without actually addressing some of the real problems? Feels like a boat has been missed…
• Not very extensible– Typically you write external “helper” processes, similar
to fcgi. This is not particularly flexible, nor powerful (can not do everything you’d want as a helper, so might have to rewrite the Squid core)
Varnish – The Good
• VCL• And did I mention VCL? Pure genius!• Very clever logging mechanism• ESI is cool, even with its limited subset
– Not unique to Varnish though• Support from several good commercial
entities
Varnish – The Bad
• Letting the kernel do the hard work might seem like a good idea on paper, but perhaps not so great in the real world. But lets not go into a BSD vs Linux kernel war …
• Persistent caching seems like an after thought at best
• No good support for conditional requests• What impact does “real” logging have on
performance?
Varnish – The Ugly
• There are a lot of threads in this puppy!• No SSL. And presumably, there never will be?
– So what happens with SPDY / HTTP2 ?• Protocol support is weak, without a massive
amount of VCL.• And, you probably will need a PhD in VCL!
– There’s a lot of VCL hacking to do to get it to behave well
Summary
• Please understand your problem`– Don’t listen to @zwoop on twitter…
• Performance in itself is rarely a key differentiator; latency, features and correctness are
• But most important, use a proxy, preferably a good one, if you run a serious web server
Performance AWS 8KB HTML (gzip)
If it ain’t broken, don’t fix itBut by all means, make it less sucky!
However, when all you have is a hammer…
http://www.flickr.com/photos/aai/6936657289/