varnish, the good, the awesome, and the downright crazy
TRANSCRIPT
Varnish, The Good, The
Awesome, and the
Downright Crazy By Mike Willbanks
Software Engineering Manager
CaringBridge
MinneBar April 7, 2012
2
•Talk
Slides will be online later!
•Me
Software Engineering Manager at CaringBridge
MNPHP Organizer
Open Source Contributor (Zend Framework and various others)
Where you can find me:
• Twitter: mwillbanks G+: Mike Willbanks
• IRC (freenode): mwillbanks Blog: http://blog.digitalstruct.com
• GitHub: https://github.com/mwillbanks
Housekeeping…
3
• What the … is Varnish?
• The Good
“I don’t want to listen to you but only want to play on my laptop.”
The quick, easy and uninformed way... You listening over there?
• The Awesome
VCL’s, Directors and more… now we’re going places.
• The Crazy
ESI, Purging, VCL C, and VMOD… my head hurts, stop!
• Varnish Command Line Apps
varnishtop, varnishstat, etc.
• Questions
Although you can bring them up at anytime!
Agenda
What the… is Varnish? Official Statement
What the hell it means
Graphs, oh my!
5
“Varnish is a web application accelerator. You install it in
front of your web application and it will speed it up
significantly.”
Official Statement
6
•Varnish allow you to accelerate your website
By using memory and keeping in mind cookies, request headers
and more…
• It caches pages so that your web server can RELAX!
What about my apache, tomcat, uhhh… (mongrel|thin|goliath….)
Generally caching by TTL + HTTP Headers (cookies too!)
•A load banancer, proxy and more…
What? …. Yes, it can do that! #winning
What The Hell? Tell me!
7
•CaringBridge Status Server
We need to get a message to our mobile users!
The system is down, or we want to be able to communicate a
message to them about some subject… maybe a campaign.
The apps and mobile site rely on an API
• Trouble in paradise? Few and far in between.
• We want to talk to our users? Of course!
Let an API talk to a server…
A story on crashing and burning
A story on a bad thing making a huge success!
A General Use Case
8
The Graph - AWS
0
10000
20000
30000
40000
50000
60000
70000
80000
Small X-Large Small Varnish
Requests
Requests
0
50
100
150
200
250
300
350
400
450
500
Small X-Large Small Varnish
Time
Time
0
100
200
300
400
500
600
700
Small X-Large Small Varnish
Req/s
Req/s
0
2
4
6
8
10
12
14
Small X-Large Small Varnish
Peak Load
Peak Load
9
The Raw Data
Small X-Large Small Varnish
Concurrency 10 150 150
Requests 5000 55558 75000
Time 438 347 36
Req/s 11.42 58 585
Peak Load 11.91 8.44 0.35
Comments 19,442 failed reqs
The Good – Listen Up! Installment
Documentation
Finding Existing VCL’s
11
•RTM : http://goo.gl/hl4Tt
Debian: sudo apt-get install varnish
EPEL: yum install varnish
• only 6.x otherwise you’ll be out of date!
WOOT Compiling #winning #git
• git clone git://git.varnish-cache.org/varnish-cache
• cd varnish-cache
• sh autogen.sh
• ./configure
• Make && make install
Installment
12
•Reference Manual
https://www.varnish-cache.org/docs/3.0/reference/index.html
•Tutorial – more like a book version of the reference manual
https://www.varnish-cache.org/docs/3.0/tutorial/index.html
•Knock yourselves out! There is a ton of documentation
• Yes, this makes happy developers.
Documentation is very accurate, read carefully.
Focus heavily on VCL’s, that is generally what you need.
I’m attempting to show you some of how this works but you will
require the documentation to assist you.
Documentation
13
•VCL’s are available for common open source projects
Hi wordpress and drupal!
• https://www.varnish-cache.org/trac/wiki/VarnishAndWordpress
• https://www.varnish-cache.org/trac/wiki/VarnishAndDrupal
Examples of all sorts of crazy
• https://www.varnish-cache.org/trac/wiki/VCLExamples
Existing VCL’s – The truly lazy…
14
backend default {
.host = "127.0.0.1“;
.port = "8080";
}
sub vcl_recv {
if (!(req.url ~ "wp-(login|admin)")) {
unset req.http.cookie;
}
}
sub vcl_fetch {
if (!(req.url ~ "wp-(login|admin)")) {
unset beresp.http.set-cookie;
}
}
Wordpress = Bad Slashdot Bad!!!
The Awesome – Going Places VCL
Directors
A Few Examples
17
•VCL State Engine
Each Request is Processed Separately & Independently
States are Isolated but are Related
Return statements exit one state and start another
VCL defaults are ALWAYS appended below your own VCL
•VCL can be complex, but…
Two main subroutines; vcl_recv and vcl_fetch
Common actions: pass, hit_for_pass, lookup, pipe, deliver
Common variables: req, beresp and obj
More subroutines, functions and complexity can arise dependent
on condition.
VCL – Varnish Configuration Language
18
• vcl_init – VCL is loaded, no request yet; VMOD initialization
• vcl_recv – Beginning of request, req is in scope
• vcl_pipe – Client & backend data passed unaltered
• vcl_pass – Request goes to backend and not cached
• vcl_hash – call hash_data to add to the hash
• vcl_hit – called on request found in the cache
• vcl_miss – called on request not found in the cache
• vcl_fetch – called on document retrieved from backend
• vcl_deliver – called prior to delivery of cached object
• vcl_error – called on errors
• vcl_fini – all requests have exited VCL, cleanup of VMOD’s
VCL - Subroutines
19
• Always Available
now – epoch time
• Backend Declarations
.host – hostname / IP
.port – port number
• Request Processing
client – ip & identity
server – ip & port
req – request information
VCL - Variables
• Backend Req Prepartion
bereq – backend request
• Retrieved Backend Request
beresp – backend response
• Cached Object
obj – Cached object, can only
change .ttl
• Response Preparation
resp – http stuff
20
• hash_data(string) – adds a string to the hash input.
Request host and URL is default from the default vcl.
• regsub(string, regex, sub) – substitution on first occurance
sub can contain numbers 0-n to inject matches from the regex.
• regsuball(string, regex, sub) – substitution on all occurances
• ban(expression) – Ban all objects in cache that match
• ban(regex) – Ban all objects in cache that have a URL match
VCL - Functions
21
• Directors allow you to talk to the backend servers
• Directors are a glorified reverse proxy
Allows for certain types of load balancing
Allows for talking to a cluster
“A director is a logical group of backend servers
clustered together for redundancy. The basic role of
the director is to let Varnish choose a backend server
amongst several so if one is down another can be
used.”
Directors
22
•Random Director – picks a backend by random number
•Client Director – picks a backend by client identity
•Hash Director – picks a backend by URL hash value
•Round-Robin Director – picks a backend in order
•DNS Director – picks a backend by means of DNS
Random OR Round-Robin
•Fallback – picks the first “healthy” backend
Directors – The Types
23
•To ensure healthy backends, you need to use probing.
It really sounds like a colonoscopy for servers.
•Variables
.url
.request
.window
.threshold
.intial
.expected_response
.interval
.timeout
Director - Probing
24
Example VCL Configuration
The Crazy ESI – Edge-Side Includes
Purging
VMOD
26
•ESI is a small markup language much like SSI (server side
includes) to include fragments (or dynamic content for that
matter).
• If don’t think you can use varnish because you have say,
user information displayed on every page; think again!
•Think of it as replacing regions inside of a page as if you
were using XHR (AJAX).
•Three Statements are Implemented
esi:include – Include a page
esi:remove – Remove content
<!-- esi --> - ESI disabled, execute normally
ESI – Edge Side Includes
27
ESI – By Diagram
28
• In vcl_fetch, you must set ESI to be on
set beresp.do_esi = true;
By default, ESI will still cache, so add an exclusion if you need it
• if (req.url == “/show_username.php”) {
return (pass);
}
Varnish refuses to parse content for ESI if it does not look like XML
• This is by default; so check varnishstat and varnishlog
Using ESI
29
<html>
<head><title>Rock it with ESI</title></head>
<body>
<header>
<esi:include src="/user_header.php" />
<!-- Don't do this as you'd lose the advantage of varnish -->
<!--esi
<?php include 'user_header.php'; ?>
-->
</header>
<section id="main"></section
<footer></footer>
</body>
</html>
ESI – By Example
30
•Sometimes, you just need to purge.
Don’t drink too much tonight, ok?
• If you do… find a DD or a cab.
•The various ways of purging
varnishadm – command line utility
• It’s the ole finger in the back of the throat
Sockets (port 6082) – everyone likes a good socket wrench
• Sure, Ipecac is likely overkill.
HTTP – now that is the sexiness
• See, now we’re not forcing the point!
Purging – Beer anyone?
31
varnishadm -T 127.0.0.1:6082 purge req.url == "/foo/bar“
telnet localhost 6082
purge req.url == "/foo/bar
telnet localhost 80
Response:
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
PURGE /foo/bar HTTP/1.0
Host: bacon.org
Purging Examples
32
•Distributed Purging… now that’s a punch line!
Use a message queue (or gearman job server)
Have a worker that knows about the varnish servers
Submit the request to clear the cache in the asynchronously or
synchronously depending on your use case.
• Have enough workers to make this effective at purging the cache
quickly.
Sexy Purging
33
•Before getting into VMOD; did you know you can embed C
into the VCL for varnish?
•Want to do something crazy fast or leverage a C library for
pre or post processing?
• I know… you’re thinking that’s useless..
On to the example; and a good one from the Varnish WIKI!
Embedding C in VCL – you must be crazy
34
C{
#include <syslog.h>
}C
sub vcl_something {
C{
syslog(LOG_INFO, "Something happened at VCL line XX.");
}C
}
# Example with using varnish variables
C{
syslog(LOG_ERR, "Spurious response from backend: xid %s request %s %s
\"%s\" %d \"%s\" \"%s\"", VRT_r_req_xid(sp), VRT_r_req_request(sp),
VRT_GetHdr(sp, HDR_REQ, "\005host:"), VRT_r_req_url(sp),
VRT_r_obj_status(sp), VRT_r_obj_response(sp), VRT_GetHdr(sp, HDR_OBJ,
"\011Location:"));
}C
VCL - Embedded C for syslog – uber sexy
35
•Taking VCL embedded C to the next level
•Allows you to extend varnish and create new functions
•Now, if you are writing modules for varnish you have a
specialty use case!
Go read up on it!
https://www.varnish-cache.org/docs/trunk/reference/vmod.html
VMOD – Varnish Modules / Extensions
36
•The VMOD std is shipped with varnish; it provides some
useful commands
toupper
tolower
set_up_tos
Random
Log
VMOD - std
syslog
fileread
duration
integer
collect
Varnish Command Line Apps varnish varnishadm varnishhist
varnishlog varnishncsa varnishreplay
varnishsizes varnishstat varnishtest
varnishtop
38
•What is varnish doing right now?
•How do I debug what is happening?
varnishtop
What is Varnish doing…
39
•Many times people want to log the requests to a file
By default Varnish only stores these in shared memory.
Apache Style Logs
• varnishncsa –D –a –w log.txt
Logging
40
•Need to warm up your cache before putting a sever in the
queue or load test an environment?
varnishreplay –r log.txt
•Replaying logs can allow you to do this. This is great for
when you are going to be deploying code to check for
performance issues.
Cache Warmup
41
•How to see your cache hit ratios…
varnishstat
•Want to parse them from XML?
varnishstat –x
Cache Hit Ratios? No Problem
Questions? These slides will be posted to SlideShare & SpeakerDeck.
Slideshare: http://www.slideshare.net/mwillbanks
SpeakerDeck: http://speakerdeck.com/u/mwillbanks
Twitter: mwillbanks
G+: Mike Willbanks
IRC (freenode): mwillbanks
Blog: http://blog.digitalstruct.com
GitHub: https://github.com/mwillbanks