varnish web accelerator
TRANSCRIPT
Varnish
Rahul Ghose
What is it?
● HTTP Proxy● Key/Value store● Designed for 64-bit● VCL → C
Fast!
3
Architecture
4
Detailed architecture
● 2 main processes– Parent is management
– Child is actual caching process● Workers (one for each connection)● Thread pool manager (2 is default)● Startup thread● Health check● Acceptor● Reaper
5
Getting started
● After installing varnish it will take the default port of 6081 instead of 8080 as mentioned in the manual.
● Magic → /etc/sysconfig/varnish (/etc/default/varnish for Debian)
● Backend config : /etc/varnish/default.vcl
6
First run
7
Cache storage
● Hash– HTTP Host header and the URL
– The options can be changed
– Multiple objects can be mapped to the same key
● Default backend in CentOS 6 is 'file'.
8
Storage backends
● File – Single file
– Not persistent across restart
– Mmap
● Malloc– It does a direct malloc()– Overhead 1kB per object
● Persistent– Experimental (no disk space = fail!)
9
The basic tools of the trade
● varnishd – The actual binary● varnishstat – Display stats● varnishlog – Display logs● varnishtop – Display most used log entries
– varnishtop -i RxURL
● varnishadm – Send command to varnishd● varnishncsa – Display apache/NCSA style logs● varnishhist – Histogram display
10
Logging in varnish
● Done in shared memory● Overwrites once memory is full● No physical logs● Formats available are:
– Apache format (NCSA)
– Custom
– Real-time
11
The log format
● First column is request ID
● Second is tag● Third
b for backend c for client - for misc.
Then data.
12
The varnishstat utility
● First column: Total of the type● Second: Total per second data● Third: Average rate since beginning of collection
13
Some Parameters from CLI
● Thread pools (thread_pools)● Minimum no. of threads (thread_pool_min)● Maximum no. of threads (thread_pool_max)● No. of queued work requests (n_wrk_queued)● Timeout for idle and extra threads (thread_pool_timeout)● Wait time if new thread creation failed (thread_pool_fail_delay)● Timeout waiting for server's response (first_byte_timeout)● Network latency (connect_timeout)● Number of deleted cache entries (n_lru_nuked)
14
Changing startup options
● Add the options as command line parameters in the config file.DAEMON_OPTS="-a :80 \
-T localhost:6082 \ -f /etc/varnish/default.vcl \-S /etc/varnish/secret \-s malloc,256m\-p first_byte_timeout=1s”
● Edit the default vcl file and add the options as:backend www {
.host = "127.0.0.1";
.port = "8080";
.first_byte_timeout = 1s;
...}
15
Hot-changes with varnishadm
● Connect to varnishadm● vcl.load me /etc/varnish/default.vcl● vcl.use me● vcl.discard unused ones.
16
How is it done?
● The vcl is converted to C code● Code compile variable:
– varnishadm param.show cc_command
● The “.so” created is loaded with dlopen● The shared libraries can be found at :
/var/lib/varnish/$(hostname)/
17
vcl_recv
vcl_pipe
vcl_pass
vcl_hash
vcl_hit
vcl_miss vcl_fetch
vcl_deliver
Start
DoneMoveBytes
The flow
18
Sailing in the vcl
● vcl_recv()– When request is recv-d!
– Data → req
● vcl_fetch()– When response has been fetched.
– Data → req and beresp
– Try alternate backends, trigger ESI
19
What to do?
● pass – No caching done● hit_for_pass – Cache decision to pass● lookup – Must deliver from cache● pipe – Varnish goes blind● deliver – Deliver cached object
20
VCL is C
● In-line C code.– C{
printf ( “Hey Jude!\n” );}C
● The compiled code:
– varnishd -d -f foo.vcl -C
Prints to syslog
21
VCL objects
● req– The requested object
● beresp– Back-end response
● obj– The cached object
– TTL is only writable
22
Operators
● ==● =● ~ (supports regex)● !● ||● &&
23
Backends
● The real server● Corresponding vcl -
– backend b1 { .host = “192.168.0.1”; }backend b2 { .host = “192.168.0.3”; }sub vcl_recv {
set req.backend b2}
24
Directors
backend b1 { .host = “192.168.0.1”; }
backend b2 { .host = “192.168.0.3”; }
director b3 random {{ .backend = b1; .weight = 2; }{ .backend = b2; .weight = 8; }
}
director b4 round-robin {{ .backend = b1; }{ .backend = { .host = “192.168.0.2”; .port = 8080; } }
}
25
Access Control
acl internal { “192.168.1.1”;“192.168.0.0/8”;! “192.168.0.123”;include “list_of_ip.txt”;
}
acl bad {“209.99.45.119”;
}
sub vcl_recv {if (client.ip ~ internal) {
return pass;}if( client.ip ~ bad) {
error 420 “Go to the corner.”; }// Nothing specified, so continue to default vcl_recv()
}
Inserts inline
26
Some HTTP Headers
● Etag● Cache-control: TTL● Authorization: pass through● Hostname (www.a.com, a.com)● Cookies (does not cache)● Vary (encoding, different caches)● User-Agent (different caches)
27
Purge & Ban
● PURGE– Removes items from the cache
● BAN
28
Grace
● It can serve stale cache data via grace period● When it does that?
– Too many connections pile up
– A back-end is down● Detect by probes
backend server1 {.host = "server1.example.com";.probe = { .url = "/"; .interval = 5s; .timeout = 1 s;.window = 5; .threshold = 3;}
}
● Set both “beresp” grace and “req” grace for serving stale data.
29
VMOD
● Used to extend the functionality of basic inline C allowed in a vcl.
● vmod.cc– Generated file to be included with custom source
● Custom locking for shared resources
30
Some competition
● Squid (separate memory/disk manager, FTP)● AiCache● LotServer● Nginx● Polipo
31
References
● NCSA Format - http://publib.boulder.ibm.com/tividd/td/ITWSA/ITWSA_info45/en_US/HTML/guide/c-logs.html#common
● The Varnish book - https://www.varnish-software.com/static/book/
32
Thanks