varnish web accelerator

Varnish

Rahul Ghose

What is it?

● HTTP Proxy● Key/Value store● Designed for 64-bit● VCL → C

Fast!

3

Architecture

4

Detailed architecture

● 2 main processes– Parent is management

– Child is actual caching process● Workers (one for each connection)● Thread pool manager (2 is default)● Startup thread● Health check● Acceptor● Reaper

5

Getting started

● After installing varnish it will take the default port of 6081 instead of 8080 as mentioned in the manual.

● Magic → /etc/sysconfig/varnish (/etc/default/varnish for Debian)

● Backend config : /etc/varnish/default.vcl

6

First run

7

Cache storage

● Hash– HTTP Host header and the URL

– The options can be changed

– Multiple objects can be mapped to the same key

● Default backend in CentOS 6 is 'file'.

8

Storage backends

● File – Single file

– Not persistent across restart

– Mmap

● Malloc– It does a direct malloc()– Overhead 1kB per object

● Persistent– Experimental (no disk space = fail!)

9

The basic tools of the trade

● varnishd – The actual binary● varnishstat – Display stats● varnishlog – Display logs● varnishtop – Display most used log entries

– varnishtop -i RxURL

● varnishadm – Send command to varnishd● varnishncsa – Display apache/NCSA style logs● varnishhist – Histogram display

10

Logging in varnish

● Done in shared memory● Overwrites once memory is full● No physical logs● Formats available are:

– Apache format (NCSA)

– Custom

– Real-time

11

The log format

● First column is request ID

● Second is tag● Third

b for backend c for client - for misc.

Then data.

12

The varnishstat utility

● First column: Total of the type● Second: Total per second data● Third: Average rate since beginning of collection

13

Some Parameters from CLI

● Thread pools (thread_pools)● Minimum no. of threads (thread_pool_min)● Maximum no. of threads (thread_pool_max)● No. of queued work requests (n_wrk_queued)● Timeout for idle and extra threads (thread_pool_timeout)● Wait time if new thread creation failed (thread_pool_fail_delay)● Timeout waiting for server's response (first_byte_timeout)● Network latency (connect_timeout)● Number of deleted cache entries (n_lru_nuked)

14

Changing startup options

● Add the options as command line parameters in the config file.DAEMON_OPTS="-a :80 \

-T localhost:6082 \ -f /etc/varnish/default.vcl \-S /etc/varnish/secret \-s malloc,256m\-p first_byte_timeout=1s”

● Edit the default vcl file and add the options as:backend www {

.host = "127.0.0.1";

.port = "8080";

.first_byte_timeout = 1s;

...}

15

Hot-changes with varnishadm

● Connect to varnishadm● vcl.load me /etc/varnish/default.vcl● vcl.use me● vcl.discard unused ones.

16

How is it done?

● The vcl is converted to C code● Code compile variable:

– varnishadm param.show cc_command

● The “.so” created is loaded with dlopen● The shared libraries can be found at :

/var/lib/varnish/$(hostname)/

17

vcl_recv

vcl_pipe

vcl_pass

vcl_hash

vcl_hit

vcl_miss vcl_fetch

vcl_deliver

Start

DoneMoveBytes

The flow

18

Sailing in the vcl

● vcl_recv()– When request is recv-d!

– Data → req

● vcl_fetch()– When response has been fetched.

– Data → req and beresp

– Try alternate backends, trigger ESI

19

What to do?

● pass – No caching done● hit_for_pass – Cache decision to pass● lookup – Must deliver from cache● pipe – Varnish goes blind● deliver – Deliver cached object

20

VCL is C

● In-line C code.– C{

printf ( “Hey Jude!\n” );}C

● The compiled code:

– varnishd -d -f foo.vcl -C

Prints to syslog

21

VCL objects

● req– The requested object

● beresp– Back-end response

● obj– The cached object

– TTL is only writable

22

Operators

● ==● =● ~ (supports regex)● !● ||● &&

23

Backends

● The real server● Corresponding vcl -

– backend b1 { .host = “192.168.0.1”; }backend b2 { .host = “192.168.0.3”; }sub vcl_recv {

set req.backend b2}

24

Directors

backend b1 { .host = “192.168.0.1”; }

backend b2 { .host = “192.168.0.3”; }

director b3 random {{ .backend = b1; .weight = 2; }{ .backend = b2; .weight = 8; }

}

director b4 round-robin {{ .backend = b1; }{ .backend = { .host = “192.168.0.2”; .port = 8080; } }

}

25

Access Control

acl internal { “192.168.1.1”;“192.168.0.0/8”;! “192.168.0.123”;include “list_of_ip.txt”;

}

acl bad {“209.99.45.119”;

}

sub vcl_recv {if (client.ip ~ internal) {

return pass;}if( client.ip ~ bad) {

error 420 “Go to the corner.”; }// Nothing specified, so continue to default vcl_recv()

}

Inserts inline

26

Some HTTP Headers

● Etag● Cache-control: TTL● Authorization: pass through● Hostname (www.a.com, a.com)● Cookies (does not cache)● Vary (encoding, different caches)● User-Agent (different caches)

http://www.a.com/

27

Purge & Ban

● PURGE– Removes items from the cache

● BAN

28

Grace

● It can serve stale cache data via grace period● When it does that?

– Too many connections pile up

– A back-end is down● Detect by probes

backend server1 {.host = "server1.example.com";.probe = { .url = "/"; .interval = 5s; .timeout = 1 s;.window = 5; .threshold = 3;}

}

● Set both “beresp” grace and “req” grace for serving stale data.

29

VMOD

● Used to extend the functionality of basic inline C allowed in a vcl.

● vmod.cc– Generated file to be included with custom source

● Custom locking for shared resources

30

Some competition

● Squid (separate memory/disk manager, FTP)● AiCache● LotServer● Nginx● Polipo

31

References

● NCSA Format - http://publib.boulder.ibm.com/tividd/td/ITWSA/ITWSA_info45/en_US/HTML/guide/c-logs.html#common

● The Varnish book - https://www.varnish-software.com/static/book/

http://publib.boulder.ibm.com/tividd/td/ITWSA/ITWSA_info45/en_US/HTML/guide/c-logs.html#common



https://www.varnish-software.com/static/book/

32

Thanks

varnish web accelerator

Technology