Varnish Tutorial Part 1: HTTP Caching With Varnish

Varnish is a powerful HTTP cache that has become ubiquitous in environments that require proxy-based HTTP caching. In this tutorial series we'll take a look at setting up Varnish starting with basic caching and gradually developing more advanced configurations.

Requirements

For this tutorial series you'll need to install Varnish. Steps differ by platform, so read the official documentation to figure out how to do that.[1] You'll also need to be able to run commands from a POSIX shell, like bash or zsh.

HTTP Caching With Varnish

Varnish is typically placed in front of a source HTTP server. It inspects requests it receives, and then either forwards them to the backend source or returns a cached result.

Most Varnish configuration is contained within the default.vcl file. The exact location of this file depends on your particular installation. Personally, I use the official Varnish Docker image, and so it is located at /etc/varnish/default.vcl within the image.[2]

A minimal configuration looks like this:

default.vcl

vcl 4.1;

backend default {
  .host = "beakerstudio-splash.s3-website.us-east-2.amazonaws.com:80";
}

Save your configuration then restart Varnish.

There's not a ton going on here, but let's break it down. On the first line we're just specifying the VCL version. VCL stands for Varnish Configuration Language. Then we specify a backend. For this tutorial we're just be using a splash page I created as the backend.

Assuming that Varnish is running on your local machine on port 8080 (again–this differs based on installation), then we can validate it is working with curl:

bash

curl -v http://localhost:8080/ -H "Host: beakerstudio-splash.s3-website.us-east-naws.com"

The -v flag outputs the HTTP headers. The -H flag is used to specify the host, which is necesarry because of Amazon S3's DNS configuration.

The output should look something like this:

*   Trying ::1:8080...
* Connected to localhost (::1) port 8080 (#0)
> GET / HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/7.77.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Content-Type: text/html
< Server: AmazonS3
< Content-Length: 1881
< X-Varnish: 5 3
< Age: 4
< Via: 1.1 varnish (Varnish/7.0)
< Accept-Ranges: bytes
< Connection: keep-alive
<
<!DOCTYPE html>
<html lang='en'>
...

You should see mostly the same headers as if you queried the splash page directly, but Varnish should add a few headers. If the value of the Age header is 0, then that indicates the response you see is not from the cache. If the value is a positive number, then that indicates the number of seconds the response has existed in cache. To validate that Varnish is caching look for this header on subsequent requests.

Let's modify our Varnish configuration so that the host is specified automatically. This will also allow us to view the splash page at http://localhost:8080/ in the web browser:

default.vcl

vcl 4.1;

backend default {
  .host = "beakerstudio-splash.s3-website.us-east-2.amazonaws.com:80";
}

sub vcl_recv {
  set req.http.host = "beakerstudio-splash.s3-website.us-east-2.amazonaws.com";
}

Restart Varnish. Now we may omit host from curl:

bash

curl -v http://localhost:8080/

Note that restarting Varnish will clear the cache. In the next tutorial we'll look at purging the cache without causing downtime by restarting the server.

Common Gotchas

More In This Series

Published: