Caddy Behind Cloudflare - Get Your Logs Right

Published: 17 Jul, 2018

Running Caddy server on your origin? Here's how to configure your log format to get all the interesting fields

This post talks about Caddy, a HTTP server that’s easy to get up and running, lightweight and has a module for exposing metrics in the native Prometheus format, so we we like it a lot. In this case we’re using Caddy to host a small static gallery, generated from the image post-processing suite Lightroom.

Caddy runs in a FreeBSD jail (OS-level virtualisation), which is hosted on a fairly powerful physical machine. This machine hosts a bunch of different jails and also serves as a NAS.

The host has a nullfs mount into the Caddy jail, so images appear as being on the local file system. This makes the photographer’s workflow easy - exported gallery is automatically present on the webserver and therefore on the public Internet w/o any additional actions needed. Great, everyone now can browse the gallery and it all just works.

However, since the audience for the gallery is distributed across the world, I also setup Cloudflare CDN in front of the site. The idea is to cache content on the CDN, so that its closer to the visitors, no matter where they are located geographically.

Now if you ever ran a HTTP server behind a proxy, you know that the HTTP client making requests to your webserver is now the proxy and not the end user. This is also reflected in your webserver logs, where suddenly all requests appear to have been made by the proxy.

Squid cache, one of the early open source HTTP proxies, introduced a new HTTP request header to address this problem, X-Forwarded-For. The format of the header is simple and is intended to preserve IP of not only the client, but any intermediate proxies, which append their IP to the list:

X-Forwarded-For: client, proxy1, proxy2

(Actually, there’s a new standard header aimed at replacing X-Forwarded-For, called Forwarded and is described in RFC 7239, but Caddy doesn’t support it yet so we won’t talk about it any more in this post.)

So back to the problem at hand. We can simply update our log format to use the address in X-Forwarded-For instead of the client IP. An alternative solution available with Caddy, is to use a 3rd party middleware http.realip, which, when enabled, restores the original client IP.

Make sure your Caddy is built with this extension and then enable it with a single line in the Caddyfile:

realip cloudflare

With this in place, the value of the remote variable in the logs is replaced with the value of X-Forwarded-For, problem solved.

This is all well and good, but I didn’t like losing the Cloudflare proxy addresses from the logs. The Cloudflare IPs carry useful information as to which Cloudflare edge locations are connecting to Caddy and to measure latency back to them, when I feel like some probes.

So what can we do to have both? Build a custom log format of course! Here’s what we came up with, which also includes the request headers that Cloudflare sends us, like Cf-Ipcountry and Cf-Visitor:

log / access.log "{>X-Forwarded-For} - {user} [{when}] \"{method} {uri} {proto}\" 
    {status} {size} \"{>Referer}\" \"{>User-Agent}\" \"{tls_version}\" 
    \"{tls_cipher}\" \"{>Cf-Ipcountry}\" \"{Cf-Ray}\" \"{>Cf-Visitor}\" 
    \"{>X-Forwarded-Proto}\" {remote}"

Note that this needs to be specified as a single line, I’ve spit it into several lines only to make it more readable.

Put that in your Caddyfile and reload Caddy with killall -SIGUSR1 caddy to make it pick up the changes and you’re ready to go.

At this point you might be thinking that X-Forwarded-For header can contain multiple addresses in it in case the request passed through several proxies (e.g. I could be running Varnish caching server before Caddy). If that were the case and we cared (like our log processing pipeline expects log line to start with only a single IP address), we could configure Varnish to preserve the original connecting IP in a separate, custom, header and use that in our Caddy logs. Implementation of this, as always, is left as an exercise for the reader.