Varnish

Thu Jul 22 15:36:20 UTC 2010

So we've finally hit that tipping point in mod_cache where it's not quite
behaving correctly.  So I've been looking at alternatives.  For those not
familiar with the current setup (in order of processes) it goes:

httpd(proxy) -> haproxy(proxy) -> httpd(app)

The first two apps are both on the proxy servers, haproxy is our balancer
that sends it to httpd.

I've been looking at a better proxy solution.  I initially pushed back
against varnish because it would complicate the environment, and this will
but since apache isn't cutting it I figured a slow incremental change is
the best approach.  So what I'm proposing is this:

httpd(proxy) -> varnish(proxy) -> haproxy(proxy) -> httpd(app)

So a couple of reasons why I'm choosing to do design, especially since, in
theory, varnish can completely replace both httpd and haproxy in that
picture.

First, trying to be incremental allows this to be a very non-intrusive
change.  We're literally installing the varnish package, deploying a
single varnish config file and altering port settings in the httpd
configs.   This will be easy to revert and troubleshoot.

Second, replacing haproxy.  Varnish's load balancing is pretty primitive
right now.  It can do health checks, but only at the host level.  This
means we'd have to create a check definition for every host * every
service.  Which is just pretty nasty.

Third, replacing httpd is also a bit complex.  We use a lot of features in
apache to do things like redirects, compression, etagging, static file
serving (like fedoraproject.org), and the big one is ssl.

So anyway, varnish's caching abilities are FAR superior to httpd, not just
in terms of speed.  The people that have suggested this in the past
(warren and daMaestro come to mind) were right, it can do a lot.  So for
interested parties, go over the tech docs, lets learn it and find out what
features we may want.

For now though I'm working on getting it in puppet in staging, we can run
it there for a little while and then move it to production.  The nice
thing here is that we can very slowly integrate things, like starting with
smolt or the wiki, then add others since it's just a port change.  Haproxy
listens on a different port for each farm, but varnish only listens on
one.  Since all of our applications are in their own namespace (/wiki vs
/smolt-wiki for example) it makes this transition smooth and easy.  Hurray
good architecture!

So, questions, comments, concerns?  Anyone think this is a bad idea?
Speak up!

	-Mike