In theory, once all of the caching stuff is fixed, the backends will spend most of their time revalidating objects.
But for some weird reason I'm seeing TCP_REFRESH_MISS on my Lusca edge nodes and generally poor performance during this release. I look at the logs and find this:
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:18.104.22.168) Gecko/2009042316 Firefox/3.0.10\r\n
If-Modified-Since: Wed, 03 Jun 2009 15:09:39 GMT\r\n
[HTTP/1.0 200 OK\r\n
Last-Modified: Wed, 03 Jun 2009 15:09:39 GMT\r\n
Date: Fri, 12 Jun 2009 04:25:40 GMT\r\n
X-Cache: MISS from mirror1.jp.cacheboy.net\r\n
Via: 1.0 mirror1.jp.cacheboy.net:80 (Lusca/LUSCA_HEAD)\r\n
Notice the different ETags? Hm! I wonder whats going on. On a hunch I checked the Etags from both backends. master1 for that object gives "1721454571"; master2 gives "1687308715". They both have the same size and same timestamp. I wonder what is different?
Time to go digging into the depths of the lighttpd code.
EDIT: the etag generation is configurable. By default it uses the mtime, inode and filesize. Disabling inode and inode/mtime didn't help. I then found that earlier lighttpd versions have different etag generation behaviour based on 32 or 64 bit platforms. I'll build a local lighttpd package and see if I can replicate the behaviour on my 32/64 bit systems. Grr.
Meanwhile, Cacheboy isn't really serving any of the mozilla updates. :(
EDIT: so it turns out the bug is in the ETag generation code. They create an unsigned 32-bit integer hash value from the etag contents, then shovel it into a signed long for the ETag header. Unfortunately for FreeBSD-i386, "long" is a signed 32 bit type, and thus things go airy from time to time. Grrrrrr.
EDIT: fixed in a newly-built local lighttpd package; both backend servers are now doing the right thing. I'm going back to serving content.