Sunday, March 28, 2010

Today's fun bug: invalid swap metadata

One of the Lusca users has issues with swap file contents being invalid. It's linked to the length of the object URL - and fixing this is unfortunately more difficult than first thought.

In essence - the size of the URL, the size of the metadata and the size of the buffers being used for reading data don't match "right". The TLV encoding code doesn't put an upper limit on the size of the metadata. The TLV decoding code doesn't enforce a maximum buffer size - it tries reading the metadata until it finds the end of said metadata. All of this unfortunately results in stupid behaviour when overly long URLs are stored.

The current maximum URL is MAX_URL - 4096 bytes. Squid-3 may have redefined this to be longer. The reason I haven't done this is because the URL is included in the swap metadata - and this is only read in in SM_PAGE_SIZE chunks - again, 4096 bytes. So if the URL is say, 4000ish or so bytes long, the total length of the encoded metadata is > 4096 bytes. This is happily written out to the swapfile.

Reading the data in however is another story. An SM_PAGE_SIZE sized buffer is created and a read is issued. The incomplete metadata is read in. There's unfortunately no check to ensure the passed in buffer actually fully contains all of the metadata - so the code happily trumps in potentially uninitialised memory. The end result is at the very least an invalid object which is eventually deleted; it could be worse. I haven't yet investigated.

In any case - I'm now going to have to somehow enforce some sensible behaviour. I'd much prefer to make the code handle arbitrary (ie, configurable arbitrary) long URLs and read in the metadata as needed - but that's a bigger issue which will take some further refactoring and redesigning to solve.

This is all another example of how code works "by magic", rather than "by intent". :)

No comments:

Post a Comment