My long-term goal is to finally separate all of this mess out so there's "generic" routines to be a HTTP client and server, create requests/replies and parse responses. But for now, tidying up some of the messy code to improve performance (and thus give people motivation to migrate their busy sites to Cacheboy) is on my short-term TODO list.
I spent some time ~ 18 months ago tidying up all of the client-side code so the request line and request header parsing didn't require half a dozen copies of various things just to complete. That was quite successful. The code structure is still horrible, but it works, and that for now is absolutely the most important part.
Now I'm doing something similar to the server-side code. The HTTP server code (src/http.c) combines both reply buffer appending, parsing, 100-continue response handling (well, "handling") and the various header checks for caching and connection in one enormous puddle of code. I'm trying to tease these apart so each part is done separately and the reply data isn't double-copied - once into the reply buffer, then once via storeAppend() into the memory store.
The CPU time spent doing this copying isn't all that high on current systems but it is definitely noticable (~30% of all CPU time spent in memcpy()) for slower systems talking to LAN-connected servers. So I'm going to do it - primarily to fix performance on slower hardware, but it also forces me to tidy up the existing code somewhat.
The next step is avoiding the copy into the memory store entirely, removing another 65% or so of memcpy() CPU time.