Thursday, June 5, 2008

More reorganisation..

I've moved cbdata, mempools, fd and legacy disk (file_*) routines out of src/. I also shuffled the comm related -definitions- out but not the code. I hit a big of a snag - the comm code path used for connecting a socket to a remote site actually uses the DNS code. Fixing this will involve divorcing the DNS lookup stuff so sockets can be connected to a remote IP directly - and the DNS lookup path will just be in another module.

This however is more intrusive than "code reorganisation" so its going to have to wait a while. Unfortunately, this means that my grand plans for 1.1 will have to be put on hold a little until I've thought this out a little more and implemented it in a seperate branch.

Thus, things will change a little. 1.1 will be released shortly, with the current set of changes included. I'll then concentrate on planning out the next set of changes required to properly divorce the core event/disk code from src/.

Why do this? Well, the biggest reason is to be able to build "other" bits of code which reuse the Squid core. I can write unit tests for a lot of stuff, sure, but it also means I can write simple network and disk applications which reuse the Squid core and find exactly how hard I can push them. I can also break off a branch and hack up the code to see what impact changes make without worrying that said changes expose strange stuff in the rest of the Squid codebase.

The four main things that I'd like to finally sort out are:

  • IPv6 socket support - support v4/v6 in the base core, and make sure that it works properly
  • Sort out the messy disk related code and reintegrate async IO as a top-level disk producer (like it was in Squid-2.2 and it almost is in Squid-3) so it can be again used for things like logfile writing!
  • Begin looking at scatter/gather disk and network IO - gather disk IO should work out great for writing logfile buffers and objects to disk, for example
  • Design a parallelism model which allows multiple threads to cooperate on tasks - worker threads implementing callback type stuff for some work; entire seperate network event threads (look at memcached as an example.) "Squid" as it stands will simply run as one thread, but some CPU intensive stuff can be pushed into worker threads for some cheap parallelism gains (as on the roadmap, ACLs and rewriting/content manipulation are two easy targets.)
So there's a lot of work to do, and not a lot of time to do it in.

No comments:

Post a Comment