Showing posts with label cacheboy. Show all posts
Showing posts with label cacheboy. Show all posts

Monday, July 12, 2010

More IPv6 hackery..

I've been spending a little time fleshing out some more of the IPv6 preparation work in Lusca. Since they're rather intrusive patches, I'm doing the work in a separate branch (/playpen/LUSCA_HEAD_ipv6) and will merge back bits and pieces as needed.

I've migrated the client db, external URL rewriters, access logging and the client-facing connection management code over to be IPv6 aware. I still have the request_t state, ACL lookups (which is luckily done - but sitting in a branch!), further DNS work and the protocol-facing stuff (HTTP, FTP.)

There isn't much more work involved in getting LUSCA_HEAD ready enough to serve IPv6-facing clients. That'll let me push out Cacheboy content over IPv6.

But for now, it's back to hacking on commercial, customer code.

Tuesday, June 22, 2010

Cacheboy: Firefox Release 3.6.4

Cacheboy is currently pushing a good 800-1200mbit of firefox 3.6.4 updates. It has about 6% of the total mozilla mirror weight so I predict that there's currently around 16 gigabits of total mozilla updates going out.

Lusca is holding up fine - a single process happily pushed ~ 400mbit on some hosts during the initial peak. I'd obviously like to be able to push a lot more than that but I'm still doing baby steps in the Lusca performance department.

Tuesday, November 10, 2009

More issues with Lighttpd

So occasionally Lighttpd on FreeBSD-7.x+ZFS gets all upset. I -think- there's something weird going on where I hit mbuf exhaustion somehow when ZFS starts taking a long time to complete IO requests; then all socket IO fails in Lighttpd until it is restarted.

More investigation is required. Well, more statistics are needed so I can make better judgements. Well, actually, more functional backends are needed so I can take one out of production when something like this occurs, properly debug what is going on and try to fix it.

Cacheboy Update / October/November 2009

Howdy,

Just a few updates this time around!
  • Cacheboy was pushing around 800-1200mbit during the Firefox 3.5.4 release cycle. I started to hit issues with the backend server not keeping up with revalidating requests and so I'll have to improve the edge caching logic a little more.
  • Lusca seems quite happy serving up 300-400mbit from a single node though; which is a big plus.
  • I've found some quite horrible memory leaks in Quagga on only one of the edge nodes. I'll have to find some time to login and debug this a little more.
  • The second backend server is now offically toast. I need to acquire another 1ru server with 2 SATA slots to magically appear in downtown Manhattan, NY.

Thursday, October 8, 2009

Cacheboy downtime - hardware failures

Howdy,

I've had both backend servers fail today. One is throwing undervolt errors on one PSU line and is having disk issues (most likely related to an undervoltage); the other is just crashed.

I'm waiting for remote hands to prod the other box into life.

This is why I'd like some more donated equipment and hosting - I can make things much more fault tolerant. Hint hint.

Monday, September 21, 2009

My current wishlist

I'm going to put this on the website at some point, but I'm currently chasing a few things for Cacheboy:

  • More US nodes. I'll take anything from 50mbit to 5gbit at this point. I need more US nodes to be able to handle enough aggregate traffic to make optimising the CDN content selection methods worthwhile.
  • Some donations to cover my upcoming APNIC membership for ASN and IPv4/IPv6 space. This will run to about AUD $3500 this year and then around AUD $2500 a year after that.
  • Some 1ru/2ru server hardware in the San Francisco area
  • Another site or two willing to run a relatively low bandwidth "master" mirror site. I have one site in New York but I'd prefer to run a couple of others spread around Europe and the United States.
I'm sure more will come to mind as I build things out a little more.

New project - sugar labs!

I've just put the finishing touches on the basic sugar labs software repository. I'll hopefully be serving part or all of their software downloads shortly.

Sugar is the software behind the OLPC environment. It works on normal intel based PCs as far as I can tell. More information can be found at http://www.sugarlabs.org/

Monday, August 31, 2009

Cacheboy presentation at AUSNOG

I've just presented on Cacheboy at AUSNOG in Sydney. The feedback so far has been reasonably positive.

There's more information available at http://www.creative.net.au/talks/.

Monday, August 17, 2009

Cacheboy status update

So by and large, the pushing of bits is working quite well. I have a bunch of things to tidy up and a DNS backend to rewrite in C or C++ but that won't stop the bits from being pushed.

Unfortunately what I'm now lacking is US hosts to send traffic from. I still have more Europe and Asian connectivity than North American - and North America is absolutely where I need connectivity the most. Right now I'm only able to push 350-450 megabits of content from North America - and this puts a big, big limit on how much content I can serve overall.

Please contact me as soon as possible if you're interested in hosting a node in North America. I ideally need enough nodes to push between a gigabit and ten gigabits of traffic.

I will be able to start pushing noticable amounts of content out of regional areas once I've sorted out North America. This includes places like Australia, Africa, South America and Eastern Europe. I'd love to be pushing more open source bits out of those locations to keep the transit use low but I just can't do so at the moment.

Canada node online and pushing bits!

The Canada/TORIX node is online thanks to John Nistor at prioritycolo in Toronto, Canada.

Thanks John!

Cacheboy is on WAIX!

Yesterday's traffic from mirror1.au into WAIX:
ASNMBytesRequests% of overall
AS754517946.77743729.85TPG-INTERNET-AP TPG Internet Pty Ltd
AS480212973.47447621.58ASN-IINET iiNet Limited
AS47398497.92294714.13CIX-ADELAIDE-AS Internode Systems Pty Ltd
AS95432524.5712414.20WESTNET-AS-AP Westnet Internet Services
AS48542097.329413.49NETSPACE-AS-AP Netspace Online Systems
AS177461881.1710503.13ORCONINTERNET-NZ-AP Orcon Internet
AS98221425.444562.37AMNET-AU-AP Amnet IT Services Pty Ltd
AS174351161.014111.93WXC-AS-NZ WorldxChange Communications LTD
AS94431140.627011.90INTERNETPRIMUS-AS-AP Primus Telecommunications
AS7657891.9311871.48VODAFONE-NZ-NGN-AS Vodafone NZ Ltd.
AS7718740.742721.23TRANSACT-SDN-AS TransACT IP Service Provider
AS7543732.114231.22PI-AU Pacific Internet (Australia) Pty Ltd
AS24313527.382520.88NSW-DET-AS NSW Department of Education and Training
AS9790436.803890.73CALLPLUS-NZ-AP CallPlus Services Limited
AS17412365.132280.61WOOSHWIRELESSNZ Woosh Wireless
AS17486349.271160.58SWIFTEL1-AP People Telecom Pty. Ltd.
AS17808311.652480.52VODAFONE-NZ-AP AS number for Vodafone NZ IP Networks
AS24093303.401140.50BIGAIR-AP BIGAIR. Multihoming ASN
AS9889288.851970.48MAXNET-NZ-AP Auckland
AS17705282.49840.47INSPIRENET-AS-AP InSPire Net Ltd

Query content served: 54878.07 mbytes; 23170 requests.
Total content served: 60123.25 mbytes; 28037 requests.

BGP aware DNS

I've just written up the first "test" hack of BGP aware DNS.

The basic logic is simple but evil. I'm simply mapping BGP next-hop to a set of weighted servers. A server is then randomly chosen from this pool.

I'm not doing this for -all- prefixes and POPs - it is only being used for two specific POPs where there is a lot of peering and almost no transit. There are a few issues regarding split horizon BGP/DNS and request routing which I'd like to fully sort out before I enable it for everything. I don't want a quirk to temporarily redirect -all- requests to -one- server cluster!

In any case, the test is working well. I'm serving ~10mbit to WAIX (Western Australia) and ~ 30mbit to TORIX (Toronto, Canada.)

All of the DNS based redirection caveats apply - most certainly that not all client requests to the caches will also be over peering. I'll have to craft some method(s) of tracking this.

Sunday, August 9, 2009

Updates - or why I've not been doing very much

G'day! Cacheboy has been running on autopilot for the last couple of months whilst I've been focusing on paid work and growing my little company. So far (mostly) so good there.

The main issue scaling traffic has been the range request handling in Squid/Lusca, so I've been working on fixing things up "just enough" to make it work in the firefox update environment. I think I've finally figured it out - and figured out the bugs in the range request handling in Squid too! - so I'll push out some updates to the network next week and throw it some more traffic.

I really am hoping to ramp traffic up past the gigabit mark once this is done. We'll just have to see!

Wednesday, July 8, 2009

VLC 1.0 released

VLC-1.0 has been released. The CDN is pushing out between 550 and 700mbit of VLC downloads. I'm sure it can do more but as I'm busy working elsewhere, I'm going to be overly conservative and leave the mirror weighting where it is.

Graphs to follow!

Monday, June 29, 2009

Current Downtime/issues

There's a current issue with content not being served correctly. It stemmed from a ZFS related panic on one of the backend servers (note to self - update to the very latest FreeBSD-7-stable code; these are all fixed!) which then came up with lighttpd but no ZFS mounts. Lighttpd then started returning 404's.

I'm now watching the backend(s) throw random connection failures and the Lusca caches then cache an error rather than the object.

I've fixed the backend giving trouble so it won't start up in that failed mode again and I've set the negative caching in the Lusca cache nodes to 30 seconds instead of the default 5 minutes. Hopefully the traffic levels now pick up to where its supposed to be.

EDIT: The problem is again related to the Firefox range requests and Squid/Lusca's inability to cache range request fragments.

The backend failure(s) removed the objects from the cache. The problem now is that the objects aren't re-entering the cache because they are all range requests.

I'm going to wind down the Firefox content serving for now until I get some time to hack up Lusca "enough" to cache the range request objects. I may just do something dodgy with the URL rewriter to force a full object request to occur in the background. Hm, actually..

Saturday, June 27, 2009

New mirror node - italy

I've just turned on a new mirror node in Italy thanks to New Media Labs. They've provided some transit services and (I believe) 100mbit access to the local internet exchange.

Thanks guys!

Wednesday, June 17, 2009

And the GeoIP summary..

And the geoip summary:


From Sun Jun 7 00:00:00 2009 to Sun Jun 14 00:00:00 2009



ServerCountryMBytesRequests

us5163783.096533162
de1514664.222307222
ca1152095.00917777
fr948433.271451105
uk945640.711136455
it818161.03770164
br542497.791426306
se482932.15229559
es445444.34647321
pl397755.301021083
nl373185.13306023
ru368124.64749924
tr293627.27484965
mx276775.12463252
be249088.62213460
ch201782.33209530
ro190059.45274216
fi172399.75204630
ar170421.77374071
no169351.46155258

Tuesday, June 16, 2009

A quick snapshot of Cacheboy destinations..

The following is a snapshot of the per destination AS traffic information I'm keeping.


If you're peering with any of these ASes and are willing to sponsor a cacheboy node or two then please let me know. How well I can scale things at this point is rapidly becoming limited to where I can push traffic from, rather than anything intrinsic to the software.


From Sun Jun 7 00:00:00 2009 to Sun Jun 14 00:00:00 2009














TimeSiteASNMBytesRequests% of overall
AS3320602465.0110219753.26DTAG Deutsche Telekom AG
AS7132583164.057782593.16SBIS-AS - AT&T Internet Services
AS19262459322.306031272.49VZGNI-TRANSIT - Verizon Internet Services Inc.
AS3215330962.955532991.79AS3215 France Telecom - Orange
AS3269317534.063331141.72ASN-IBSNAZ TELECOM ITALIA
AS9121259768.324349321.41TTNET TTnet Autonomous System
AS22773244573.652834271.32ASN-CXA-ALL-CCI-22773-RDC - Cox Communications Inc.
AS12322224708.253436861.22PROXAD AS for Proxad/Free ISP
AS3352206093.843051831.12TELEFONICADATA-ESPANA Internet Access Network of TDE
AS812204120.741666331.10ROGERS-CABLE - Rogers Cable Communications Inc.
AS8151198918.223286321.08Uninet S.A. de C.V.
AS6327197906.531528611.07SHAW - Shaw Communications Inc.
AS3209191429.183037871.04ARCOR-AS Arcor IP-Network
AS20115182407.092251510.99CHARTER-NET-HKY-NC - Charter Communications
AS2119181719.201176560.98TELENOR-NEXTEL T.net
AS577181167.021523830.98BACOM - Bell Canada
AS12874172973.421084290.94FASTWEB Fastweb Autonomous System
AS6389165445.732361330.90BELLSOUTH-NET-BLK - BellSouth.net Inc.
AS6128165183.072103000.89CABLE-NET-1 - Cablevision Systems Corp.
AS2856164332.962192670.89BT-UK-AS BTnet UK Regional network

Query content served: 5234195.61 mbytes; 6878234 requests (ie, what was displayed in the table.)


Total content served: 18473721.25 mbytes; 26272660 requests (ie, the total amount of content served over the time period.)

Saturday, June 13, 2009

Seeking a few more US / Canada hosts

G'day everyone!

I'm now actively looking for some more Cacheboy CDN nodes in the United States and Canada. I've got around 3gbit of available bandwidth in Europe, 1gbit of available bandwidth in Japan but only 300mbit of available bandwidth in North America.

I'd really, really appreciate a couple of well-connected North American nodes so I can properly test the platform and software that I'm building. The majority of traffic is still North American in destination; I'm having to serve a fraction of it from Sweden and the United Kingdom at the moment. Erk.

Please drop me a line if you're interested. The node requirements are at http://www.cacheboy.net/node_requirements.html . Thankyou!

Friday, June 12, 2009

Another day, another firefox release done..

The June Firefox 3.0.11 release rush is all but over and Cacheboy worked without much of a problem.

The changes I've made to the Lusca load shedding code (ie, being able to disable it :) works well for this workload. Migrating the backend to lighttpd (and fixing up the ETag generation to be properly consistent between 32 bit and 64 bit platforms) fixed the initial issues I was seeing.

The network pushed out around 850mbit at peak. Not a lot (heck, I can do that on one CPU of a mid-range server without a problem!) but it was a good enough test to show that things are working.

I need to teach Lusca a couple of new tricks, namely:


  • It needs to be taught to download at the fastest client speed, not the slowest; and

  • Some better range request caching needs to be added.



The former isn't too difficult - that is a weekend 5 line patch. The latter is more difficult. I don't really want to shoehorn in range request caching into the current storage layer. It would look a lot like how Vary and Etag is currently handled (ie, with "magical" store entries acting as indexes to the real backend objects.) I'd rather put in a dirtier hack that is easy to undo now and use the opportunity to tidy up the whole storage layer a whole lot. But the "tidying up" rant is not for this blog entry, its for the Lusca development blog.

The hack will most likely be a little logic to start downloading full objects that aren't in the cache when their first range request comes in - so subsequent range requests for those objects will be "glued" to the current request. It means that subsequent requests will "stall" until enough of the object is transferred to start satisfying their range request. The alternative is to pass through each range request to a backend until the full object is transferred and this would improve initial performance but there's a point where the backend could be overloaded with too many range requests for highly popular objects and that starts affecting how fast full objects are transferred.

As a side note, I should probably do up some math on a whiteboard here and see if I can model some of the potential behaviour(s). It would certainly be a good excuse to brush up on higher math clue. Hm..!