Sunday, March 18, 2012

Concurrency in the TX path and when it all falls down..

I'm still (yes, still) hacking on the FreeBSD 802.11n ath(4) TX path. I'm trying to find and fix issues that creep up before I can flip on the 11n code by default.

I've become increasingly aware of the lack of locking in net80211, including some of the A-MPDU TX session management code. I know that I'll eventually have to plan out and implement some locking, but for now I'm just trying to squish whatever issues are showstoppers.

A user approached the freebsd-wireless list about two weeks ago and noticed that his 802.11n session was hanging after a bit of use. If he tried 802.11g, things worked fine. He tried SMP - and things worked fine. The symptoms? A number of frames seemed to be "stuck" and sitting in a software TX queue, not being transmitted. But other frames would be TXed fine, so it wasn't as simple as a totally confused TX BAW (Block-ack Window.)

After I managed to reproduce the issue locally, I discovered what was going on.

It was concurrency.

Specifically, that there's multiple places where ath_start() is being called, thus multiple concurrent TX is occuring.

Now, in the non-aggregate method, net80211 is doing all the sequence number assignment. I'm not so sure that in the normal case, the sequence numbers are being allocated in a consistent, sequential way - if the net80211 TX code is able to be called concurrently from multiple threads, sequence numbers can and will be occasionally "raced" and allocated in the reverse order that they're submitted to the driver. But I'm not here to fix that (however I'll eventually have to.)

In the aggregate method, the driver is doing the aggregation and sequence number assignment. For now, the driver is also doing the TX BAW tracking and frame queuing. So imagine this sequence of events:

* a frame is submitted via ath_start();
* since aggregation is enabled, it's allocated a sequence number;
* it's then thrown into the software queue;
* then at some later stage, the software queue is checked, the frame is popped out of the list and if it's inside the BAW, it's added to the BAW and TX;ed
* adding the frame to the TX BAW slides the left hand edge of the BAW to be at that sequence number. Any frames TXed with a seqno _less_ than this will be treated as outside of the BAW and put back onto the software queue for now.

Now, the locking only occurs at:

* the time the frame has the sequence number allocated;
* then when the queue is checked and the frame is popped off.

If two or more threads are allocating sequence numbers and doing work, it's quite possible that thread A will allocate (say) seqno 5, thread B will allocate seqno 6 and then add it to the BAW before thread A can. Then when thread A tries to do some work, it finds the queue has a frame with seqno 5 in it - and since it's before the left hand edge of the BAW (which is at 6, as it was successfully pushed to the hardware), it won't be transmitted and will stay in the software queue until the BAW sequence numbers wrap around to 0 and catch up.

Now, linux ath9k/mac80211 doesn't have this problem. The TX pathway is totally serialised, which means that even if multiple threads are trying to TX, only one thread will be able to enter the TX code at any time. The other threads get blocked.

So how can I solve this? The easy solution would be to serialise FreeBSD's net80211 and ath driver TX code. That way all of this nonsense will go away. For the net80211 side of things this may work - the legacy TX path, where sequence numbers are allocated by the stack, could benefit from serialisation. The throughput isn't ever going to be that great, so we wouldn't really hurt from it. But the trouble is making absolutely sure that the driver also does the same - even if I push the frames into the queue in order and ensure that they have sequential sequence numbers, there's no guarantee that a driver with concurrent entry paths into XXX_start() will de-queue the frames and push them into the hardware in the same order.

So what I've chosen to do instead is to ignore the legacy part for now, and not serialise anything. Instead, I'm doing the sequence number allocation (for aggregation, remember) -at the time I'm about to add the frame to the BAW and TX it to the hardware-. Ie, until the frame actually is able to be added to the BAW, it won't _have_ a sequence number. Since this action is done behind a lock, it's guaranteed to be sequential. The trick here is to only allocate the sequence numbers once it's known for certain that the frame _will_ be going out to the hardware.

For the legacy path, it's also likely worthwhile delaying the sequence number allocation until it's just about to go to the ifnet queue. That way the frames on the ifnet queue have sequence numbers that are in order. Then I need to fix each driver (ugh) to make sure they're dequeued fine.

I've written up the aggregation change and this so far works quite well. I don't want to tackle the legacy path yet or fix other drivers, not until I've verified this works. What I also should do is write some test cases to verify that indeed sequence numbers are being presented to the driver in order, so I can identify this happening in the wild.

Monday, February 13, 2012

.. and the price of packaging up software? Billions.

A friend of mine from Perth recently wrote an article outlining the "cost" of Debian Wheezy:

http://blog.james.rcpt.to/2012/02/13/debian-wheezy-us19-billion-your-price-free/

To quote:

"In my analysis the projected cost of producing Debian Wheezy in February 2012 is US$19,070,177,727 (AU$17.7B, EUR€14.4B, GBP£12.11B), making each package’s upstream source code wrth an average of US$1,112,547.56 (AU$837K) to produce. Impressively, this is all free (of cost)."

Now this has apparently caused a bit of a stir among Linux and IT news sites. It's a large number, right? It's all free, right?

However - Debian for the most part is a package repository. Sure, a lot of effort goes into building and maintaining that - and I think _that_ should be assigned a cost - but I think counting all of those package upstreams as part of Debian is hiding the true nature of software development.

According to Ohloh, the cost of producing FreeBSD, at $72,000 a year per programmer, would be $ 243,777,135, _before_ all of the packages in the FreeBSD ports repository. FreeBSD has over 23,000 packages too, just like Debian - if those were also counted, it may start to push that figure far up from millions to billions.

Does it mean Debian Wheezy is equivalent of $20B of effort? Maybe. But then a tiny (comparatively) more effort and you end up with FreeBSD. Or, with different effort - Redhat. Or Gentoo. Or Mandrake. Or NetBSD. Or OpenBSD. Or MacPorts/Fink, which packages this similar software for MacOS X.

What I guess I'm trying to say is this. You get cool stuff from programmers for free. Including that in your project "cost" just seems silly.

Saturday, November 26, 2011

FreeBSD on the TP-Link TL-WR1043nd!

I've now included (almost) all of the support needed to run FreeBSD natively on the TP-Link TL-WR1043nd. It's a 3-antenna, 2x2 stream 2.4ghz 802.11n AP which you can get for under AUD $100.

It supports hostap mode (which is what I bet most of you want to use) and I'm currently using it at home alongside my Ubiquiti Routerstation Pro based hostap (which is what I use to test out all the other pre-11n and 11n NICs that I currently own.)

I currently get around 50mbit TCP throughput - but I leave full FreeBSD-HEAD debugging on. I'm sure I can push the unit closer to 100mbit. (Compare to the Routerstation Pro + AR9160 hostap - where I routinely get 160mbit of TCP throughput.)

What works (read: what I've tested):
  • Ethernet (at least the WAN port);
  • Wireless - 802.11bgn - 20/40mhz operation as well as legacy operation (and both, if that's what you need);
  • Serial console - if you've soldered in one.

The firmware image stores the configuration in a 64k flash partition which is read upon boot. You can modify files in /etc and then save these to flash via "cfg_save".

What isn't supported:
  • The onboard switch - so I believe the only port available at the present moment is the WLAN port;
  • The GPIO lines aren't being configured, so the WLAN, status and USB/QSS buttons don't function.
I haven't tested out Multi-SSID mode yet. The earlier AR9130 revisions have some issues with multi-SSID mode and handling block-ack tracking, so I _think_ I'll need to somehow disable aggregation on the second and later VAP interfaces. Just keep that in mind if you're tinkering.

Further details about the hardware and how to build the software for yourself can be found here in my FreeBSD wifi development project wiki.

No, I won't (yet) be putting up firmware images for people to test. Things are changing quite rapidly and there's no easy way to reflash a unit once you've placed FreeBSD on it - you'll need to have added a serial console to the device.

FreeBSD 802.11n update: 27 November 2011.

I've merged in most of the reset related fixes from my git tree into FreeBSD-HEAD. This means that normal resets (eg stuck beacon, calibration resets, etc) shouldn't drop frames any longer.

Frames are still dropped during things like channel/operation mode changes and channel scanning (which does do a channel change.) I'll have to look into that at a later stage. If you're using this in station mode you will likely need to disable background scanning or your aggregation sessions may occasionally drop. You'll have random messages logged when frames are dropping during a flush or reset, so just check your system dmesg log for anything from the ath driver.

I'll be next working on correctly handling failed/filtered frames and then adding some transition stuff to net80211 so the TIM/ATIM bitmaps can be kept correctly up to date. This should fix some of the power saving issues that I'm sure exist.

Unfortunately transmitting BAR frames is still quite a bit off. There's a lot more tidying up that I'd like to do before I start down the path of handling BAR TX, including trying to figure out how to better handle packet transmission and reception when the NIC is off-channel (eg when doing a background channel scan.)

I also have a long list of things I'd like to do to the rate control code and all the surrounding code which sets up rates and creates aggregates. The code I ported/wrote is a little too verbose and duplicate-y for me. That likely will occur after the christmas break.

Enjoy!

Wednesday, November 16, 2011

FreeBSD is now doing (even more) 802.11n..

I held off merging in my 802.11n work as much as possible but I decided that I'd like to get it done before the end of the year. Even though 9.0-RELEASE is still around the corner, I decided that it would be better to merge in what I have into -HEAD and then tidy that up then wait for what could be a few more months.

So, it's in there, bugs and all, supporting both station and hostap mode. No, wds, adhoc, mesh and TDMA aren't currently supported (I have enough bugs to worry about for the time being, without trying to debug the other operating modes. But I'd like to.)

What works:
  • TX and RX aggregation!
  • The rest of the 802.11n negotiation stuff, mostly thanks to Bernhard Schmidt who fixed up a lot of the net80211 quirks.
  • Lots of ANI changes which hopefully make noisy environments more stable.

What doesn't yet work:

  • Interface resets cause frames to be dropped from the RX and TX queues. This messes up aggregation and causes sessions to hang. I'm fixing that up in a git branch at the moment.
  • BAR TX - I'll implement BAR TX soon - it's just tricky to get right.
  • Filtered frames - ie, TX failed frames from the hardware. Instead of the current method of "always try", the hardware supports failing the current and subsequent frames in a set. That way a hostap seeing a station going into power saving mode can quickly abort all TX frames to said station and then only retransmit them when the station indicates it's again awake. If I don't do this then the hardware will constantly fail a lot of frames, causing BAR frames to be TXed when they likely shouldn't be.

But it's enough to try. So if you have an AR5416, AR9160, AR9220, AR9280, AR9285, AR9227 or AR9287, give it a whirl. If you have a pre-11n NIC then please, give it a go too. I'd like to ensure that the hardware support for earlier chipsets hasn't broken.

If you'd like to use this in production on a hostap then please keep in mind that power saving support isn't entirely functional and featured, so stations which go into frequent power saving mode may have some performance issues. I'll tinker with this some more soon.

Finally, thank you very much to Hobnob, Inc. for sponsoring this work and Qualcomm Atheros for providing me source code, documentation and assistance in understanding how all of this works.

Saturday, October 29, 2011

Sitting in a hotel, getting stuck beacons..

I'm sitting in a hotel here in Sydney, wondering if Qantas will allow my flight to happen on Monday, when I discover my 11bg AP interface here is going a bit bezerk-o with the stuck beacons.

This isn't terribly good - I thought I had fixed all the stuck beacon issues! Apparently not.

It turns out that in this hotel room, overlooking St Martins Place, I can see a whole lot of random crap. And this includes a very persistent little wireless client:

[e8:06:88:8b:23:57] discard probe_req frame, ssid mismatch: "Crime Prevention"
[e8:06:88:8b:23:57] discard probe_req frame, ssid mismatch: "vodafone8A1F"
[e8:06:88:8b:23:57] discard probe_req frame, ssid mismatch: "Boingo Hotspot"
[e8:06:88:8b:23:57] discard probe_req frame, ssid mismatch: "Belkin.44FD"
[e8:06:88:8b:23:57] discard probe_req frame, ssid mismatch: "mycloud"
[e8:06:88:8b:23:57] discard probe_req frame, ssid mismatch: "101-108"
[e8:06:88:8b:23:57] discard probe_req frame, ssid mismatch: "Willkommen bei La Maison du Pain"
[e8:06:88:8b:23:57] discard probe_req frame, ssid mismatch: "WLAN-E36697"
[e8:06:88:8b:23:57] discard probe_req frame, ssid mismatch: "FRITZ!BoxWLAN7240 HB"
[e8:06:88:8b:23:57] discard probe_req frame, ssid mismatch: "Telekom_ICE"
[e8:06:88:8b:23:57] discard probe_req frame, ssid mismatch: "minershotspot"
[e8:06:88:8b:23:57] discard probe_req frame, ssid mismatch: "GANAG"
[e8:06:88:8b:23:57] discard probe_req frame, ssid mismatch: "WLAN-001C4A075A8D"
[e8:06:88:8b:23:57] discard probe_req frame, ssid mismatch: "iJumeirah"
[e8:06:88:8b:23:57] discard probe_req frame, ssid mismatch: "bunny"
[e8:06:88:8b:23:57] discard probe_req frame, ssid mismatch: "Vineyard Square Hotel"
[e8:06:88:8b:23:57] discard probe_req frame, ssid mismatch: "Belkin.44FD"
[e8:06:88:8b:23:57] discard probe_req frame, ssid mismatch: "mycloud"
[e8:06:88:8b:23:57] discard probe_req frame, ssid mismatch: "101-108"
[e8:06:88:8b:23:57] discard probe_req frame, ssid mismatch: "Willkommen bei La Maison du Pain"
[e8:06:88:8b:23:57] discard probe_req frame, ssid mismatch: "FRITZ!BoxWLAN7240 HB"
[e8:06:88:8b:23:57] discard probe_req frame, ssid mismatch: "Telekom_ICE"
[e8:06:88:8b:23:57] discard probe_req frame, ssid mismatch: "minershotspot"
[e8:06:88:8b:23:57] discard probe_req frame, ssid mismatch: "GANAG"
[e8:06:88:8b:23:57] discard probe_req frame, ssid mismatch: "WLAN-001C4A075A8D"
[e8:06:88:8b:23:57] discard probe_req frame, ssid mismatch: "iJumeirah"
[e8:06:88:8b:23:57] discard probe_req frame, ssid mismatch: "bunny"

Sunday, September 25, 2011

I know this is old..

From if_wi.c in FreeBSD:

/*
* Lucent WaveLAN/IEEE 802.11 PCMCIA driver.
*
* Original FreeBSD driver written by Bill Paul
* Electrical Engineering Department
* Columbia University, New York City
*/

From if_wi.c in OpenBSD:

/*
* Lucent WaveLAN/IEEE 802.11 driver for OpenBSD.
*
* Originally written by Bill Paul
* Electrical Engineering Department
* Columbia University, New York City
*/