Friday, September 27, 2019

Fixing up KA9Q-unix, or "neck deep in 30 year old codebases.."

I'll preface this by saying - yes, I'm still neck deep in FreeBSD's wifi stack and 802.11ac support, but it turns out it's slow work to fix 15 year old locking related issues that worked fine on 11abg cards, kinda worked ok on 11n cards, and are terrible for these 11ac cards. I'll .. get there.

Anyhoo, I've finally been mucking around with AX.25 packet radio. I've been wanting to do this since I was a teenager and found out about its existence, but back in high school and .. well, until a few years ago really .. I didn't have my amateur radio licence. But, now I do, and I've done a bunch of other stuff with a bunch of other radios. The main stumbling block? All my devices are either Apple products or run FreeBSD - and none of them have useful AX.25 stacks. The main stacks of choice these days run on Linux, Windows or are a full hardware TNC.

So yes, I was avoiding hacking on AX.25 stuff because there wasn't a BSD compatible AX.25 stack. I'm 40 now, leave me be.

But! A few weeks ago I found that someone was still running a packet BBS out of San Francisco. And amazingly, his local node ran on FreeBSD! It turns out Jeremy (KK6JJJ) ported both an old copy of KA9Q and N0ARY-BBS to run on FreeBSD! Cool!

I grabbed my 2m radio (which is already cabled up for digital modes), compiled up his KA9Q port, figured out how to get it to speak to Direwolf, and .. ok. Well, it worked. Kinda.

Here's my config:

ax25 mycall CALLSIGN-1
ax25 version 2
ax25 maxframe 7
attach asy kissui tnc0 65535 256 1200

.. and it worked. But it wasn't fast. I mean, sure, it's 1200 bps data, but after digging I found some very bad stack behaviour on both KA9Q and N0ARY. So, off I went to learn about AX.25.

And holy hell, there are some amusing bugs. I'll list the big showstoppers first and then what I think needs to happen next.

Let's look at the stack behaviour first. So, when doing LAPB over AX.25, there's a bunch of frames with sequence numbers that go out, and then the receiver ACKs the sequence numbers four ways:
  • RR - "roger roger" - yes, I ack everything up to N-1
  • RNR - I ack everything up to N-1 but I'm full; please stop sending until I send something back to start transmission up again
  • REJ - I received an invalid or missing sequence number, ACK everything to N-1 and retransmit the rest please
  • I - this is a data frame which includes both the send and receive sequence numbers. Thus, transmitted data can implicitly ACK received data.
I'd see bursts like this:
  • N0ARY would send 7 frames
  • I'd receive them, and send individual RR's for each of them
  • N0ARY would then send another 7 frames
  • I'd receive a few, maybe I'd get a CRC error on one or miss something, and send a REJ, followed by getting what I wanted or not and maybe send an RR
  • N0ARY would then start sending many, many more copies of the same frame window, in a loop
  • I'd individually ACK/REJ each of these appropriately
  • .. and it'd stay like this until things eventually caught up.

So, two things were going wrong here.

Firstly - KA9Q didn't implement the T2 timer in the AX.25 v2.0 spec. T2 is an optional timer which a TNC can use to delay sending responses until it expires, allowing it to batch up sending responses instead of responding (eg RR'ing) each individual frame. Now, since the KISS TNC only sends data and not signaling up to the applications, all the applications can do is queue frames in response to other frames or fire off timers to do things. The KA9Q stack doesn't know that the air is busy receiving more data - only that it received a frame. So, T2 could be used to buffer sending status updates until it expires.

N0ARY-BBS implements T2 for RR/RNR responses, but not for REJ responses.

Then, both KA9Q and N0ARY-BBS don't delay sending LAPB frames upon status notifications. Thus, every RR, RNR and REJ that is received may trigger sending whatever is left in the transmit window. Importantly, receiving a REJ will clear the "unack" (unacknowledged) window and force retransmission of everything. If you get a couple of REJ's in a row then it'll try to send multiple sets of the same window out, over and over. If you get an RR and REJ and RR, it may send more data, then the whole window, then more data. It's crazy.

Finally, there's T1. T1 is the retransmisison timer. Now, transmitting a burst of 7 frames of full length at 1200 baud again takes around 2.2 seconds a frame, so it's around 15.4 seconds for the full burst. If T1 is less than that, then because there's no feedback about when the frames went out - only that you sent them to the TNC - you'll start to try retransmitting things. Now, luckily one can poll the other end using a RR poll frame to ask the other end to respond with its current sequence number - that way each end can re-establish what each others send/receive sequence numbers are. However, these can also be batched up - so whilst you're sending your frames, T1 fires generating another batch of RR's to poke the other side. This in itself isn't such a bad thing, but it does mean the receiver sees a big, long burst of frames followed by a handful of RR polls. Strictly speaking this isn't ideal - you're only supposed to send a single poll and then not poll until you get a response or another timeout.

So what have I done?

I'm doing what JNOS2 (another KA9Q port) is doing - I am using T2 for data transmission, RR, RNR and REJ transmission. It's not a pretty solution, but it at least stops the completely pointless retransmission of a lot of wasted data. I've patched both N0ARY and KA9Q to do this, so once KE6JJJ gets a chance to update his BBS and the N0ARY BBS I am hoping that the bad behaviour stops and the BBS becomes useful again for multiple people.

Ok, so what needs to happen?

Firstly, we've learnt a lot about networking since the 80s and 90s. AX.25 is kinda part TCP, part IP, so you'd think it should be fine as a timer based protocol. But alas no - it's slow, it's mostly half duplex, and overflowing transmit queues or resending data incorrectly has a real cost. It's better to look at it as a reliable wireless protocol like what 802.11 does, and /not/ as TCP/IP. 802.11 has timers, 802.11 has sequence numbers, and 802.11 tries to provide as reliable a stream as it can. But it doesn't guarantee traffic; if traffic takes too long it'll just time it out and let the upper layer (like TCP) handle the actual guarantees. Now, you kind want the AX.25 LAPB sessions to be as reliable as possible, but this gets to the second point.

You need to figure out how to be fair between sessions. The KA9Q stacks right now don't schedule packets based on any kind of fairness between LAPB or AX.25 queues. The LAPB code will try to transmit stuff based only on its local window and I guess they treat retransmits as something that signals they need to back off. That's all fine and dandy in theory but in practice at 1200 bps a 7 packet window at 256 bytes a packet is 7*2.2 seconds, or 15.4 seconds. So after 15.4 seconds if the remote side immediately ACKs and you then send another 7 packet burst, noone is going to really get a chance to come in and talk on the BBS.

So, this needs a couple things.

Firstly, just because you can transmit a maximum window of 7 doesn't mean you should. If you see the air being busy, maybe you want to back that off a bit to let others get in and talk. Yes, it does mean the channel is being used less efficiently in total for a single session, but now you're allowing other sessions to get airtime and actually interact. Bugs aside, I've managed to keep the N0ARY BBS tied up for minutes at a time squeezing me tens of kilobytes of article content. That's all fine and dandy, but I don't mind if it takes a little longer when there are other users trying to also do stuff.

Next, the scheduling for LAPB shouldn't just be timers kicking off packet generation into a queue, and then best effort transmission at the KISS TNC. If the AX.25 stack was told about the data link status transitions - ie, between idle, receiving, transmitting and such - then when the air was free the TNC could actually schedule which LAPB session(s) to service next. I've watched T1 (retransmission) and T2 kick over multiple times during someone else downloading data, and when the air is eventually busy an AX.25 node sends multiple copies of both I data payload and S status frames (RR, RNR, REJ, probes, etc.) It's insane. The only reason it's doing this is because it doesn't know the TNC is busy receiving or transmitting and thus those timers don't need to run and traffic doesn't need to be generated. This is how the MAC and PHY layers in 802.11 interoperate. The MAC doesn't queue lots of packets to be sent out when the PHY is ready - the MAC has the work there, and when the PHY signals the air is free and the contention window timer is expired, the MAC signals to get the air and sends its frame. It sends what it can in its given time window and then it stops.

This means that yes, the KISS TNC popularity is part of the reason AX.25 is inefficient these days. KISS TNCs make it easy to do AX.25 packets, but they make it super easy to do it inefficiently. The direwolf author wrote a paper on this where he compared these techniques to just using the AX.25 stack (and AX.25 2.2 features) which have knowledge of the direwolf physical/radio layer. If these hooks were made available over the KISS TNC interface - and honestly, they'd just be a two byte status notification saying that the TNC is in the { idle, receiving, decoding, transmitting } states - then AX.25 stacks could make much, much smarter decisions about what to transmit and when.

Finally - wow this whole thing needs per packet compression already. AX.25 version 2.2 introduces a way of negotiating parameters with remote TNCs for supported extensions and so one of my medium term KA9Q/N0ARY goals is to introduce enough version 2.2 support to negotiate SREJ (selective rejection/retransmission) and maybe the window size options, but primarily to add compression. I think SREJ + per packet compression would be the biggest benefits over 1200 and 9600 bps links.

If you're interested, software repositories are located below. I encourage people to contribute to the KE6JJJ work; I'm just forked off of it (github username erikarn) and I'll be pushing improvements there.

Oh, and these compile on FreeBSD. KA9Q and direwolf both compile and run on MacOSX but N0ARY-BBS doesn't yet do so. Yes, this does mean you can now do packet radio on FreeBSD and MacOSX.

Wednesday, June 19, 2019

wow, it's been a while

Well, it has been a while. I've been busy with a new job (Facebook/Oculus) which has taken some adjustment to get to. My new baby girl Alice is now 1 and a half years old and a redhead handful of gremlin energy. Nora (who is almost 5 now) is now inquisitive, playful and talkative. And daddy needs some sleep.

But, hey, I have been playing with radios and slowly getting back into FreeBSD wifi hackery. So hopefully I will write a few posts here over the next few months to catch up on what I've been doing.

Field Day 2019 is this weekend. I hope to bring Nora upstairs for a few hours to work some stations with me from my home setup. I'm hoping we can work a handful of modes on different bands during the day.

Saturday, August 4, 2018

Aligning a TS-430S, or "wait, how am I supposed to check FM again?"

I'm fixing another (I know I know) TS-430S for a friend. Yes, this means I'm returning it back to them. After all of the repairs I had to do to get the thing up and going reliably I did an RX carrier calibration. It was a little bit off - a combination of using WWV at 10MHz and the scope to calibrate CW, USB and SSB.

However, the AM and FM carriers didn't at all meet the expectations of the service manual. Notably, the AM carrier is seemingly the same as the USB carrier on transmit and there isn't one on receive. The FM carrier just didn't appear during receive or transmit. But .. it's transmitting FM.

Now, I need to go get the TS-430Ses I've fixed and compare the carrier behaviour to the other rigs, but .. well, they work on AM/FM receive and transmit. So ok, let's figure it out.

The AM carrier matches the USB carrier. It's weird because the circuit has an AM/FM carrier crystal however.. yeah, AM carrier here is linked to the USB carrier. I need to figure that out. And the FM transmit has no power control - it's 100W carrier only. So the only way to do it without dumping 100W out into the finals whilst adjusting it is to remove the RF drive output on the RF board (which feeds the finals with RF), attach a 50 ohm resistor across it and check the final RF carrier signal on the scope. This worked mostly OK but since there's no ALC feedback, the output is .. very distorted. Now, I don't know if these rigs were supposed to output a clean sine wave at all carrier output settings but .. well, they're very loud signals on lower bands, sometimes more than 8V peak-to-peak, which is almost triple what you need to feed the finals to get 100W out. So I got it in the ballpark - because well, the thing is not outputting a true sine wave here because the carrier output is way too high - and then had to resort to checking using a directional coupler and the scope.

Now, this isn't too bad - I was in the rough right spot for the FM carrier frequency anyway, and I can key down for a few seconds at a time without making things sad. But, this step was delayed until I verified the finals were working and that took a lot of work to get right. It turns out it was on the nose anyway after all of that and FM modulation now works great.

So - if you're aligning a TS-430S, the AM/FM carrier bit in the service manual may not be entirely correct.

Sunday, July 15, 2018

Restoring a TS-430S, or "dry joints and stray RF: a tutorial"

I recently acquired a TS-430S HF transceiver. The seller claimed the FM board and full complement of filters worked, but no display, buttons/LEDs or sound. He said it worked until he sent it in to have filters added. I figured it was going to be something simple. Boy was I both right and wrong.

These rigs have a habit of dry joints everywhere. So, I powered it up to see - yes, no display. Ok - step 1 - check power rails. I discovered there was no 5v line. The IF board has the 7805 regulator, so it is time to check for dry joints.

Oh look! Some very dry joints. I bet these were marginal until the tech installing the filters jostled it about. I fixed these any anything else I could find on the IF board.

I then powered it up. One digit showed up - the optional 10Hz digit - but all the lights and buttons worked.

Now, this rig has a separate PLL board for the main VFO which exports a signal that blanks the VFO output and the display. Amusingly it doesn't blank the final digit though. Ok, so it's likely PLL unlock. The PLL board was getting power, but ... no stable 36MHz base oscillator. That's on the control board. I pulled that out to find more dry joints around that circuit and its connector - so, fixed that.

I fired it up again. The PLL board was still unlocked even though the 36MHz oscillator was now working. I spun the dial and measured the other VFO feeding the PLL board - this is the fine grain frequency selection that gets mixed in to the PLL boards four VFOs to output the final VFO signal. It was moving OK - so the control board and the other PLLs were OK. Next - check the four VCO selection lines - nothing.

The PLL board has four varicap diode based VCOs and a PLL loop. The control board outputs the band select data to the RF board which decodes it and drives the PLL VCO, the relay based LPFs and the receiver HPFs. There were multiple issues - the control board bandpass lines were wrong and the VCO select lines were wrong.

Next - the RF board. Dry joints everywhere. Here is one of many that linked ground planes together.

And this one was on the VCO output connector.

I removed the TTL IC that did the BCD to output line demuxing because it was dead and fixed the dry joints. But the control board was still outputting the wrong band info. It turns out the IO expander IC that drives those four lines had two dead IO lines. So, that needed replacing too.

At this stage the control board was OK, the band select lines and VCO select lines are OK, but no PLL lock. Time to diagnose the PLL board.

First up - the varicap VCO was working. Wrong frequency but working. The circuit takes the output of that, buffers it though a transistor amplifier, shapes it into a square wave and divides it down via a pair of TTL chips and feeds it into the PLL control IC.

Next - the 5v line on the PLL board was ... suspiciously low. 5v was coming in OK, but something was dragging it down to 3.8v in places. That is too low for TTL. I checked each chip and... the 75S112N flip flop chip was running hot. Ok, so that needed replacing. Note it is S and not 74LS - the PLL loop runs from 45 to 75MHz, so it needs speed. With that chip replaced the 5v rail was again at 5v. But, no PLL lock.

So I then traced the PLL loop. VCO was OK. VCO though the buffer amp wasn't. I pulled out the transistor there and it was open circuit. I didn't have an equivalent so I found a close enough one for now and ordered a replacement. But then it was sill not working right - the signal level into the TTL NAND chip was super low. I figured either the transistor I replaced it with wasn't biased right or the TTL chip was pulling its input low. Indeed it was the latter - the input side was shorted to ground. I replaced that chip and the rig sprung to life!

I recalibrated the four VCOs now that I had replaced some parts. It was locking OK on all bands.

But - the receive signal was low. I checked the attenuator switch - no go. I disconnected the attenuator control cable to the RF board - RX sprung to life! A little solder reflow on the switch board and that fixed that.

After that I just did the obligatory filter and finals board check and reflow.

One LPF relay clean procedure and finals alignment later and it's all ready to go. The SWR foldback protection needs fixing and I need a 150 ohm dry load to do that, so that's my next week project.

As to how those parts all failed, likely at once? My guess is stray RF fried a path somehow. I'm glad this was the extent of the part damage!

Monday, March 12, 2018

Not merging stuff from FreeBSD-HEAD into production branches, or "hey FreeBSD-HEAD should just be production"

I get asked all the time why I don't backport my patches into stable FreeBSD release branches. It's a good question, so let me explain it here.

I don't get paid to do it.

Ok, so now you ask "but wait, surely the users matter?" Yes, of course they do! But, I also have other things going on in my life, and the stuff I do for fun is .. well, it's the stuff I do for fun. I'm not paid to do FreeBSD work, let alone open source wireless stuff in general.

So then I see posts like this:

I understand his point of view, I really do. I'm also that user when it comes to a variety of other open source software and I ask why features aren't implemented that seem easy, or why they're not in a stable release. But then I remember that I'm also doing this for fun and it's totally up to me to spend my time however I want.

Now, why am I like this?

Well, the short-hand version is - I used to bend over backwards to try and get stuff in to stable releases of the open source software I once worked on. And that was taken advantage of by a lot of people and companies who turned around to incorporate that work into successful commercial software releases without any useful financial contribution to either myself or the project as a whole. After enough time of this, you realise that hey, maybe my spare time should just be my spare time.

My hope is that if people wish to backport my FreeBSD work to a stable release then they'll either pay me to do it, pay someone else to do it, or see if a company will sponsor that work for their own benefit. I don't want to get into the game of trying to backport things to one and potentially two stable releases and deal with all the ABI changes and support fallout that happens when you are porting things into a mostly ABI stable release. And yes, my spare time is my own.

Monday, December 25, 2017

More TS-440S hijinx, or "ok, what if you wanna homebrew a digital hookup?"

I've been homebrewing digital hookups between my amateur radios (HF, VHF, UHF) and a FreeBSD PC. It all ... mostly works. There's one or two FreeBSD hiccups though, which are summarised thusly:

The default package selection for audio paths is .. suboptimal. Some can be configured to use OSS and that is nice. Some provide ALSA but FreeBSD's "ALSA" implementation doesn't provide full ALSA device emulation so we don't get a list of ALSA devices by default. You have to put your devices into asound.conf with names for things. However ..

.. FreeBSD doesn't currently make it easy to hard-code say, USB device paths to serial port names or sound devices to something predictable. So every time I reboot or mess with the setup it goes pear shaped.

Then there's a bug where I can run three USB audio devices, but I can't do mic input on the last one. Output works fine. That's going to be amusing to diagnose.

So I do have my TS-440S, TS-711A and TS-811E all doing digital modes. They're just all .. subtly different.

The TS-711A and TS-811E have an accessory jack (ACC2) that has input and output. There's a PTT control line and a mic mute line. The line levels of those signals is a couple hundred millivolts, so it's good enough to build a little resistor divider with a potentiometer to get the computer output down to the right level. I'm using mic input on the USB audio devices, so that also works fine at a couple hundred millivolts.

The TS-440S also has a similar accessory jack, however the audio input in that jack seems to be quite a big higher than a couple hundred millivolts. It looks like it needs to be around 4-5v peak-to-peak for it to be at the right level internally on the IF board. The ACC2 path has a couple of resistor attenuators so it looks like this was intentional - after asking around it looks like it's expecting professional line audio output levels (~4v peak-to-peak) instead of consumer grade levels (~1.5v peak-to-peak.) I'll go dig into it some more. This path bypasses the microphone pre-amplifier entirely and goes straight into the Mic Gain control pot.

The TS-440S also has AFSK input/output RCA jacks on the back. The audio output is at the same level as the ACC2 jack, however the audio input side is routed via the microphone input side so it gets preamp'ed and processed appropriately. That's what I've been using for digital modes - I can divide down the input side to a couple hundred millivolts to keep it all kosher. However - and here's the really annoying part - the mic mute input on ACC2 also mutes the AFSK input line.

Then there's what happens if you leave the microphone connected. If you do leave it connected, even in a quiet room, it seems to present some load that requires a lot more signal on AFSK input to do its thing. If you tune it all up to the right signal levels and then disconnect the microphone, you'll be really overdriving the RF section. Ugh.

So - I don't have to do any of this for the TS-711 and TS-811 - their input values are a lot lower and grounding the mic line actually just quietens the mic input.

If I can score a slightly different radio - like a TS-680S for example - then I can do this stuff with a much lower level line input value. It'll be tricky to get it down to the TS-680S level (it wants it at 10mV!) but at least I can do that with passive, well shielded bits.

On the plus side - yes, this means I at least can do digital modes on my TS-440S. I just have to keep unplugging the microphone line for now. What I may end up doing for now though is adding another switch to the desktop microphone I have to turn /its/ microphone input off so it is fully disconnected. Hopefully that'll be enough to do digital modes without constantly screwing and unscrewing things.

If you're at all curious -

Tuesday, October 10, 2017

FreeBSD and APRS, or "hm what happens when none of this is well documented.."

Here's another point along my quest for amateur radio on FreeBSD - bring up basic APRS support. Yes, someone else has done the work, but in the normal open source way it was .. inconsistently documented.

First is figuring out the hardware platform. I chose the following:

  • A Baofeng UV5R2, since they're cheap, plentiful, and do both VHF and UHF;
  • A cable to do sound level conversion and isolation (and yes, I really should post a circuit diagram and picture..);
  • A USB sound device, primarily so I can whack it into FreeBSD/Linux devices to get a separate sound card for doing radio work;
  • FreeBSD laptop (it'll become a raspberry pi + GPS + sensor + LCD thingy later, but this'll do to start with.)
The Baofeng is easy - set it to the right frequency (VHF APRS sits on 144.390MHz), turn on VOX so I don't have to make up a PTT cable, done/done.

The PTT bit isn't that hard - one of the microphone jack pins is actually PTT (if you ground it, it engages PTT) so when you make the cable just ensure you expose a ground pin and PTT pin so you can upgrade it later.

The cable itself isn't that hard either - I had a baofeng handmic lying around (they're like $5) so I pulled it apart for the cable. I'll try to remember to take pictures of that.

Here's a picture I found on the internet that shows the pinout:

Now, I went a bit further. I bought a bunch of 600 ohm isolation transformers for audio work, so I wired it up as follows:

  • From the audio output of the USB sound card, I wired up a little attenuator - input is 2k to ground, then 10k to the input side of the transformer; then the output side of the transformer has a 0.01uF greencap capacitor to the microphone input of the baofeng;
  • From the baofeng I just wired it up to the transformer, then the output side of that went into a 0.01uF greencap capacitor in series to the microphone input of the sound card.
In both instances those capacitors are there as DC blockers.

(I'd draw up a circuit diagram but for some reason there's no easy tool here in blogger to do that in-line! Sigh.)

Ok, so that bit is easy.

Then on to the software side.

The normal way people do this stuff is "direwolf" on Linux. So, "pkg install direwolf" installed it. That was easy.

Configuring it up was a bit less easy. I found this guide to be helpful:

FreeBSD has the example direwolf config in /usr/local/share/doc/direwolf/examples/direwolf.conf . Now, direwolf will run as a normal user (there's no rc.d script for it yet!) and by default runs out of the current directory. So:

$ cd ~
$ cp /usr/local/share/doc/direwolf/examples/direwolf.conf .
$ (edit it)
$ direwolf

Editing it isn't that hard - you need to change your callsign and the audio device.

OK, here is the main undocumented bit for FreeBSD - the sound device can just be /dev/dsp . It isn't an ALSA name! Don't waste time trying to use ALSA names. Instead, just find the device you want and reference it. For me the USB sound card shows up as /dev/dsp3 (which is very non specific as USB sound devices come and go, but that's a later problem!) but it's enough to bring it up.

So yes, following the above guide, using the right sound device name resulted in a working APRS modem.

Next up - something to talk to it. This is called 'xastir'. It's .. well, when you run it, you'll find exactly how old an X application it is. It's very nostalgically old. But, it is enough to get APRS positioning up and test both the TCP/IP side of APRS and the actual radio radio side.

Here's the guide I followed:

So, that was it! So far so good. It actually works well enough to decode and watch APRS traffic around me. I managed to get out position information to the APRS network over both TCP/IP and relayed via VHF radio.