Planet Linux Australia

,

David RoweCodec 2 HF Data Modes Part 2

Over the past few months I’ve been working on HF data modes, in particular building up a new burst acquisition system for our OFDM modem. As usual, what seemed like a small project turned out to be a lot of work! I’ve now integrated all the changes into the FreeDV API and started testing over the air, sending frames of data from a Tx at my home to remote SDRs all over Australia.

Features:

  • Importantly – this work is open source – filling a gap in the HF data world. HF is used for Ham radio, emergency communications and in the developing world where no other infrastructure exists. It needs to be open.
  • High performance waveforms designed for fast fading channels with modern FEC (thanks Bill, VK5DSP).
  • Implemented as a C library that can be cross compiled on many machines, and called from other programs (C and Python examples). You don’t need to be tied to one operating system or expensive, proprietary hardware.
  • Further development is supported by a suite of automated tests.

I’m not aiming to build a full blown TNC myself, just the layer that can move data frames over HF Radio channels. This seems to be where the real need lies, and the best use of my skills. I have however been working with TNC developers like Simon, DJ2LS. Together we have written a set of use cases that we have been developing against. This has been very useful, and a fun learning experience for both of us.

I’ve documented the Codec 2 HF data modes in README_data, which includes simple examples of how to use the API, and simulated/real world results.

Further work:

  • Automated testing over real world channels
  • Tuning performance
  • Port a higher bit rate QAM16 mode to C
  • Working with TNC developers
  • Prototype very simple low cost HF Data links using RTLSDRs and RpiTx transmitters

Reading Further

HF Acquisition Pull Request – journal of the recent development
README_data – Codec 2 data mode documentation (HF OFDM raw data section)
Codec2 HF Data Modes Part 1

Francois MarierDeleting non-decryptable restic snapshots

Due to what I suspect is disk corruption error due to a faulty RAM module or network interface on my GnuBee, my restic backup failed with the following error:

$ restic check
using temporary cache in /var/tmp/restic-tmp/restic-check-cache-854484247
repository b0b0516c opened successfully, password is correct
created new cache in /var/tmp/restic-tmp/restic-check-cache-854484247
create exclusive lock for repository
load indexes
check all packs
check snapshots, trees and blobs
error for tree 4645312b:
  decrypting blob 4645312b443338d57295550f2f4c135c34bda7b17865c4153c9b99d634ae641c failed: ciphertext verification failed
error for tree 2c3248ce:
  decrypting blob 2c3248ce5dc7a4bc77f03f7475936041b6b03e0202439154a249cd28ef4018b6 failed: ciphertext verification failed
Fatal: repository contains errors

I started by locating the snapshots which make use of these corrupt trees:

$ restic find --tree 4645312b
repository b0b0516c opened successfully, password is correct
Found tree 4645312b443338d57295550f2f4c135c34bda7b17865c4153c9b99d634ae641c
 ... path /usr/include/boost/spirit/home/support/auxiliary
 ... in snapshot 41e138c8 (2021-01-31 08:35:16)
Found tree 4645312b443338d57295550f2f4c135c34bda7b17865c4153c9b99d634ae641c
 ... path /usr/include/boost/spirit/home/support/auxiliary
 ... in snapshot e75876ed (2021-02-28 08:35:29)

$ restic find --tree 2c3248ce
repository b0b0516c opened successfully, password is correct
Found tree 2c3248ce5dc7a4bc77f03f7475936041b6b03e0202439154a249cd28ef4018b6
 ... path /usr/include/boost/spirit/home/support/char_encoding
 ... in snapshot 41e138c8 (2021-01-31 08:35:16)
Found tree 2c3248ce5dc7a4bc77f03f7475936041b6b03e0202439154a249cd28ef4018b6
 ... path /usr/include/boost/spirit/home/support/char_encoding
 ... in snapshot e75876ed (2021-02-28 08:35:29)

and then deleted them:

$ restic forget 41e138c8 e75876ed
repository b0b0516c opened successfully, password is correct
[0:00] 100.00%  2 / 2 files deleted

$ restic prune 
repository b0b0516c opened successfully, password is correct
counting files in repo
building new index for repo
[13:23] 100.00%  58964 / 58964 packs
repository contains 58964 packs (1417910 blobs) with 278.913 GiB
processed 1417910 blobs: 0 duplicate blobs, 0 B duplicate
load all snapshots
find data that is still in use for 20 snapshots
[1:15] 100.00%  20 / 20 snapshots
found 1364852 of 1417910 data blobs still in use, removing 53058 blobs
will remove 0 invalid files
will delete 942 packs and rewrite 1358 packs, this frees 6.741 GiB
[10:50] 31.96%  434 / 1358 packs rewritten
hash does not match id: want 9ec955794534be06356655cfee6abe73cb181f88bb86b0cd769cf8699f9f9e57, got 95d90aa48ffb18e6d149731a8542acd6eb0e4c26449a4d4c8266009697fd1904
github.com/restic/restic/internal/repository.Repack
    github.com/restic/restic/internal/repository/repack.go:37
main.pruneRepository
    github.com/restic/restic/cmd/restic/cmd_prune.go:242
main.runPrune
    github.com/restic/restic/cmd/restic/cmd_prune.go:62
main.glob..func19
    github.com/restic/restic/cmd/restic/cmd_prune.go:27
github.com/spf13/cobra.(*Command).execute
    github.com/spf13/cobra/command.go:852
github.com/spf13/cobra.(*Command).ExecuteC
    github.com/spf13/cobra/command.go:960
github.com/spf13/cobra.(*Command).Execute
    github.com/spf13/cobra/command.go:897
main.main
    github.com/restic/restic/cmd/restic/main.go:98
runtime.main
    runtime/proc.go:204
runtime.goexit
    runtime/asm_amd64.s:1374

As you can see above, the prune command failed due to a corrupt pack and so I followed the process I previously wrote about and identified the affected snapshots using:

$ restic find --pack 9ec955794534be06356655cfee6abe73cb181f88bb86b0cd769cf8699f9f9e57

before deleting them with:

$ restic forget 031ab8f1 1672a9e1 1f23fb5b 2c58ea3a 331c7231 5e0e1936 735c6744 94f74bdb b11df023 dfa17ba8 e3f78133 eefbd0b0 fe88aeb5 
repository b0b0516c opened successfully, password is correct
[0:00] 100.00%  13 / 13 files deleted

$ restic prune
repository b0b0516c opened successfully, password is correct
counting files in repo
building new index for repo
[13:37] 100.00%  60020 / 60020 packs
repository contains 60020 packs (1548315 blobs) with 283.466 GiB
processed 1548315 blobs: 129812 duplicate blobs, 4.331 GiB duplicate
load all snapshots
find data that is still in use for 8 snapshots
[0:53] 100.00%  8 / 8 snapshots
found 1219895 of 1548315 data blobs still in use, removing 328420 blobs
will remove 0 invalid files
will delete 6232 packs and rewrite 1275 packs, this frees 36.302 GiB
[23:37] 100.00%  1275 / 1275 packs rewritten
counting files in repo
[11:45] 100.00%  52822 / 52822 packs
finding old index files
saved new indexes as [a31b0fc3 9f5aa9b5 db19be6f 4fd9f1d8 941e710b 528489d9 fb46b04a 6662cd78 4b3f5aad 0f6f3e07 26ae96b2 2de7b89f 78222bea 47e1a063 5abf5c2d d4b1d1c3 f8616415 3b0ebbaa]
remove 23 old index files
[0:00] 100.00%  23 / 23 files deleted
remove 7507 old packs
[0:08] 100.00%  7507 / 7507 files deleted
done

And with 13 of my 21 snapshots deleted, the checks now pass:

$ restic check
using temporary cache in /var/tmp/restic-tmp/restic-check-cache-407999210
repository b0b0516c opened successfully, password is correct
created new cache in /var/tmp/restic-tmp/restic-check-cache-407999210
create exclusive lock for repository
load indexes
check all packs
check snapshots, trees and blobs
no errors were found

This represents a significant amount of lost backup history, but at least it's not all of it.

,

Russell CokerYama

I’ve just setup the Yama LSM module on some of my Linux systems. Yama controls ptrace which is the debugging and tracing API for Unix systems. The aim is to prevent a compromised process from using ptrace to compromise other processes and cause more damage. In most cases a process which can ptrace another process which usually means having capability SYS_PTRACE (IE being root) or having the same UID as the target process can interfere with that process in other ways such as modifying it’s configuration and data files. But even so I think it has the potential for making things more difficult for attackers without making the system more difficult to use.

If you put “kernel.yama.ptrace_scope = 1” in sysctl.conf (or write “1” to /proc/sys/kernel/yama/ptrace_scope) then a user process can only trace it’s child processes. This means that “strace -p” and “gdb -p” will fail when run as non-root but apart from that everything else will work. Generally “strace -p” (tracing the system calls of another process) is of most use to the sysadmin who can do it as root. The command “gdb -p” and variants of it are commonly used by developers so yama wouldn’t be a good thing on a system that is primarily used for software development.

Another option is “kernel.yama.ptrace_scope = 3” which means no-one can trace and it can’t be disabled without a reboot. This could be a good option for production servers that have no need for software development. It wouldn’t work well for a small server where the sysadmin needs to debug everything, but when dozens or hundreds of servers have their configuration rolled out via a provisioning tool this would be a good setting to include.

See Documentation/admin-guide/LSM/Yama.rst in the kernel source for the details.

When running with capability SYS_PTRACE (IE root shell) you can ptrace anything else and if necessary disable Yama by writing “0” to /proc/sys/kernel/yama/ptrace_scope .

I am enabling mode 1 on all my systems because I think it will make things harder for attackers while not making things more difficult for me.

Also note that SE Linux restricts SYS_PTRACE and also restricts cross-domain ptrace access, so the combination with Yama makes things extra difficult for an attacker.

Yama is enabled in the Debian kernels by default so it’s very easy to setup for Debian users, just edit /etc/sysctl.d/whatever.conf and it will be enabled on boot.

Russell CokerRiverdale

I’ve been watching the show Riverdale on Netflix recently. It’s an interesting modern take on the Archie comics. Having watched Josie and the Pussycats in Outer Space when I was younger I was anticipating something aimed towards a similar audience. As solving mysteries and crimes was apparently a major theme of the show I anticipated something along similar lines to Scooby Doo, some suspense and some spooky things, but then a happy ending where criminals get arrested and no-one gets hurt or killed while the vast majority of people are nice. Instead the first episode has a teen being murdered and Ms Grundy being obsessed with 15yo boys and sleeping with Archie (who’s supposed to be 15 but played by a 20yo actor).

Everyone in the show has some dark secret. The filming has a dark theme, the sky is usually overcast and it’s generally gloomy. This is a significant contrast to Veronica Mars which has some similarities in having a young cast, a sassy female sleuth, and some similar plot elements. Veronica Mars has a bright theme and a significant comedy element in spite of dealing with some dark issues (murder, rape, child sex abuse, and more). But Riverdale is just dark. Anyone who watches this with their kids expecting something like Scooby Doo is in for a big surprise.

There are lots of interesting stylistic elements in the show. Lots of clothing and uniform designs that seem to date from the 1940’s. It seems like some alternate universe where kids have smartphones and laptops while dressing in the style of the 1940s. One thing that annoyed me was construction workers using tools like sledge-hammers instead of excavators. A society that has smart phones but no earth-moving equipment isn’t plausible.

On the upside there is a racial mix in the show that more accurately reflects American society than the original Archie comics and homophobia is much less common than in most parts of our society. For both race issues and gay/lesbian issues the show treats them in an accurate way (portraying some bigotry) while the main characters aren’t racist or homophobic.

I think it’s generally an OK show and recommend it to people who want a dark show. It’s a good show to watch while doing something on a laptop so you can check Wikipedia for the references to 1940s stuff (like when Bikinis were invented). I’m half way through season 3 which isn’t as good as the first 2, I don’t know if it will get better later in the season or whether I should have stopped after season 2.

I don’t usually review fiction, but the interesting aesthetics of the show made it deserve a review.

Russell CokerStorage Trends 2021

The Viability of Small Disks

Less than a year ago I wrote a blog post about storage trends [1]. My main point in that post was that disks smaller than 2TB weren’t viable then and 2TB disks wouldn’t be economically viable in the near future.

Now MSY has 2TB disks for $72 and 2TB SSD for $245, saving $173 if you get a hard drive (compared to saving $240 10 months ago). Given the difference in performance and noise 2TB hard drives won’t be worth using for most applications nowadays.

NVMe vs SSD

Last year NVMe prices were very comparable for SSD prices, I was hoping that trend would continue and SSDs would go away. Now for sizes 1TB and smaller NVMe and SSD prices are very similar, but for 2TB the NVMe prices are twice that of SSD – presumably partly due to poor demand for 2TB NVMe. There are also no NVMe devices larger than 2TB on sale at MSY (a store which caters to home stuff not special server equipment) but SSDs go up to 8TB.

It seems that NVMe is only really suitable for workstation storage and for cache etc on a server. So SATA SSDs will be around for a while.

Small Servers

There are a range of low end servers which support a limited number of disks. Dell has 2 disk servers and 4 disk servers. If one of those had 8TB SSDs you could have 8TB of RAID-1 or 24TB of RAID-Z storage in a low end server. That covers the vast majority of servers (small business or workgroup servers tend to have less than 8TB of storage).

Larger Servers

Anandtech has an article on Seagates roadmap to 120TB disks [2]. They currently sell 20TB disks using HAMR technology

Currently the biggest disks that MSY sells are 10TB for $395, which was also the biggest disk they were selling last year. Last year MSY only sold SSDs up to 2TB in size (larger ones were available from other companies at much higher prices), now they sell 8TB SSDs for $949 (4* capacity increase in less than a year). Seagate is planning 30TB disks for 2023, if SSDs continue to increase in capacity by 4* per year we could have 128TB SSDs in 2023. If you needed a server with 100TB of storage then having 2 or 3 SSDs in a RAID array would be much easier to manage and faster than 4*30TB disks in an array.

When you have a server with many disks you can expect to have more disk failures due to vibration. One time I built a server with 18 disks and took disks from 2 smaller servers that had 4 and 5 disks. The 9 disks which had been working reliably for years started having problems within weeks of running in the bigger server. This is one of the many reasons for paying extra for SSD storage.

Seagate is apparently planning 50TB disks for 2026 and 100TB disks for 2030. If that’s the best they can do then SSD vendors should be able to sell larger products sooner at prices that are competitive. Matching hard drive prices is not required, getting to less than 4* the price should be enough for most customers.

The Anandtech article is worth reading, it mentions some interesting features that Seagate are developing such as having 2 actuators (which they call Mach.2) so the drive can access 2 different tracks at the same time. That can double the performance of a disk, but that doesn’t change things much when SSDs are more than 100* faster. Presumably the Mach.2 disks will be SAS and incredibly expensive while providing significantly less performance than affordable SATA SSDs.

Computer Cases

In my last post I speculated on the appearance of smaller cases designed to not have DVD drives or 3.5″ hard drives. Such cases still haven’t appeared apart from special purpose machines like the NUC that were available last year.

It would be nice if we could get a new industry standard for smaller power supplies. Currently power supplies are expected to be almost 5 inches wide (due to the expectation of a 5.25″ DVD drive mounted horizontally). We need some industry standards for smaller PCs that aren’t like the NUC, the NUC is very nice, but most people who build their own PC need more space than that. I still think that planning on USB DVD drives is the right way to go. I’ve got 4PCs in my home that are regularly used and CDs and DVDs are used so rarely that sharing a single DVD drive among all 4 wouldn’t be a problem.

Conclusion

I’m tempted to get a couple of 4TB SSDs for my home server which cost $487 each, it currently has 2*500G SSDs and 3*4TB disks. I would have to remove some unused files but that’s probably not too hard to do as I have lots of old backups etc on there. Another possibility is to use 2*4TB SSDs for most stuff and 2*4TB disks for backups.

I’m recommending that all my clients only use SSDs for their storage. I only have one client with enough storage that disks are the only option (100TB of storage) but they moved all the functions of that server to AWS and use S3 for the storage. Now I don’t have any clients doing anything with storage that can’t be done in a better way on SSD for a price difference that’s easy for them to afford.

Affordable SSD also makes RAID-1 in workstations more viable. 2 disks in a PC is noisy if you have an office full of them and produces enough waste heat to be a reliability issue (most people don’t cool their offices adequately on weekends). 2 SSDs in a PC is no problem at all. As 500G SSDs are available for $73 it’s not a significant cost to install 2 of them in every PC in the office (more cost for my time than hardware). I generally won’t recommend that hard drives be replaced with SSDs in systems that are working well. But if a machine runs out of space then replacing it with SSDs in a RAID-1 is a good choice.

Moore’s law might cover SSDs, but it definitely doesn’t cover hard drives. Hard drives have fallen way behind developments of most other parts of computers over the last 30 years, hopefully they will go away soon.

Jan SchmidtRift CV1 – Getting close now…

It’s been a while since my last post about tracking support for the Oculus Rift in February. There’s been big improvements since then – working really well a lot of the time. It’s gone from “If I don’t make any sudden moves, I can finish an easy Beat Saber level” to “You can’t hide from me!” quality.

Equally, there are still enough glitches and corner cases that I think I’ll still be at this a while.

Here’s a video from 3 weeks ago of (not me) playing Beat Saber on Expert+ setting showing just how good things can be now:

Beat Saber – Skunkynator playing Expert+, Mar 16 2021

Strap in. Here’s what I’ve worked on in the last 6 weeks:

Pose Matching improvements

Most of the biggest improvements have come from improving the computer vision algorithm that’s matching the observed LEDs (blobs) in the camera frames to the 3D models of the devices.

I split the brute-force search algorithm into 2 phases. It now does a first pass looking for ‘obvious’ matches. In that pass, it does a shallow graph search of blobs and their nearest few neighbours against LEDs and their nearest neighbours, looking for a match using a “Strong” match metric. A match is considered strong if expected LEDs match observed blobs to within 1.5 pixels.

Coupled with checks on the expected orientation (matching the Gravity vector detected by the IMU) and the pose prior (expected position and orientation are within predicted error bounds) this short-circuit on the search is hit a lot of the time, and often completes within 1 frame duration.

In the remaining tricky cases, where a deeper graph search is required in order to recover the pose, the initial search reduces the number of LEDs and blobs under consideration, speeding up the remaining search.

I also added an LED size model to the mix – for a candidate pose, it tries to work out how large (in pixels) each LED should appear, and use that as a bound on matching blobs to LEDs. This helps reduce mismatches as devices move further from the camera.

LED labelling

When a brute-force search for pose recovery completes, the system now knows the identity of various blobs in the camera image. One way it avoids a search next time is to transfer the labels into future camera observations using optical-flow tracking on the visible blobs.

The problem is that even sped-up the search can still take a few frame-durations to complete. Previously LED labels would be transferred from frame to frame as they arrived, but there’s now a unique ID associated with each blob that allows the labels to be transferred even several frames later once their identity is known.

IMU Gyro scale

One of the problems with reverse engineering is the guesswork around exactly what different values mean. I was looking into why the controller movement felt “swimmy” under fast motions, and one thing I found was that the interpretation of the gyroscope readings from the IMU was incorrect.

The touch controllers report IMU angular velocity readings directly as a 16-bit signed integer. Previously the code would take the reading and divide by 1024 and use the value as radians/second.

From teardowns of the controller, I know the IMU is an Invensense MPU-6500. From the datasheet, the reported value is actually in degrees per second and appears to be configured for the +/- 2000 °/s range. That yields a calculation of Gyro-rad/s = Gyro-°/s * (2000 / 32768) * (?/180) – or a divisor of 938.734.

The 1024 divisor was under-estimating rotation speed by about 10% – close enough to work until you start moving quickly.

Limited interpolation

If we don’t find a device in the camera views, the fusion filter predicts motion using the IMU readings – but that quickly becomes inaccurate. In the worst case, the controllers fly off into the distance. To avoid that, I added a limit of 500ms for ‘coasting’. If we haven’t recovered the device pose by then, the position is frozen in place and only rotation is updated until the cameras find it again.

Exponential filtering

I implemented a 1-Euro exponential smoothing filter on the output poses for each device. This is an idea from the Project Esky driver for Project North Star/Deck-X AR headsets, and almost completely eliminates jitter in the headset view and hand controllers shown to the user. The tradeoff is against introducing lag when the user moves quickly – but there are some tunables in the exponential filter to play with for minimising that. For now I’ve picked some values that seem to work reasonably.

Non-blocking radio

Communications with the touch controllers happens through USB radio command packets sent to the headset. The main use of radio commands in OpenHMD is to read the JSON configuration block for each controller that is programmed in at the factory. The configuration block provides the 3D model of LED positions as well as initial IMU bias values.

Unfortunately, reading the configuration block takes a couple of seconds on startup, and blocks everything while it’s happening. Oculus saw that problem and added a checksum in the controller firmware. You can read the checksum first and if it hasn’t changed use a local cache of the configuration block. Eventually, I’ll implement that caching mechanism for OpenHMD but in the meantime it still reads the configuration blocks on each startup.

As an interim improvement I rewrote the radio communication logic to use a state machine that is checked in the update loop – allowing radio communications to be interleaved without blocking the regularly processing of events. It still interferes a bit, but no longer causes a full multi-second stall as each hand controller turns on.

Haptic feedback

The hand controllers have haptic feedback ‘rumble’ motors that really add to the immersiveness of VR by letting you sense collisions with objects. Until now, OpenHMD hasn’t had any support for applications to trigger haptic events. I spent a bit of time looking at USB packet traces with Philipp Zabel and we figured out the radio commands to turn the rumble motors on and off.

In the Rift CV1, the haptic motors have a mode where you schedule feedback events into a ringbuffer – effectively they operate like a low frequency audio device. However, that mode was removed for the Rift S (and presumably in the Quest devices) – and deprecated for the CV1.

With that in mind, I aimed for implementing the unbuffered mode, with explicit ‘motor on + frequency + amplitude’ and ‘motor off’ commands sent as needed. Thanks to already having rewritten the radio communications to use a state machine, adding haptic commands was fairly easy.

The big question mark is around what API OpenHMD should provide for haptic feedback. I’ve implemented something simple for now, to get some discussion going. It works really well and adds hugely to the experience. That code is in the https://github.com/thaytan/OpenHMD/tree/rift-haptics branch, with a SteamVR-OpenHMD branch that uses it in https://github.com/thaytan/SteamVR-OpenHMD/tree/controller-haptics-wip

Problem areas

Unexpected tracking losses

I’d say the biggest problem right now is unexpected tracking loss and incorrect pose extractions when I’m not expecting them. Especially my right controller will suddenly glitch and start jumping around. Looking at a video of the debug feed, it’s not obvious why that’s happening:

To fix cases like those, I plan to add code to log the raw video feed and the IMU information together so that I can replay the video analysis frame-by-frame and investigate glitches systematically. Those recordings will also work as a regression suite to test future changes.

Sensor fusion efficiency

The Kalman filter I have implemented works really nicely – it does the latency compensation, predicts motion and extracts sensor biases all in one place… but it has a big downside of being quite expensive in CPU. The Unscented Kalman Filter CPU cost grows at O(n^3) with the size of the state, and the state in this case is 43 dimensional – 22 base dimensions, and 7 per latency-compensation slot. Running 1000 updates per second for the HMD and 500 for each of the hand controllers adds up quickly.

At some point, I want to find a better / cheaper approach to the problem that still provides low-latency motion predictions for the user while still providing the same benefits around latency compensation and bias extraction.

Lens Distortion

To generate a convincing illusion of objects at a distance in a headset that’s only a few centimetres deep, VR headsets use some interesting optics. The LCD/OLED panels displaying the output get distorted heavily before they hit the users eyes. What the software generates needs to compensate by applying the right inverse distortion to the output video.

Everyone that tests the CV1 notices that the distortion is not quite correct. As you look around, the world warps and shifts annoyingly. Sooner or later that needs fixing. That’s done by taking photos of calibration patterns through the headset lenses and generating a distortion model.

Camera / USB failures

The camera feeds are captured using a custom user-space UVC driver implementation that knows how to set up the special synchronisation settings of the CV1 and DK2 cameras, and then repeatedly schedules isochronous USB packet transfers to receive the video.

Occasionally, some people experience failure to re-schedule those transfers. The kernel rejects them with an out-of-memory error failing to set aside DMA memory (even though it may have been running fine for quite some time). It’s not clear why that happens – but the end result at the moment is that the USB traffic for that camera dies completely and there’ll be no more tracking from that camera until the application is restarted.

Often once it starts happening, it will keep happening until the PC is rebooted and the kernel memory state is reset.

Occluded cases

Tracking generally works well when the cameras get a clear shot of each device, but there are cases like sighting down the barrel of a gun where we expect that the user will line up the controllers in front of one another, and in front of the headset. In that case, even though we probably have a good idea where each device is, it can be hard to figure out which LEDs belong to which device.

If we already have a good tracking lock on the devices, I think it should be possible to keep tracking even down to 1 or 2 LEDs being visible – but the pose assessment code will have to be aware that’s what is happening.

Upstreaming

April 14th marks 2 years since I first branched off OpenHMD master to start working on CV1 tracking. How hard can it be, I thought? I’ll knock this over in a few months.

Since then I’ve accumulated over 300 commits on top of OpenHMD master that eventually all need upstreaming in some way.

One thing people have expressed as a prerequisite for upstreaming is to try and remove the OpenCV dependency. The tracking relies on OpenCV to do camera distortion calculations, and for their PnP implementation. It should be possible to reimplement both of those directly in OpenHMD with a bit of work – possibly using the fast LambdaTwist P3P algorithm that Philipp Zabel wrote, that I’m already using for pose extraction in the brute-force search.

Others

I’ve picked the top issues to highlight here. https://github.com/thaytan/OpenHMD/issues has a list of all the other things that are still on the radar for fixing eventually.

Other Headsets

At some point soon, I plan to put a pin in the CV1 tracking and look at adapting it to more recent inside-out headsets like the Rift S and WMR headsets. I implemented 3DOF support for the Rift S last year, but getting to full positional tracking for that and other inside-out headsets means implementing a SLAM/VIO tracking algorithm to track the headset position.

Once the headset is tracking, the code I’m developing here for CV1 to find and track controllers will hopefully transfer across – the difference with inside-out tracking is that the cameras move around with the headset. Finding the controllers in the actual video feed should work much the same.

Sponsorship

This development happens mostly in my spare time and partly as open source contribution time at work at Centricular. I am accepting funding through Github Sponsorships to help me spend more time on it – I’d really like to keep helping Linux have top-notch support for VR/AR applications. Big thanks to the people that have helped get this far.

Francois MarierRemoving a corrupted data pack in a Restic backup

I recently ran into a corrupted data pack in a Restic backup on my GnuBee. It led to consistent failures during the prune operation:

incomplete pack file (will be removed): b45afb51749c0778de6a54942d62d361acf87b513c02c27fd2d32b730e174f2e
incomplete pack file (will be removed): c71452fa91413b49ea67e228c1afdc8d9343164d3c989ab48f3dd868641db113
incomplete pack file (will be removed): 10bf128be565a5dc4a46fc2fc5c18b12ed2e77899e7043b28ce6604e575d1463
incomplete pack file (will be removed): df282c9e64b225c2664dc6d89d1859af94f35936e87e5941cee99b8fbefd7620
incomplete pack file (will be removed): 1de20e74aac7ac239489e6767ec29822ffe52e1f2d7f61c3ec86e64e31984919
hash does not match id: want 8fac6efe99f2a103b0c9c57293a245f25aeac4146d0e07c2ab540d91f23d3bb5, got 2818331716e8a5dd64a610d1a4f85c970fd8ae92f891d64625beaaa6072e1b84
github.com/restic/restic/internal/repository.Repack
        github.com/restic/restic/internal/repository/repack.go:37
main.pruneRepository
        github.com/restic/restic/cmd/restic/cmd_prune.go:242
main.runPrune
        github.com/restic/restic/cmd/restic/cmd_prune.go:62
main.glob..func19
        github.com/restic/restic/cmd/restic/cmd_prune.go:27
github.com/spf13/cobra.(*Command).execute
        github.com/spf13/cobra/command.go:838
github.com/spf13/cobra.(*Command).ExecuteC
        github.com/spf13/cobra/command.go:943
github.com/spf13/cobra.(*Command).Execute
        github.com/spf13/cobra/command.go:883
main.main
        github.com/restic/restic/cmd/restic/main.go:86
runtime.main
        runtime/proc.go:204
runtime.goexit
        runtime/asm_amd64.s:1374

Thanks to the excellent support forum, I was able to resolve this issue by dropping a single snapshot.

First, I identified the snapshot which contained the offending pack:

$ restic -r sftp:hostname.local: find --pack 8fac6efe99f2a103b0c9c57293a245f25aeac4146d0e07c2ab540d91f23d3bb5
repository b0b0516c opened successfully, password is correct
Found blob 2beffa460d4e8ca4ee6bf56df279d1a858824f5cf6edc41a394499510aa5af9e
 ... in file /home/francois/.local/share/akregator/Archive/http___udd.debian.org_dmd_feed_
     (tree 602b373abedca01f0b007fea17aa5ad2c8f4d11f1786dd06574068bf41e32020)
 ... in snapshot 5535dc9d (2020-06-30 08:34:41)

Then, I could simply drop that snapshot:

$ restic -r sftp:hostname.local: forget 5535dc9d
repository b0b0516c opened successfully, password is correct
[0:00] 100.00%  1 / 1 files deleted

and run the prune command to remove the snapshot, as well as the incomplete packs that were also mentioned in the above output but could never be removed due to the other error:

$ restic -r sftp:hostname.local: prune
repository b0b0516c opened successfully, password is correct
counting files in repo
building new index for repo
[20:11] 100.00%  77439 / 77439 packs
incomplete pack file (will be removed): b45afb51749c0778de6a54942d62d361acf87b513c02c27fd2d32b730e174f2e
incomplete pack file (will be removed): c71452fa91413b49ea67e228c1afdc8d9343164d3c989ab48f3dd868641db113
incomplete pack file (will be removed): 10bf128be565a5dc4a46fc2fc5c18b12ed2e77899e7043b28ce6604e575d1463
incomplete pack file (will be removed): df282c9e64b225c2664dc6d89d1859af94f35936e87e5941cee99b8fbefd7620
incomplete pack file (will be removed): 1de20e74aac7ac239489e6767ec29822ffe52e1f2d7f61c3ec86e64e31984919
repository contains 77434 packs (2384522 blobs) with 367.648 GiB
processed 2384522 blobs: 1165510 duplicate blobs, 47.331 GiB duplicate
load all snapshots
find data that is still in use for 15 snapshots
[1:11] 100.00%  15 / 15 snapshots
found 1006062 of 2384522 data blobs still in use, removing 1378460 blobs
will remove 5 invalid files
will delete 13728 packs and rewrite 15140 packs, this frees 142.285 GiB
[4:58:20] 100.00%  15140 / 15140 packs rewritten
counting files in repo
[18:58] 100.00%  50164 / 50164 packs
finding old index files
saved new indexes as [340cb68f 91ff77ef ee21a086 3e5fa853 084b5d4b 3b8d5b7a d5c385b4 5eff0be3 2cebb212 5e0d9244 29a36849 8251dcee 85db6fa2 29ed23f6 fb306aba 6ee289eb 0a74829d]
remove 190 old index files
[0:00] 100.00%  190 / 190 files deleted
remove 28868 old packs
[1:23] 100.00%  28868 / 28868 files deleted
done

Recovering from a corrupt pack

If dropping a single snapshot is not an option (for example, if the corrupt pack is used in every snapshot!) and you still have the original file, then you can try a different approach:

restic rebuild-index
restic backup

If you are using an older version of restic which doesn't automatically heal repos, you may need to use the --force option to the backup command.

,

Stewart Smithlibeatmydata v129

Every so often, I release a new libeatmydata. This has not happened for a long time. This is just some bug fixes, most of which have been in the Debian package for some time, I’ve just been lazy and not sat down and merged them.

git clone https://github.com/stewartsmith/libeatmydata.git

Download the source tarball from here: libeatmydata-129.tar.gz and GPG signature: libeatmydata-129.tar.gz.asc from my GPG key.

Or, feel free to grab some Fedora RPMs:

Releases published also in the usual places:

,

Simon LyallAudiobooks – March 2021

The Dream Machine: The Untold History of the Notorious V-22 Osprey by Richard Whittle

The story of tilt-rotor aircraft & the long history of the V-22’s development. Covers defense politics and technical matters equally well. 4/5

Broad Band: The Untold Story of the Women Who Made the Internet by Claire L. Evans

A series of stories about individuals, not just about the Internet but about women and early computing, hypertext, etc. Interesting and well written. 3/5

The Fifth Risk by Michael Lewis

Lewis interviews people involved in the Obama to Trump transition at 3 major government agencies. He profiles the people, their jobs and in most cases how the Trump people underestimated the Dept’s importance. 3/5

OK Boomer, Let’s Talk: How My Generation Got Left Behind by Jill Filipovic

Mostly a stats dump with a few profiles and accounts of struggling millennials sprinkled in. With a weird tone shift to boomer-love in the last chapter. Okay I guess 3/5

Six Days of Impossible: Navy SEAL Hell Week – A Doctor Looks Back by Robert Adams

A first-hand account of a training class in 1974/75 where only 11 of the 71 starters graduated. Fun read although some interviews with non-graduates would have provided a contrast. 3/5

Three Laws of Nature: A Little Book on Thermodynamics by R Stephen Berry

Science mixed in with some history, designed for those with minimal science. The equations were simple but numerous & didn’t work in audiobook format. Try the printed version. 2/5

Space Odyssey: Stanley Kubrick Arthur C Clarke and the Making of a Masterpiece by Michael Benson

A detailed account of the film’s making from pre-production though to the bad reviews of the first release. Covers most aspects of the film and people involved. 4/5

The Soul of a New Machine by Tracy Kidder

Pulitzer Prize winning story of a team creating a new model of minicomputer in the late-1970s. Good portraits of the team members and aspects of the tech. 4/5

My Scoring System

  • 5/5 = Brilliant, top 5 book of the year
  • 4/5 = Above average, strongly recommend
  • 3/5 = Average. in the middle 70% of books I read
  • 2/5 = Disappointing
  • 1/5 = Did not like at all

Share

Russell CokerCensoring Images

A client asked me to develop a system for “censoring” images from an automatic camera. The situation is that we have a camera taking regular photos from a fixed location which includes part of someone else’s property. So my client made a JPEG with some black rectangles in the sections that need to be covered. The first thing I needed to do was convert the JPEG to a PNG with transparency for the sections that aren’t to be covered.

To convert it I loaded the JPEG in the GIMP and went to the Layer->Transparency->Add Alpha Channel menu to enabled the Alpha channel. Then I selected the “Bucket Fill tool” and used “Mode Erase” and “Fill by Composite” and then clicked on the background (the part of the JPEG that was white) to make it transparent. Then I exported it to PNG.

If anyone knows of an easy way to convert the file then please let me know. It would be nice if there was a command-line program I could run to convert a specified color (default white) to transparent. I say this because I can imagine my client going through a dozen iterations of an overlay file that doesn’t quite fit.

To censor the image I ran the “composite” command from imagemagick. The command I used was “composite -gravity center overlay.png in.jpg out.jpg“. If anyone knows a better way of doing this then please let me know.

The platform I’m using is a ARM926EJ-S rev 5 (v5l) which takes 8 minutes of CPU time to convert a single JPEG at full DSLR resolution (4 megapixel). It also required enabling swap on a SD card to avoid running out of RAM and running “systemctl disable tmp.mount” to stop using tmpfs for /tmp as the system only has 256M of RAM.

Paul WiseFLOSS Activities March 2021

Focus

This month I didn't have any particular focus. I just worked on issues in my info bubble.

Changes

Issues

Debugging

Review

Administration

  • Debian packages: migrate flower git repo from alioth-archive to salsa
  • Debian: restart bacula-director after PostgreSQL restart
  • Debian wiki: block spammer, clean up spam, approve accounts

Communication

Sponsors

The librecaptcha/libpst/flower/marco work was sponsored by my employers. All other work was done on a volunteer basis.

,

BlueHackersWorld bipolar day 2021

Today, 30 March, is World Bipolar Day.

Vincent van Gogh - Worn Out

Why that particular date? It’s Vincent van Gogh’s birthday (1853), and there is a fairly strong argument that the Dutch painter suffered from bipolar (among other things).

The image on the side is Vincent’s drawing “Worn Out” (from 1882), and it seems to capture the feeling rather well – whether (hypo)manic, depressed, or mixed. It’s exhausting.

Bipolar is complicated, often undiagnosed or misdiagnosed, and when only treated with anti-depressants, it can trigger the (hypo)mania – essentially dragging that person into that state near-permanently.

Have you heard of Bipolar II?

Hypo-mania is the “lesser” form of mania that distinguishes Bipolar I (the classic “manic depressive” syndrome) from Bipolar II. It’s “lesser” only in the sense that rather than someone going so hyper they may think they can fly (Bipolar I is often identified when someone in manic state gets admitted to hospital – good catch!) while with Bipolar II the hypo-mania may actually exhibit as anger. Anger in general, against nothing in particular but potentially everyone and everything around them. Or, if it’s a mixed episode, anger combined with strong negative thoughts. Either way, it does not look like classic mania. It is, however, exhausting and can be very debilitating.

Bipolar II people often present to a doctor while in depressed state, and GPs (not being psychiatrists) may not do a full diagnosis. Note that D.A.S. and similar test sheets are screening tools, they are not diagnostic. A proper diagnosis is more complex than filling in a form some questions (who would have thought!)

Call to action

If you have a diagnosis of depression, only from a GP, and are on medication for this, I would strongly recommend you also get a referral to a psychiatrist to confirm that diagnosis.

Our friends at the awesome Black Dog Institute have excellent information on bipolar, as well as a quick self-test – if that shows some likelihood of bipolar, go get that referral and follow up ASAP.

I will be writing more about the topic in the coming time.

The post World bipolar day 2021 first appeared on BlueHackers.org.

,

Francois MarierUsing a Let's Encrypt TLS certificate with Asterisk 16.2

In order to fix the following error after setting up SIP TLS in Asterisk 16.2:

asterisk[8691]: ERROR[8691]: tcptls.c:966 in __ssl_setup: TLS/SSL error loading cert file. <asterisk.pem>

I created a Let's Encrypt certificate using certbot:

apt install certbot
certbot certonly --standalone -d hostname.example.com

To enable the asterisk user to load the certificate successfuly (it doesn't have permission to access the certificates under /etc/letsencrypt/), I copied it to the right directory:

cp /etc/letsencrypt/live/hostname.example.com/privkey.pem /etc/asterisk/asterisk.key
cp /etc/letsencrypt/live/hostname.example.com/fullchain.pem /etc/asterisk/asterisk.cert
chown asterisk:asterisk /etc/asterisk/asterisk.cert /etc/asterisk/asterisk.key
chmod go-rwx /etc/asterisk/asterisk.cert /etc/asterisk/asterisk.key

Then I set the following variables in /etc/asterisk/sip.conf:

tlscertfile=/etc/asterisk/asterisk.cert
tlsprivatekey=/etc/asterisk/asterisk.key

Automatic renewal

The machine on which I run asterisk has a tricky Apache setup:

  • a webserver is running on port 80
  • port 80 is restricted to the local network

This meant that the certbot domain ownership checks would get blocked by the firewall, and I couldn't open that port without exposing the private webserver to the Internet.

So I ended up disabling the built-in certbot renewal mechanism:

systemctl disable certbot.timer certbot.service
systemctl stop certbot.timer certbot.service

and then writing my own script in /etc/cron.daily/certbot-francois:

#!/bin/bash
TEMPFILE=`mktemp`

# Stop Apache and backup firewall.
/bin/systemctl stop apache2.service
/usr/sbin/iptables-save > $TEMPFILE

# Open up port 80 to the whole world.
/usr/sbin/iptables -D INPUT -j LOGDROP
/usr/sbin/iptables -A INPUT -p tcp --dport 80 -j ACCEPT
/usr/sbin/iptables -A INPUT -j LOGDROP

# Renew all certs.
/usr/bin/certbot renew --quiet

# Restore firewall and restart Apache.
/usr/sbin/iptables -D INPUT -p tcp --dport 80 -j ACCEPT
/usr/sbin/iptables-restore < $TEMPFILE
/bin/systemctl start apache2.service

# Copy certificate into asterisk.
cp /etc/letsencrypt/live/hostname.example.com/privkey.pem /etc/asterisk/asterisk.key
cp /etc/letsencrypt/live/hostname.example.com/fullchain.pem /etc/asterisk/asterisk.cert
chown asterisk:asterisk /etc/asterisk/asterisk.cert /etc/asterisk/asterisk.key
chmod go-rwx /etc/asterisk/asterisk.cert /etc/asterisk/asterisk.key
/bin/systemctl restart asterisk.service

# Commit changes to etckeeper.
pushd /etc/ > /dev/null
/usr/bin/git add letsencrypt asterisk
DIFFSTAT="$(/usr/bin/git diff --cached --stat)"
if [ -n "$DIFFSTAT" ] ; then
    /usr/bin/git commit --quiet -m "Renewed letsencrypt certs."
    echo "$DIFFSTAT"
fi
popd > /dev/null

,

Stewart SmithThe Apple Power Macintosh 7200/120 PC Compatible (Part 1)

So, I learned something recently: if you pick up your iPhone with eBay open on an auction bid screen in just the right way, you may accidentally click the bid button and end up buying an old computer. Totally not the worst thing ever, and certainly a creative way to make a decision.

So, not too long later, a box arrives!

In the 1990s, Apple created some pretty “interesting” computers and product line. One thing you could get is a DOS Compatibility (or PC Compatibility) card. This was a card that went into one of the expansion slots on a Mac and had something really curious on it: most of the guts of a PC.

Others have written on these cards too: https://www.engadget.com/2009-12-10-before-there-was-boot-camp-there-were-dos-compatibility-cards.html and http://www.edibleapple.com/2009/12/09/blast-from-the-past-a-look-back-at-apples-dos-compatibility-cards/. There’s also the Service Manual https://tim.id.au/laptops/apple/misc/pc_compatibility_card.pdf with some interesting details.

The machine I’d bought was an Apple Power Macintosh 7200/120 with the PC Compatible card added afterwards (so it doesn’t have the PC Compatible label on the front like some models ended up getting).

The Apple Power Macintosh 7200/120

Wikipedia has a good article on the line, noting that it was first released in August 1995, and fitting for the era, was sold as about 14 million other model numbers (okay not quite that bad, it was only a total of four model numbers for essentially the same machine). This specific model, the 7200/120 was introduced on April 22nd, 1996, and the original web page describing it from Apple is on the wayback machine.

For older Macs, Low End Mac is a good resource, and there’s a page on the 7200, and amazingly Apple still has the tech specs on their web site!

The 7200 series replaced the 7100, which was one of the original PowerPC based Macs. The big changes are using the industry standard PCI bus for its three expansion slots rather than NuBus. Rather surprisingly, NuBus was not Apple specific, but you could not call it widely adopted by successful manufacturers. Apple first used NuBus in the 1987 Macintosh II.

The PCI bus was standardized in 1992, and it’s almost certain that a successor to it is in the computer you’re using to read this. It really quite caught on as an industry standard.

The processor of the machine is a PowerPC 601. The PowerPC was an effort of IBM, Apple, and Motorola (the AIM Alliance) to create a class of processors for personal computers based on IBM’s POWER Architecture. The PowerPC 601 was the first of these processors, initially used by Apple in its Power Macintosh range. The machine I have has one running at a whopping 120Mhz. There continued to be PowerPC chips for a number of years, and IBM continued making POWER processors even after that. However, you are almost certainly not using a PowerPC derived processor in the computer you’re using to read this.

The PC Compatibility card has on it a full on legit Pentium 100 processor, and hardware for doing VGA graphics, a Sound Blaster 16 and the other things you’d usually expect of a PC from 1996. Since it’s on a PCI card though, it’s a bit different than a PC of the era. It doesn’t have any expansion slots of its own, and in fact uses up one of the three PCI slots in the Mac. It also doesn’t have its own floppy drive, or hard drive. There’s software on the Mac that will let the PC card use the Mac’s floppy drive, and part of the Mac’s hard drive for the PC!

The Pentium 100 was the first mass produced superscalar processor. You are quite likely to be using a computer with a processor related to the Pentium to read this, unless you’re using a phone or tablet, or one of the very latest Macs; in which case you’re using an ARM based processor. You likely have more ARM processors in your life than you have socks.

Basically, this computer is a bit of a hodge-podge of historical technology, some of which ended up being successful, and other things less so.

Let’s have a look inside!

So, one of the PCI slots has a Vertex Twin Turbo 128M8A video card in it. There is not much about this card on the internet. There’s a photo of one on Wikimedia Commons though. I’ll have to investigate more.

Does it work though? Yes! Here it is on my desk:

The powered on Power Mac 7200/120

Even with Microsoft Internet Explorer 4.0 that came with MacOS 8.6, you can find some places on the internet you can fetch files from, at a not too bad speed even!

More fun times with this machine to come!

,

Francois MarierCreating a Kodi media PC using a Raspberry Pi 4

Here's how I set up a media PC using Kodi (formerly XMBC) and a Raspberry Pi 4.

Hardware

The hardware is fairly straightforward, but here's what I ended up getting:

You'll probably want to add a remote control to that setup. I used an old Streamzap I had lying around.

Installing the OS on the SD-card

Plug the SD card into a computer using a USB adapter.

Download the imager and use it to install Raspbian on the SDcard.

Then you can simply plug the SD card into the Pi and boot.

System configuration

Using sudo raspi-config, I changed the following:

  • Set hostname (System Options)
  • Wait for network at boot (System Options): needed for NFS
  • Disable screen blanking (Display Options)
  • Enable ssh (Interface Options)
  • Configure locale, timezone and keyboard (Localisation Options)
  • Set WiFi country (Localisation Options)

Then I enabled automatic updates:

apt install unattended-upgrades anacron

echo 'Unattended-Upgrade::Origins-Pattern {
        "origin=Debian,codename=${distro_codename},label=Debian";
        "origin=Debian,codename=${distro_codename},label=Debian-Security";
        "origin=Raspbian,codename=${distro_codename},label=Raspbian";
        "origin=Raspberry Pi Foundation,codename=${distro_codename},label=Raspberry Pi Foundation";
};' | sudo tee /etc/apt/apt.conf.d/51unattended-upgrades-raspbian

Headless setup

Should you need to do the setup without a monitor, you can enable ssh by inserting the SD card into a computer and then creating an empty file called ssh in the boot partition.

Plug it into your router and boot it up. Check the IP that it received by looking at the active DHCP leases in your router's admin panel.

Then login:

ssh -o PreferredAuthentications=password -o PubkeyAuthentication=no pi@192.168.1.xxx

using the default password of raspberry.

Hardening

In order to secure the Pi, I followed most of the steps I usually take when setting up a new Linux server.

I created a new user account for admin and ssh access:

adduser francois
addgroup sshuser
adduser francois sshuser
adduser francois sudo

and changed the pi user password to a random one:

pwgen -sy 32
sudo passwd pi

before removing its admin permissions:

deluser pi adm
deluser pi sudo
deluser pi dialout
deluser pi cdrom
deluser pi lpadmin

Finally, I enabled the Uncomplicated Firewall by installing its package:

apt install ufw

and only allowing ssh connections.

After starting ufw using systemctl start ufw.service, you can check that it's configured as expected using ufw status. It should display the following:

Status: active

To                         Action      From
--                         ------      ----
22/tcp                     ALLOW       Anywhere
22/tcp (v6)                ALLOW       Anywhere (v6)

Installing Kodi

Kodi is very straightforward to install since it's now part of the Raspbian repositories:

apt install kodi

To make it start at boot/login, while still being able to exit and use other apps if needed:

cp /etc/xdg/lxsession/LXDE-pi/autostart ~/.config/lxsession/LXDE-pi/
echo "@kodi" >> ~/.config/lxsession/LXDE-pi/autostart

In order to improve privacy while fetching metadata, I also installed Tor:

apt install tor

and then set a proxy in the Kodi System | Internet access settings:

  • Proxy type: SOCKS5 with remote DNS resolving
  • Server: localhost
  • Port: 9050

Network File System

In order to avoid having to have all media storage connected directly to the Pi via USB, I setup an NFS share over my local network.

First, give static IP allocations to the server and the Pi in your DHCP server, then add it to the /etc/hosts file on your NFS server:

192.168.1.3    pi

Install the NFS server package:

apt instal nfs-kernel-server

Setup the directories to share in /etc/exports:

/pub/movies    pi(ro,insecure,all_squash,subtree_check)
/pub/tv_shows  pi(ro,insecure,all_squash,subtree_check)

Open the right ports on your firewall by putting this in /etc/network/iptables.up.rules:

-A INPUT -s 192.168.1.3 -p udp -j ACCEPT
-A INPUT -s 192.168.1.0/24 -p tcp --dport 111 -j ACCEPT
-A INPUT -s 192.168.1.0/24 -p udp --dport 111 -j ACCEPT
-A INPUT -s 192.168.1.0/24 -p udp --dport 123 -j ACCEPT
-A INPUT -s 192.168.1.0/24 -p tcp --dport 600:1124 -j ACCEPT
-A INPUT -s 192.168.1.0/24 -p udp --dport 600:1124 -j ACCEPT
-A INPUT -s 192.168.1.0/24 -p tcp --dport 2049 -j ACCEPT
-A INPUT -s 192.168.1.0/24 -p udp --dport 2049 -j ACCEPT

Finally, apply all of these changes:

iptables-apply
systemctl restart nfs-kernel-server.service

On the Pi, put the server's static IP in /etc/hosts:

192.168.1.2    fileserver

and this in /etc/fstab:

fileserver:/data/movies  /kodi/movies  nfs  ro,bg,hard,noatime,async,nolock  0  0
fileserver:/data/tv      /kodi/tv      nfs  ro,bg,hard,noatime,async,nolock  0  0

Then create the mount points and mount everything:

mkdir -p /kodi/movies
mkdir /kodi/tv
mount /kodi/movies
mount /kodi/tv

,

Paul WiseFLOSS Activities February 2021

Focus

This month I didn't have any particular focus. I just worked on issues in my info bubble.

Changes

Issues

Review

Administration

  • Debian: fix permissions for XMPP anti-spam git
  • Debian wiki: workaround moin bug with deleting deprecated pages, unblock IP addresses, approve accounts

Communication

  • Respond to queries from Debian users and contributors on the mailing lists and IRC
  • Edited and sent Debian DevNews #54

Sponsors

The purple-discord/harmony/librecaptcha/libemail-outlook-message-perl work was sponsored by my employer. All other work was done on a volunteer basis.

,

Tim RileyOpen source status update, February 2021

Well hey there, Ruby open source fans! February for me was all about consolidating the dry-system breakthroughs I made last month.

I started off by testing the work on a real app I made, and happily, all was fine! Things all looking good, I wrote myself a list and shared it with my Hanami colleagues:

[timriley] So what’s left for me to do here:

  • Release dry-configurable with the new cloneable values support
  • Apply @Nikita Shilnikov’s excellent feedback to https://github.com/dry-rb/dry-system/pull/157
  • Merge https://github.com/dry-rb/dry-system/pull/155
  • Merge https://github.com/dry-rb/dry-system/pull/155
  • Merge https://github.com/hanami/controller/pull/341
  • Merge https://github.com/hanami/hanami/pull/1093
  • In a new PR, configure Zeitwerk for Hanami, and enable the autoloading loader for Hanami’s container component_dirs

Everything started well: by the 15th of Feb I released dry-configurable 0.12.1 with the new cloneable option for custom setting values. And a mere hour later, I merged the two dry-system PRs! Woo, we’re on the home straight!

At that point, I gave Nikita the go-ahead to test all the dry-system changes on some of the apps that he manages: he’s absolutely brazen about running our bleeding edge code, and I love it. In this case, it was very helpful, because it revealed a little wrinkle in my heretofore best laid plans: if you configure a dry-system container with a component dir and a default namespace, and some of the component files sit outside that namespace, then they would fail to load. This was a valid use case missing from our test suite, and something I must’ve broke in my major changes last month.

This turned out to be relatively simple for me to hack in a fix, but at the same time I noticed an opportunity for yet another improvement: spread across multiple parts of dry-system was a bunch of (often repeated) string manipulation code working on container identifiers, doing things like removing a leading namespace, or converting a delimited identifier to a file path. It felt like there was a Dry::System::Identifier abstraction just waiting to be let out.

And so I did it! In this omnibus PR, I introduced Dry::System::Identifier, refactored component building once more, and, not to be forgotten, fixed the bug Nikita found.

I’m really happy with both of the refactorings. Let’s start with Identifier: now we have just a single place for all the logic dealing with identifier string manipulations, but we also get to provide a new, rich aspect of our component API for users of dry-system. For example, as of my work last month, it’s now possible to configure per-component behaviour around auto-registration, and we can now use the component’s identifier like so:

config.component_dirs.add "lib" do |dir|
  dir.default_namespace = "my_app"

  dir.auto_register = lambda do |component|
    !component.identifier.start_with?("entities")
  end
end

Isn’t that neat? The Identifier began life as an internal-only improvement, but here we get to make our user’s life easier too, with namespace-aware methods like #start_with? (which will return true only if the "entities" is complete leading namespace, like "entities.user", and not "entities_abc.user"). I’d like to add a range of similar conveniences to Identifier before we release 1.0. Please let me know what you’d like to see!

The other benefit of Identifier is that it’s vastly simplified how we load components. Check out how we use it in ComponentDir#component_for_path, which is used when the container is finalizing, and the auto-registrar crawls a directory to register a corresponding component for each file (comments added below for the sake of explanation):

def component_for_path(path)
  separator = container.config.namespace_separator

  # 1. Convert a path, like "my_app/articles/operations/create.rb"
  #    to an identifier key, "my_app.articles.operations.create"
  key = Pathname(path).relative_path_from(full_path).to_s
    .sub(RB_EXT, EMPTY_STRING)
    .scan(WORD_REGEX)
    .join(separator)

  # 2. Create the Identifier using the key, but without any namespace attached
  identifier = Identifier.new(key, separator: separator)

  # 3. If the identifier is part of the component dir's configured default
  #    namespace, then strip the namespace from the front of the key and
  #    rebuild the identifier with the namespace attached
  if identifier.start_with?(default_namespace)
    identifier = identifier.dequalified(default_namespace, namespace: default_namespace)
  end

  # 4. By this point, the identifier will be appropriate for both default
  #    namespaced components as well as non-namespaced components, so we can go
  #    ahead and use it to build our component!
  build_component(identifier, path)
end

Even without the comments, this method is concise and easy to follow thanks to the high-level Identifier API. What you’re also witnessing above is the very fix for the bug Nikita found! Getting to this point was a perfect example of Kent Beck’s “for each desired change, make the change easy (warning: this may be hard), then make the easy change� process. And you bet, it felt good!

In this change I actually introduced a pair of methods on ComponentDir: #component_for_path (as we saw above), which is used when finalizing a container, as well as #component_for_identifier, which is used when lazy-loading components on a non-finalized container. Previously, these two methods were both class-level �factory� methods on Component itself. By moving them to ComponentDir, not only are they much closer to the details that are important for building the component, they provide a nice symmatry which will help ensure we don’t miss either case when making changes to component loading in future. Component winds up being a much simpler class, too, which is nice.

After all of this, Nikita gave me a happy thumbs up and we were good to merge this PR and resume preparation for a major dry-system release!

But not so fast, I also:

It was in writing the CHANGELOG entry that I realised I needed to make one last change before I can really consider this release done: I want to create a way to configure default values for all component dirs. This will be helpful for when we eventually add a one-liner use :zeitwerk plugin to dry-system, which will need to ensure that dir.loader = Dry::Systen::Loader::Autoloading and dir.add_to_load_path = false are set for all subsequent user-configured component dirs. Given the amount of breaking changes we’ll be making with this release, I’d hate to see yet any unnecessary extra churn arise from this work. So that’s my first task for the month of March.

In the meantime, I have Hanami ready and waiting for this new dry-system release! As soon as our ducks are finally in a row, I’ll be able to merge this and begin the long-anticipated work on configuring Zeitwerk within Hanami.

Looking back on this month, I spent most of it feeling frustrated that I was still working on dry-system after all this time, when I just wanted to get back and make that very last change in Hanami before we could ship the next alpha release. This was exacerbated by a series of late nights that I pulled trying to get that bugfix and related Identifier changes working in a way I was happy with. I finished the month feeling pretty drained and the slightest bit cranky.

I’m indeed happy with the outcome and glad I put in the work. And yes, I got to the point where I could laugh about it. From the PR description:

As part of this work, I also continued my dry-system refactoring journey, because this is more or less my life now 😆.

What I’m taking away from this experience is a reminder that I need to be patient. When working on nights-and-weekends open source, things will only happen when they can happen, and the best we can do is the best we can do. I’ll keep on pushing hard, but February has reminded me to make sure I take care of my (physical and mental) health too.

We’re already a good week into March by the time I’m writing this, and later this month I expect to be moving apartments (yay!), so March may be a slightly less full month from me on the OSS front. If I can squeeze in that last dry-system fix and get myself in a position to begin some early experiments of Zeitwerk in Hanami, I’ll be happy. That’ll put us in a good place to ship the next Hanami alpha early in April.

Thank you to my sponsors ��

If you want to give me a boost in all of these efforts, I’d love for you to sponsor me on GitHub. Thank you to my sponsors for your ongoing support!

See you next month!

,

Leon Brooksmac on mac

There is an ancient PowerPC-based Macintosh (still working!) about 18km West of here.

The chap who began using it about two decades ago has a need to access some of the CAD files on it, and to print some of those out.

So I installed QEMU on an hp laptop I own (based on Lubuntu), then a PowerPC emulator module for it, Mac OS 9.2 on that (retired 17 years ago), the free CAD package, and a print-to-PDF plugin.

After copying the CAD files into the emulated file space (from the internal hard-disk drive borrowed from the original Mac), they can be opened, viewed, edited and/or PDFed.

The next step was to install QEMU under OS X on his MacBook Pro under OS X, and copy the virtual workspaces to it. Now the owner can edit drawings of machinery (some of it older than I am), print them out on a modern printer, and back up the entire system by copying two files.

If his MacBook ever quits the game, QEMU can be installed on whatever replaces it, those two files copied into it, then away he goes again.

Leon Brooksnot so technical... or so mindless...

Posit a human being who claims to be Christian.

They might also claim to be Atheist, or Hindu, or almost anything else.

To this particular individual, the most important person is their grandiose — false — self-image. Everyone else (everything else) comes secondary to that.

Regardless of their claims, the belief system they follow is Virtual Idolatry.

“Virtual?”

Yes. There is zero tangible evidence of the physical existence of the principal object of their affection.

“Idolatry?”

Yes.  What they effectively worship was created by mankind (in this case, themselves, although they may have been conditioned to act that way by an authority figure early in their lives).

Their actions reveal them to not follow the best-documented human in history (Christ) in any practical way, so they are evidently not Christian.

That they regard themselves (by projection of that image) to be better than any other human being clashes with the basic Monism upon which Hinduism is built, so they are evidently not Hindu (the greeting ‘namasté’ has no place in their life).

A being with no physical existence dictates their attitude and actions, which clashes with a foundational dogma of Atheism (either the ‘lack-a-belief’ kind, or the ‘does-not-exist’ kind) that everything in existence is physical, so they are not Atheist.

Extending that list of belief systems could go on indefinitely, however the practical aspects for someone who has to deal with such a person have some serious consequences: you will experience Gaslighting from them, and other two-faced manipulative behaviours. Empathy — for them — is not a real concept.

Yet if you (in any way) express distrust of them (even fail to admire them enough or to give them enough attention), you may experience a very destructive form of rage (which relates to mere anger as a lightning bolt relates to a phone battery) in which not even their own life is important to them: only that image.

In an attempt to fit the rest of the universe into their view of things, they will attempt to control (sometimes subtly) everyone around them.

The only safe way to proceed is to isolate yourselves from them, do not cross paths with them — which must be done very carefully and if possible subtly, as they are likely to regard it as an insult and slam into rage mode.

Not every person will consider this an advantage, yet nevertheless it can be: if an object is not alive, it has no feelings. This can make dealing with them very much simpler, however a (literally) mindless object cannot do anything creative, of itself.

,

Paul WiseFLOSS Activities January 2021

Focus

This month I didn't have any particular focus. I just worked on issues in my info bubble.

Changes

Issues

Review

Administration

  • Debian ports: fix header on incoming.ports.d.o/buildd
  • Debian wiki: unblock IP addresses, approve accounts

Communication

  • Initiate discussions about aptly, Linux USB gadgets maintenance
  • Respond to queries from Debian users and contributors on the mailing lists and IRC
  • Invite organisations to post on FOSSjobs

Sponsors

The samba work, apt-listchanges work, pytest-rerunfailures upload, python-testfixtures/python-scrapy bugs and python-scrapy related backports were sponsored by my employer. All other work was done on a volunteer basis.

Paul WiseFLOSS Activities December 2020

Focus

This month I didn't have any particular focus. I just worked on issues in my info bubble.

Changes

Issues

Review

Administration

  • Debian: restart bacula director, ping some people about disk usage
  • Debian wiki: unblock IP addresses, approve accounts, update email for accounts with bouncing email

Communication

  • Respond to queries from Debian users and contributors on the mailing lists and IRC

Sponsors

All work was done on a volunteer basis.

Paul WiseFLOSS Activities November 2020

Focus

This month I didn't have any particular focus. I just worked on issues in my info bubble.

Changes

Issues

Review

Administration

  • Debian wiki: disable attachments due to security issue, approve accounts

Communication

  • Respond to queries from Debian users and contributors on the mailing lists and IRC

Sponsors

The visdom, apt-listchanges work and lintian-brush bug report were sponsored by my employer. All other work was done on a volunteer basis.

Paul WiseFLOSS Activities October 2020

Focus

This month I didn't have any particular focus. I just worked on issues in my info bubble.

Changes

Issues

Review

  • Spam: reported 2 Debian bug reports and 147 Debian mailing list posts
  • Patches: merged libicns patches
  • Debian packages: sponsored iotop-c
  • Debian wiki: RecentChanges for the month
  • Debian screenshots:

Administration

  • Debian: get us removed from an RBL
  • Debian wiki: reset email addresses, approve accounts

Communication

Sponsors

The pytest-rerunfailures/pyemd/morfessor work was sponsored by my employer. All other work was done on a volunteer basis.

Paul WiseFLOSS Activities September 2020

Focus

This month I didn't have any particular focus. I just worked on issues in my info bubble.

Changes

Issues

Review

Administration

  • Debian wiki: unblock IP addresses, approve accounts

Communication

Sponsors

The gensim, cython-blis, python-preshed, pytest-rerunfailures, morfessor, nmslib, visdom and pyemd work was sponsored by my employer. All other work was done on a volunteer basis.

Paul WiseFLOSS Activities August 2020

Focus

This month I didn't have any particular focus. I just worked on issues in my info bubble.

Changes

Issues

Review

Administration

  • Debian: restarted RAM eating service
  • Debian wiki: unblock IP addresses, approve accounts

Sponsors

The cython-blis/preshed/thinc/theano bugs and smart-open/python-importlib-metadata/python-pyfakefs/python-zipp/python-threadpoolctl backports were sponsored by my employer. All other work was done on a volunteer basis.

,

sthbrx - a POWER technical blogFuzzing grub: part 1

Recently a set of 8 vulnerabilities were disclosed for the grub bootloader. I found 2 of them (CVE-2021-20225 and CVE-2021-20233), and contributed a number of other fixes for crashing bugs which we don't believe are exploitable. I found them by applying fuzz testing to grub. Here's how.

This is a multi-part series: I think it will end up being 4 posts. I'm hoping to cover:

  • Part 1 (this post): getting started with fuzzing grub
  • Part 2: going faster by doing lots more work
  • Part 3: fuzzing filesystems and more
  • Part 4: potential next steps and avenues for further work

Fuzz testing

Let's begin with part one: getting started with fuzzing grub.

One of my all-time favourite techniques for testing programs, especially programs that handle untrusted input, and especially-especially programs written in C that parse untrusted input, is fuzz testing. Fuzz testing (or fuzzing) is the process of repeatedly throwing randomised data at your program under test and seeing what it does.

(For the secure boot threat model, untrusted input is anything not validated by a cryptographic signature - so config files are untrusted for our purposes, but grub modules can only be loaded if they are signed, so they are trusted.)

Fuzzing has a long history and has recently received a new lease on life with coverage-guided fuzzing tools like AFL and more recently AFL++.

Building grub for AFL++

AFL++ is extremely easy to use ... if your program:

  1. is built as a single binary with a regular tool-chain
  2. runs as a regular user-space program on Linux
  3. reads a small input files from disk and then exits
  4. doesn't do anything fancy with threads or signals

Beyond that, it gets a bit more complex.

On the face of it, grub fails 3 of these 4 criteria:

  • grub is a highly modular program: it loads almost all of its functionality as modules which are linked as separate ELF relocatable files. (Not runnable programs, but not shared libraries either.)

  • grub usually runs as a bootloader, not as a regular app.

  • grub reads all sorts of things, ranging in size from small files to full disks. After loading most things, it returns to a command prompt rather than exiting.

Fortunately, these problems are not insurmountable.

We'll start with the 'running as a bootloader' problem. Here, grub helps us out a bit, because it provides an 'emulator' target, which runs most of grub functionality as a userspace program. It doesn't support actually booting anything (unsurprisingly) but it does support most other modules, including things like the config file parser.

We can configure grub to build the emulator. We disable the graphical frontend for now.

./bootstrap
./configure --with-platform=emu --disable-grub-emu-sdl

At this point in building a fuzzing target, we'd normally try to configure with afl-cc to get the instrumentation that makes AFL(++) so powerful. However, the grub configure script is not a fan:

./configure --with-platform=emu --disable-grub-emu-sdl CC=$AFL_PATH/afl-cc
...
checking whether target compiler is working... no
configure: error: cannot compile for the target

It also doesn't work with afl-gcc.

Hmm, ok, so what if we just... lie a bit?

./configure --with-platform=emu --disable-grub-emu-sdl
make CC="$AFL_PATH/afl-gcc" 

(Normally I'd use CC=clang and afl-cc, but clang support is slightly broken upstream at the moment.)

After a small fix for gcc-10 compatibility, we get the userspace tools (potentially handy!) but a bunch of link errors for grub-emu:

/usr/bin/ld: disk.module:(.bss+0x20): multiple definition of `__afl_global_area_ptr'; kernel.exec:(.bss+0xe078): first defined here
/usr/bin/ld: regexp.module:(.bss+0x70): multiple definition of `__afl_global_area_ptr'; kernel.exec:(.bss+0xe078): first defined here
/usr/bin/ld: blocklist.module:(.bss+0x28): multiple definition of `__afl_global_area_ptr'; kernel.exec:(.bss+0xe078): first defined here

The problem is the module linkage that I talked about earlier: because there is a link stage of sorts for each module, some AFL support code gets linked in to both the grub kernel (kernel.exec) and each module (here disk.module, regexp.module, ...). The linker doesn't like it being in both, which is fair enough.

To get started, let's instead take advantage of the smarts of AFL++ using Qemu mode instead. This builds a specially instrumented qemu user-mode emulator that's capable of doing coverage-guided fuzzing on uninstrumented binaries at the cost of a significant performance penalty.

make clean
make

Now we have a grub-emu binary. If you run it directly, you'll pick up your system boot configuration, but the -d option can point it to a directory of your choosing. Let's set up one for fuzzing:

mkdir stage
echo "echo Hello sthbrx readers" > stage/grub.cfg
cd stage
../grub-core/grub-emu -d .

You probably won't see the message because the screen gets blanked at the end of running the config file, but if you pipe it through less or something you'll see it.

Running the fuzzer

So, that seems to work - let's create a test input and try fuzzing:

cd ..
mkdir in
echo "echo hi" > in/echo-hi

cd stage
# -Q qemu mode
# -M main fuzzer
# -d don't do deterministic steps (too slow for a text format)
# -f create file grub.cfg
$AFL_PATH/afl-fuzz -Q -i ../in -o ../out -M main -d -- ../grub-core/grub-emu -d .

Sadly:

[-] The program took more than 1000 ms to process one of the initial test cases.
    This is bad news; raising the limit with the -t option is possible, but
    will probably make the fuzzing process extremely slow.

    If this test case is just a fluke, the other option is to just avoid it
    altogether, and find one that is less of a CPU hog.

[-] PROGRAM ABORT : Test case 'id:000000,time:0,orig:echo-hi' results in a timeout
         Location : perform_dry_run(), src/afl-fuzz-init.c:866

What we're seeing here (and indeed what you can observe if you run grub-emu directly) is that grub-emu isn't exiting when it's done. It's waiting for more input, and will keep waiting for input until it's killed by afl-fuzz.

We need to patch grub to sort that out. It's on my GitHub.

Apply that, rebuild with FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION, and voila:

cd ..
make CFLAGS="-DFUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION"
cd stage
$AFL_PATH/afl-fuzz -Q -i ../in -o ../out -M main -d -f grub.cfg -- ../grub-core/grub-emu -d .

And fuzzing is happening!

afl-fuzz fuzzing grub, showing fuzzing happening

This is enough to find some of the (now-fixed) bugs in the grub config file parsing!

Fuzzing beyond the config file

You can also extend this to fuzzing other things that don't require the graphical UI, such as grub's transparent decompression support:

cd ..
rm -rf in out stage
mkdir in stage
echo hi > in/hi
gzip in/hi
cd stage
echo "cat thefile" > grub.cfg
$AFL_PATH/afl-fuzz -Q -i ../in -o ../out -M main -f thefile -- ../grub-core/grub-emu -d .

You should be able to find a hang pretty quickly with this, an as-yet-unfixed bug where grub will print output forever from a corrupt file: (your mileage may vary, as will the paths.)

cp ../out/main/hangs/id:000000,src:000000,time:43383,op:havoc,rep:16 thefile
../grub-core/grub-emu -d . | less # observe this going on forever

zcat, on the other hand, reports it as simply corrupt:

$ zcat thefile

gzip: thefile: invalid compressed data--format violated

(Feel free to fix that and send a patch to the list!)

That wraps up part 1. Eventually I'll be back with part 2, where I explain the hoops to jump through to go faster with the afl-cc instrumentation.

,

Francois MarierUsing a Streamzap remote control with Kodi

After installing Kodi on a Raspberry Pi 4, I found that my Streamzap remote control worked for everything except the Ok and Exit buttons (which are supposed to get mapped to Enter and Back respectively).

A very old set of instructions for this is archived on the Kodi wiki but here's a more modern version of it.

Root cause

I finally tracked down the problem by enabling debug logging in Kodi settings. I saw the following in ~/.kodi/temp/kodi.log when presing the OK button:

DEBUG: Keyboard: scancode: 0x00, sym: 0x0000, unicode: 0x0000, modifier: 0x0
DEBUG: GetActionCode: Trying Hardy keycode for 0xf200
DEBUG: Previous line repeats 3 times.
DEBUG: HandleKey: long-0 (0x100f200, obc-16838913) pressed, action is
DEBUG: Keyboard: scancode: 0x00, sym: 0x0000, unicode: 0x0000, modifier: 0x0

and this when pressing the Down button:

DEBUG: CLibInputKeyboard::ProcessKey - using delay: 500ms repeat: 125ms
DEBUG: Thread Timer start, auto delete: false
DEBUG: Keyboard: scancode: 0x6c, sym: 0x0112, unicode: 0x0000, modifier: 0x0
DEBUG: HandleKey: down (0xf081) pressed, action is Down
DEBUG: Thread Timer 2502349008 terminating
DEBUG: Keyboard: scancode: 0x6c, sym: 0x0112, unicode: 0x0000, modifier: 0x0

This suggests that my Streamzap remote is recognized as a keyboard, which I can confirm using:

$ cat /proc/bus/input/devices 
I: Bus=0003 Vendor=0e9c Product=0000 Version=0100
N: Name="Streamzap PC Remote Infrared Receiver (0e9c:0000)"
P: Phys=usb-0000:01:00.0-1.2/input0
S: Sysfs=/devices/platform/scb/fd500000.pcie/pci0000:00/0000:00:00.0/0000:01:00.0/usb1/1-1/1-1.2/1-1.2:1.0/rc/rc0/input4
U: Uniq=
H: Handlers=kbd event0 
B: PROP=20
B: EV=100017
B: KEY=3ff 0 0 0 fc000 1 0 0 0 0 18000 4180 c0000801 9e1680 0 0 0
B: REL=3
B: MSC=10

Installing LIRC

The fix I found is to put the following in /etc/X11/xorg.conf.d/90-streamzap-disable.conf:

Section "InputClass"
    Identifier "Ignore Streamzap IR"
    MatchProduct "Streamzap"
    MatchIsKeyboard "true"
    Option "Ignore" "true"
EndSection

to prevent the remote from being used as a keyboard and to instead use it via LIRC, which can be installed like this:

apt install lirc lirc-compat-modules

Put the following in /etc/lirc/lirc_options:

driver=default
device=/dev/lirc0

and add the remote config for the Streamzap:

cd /etc/lirc/lircd.conf.d/
ln -s /usr/share/lirc/remotes/streamzap/lircd.conf.streamzap streamzap.conf

Testing

Now you should be able to test the remote using:

mode2

to see the undecoded infra-red signal, and:

irw

to display the decoded key presses.

Kodi configuration

Finally, as the pi user, put the following config in ~/.kodi/userdata/Lircmap.xml:

<lircmap>
  <remote device="Streamzap_PC_Remote">
    <power>KEY_POWER</power>
    <play>KEY_PLAY</play>
    <pause>KEY_PAUSE</pause>
    <stop>KEY_STOP</stop>
    <forward>KEY_FORWARD</forward>
    <reverse>KEY_REWIND</reverse>
    <left>KEY_LEFT</left>
    <right>KEY_RIGHT</right>
    <up>KEY_UP</up>
    <down>KEY_DOWN</down>
    <pageplus>KEY_CHANNELUP</pageplus>
    <pageminus>KEY_CHANNELDOWN</pageminus>
    <select>KEY_OK</select>
    <back>KEY_EXIT</back>
    <menu>KEY_MENU</menu>
    <red>KEY_RED</red>
    <green>KEY_GREEN</green>
    <yellow>KEY_YELLOW</yellow>
    <blue>KEY_BLUE</blue>
    <skipplus>KEY_NEXT</skipplus>
    <skipminus>KEY_PREVIOUS</skipminus>
    <record>KEY_RECORD</record>
    <volumeplus>KEY_VOLUMEUP</volumeplus>
    <volumeminus>KEY_VOLUMEDOWN</volumeminus>
    <mute>KEY_MUTE</mute>
    <record>KEY_RECORD</record>
    <one>KEY_1</one>
    <two>KEY_2</two>
    <three>KEY_3</three>
    <four>KEY_4</four>
    <five>KEY_5</five>
    <six>KEY_6</six>
    <seven>KEY_7</seven>
    <eight>KEY_8</eight>
    <nine>KEY_9</nine>
    <zero>KEY_0</zero>
  </remote>
</lircmap>

In order for all of this to take effect, I simply rebooted the Pi:

sudo systemctl reboot

,

Simon LyallMoving my backups to restic

I’ve recently moved my home backups over to restic . I’m using restic to backup the /etc and /home folders and on all machines are my website files and databases. Media files are backed up separately.

I have around 220 Gigabytes of data, about half of that is photos.

My Home setup

I currently have 4 regularly-used physical machines at home: two workstations, one laptop and server. I also have a VPS hosted at Linode and a VM running on the home server. Everything is running Linux.

Existing Backup Setup

For at least 15 years I’ve been using rsnaphot for backup. rsnapshot works by keeping a local copy of the folders to be backup up. To update the local copy it uses rsync over ssh to pull down a copy from the remote machine. It then keeps multiple old versions of files by making a series of copies.

I’d end up with around 12 older versions of the filesystem (something like 5 daily, 4 weekly and 3 monthly) so I could recover files that had been deleted. To save space rsnapshot uses hard links so only one copy of a file is kept if the contents didn’t change.

I also backed up a copy to external hard drives regularly and kept one copy offsite.

The main problem with rsnapshot was it was a little clunky. It took a long time to run because it copied and deleted a lot of files every time it ran. It also is difficult to exclude folders from being backed up and it is also not compatible with any cloud based filesystems. It also requires ssh keys to login to remote machines as root.

Getting started with restic

I started playing around with restic after seeing some recommendations online. As a single binary with a few commands it seemed a little simpler than other solutions. It has a push model so needs to be on each machine and it will upload from there to the archive.

Restic supports around a dozen storage backends for repositories. These include local file system, sftp and Amazon S3. When you create an archive via “restic init” it creates a simple file structure for the repository in most backends:

You can then use simple commands like “restic backup /etc” to backup files to there. The restic documentation site makes things pretty easy to follow.

Restic automatically encrypts backups and each server needs a key to read/write to it’s backups. However any key can see all files in a repository even those belonging to other hosts.

Backup Strategy with Restic

I decided on the followup strategy for my backups:

  • Make a daily copy of /etc and other files for each server
  • Keep 5 daily and 3 weekly copies
  • Have one copy of data on Backblaze B2
  • Have another copy on my home server
  • Export the copies on the home server to external disk regularly

Backblaze B2 is very similar Amazon S3 and is supported directly by restic. It is however cheaper. Storage is 0.5 cents per gigabyte/month and downloads are 1 cent per gigabyte. In comparison AWS S3 One Zone Infrequent access charges 1 cent per gigabyte/month for storage and 9 cents per gigabyte for downloads.

WhatBackblaze B2 AWS S3
Store 250 GB per month$1.25$2.50
Download 250 GB$2.50$22.50

AWS S3 Glacier is cheaper for storage but hard to work with and retrieval costs would be even higher.

Backblaze B2 is less reliable than S3 (they had an outage when I was testing) but this isn’t a big problem when I’m using them just for backups.

Setting up Backblaze B2

To setup B2 I went to the website and created an account. I would advise putting in your credit card once you finish initial testing as it will not let you add more than 10GB of data without one.

I then created a private bucket and changed the bucket’s lifecycle settings to only keep the last version.

I decided that for security I would have each server use a separate restic repository. This means that I would use a bit of extra space since restic will only keep one copy of a file that is identical on most machines. I ended up using around 15% more.

For each machine I created an B2 application key and set it to have a namePrefix with the name of the machine. This means that each application key can only see files in it’s own folder

On each machine I installed restic and then created an /etc/restic folder. I then added the file b2_env:

export B2_ACCOUNT_ID=000xxxx
export B2_ACCOUNT_KEY=K000yyyy
export RESTIC_PASSWORD=abcdefghi
export RESTIC_REPOSITORY=b2:restic-bucket:/hostname

You can now just run “restic init” and it should create an empty repository, check via b2 to see.

I then had a simple script that runs:

source /etc/restic/b2_env

restic --limit-upload 2000 backup /home/simon --exclude-file /etc/restic/home_exclude

restic --limit-upload 2000 backup /etc /usr/local /var/lib /var/backups

restic --verbose --keep-last 5 --keep-daily 6 --keep-weekly 3 forget

The “source” command loads in the api key and passwords.

The restic backup lines do the actual backup. I have restricted my upload speed to 20 Megabits/second . The /etc/restic/home_exclude lists folders that shouldn’t be backed up. For this I have:

/home/simon/.cache
/home/simon/.config/Slack
/home/simon/.local/share/Trash
/home/simon/.dropbox-dist
/home/simon/Syncthing/audiobooks

as these are folders with regularly changing contents that I don’t need to backup.

The “restic forget” command removes older snapshots. I’m telling it to keep 6 daily copies and 3 weekly copies of my data, plus at least the most recent 5 no matter how old then are.

This command doesn’t actually free up the space taken up by the removed snapshots. I need to run the “restic prune” command for that. However according to this analysis the prune operation generates so many API calls and data transfers that the payback time on disk space saved can be months(!). So for now I’m planning to run the command only occasionally (probably every few months, depending on testing).

Setting up sftp

As well as backing up to B2 I wanted to backup my data to my home server. In this case I decided to have a single repository shared by all the servers.

First of all I created a “restic” account on my server with a home of /home/restic. I then created a folder /media/backups/restic owned by the restic user.

I then followed this guide for sftp-only accounts to restrict the restic user. Relevant lines I changed were “Match User restic” and “ChrootDirectory /media/backups/restic “

On each host I also needed to run “cp /etc/ssh/ssh_host_rsa_key /root/.ssh/id_rsa ” and also add the host’s public ssh_key to /home/restic/.ssh/authorized_keys on the server.

Then it is just a case of creating a sftp_env file like in the b2 example above. Except this is a little shorter:

export RESTIC_REPOSITORY=sftp:restic@server.darkmere.gen.nz:shared
export RESTIC_PASSWORD=abcdefgh

For backing up my VPS I had to do another step since this couldn’t push files to my home. What I did was instead add a script that ran on the home server and used rsync to copy down folders from by VPS to local. I used rrsync to restrict this script.

Once I had a local folder I ran “restic –home vps-name backup /copy-of-folder” to backup over sftpd. The –host option made sure the backups were listed for the right machine.

Since the restic folder is just a bunch of files, I’m copying up it directly to external disk which I keep outside the house.

Parting Thoughts

I’m fairly happy with restic so far. I don’t have not run into too many problems or gotchas yet although if you are starting up I’d suggest testing with a small repository to get used to the commands etc.

I have copies of keys in my password manager for recovery.

There are a few things I still have to do including setup up some monitoring and also decide how often to run the prune operation.

Share

,

Simon LyallAudiobooks – February 2021

Lost and Founder: A Painfully Honest Field Guide to the Startup World by Rand Fishkin

Advice for perspective founders mixed in with stories from the author’s company. Open about missteps he made to be avoided. 4/5

The Victorian Internet: The Remarkable Story of the Telegraph and the Nineteenth Century’s On-line Pioneers by Tom Standage

A short book on the rise of the telegraph and how it changed the world. Peppered with amusing stories and analogies to the Internet. 3/5

Dreams from My Father: A Story of Race and Inheritance by Barack Obama

A memoir of the author growing up and into his mid-20s. Well written and interesting. Audiobook is read by the author but he’s okay. 3/5

The Age of Benjamin Franklin by Robert J. Allison

24 Lectures about various aspects of Franklin and his life. Each lecture is on a theme so they are not chronological. I hadn’t read any biographies previously but this might help. 4/5

The Relentless Moon by Mary Robinette Kowal

3rd book in the Lady Astronaut series. Mostly concerned with trying to find and stop agents sabotaging the Moonbase. Works well and held my interest. 3/5

Business Adventures: Twelve Classic Tales from the World of Wall Street by John Brooks

A collection of long New Yorker articles from the 1960s. One on a stock corner even has parallels with Gamestop in 2021. Interesting and well told even when dated. 3/5

Live and Let Die by Ian Fleming

James Bond takes on Gangster/Agent/Voodoo leader ‘Mr Big’ in Harlem, Florida and Jamaica. The racial stereotypes are dated but could be worse. The story held my interest. 3/5

My Scoring System

  • 5/5 = Brilliant, top 5 book of the year
  • 4/5 = Above average, strongly recommend
  • 3/5 = Average. in the middle 70% of books I read
  • 2/5 = Disappointing
  • 1/5 = Did not like at all

Share

Russell CokerLinks February 2021

Elestic Search gets a new license to deal with AWS not paying them [1]. Of course AWS will fork the products in question. We need some anti-trust action against Amazon.

Big Think has an interesting article about what appears to be ritualistic behaviour in chompanzees [2]. The next issue is that if they are developing a stone-age culture does that mean we should treat them differently from other less developed animals?

Last Week in AWS has an informative article about Parler’s new serverless architecture [3]. They explain why it’s not easy to move away from a cloud platform even for a service that’s designed to not be dependent on it. The moral of the story is that running a service so horrible that none of the major cloud providers will touch it doesn’t scale.

Patheos has an insightful article about people who spread the most easily disproved lies for their religion [4]. A lot of political commentary nowadays is like that.

Indi Samarajiva wrote an insightful article comparing terrorism in Sri Lanka with the right-wing terrorism in the US [5]. The conclusion is that it’s only just starting in the US.

Belling Cat has an interesting article about the FSB attempt to murder Russian presidential candidate Alexey Navalny [6].

Russ Allbery wrote an interesting review of Anti-Social, a book about the work of an anti-social behavior officer in the UK [7]. The book (and Russ’s review) has some good insights into how crime can be reduced. Of course a large part of that is allowing people who want to use drugs to do so in an affordable way.

Informative post from Electrical Engineering Materials about the difference between KVW and KW [8]. KVA is bigger than KW, sometimes a lot bigger.

Arstechnica has an interesting but not surprising article about a “supply chain” attack on software development [9]. Exploiting the way npm and similar tools resolve dependencies to make them download hostile code. There is no possibility of automatic downloads being OK for security unless they are from known good sites that don’t allow random people to upload. Any sort of system that allows automatic download from sites like the Node or Python repositories, Github, etc is ripe for abuse. I think the correct solution is to have dependencies installed manually or automatically from a distribution like Debian, Ubuntu, Fedora, etc where there have been checks on the source of the source.

Devon Price wrote an insightful Medium article “Laziness Does Not Exist” about the psychological factors which can lead to poor results that many people interpret as “laziness” [10]. Everyone who supervises other people’s work should read this.

,

Rusty RussellA Model for Bitcoin Soft Fork Activation

TL;DR: There should be an option, taproot=lockintrue, which allows users to set lockin-on-timeout to true. It should not be the default, though.

As stated in my previous post, we need actual consensus, not simply the appearance of consensus. I’m pretty sure we have that for taproot, but I would like a template we can use in future without endless debate each time.

  • Giving every group a chance to openly signal for (or against!) gives us the most robust assurance that we actually have consensus. Being able to signal opposition is vital, since everyone can lie anyway; making opposition difficult just reduces the reliability of the signal.
  • Developers should not activate. They’ve tried to assure themselves that there’s broad approval of the change, but that’s not really a transferable proof. We should be concerned about about future corruption, insanity, or groupthink. Moreover, even the perception that developers can set the rules will lead to attempts to influence them as Bitcoin becomes more important. As a (non-Bitcoin-core) developer I can’t think of a worse hell myself, nor do we want to attract developers who want to be influenced!
  • Miner activation is actually brilliant. It’s easy for everyone to count, and majority miner enforcement is sufficient to rely on the new rules. But its real genius is that miners are most directly vulnerable to the economic majority of users: in a fork they have to pick sides continuously knowing that if they are wrong, they will immediately suffer economically through missed opportunity cost.
  • Of course, economic users are ultimately in control. Any system which doesn’t explicitly encode that is fragile; nobody would argue that fair elections are unnecessary because if people were really dissatisfied they could always overthrow the government themselves! We should make it as easy for them to exercise this power as possible: this means not requiring them to run unvetted or home-brew modifications which will place them at more risk, so developers need to supply this option (setting it should also change the default User-Agent string, for signalling purposes). It shouldn’t be an upgrade either (which inevitably comes with other changes). Such a default-off option provides both a simple method, and a Schelling point for the lockinontimeout parameters. It also means much less chance of this power being required: “Si vis pacem, para bellum“.

This triumverate model may seem familiar, being widely used in various different governance systems. It seems the most robust to me, and is very close to what we have evolved into already. Formalizing it reduces uncertainty for any future changes, as well.

,

Dave HallParameter Store vs Secrets Manager

Which AWS managed service is best for storing and managing your secrets?

,

Francois MarierUpgrading from Ubuntu 18.04 bionic to 20.04 focal

I recently upgraded from Ubuntu 18.04.5 (bionic) to 20.04.1 (focal) and it was one of the roughest Ubuntu upgrades I've gone through in a while. Here are the notes I took on avoiding or fixing the problems I ran into.

Preparation

Before going through the upgrade, I disabled the configurations which I know interfere with the process:

  • Enable etckeeper auto-commits before install by putting the following in /etc/etckeeper/etckeeper.conf:

      AVOID_COMMIT_BEFORE_INSTALL=0
    
  • Remount /tmp as exectuable:

      mount -o remount,exec /tmp
    

Another step I should have taken but didn't, was to temporarily remove safe-rm since it caused some problems related to a Perl upgrade happening at the same time:

apt remove safe-rm

Network problems

After the upgrade, my network settings weren't really working properly and so I started by switching from ifupdown to netplan.io which seems to be the preferred way of configuring the network on Ubuntu now.

Then I found out that netplan.io is not automatically enabling the systemd-resolved handling of .local hostnames.

I would be able to resolve a hostname using avahi:

$ avahi-resolve --name machine.local
machine.local   192.168.1.5

but not with systemd:

$ systemd-resolve machine.local
machine.local: resolve call failed: 'machine.local' not found

$ resolvectl mdns
Global: no
Link 2 (enp4s0): no

The best solution I found involves keeping systemd-resolved and its /etc/resolv.conf symlink to /run/systemd/resolve/stub-resolv.conf.

I added the following in a new /etc/NetworkManager/conf.d/mdns.conf file:

[connection]
connection.mdns=1

which instructs NetworkManager to resolve mDNS on all network interfaces it manages but not register a hostname since that's done by avahi-daemon.

Then I enabled mDNS globally in systemd-resolved by setting the following in /etc/systemd/resolved.conf:

MulticastDNS=yes

before restarting both services:

systemctl restart NetworkManager.service systemd-resolved.service

With that in place, .local hostnames are resolved properly and I can see that mDNS is fully enabled:

$ resolvectl mdns
Global: yes
Link 2 (enp4s0): yes

Boot problems

For some reason I was able to boot with the kernel I got as part of the focal update, but a later kernel update rendered my machine unbootable.

Adding some missing RAID-related modules to /etc/initramfs-tools/modules:

raid1
dmraid
md-raid1

and then re-creating all initramfs:

update-initramfs -u -k all

seemed to do the trick.

Chris SmartRootless podman containers under system accounts, managed and enabled at boot with systemd

While you can run containers as root on the host, or run rootless containers as your regular user (either as uid 0 or any another), sometimes it’s nice to create specific users to run one or more containers. This provides neat separation and can also improve security posture.

We also want those containers to act as regular system services; managed with systemd to auto-restart and be enabled on boot.

This assumes you’ve just installed Fedora (or RHEL/CentOS 8+) server and have a local user with sudo privileges. First, let’s also install some SELinux tools.

sudo dnf install -y /usr/sbin/semanage

Setting up the system user

Let’s create our system user, placing their home dir under /var/lib. For the purposes of this example I’m using a service account of busybox but this can be anything unique on the box. Note, if you prefer to have a real shell, then swap /bin/false with /bin/bash or other.

export SERVICE="busybox"

sudo useradd -r -m -d "/var/lib/${SERVICE}" -s /bin/false "${SERVICE}"

In order for our user to run containers automatically on boot, we need to enable systemd linger support. This will ensure that a user manager is run for the user at boot and kept around after logouts.

sudo loginctl enable-linger "${SERVICE}"

Configure homedir for containers

Next, we create a data directory to be passed in as a volume to the container (some containers may require more, but this is a good start).

sudo -H -u "${SERVICE}" bash -c "mkdir ~/data"

We need to set some SELinux context on the home directory, otherwise rootless containers won’t run. This will change the service account’s directory under /var/lib from var_lib_t to user_home_dir_t. It also sets the data directory to be of type container_file_t so that containers will be able to access it (technically this isn’t necessary if you use the :z or :Z flag for the volume when running the container, but I’m keeping it in for broader context).

sudo semanage fcontext -a -t user_home_dir_t \
"/var/lib/${SERVICE}(/.+)?"

sudo semanage fcontext -a -t container_file_t \
"/var/lib/${SERVICE}/data(/.+)?"

sudo restorecon -Frv /var/lib/"${SERVICE}"

Enable rootless containers

By default, system users do not get any subuid ranges which means it will not be able to run rootless containers. Setting this up is done manually with a little bit of bash magic.

NEW_SUBUID=$(($(tail -1 /etc/subuid |awk -F ":" '{print $2}')+65536))
NEW_SUBGID=$(($(tail -1 /etc/subgid |awk -F ":" '{print $2}')+65536))

sudo usermod \
--add-subuids ${NEW_SUBUID}-$((${NEW_SUBUID}+65535)) \
--add-subgids ${NEW_SUBGID}-$((${NEW_SUBGID}+65535)) \
  "${SERVICE}"

Great! Now we have our system user ready to go.

Switch to system user

Let’s switch to our system user (note this is slightly more complicated as we have /bin/false as the shell, so this puts us in the right homedir).

sudo -H -u "${SERVICE}" bash -c 'cd; bash'

Running a rootless container

We have a dedicated user which can run rootless containers, so when we start a container, we can tell it to run as root with the --user 0:0 option (or -u 0:0 for short). This way the process in the container will be actually run as our system user on the host.

OK, now let’s run a container! Note we are running this in --detached (-d for short) mode so that it runs in the background. We’re also enabling interactive mode with --interactive (-i for short) and allocating a pseudo terminal with --terminal (-t for short) which is required for busybox to work. You may recall from earlier posts that the :z option after the volume sets an SELinux context on the data directory, explicit to this container via MCS labels.

podman run -u 0:0 -dit --name busybox -v data:/data:z busybox

Do a simple test to make sure we can connect to the running container.

podman exec busybox sh -c 'echo -n "In this container, I am ";id -un'

You should see that you are root

In this container, I am root

Managing and enabling the container with systemd

OK, so we can create a dedicated user on the host, and we can run a container, great! But how do we get that non-root user to automatically start their container on boot? Enter systemd.

In order to interact with systemd, we must ensure XDG_RUNTIME_DIR is set (this is because we switched user, if we ssh in instead, it will be set up for us, but our system user has no shell so that’s not possible).

export XDG_RUNTIME_DIR=/run/user/"$(id -u)"

You should be able to connect to systemd now, let’s test it.

systemctl --user

Remember when we created the account we enabled linger support? That’s critical when running containers without an actual login.

Let’s make the user systemd directory.

mkdir -p ~/.config/systemd/user/

Use podman to generate a systemd service file.

podman generate systemd --restart-policy always --name busybox > \
~/.config/systemd/user/container-busybox.service

sed -i s/^KillMode=.*/KillMode=control-group/ \
~/.config/systemd/user/container-busybox.service

Next, we reload systemd so that it can see the new service.

systemctl --user daemon-reload

Now we are able to interact with the container using systemd. Let’s enable it on boot and check the status!

systemctl --user enable --now container-busybox
systemctl --user status container-busybox

On boot, this service should auto-start and can be managed via systemd.

So the final step is to reboot the host, switch back to the user and ensure the container is running.

,

Rusty RussellBitcoin Consensus and Solidarity

Bitcoin’s consensus rules define what is valid, but this isn’t helpful when we’re looking at changing the rules themselves. The trend in Bitcoin has been to make such changes in an increasingly inclusive and conservative manner, but we are still feeling our way through this, and appreciating more nuance each time we do so.

To use Bitcoin, you need to remain in the supermajority of consensus on what the rules are. But you can never truly know if you are. Everyone can signal, but everyone can lie. You can’t know what software other nodes or miners are running: even expensive testing of miners by creating an invalid block only tests one possible difference, may still give a false negative, and doesn’t mean they can’t change a moment later.

This risk of being left out is heightened greatly when the rules change. This is why we need to rely on multiple mechanisms to reassure ourselves that consensus will be maintained:

  1. Developers assure themselves that the change is technically valid, positive and has broad support. The main tools for this are open communication, and time. Developers signal support by implementing the change.
  2. Users signal their support by upgrading their nodes.
  3. Miners signal their support by actually tagging their blocks.

We need actual consensus, not simply the appearance of consensus. Thus it is vital that all groups know they can express their approval or rejection, in a way they know will be heard by others. In the end, the economic supermajority of Bitcoin users can set the rules, but no other group or subgroup should have inordinate influence, nor should they appear to have such control.

The Goodwill Dividend

A Bitcoin community which has consensus and knows it is not only safest from a technical perspective: the goodwill and confidence gives us all assurance that we can make (or resist!) changes in future.

It will also help us defend against the inevitable attacks and challenges we are going to face, which may be a more important effect than any particular soft-fork feature.

,

Lev LafayetteInteractive HPC Computation with Open OnDemand and FastX

As dataset size and complexity requirements grow increasingly researchers need to find additional computational power for processing. A preferred choice is high performance computing (HPC) which, due to its physical architecture, operating system, and optimised application installations, is best suited for such processing. However HPC systems have historically been less effective at the visual display, and least of all in an interactive manner, leading into a general truism of "compute on the HPC, visualise locally".

This is primarily due to the tyranny of distance, but also with additional latency introduced by contemporary graphics when remote display instructions are sent to a local X-server. With a demand for both HPC computational power and interactive graphics, the University of Melbourne has implemented two technologies, FastX and Open OnDemand, on their general-purpose HPC system "Spartan". This allows users to run graphical applications on Spartan, by submitting a job to the batch system which executes an XFCE graphical environment. In the Spartan environment, FastX has been coupled with Open OnDemand which provides web-enabled applications (e.g., RStudio, Jupyter Notebooks). In illustrating how this environment operates at the Spartan HPC system, the presentation will also illustrate recent research case studies from the University of Melbourne that have utilised this technology.

A presentation to eResearchNZ 2021

,

Dave HallA Lost Parcel Results in a New Website

When Australia Post lost a parcel, we found a lot of problems with one of their websites.

,

Dave HallWe Have a New Website (Finally)

After 15 years we rebuilt our website. Learn more about the new site.

,

Ben MartinFileSender UI overhaul to using Bootstrap

 I have recently made a branch available which updates the user interface for FileSender to use Bootstrap. In this post I will discuss some of the changes and also mention some of the features of FileSender which might be useful to people. FileSender allows you to send and receive large files with people. You can upload a file to FileSender and send somebody a link to be able to download it. You can invite a person to send you a file on a FileSender server that you use and have the option to restrict them to only send files to you. This can be very handy if you meet somebody at a conference who wants to offer you a data set and you would like to only let them send to you. 

FileSender also offers End-to-End encryption so you can be comfortable that your data is only used by those you intend to have access. FileSender is open source and browser based.

The following screen shots are taken from the above cited branch with the updated user interface. The upload screen is shown below. I have resisted adding animations to the main "Drag or choose file(s)" panel at this stage. I think the secondary button for select File or select Directory are the main interesting items as perhaps people are not so used to dropping files into the browser and a single "add" action does not explicitly inform a user that a whole directory (recursively) can be added if desired.

Once you have added a file you see the manifest and the clear all button has been moved to above the option to remove single files. This listing lets you see the files that will be transferred and also can show why a file might not be ok. For example, if a file is 0 bytes or can not be read, or is a type such as an executable which is not allowed by the system.

If you choose the File Encryption option then you can either set a passphrase or have one generated for you. When FileSender is using a generated password it knows how much entropy was used for the password. So to go back to a user nominated password you have to deselect "generated password" which allows the password field to be edited again. Using a generated password will be more secure than anything a human is likely to enter.


After clicking "Continue" you move to stage 2 of the new upload process. The main choice here is if you would like to get a link to the file or would like to email the link directly to the recipient. Many of the other options on this stage can be left at the default and you could just click send from here to start the upload process.


If you choose the "Or send via email" button the page alters to what is shown below. Again, once you click on Send then the transfer will begin.

The upload is happening in the below screen shot. I have paused the upload here to make it easier to capture the screenshot. FileSender can upload using multiple web workers which can work on one or more files at the same time depending on the server configuration. The grey bar below "Admin" is a developer option I have enabled which shows the progress of each of the web workers and how long they have been since the last progress message. It is grey as I have paused the upload. 

The resume button will continue the upload and if a stall is detected will automatically attempt to reconnect and resume the upload (not from the start). The "Reconnect and Continue" button will tell FileSender to close the connections and remake then before resuming the upload. This can be useful if you move a laptop to another network. Even if you IP address and network have changed you should be able to resume the upload with "Reconnect and Continue".

The automatic resumes and the like on this page are all setup to try to complete the upload if possible. When things stall multiple attempts are made to try to complete the transfer without needing any interaction from you. It is only if all that fails that an error is presented to that effect. Even at that stage you can setup the transfer again and resume without loosing all the data that has already been transferred.

The plan is that after a while you will see the below screen informing you of a successful upload. In this case I have chosen to get a link to the uploaded file. This is the link I can follow to download the file and I can share that link with anybody I wish to be able to download the file(s).


The My Transfers page shows all the files you have uploaded. The files which have not expired are listing in the main (default) tab. The actions icons all have hints on mouse over to let you know what they are for. As I am an admin user I have an extra option to extend the expire time for any transfer using the red clock. I have used a red overlay for admin privileged actions. In this case, as an admin I can extend the expire time as many times as I like.

When you expand a transfer with the + icon you see the details and can access the download link or see an audit log for the file. The ip address information might not be shown depending on how the FileSender instance is configured.

The my transfers page can still be used even on smaller displays which is very handy if people want to take a quick look at some data on a tablet or cast it from the tablet to a large screen at a meeting.



The guests section has the primary action allowing you to invite a guest to the FileSender server. I may update the "can only send to me" and "get a link" into a single drop down option as these two options are mutually exclusive which is currently only shown by the colour and a message when you try to select both at once. This is a hold over from the old UI code and I have been looking to update that code for a long time. Other options like seeing the current guests or which transfers a guest has sent (optional depending on server configuration and settings), are now shown as tabs at the top of the page. In the old UI these were below the invite form and might not even be known to exist at first.

The download page allows you to select individual files to download or a subset of the files to download into either a tar or zip file. When a file is encrypted you can only download the subset as a zip64 file. I have used zip64 even for small archives in order to not surprise users when they start downloading larger files and find that some downloads might not be supported natively by the tools that come with their operating system. There is a nice free tool which is recommended if you visit this page on macos.

The download process is a little tricky here as the files have each been encrypted in the browser. The server never sees the passphrase needed to decrypt the files. Once you start to download to a zip for some files they are sent in the encrypted form from the server and decrypted in the browser before being added to a virtual zip64 file which is then streamed to your disk. The result is a zip64 file containing the decrypted files you have selected.

For real world use the files will likely be more interesting than just random test files I am using here :)

If you are new to FileSender and this looks interesting you can setup your own server with using apache/nginx and php to serve, and mariadb/postgresql for database storage and a big disk of your choice ;)

https://docs.filesender.org/v2.0/install/



,

Simon LyallAudiobooks – January 2021

The Esperanza Fire: Arson, Murder and the Agony of Engine 57 by John N. Maclean

An account of the fire that killed a five-person firefighter crew. Minute by minute of the fire itself, plus the investigation and the trial of the arsonist. 4/5

Range: Why Generalists Triumph in a Specialized World by David Epstein

An argument against early-specialisation and over-specialisation. How it fails against open non-predictable problems and environments. 4/5

The Vikings: A New History by Neil Oliver

A vaguely chronological introduction to the Vikings. Lots of first person descriptions of artifacts by the author. 3/5

Ultralearning: Master Hard Skills, Outsmart the Competition, and Accelerate Your Career by Scott Young

Examples and advice on how to learn a skill very quickly, usually via an intense method. Good practical advice mixed with some stories 3/5

81 Days Below Zero: The Incredible Survival Story of a World War II Pilot in Alaska’s Frozen Wilderness by Brian Murphy

An interesting survival story. The pilot survives a crash in a remote area & manages to walk out with minimal gear during winter. 3/5

Messy: How to Be Creative and Resilient in a Tidy-Minded World by Tim Harford

The unexpected connections between creativity and mess. Lots of examples although as one commentator noticed most of them were from people already masters not beginners. 3/5

Outliers: The Story of Success by Martin Gladwell

A book on how the most famous and successful are often there because of their upbringing, practice or chance events pushed them to the top rather than just raw talent.

The Book of Humans: The Story of How We Became Us by Adam Rutherford

How the latest research that reveals the extent to which behaviors once thought exclusively human are also found in other species. Spoiler: except Culture. 3/5

Tank Action: An Armoured Troop Commander’s War 1944-45 by David Render and Stuart Tootal

The author is thrown into the war as a 19 year old officer in command of 4 tanks 5 days after D-Day. Very well written and lots of detail of the good and the bad. 4/5

My Scoring System

  • 5/5 = Brilliant, top 5 book of the year
  • 4/5 = Above average, strongly recommend
  • 3/5 = Average. in the middle 70% of books I read
  • 2/5 = Disappointing
  • 1/5 = Did not like at all

Share

,

Jan SchmidtRift CV1 – Testing SteamVR

I’ve had a few people ask how to test my OpenHMD development branch of Rift CV1 positional tracking in SteamVR. Here’s what I do:

  • Make sure Steam + SteamVR are already installed.
  • Clone the SteamVR-OpenHMD repository:
git clone --recursive https://github.com/ChristophHaag/SteamVR-OpenHMD.git
  • Switch the internal copy of OpenHMD to the right branch:
cd subprojects/openhmd
git remote add thaytan-github https://github.com/thaytan/OpenHMD.git
git fetch thaytan-github
git checkout -b rift-kalman-filter thaytan-github/rift-kalman-filter
cd ../../
  • Use meson to build and register the SteamVR-OpenHMD binaries. You may need to install meson first (see below):
meson -Dbuildtype=release build
ninja -C build
./install_files_to_build.sh
./register.sh
  • It is important to configure in release mode, as the kalman filtering code is generally too slow for real-time in debug mode (it has to run 2000 times per second)
  • Make sure your USB devices are accessible to your user account by configuring udev. See the OpenHMD guide here: https://github.com/OpenHMD/OpenHMD/wiki/Udev-rules-list
  • Please note – only Rift sensors on USB 3.0 ports will work right now. Supporting cameras on USB 2.0 requires someone implementing JPEG format streaming and decoding.
  • It can be helpful to test OpenHMD is working by running the simple example. Check that it’s finding camera sensors at startup, and that the position seems to change when you move the headset:
./build/subprojects/openhmd/openhmd_simple_example
  • Calibrate your expectations for how well tracking is working right now! Hint: It’s very experimental 🙂
  • Start SteamVR. Hopefully it should detect your headset and the light(s) on your Rift Sensor(s) should power on.

Meson

I prefer the Meson build system here. There’s also a cmake build for SteamVR-OpenHMD you can use instead, but I haven’t tested it in a while and it sometimes breaks as I work on my development branch.

If you need to install meson, there are instructions here – https://mesonbuild.com/Getting-meson.html summarising the various methods.

I use a copy in my home directory, but you need to make sure ~/.local/bin is in your PATH

pip3 install --user meson

Lev LafayetteAPA Style vs IEEE Style: Fight!

There is much that irks me in academia. The way that disciplines are almost randomly assigned to artium, scientiae, or legum, without any reference to their means of verification or falsification. Or, for that matter, the Dewey (or Universal) Decimal Classification for libraries, which, in its insanity, places computer applications in the same category as "Fundamentals of knowledge and culture" and "Propaedeutics". One could also describe ask why the value "Dead languages of unknown affiliation" also belongs with "Caucasian languages". I suppose most of them are "near dead", right?

Then there is the eye-watering level of digital illiteracy among academics, researchers, students, and professional staff. It is little wonder that closed-knowledge academic journals and proprietary software companies fleece the university sheep and make out like bandits. They don't even realise that they've been robbed, such is the practical ignorance of the lofty principles that they espouse. Ever received a document in a proprietary format explaining how important it is to make content accessible for the visually impaired? Yeah, it's like that all the time, a combination of hypocrisy combined with willful ignorance.

But I reserve a special spot in my hell for referencing systems. On the very basic level, a reference is a relationship where one object designates or links to another. A reference in an essay should link to some more elaborate reference in the text (whether by footnote, endnote, bibliography, hyperlink, etc), and that reference should be referred to the actual publication. Really, that's all that should be necessary. When I first offered an opinion on this subject, a computer science graduate noted that in their discipline it was not unusual for academics not to care about the referencing style, "as long as I can find the source". Certainly, consistency in single paper is better than inconsistency in that regard (following the principles of simplicity, clarity and frugality, but on the other end of the scale (pun not intended), there are poisonous snakes in many universities.

Minimal requirements and poisonous snakes aside, the good people at the Online Writing Lab at Perdue University have provided a summary of different referencing styles, even with an engine to generate the appropriate reference for APA and MLA, as much as that is possible. Using these suggestions and original sources I will engage in a comparison between the APA and IEEE styles. Giving away the conclusion in advance, despite my background in the social sciences, I prefer the IEEE style for reasons that will become evident.

Effectiveness

Effectiveness is a signal-to-noise ratio, the ability to provide content with frugality. "Provide" is an important verb here because it suggests the human actor, which of course psychologists would be remarkably aware of. It is a pity that they haven't taken this into account. Perhaps this is more representative of their own service fees. It serves little purpose seeking to provide a comprehensive taxonomy (as improbable as that is) if most of the text is based on developing special rules for unusual corner-cases, rather than providing basic principles from which effective elaborations can occur.

This is a problem with APA. The seventh edition weighs in at 427 pages, a terrifying increase from the 272 pages of the sixth edition, which was bad enough. To suggest that researchers must be familiar with this mighty tome in addition to their domain specialisation is, frankly, bordering on the offensively insane in their expectations that they wish to lump onto others. Have they even thought about what it must be like being an APA style guide expert? They'd be a real hit at parties! They probably have special parties of their own where they engage in fights over whether in-text citations should be author (date) or (author, date). For what it's worth, this actually is an on-going debate in APA.

In comparison, the IEEE reference guide is a mere twenty pages with the same again for abbreviations and a list of publishers. In addition, to make the comparison equivalent, there is an IEEE style manual which twenty-three pages. Combined, these cover pretty much the same content as the APA's Publication Manual, except it does so with a vastly reduced page count.

The APA seems to be infected by some version of Parkinson's Law, based on the discovery that the Colonial Office of the British Empire expanded whist the Empire itself declined; leading to a situation where it had the greatest number of staff when it was folded into the Foreign Office due to a lack of colonies to administer. One would like to think that the increase in page count has occurred to an equivalent changed in a potential medium. Even at glance, however, this is not the case. Using an observation from Parkinson, it would seem that the page count of the APA Manual depends entirely on the number of bureaucrats promoting the Manual, which will expand by approximately 5% per annum, regardless of the work that needs to be done.

Bias

There absolutely can be no doubt of the work that the APA has done in reducing bias in language. Their Guidelines for nonsexist language in APA journals remains one of the most significant publications of its sort, and of course that original set of guidelines has been updated to become an entire chapter in the 7th edition and the scope expanded to include age, disability, gender, participation in research, race and ethnicity, sexual orientation, socioeconomic status, and intersectionality, etc.

Despite these good intentions, the citation method of the APA is biased. Psychologists, of all people, would know quite well that the vast majority of people will assign agreement or disagreement from a source, viscerally, rather than form a rational opinion based on the content. By default, our beliefs come from emotional responses, not the critical reasoning faculties of the recently evolved parts of the brain. Given that social norms are usually based on herd-loyalty towards people rather than facts it understandable, rather than acceptable, that APA requires an (author, date) inline citation method. For their own part, IEEE doesn't care who wrote a fact, as long as you can check it. Their citation method simply will use a numerical reference (e.g., [1]).

The problem with the APA method is that it does lead to biased judgments, especially on normative considerations. Consider the following, first in IEEE style and then in APA style:

"An absolute and permanent ban on vivisection is not only a necessary law to protect animals and to show sympathy with their pain, but it is also a law for humanity itself."[1]

Well, that sounds humane enough, doesn't it? Does your opinion change when the author is cited?

"An absolute and permanent ban on vivisection is not only a necessary law to protect animals and to show sympathy with their pain, but it is also a law for humanity itself." (Göring, 1933).

Does it make a difference? Rationally, it shouldn't. The quote is an accurate statement of a remark that was made. But on a visceral level, it probably does make a difference. Most people with a bit of moral cognition probably don't want be associated with Göring regardless of their view on vivisection and, of course, people will play on that. Quoting a Nazi on a normative position generates an emotional response because of other normative positions carried out by the Nazis.

The use of inline author citation encourages such emotional reactions. Removing the named author citation reduces bias and asks the reader to evaluate a statement on its own merit. The IEEE probably didn't do that on purpose. But the effect is that propositions are not initially coloured by the author's name.

Posthumous Publications

The inline citation method of IEEE provides no information about the author, publication, or date. That means if a bold conjecture is asserted then it is up to the reader to check the quality of the reference. In contrast, the APA provides some of those details in the inline citation. But the combination is just weird; author and publisher's publication date. The combination means that the reader is simultaneously unaware of when the author made the remark, thus losing the historical context of the meaning of the words, or who determined that the author's remarks were worthy of publication. To be fair, I don't really care about the latter, especially not in these days where my favourite publications are electronic media and my favourite "publisher" is Library Genesis. But that combination; every time I see something like (Plato, 2014) part of my soul dies, pun intended. Poor Plato died over two thousand years ago.

Again, I find that the IEEE method is more sound. There is really no need to know inline who is being referenced or when the publication occurred, and certainly not in the combination required by APA. Where the author is relevant the author can be mentioned in the body of the text, and likewise for publications, such as a comparison between two translations.

Words, Words, Words: A Parting Shot

Most sound academics don't really care what referencing style one uses, as long as one is consistent. After all, it really is only the pettiest individuals, the "poisonous snakes", who slavishly adhere to someone else's style over their own lack of substance. Being a question of style, the problems I have noted with APA are really a matter of personal taste. The complexity of APA is something I dislike, and others might actually delight in the minutiae; sick, perverted, deviants, I tell you. It is possible as well to overlook the potential of name-based bias and perhaps even force one's self to overcome any unconscious biases that might otherwise creep in.

All this aside, it is perhaps worth noting that almost nobody follows the rules for style guides, including academics who have their own arbitrary insistence on using a particular system. I refer in particular to word count. As a veteran of several universities (Murdoch, Chifley, Salford, LSE-UoL, Otago, Melbourne, etc), I have had to note that everyone has their own way of conducting a word count in APA, which is fairly important because academics get grumpy if you exceed or fail to meet a word count by a certain percentage. Some include in-text citations, others don't. Some include the bibliography, others don't, and so forth. Yet the official rule from the APA is everything counts in the word count. A rule that is not used by any academic I've encountered and, for those who know how such systems work, will vary by the application's own method of counting words.

In summary, APA is poison, and whilst I am engaging in obvious hyperbole to say its advocates ought to be executed (I generally oppose the death penalty), it probably should be avoided in favour of a more rational referencing style. I do wonder, in moments of self-reflection, that my own journey from normative to factual academic content has influenced this view. So to conclude on a normative position, whilst the image meme is funny, the person himself is a pretty awful human being. This is the one and only time I'll be giving him any oxygen.

,

Francois MarierRecovering from a corrupt MariaDB index page

I ran into a corrupt MariaDB index page the other day and had to restore my MythTV database from the automatic backups I make as part of my regular maintainance tasks.

Signs of trouble

My troubles started when my daily backup failed on this line:

mysqldump --opt mythconverg -umythtv -pPASSWORD > mythconverg-200200923T1117.sql

with this error message:

mysqldump: Error 1034: Index for table 'recordedseek' is corrupt; try to repair it when dumping table `recordedseek` at row: 4059895

Comparing the dump that was just created to the database dumps in /var/backups/mythtv/, it was clear that it was incomplete since it was about 100 MB smaller.

I first tried a gentle OPTIMIZE TABLE recordedseek as suggested in this StackExchange answer but that caused the database to segfault:

mysqld[9141]: 2020-09-23 15:02:46 0 [ERROR] InnoDB: Database page corruption on disk or a failed file read of tablespace mythconverg/recordedseek page [page id: space=115871, page number=11373]. You may have to recover from a backup.
mysqld[9141]: 2020-09-23 15:02:46 0 [Note] InnoDB: Page dump in ascii and hex (16384 bytes):
mysqld[9141]:  len 16384; hex 06177fa70000...
mysqld[9141]:  C     K     c      {\;
mysqld[9141]: InnoDB: End of page dump
mysqld[9141]: 2020-09-23 15:02:46 0 [Note] InnoDB: Uncompressed page, stored checksum in field1 102203303, calculated checksums for field1: crc32 806650270, innodb 1139779342,  page type 17855 == INDEX.none 3735928559, stored checksum in field2 102203303, calculated checksums for field2: crc32 806650270, innodb 3322209073, none 3735928559,  page LSN 148 2450029404, low 4 bytes of LSN at page end 2450029404, page number (if stored to page already) 11373, space id (if created with >= MySQL-4.1.1 and stored already) 115871
mysqld[9141]: 2020-09-23 15:02:46 0 [Note] InnoDB: Page may be an index page where index id is 697207
mysqld[9141]: 2020-09-23 15:02:46 0 [Note] InnoDB: Index 697207 is `PRIMARY` in table `mythconverg`.`recordedseek`
mysqld[9141]: 2020-09-23 15:02:46 0 [Note] InnoDB: It is also possible that your operating system has corrupted its own file cache and rebooting your computer removes the error. If the corrupt page is an index page. You can also try to fix the corruption by dumping, dropping, and reimporting the corrupt table. You can use CHECK TABLE to scan your table for corruption. Please refer to https://mariadb.com/kb/en/library/innodb-recovery-modes/ for information about forcing recovery.
mysqld[9141]: 200923 15:02:46 2020-09-23 15:02:46 0 [ERROR] InnoDB: Failed to read file './mythconverg/recordedseek.ibd' at offset 11373: Page read from tablespace is corrupted.
mysqld[9141]: [ERROR] mysqld got signal 11 ;
mysqld[9141]: Core pattern: |/lib/systemd/systemd-coredump %P %u %g %s %t 9223372036854775808 %h ...
kernel: [820233.893658] mysqld[9186]: segfault at 90 ip 0000557a229f6d90 sp 00007f69e82e2dc0 error 4 in mysqld[557a224ef000+803000]
kernel: [820233.893665] Code: c4 20 83 bd e4 eb ff ff 44 48 89 ...
systemd[1]: mariadb.service: Main process exited, code=killed, status=11/SEGV
systemd[1]: mariadb.service: Failed with result 'signal'.
systemd-coredump[9240]: Process 9141 (mysqld) of user 107 dumped core.#012#012Stack trace of thread 9186: ...
systemd[1]: mariadb.service: Service RestartSec=5s expired, scheduling restart.
systemd[1]: mariadb.service: Scheduled restart job, restart counter is at 1.
mysqld[9260]: 2020-09-23 15:02:52 0 [Warning] Could not increase number of max_open_files to more than 16364 (request: 32186)
mysqld[9260]: 2020-09-23 15:02:53 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=638234502026
...
mysqld[9260]: 2020-09-23 15:02:53 0 [Note] InnoDB: Recovered page [page id: space=115875, page number=5363] from the doublewrite buffer.
mysqld[9260]: 2020-09-23 15:02:53 0 [Note] InnoDB: Starting final batch to recover 2 pages from redo log.
mysqld[9260]: 2020-09-23 15:02:53 0 [Note] InnoDB: Waiting for purge to start
mysqld[9260]: 2020-09-23 15:02:53 0 [Note] Recovering after a crash using tc.log
mysqld[9260]: 2020-09-23 15:02:53 0 [Note] Starting crash recovery...
mysqld[9260]: 2020-09-23 15:02:53 0 [Note] Crash recovery finished.

and so I went with the nuclear option of dropping the MythTV database and restoring from backup.

Dropping the corrupt database

First of all, I shut down MythTV as root:

killall mythfrontend
systemctl stop mythtv-status.service
systemctl stop mythtv-backend.service

and took a full copy of my MariaDB databases just in case:

systemctl stop mariadb.service
cd /var/lib
apack /root/var-lib-mysql-20200923T1215.tgz mysql/
systemctl start mariadb.service

before dropping the MythTV databse (mythconverg):

$ mysql -pPASSWORD

MariaDB [(none)]> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| mythconverg        |
| performance_schema |
+--------------------+
4 rows in set (0.000 sec)

MariaDB [(none)]> drop database mythconverg;
Query OK, 114 rows affected (25.564 sec)

MariaDB [(none)]> quit
Bye

Restoring from backup

Then I re-created an empty database:

mysql -pPASSWORD < /usr/share/mythtv/sql/mc.sql

and restored the last DB dump prior to the detection of the corruption:

sudo -i -u mythtv
/usr/share/mythtv/mythconverg_restore.pl --directory /var/backups/mythtv --filename mythconverg-1350-20200923010502.sql.gz

In order to restart everything properly, I simply rebooted the machine:

systemctl reboot

,

Tim RileyOpen source status update, January 2021

I had a very satisfying January in Ruby OSS work! This month was all about overhauling the dry-system internals. That I’ve written about this in both November and December just goes to show (a) how long things actually take when you’re doing this on the side (and I’m not lazing about, I spend at least 3 nights a week working on OSS), and (b) just how much there was going on inside of dry-system.

So to set the scene, here’s the circuitous path I took in adding rich component directory configuration to dry-system. This is the commit history before I tidied it:

2020-11-24 Add some early WIP
2020-11-25 Accept pre-configured component dirs
2020-11-25 Configure with block as part of initialize
2020-11-25 Provide path when initializing ComponentDir
2020-12-01 Add Rubocop rule
2020-12-01 Allow ComponentDirs to be cloned
2020-12-01 Clarify names
2020-12-01 Fix wording of spec
2020-12-01 Fixup naming
2020-12-01 Start getting component_dirs in place
2020-12-01 Update auto-registrar to use component_dirs
2020-12-01 Update specs for component_dirs
2020-12-03 Total messy WIP
2020-12-23 Get some WIP laid down on Booter#find_component
2020-12-23 Remove some WIP comments
2020-12-23 Tidy
2020-12-23 Update file_exists? behavior
2021-01-04 Add error
2021-01-04 Get things closer
2021-01-04 Provide custom dry-configurable cloneable value
2021-01-04 Remove unused settings
2021-01-04 Use a Concurrent::Map
2021-01-05 Add FIXME about avoiding config.default_namespace
2021-01-05 Introduce ComponentDir with behaviour separate to config
2021-01-05 Remove note, now that Loader#call is doing a require!, we’re fine
2021-01-05 Remove top-level default_namespace config
2021-01-05 Tidy Component
2021-01-05 Update FIXME
2021-01-11 Add docs
2021-01-11 Add docs for file exists
2021-01-11 Document Booter#boot_files method as public
2021-01-11 Don’t preserve file_path when namespacing component
2021-01-11 Expand docs
2021-01-11 Fix
2021-01-11 Flesh out ComponentDir
2021-01-11 Remove TODO
2021-01-11 Remove unused method
2021-01-11 Rip out the Component cache
2021-01-11 Stop setting file attribute on Component
2021-01-11 Tidy up Component file_path attr
2021-01-11 Tweak names
2021-01-11 Use a faster way of building namespaces path
2021-01-12 Use cloneable option for component_dirs setting
2021-01-14 Do not load components with auto_register: false
2021-01-14 Initialize components directly from AutoRegistrar
2021-01-14 Remove stale comment
2021-01-14 Remove unneeded requiring of component
2021-01-14 Scan file for magic comment options when locating
2021-01-14 Use dry-configurable master
2021-01-15 Add spec (and adjust approach) for skipping lazy loading of auto-register-disabled components
2021-01-15 Add unit tests for ComponentDir
2021-01-15 Tidy AutoRegistrar
2021-01-16 Add extra attributes to component equalizer
2021-01-16 Add unit tests for Component.locate
2021-01-16 Make load_component easier to understand
2021-01-18 Use base Dry::Container missing component error

Yep. 56 commits and just under two calendar months of work, with the break in early December being my sting doing Advent of Code. Luckily, that “Total messy WIP� left me with a passing test suite after some heavy refactoring, but it did take a day or two to figure out just what I was doing again! Note to self: leave more notes to self.

Rich, independent component directory configuration for dry-system

The (tidied) pull request for this change has a lengthy description, focused on implementation. If you’re interested in the details, please have a read!

Here’s the long and the short of it, though: previously, dry-system would let you configure a top-level auto_register setting, which would contain an array of string paths within the container root, which the system would use to populate the container. This would often be used alongside another top-level setting, default_namespace, which would strip away a common namespace prefix from the container identifiers, and a call to .load_paths! for each directory being auto-registered, to ensure the sources files within those directories could be properly required:

class MyApp::Container < Dry::System::Container
  configure do |config|
    config.root = __dir__
    config.auto_register = ["lib"]
    config.default_namespace = "my_app"
  end

  load_paths! "lib"
end

Those are three different things you would need to know how to use just right in order to set up a properly working dry-system container. Luckily, most users could copy a working example and then tweak from there. Also, users would typically only set up a single directory for auto-registration, so those three elements would only need to apply to that one directory only. If you ever tried to do more (for example, now that we have an autoloading loader, configure one directory to use the autoloder and another not to), things would fall apart.

Things brings us to the rich component directory configuration, and indeed the introduction of a �Component Directory� as a first-class concept within dry-system. Here’s how a container setup would look now:

class MyApp::Container < Dry::System::Container
  configure do |config|
    config.root = __dir__

    config.component_dirs.add "lib" do |dir|
      dir.auto_register = proc do |component|
        !component.path.match?(%r{/entities/})
      end
      dir.add_to_load_path = false
      dir.loader = Dry::System::Loader::Autoloading
      dir.default_namespace = "my_app"

      # Also available, `dir.memoize`, accepting a boolean or proc
    end
  end
end

Now the behavior for handling a given component directory can be configured on that directory and that directory alone. In the above example, another component diretory could be added with diametrically opposed settings to the first, and everything will still be dandy!

As you can also see, the degree of configurability has also increased greatly over the released versions of dry-system. Now you can opt into or out of auto-registration for specific components by passing a proc to the auto_register setting. Memoization of registered components can also be enabled, disabled, or configured specifically with the memoize setting.

(While you’re here, also check out the dry-configurable change I made to allow cloneable setting values, without which we couldn’t have provided this rich nested API for configuring particular directories)

Consistent component loading behavior, including magic comments!

With the changes above in place, I could remove the .auto_register! container class method (done in this pull request, also with its own lengthy description), which leaves the component_dirs setting as the only way to tell dry-system how to load components from source files.

Not only does this make for an easier to configure container, it also supports a more consistent component loading experience! Now, every configurable aspect of component loading is respected in the container’s two methods of auto-registering components: either via finalizing the container (which loads everything up front and freezes the container) or via lazy-loading (which loads components just in time, and is useful for keeping container load time down when running unit tests or using an interactive console, among other things).

It also means that magic comments within source files are respected in all cases, where previously, only a subset of comments were considered, and only when finalizing a container, not during lazy-loading.

This means you can now have a source file like this:

# auto_register: false

class MyEntity
end

And MyEntity will never find its way into your container.

Or you can have a source file like this:

# memoize: true

class MySpecialComponent
end

And when the component is registered, it will be memoized automatically.

Magic comments for dry-system are great, I use them all the time, and now they’re even more powerful!

More consistent, easier to understand dry-system internals

I’ve worked in the dry-system codebase quite regularly over the last few years, and certain parts have always felt a little too complicated, often leaving me confused, or at least afraid to change them. This is no discredit everyone who worked on dry-system previously! Its an amazing innovation, and its features just grew organically over the years to make it the capable, powerful system it is today!

However, given I was going to be deep in the code again to implement the changes I wanted, I took the chance to refactor as much as I could. And I’m just delighted in the outcome! For example, check out how .load_component and .load_local_component (which are used for lazy-loading components) used to look:

def load_component(key, &block)
  return self if registered?(key)

  component(key).tap do |component|
    if component.bootable?
      booter.start(component)
    else
      root_key = component.root_key

      if (root_bootable = component(root_key)).bootable?
        booter.start(root_bootable)
      elsif importer.key?(root_key)
        load_imported_component(component.namespaced(root_key))
      end

      load_local_component(component, &block) unless registered?(key)
    end
  end

  self
end

def load_local_component(component, default_namespace_fallback = false, &block)
  if booter.bootable?(component) || component.file_exists?(component_paths)
    booter.boot_dependency(component) unless finalized?

    require_component(component) do
      register(component.identifier) { component.instance }
    end
  elsif !default_namespace_fallback
    load_local_component(component.prepend(config.default_namespace), true, &block)
  elsif manual_registrar.file_exists?(component)
    manual_registrar.(component)
  elsif block_given?
    yield
  else
    raise ComponentLoadError, component
  end
end

And here’s how they look now:

def load_component(key)
  return self if registered?(key)

  component = component(key)

  if component.bootable?
    booter.start(component)
    return self
  end

  booter.boot_dependency(component)
  return self if registered?(key)

  if component.file_exists?
    load_local_component(component)
  elsif manual_registrar.file_exists?(component)
    manual_registrar.(component)
  elsif importer.key?(component.root_key)
    load_imported_component(component.namespaced(component.root_key))
  end

  self
end

def load_local_component(component)
  if component.auto_register?
    register(component.identifier, memoize: component.memoize?) { component.instance }
  end
end

Just look at that improvement! We went from a pair of methods that always confused me (with their mixed responsibilities, multiple conditionals and levels of nesting) to a simple top-to-bottom flow in .load_component, and .load_local_component reduced to a simple 3-liner with just a single job.

Weeks later, I’m still marvelling at this. I think it’s one of the best refactorings I’ve ever done.

These improvements didn’t come on their own. As you might notice there, component is carrying a lot more of its own weight. This includes a new set of methods for finding and loading components from within component directories (namely Dry::System::Component.locate and .new_from_component_dir), and indeed the new Dry::System::ComponentDir abstraction itself, which together provide the consistent component loading behavior I described above.

Dry::System::Loader converted to a class interface

One thing I noticed during the work on component loading is that a new Dry::System::Loader would be instantiated for every component, even through it carried no other state apart from the component itself, so I turned it into a stateless, class-level interface, and hey presto, we save an object allocation for every component we load.

This is a breaking change, but hey, so is everything else I’ve described so far! I figure this is the right time to sort these things out before we look to a dry-system 1.0 release sometime in the next few months (which is seeming much more attractive after this round of work!).

I appreciated being appreciated 🥺

Given how significant my plans were for all these changes, I made sure to keep Piotr and the other maintainers in the loop over those couple of months of work.

Then, when Piotr reviewed my first finished pull request for this work, he left me the most amazing comment. I want to repeat it here in full (that is, to take it straight in the pool room):

@timriley thanks for this very detailed description, it made much more sense for me to carefully read it and understand the changes rather than to examine the diff. Since what you did here, conceptually, makes perfect sense to me, AND it resulted in simplified implementation which at the same time makes the library much more powerful, I have nothing to complain about 😆 FWIW - I’ve read the diff and nothing stands out as potentially problematic. I reckon seeing it work in real-world apps will be a much better verification process, it’s a huge change after all.

This is clearly a huge milestone and I honestly didn’t expect that the lib will be so greatly improved prior 1.0.0, so thank you for this huge effort, really ��

One thing I’ll probably experiment with would be a backward-compatibility shim so that previous methods (that you removed) could still work. This should make it simpler to upgrade, but please tell me if this is a stupid idea.

I will also upgrade dry-rails!

Tim, seriously, again, this is an epic refactoring, I’m so happy. You’re taking dry-system to the next level. Thank you! 🚀 � 🙇�

After a full year of labouring away at this stuff, often to uncertain or, frankly, even unknowable ends, a comment like this has just given me the fuel to go another year more. Thank you, Piotr ♥�

If there’s someone out there in OSS land whose work you appreciate, please take the time to tell them! It might mean more than you think.

Next steps with dry-system, Hanami, and Zeitwerk, and the alpha2 release

Now that the bulk of the dry-system work is done, here’s what I’m looking to get done next:

  • Run some final tests using the dry-system branches within a real application
  • Work with Piotr to coordinate related changes to dry-rails (which configures dry-system auto-registration)
  • Updating Hanami to configure component_dirs within its own dry-system containers
  • Then work out how to enable Zeitwork within Hanami and use the autoloading loader for its component directories by default, while still providing a clean way for application authors to opt out if they’d rather use traditional require-based code loading

I expect this will take most of February. Once this is done, we’ll finally be in the clear for a 2.0.0.alpha2 release of Hanami. Focusing on Zeitwerk and dry-system has pushed it back a couple of months, but I hope everyone will agree it was worth the wait!

Thank you to my sponsors! 🙌�

Thanks to my GitHub sponsorts for your continuing support! If you’re reading this and would like to support my work in making Hanami 2.0 a reality, I’d really appreciate your support.

See you all next month!

Lev LafayetteEazy-Photoz EasyBuild Script and Test Cases

"EAZY is a photometric redshift code designed to produce high-quality redshifts for situations where complete spectroscopic calibration samples are not available", which is a pretty excellent project. However, it has a few issues that are illustrative of typical problems when developers aren't thinking in terms of operations. I like this software, it carries out a valuable scientific task with ease, which will be awesome in an HPC environment, especially when run as a job array. However, the developers have assumed a single-user, single-machine environment and haven't been very flexible with their design. Hopefully, these notes will make life easier for others in a similar situation.

1. There are no releases
It is perfectly acceptable to have a continuous process of code improvement that can be saved as commits to a repository. It is, in fact, very good practise to do so. However, it makes life a lot easier for people if one picks a point in time and commits a release version. Why? Because when software changes results can change. Following the instructions provided (download latest repository, make, and run) may mean that researchers will get different results using the same software, on the same machine, with the same datasets because the software has changed, and the researchers don't realise it. This is not conducive to the basic principles of science. Releasing versions makes a
world of difference in that regard. The sugggested process of updating the checked-out repository is not particularly helpful, as it is important that researchers be able to check against prior versions as well.

As a work around for this issue the following is offered as an EasyBuild script. It is far from ideal; the version number is date, based on the last commit to the repository. The source tarball has to be created manually. But it is better than the alternative, as it specifies a version (of sorts), a compiler, and dependencies used. The EasyBuild script will generate an LMod environment module.

easyblock = 'MakeCp'
name = 'eazy-photoz'
version = '20201002'

homepage = 'https://github.com/gbrammer/eazy-photoz/ '
description = """EAZY is a photometric redshift code designed to produce high-quality redshifts for situations where complete spectroscopic calibration samples are not available."""

toolchain = {'name': 'GCC', 'version': '8.3.0'}

# source tarball needs to be created manually,
# because of lack of proper releases
# git clone https://github.com/gbrammer/eazy-photoz
# tar cfvz eazy-photos-.tar.gz easzy-photos
sources = [SOURCELOWER_TAR_GZ]

builddependencies = [('binutils', '2.32')]

files_to_copy = ["*" ]

start_dir = 'src'

sanity_check_paths = {
        'files': ["eazy"],
        'dirs': [""]
}

moduleclass = 'phys'

2. Hard-coded Symlinks
It is typical in many environments to have an installation in significantly different location to the source-code. In multi-user environments it is pretty normal to separate access to the code, the binaries, and user datasets. In this case the hard-coded symlinks used in EAZY will cause problems as the expected files will not be present.

To work around this I have copied the symlinked files FILTER.RES.latest and FILTER.RES.latest.info from the filters directory. The source specifies the following symlinks in the input directory.

FILTER.RES.latest -> ../filters/FILTER.RES.latest
FILTER.RES.latest.info -> ../filters/FILTER.RES.latest.info
templates -> ../templates

A test-case is created with the input directory and the symlinks as included files. The suggested example, using HDF-N catalog Fernandez-Soto et al. 1999 will work with the following Slurm script. The name of modules may vary according to the system.

#!/bin/bash
#SBATCH --job-name=eazy-photoz-test.slurm
#SBATCH --ntasks=1
#SBATCH -t 0:15:00

# Load the environment variables
module purge
module load eazy-photoz/20201002

cd inputs
mkdir OUTPUT
eazy # generates param file
eazy -p zphot.param.default

,

Jan SchmidtRift CV1 – Pose rejection

I spent some time this weekend implementing a couple of my ideas for improving the way the tracking code in OpenHMD filters and rejects (or accepts) possible poses when trying to match visible LEDs to the 3D models for each device.

In general, the tracking proceeds in several steps (in parallel for each of the 3 devices being tracked):

  1. Do a brute-force search to match LEDs to 3D models, then (if matched)
    1. Assign labels to each LED blob in the video frame saying what LED they are.
    2. Send an update to the fusion filter about the position / orientation of the device
  2. Then, as each video frame arrives:
    1. Use motion flow between video frames to track the movement of each visible LED
    2. Use the IMU + vision fusion filter to predict the position/orientation (pose) of each device, and calculate which LEDs are expected to be visible and where.
  3. Try and match up and refine the poses using the predicted pose prior and labelled LEDs. In the best case, the LEDs are exactly where the fusion predicts they’ll be. More often, the orientation is mostly correct, but the position has drifted and needs correcting. In the worst case, we send the frame back to step 1 and do a brute-force search to reacquire an object.

The goal is to always assign the correct LEDs to the correct device (so you don’t end up with the right controller in your left hand), and to avoid going back to the expensive brute-force search to re-acquire devices as much as possible

What I’ve been working on this week is steps 1 and 3 – initial acquisition of correct poses, and fast validation / refinement of the pose in each video frame, and I’ve implemented two new strategies for that.

Gravity Vector matching

The first new strategy is to reject candidate poses that don’t closely match the known direction of gravity for each device. I had a previous implementation of that idea which turned out to be wrong, so I’ve re-worked it and it helps a lot with device acquisition.

The IMU accelerometer and gyro can usually tell us which way up the device is (roll and pitch) but not which way they are facing (yaw). The measure for ‘known gravity’ comes from the fusion Kalman filter covariance matrix – how certain the filter is about the orientation of the device. If that variance is small this new strategy is used to reject possible poses that don’t have the same idea of gravity (while permitting rotations around the Y axis), with the filter variance as a tolerance.

Partial tracking matches

The 2nd strategy is based around tracking with fewer LED correspondences once a tracking lock is acquired. Initial acquisition of the device pose relies on some heuristics for how many LEDs must match the 3D model. The general heuristic threshold I settled on for now is that 2/3rds of the expected LEDs must be visible to acquire a cold lock.

With the new strategy, if the pose prior has a good idea where the device is and which way it’s facing, it allows matching on far fewer LED correspondences. The idea is to keep tracking a device even down to just a couple of LEDs, and hope that more become visible soon.

While this definitely seems to help, I think the approach can use more work.

Status

With these two new approaches, tracking is improved but still quite erratic. Tracking of the headset itself is quite good now and for me rarely loses tracking lock. The controllers are better, but have a tendency to “fly off my hands” unexpectedly, especially after fast motions.

I have ideas for more tracking heuristics to implement, and I expect a continuous cycle of refinement on the existing strategies and new ones for some time to come.

For now, here’s a video of me playing Beat Saber using tonight’s code. The video shows the debug stream that OpenHMD can generate via Pipewire, showing the camera feed plus overlays of device predictions, LED device assignments and tracked device positions. Red is the headset, Green is the right controller, Blue is the left controller.

Initial tracking is completely wrong – I see some things to fix there. When the controllers go offline due to inactivity, the code keeps trying to match LEDs to them for example, and then there are some things wrong with how it’s relabelling LEDs when they get incorrect assignments.

After that, there are periods of good tracking with random tracking losses on the controllers – those show the problem cases to concentrate on.

,

Chris SmartPodman volumes and SELinux

In my previous post on volumes with podman, I touched on SELinux but I think it’s worthy of its own post to work through some details.

If your host has SELinux enabled, then container processes are confined to the system_u:system_r:container_t:s0 domain. Volumes you pass to podman will need to have appropriate labels, otherwise the container won’t be able access the volume, no-matter what the filesystem permissions are.

When running rootless containers (as your non-root user), files in your homedir will probably have the context of unconfined_u:object_r:data_home_t:s0, however the context that is required for container volumes is system_u:object_r:container_file_t:s0.

Fortunately, container volumes which podman creates at runtime will have the appropriate context set automatically. However, for host-dir volumes podman will not change the context by default, you will need to tell it to.

Let’s spin up a busybox container without setting the SELinux context and note that the container is not able to access the host-dir volume.

$ mkdir src

$ ls -dZ src/
unconfined_u:object_r:user_home_t:s0 src/

$ podman run -dit --name busybox -v ./src:/dest busybox
395924189c95d05dae65d5616fbdd7054095b1c3318603642e03047af9c893c7

$ podman exec -it busybox touch /dest/file
touch: /dest/file: Permission denied

Shared labels

OK, let’s delete that container and spin it up again but with the :z SELinux volume option.

$ podman rm -f busybox
395924189c95d05dae65d5616fbdd7054095b1c3318603642e03047af9c893c7

$ podman run -dit --volume ~/src:/dest:z --name busybox busybox
e5176a7acee86e17b42d417e7d2c570b2ecc51cccf6fd938688998d176e60df1

$ podman exec -it busybox touch /dest/file

OK, it didn’t error, that’s good! Let’s have a look on the host.

$ ls -Z ./src/file
system_u:object_r:container_file_t:s0 ./src/file

Great! We were able to write the file and we can see on the host that it exists with the correct SELinux context.

Thus, the :z option is critical as it tells podman to at least set the context to system_u:object_r:container_file_t:s0.

Note however, that with this context, SELinux will not stop any other container from being able to access that same directory. Yes, that can introduce a security risk if applied incorrectly (or perhaps through a vulnerability), but it’s also how you would share the same volume between multiple containers.

Private labels

So what if you wanted to restrict a volume to a specific container only? Well, that’s what the the UPPERCASE :Z option is all about. It not only tells podman to set the context on the volume, like lowercase :z, but it also ensures that other containers are not able to access it.

How does it do this? Each container process also has unique MCS (Multi-Category Security) categories. This is what podman uses for the private label, setting the SELinux context on the volume to match those of the process. For example, if the process runs in the confined domain with unique MCS categories c123,c456 then the volume context will be set to match, e.g. system_u:object_r:container_file_t:s0:c123,c456.

Let’s try a real world example with busybox running the top command (note the UPPERCASE :Z option).

$ mkdir -p src

$ podman run -d --name busybox-top -v ./src:/dest:Z busybox top
eceb13fee0e79cd328fde948e4dd9f1b39cddfc04f147e5eebcbfd2ee2d6abae

Looking for the container’s top process running on the host we can see that its MCS label is c260,c602.

$ ps -eZ | grep container_t |grep top
system_u:system_r:container_t:s0:c260,c602 26474 pts/0 00:00:00 top

Now let’s look at the host directory for the volume and note that it has a matching MCS label of c260,c602.

$ ls -Zd ./src
system_u:object_r:container_file_t:s0:c260,c602 src

Note that if you attach that same host-dir volume to multiple containers, only the last container with that volume attached will be able to access it as the context is updated each time.

Proving protection with private labels

Let’s spin up a second busybox container running iostat command this time, using the same host dir volume.

$ podman run -d --name busybox-iostat -v ./src:/dest:Z busybox iostat 1
1ad1ee6413f0b222468a2f540c4316a6a504c25245435fb51ee125f465514576

Let’s grab the MCS label for the iostat process, which we can see is c327,c995.

$ ps -eZ | grep container_t |grep iostat
system_u:system_r:container_t:s0:c327,c995 26876 pts/0 00:00:00 iostat

Now we can see that the label for the host dir has changed to match this new container (it used to be c260,c602).

$ ls -Zd ./src
system_u:object_r:container_file_t:s0:c327,c995 src

Finally, if we try to touch a file inside each of the containers, we’ll see the original busybox container now fails.

$ podman exec busybox-top touch /dest/file
touch: /dest/file: Permission denied

However, the second container running iostat works, and the file has a matching label.

$ podman exec busybox-iostat touch /dest/file

$ ls -Z src/file
system_u:object_r:container_file_t:s0:c327,c995 src/file

This shows how uppercase :Z is the more secure of the two SELinux volume options.

Chris SmartVolumes and rootless Podman

Containers generally run from an image and have no access to the host file system. This is fine for stand-alone ephemeral containers, but others require persistent data. While we can pass in environment variables to configure the app inside the container, sometimes a place to read and write more complex files is needed. When you update or replace such a container (perhaps with a new release) you want the new container to have access to the same data as the previous one.

The tricky thing with rootless containers is that you’re not root on the host and, as per my previous post, containers can run as any user id. If the container runs as root (uid 0) then that is fine as it actually maps to your non-root user on the host (e.g. 1000) and management of the data is therefore easy. However, containers running as other users (e.g. 123) will map to a uid on the host based on the subuid offset range (e.g. 100122) which will not match your non-root user and therefore data management is harder.

Providing persistent storage to a container is done by setting up a bind mounts using the --volume (or -v) option with podman. Note that volumes are not restricted to just one per container, simply repeat the -v option for as many volumes as required!

The volume option uses a colon separated format specifying the source on the host and then the destination directory where the volume is mounted to inside the container (e.g. src:/dest).

The source is generally one of two types; either a container volume (like a container image) or a regular host directory.

Container volumes

The simplest and easiest form of persistent data is to use a container volume, created at the time of container launch, as it will configure it the right permissions. These volumes are actually just a directory on the host, but managed nicely with podman volume set of commands. You can create them manually if you prefer.

If the source of the volume is just a name and not a path then podman expects a volume. If the volume does not already exist when podman run is executed, it will be created automatically and have the appropriate SELinux context set for you.

$ podman run -dit --volume src:/dest busybox

You will then be able to see the src volume on the host (along with any others).

$ podman volume ls

Host dir volumes

If the source is either a relative or absolute path, then podman expects this to be a directory on the host (which must already exist on the host).

Host dir volumes effectively work the same way as container volumes, in that they are a bind mount of a directory, it’s just that you have to specify the path and it has to already exist. Furthermore, the permissions need to be correct, else the container will not be able to access the directory. It must also have the correct SELinux context, although podman can set that for us at runtime if we use the :z option (more on that in the next section).

This mounts the directory /src on the host to /dest inside the container.

$ podman run -dit --volume /src:/dest:z busybox

Volume options

Volumes support a number of comma separated options after their source and destination declaration (separated by another colon), such as z (or Z) for setting SELinux context, ro makes the volume read only, while others such as noexec and nosuid enforce those features.

It’s easy to test, let’s try to write something to a volume that’s read only (:ro option).

$ podman run -dit --rm --name busybox -v src:/dest:ro docker.io/busybox
48a99f8e0454cb035adfd95097e5f92f6704a96153aeebcdfa63747637212d1f

$ podman exec busybox touch /dest/file
touch: /dest/file: Read-only file system

A note about SELinux

I have a follow-up post about SELinux, but in short, if your host has SELinux enabled then make sure you at least include the :z option after declaring your volumes (or :Z if you want it to be restricted to that container). This will make sure that the appropriate context is applied for the volume, otherwise the container will not be able to access it.

Volumes and uids

As per my previous post, we know that images run their process inside the container as a specific user. For example, by default busybox runs as root (uid 0) while grafana runs as grafana (uid 472).

If you run a rootless container with an image that is configured to run its process as root (uid 0), then it actually runs on the host as your non-root user (e.g. 1000). For volumes in this case, passing in any host directory owned by your non-root host user (for example, a directory in your home dir) will work just fine. The root user in the container ultimately maps to the same uid as your host user and so files and dirs have the same permissions.

Thanks to subuid ranges (see previous post), we can actually run multiple rootless containers all with different user IDs as a form of separation and security. This means the process from the container runs on the host with a unique uid, but unlike root, it will be different to our non-root user id. This can make persistent data more tricky as your host user uid won’t match subuids used by the container, and thus directories and files will have different permissions. This is not a problem if the data is only managed inside the container. We’ll investigate this a bit further in a future post, for now let’s undertand how volumes work.

Volumes and rootless containers, running as root

Let’s take a look at a rootless busybox container (which runs internally as root) and see what permissions the source (src volume) gets mounted with inside the container (at /dest).

$ podman run -dit --volume src:/dest --name busybox busybox
4ff42120b65c79ed42a48f6532c5cd259b79ec6e80351fbe30404496542a99ec

$  podman exec busybox ls -ld /dest
drwxr-xr-x    1 root     root             0 Jan 29 23:07 /dest

Great! We can see that the permissions are correctly set to root inside the container.

We now also have a brand new volume called src which podman has created for us, which we can inspect.

$ podman volume inspect src
[
    {
        "Name": "src",
        "Driver": "local",
        "Mountpoint": "/home/csmart/.local/share/containers/storage/volumes/src/_data",
        "CreatedAt": "2021-01-30T10:07:18.340278469+11:00",
        "Labels": {},
        "Scope": "local",
        "Options": {},
        "UID": 0,
        "GID": 0,
        "Anonymous": false
    }
]

Note that the container is really a directory at the location /home/csmart/.local/share/containers/storage/volumes/src/_data. If we look at that directory on the host, we can see the permissions match our non-root user (remember root in a container maps to our non-root user on the host).

$ id
uid=1000(csmart) gid=1000(csmart) groups=1000(csmart)

$ ls -ldZ /home/csmart/.local/share/containers/storage/volumes/src/_data
drwxr-xr-x. 1 csmart csmart system_u:object_r:container_file_t:s0 0 Jan 30 10:07 _data

Also note that the SELinux context of the _data directory is set to container_file_t (the parent directory will instead be data_home_t, as per regular home dirs). This is set for you automatically, but it is important. Without that context, the container will not be able to operate on that directory, so it becomes important when using host-dir instead of volumes, which we’ll see later.

OK, now let’s create a file on the host and see that it appears inside the container.

$ touch /home/csmart/.local/share/containers/storage/volumes/src/_data/file

$ podman exec -it busybox ls -l /dest/file
-rw-rw-r--    1 root     root             0 Jan 29 23:10 file

Similarly, we can touch a file in the container and have it appear on the host.

$ podman exec busybox touch /dest/file2

$ ls -laZ /home/csmart/.local/share/containers/storage/volumes/src/_data/file*
-rw-rw-r--. 1 csmart csmart unconfined_u:object_r:container_file_t:s0 0 Jan 30 10:10 file
-rw-r--r--. 1 csmart csmart system_u:object_r:container_file_t:s0     0 Jan 30 10:11 file2

OK, so that hopefully shows how volumes and containers interact. It’s easy when the container runs as root, as it matches your host user.

Let’s clean up that container and volume.

$ podman rm -f busybox
4ff42120b65c79ed42a48f6532c5cd259b79ec6e80351fbe30404496542a99ec

$ podman volume remove src
src

Host-dir volumes and rootless containers, running as root

Using a host-dir volume is easy when running a rootless container as root because the uids match. Just make the directory and use the full path when passing in the volume. Note that if you’re running SELinux you must specify either :z or :Z directly after the destination argument for the volume, in order to have podman set the right SELinux context on the directory.

$ mkdir ~/src

$ podman run -dit --volume ~/src:/dest:z --name busybox busybox
57181b4ab70173f8796fda62509082e56ce334a51805c9689fee09e868cc9753

$ podman exec busybox ls -ld /dest
drwxr-xr-x    1 root     root             0 Jan 29 23:07 /dest

This is effectively the same as running rootless containers as root we did above with a volume, except that the path is different (e.g. ~/src instead of /home/csmart/.local/share/containers/storage/volumes/src/_data). Everything there applies here.

Volumes and rootless containers, running as non-root

Let’s create a new container running as a different user (123) and we can see that inside the container it uses 123 but on the host it uses 100122 (remembering that according to our subuid map, uid 1 in a container maps to user 100000 on the host).

$ podman run -dit --volume src:/dest --user 123:123 --name busybox busybox
d4babfb54b2d162c4edc6061d0a2f28f666bec38433243923f418ac2d80dfa5f

$ podman exec busybox ls -ld /dest
drwxr-xr-x    1 123      123              0 Jan 29 23:14 /dest

Again, we can see that the container volume was re-created and mounted with the appropriate permissions for user 123 inside the container.

Looking on the directory on the host, we can see that this directory has been set up with the uid of 100122. This is correct because the container is not running as root, and so the subuid offset is being used.

$ ls -ldZ /home/csmart/.local/share/containers/storage/volumes/src/_data
drwxr-xr-x. 1 100122 100122 system_u:object_r:container_file_t:s0 0 Jan 30 10:14 _data

Of course, if we try to write to that directly now as our non-root user on the host, it will fail.

$ touch /home/csmart/.local/share/containers/storage/volumes/src/_data/file
touch: cannot touch 'file': Permission denied

Great! So, when it comes to running lots of rootless containers with different uids, the easiest way to do this with persistent data is with container volumes.

Host-dir volumes and rootless containers, running as root

OK so this is where things start to get a little more interesting as the uid in the container is NOT the same as our user (and we’re not root on the host) and podman will not help us out here (as it does with container volumes above).

We cannot just create a host directory as our non-root host user and pass it through, as the permissions inside the container will be root.

$ mkdir src

$ ls -lZd src
drwxrwxr-x. 2 csmart csmart system_u:object_r:container_file_t:s0 6 Jan 31 19:15 src

$ podman run -dit --volume ./src:/dest:z --user 123:123 --name busybox busybox
bd9b8e7685acaa3b03380a10724bd2c06ddaa48dbfb67b49339068e16f51d57c

$ podman exec busybox id
uid=123(123) gid=123(123)

$ podman exec busybox ls -ld /dest
drwxrwxr-x    2 root     root             6 Jan 31 08:15 /dest

$ podman exec busybox touch /dest/file
touch: /dest/file: Permission denied

Obviously the container user is not able to write to the volume. So what do we do? Well we need to change the permissions so that they match the user (similar to what podman does for us automatically when using a container volume).

If you have root on the box, that’s pretty easy.

$ chown 100122:100122 src

$ ls -lZd src
drwxrwxr-x. 2 100122 100122 system_u:object_r:container_file_t:s0 6 Jan 31 19:15 src

$ podman exec busybox touch /dest/file

$ ls -lZ src/file
-rw-r--r--. 1 100122 100122 system_u:object_r:container_file_t:s0 0 Jan 31 19:20 src/file

OK, but remember we’re running rootless containers, so how do you do that as your regular old non-root user if you don’t have root on the host?

$ chown 100122:100122 src/
chown: changing ownership of 'src/': Operation not permitted

Remember that podman unshare command (see previous post) which under a new user namespace? We can use this to set the permissions, but remember we don’t actually set it to 100122 as it will be on the host, we use the container uid of 123:123.

$ podman unshare chown 123:123 ./src

$ ls -lZd src
drwxrwxr-x. 2 100122 100122 system_u:object_r:container_file_t:s0 18 Jan 31 19:20 src

Now the host directory has the right permissions and the container user will be able to write just fine!

$ podman exec busybox touch /dest/file

$ ls -lZ src/file
-rw-r--r--. 1 100122 100122 system_u:object_r:container_file_t:s0 0 Jan 31 19:23 src/file

Accessing data as non-root user and non-root container user

Although they are very similar, if you want to be able to access the data as both your non-root user on the host and the container, you’re probably better off using host dir volumes over container volumes as it’s easier to see and manage.

In a future post I’ll talk about how we can modify these host directories so that we can access them both as our non-root host user and also as the user in the container. I think this post has gotten long enough for now…

,

Jan SchmidtHitting a milestone – Beat Saber!

I hit an important OpenHMD milestone tonight – I completed a Beat Saber level using my Oculus Rift CV1!

I’ve been continuing to work on integrating Kalman filtering into OpenHMD, and on improving the computer vision that matches and tracks device LEDs. While I suspect noone will be completing Expert levels just yet, it’s working well enough that I was able to play through a complete level of Beat Saber. For a long time this has been my mental benchmark for tracking performance, and I’m really happy 🙂

Check it out:

I should admit at this point that completing this level took me multiple attempts. The tracking still has quite a tendency to lose track of controllers, or to get them confused and swap hands suddenly.

I have a list of more things to work on. See you at the next update!

Chris SmartUser IDs and (rootless) containers with Podman

When a Linux container image is created, like any system it can have specific users, or maybe it only has root (uid 0). Containers have a specific entry point which runs the program the image was created for and this might run as any user in the image, it’s up to whoever created it.

You can use podman (a daemonless container engine) to easily see what uid an image will use, by getting the container to run the id command instead of the default entry point.

For example, here we can see that busybox wants to run as root (uid 0).

# podman run --rm --entrypoint '' docker.io/busybox id
uid=0(root) gid=0(root) groups=0(root)

However, grafana wants to run as the grafana user with uid 472 .

# podman run --rm --entrypoint '' docker.io/grafana/grafana id
uid=472(grafana) gid=0(root) groups=0(root)

OK, so inside the containers we are running as different users, but as we’re running as root those same uids are also used on the host system.

Running containers as root

Let’s run a grafana container as root and see that the actual grafana-server process on the host is running under uid 472.

# podman run -d docker.io/grafana/grafana
ee275572c8dcb0922722b7d6c3e5ff0bda8c9af3682018c4fc53675d8e189e59
# ps -o user $(pidof grafana-server)
USER
472

Now, remember how busybox wanted to run as uid 0? Let’s run top in the busybox container and see that the process does indeed run as root on the host.

# podman run -d docker.io/busybox top
478a50a9054b36fc1e1c0f0dc005ae4393d60ecbbd6ba2bf5021b255c5d3d133
# ps -o user $(pidof top)
USER
root

So, running a container as root will use whatever uid is inside the container to run its process on the host. This might “conflict” with other users already on the system for example, if 472 already exists. Furthermore, as with any process on a host, it’s probably not ideal to run it as root.

We can, however override the uid that’s used in the container with the --user option. For example, here I’m telling the container to run as uid 1000 which means the top process will actually run as my non-root csmart user on the host.

# podman run --user 1000:1000 -d docker.io/busybox top
699449a882b0a6402728176f2773bc87d55ada8115ef55eeca2cba465a70a018
# ps -o user $(pidof top)
USER
csmart

While we can run containers as root and have its process execute as a non-root user on the host (which is good), there are still a few downsides. For example, it requires root access in the first place, parts of the container (such as conmon) are still running as root and a vulnerability somewhere in the stack might render the user protection useless.

Running rootless containers

As with any Linux process, it’s safer if we can run a container as a non-root user.

When running the container as a non-root user however, how do we run as root (uid 0) when you aren’t root on the host? We need a way to allow the container inside to be root, but not on the actual host system running it.

Fortunately this is possible and managed with rootless containers via /etc/subuid and /etc/subgid config files. This sets different uid and gid range offsets for each user, so while multiple users might run the same container with the same internal uid, it will get translated to a different uid on the host, thus avoiding conflicts.

Running rootless containers as root, as you

One of the neat things this does is to map root (uid 0) in the container to your non-root user on the host. This way when a container runs as root, it’s actually running as you!

Let’s take a look. My current non-root user on my host is csmart which has a uid of 1000.

$ id -u
1000

Looking at the subuid file we can see that my user has a range of 65536 subuids starting from 100000.

$ cat /etc/subuid
csmart:100000:65536

Using the podman unshare command (which runs under a new user namespace) we can confirm the user range is applied for our user.

$ podman unshare cat /proc/self/uid_map
0 1000 1
1 100000 65536

OK, but what does that mean? Well, here we can see that for my user, the root account with uid 0 in a container actually maps to the 1000 uid of our non-root user on the host.

Then, the uid of 1 in a container for my user would map to 100000 on the host, 2 would be 100001 and so on. To use grafana as an example, running it in a rootless container with uid of 472 would map to 100471 on the host for my user.

We can also use the id command, along with podman unshare again, to compare our uid outside a container (as non-root user) and inside a container (as root).

$ id
uid=1000(csmart) gid=1000(csmart) groups=1000(csmart)
$ podman unshare id
uid=0(root) gid=0(root) groups=0(root)

Great. So running a container with user root (uid 0) will translate to our non-root user on the host (uid 1000 in this case).

So, let’s see what happens when my non-root user runs the top command in a busybox container (remember busybox runs as root inside the container, but I’m running it as my non-root user 1000 on the host).

$ podman run -d --name busybox docker.io/busybox top
72a40c16be71020b0f4be6af447c55b32a6fd406a97d4861bbf2794a50b04a5d
$ podman exec -it busybox id
uid=0(root) gid=0(root)
$ ps -o user $(pidof top)
USER
csmart

So, if you have a container that wants to run as root, this will automatically be translated to your regular non-root user on the host.

However, if the container you’re running uses as a different uid (such as grafana with uid 472), then you can always pass in the option --user 0:0 to make it run as root inside the container (which then runs as your non-root user on the host).

$ podman run --user 0:0 -d --name grafana docker.io/grafana/grafana
d44b5e61d856e585c57ab0922d8f19f7d7eeed6f9a7fbabb149bb344fe20f955
$ podman exec -it grafana id
uid=0(root) gid=0(root)
$ ps -o user $(pidof grafana-server)
USER
csmart

Running rootless containers as other uids, as you

Remember though, that you don’t have to run the container as root and have it translate to your own user, you have a full 65536 users you can run as! This way you can also achieve some isolation between your own containers, while still taking advantage of rootless containers.

OK, so what if you have a container that does not use root, what happens? Easy! The uid gets mapped to the offsets in the subuid file. Remember that grafana wants to run as uid 472? Well this will translate to 100471 on the host for my user (uid 1000).

$ podman run -d docker.io/grafana/grafana
541bef2aea52fa2a45e440bc5bf1d13797b83750405a794c2153af3e61580040
$ ps -o user $(pidof grafana-server)
USER
100471

So, we can run lots of containers as different users and keep them separate both from each other and from any other users’ containers, all without needing root on the host. Great!

Again, you can manage which user each container runs as, by passing in the --user option to podman and have it map to your own subuid space.

Running multiple containers works well, but it does get more tricky if you need to pass in directories on the host for persistent storage. That will probably be the topic of another post…

,

Sam WatkinsDeveloping CZ, a dialect of C that looks like Python

In my experience, the C programming language is still hard to beat, even 50 years after it was first developed (and I feel the same way about UNIX). When it comes to general-purpose utility, low-level systems programming, performance, and portability (even to tiny embedded systems), I would choose C over most modern or fashionable alternatives. In some cases, it is almost the only choice.

Many developers believe that it is difficult to write secure and reliable software in C, due to its free pointers, the lack of enforced memory integrity, and the lack of automatic memory management; however in my opinion it is possible to overcome these risks with discipline and a more secure system of libraries constructed on top of C and libc. Daniel J. Bernstein and Wietse Venema are two developers who have been able to write highly secure, stable, reliable software in C.

My other favourite language is Python. Although Python has numerous desirable features, my favourite is the light-weight syntax: in Python, block structure is indicated by indentation, and braces and semicolons are not required. Apart from the pleasure and relief of reading and writing such light and clear code, which almost appears to be executable pseudo-code, there are many other benefits. In C or JavaScript, if you omit a trailing brace somewhere in the code, or insert an extra brace somewhere, the compiler may tell you that there is a syntax error at the end of the file. These errors can be annoying to track down, and cannot occur in Python. Python not only looks better, the clear syntax helps to avoid errors.

The obvious disadvantage of Python, and other dynamic interpreted languages, is that most programs run extremely slower than C programs. This limits the scope and generality of Python. No AAA or performance-oriented video game engines are programmed in Python. The language is not suitable for low-level systems programming, such as operating system development, device drivers, filesystems, performance-critical networking servers, or real-time systems.

C is a great all-purpose language, but the code is uglier than Python code. Once upon a time, when I was experimenting with the Plan 9 operating system (which is built on C, but lacks Python), I missed Python’s syntax, so I decided to do something about it and write a little preprocessor for C. This converts from a “Pythonesque” indented syntax to regular C with the braces and semicolons. Having forked a little dialect of my own, I continued from there adding other modules and features (which might have been a mistake, but it has been fun and rewarding).

At first I called this translator Brace, because it added in the braces for me. I now call the language CZ. It sounds like “C-easy”. Ease-of-use for developers (DX) is the primary goal. CZ has all of the features of C, and translates cleanly into C, which is then compiled to machine code as normal (using any C compiler; I didn’t write one); and so CZ has the same features and performance as C, but enjoys a more pleasing syntax.

CZ is now self-hosted, in that the translator is written in the language CZ. I confess that originally I wrote most of it in Perl; I’m proficient at Perl, but I consider it to be a fairly ugly language, and overly complicated.

I intend for CZ’s new syntax to be “optional”, ideally a developer will be able to choose to use the normal C syntax when editing CZ, if they prefer it. For this, I need a tool to convert C back to CZ, which I have not fully implemented yet. I am aware that, in addition to traditionalists, some vision-impaired developers prefer to use braces and semicolons, as screen readers might not clearly indicate indentation. A C to CZ translator would of course also be valuable when porting an existing C program to CZ.

CZ has a number of useful features that are not found in standard C, but I did not go so far as C++, which language has been described as “an octopus made by nailing extra legs onto a dog”. I do not consider C to be a dog, at least not in a negative sense; but I think that C++ is not an improvement over plain C. I am creating CZ because I think that it is possible to improve on C, without losing any of its advantages or making it too complex.

One of the most interesting features I added is a simple syntax for fast, light coroutines. I based this on Simon Tatham’s approach to Coroutines in C, which may seem hacky at first glance, but is very efficient and can work very well in practice. I implemented a very fast web server with very clean code using these coroutines. The cost of switching coroutines with this method is little more than the cost of a function call.

CZ has hygienic macros. The regular cpp (C preprocessor) macros are not hygenic and many people consider them hacky and unsafe to use. My CZ macros are safe, and somewhat more powerful than standard C macros. They can be used to neatly add new program control structures. I have plans to further develop the macro system in interesting ways.

I added automatic prototype and header generation, as I do not like having to repeat myself when copying prototypes to separate header files. I added support for the UNIX #! scripting syntax, and for cached executables, which means that CZ can be used like a scripting language without having to use a separate compile or make command, but the programs are only recompiled when something has been changed.

For CZ, I invented a neat approach to portability without conditional compilation directives. Platform-specific library fragments are automatically included from directories having the name of that platform or platform-category. This can work very well in practice, and helps to avoid the nightmare of conditional compilation, feature detection, and Autotools. Using this method, I was able easily to implement portable interfaces to features such as asynchronous IO multiplexing (aka select / poll).

The CZ library includes flexible error handling wrappers, inspired by W. Richard Stevens’ wrappers in his books on Unix Network Programming. If these wrappers are used, there is no need to check return values for error codes, and this makes the code much safer, as an error cannot accidentally be ignored.

CZ has several major faults, which I intend to correct at some point. Some of the syntax is poorly thought out, and I need to revisit it. I developed a fairly rich library to go with the language, including safer data structures, IO, networking, graphics, and sound. There are many nice features, but my CZ library is more prototype than a finished product, there are major omissions, and some features are misconceived or poorly implemented. The misfeatures should be weeded out for the time-being, or moved to an experimental section of the library.

I think that a good software library should come in two parts, the essential low-level APIs with the minimum necessary functionality, and a rich set of high-level convenience functions built on top of the minimal API. I need to clearly separate these two parts in order to avoid polluting the namespaces with all sorts of nonsense!

CZ is lacking a good modern system of symbol namespaces. I can look to Python for a great example. I need to maintain compatibility with C, and avoid ugly symbol encodings. I think I can come up with something that will alleviate the need to type anything like gtk_window_set_default_size, and yet maintain compatibility with the library in question. I want all the power of C, but it should be easy to use, even for children. It should be as easy as BASIC or Processing, a child should be able to write short graphical demos and the like, without stumbling over tricky syntax or obscure compile errors.

Here is an example of a simple CZ program which plots the Mandelbrot set fractal. I think that the program is fairly clear and easy to understand, although there is still some potential to improve and clarify the code.

#!/usr/local/bin/cz --
use b
use ccomplex

Main:
	num outside = 16, ox = -0.5, oy = 0, r = 1.5
	long i, max_i = 50, rb_i = 30
	space()
	uint32_t *px = pixel()  # CONFIGURE!
	num d = 2*r/h, x0 = ox-d*w_2, y0 = oy+d*h_2
	for(y, 0, h):
		cmplx c = x0 + (y0-d*y)*I
		repeat(w):
			cmplx w = c
			for i=0; i < max_i && cabs(w) < outside; ++i
				w = w*w + c
			*px++ = i < max_i ? rainbow(i*359 / rb_i % 360) : black
			c += d

I wrote a more elaborate variant of this program, which generates images like the one shown below. There are a few tricks used: continuous colouring, rainbow colours, and plotting the logarithm of the iteration count, which makes the plot appear less busy close to the black fractal proper. I sell some T-shirts and other products with these fractal designs online.

An image from the Mandelbrot set, generated by a fairly simple CZ program.

I am interested in graph programming, and have been for three decades since I was a teenager. By graph programming, I mean programming and modelling based on mathematical graphs or diagrams. I avoid the term visual programming, because there is no necessary reason that vision impaired folks could not use a graph programming language; a graph or diagram may be perceived, understood, and manipulated without having to see it.

Mathematics is something that naturally exists, outside time and independent of our universe. We humans discover mathematics, we do not invent or create it. One of my main ideas for graph programming is to represent a mathematical (or software) model in the simplest and most natural way, using relational operators. Elementary mathematics can be reduced to just a few such operators:

+add, subtract, disjoint union, zero
×multiply, divide, cartesian product, one
^power, root, logarithm
sin, cos, sin-1, cos-1, hypot, atan2
δdifferential, integral
a set of minimal relational operators for elementary math

I think that a language and notation based on these few operators (and similar) can be considerably simpler and more expressive than conventional math or programming languages.

CZ is for me a stepping-stone toward this goal of an expressive relational graph language. It is more pleasant for me to develop software tools in CZ than in C or another language.

Thanks for reading. I wrote this article during the process of applying to join Toptal, which appears to be a freelancing portal for top developers; and in response to this article on toptal: After All These Years, the World is Still Powered by C Programming.

My CZ project has been stalled for quite some time. I foolishly became discouraged after receiving some negative feedback. I now know that honest negative feedback should be valued as an opportunity to improve, and I intend to continue the project until it lacks glaring faults, and is useful for other people. If this project or this article interests you, please contact me and let me know. It is much more enjoyable to work on a project when other people are actively interested in it!

Russell CokerLinks January 2021

Krebs on Security has an informative article about web notifications and how they are being used for spamming and promoting malware [1]. He also includes links for how to permanently disable them. If nothing else clicking “no” on each new site that wants to send notifications is annoying.

Michael Stapelberg wrote an insightful posts about inefficiencies in the Debian development processes [2]. While I agree with most of his assessment of Debian issues I am not going to decrease my involvement in Debian. Of the issues he mentions the 2 that seem to have the best effort to reward ratio are improvements to mailing list archives (to ideally make it practical to post to lists without subscribing and read responses in the archives) and the issues of forgetting all the complexities of the development process which can be alleviated by better Wiki pages. In my Debian work I’ve contributed more to the Wiki in recent times but not nearly as much as I should.

Jacobin has an insightful article “Ending Poverty in the United States Would Actually Be Pretty Easy” [3].

Mark Brown wrote an interesting blog post about the Rust programming language [4]. He links to a couple of longer blog posts about it. Rust has some great features and I’ve been meaning to learn it.

Scientific America has an informative article about research on the spread of fake news and memes [5]. Something to consider when using social media.

Bruce Schneier wrote an insightful blog post on whether there should be limits on persuasive technology [6].

Jonathan Dowland wrote an interesting blog post about git rebasing and lab books [7]. I think it’s an interesting thought experiment to compare the process of developing code worthy of being committed to a master branch of a VCS to the process of developing a Ph.D thesis.

CBS has a disturbing article about the effect of Covid19 on people’s lungs [8]. Apparently it usually does more lung damage than long-term smoking and even 70%+ of people who don’t have symptoms of the disease get significant lung damage. People who live in heavily affected countries like the US now have to worry that they might have had the disease and got lung damage without knowing it.

Russ Allbery wrote an interesting review of the book “Because Internet” about modern linguistics [9]. The topic is interesting and I might read that book at some future time (I have many good books I want to read).

Jonathan Carter wrote an interesting blog post about CentOS Streams and why using a totally free OS like Debian is going to be a better option for most users [10].

Linus has slammed Intel for using ECC support as a way of segmenting the market between server and desktop to maximise profits [11]. It would be nice if a company made a line of Ryzen systems with ECC RAM support, but most manufacturers seem to be in on the market segmentation scam.

Russ Allbery wrote an interesting review of the book “Can’t Even” about millenials as the burnout generation and the blame that the corporate culture deserves for this [12].

Gary PendergastWordPress Importers: Free (as in Speech)

Back at the start of this series, I listed four problems within the scope of the WordPress Importers that we needed to address. Three of them are largely technical problems, which I covered in previous posts. In wrapping up this series, I want to focus exclusively on the fourth problem, which has a philosophical side as well as a technical one — but that does not mean we cannot tackle it!

Problem Number 4

Some services work against their customers, and actively prevent site owners from controlling their own content.

Some services are merely inconvenient: they provide exports, but it often involves downloading a bunch of different files. Your CMS content is in one export, your store products are in another, your orders are in another, and your mailing list is in yet another. It’s not ideal, but they at least let you get a copy of your data.

However, there’s another class of services that actively work against their customers. It’s these services I want to focus on: the services that don’t provide any ability to export your content — effectively locking people in to using their platform. We could offer these folks an escape! The aim isn’t to necessarily make them use WordPress, it’s to give them a way out, if they want it. Whether they choose to use WordPress or not after that is immaterial (though I certainly hope they would, of course). The important part is freedom of choice.

It’s worth acknowledging that this is a different approach to how WordPress has historically operated in relation to other CMSes. We provide importers for many CMSes, but we previously haven’t written exporters. However, I don’t think this is a particularly large step: for CMSes that already provide exports, we’d continue to use those export files. This is focussed on the few services that try to lock their customers in.

Why Should WordPress Take This On?

There are several aspects to why we should focus on this.

First of all, it’s the the WordPress mission. Underpinning every part of WordPress is the simplest of statements:

Democratise Publishing

The freedom to build. The freedom to change. The freedom to share.

These freedoms are the pillars of a Free and Open Web, but they’re not invulnerable: at times, they need to be defended, and that needs people with the time and resources to offer a defence.

Which brings me to my second point: WordPress has the people who can offer that defence! The WordPress project has so many individuals working on it, from such a wide variety of backgrounds, we’re able to take on a vast array of projects that a smaller CMS just wouldn’t have the bandwidth for. That’s not to say that we can do everything, but when there’s a need to defend the entire ecosystem, we’re able to devote people to the cause.

Finally, it’s important to remember that WordPress doesn’t exist in a vacuum, we’re part of a broad ecosystem which can only exist through the web remaining open and free. By encouraging all CMSes to provide proper exports, and implementing them for those that don’t, we help keep our ecosystem healthy.

We have the ability to take on these challenges, but we have a responsibility that goes alongside. We can’t do it solely to benefit WordPress, we need to make that benefit available to the entire ecosystem. This is why it’s important to define a WordPress export schema, so that any CMS can make use of the export we produce, not just WordPress. If you’ll excuse the imagery for a moment, we can be the knight in shining armour that frees people — then gives them the choice of what they do with that freedom, without obligation.

How Can We Do It?

Moving on to the technical side of this problem, I can give you some good news: the answer is definitely not screen scraping. 😄 Scraping a site is fragile, impossible to transform into the full content, and provides an incomplete export of the site: anything that’s only available in the site dashboard can’t be obtained through scraping.

I’ve recently been experimenting with an alternative approach to solving this problem. Rather than trying to create something resembling a traditional exporter, it turns out that modern CMSes provide the tools we need, in the form of REST APIs. All we need to do is call the appropriate APIs, and collate the results. The fun part is that we can authenticate with these APIs as the site owner, by calling them from a browser extension! So, that’s what I’ve been experimenting with, and it’s showing a lot of promise.

If you’re interested in playing around with it, the experimental code is living in this repository. It’s a simple proof of concept, capable of exporting the text content of a blog on a Wix site, showing that we can make a smooth, comprehensive, easy-to-use exporter for any Wix site owner.

Screenshot of the "Free (as in Speech)" browser extension UI.

Clicking the export button starts a background script, which calls Wix’s REST APIs as the site owner, to get the original copy of the content. It then packages it up, and presents it as a WXR file to download.

Screenshot of a Firefox download dialog, showing a Wix site packaged up as a WXR file.

I’m really excited about how promising this experiment is. It can ultimately provide a full export of any Wix site, and we can add support for other CMS services that choose to artificially lock their customers in.

Where Can I Help?

If you’re a designer or developer who’s excited about working on something new, head on over to the repository and check out the open issues: if there’s something that isn’t already covered, feel free to open a new issue.

Since this is new ground for a WordPress project, both technically and philosophically, I’d love to hear more points of view. It’s being discussed in the WordPress Core Dev Chat this week, and you can also let me know what you think in the comments!

This post is part of a series, talking about the WordPress Importers, their history, where they are now, and where they could go in the future.

,

Gary PendergastWordPress Importers: Defining a Schema

While schemata are usually implemented using language-specific tools (eg, XML uses XML Schema, JSON uses JSON Schema), they largely use the same concepts when talking about data. This is rather helpful, we don’t need to make a decision on data formats before we can start thinking about how the data should be arranged.

Note: Since these concepts apply equally to all data formats, I’m using “WXR” in this post as shorthand for “the structured data section of whichever file format we ultimately use”, rather than specifically referring to the existing WXR format. 🙂

Why is a Schema Important?

It’s fair to ask why, if the WordPress Importers have survived this entire time without a formal schema, why would we need one now?

There are two major reasons why we haven’t needed one in the past:

  • WXR has remained largely unchanged in the last 10 years: there have been small additions or tweaks, but nothing significant. There’s been no need to keep track of changes.
  • WXR is currently very simple, with just a handful of basic elements. In a recent experiment, I was able to implement a JavaScript-based WXR generator in just a few days, entirely by referencing the Core implementation.

These reasons are also why it would help to implement a schema for the future:

  • As work on WXR proceeds, there will likely need to be substantial changes to what data is included: adding new fields, modifying existing fields, and removing redundant fields. Tracking these changes helps ensure any WXR implementations can stay in sync.
  • These changes will result in a more complex schema: relying on the source to re-implement it will become increasingly difficult and error-prone. Following Gutenberg’s lead, it’s likely that we’d want to provide official libraries in both PHP and JavaScript: keeping them in sync is best done from a source schema, rather than having one implementation copy the other.

Taking the time to plan out a schema now gives us a solid base to work from, and it allows for future changes to happen in a reliable fashion.

WXR for all of WordPress

With a well defined schema, we can start to expand what data will be included in a WXR file.

Media

Interestingly, many of the challenges around media files are less to do with WXR, and more to do with importer capabilities. The biggest headache is retrieving the actual files, which the importer currently handles by trying to retrieve the file from the remote server, as defined in the wp:attachment_url node. In context, this behaviour is understandable: 10+ years ago, personal internet connections were too slow to be moving media around, it was better to have the servers talk to each other. It’s a useful mechanism that we should keep as a fallback, but the more reliable solution is to include the media file with the export.

Plugins and Themes

There are two parts to plugins and themes: the code, and the content. Modern WordPress sites require plugins to function, and most are customised to suit their particular theme.

For exporting the code, I wonder if a tiered solution could be applied:

  • Anything from WordPress.org would just need their slug, since they can be re-downloaded during import. Particularly as WordPress continues to move towards an auto-updated future, modified versions of plugins and themes are explicitly not supported.
  • Third party plugins and themes would be given a filter to use, where they can provide a download URL that can be included in the export file.
  • Third party plugins/themes that don’t provide a download URL would either need to be skipped, or zipped up and included in the export file.

For exporting the content, WXR already includes custom post types, but doesn’t include custom settings, or custom tables. The former should be included automatically, and the latter would likely be handled by an appropriate action for the plugin to hook into.

Settings

There are a currently handful of special settings that are exported, but (as I just noted, particularly with plugins and themes being exported) this would likely need to be expanded to included most items in wp_options.

Users

Currently, the bare minimum information about users who’ve authored a post is included in the export. This would need to be expanded to include more user information, as well as users who aren’t post authors.

WXR for parts of WordPress

The modern use case for importers isn’t just to handle a full site, but to handle keeping sites in sync. For example, most news organisations will have a staging site (or even several layers of staging!) which is synchronised to production.

While it’s well outside the scope of this project to directly handle every one of these use cases, we should be able to provide the framework for organisations to build reliable platforms on. Exports should be repeatable, objects in the export should have unique identifiers, and the importer should be able to handle any subset of WXR.

WXR Beyond WordPress

Up until this point, we’ve really been talking about WordPress→WordPress migrations, but I think WXR is a useful format beyond that. Instead of just containing direct exports of the data from particular plugins, we could also allow it to contain “types” of data. This turns WXR into an intermediary language, exports can be created from any source, and imported into WordPress.

Let’s consider an example. Say we create a tool that can export a Shopify, Wix, or GoDaddy site to WXR, how would we represent an online store in the WXR file? We don’t want to export in the format that any particular plugin would use, since a WordPress Core tool shouldn’t be advantaging one plugin over others.

Instead, it would be better if we could format the data in a platform-agnostic way, which plugins could then implement support for. As luck would have it, Schema.org provides exactly the kind of data structure we could use here. It’s been actively maintained for nearly nine years, it supports a wide variety of data types, and is intentionally platform-agnostic.

Gazing into my crystal ball for a moment, I can certainly imagine a future where plugins could implement and declare support for importing certain data types. When handling such an import (assuming one of those plugins wasn’t already installed), the WordPress Importer could offer them as options during the import process. This kind of seamless integration allows WordPress to show that it offers the same kind of fully-featured site building experience that modern CMS services do.

Of course, reality is never quite as simple as crystal balls and magic wands make them out to be. We have to contend with services that provide incomplete or fragmented exports, and there are even services that deliberately don’t provide exports at all. In the next post, I’ll be writing about why we should address this problem, and how we might be able to go about it.

This post is part of a series, talking about the WordPress Importers, their history, where they are now, and where they could go in the future.

,

Gary PendergastWordPress Importers: Getting Our House in Order

The previous post talked about the broad problems we need to tackle to bring our importers up to speed, making them available for everyone to use.

In this post, I’m going to focus on what we could do with the existing technology, in order to give us the best possible framework going forward.

A Reliable Base

Importers are an interesting technical problem. Much like you’d expect from any backup/restore code, importers need to be extremely reliable. They need to comfortable handle all sorts of unusual data, and they need to keep it all safe. Particularly considering their age, the WordPress Importers do a remarkably good job of handling most content you can throw at it.

However, modern development practices have evolved and improved since the importers were first written, and we should certainly be making use of such practices, when they fit with our requirements.

For building reliable software that we expect to largely run by itself, a variety of comprehensive automated testing is critical. This ensures we can confidently take on the broader issues, safe in the knowledge that we have a reliable base to work from.

Testing must be the first item on this list. A variety of automated testing gives us confidence that changes are safe, and that the code can continue to be maintained in the future.

Data formats must be well defined. While this is useful for ensuring data can be handled in a predictable fashion, it’s also a very clear demonstration of our commitment to data freedom.

APIs for creating or extending importers should be straightforward for hooking into.

Performance Isn’t an Optional Extra

With sites constantly growing in size (and with the export files potentially gaining a heap of extra data), we need to care about the performance of the importers.

Luckily, there’s already been some substantial work done on this front:

There are other groups in the WordPress world who’ve made performance improvements in their own tools: gathering all of that experience is a relatively quick way to bring in production-tested improvements.

The WXR Format

It’s worth talking about the WXR format itself, and determining whether it’s the best option for handling exports into the future. XML-based formats are largely viewed as a relic of days gone past, so (if we were to completely ignore backwards compatibility for a moment) is there a modern data format that would work better?

The short answer… kind of. 🙂

XML is actually well suited to this use case, and (particularly when looking at performance improvements) is the only data format for which PHP comes with a built-in streaming parser.

That said, WXR is basically an extension of the RSS format: as we add more data to the file that clearly doesn’t belong in RSS, there is likely an argument for defining an entirely WordPress-focused schema.

Alternative Formats

It’s important to consider what the priorities are for our export format, which will help guide any decision we make. So, I’d like to suggest the following priorities (in approximate priority order):

  • PHP Support: The format should be natively supported in PHP, thought it is still workable if we need to ship an additional library.
  • Performant: Particularly when looking at very large exports, it should be processed as quickly as possible, using minimal RAM.
  • Supports Binary Files: The first comments on my previous post asked about media support, we clearly should be treating it as a first-class citizen.
  • Standards Based: Is the format based on a documented standard? (Another way to ask this: are there multiple different implementations of the format? Do those implementations all function the same?
  • Backward Compatible: Can the format be used by existing tools with no changes, or minimal changes?
  • Self Descriptive: Does the format include information about what data you’re currently looking at, or do you need to refer to a schema?
  • Human Readable: Can the file be opened and read in a text editor?

Given these priorities, what are some options?

WXR (XML-based)

Either the RSS-based schema that we already use, or a custom-defined XML schema, the arguments for this format are pretty well known.

One argument that hasn’t been well covered is how there’s a definite trade-off when it comes to supporting binary files. Currently, the importer tries to scrape the media file from the original source, which is not particularly reliable. So, if we were to look at including media files in the WXR file, the best option for storing them is to base64 encode them. Unfortunately, that would have a serious effect on performance, as well as readability: adding huge base64 strings would make even the smallest exports impossible to read.

Either way, this option would be mostly backwards compatible, though some tools may require a bit of reworking if we were to substantial change the schema.

WXR (ZIP-based)

To address the issues with media files, an alternative option might be to follow the path that Microsoft Word and OpenOffice use: put the text content in an XML file, put the binary content into folders, and compress the whole thing.

This addresses the performance and binary support problems, but is initially worse for readability: if you don’t know that it’s a ZIP file, you can’t read it in a text editor. Once you unzip it, however, it does become quite readable, and has the same level of backwards compatibility as the XML-based format.

JSON

JSON could work as a replacement for XML in both of the above formats, with one additional caveat: there is no streaming JSON parser built in to PHP. There are 3rd party libraries available, but given the documented differences between JSON parsers, I would be wary about using one library to produce the JSON, and another to parse it.

This format largely wouldn’t be backwards compatible, though tools which rely on the export file being plain text (eg, command line tools to do broad search-and-replaces on the file) can be modified relatively easily.

There are additional subjective arguments (both for and against) the readability of JSON vs XML, but I’m not sure there’s anything to them beyond personal preference.

SQLite

The SQLite team wrote an interesting (indirect) argument on this topic: OpenOffice uses a ZIP-based format for storing documents, the SQLite team argued that there would be benefits (particularly around performance and reliability) for OpenOffice to switch to SQLite.

They key issues that I see are:

  • SQLite is included in PHP, but not enabled by default on Windows.
  • While the SQLite team have a strong commitment to providing long-term support, SQLite is not a standard, and the only implementation is the one provided by the SQLite team.
  • This option is not backwards compatible at all.

FlatBuffers

FlatBuffers is an interesting comparison, since it’s a data format focussed entirely on speed. The down side of this focus is that it requires a defined schema to read the data. Much like SQLite, the only standard for FlatBuffers is the implementation. Unlike SQLite, FlatBuffers has made no commitments to providing long-term support.

WXR (XML-based)WXR (ZIP-based)JSONSQLiteFlatBuffers
Works in PHP?✅✅⚠⚠⚠
Performant?⚠✅⚠✅✅
Supports Binary Files?⚠✅⚠✅✅
Standards Based?✅✅✅⚠ / ��
Backwards Compatible?⚠⚠���
Self Descriptive?✅✅✅✅�
Readable?✅⚠ / �✅��

As with any decision, this is a matter of trade-offs. I’m certainly interested in hearing additional perspectives on these options, or thoughts on options that I haven’t considered.

Regardless of which particular format we choose for storing WordPress exports, every format should have (or in the case of FlatBuffers, requires) a schema. We can talk about schemata without going into implementation details, so I’ll be writing about that in the next post.

This post is part of a series, talking about the WordPress Importers, their history, where they are now, and where they could go in the future.

Gary PendergastWordPress Importers: Stating the Problem

It’s time to focus on the WordPress Importers.

I’m not talking about tidying them up, or improve performance, or fixing some bugs, though these are certainly things that should happen. Instead, we need to consider their purpose, how they fit as a driver of WordPress’ commitment to Open Source, and how they can be a key element in helping to keep the Internet Open and Free.

The History

The WordPress Importers are arguably the key driver to WordPress’ early success. Before the importer plugins existed (before WordPress even supported plugins!) there were a handful of import-*.php scripts in the wp-admin directory that could be used to import blogs from other blogging platforms. When other platforms fell out of favour, WordPress already had an importer ready for people to move their site over. One of the most notable instances was in 2004, when Moveable Type changed their license and prices, suddenly requiring personal blog authors to pay for something that had previously been free. WordPress was fortunate enough to be in the right place at the right time: many of WordPress’ earliest users came from Moveable Type.

As time went on, WordPress became well known in its own right. Growth relied less on people wanting to switch from another provider, and more on people choosing to start their site with WordPress. For practical reasons, the importers were moved out of WordPress Core, and into their own plugins. Since then, they’ve largely been in maintenance mode: bugs are fixed when they come up, but since export formats rarely change, they’ve just continued to work for all these years.

An unfortunate side effect of this, however, is that new importers are rarely written. While a new breed of services have sprung up over the years, the WordPress importers haven’t kept up.

The New Services

There are many new CMS services that have cropped up in recent years, and we don’t have importers for any of them. WordPress.com has a few extra ones written, but they’ve been built on the WordPress.com infrastructure out of necessity.

You see, we’ve always assumed that other CMSes will provide some sort of export file that we can use to import into WordPress. That isn’t always the case, however. Some services (notable, Wix and GoDaddy Website Builder) deliberately don’t allow you to export your own content. Other services provide incomplete or fragmented exports, needlessly forcing stress upon site owners who want to use their own content outside of that service.

To work around this, WordPress.com has implemented importers that effectively scrape the site: while this has worked to some degree, it does require regular maintenance, and the importer has to do a lot of guessing about how the content should be transformed. This is clearly not a solution that would be maintainable as a plugin.

Problem Number 4

Some services work against their customers, and actively prevent site owners from controlling their own content.

This strikes at the heart of the WordPress Bill of Rights. WordPress is built with fundamental freedoms in mind: all of those freedoms point to owning your content, and being able to make use of it in any form you like. When a CMS actively works against providing such freedom to their community, I would argue that we have an obligation to help that community out.

A Variety of Content

It’s worth discussing how, when starting a modern CMS service, the bar for success is very high. You can’t get away with just providing a basic CMS: you need to provide all the options. Blogs, eCommerce, mailing lists, forums, themes, polls, statistics, contact forms, integrations, embeds, the list goes on. The closest comparison to modern CMS services is… the entire WordPress ecosystem: built on WordPress core, but with the myriad of plugins and themes available, along with the variety of services offered by a huge array of companies.

So, when we talk about the importers, we need to consider how they’ll be used.

Problem Number 3

To import from a modern CMS service into WordPress, your importer needs to map from service features to WordPress plugins.

Getting Our Own House In Order

Some of these problems don’t just apply to new services, however.

Out of the box, WordPress exports to WXR (WordPress eXtended RSS) files: an XML file that contains the content of the site. Back when WXR was first created, this was all you really needed, but much like the rest of the WordPress importers, it hasn’t kept up with the times. A modern WordPress site isn’t just the sum of its content: a WordPress site has plugins and themes. It has various options configured, it has huge quantities of media, it has masses of text content, far more than the first WordPress sites ever had.

Problem Number 2

WXR doesn’t contain a full export of a WordPress site.

In my view, WXR is a solid format for handling exports. An XML-based system is quite capable of containing all forms of content, so it’s reasonable that we could expand the WXR format to contain the entire site.

Built for the Future

If there’s one thing we can learn from the history of the WordPress importers, it’s that maintenance will potentially be sporadic. Importers are unlikely to receive the same attention that the broader WordPress Core project does, owners may come and go. An importer will get attention if it breaks, of course, but it otherwise may go months or years without changing.

Problem Number 1

We can’t depend on regular importer maintenance in the future.

It’s quite possible to build code that will be running in 10+ years: we see examples all across the WordPress ecosystem. Doing it in a reliable fashion needs to be a deliberate choice, however.

What’s Next?

Having worked our way down from the larger philosophical reasons for the importers, to some of the more technically-oriented implementation problems; I’d like to work our way back out again, focussing on each problem individually. In the following posts, I’ll start laying out how I think we can bring our importers up to speed, prepare them for the future, and make them available for everyone.

This post is part of a series, talking about the WordPress Importers, their history, where they are now, and where they could go in the future.

,

Dave HallPrivacy Policy

Skwashd Services Pty is committed to providing quality services to you and this policy outlines our ongoing obligations to you in respect of how we manage your Personal Information. We have adopted the Australian Privacy Principles (APPs) contained in the Privacy Act 1988 (Cth) (the Privacy Act). The NPPs govern the way in which we collect, use, disclose, store, secure and dispose of your Personal Information. A copy of the Australian Privacy Principles may be obtained from the website of The Office of the Australian Information Commissioner at www.

,

Craige McWhirterSober Living for the Revolution

by Gabriel Kuhn

Sober Living for the Revolution: Hardcore Punk, Straight Edge, and Radical Politics

This is not a new book, having been published in 2010 but it's a fairly recent discovery for me.

I was never part of the straight edge scene here in Australia but was certainly aware of some of the more prominent bands and music in the punk scene in general. I've always had an ear for music with a political edge.

When it came to the straight edge scene I knew sweet FA. So that aspect of this book was pure curiousity. What attracted me to this work was the subject of radical sobriety and it's lived experience amongst politically active people.

In life, if you decide to forgo something that everybody else does, it gives you a perspective on society that that you wouldn't have if you were just engaging. It teaches you a lot about the world.

-- Ian MacKaye

This was one of the first parts of the book to really pop out at me. This rang true for my lived experience in other parts of my life where I'd forgone things that everyone else does. There were costs in not engaging but Ian is otherwise correct.

While entirely clear eyed about the problems of inebriation amongst Australian activists and in wider society as a whole, the titular concept of sober living for the revolution had not previously resonated with me.

But then I realised that if you do not speak that language, you recognise that they are not talking to you... In short, if you don't speak the language of violence, you are released from violence. This was a very profound discovery for me.

-- Ian MacKaye

While my quotes are pretty heavily centered on one individual, there are about 20 contributors from Europe, the middle east and both North and South America provding reasonably diverse perspective on the music but more importantly the inspiration and positive impacts of radical sobriety on their communities.

As someone who was reading primarilly for the sober living insights, the book's focus on the straight edge scene was quite heavy to wade through but the insights gained were worth the musical history lessons.

The only strategy for sharing good ideas that succeeds unfailingly... is the power of example — if you put “ecstatic sobriety” into action in your life, and it works, those who sincerely want similar things will join in.

-- Crimethinc

Overall this book pulled together a number of threads I'd been pulling on myself over my adult life and brought them into one comical phrase: lucid bacchanalism.

I was also particularly embarassed to have not previously identified alcohol consumption as not merely a recreation but yet another insidious form of consumerism.

Well worth a read.

Russell CokerPSI and Cgroup2

In the comments on my post about Load Average Monitoring [1] an anonymous person recommended that I investigate PSI. As an aside, why do I get so many great comments anonymously? Don’t people want to get credit for having good ideas and learning about new technology before others?

PSI is the Pressure Stall Information subsystem for Linux that is included in kernels 4.20 and above, if you want to use it in Debian then you need a kernel from Testing or Unstable (Bullseye has kernel 4.19). The place to start reading about PSI is the main Facebook page about it, it was originally developed at Facebook [2].

I am a little confused by the actual numbers I get out of PSI, while for the load average I can often see where they come from (EG have 2 processes each taking 100% of a core and the load average will be about 2) it’s difficult to work out where the PSI numbers come from. For my own use I decided to treat them as unscaled numbers that just indicate problems, higher number is worse and not worry too much about what the number really means.

With the cgroup2 interface which is supported by the version of systemd in Testing (and which has been included in Debian backports for Buster) you get PSI files for each cgroup. I’ve just uploaded version 1.3.5-2 of etbemon (package mon) to Debian/Unstable which displays the cgroups with PSI numbers greater than 0.5% when the load average test fails.

System CPU Pressure: avg10=0.87 avg60=0.99 avg300=1.00 total=20556310510
/system.slice avg10=0.86 avg60=0.92 avg300=0.97 total=18238772699
/system.slice/system-tor.slice avg10=0.85 avg60=0.69 avg300=0.60 total=11996599996
/system.slice/system-tor.slice/tor@default.service avg10=0.83 avg60=0.69 avg300=0.59 total=5358485146

System IO Pressure: avg10=18.30 avg60=35.85 avg300=42.85 total=310383148314
 full avg10=13.95 avg60=27.72 avg300=33.60 total=216001337513
/system.slice avg10=2.78 avg60=3.86 avg300=5.74 total=51574347007
/system.slice full avg10=1.87 avg60=2.87 avg300=4.36 total=35513103577
/system.slice/mariadb.service avg10=1.33 avg60=3.07 avg300=3.68 total=2559016514
/system.slice/mariadb.service full avg10=1.29 avg60=3.01 avg300=3.61 total=2508485595
/system.slice/matrix-synapse.service avg10=2.74 avg60=3.92 avg300=4.95 total=20466738903
/system.slice/matrix-synapse.service full avg10=2.74 avg60=3.92 avg300=4.95 total=20435187166

Above is an extract from the output of the loadaverage check. It shows that tor is a major user of CPU time (the VM runs a ToR relay node and has close to 100% of one core devoted to that task). It also shows that Mariadb and Matrix are the main users of disk IO. When I installed Matrix the Debian package told me that using SQLite would give lower performance than MySQL, but that didn’t seem like a big deal as the server only has a few users. Maybe I should move Matrix to the Mariadb instance. to improve overall system performance.

So far I have not written any code to display the memory PSI files. I don’t have a lack of RAM on systems I run at the moment and don’t have a good test case for this. I welcome patches from people who have the ability to test this and get some benefit from it.

We are probably about 6 months away from a new release of Debian and this is probably the last thing I need to do to make etbemon ready for that.

Russell CokerRISC-V and Qemu

RISC-V is the latest RISC architecture that’s become popular. It is the 5th RISC architecture from the University of California Berkeley. It seems to be a competitor to ARM due to not having license fees or restrictions on alterations to the architecture (something you have to pay extra for when using ARM). RISC-V seems the most popular architecture to implement in FPGA.

When I first tried to run RISC-V under QEMU it didn’t work, which was probably due to running Debian/Unstable on my QEMU/KVM system and there being QEMU bugs in Unstable at the time. I have just tried it again and got it working.

The Debian Wiki page about RISC-V is pretty good [1]. The instructions there got it going for me. One thing I wasted some time on before reading that page was trying to get a netinst CD image, which is what I usually do for setting up a VM. Apparently there isn’t RISC-V hardware that boots from a CD/DVD so there isn’t a Debian netinst CD image. But debootstrap can install directly from the Debian web server (something I’ve never wanted to do in the past) and that gave me a successful installation.

Here are the commands I used to setup the base image:

apt-get install debootstrap qemu-user-static binfmt-support debian-ports-archive-keyring

debootstrap --arch=riscv64 --keyring /usr/share/keyrings/debian-ports-archive-keyring.gpg --include=debian-ports-archive-keyring unstable /mnt/tmp http://deb.debian.org/debian-ports

I first tried running RISC-V Qemu on Buster, but even ls didn’t work properly and the installation failed.

chroot /mnt/tmp bin/bash
# ls -ld .
/usr/bin/ls: cannot access '.': Function not implemented

When I ran it on Unstable ls works but strace doesn’t work in a chroot, this gave enough functionality to complete the installation.

chroot /mnt/tmp bin/bash
# strace ls -l
/usr/bin/strace: test_ptrace_get_syscall_info: PTRACE_TRACEME: Function not implemented
/usr/bin/strace: ptrace(PTRACE_TRACEME, ...): Function not implemented
/usr/bin/strace: PTRACE_SETOPTIONS: Function not implemented
/usr/bin/strace: detach: waitpid(1602629): No child processes
/usr/bin/strace: Process 1602629 detached

When running the VM the operation was noticably slower than the emulation of PPC64 and S/390x which both ran at an apparently normal speed. When running on a server with equivalent speed CPU a ssh login was obviously slower due to the CPU time taken for encryption, a ssh connection from a system on the same LAN took 6 seconds to connect. I presume that because RISC-V is a newer architecture there hasn’t been as much effort made on optimising the Qemu emulation and that a future version of Qemu will be faster. But I don’t think that Debian/Bullseye will give good Qemu performance for RISC-V, probably more changes are needed than can happen before the freeze. Maybe a version of Qemu with better RISC-V performance can be uploaded to backports some time after Bullseye is released.

Here’s the Qemu command I use to run RISC-V emulation:

qemu-system-riscv64 -machine virt -device virtio-blk-device,drive=hd0 -drive file=/vmstore/riscv,format=raw,id=hd0 -device virtio-blk-device,drive=hd1 -drive file=/vmswap/riscv,format=raw,id=hd1 -m 1024 -kernel /boot/riscv/vmlinux-5.10.0-1-riscv64 -initrd /boot/riscv/initrd.img-5.10.0-1-riscv64 -nographic -append net.ifnames=0 noresume security=selinux root=/dev/vda ro -object rng-random,filename=/dev/urandom,id=rng0 -device virtio-rng-device,rng=rng0 -device virtio-net-device,netdev=net0,mac=02:02:00:00:01:03 -netdev tap,id=net0,helper=/usr/lib/qemu/qemu-bridge-helper

Currently the program /usr/sbin/sefcontext_compile from the selinux-utils package needs execmem access on RISC-V while it doesn’t on any other architecture I have tested. I don’t know why and support for debugging such things seems to be in early stages of development, for example the execstack program doesn’t work on RISC-V now.

RISC-V emulation in Unstable seems adequate for people who are serious about RISC-V development. But if you want to just try a different architecture then PPC64 and S/390 will work better.

,

Linux AustraliaCouncil Meeting Tuesday 12th January 2021 – Minutes

1. Meeting overview and key information

Present

Sae Ra Germaine

Jonathan Woithe

Julien Goodwin

Russell Stuart

Lisa Sands

Joel Addison

 

Apologies 

Benno Rice

 

Meeting opened at 1931 AEDT by Sae Ra and quorum was achieved.

Minutes taken by Julien.

2. Log of correspondence

  • From Anchor to council@ on 16 Dec 2020: opensource.org.au is due for renewal by 15 Feb 2021.
  • From Big Orange Heart via web form: request for sponsorship of WordFest (https://wordfest.live) on 22 Jan 2021.
    [Response has been sent indicating Council in caretaker mode, suggesting a longer lead time in future and outlining LA’s funding priorities.]
  • From Anchor to council@ on 27 Dec 2020: lca2018.org requires renewal by 26 Feb 2021.
    [As per meeting on 15 Dec 2020, this will be allowed to expire if the few remaining services dependent on the domain have been moved by 26 Feb 2021.]
  • From <a member> via website form on 27 Dec 2020: will LA write a submission into the Productivity Commission “Right to Repair” enquiry? A response was sent:
    • The tight timing (closing date in early Feb) and existing commitments of the Council members will almost certainly prevent this happening unless a member steps up to do it.
    • If a member wished LA to sign a submission they prepared and it was consistent with LA’s values, Council would consider signing it.

<The member> is writing a submission.  He has subsequently indicated he won’t request LA sign it as it contains a lot of personal perspective and he doesn’t feel he can speak for others. He may share it on linux-aus for comments.

  • From Binh Hguyen on Grants list, asking about out-of-cycle funding for a project idea (a public information aggregator). Response sent indicating Council is in caretaker mode and suggesting that they wait for the 2021 Grant program if it eventuates.

 

3. Items for discussion

  • AGM –  Are we ready, who is doing what etc. 
    • Roles required: 
      • Chat Monitor
      • Votes/Poll Monitor
      • Hands Up – Reaction Monitor
      • Timer
  • Bringing together the nomination sheets, these need to be collated, distributed and a seconder needs to be found for each of the nominations.
  • Annual Report – what is missing what do we need to send out etc.
    • Hoping for reports from web team, pycon.
    • Once those are received the report can be sent out.
  • Rusty Wrench
    • Decision is to award to <see announcement>, Jon or one of the other formers will award
    • AI: Sae Ra to liaise with Jon
  • JW: Russell Coker has set up a replacement for planet.linux.org.au at https://planet.luv.asn.au/. It has been suggested that planet.linux.org.au either point to this URL or contain a link to it.
    • AI: Jonathan & Julien to action link on current static page
  • JW: <A member> suggests via council@ on 5 Jan 2021 that the “Should a digest be dispatched daily when the size threshold isn’t reached” setting for la-announce be set to “Yes”. Due to the low traffic volume on this list, users with digest mode set may only receive messages after a delay of many months.
    • All agree, however it turns out the setting has already been enabled.

4. Items for noting

  • None

5. Other business

  • None

6. In camera

  • No items were discussed in camera

Meeting closed at 2024

The post Council Meeting Tuesday 12th January 2021 – Minutes appeared first on Linux Australia.

Pia AndrewsReflections on public sector transformation and COVID

Public sectors around the world are facing unprecedented challenges as the speed, scale and complexity of modern life grows exponentially. The 21st century is a large, complex, globalised and digital age unlike anything in the history of humans, but our systems of governance were largely forged in the industrial age. The 20th century alone saw enough change to merit a rethink: global population rose from 1.6 billion to 6 billion, two world wars spurred the creation of global economic and power structures, the number of nations rose from 77 to almost 200, and of course we entered the age of electronics and the internet, changing forever the experience, connectivity, access to knowledge, and increased individual empowerment of people everywhere. Between Climate Change, COVID-19, and globalism, nations worldwide are also now preparing for the likelihood of rolling emergencies, whether health, environmental, economic or social.

“Traditional” approaches to policy, service delivery and regulation are too slow, increasingly ineffective and result in increasingly hard to predict outcomes, making most public sectors and governments increasingly unable to meet the changing needs of the communities we serve.

Decades of austerity, hollowing out expertise, fragmentation of interdependent functions that are forced to compete, outsourcing and the inevitable ensuing existential crises have all left public sectors less prepared than ever, at a the time when people most need us. Trust is declining and yet public sectors often feel unable to be authoritative sources of facts or information, independent of political or ideological influence, which exacerbates the trust and confidence deficit. Public sectors have become too reactive, too “business” focused, constantly pivoting all efforts on the latest emergency, cost efficiency, media release or whim of the Minister, whilst not investing in baseline systems, transformation, programs or services that are needed to be proactive and resilient. A values-based public sector that is engaged with, responsive to and serving the needs of (1) the Government, (2) the Parliament AND (3) the people – a difficult balancing act to be sure! – is critical, both to maintaining the trust of all three masters, and to being genuinely effective over time :)

Whether it is regulation, services or financial management, public sectors everywhere also need to embrace change as the new norm, which means our systems, processes and structures need to be engaged in continuously measuring, monitoring and responding to change, throughout the entire policy-delivery lifecycle. This means policy and delivery folk should be hand in hand throughout the entire process, so the baton passing between functionally segmented teams can end.

Faux transformation

Sadly today, most “transformation programs” appear to fall into one of three types:

  • Iteration or automation – iterative improvements, automation or new tech just thrown at existing processes and services, which doesn’t address the actual needs, systemic problems, or the gaping policy-delivery continuum chasm that has widened significantly in recent decades; or
  • Efficiency restructures - well marketed austerity measures to reduce the cost of government without actually improving the performance, policy outcomes or impact of government; or
  • Experimentation at the periphery - real transformation skills or units that are kept at the fringe and unable to drive or affect systemic change across any given public sector.

Most “transformation programs” I see are simply not particularly transformative, particularly when you scratch the surface to find how they would change things in future. If you answer is “we’ll have a new system” or “an x% improvement”, then it probably isn’t transformation, it is probably an iteration. Transformation should result in exponential solutions to exponential problems and a test driven and high confidence policy-delivery continuum that takes days not months for implementation, with the effects of new policies clearly seen through consistently measured, monitored and continuously improved delivery. You should have a clear and clearly understood future state in mind to transformation towards, otherwise it is certainly iteration on the status quo.

There are good exceptions to this normative pattern. Estonia, Taiwan, South Korea, Canada and several nations across South East Asia have and are investing in genuine and systemic transformation programs, often focused on improving the citizen experience as well as the quality of life of their citizens and communities. My favourite quote from 2020 was from Dr Sania Nishtar (Special Assistant to the Prime Minister of Pakistan on Poverty Alleviation and Social Protection) when she said ‘it is neither feasible nor desirable to return to the pre-COVID status’. It was part of a major UNDP summit on NextGenGov, where all attendees reflected the same sentiment that COVID exposed significant gaps in our public sectors, and we all need significant reform to be effective and responsive to rolling emergencies moving forward.

So what does good transformation look like?

I would categorise true transformation efforts in three types, with all three needed:

  1. Policy and service transformation means addressing and reimagining the policy-delivery continuum in the 21st century, and bringing policy and implementation people together in the same process and indeed, the same (virtual) room. This would mean new policies are better informed, able to be tested from inception through to implementation, are able to be immediately or at least swiftly implemented upon enactment in Parliament and are then continuously measured, monitored and iterated in accordance with the intended policy outcome. The exact same infrastructure used for delivery should be used for policy, and vice versa, to ensure there is no gap between, and to ensure policy outcomes are best realised whilst also responding to ongoing change. After all, when policy outcomes are not realized, regardless of whose fault it     was, it is everyone’s failure. This kind of transformation is possible within any one department or agency, but ideally needs leadership across all of government to ensure consistency of policy impact and benefits realisation.
  2. Organizational transformation would mean getting back to basics and having a clear vision of the purpose and intended impact of the department as a whole, with clear overarching measurement of those goals, and clear line of sight for how all programs contribute to those goals, and with all staff clear in how their work supports the goals. This type of transformation requires structural cultural transformation that builds on the shared values and goals of the department, but gains a consistency of behaviours that are constructive and empathetic. This kind of transformation is entirely possible within the domain of any one department or agency, if the leadership support and participate in it.
  3. Systemic transformation means the addressing and reimagining of the public sector as a whole, including its role in society, the structures, incentive systems, assurance processes, budget management, 21st century levers (like open government), staff support and relationship to other sectors. It also means having a clear vision for what it means to be a proud, empowered and skilled public servant today, which necessarily includes system and design thinking, participatory governance skills and digital literacy (not just skills). This can’t be done in any one department and requires all of public sector investment, coordination and cross government mandate. This level of transformation has started to happen in some countries but it is early days and needs prioritization if public sectors are to truly and systemically transform. Such transformation efforts often focus on structure, but need to include scope for transformation of policy, services, workforce, funding and more across government.

As we enter the age of Artificial Intelligence, public sectors should also be planning what an augmented public sector looks like, one that keeps values, trust and accountability at the heart of what we do, whilst using machines to support better responsiveness, modelling, service delivery and to maintain diligent and proactive protection of the people and communities we serve. Most AI projects seem to be about iterative efforts, automation or cost savings, which misses the opportunity to design a modern public service that gets the best of humans and machines working together for the best public outcomes.

COVID-19

COVID has been a dramatic reminder of the ineffectiveness of government systems to respond to changing needs in at least three distinct ways:

  • heavy use of emergency powers have been relied upon to get anything of substance done, demonstrating key systemic barriers, but rather than changing the problematic business as usual processes, many are reverting to usual practice as soon as practical;
  • superhuman efforts have barely scratched the surface of the problems. The usual resourcing response to pressure it to just increase resources rather than to change how we respond to the problem, but there are not exponential resources available, so ironically the
  • inequities have been compounded by governments pressing on the same old levers with the same old processes without being able to measure, monitor and iterative or pivot in real time in response to the impacts of change.

Sadly, the pressure for ‘good news stories’ often drives a self-congratulatory tone and an increase to an already siloed mindset, as public servants struggle to respond to increased and often diametrically opposed expectations and needs from the public and political domains. Many have also mistaken teleworking for transformation, potentially missing a critical opportunity to transform towards a 21st century public sector.

Last word

I’m planning to do a bit more writing about this, so please leave your comments and thoughts below. I’d be keen to hear how you differentiate transformation from iterative efforts, and how to ensure we are doing both. There is, of course, value to be found in some iterative efforts. It is when 100% of our time and effort is focused on iteration that we see public sectors simply revert to playing whack-a-mole against an exponentially growing problem space, hence the need to have SOME proportion of our resource on genuine transformation efforts. Proportional planning is critical so we address both the important and the urgent, not one without the other.

,

Jan SchmidtRift CV1 – Adventures in Kalman filtering Part 2

In the last post I had started implementing an Unscented Kalman Filter for position and orientation tracking in OpenHMD. Over the Christmas break, I continued that work.

A Quick Recap

When reading below, keep in mind that the goal of the filtering code I’m writing is to combine 2 sources of information for tracking the headset and controllers.

The first piece of information is acceleration and rotation data from the IMU on each device, and the second is observations of the device position and orientation from 1 or more camera sensors.

The IMU motion data drifts quickly (at least for position tracking) and can’t tell which way the device is facing (yaw, but can detect gravity and get pitch/roll).

The camera observations can tell exactly where each device is, but arrive at a much lower rate (52Hz vs 500/1000Hz) and can take a long time to process (hundreds of milliseconds) to analyse to acquire or re-acquire a lock on the tracked device(s).

The goal is to acquire tracking lock, then use the motion data to predict the motion closely enough that we always hit the ‘fast path’ of vision analysis. The key here is closely enough – the more closely the filter can track and predict the motion of devices between camera frames, the better.

Integration in OpenHMD

When I wrote the last post, I had the filter running as a standalone application, processing motion trace data collected by instrumenting a running OpenHMD app and moving my headset and controllers around. That’s a really good way to work, because it lets me run modifications on the same data set and see what changed.

However, the motion traces were captured using the current fusion/prediction code, which frequently loses tracking lock when the devices move – leading to big gaps in the camera observations and more interpolation for the filter.

By integrating the Kalman filter into OpenHMD, the predictions are improved leading to generally much better results. Here’s one trace of me moving the headset around reasonably vigourously with no tracking loss at all.

Headset motion capture trace

If it worked this well all the time, I’d be ecstatic! The predicted position matched the observed position closely enough for every frame for the computer vision to match poses and track perfectly. Unfortunately, this doesn’t happen every time yet, and definitely not with the controllers – although I think the latter largely comes down to the current computer vision having more troubler matching controller poses. They have fewer LEDs to match against compared to the headset, and the LEDs are generally more side-on to a front-facing camera.

Taking a closer look at a portion of that trace, the drift between camera frames when the position is interpolated using the IMU readings is clear.

Headset motion capture – zoomed in view

This is really good. Most of the time, the drift between frames is within 1-2mm. The computer vision can only match the pose of the devices to within a pixel or two – so the observed jitter can also come from the pose extraction, not the filtering.

The worst tracking is again on the Z axis – distance from the camera in this case. Again, that makes sense – with a single camera matching LED blobs, distance is the most uncertain part of the extracted pose.

Losing Track

The trace above is good – the computer vision spots the headset and then the filtering + computer vision track it at all times. That isn’t always the case – the prediction goes wrong, or the computer vision fails to match (it’s definitely still far from perfect). When that happens, it needs to do a full pose search to reacquire the device, and there’s a big gap until the next pose report is available.

That looks more like this

Headset motion capture trace with tracking errors

This trace has 2 kinds of errors – gaps in the observed position timeline during full pose searches and erroneous position reports where the computer vision matched things incorrectly.

Fixing the errors in position reports will require improving the computer vision algorithm and would fix most of the plot above. Outlier rejection is one approach to investigate on that front.

Latency Compensation

There is inherent delay involved in processing of the camera observations. Every 19.2ms, the headset emits a radio signal that triggers each camera to capture a frame. At the same time, the headset and controller IR LEDS light up brightly to create the light constellation being tracked. After the frame is captured, it is delivered over USB over the next 18ms or so and then submitted for vision analysis. In the fast case where we’re already tracking the device the computer vision is complete in a millisecond or so. In the slow case, it’s much longer.

Overall, that means that there’s at least a 20ms offset between when the devices are observed and when the position information is available for use. In the plot above, this delay is ignored and position reports are fed into the filter when they are available. In the worst case, that means the filter is being told where the headset was hundreds of milliseconds earlier.

To compensate for that delay, I implemented a mechanism in the filter where it keeps extra position and orientation entries in the state that can be used to retroactively apply the position observations.

The way that works is to make a prediction of the position and orientation of the device at the moment the camera frame is captured and copy that prediction into the extra state variable. After that, it continues integrating IMU data as it becomes available while keeping the auxilliary state constant.

When a the camera frame analysis is complete, that delayed measurement is matched against the stored position and orientation prediction in the state and the error used to correct the overall filter. The cool thing is that in the intervening time, the filter covariance matrix has been building up the right correction terms to adjust the current position and orientation.

Here’s a good example of the difference:

Before: Position filtering with no latency compensation
After: Latency-compensated position reports

Notice how most of the disconnected segments have now slotted back into position in the timeline. The ones that haven’t can either be attributed to incorrect pose extraction in the compute vision, or to not having enough auxilliary state slots for all the concurrent frames.

At any given moment, there can be a camera frame being analysed, one arriving over USB, and one awaiting “long term” analysis. The filter needs to track an auxilliary state variable for each frame that we expect to get pose information from later, so I implemented a slot allocation system and multiple slots.

The downside is that each slot adds 6 variables (3 position and 3 orientation) to the covariance matrix on top of the 18 base variables. Because the covariance matrix is square, the size grows quadratically with new variables. 5 new slots means 30 new variables – leading to a 48 x 48 covariance matrix instead of 18 x 18. That is a 7-fold increase in the size of the matrix (48 x 48 = 2304 vs 18 x 18 = 324) and unfortunately about a 10x slow-down in the filter run-time.

At that point, even after some optimisation and vectorisation on the matrix operations, the filter can only run about 3x real-time, which is too slow. Using fewer slots is quicker, but allows for fewer outstanding frames. With 3 slots, the slow-down is only about 2x.

There are some other possible approaches to this problem:

  • Running the filtering delayed, only integrating IMU reports once the camera report is available. This has the disadvantage of not reporting the most up-to-date estimate of the user pose, which isn’t great for an interactive VR system.
  • Keeping around IMU reports and rewinding / replaying the filter for late camera observations. This limits the overall increase in filter CPU usage to double (since we at most replay every observation twice), but potentially with large bursts when hundreds of IMU readings need replaying.
  • It might be possible to only keep 2 “full” delayed measurement slots with both position and orientation, and to keep some position-only slots for others. The orientation of the headset tends to drift much more slowly than position does, so when there’s a big gap in the tracking it would be more important to be able to correct the position estimate. Orientation is likely to still be close to correct.
  • Further optimisation in the filter implementation. I was hoping to keep everything dependency-free, so the filter implementation uses my own naive 2D matrix code, which only implements the features needed for the filter. A more sophisticated matrix library might perform better – but it’s hard to say without doing some testing on that front.

Controllers

So far in this post, I’ve only talked about the headset tracking and not mentioned controllers. The controllers are considerably harder to track right now, but most of the blame for that is in the computer vision part. Each controller has fewer LEDs than the headset, fewer are visible at any given moment, and they often aren’t pointing at the camera front-on.

Oculus Camera view of headset and left controller.

This screenshot is a prime example. The controller is the cluster of lights at the top of the image, and the headset is lower left. The computer vision has gotten confused and thinks the controller is the ring of random blue crosses near the headset. It corrected itself a moment later, but those false readings make life very hard for the filtering.

Position tracking of left controller with lots of tracking loss.

Here’s a typical example of the controller tracking right now. There are some very promising portions of good tracking, but they are interspersed with bursts of tracking losses, and wild drifting from the computer vision giving wrong poses – leading to the filter predicting incorrect acceleration and hence cascaded tracking losses. Particularly (again) on the Z axis.

Timing Improvements

One of the problems I was looking at in my last post is variability in the arrival timing of the various USB streams (Headset reports, Controller reports, camera frames). I improved things in OpenHMD on that front, to use timestamps from the devices everywhere (removing USB timing jitter from the inter-sample time).

There are still potential problems in when IMU reports from controllers get updated in the filters vs the camera frames. That can be on the order of 2-4ms jitter. Time will tell how big a problem that will be – after the other bigger tracking problems are resolved.

Sponsorships

All the work that I’m doing implementing this positional tracking is a combination of my free time, hours contributed by my employer Centricular and contributions from people via Github Sponsorships. If you’d like to help me spend more hours on this and fewer on other paying work, I appreciate any contributions immensely!

Next Steps

The next things on my todo list are:

  • Integrate the delayed-observation processing into OpenHMD (at the moment it is only in my standalone simulator).
  • Improve the filter code structure – this is my first kalman filter and there are some implementation decisions I’d like to revisit.
  • Publish the UKF branch for other people to try.
  • Circle back to the computer vision and look at ways to improve the pose extraction and better reject outlying / erroneous poses, especially for the controllers.
  • Think more about how to best handle / schedule analysis of frames from multiple cameras. At the moment each camera operates as a separate entity, capturing frames and analysing them in threads without considering what is happening in other cameras. That means any camera that can’t see a particular device starts doing full pose searches – which might be unnecessary if another camera still has a good view of the device. Coordinating those analyses across cameras could yield better CPU consumption, and let the filter retain fewer delayed observation slots.

,

Tim RileyOpen source status update, December 2020

Happy new year! Before we get too far through January, here’s the recap of my December in OSS.

Advent of Code 2020 (in Go!)

This month started off a little differently to usual. After spending some time book-learning about Go, I decided to try the Advent of Code for the first time as a way to build some muscle memory for a new language. And gosh, it was a lot of fun! Turns out I like programming and problem-solving, go figure. After ~11 days straight, however, I decided to put the effort on hold. I could tell the pace wasn’t going to be sustainable for me (it was a lot of late nights), and I’d already begun to feel pretty comfortable with various aspects of Go, so that’s where I left it for now.

Rich dry-system component_dir configuration (and cleanups!)

Returning to my regular Ruby business, December was a good month for dry-system. After the work in November to prepare the way for Zeitwerk, I moved onto introducing a new component_dirs setting, which permits the addition of any number of component directories (i.e. where dry-system should look for your Ruby class files), each with their own specific configurations:

class MyApp::Container < Dry::System::Container
  configure do |config|
    config.root = __dir__

    config.component_dirs.add "lib" do |dir|
      dir.auto_register = true    # defaults to true
      dir.add_to_load_path = true # defaults to true
      dir.default_namespace = "my_app"
    end
  end
end

Along with this, I’m removing the following from Dry::System::Container:

  • The top-level default_namespace and auto_register settings
  • The .add_to_load_path! and .auto_register! methods

Together, this means there’ll be only a single place to configure the behaviour related to the loading of components from directories: the singular component_dirs setting.

This has been a rationalization I’ve been wanting to make for a long time, and happily, it’s proving to be a positive one: as I’ve been working through the changes, it’s allowed me to simplify some of the gnarlier parts of the gem.

What all of this provides is the right set of hooks for Hanami to specify the component directories for your app, as well as configure each one to work nicely with Zeitwerk. That’s the end goal, and I suspect we’ll arrive there in late January or February, but in the meantime, I’ve enjoyed the chance to tidy up the internals of this critical part of the Hanami 2.0 underpinnings.

You can follow my work in progress over in this PR.

Helpers for hanami-view 2.0

Towards the end of the month I had a call with Luca (the second in as many months, what a treat!), in which we discussed how we might bringing about full support for view helpers into hanami-view 2.0.

Of course, these won’t be �helpers� in quite the same shape you’d expect from Rails or any of the Ruby static site generators, because if you’ve ever heard me talk about dry-view or hanami-view 2.0 (here’s a refresher), one of its main goals is to help move you from a gross, global soup of unrelated helpers towards view behaviour modelled as focused, testable, well-factored object oriented code.

In this case, we finished the discussion with a plan, and Luca turned it around within a matter of days, with a quickfire set of PRs!

First he introduced the concept of custom anonymous scopes for any view. A scope in dry-view/hanami-view parlance is the object that provides the total set of methods available to use within the template. For a while we’ve supported defining custom scope classes to add behavior for a view that doesn’t belong on any one of its particular exposures, but this requires a fair bit of boilerplate, especially if it’s just for a method or two:

class ArticleViewScope < Hanami::View::Scope
  def custom_method
    # Custom behavior here, can access all scope facilities, e.g. `locals` or `context`
  end
end

class ArticleView < Hanami::View
  config.scope = ArticleViewScope

  expose :article do |slug:|
    # logic to load article here
  end
end

So to make this eaiser, we now we have this new class-level block:

class ArticleView < Hanami::View
  expose :article do |slug:|
    # logic to load article here
  end

  # New scope block!
  scope do
    def custom_method
      # Custom behavior here, can access all scope facilities, e.g. `locals` or `context`
    end
  end
end

So nice! Also nice? That it was a literal 3-line change to the hanami-view code � Also, also nice? You can still “upgrade� to a fully fledged class if the code ever requires it.

Along with this, Luca also began adapting the existing range of global helpers for use in hanami-view 2.0. I may dislike the idea of helpers in general, but truly stateless things like html builders, etc. I’m generally happy to see around, and with the improvements to template rendering we have over hanami-view 1.x, we’ll be able to make these a lot more expressive for Hanami view developers. This PR is just the first step, but I expect we’ll be able to make some quick strides once this is in place.

Thank you to my sponsors! 🙌�

Thank you to my six GitHub sponsorts for your continuing support! If you’re reading this and would like to chip in and help push forward the Ruby web application ecosystem for 2021, I’d really appreciate your support.

See you all next month!

,

Francois MarierProgramming a DMR radio with its CPS

Here are some notes I took around programming my AnyTone AT-D878UV radio to operate on DMR using the CPS software that comes with it.

Note that you can always tune in to a VFO channel by hand if you haven't had time to add it to your codeplug yet.

DMR terminology

First of all, the terminology of DMR is quite different from that of the regular analog FM world.

Here are the basic terms:

  • Frequency: same meaning as in the analog world
  • Repeater: same meaning as in the analog world
  • Timeslot: Each frequency is split into two timeslots (1 and 2) and what that means that there can be two simultaneous transmissions on each frequency.
  • Color code: This is the digital equivalent of a CTCSS tone (sometimes called privacy tone) in that using the incorrect code means that you will tie up one of the timeslots on the frequency, but nobody else will hear you. These are not actually named after colors, but are instead just numerical IDs from 0 to 15.

There are two different identification mechanisms (both are required):

  • Callsign: This is the same identifier issued to you by your country's amateur radio authority. Mine is VA7GPL.
  • Radio ID: This is a unique numerical ID tied to your callsign which you must register for ahead of time and program into your radio. Mine is 3027260.

The following is where this digital mode becomes most interesting:

  • Talkgroup: a "chat room" where everything you say will be heard by anybody listening to that talkgroup
  • Network: a group of repeaters connected together over the Internet (typically) and sharing a common list of talkgroups
  • Hotspot: a personal simplex device which allows you to connect to a network with your handheld and access all of the talkgroups available on that network

The most active network these days is Brandmeister, but there are several others.

  • Access: This can either be Always on which means that a talkgroup will be permanently broadcasting on a timeslot and frequency, or PTT which means a talkgroup will not be broadcast until it is first "woken up" by pressing the push-to-talk button and then will broadcast for a certain amount of time before going to sleep again.
  • Channel: As in the analog world, this is what you select on your radio when you want to talk to a group of people. In the digital world however, it is tied not only to a frequency (and timeslot) and tone (color code), but also to a specific talkgroup.

Ultimately what you want to do when you program your radio is to find the talkgroups you are interested in (from the list offered by your local repeater) and then assign them to specific channel numbers on your radio. More on that later.

Callsign and Radio IDs

Before we get to talkgroups, let's set your callsign and Radio ID:

Then you need to download the latest list of Radio IDs so that your radio can display people's names and callsigns instead of just their numerical IDs.

One approach is to only download the list of users who recently talked on talkgroups you are interested in. For example, I used to download the contacts for the following talkgroups: 91,93,95,913,937,3026,3027,302,30271,30272,530,5301,5302,5303,5304,3100,3153,31330 but these days, what I normally do is to just download the entire worldwide database (user.csv) since my radio still has enough storage (200k entries) for it.

In order for the user.csv file to work with the AnyTone CPS, it needs to have particular columns and use the DOS end-of-line characters (apt install dos2unix if you want to do it manually). I wrote a script to do all of the work for me.

If you use dmrconfig to program this radio instead, then the conversion is unnecessary. The user.csv file can be used directly, however it will be truncated due to an incorrect limit hard-coded in the software.

Talkgroups

Next, you need to pick the talkgroups you would like to allocate to specific channels on your radio.

Start by looking at the documentation for your local repeaters (e.g. VE7RAG and VE7NWR in the Vancouver area).

In addition to telling you the listen and transmit frequencies of the repeater (again, this works the same way as with analog FM), these will tell you which talkgroups are available and what timeslots and color codes they have been set to. It will also tell you the type of access for each of these talkgroups.

This is how I programmed a channel:

and a talkgroup on the VE7RAG repeater in my radio:

If you don't have a local repeater with DMR capability, or if you want to access talkgroups available on a different network, then you will need to get a DMR hotspot such as one that's compatible with the Pi-Star software.

This is an excerpt from the programming I created for the talkgroups I made available through my hotspot:

One of the unfortunate limitations of the CPS software for the AnyTone 878 is that talkgroup numbers are globally unique identifiers. This means that if TG1234 (hypothetical example) is Ragchew 3000 on DMR-MARC but Iceland-wide chat on Brandmeister, then you can't have two copies of it with different names. The solution I found for this was to give that talkgroup the name "TG1234" instead of "Ragchew3k" or "Iceland". I use a more memorable name for non-conflicting talkgroups, but for the problematic ones, I simply repeat the talkgroup number.

Simplex

Talkgroups are not required to operate on DMR. Just like analog FM, you can talk to another person point-to-point using a simplex channel.

The convention for all simplex channels is the following:

  • Talkgroup: 99
  • Color code: 1
  • Timeslot: 1
  • Admit criteria: Always
  • In Call Criteria: TX or Always

After talking to the British Columbia Amateur Radio Coordination Council, I found that the following frequency ranges are most suitable for DMR simplex:

  • 145.710-145.790 MHz (simplex digital transmissions)
  • 446.000-446.975 MHz (all simplex modes)

The VECTOR list identifies two frequencies in particular:

  • 446.075 MHz
  • 446.500 MHz

Learn more

If you'd like to learn more about DMR, I would suggest you start with this excellent guide (also mirrored here).

,

Tim Riley2020 in review

2020, hey? What a time to finally get back on my ”year in review” horse. It was a calamitous year for many of us, but there’s a lot I’m thankful for from 2020.

Work

In January I started a new job with Culture Amp, as part of the Icelab closing and the whole team moving across. I couldn’t have found a better place to work: the people are inspiring, I’ve wound up with a great mentor/manager, and the projects are nourishing. There’s so much I can contribute to here, and I know I’m still only scratching the surface.

A new workplace with it’s own tech ecosystem meant I did a lot of learning for work this year. Among other things, I started working with event sourcing, distributed systems, AWS CDK, and even a little Go as the year came to an end.

I’m full-time remote with Culture Amp. Under ordinary circumstances (which went out the window before long), this would mean semi-regular visits to the Melbourne office, and I was lucky enough to do that a couple of times in January and February before travel became unworkable. Aside from that, I’ve enjoyed many hours over Zoom working with all my new colleagues.

And just to tie a bow in the a big year of work-related things, Michael, Max, and I worked through to December putting the finishing touches on the very last Icelab project, Navigate Senate Committees.

I’m deeply grateful to be where I am now, to have smoothly closed one chapter of work while opening another that I’m excited to inhabit for many years to come.

OSS

This year in OSS has been all about Hanami 2.0 development. By the end of 2019, I’d already made the broad strokes required to make dry-system the new core of the framework. 2020 was all about smoothing around the edges. I worked pretty consistently at this throughout the year, focusing on a new technique for frictionless, zero-boilerplate integrated components, view/action integration, a revised approach for configuration, and lately, support for code loading with Zeitwerk.

Towards the beginning of the year, I decided to follow Piotr’s good example and write my own monthly OSS status updates. I published these monthly since, making for 9 in the year (and 10 once I do December’s, once this post is done!). I’m really glad I established this habit. It captures so much of the thinking I put into my work that would otherwise be lost with time, and in the case of the Hanami project, it’s a way for the community to follow along with progress. And I won’t lie, the thought of the upcoming post motivates me to squeeze just a little bit more into each month!

This year I shared my Hanami 2 application template as a way to try the framework while it’s still in development. We’re using it for three production services at work and it’s running well.

Helping to rewrite Hanami for 2.0 has been the biggest OSS project I’ve undertaken, and work on this was a slog at times, but I’m happy I managed to keep a steady pace. I also rounded out the year by being able to catch up with Luca and Piotr for a couple of face to face discussions, which was a delight after so many months of text-only collaboration.

On the conference side of things, given the travel restrictions, there was a lot less than normal, but I did have a great time at the one event I did attend, RubyConf AU back in February (which seems so long ago now). Sandwiched between the Australian summer bushfires and the onset of the coronavirus, the community here was amazingly lucky with the timing. Aside from this, the increase of virtual events meant I got to share a talk with Philly.rb and appear on GitHub’s open source Friday livestream series.

I also joined the GitHub sponsors program in May. I only have a small group of sponsors (I’d love more!), but receiving notice of each one was a true joy.

And in a first for me, I spent some time working on this year’s Advent of Code! I used it as an opportunity to develop some familiarity with Go. It was great fun! Code is here if you’re interested.

Home & family

The ”stay at home” theme of 2020 was a blessing in many ways, because it meant more time with all the people I love. This year I got to join in on Clover learning to read, write, ride a bike, so many things! Iris is as gregarious as ever and definitely keen to begin her school journey this coming year.

As part of making the most of being home, I also started working out at home in March (thanks, PE with Joe!), which I managed to keep up at 5 days/week ever since! I haven’t felt this good in years.

And to close out, a few other notables:

  • Started visiting the local library more, I enjoyed reading paper books again
  • Some particularly good reads from the year: Stephen Baxter’s Northland trilogy, Cixin Liu’s The Supernova Era, Stephen Baxter’s The Medusa Chronicles, Roger Levy’s The Rig, Kim Stanley Robinson’s Red Moon, Mary Robinette Kowal’s The Relentless Moon, Kylie Maslen’s Show Me Where it Hurts, and Hugh Howey’s Sand
  • Ted Lasso on Apple TV+ was just great
  • Got a Nintendo Switch and played through Breath of the Wild, a truly spectacular experience
  • I think that’s all for now, see you next year!

Lev LafayetteImage Watermarks in Batch

A common need among those who engage in large scale image processing is to assign a watermark of some description to their images. Further, so I have been told, it is preferable to have multiple watermarks that have slightly different kerning depending on whether the image is portrait or landscape. Thus there are two functions to this script, one for separating the mass of images into a directory into whether they are portrait or landscape, and a second to apply the appropriate watermark. The script is therefore structured as follows and witness the neatness and advantages of structured coding, even in shell scripts. I learned a lot from first-year Pascal programming.

#!/bin/bash
separate() { # Separate original files in portrait and landscape directories
# content here
}

apply() { # Apply correct watermark to each directory
# content here
}

main() {
    separate
    apply
}

main
exit

The first function simple compares the width to the length of the image to make its determination. It assumes that the images are correctly orientated in the first place. Directories are created for the two types of files, and the script parses over each file in the directory and determines whether they are portraits or landscapes using the identify utility from ImageMagick and copies (you want to keep the originals) into an appropriate directory. The content of the separate function therefore ends up like this:

separate() { # Separate original files in portrait and landscape directories
mkdir portraits; mkdir landscapes
for item in ./*.jpg
do
  orient=$(identify -format '%[fx:(h>w)]' "$item")
  if [ $orient -eq 1 ] ;
  then
      cp "$item" ./portraits
  else
      cp "$item" ./landscapes
  fi
done
}

The second function goes into each directory and applies the watermark by making use of ImageMagick's composite function, looping over each file in the directory. The file in the directories are overwritten (the $item is input and output) but remember the originals have not been altered. The size of the watermark varies according to the size of the image and each are located in the "southeast" corner of the file. The function assumes a watermark 20 pixels by 20 pixels offset from the bottom left corner and a watermark of 0.1 of the height of the image. These can be changed as desired.

apply() { # Apply correct watermark to each directory
cd portraits
for item in ./*.jpg; do convert "$item" ../watermarkp.png +distort affine "0,0 0,0 %[w],%[h] %[fx:t?v.w*(u.h/v.h*0.1):s.w],%[fx:t?v.h*(u.h/v.h*0.1):s.h]" -shave 1 -gravity southeast -geometry +20+20 -composite "$item" ; done
cd ../landscapes
for item in ./*.jpg; do convert "$item" ../watermarkl.png +distort affine "0,0 0,0 %[w],%[h] %[fx:t?v.w*(u.h/v.h*0.1):s.w],%[fx:t?v.h*(u.h/v.h*0.1):s.h]" -shave 1 -gravity southeast -geometry +20+20 -composite "$item" ; done
}

This file can also be combined with existing scripts for Batch Image Processing. In particular, jpegtran/exiftran, suggested by Michael Deegan, for EXIF rotation flags.

For what it's worth this script took around an hour to put together (mostly research, about 5 minutes coding, and 10 minutes testing and debugging). I suspect it will save readers who use it a great deal of time.

Ben MartinNew home for the Astrolabe, pocket day calc, and coin of sentimental value

 I turned a slice of a tree trunk into a matching pair of holders for sentimental objects over the break. This has a few coats of polyurethane and deeper coating on the bark. Having some layers on the bark takes away the sharper edges for you. I need to measure the thickness of the poly on the front and inside the pockets as it is certainly measurable. What was a nice fit without finish becomes a rather tight fit with the poly.

 

Behind the two instruments is the key chain which is tucked away into a deeper pocket. The pockets at the side of each object are to allow fingers to free the object for inspection or frustrating use in the case of the astrolabe.

I was going to go down the well trodden path of making a small coffee table top from the timber but I like this idea as it frees these objects from their boxes and the darker red timber really compliments the objects embedded within it.

Tim SerongScope Creep

On December 22, I decided to brew an oatmeal stout (5kg Gladfield ale malt, 250g dark chocolate malt, 250g light chocolate malt, 250g dark crystal malt, 500g rolled oats, 150g rice hulls to stop the mash sticking, 25g Pride of Ringwood hops, Safale US-05 yeast). This all takes a good few hours to do the mash and the boil and everything, so while that was underway I thought it’d be a good opportunity to remove a crappy old cupboard from the laundry, so I could put our nice Miele upright freezer in there, where it’d be closer to the kitchen (the freezer is presently in a room at the other end of the house).

The cupboard was reasonably easy to rip out, but behind it was a mouldy and unexpectedly bright yellow wall with an ugly gap at the top where whoever installed it had removed the existing cornice.

Underneath the bottom half of the cupboard, I discovered not the cork tiles which cover the rest of the floor, but a layer of horrific faux-tile linoleum. Plus, more mould. No way was I going to put the freezer on top of that.

So, up came the floor covering, back to nice hardwood boards.

Of course, the sink had to come out too, to remove the flooring from under its cabinet, and that meant pulling the splashback tiles (they had ugly screw holes in them anyway from a shelf that had been bracketed up on top of them previously).

Removing the tiles meant replacing a couple of sections of wall.

Also, we still needed to be able to use the washing machine through all this, so I knocked up a temporary sink support.

New cornice went in.

The rest of the plastering was completed and a ceiling fan installed.

Waterproofing membrane was applied where new tiles will go around a new sink.

I removed the hideous old aluminium backed weather stripping from around the exterior door and plastered up the exposed groove.

We still need to paint everything, get the new sink installed, do the tiling work and install new taps.

As for the oatmeal stout, I bottled that on January 2. From a sample taken at the time, it should be excellent, but right now still needs to carbonate and mature.

Stewart SmithPhotos from Taiwan

A few years ago we went to Taiwan. I managed to capture some random bits of the city on film (and also some shots on my then phone, a Google Pixel). I find the different style of art on the streets around the world to be fascinating, and Taiwan had some good examples.

I’ve really enjoyed shooting Kodak E100VS film over the years, and some of my last rolls were shot in Taiwan. It’s a film that unfortunately is not made anymore, but at least we have a new Ektachrome to have fun with now.

Words for our time: “Where there is democracy, equality and freedom can exist; without democracy, equality and freedom are merely empty words”.

This is, of course, only a small number of the total photos I took there. I’d really recommend a trip to Taiwan, and I look forward to going back there some day.

,

Simon LyallAudiobooks – December 2020

The Perils of Perception: Why We’re Wrong About Nearly Everything by Bobby Duffy

Lots of examples of how people are wrong about usually crime rates or levels of immigration. Divided into topics with some comments on why and how to fix. 3/5

The Knowledge: How to Rebuild our World from Scratch
by Lewis Dartnell

A how-to on rebooting civilization following a worldwide disaster. The tone is addressed to a present-day person rather than someone from the future which makes it more readable. 4/5

The Story of Silver: How the White Metal Shaped America and the Modern World by William L. Silber

Almost solely devoted to America it devotes sections to major events around the metal including it’s demonetization, government and private price manipulation and speculation including the Hunt Brothers. 3/5

The First Four Years by Laura Ingalls Wilder

About half the length of the other books in the series and published posthumously. Laura and Almanzo try to make a success farming for 4 years. Things don’t go well. The book is a bit more adult than some of the others 3/5

Casino Royale by Ian Fleming

Interesting how close it is to the 2006 Movie. Also since it is set in ~1951, World War 2 looms large in many places & most characters are veterans. Very good and fairly quick read. 4/5

A Bridge too far: The Classic History of the Greatest Battle of World War II by Cornelius Ryan

An account of the failed airborne operation. Mostly a day-by-day & sources including interviews with participants. A little confusing without maps. 4/5

The Bomb: Presidents, Generals, and the Secret History of Nuclear War by Fred Kaplan

“The definitive history of American policy on nuclear war”. Lots of “War Plans” and “Targeting Policy” with back and forth between service factions. 3/5

The Sirens of Mars: Searching for Life on Another World
by Sarah Stewart Johnson

“Combines elements of memoir from Johnson with the history and science of attempts to discover life on Mars”. I liked this book a lot, very nicely written and inspiring. 4/5

My Scoring System

  • 5/5 = Brilliant, top 5 book of the year
  • 4/5 = Above average, strongly recommend
  • 3/5 = Average. in the middle 70% of books I read
  • 2/5 = Disappointing
  • 1/5 = Did not like at all

Share

,

Simon LyallDonations 2020

Each year I do the majority of my Charity donations in early December (just after my birthday) spread over a few days (so as not to get my credit card suspended).

I also blog about it to hopefully inspire others. See: 2019, 2018, 2017, 2016, 2015

All amounts this year are in $US unless otherwise stated

My main donations was $750 to Givewell (to allocate to projects as they prioritize). Once again I’m happy that Givewell make efficient use of money donated. I decided this year to give a higher proportion of my giving to them than last year.

Software and Internet Infrastructure Projects

€20 to Syncthing which I’ve started to use instead of Dropbox.

$50 each to the Software Freedom Conservancy and Software in the Public Interest . Money not attached to any specific project.

$51 to the Internet Archive

$25 to Let’s Encrypt

Advocacy Organisations

$50 to the Electronic Frontier Foundation

Others including content creators

I donated $103 to Signum University to cover Corey Olsen’s Exploring the Lord of the Rings series plus other stuff I listen to that they put out.

I paid $100 to be a supporter of NZ News site The Spinoff

I also supported a number of creators on Patreon:

Share

,

Adrian ChaddRepairing and bootstrapping an IBM PC/AT 5170, Part 3

So! In Parts 1 and 2 I covered getting this old thing cleaned up, getting it booting, hacking together a boot floppy disc and getting a working file transfer onto a work floppy disc.

Today I'm going to cover what it took to get it going off of a "hard disk", which in 2020 can look quite a bit different than 1990.

First up - what's the hard disk? Well, in the 90s we could still get plenty of MFM and early IDE hard disks to throw into systems like this. In 2020, well, we can get Very Large IDE disks from NOS (like multi hundred gigabyte parallel ATA interface devices), but BIOSes tend to not like them. The MFM disks are .. well, kinda dead. It's understandable - they didn't exactly build them for a 40 year shelf life.

The IBM PC, like most computers at the time, allow for peripherals to include software support in ROM for the hardware you're trying to use. For example, my PC/AT BIOS doesn't know anything about IDE hardware - it only knows about ye olde ST-412/ST-506 Winchester/MFM drive controllers. But contemporary IDE hardware would include the driver code for the BIOS in an on-board ROM, which the BIOS would enumerate and use. Other drive interconnects such as SCSI did the same thing.

By the time the 80386's were out, basic dumb IDE was pretty well supported in BIOSes as, well, IDE is really code for "let's expose some of the 16 bit ISA bus on a 40 pin ribbon cable to the drives". But, more about that later.

Luckily some electronics minded folk have gone and implemented alternatives that we can use. Notably:

  • There's now an open-source "Universe IDE BIOS" available for computers that don't have IDE support in their BIOS - notably PC/XT and PC/AT, and
  • There are plenty of projects out there which break out the IDE bus on an XT or AT ISA bus - I'm using XT-IDE.
Now, I bought a little XT-IDE + compact flash card board off of ebay. They're cheap, it comes with the universal IDE bios on a flash device, and ...

... well, I plugged it in and it didn't work. So, I wondered if I broke it. I bought a second one, as I don't have other ISA bus computers yet, and ...

IT didn't work. Ok, so I know that there's something up with my system, not these cards. I did the 90s thing of "remove all IO cards until it works" in case there was an IO port conflict and ...

.. wham! The ethernet card. Both wanted 0x300. I'd have to reflash the Universal IDE BIOS to get it to look at any other address, so off I went to get the Intel Etherexpress 8/16 card configuration utility.

Here's an inside shot of the PC/AT with the XT-IDE installed, and a big gaping hole where the Intel EtherExpress 8/16 NIC should be.





No wait. What I SHOULD do first is get the XT-IDE CF card booting and running.

Ok, so - first things first. I had to configure the BIOS drive as NONE, because the BIOS isn't servicing the drive - the IDE BIOS is. Unfortunately, the IDE BIOS is coming in AFTER the system BIOS disks, so I currently can't run MFM + IDE whilst booting from IDE. I'm sure I can figure out how at some point, but that point is not today.

Success! It boots!

To DOS 6.22!

And only the boot sector, and COMMAND.COM! Nooooo!

Ok so - I don't have a working 3.5" drive installed, I don't have DOS 6.22 media on 1.2MB, but I can copy my transfer program (DSZ) - and Alley Cat - onto the CF card. But - now I need the DOS 6.22 install media.

On the plus side - it's 2020 and this install media is everywhere. On the minus side - it's disk images that I can't easily use. On the double minus side - the common DOS raw disk read/write tool - RAWREAD/RAWRITE - don't know about 5.25" drives! Ugh!

However! Here's where a bit of hilarious old knowledge is helpful - although the normal DOS installers want to be run from floppy, there's a thing called the "DOS 6.22 Upgrade" - and this CAN be run from the hard disk. However! You need a blank floppy for it to write the "uninstallation" data to, so keep one of those handy.

I extracted the files from the disk images using MTOOLS - "MCOPY -i DISK.IMG ::*.* ." to get everything out  - used PKZIP and DSZ to get it over to the CF card, and then ran the upgrader.


Hello DOS 6.22 Upgrade Setup!


Ah yes! Here's the uninstall disc step! Which indeed I had on hand for this very moment!


I wonder if I should fill out the registration card for this install and send it to Microsoft.

Ok, so that's done and now I have a working full DOS 6.22 installation. Now I can do all the fun things like making a DOS boot disk and recovery things. (For reference - you do that using FORMAT /S A: to format a SYSTEM disk that you can boot from; then you add things to it using COPY.)

Finally, I made a boot disk with the Intel EtherExpress 8/16 config program on it, and reconfigured my NIC somewhere other than 0x300. Now, I had to open up the PC/AT, remove the XT-IDE and install the EtherExpress NIC to do this - so yes, I had to boot from floppy disc.





Once that was done, I added a bunch of basic things like Turbo C 2.0, Turbo Assembler and mTCP. Now, mTCP is a package that really needed to exist in the 90s. However, this and the RAM upgrade (which I don't think I've talked about yet!) will come in the next installment of "Adrian's remembering old knowledge from his teenage years!".

,

David RoweFreeDV 700E and Compression

FreeDV 700D [9] is built around an OFDM modem [6] and powerful LDPC codes, and was released in mid 2018. Since then our real world experience has shown that it struggles with fast fading channels. Much to my surprise, the earlier FreeDV 700C mode actually works better on fast fading channels. This is surprising as 700C doesn’t have any FEC, but instead uses a simple transmit diversity scheme – the signal is sent twice at two different frequencies.

So I decided to develop a new FreeDV 700E waveform [8] with the following features:

  1. The ability to handle fast fading through an increased pilot symbol rate, but also with FEC which is useful for static crashes, interference, and mopping up random bit errors.
  2. Uses a shorter frame (80ms), rather than the 160ms frame of 700D. This will reduce latency and makes sync faster.
  3. The faster pilot symbol rate will mean 700E can handle frequency offsets better, as well as fast fading.
  4. Increasing the cyclic prefix from 2 to 6ms, allowing the modem to handle up to 6ms of multipath delay spread.
  5. A wider RF bandwidth than 700D, which can help mitigate frequency selective fading. If one part of the spectrum is notched out, we can use FEC to recover data from other parts of the spectrum. On the flip side, narrower signals are more robust to some interference, and use less spectrum.
  6. Compression of the OFDM waveform, to increase the average power (and hence received SNR) for a given peak power.
  7. Trade off low SNR performance for fast fading channel performance. A higher pilot symbol rate and longer cyclic prefix mean less energy is available for data symbols, so low SNR performance won’t be as good as 700D.
  8. It uses the same Codec 2 700C voice codec, so speech quality will be the same as 700C and D when SNR is high.

Over the course of 2020, we’ve refactored the OFDM modem and FreeDV API to make implementing new modem waveforms much easier. This really helped – I designed, simulated, and released the FreeDV 700E mode in just one week of part time work. It’s already being used all over the world in the development version of FreeDV 1.5.0.

My bench tests indicate 700C/D/E are equivalent on moderate fading channels (1Hz Doppler/2ms spread). As the fading speeds up to 2Hz 700D falls over, but 700C/E perform well. On very fast fading (4Hz/4ms) 700E does better as 700C stops working. 700D works better at lower SNRs on slow fading channels (1Hz Doppler/2ms and slower).

The second innovation is compression of the 700C/D/E waveforms, to increase average power significantly (around 6dB from FreeDV 1.4.3). Please be careful adjusting the Tx drive and especially enabling the Tools – Options – Clipping. It can drive your PA quite hard. I have managed 40W RMS out of my 75W PEP transmitter. Make sure your transmitter can handle long periods of high average power.

I’ve also been testing against compressed SSB, which is pretty hard to beat as it’s so robust to fading. However 700E is hanging on quite well with fast fading, and unlike SSB becomes noise free as the SNR increases. At the same equivalent peak power – 700D is doing well when compressed SSB is -5dB SNR and rather hard on the ears.

SSB Compression

To make an “apples to apples” comparison between FreeDV to SSB at low SNRs I need SSB compressor software than I can run on the (virtual) bench. So I’ve developed a speech compressor using some online wisdom [1][2]. Turns out the “Hilbert Clipper” in [2] is very similar to how I am compressing my OFDM signals to improve their PAPR. This appeals to me – using the same compression algorithm on SSB and FreeDV.

The Hilbert transformer takes the “real” speech signal and converts it to a “complex” signal. It’s the same signal, but now it’s represented by in phase and quadrature signals, or alternatively a vector spinning around the origin. Turns out you can do a much better job at compression by limiting the magnitude of that vector, than by clipping the input speech signal. Any clipping tends to spread the signal in frequency, so we have a SSB filter at the output to limit the bandwidth. Good compressors can get down to about 6dB PAPR for SSB, mine is not too shabby at 7-8dB.

It certainly makes a difference to noisy speech, as you can see in this plot (in low SNR channel), and the samples below:

Compression SNR Sample
Off High Listen
On High Listen
Off Low Listen
On Low Listen

FreeDV Performance

Here are some simulated samples. They are all normalised to the same peak power, and all waveforms (SSB and FreeDV) are compressed. The noise power N is in dB but there is some arbitrary scaling (for historical reasons). A more negative N means less noise. For a given noise power N, the SNRs vary as different waveforms have different peak to average power ratio. I’m adopting the convention of comparing signals at the same (Peak Power)/(Noise Power) ratio. This matches real world transmitters – we need to do the best we can with a given PEP (peak power). So the idea below is to compare samples at the same noise power N, and channel type, as peak power is known to be constant. An AWGN channel is just plain noise, MPP is 1Hz Doppler, 2ms delay spread; and MPD is 2Hz Doppler, 4ms delay spread.

Test Mode Channel N SNR Sample
1 SSB AWGN -12.5 -5 Listen
700D AWGN -12.5 -1.8 Listen
2 SSB MPP -17.0 0 Listen
700E MPP -17.0 3 Listen
3 SSB MPD -23.0 8 Listen
700E MPD -23.0 9 Listen

Here’s a sample of the simulated off air 700E modem signal 700E at 8dB SNR in a MPP channel. It actually works up to 4 Hz Doppler and 6ms delay spread. Which sounds likes a UFO landing.

Comments:

  1. With digital when your in a fade, you’re a fade! You lose that chunk of speech. A short FEC code (less than twice fade duration) isn’t going to help you much. We can’t extend the code length because latency (this is PTT speech). Sigh.
  2. This explain why 700C (with no FEC) does so well – we lose speech during deep fades (where FEC breaks anyway) but it “hangs on” as the Doppler whirls around and sounds fine in the “anti-fades”. The voice codec is robust to a few % BER all by itself which helps.
  3. Analog SSB is nearly impervious to fading, no matter how fast. It’s taken a lot of work to develop modems that “hang on” in fast fading channels.
  4. Analog SSB degrades slowly with decreasing SNR, but also improves slowly with increasing SNR. It’s still noisy at high SNRs. DSP noise reduction can help.

Lets take a look at the effect of compression. Here is a screen shot from my spectrum analyser in zero-span mode. It’s displaying power against time from my FT-817 being driven by two waveforms. Yellow is the previous, uncompressed 700D waveform, purple is the latest 700D with compression. You can really get a feel for how much higher the average power is. On my radio I jumped from 5-10W RMS to 40WRMS.

Jose’s demo

Jose, LU5DKI sent me a wave file sample of him “walking through the modes” over a 12,500km path between Argentina and New Zealand. The SSB is at the 2:30 mark:

This example shows how well 700E can handle fast fading over a path that includes Antartica:

A few caveats:

  • Jose’s TK-80 radio is 40 years old and doesn’t have any compression available for SSB.
  • FreeDV attenuates the “pass through” off air radio noise by about 6dB, so the level of the SSB will be lower than the FreeDV audio. However that might be a good idea with all that noise.
  • Some noise reduction DSP might help, although it tends to fall over at low SNRs. I don’t have a convenient command line tool for that. If you do – here is Jose’s sample. Please share the output with us.

I’m interested in objective comparisons of FreeDV and SSB using off air samples. I’m rather less interested in subjective opinions. Show me the samples …….

Conclusions and Further Work

I’m pleased with our recent modem waveform development and especially the compression. It’s also good fun to develop new waveforms and getting easier as the FreeDV API software matures. We’re getting pretty good performance over a range of channels now. Learning how to make modems for digital voice play nicely over HF channels. I feel our SSB versus FreeDV comparisons are maturing too.

The main limitation is the Codec 2 700C vocoder – while usable in practice it’s not exactly HiFi. Unfortunately speech coding is hard – much harder than modems. More R&D than engineering, which means a lot more effort – with no guarantee of a useful result. Anyhoo, lets see if I can make some progress on speech quality at low SNRs in 2021!

Links

[1] Compression – good introduction from AB4OJ.
[2] DSP Speech Processor Experimentation 2012-2020 – Sophisticated speech processor.
[3] Playing with PAPR – my initial simulations from earlier in 2020.
[4] Jim Ahlstrom N2ADR has done some fine work on FreeDV filter C code – very useful once again for this project. Thanks Jim!
[5] Modems for HF Digital Voice Part 1 and Part 2 – gentle introduction in to modems for HF.
[6] Steve Ports an OFDM modem from Octave to C – the OFDM modem Steve and I built – it keeps getting better!
[7] FreeDV 700E uses one of Bill’s fine LDPC codes.
[8] Modem waveform design spreadsheet.
[9] FreeDV 700D Released

Command Lines

Writing these down so I can cut and paste them to repeat these tests in the future….

Typical FreeDV 700E simulation, wave file output:

./src/freedv_tx 700E ../raw/ve9qrp_10s.raw - --clip 1 | ./src/cohpsk_ch - - -23 --mpp --raw_dir ../raw --Fs 8000 | sox -t .s16 -r 8000 -c 1 - ~/drowe/blog/ve9qrp_10s_700e_23_mpd_rx.wav

Looking at the PDF (histogram) of signal magnitude is interesting. Lets generate some compressed FreeDV 700E:

./src/freedv_tx 700D ../raw/ve9qrp.raw - --clip 1 | ./src/cohpsk_ch - - -100  --Fs 8000 --complexout > ve9qrp_700d_clip1.iq16

Now take the complex valued output signal and plot the PDF and CDF the magnitude (and time domain and spectrum):

octave:1> s=load_raw("../build_linux/ve9qrp_700d_clip1.iq16"); s=s(1:2:end)+j*s(2:2:end); figure(1); plot(abs(s)); S=abs(fft(s)); figure(2): clf; plot(20*log10(S)); figure(3); clf; [hh nn] = hist(abs(s),25,1); cdf = empirical_cdf(1:max(abs(s)),abs(s)); plotyy(nn,hh,1:max(abs(s)),cdf);


This is after clipping, so 100% of the samples have a magnitude less than 16384. Also see [3].

When testing with real radios it’s useful to play a sine wave at the same PEP level as the modem signals under test. I could get 75WRMS (and PEP) out of my IC-7200 using this test (13.8VDC power supply):

./misc/mksine - 1000 160 16384 | aplay -f S16_LE

We can measure the PAPR of the sine wave with the cohpsk_ch tool:

./misc/mksine - 1000 10 | ./src/cohpsk_ch - /dev/null -100 --Fs 8000
ohpsk_ch: Fs: 8000 NodB: -100.00 foff: 0.00 Hz fading: 0 nhfdelay: 0 clip: 32767.00 ssbfilt: 1 complexout: 0
cohpsk_ch: SNR3k(dB):    85.23  C/No....:   120.00
cohpsk_ch: peak.....: 10597.72  RMS.....:  9993.49   CPAPR.....:  0.51 
cohpsk_ch: Nsamples.:    80000  clipped.:     0.00%  OutClipped:  0.00%

CPAPR = 0.5dB, should be 0dB, but I think there’s a transient as the Hilbert Transformer FIR filter memory fills up. Close enough.

By chaining cohpsk_ch together is various ways we can build a SSB compressor, and simulate the channel by injecting noise and fading:

./src/cohpsk_ch ../raw/ve9qrp_10s.raw - -100 --Fs 8000 | ./src/cohpsk_ch - - -100 --Fs 8000 --clip 16384 --gain 10 | ./src/cohpsk_ch - - -100 --Fs 8000 --clip 16384 | ./src/cohpsk_ch - - -17 --raw_dir ../raw --mpd --Fs 8000 --gain 0.8 | aplay -f S16_LE

cohpsk_ch: peak.....: 16371.51  RMS.....:  7128.40   CPAPR.....:  7.22

A PAPR of 7.2 dB is pretty good for a few hours work – the cools kids get around 6dB [1][2].

,

Jan SchmidtRift CV1 – Adventures in Kalman filtering

In my last post I wrote about changes in my OpenHMD positional tracking branch to split analysis of the tracking frames from the camera sensors across multiple threads. In the 2 months since then, the only real change in the repository was to add some filtering to the pose search that rejects bad poses by checking if they align with the gravity vector observed by the IMU. That is in itself a nice improvement, but there is other work I’ve been doing that isn’t published yet.

The remaining big challenge (I think) to a usable positional tracking solution is fusing together the motion information that comes from the inertial tracking sensors (IMU) in the devices (headset, controllers) with the observations that come from the camera sensors (video frames). There are some high level goals for that fusion, and lots of fiddly details about why it’s hard.

At the high level, the IMUs provide partial information about the motion of each device at a high rate, while the cameras provide observations about the actual position in the room – but at a lower rate, and with sometimes large delays.

In the Oculus CV1, the IMU provides Accelerometer and Gyroscope readings at 1000Hz (500Hz for controllers), and from those it’s possible to compute the orientation of the device relative to the Earth (but not the compass direction it’s facing), and also to integrate acceleration readings to get velocity and position – but the position tracking from an IMU is only useful in the short term (a few seconds) as it drifts rapidly due to that double integration.

The accelerometers measure (surprise) the acceleration of the device, but are always also sensing the Earth’s gravity field. If a device is at rest, it will ideally report 9.81 m/s2, give or take noise and bias errors. When the device is in motion, the acceleration measured is the sum of the gravity field, bias errors and actual linear acceleration. To interpolate the position with any accuracy at all, you need to separate those 3 components with tight tolerance.

That’s about the point where the position observations from the cameras come into play. You can use those snapshots of the device position to determine the real direction that the devices are facing, and to correct for any errors in the tracked position and device orientation from the IMU integration – by teasing out the bias errors and gravity offset.

The current code uses some simple hacks to do the positional tracking – using the existing OpenHMD 3DOF complementary filter to compute the orientation, and some hacks to update the position when a camera finds the pose of a device.

The simple hacks work surprisingly well when devices don’t move too fast. The reason is (as previously discussed) that the video analysis takes a variable amount of time – if we can predict where a device is with a high accuracy and maintain “tracking lock”, then the video analysis is fast and runs in a few milliseconds. If tracking lock is lost, then a full search is needed to recover the tracking, and that can take hundreds of milliseconds to complete… by which time the device has likely moved a long way and requires another full pose search, which takes hundreds of milliseconds..

So, the goal of my current development is to write a single unified fusion filter that combines IMU and camera observations to better track and predict the motion of devices between camera frames. Better motion prediction means hitting the ‘fast analysis’ path more often, which leads to more frequent corrections of the unknowns in the IMU data, and (circularly) better motion predictions.

To do that, I am working on an Unscented Kalman Filter that tracks the position, velocity, acceleration, orientation and IMU accelerometer and gyroscope biases – with promising initial results.

Graph of position error (m) between predicted position and position from camera observations
Graph of orientation error (degrees) between predicted orientation and camera observed pose.

In the above graphs, the filter is predicting the position of the headset at each camera frame to within 1cm most of the time and the pose to within a few degrees, but with some significant spikes that still need fixing. The explanation for the spikes lies in the data sets that I’m testing against, and points to the next work that needs doing.

To develop the filter, I’ve modifed OpenHMD to record traces as I move devices around. It saves out a JSON file for each device with a log of each IMU reading and each camera frame. The idea is to have a baseline data set that can be used to test each change in the filter – but there is a catch. The current data was captured using the upstream positional tracking code – complete with tracking losses and long analysis delays.

The spikes in the filter graph correspond with when the OpenHMD traces have big delays between when a camera frame was captured and when the analysis completes.

Delay (ms) between camera frame and analysis results.

What this means is that when the filter makes bad predictions, it’s because it’s trying to predict the position of the device at the time the sensor result became available, instead of when the camera frame was captured – hundreds of milliseconds earlier.

So, my next step is to integrate the Kalman filter code into OpenHMD itself, and hopefully capture a new set of motion data with fewer tracking losses to prove the filter’s accuracy more clearly.

Second – I need to extend the filter to compensate for that delay between when a camera frame is captured and when the results are available for use, by using an augmented state matrix and lagged covariances. More on that next time.

To finish up, here’s a taste of another challenge hidden in the data – variability in the arrival time of IMU updates. The IMU updates at 1000Hz – ideally we’d receive those IMU updates 1 per millisecond, but transfer across the USB and variability in scheduling on the host computer make that much noisier. Sometimes further apart, sometimes bunched together – and in one part there’s a 1.2 second gap.

IMU reading timing variability (nanoseconds)

,

Tim SerongI Have No Idea How To Debug This

On my desktop system, I’m running XFCE on openSUSE Tumbleweed. When I leave my desk, I hit the “lock screen” button, the screen goes black, and the monitors go into standby. So far so good. When I come back and mash the keyboard, everything lights up again, the screens go white, and it says:

blank: Shows nothing but a black screen
Name: tserong@HOSTNAME
Password:
Enter password to unlock; select icon to lock

So I type my password, hit ENTER, and I’m back in action. So far so good again. Except… Several times recently, when I’ve come back and mashed the keyboard, the white overlay is gone. I can see all my open windows, my mail client, web browser, terminals, everything, but the screen is still locked. If I type my password and hit ENTER, it unlocks and I can interact again, but this is where it gets really weird. All the windows have moved down a bit on the screen. For example, a terminal that was previously neatly positioned towards the bottom of the screen is now partially off the screen. So “something” crashed – whatever overlay the lock thingy put there is gone? And somehow this affected the position of all my application windows? What in the name of all that is good and holy is going on here?

Update 2020-12-21: I’ve opened boo#1180241 to track this.

,

Adrian ChaddRepairing and bootstrapping an IBM 5170 PC/AT, part 2

Ok, so now it runs. But, what did it take to get here?

First up - I'm chasing down a replacement fusable PROM and I'll likely have to build a programmer for it. The programmer will need to run a bit at a time, which is very different to what the EPROM programmers available today support. It works for now, but I don't like it.

I've uploaded a dump of the PROM here - https://erikarn.github.io/pcat/notes.html .

Here's how the repair looks so far:



Next - getting files onto the device. Now, remember the hard disk is unstable, but even given that it's only DOS 5.0 which didn't really ship with any useful file transfer stuff. Everyone expected you'd have floppies available. But, no, I don't have DOS available on floppy media! And, amusingly, I don't have a second 1.2MB drive installed anywhere to transfer files.

I have some USB 3.5" drives that work, and I have a 3.5" drive and Gotek drive to install in the PC/AT. However, until yesterday I didn't have a suitable floppy cable - the 3.5" drive and Gotek USB floppy thingy both use IDC pin connectors, and this PC/AT uses 34 pin edge connectors. So, whatever I had to do, I had to do with what I had.

There are a few options available:

  • You can write files in DOS COMMAND.COM shell using COPY CON <file> - it either has to be all ascii, or you use ALT-<3 numbers> to write ALT CODES. For MS-DOS, this would just input that value into the keyboard buffer. For more information, Wikipedia has a nice write-up here: https://en.wikipedia.org/wiki/Alt_code .
  • You can use an ASCII only assembly as above: a popular one was TCOM.COM, which I kept here: https://erikarn.github.io/pcat/tcomtxt.asm
  • If you have MODE.COM, you could try setting up the serial port (COM1, COM2, etc) to a useful baud rate, turn on flow control, etc - and then COPY COM1 <file>. I didn't try this because I couldn't figure out how to enable hardware flow control, but now that I have it all (mostly) working I may give it a go.
  • If you have QBASIC, you can write some QBASIC!
I tried TCOM.COM, both at 300 and 2400 baud. Both weren't reliable, and there's a reason it isn't - writing to the floppy is too slow! Far, far too slow! And, it wasn't enforcing hardware flow control, which was very problematic for reliable transfers.

So, I wrote some QBASIC. It's pretty easy to open a serial port and read/write to it, but it's not AS easy to have it work for binary file transfer. There are a few fun issues:

  • Remember, DOS (and Windows too, yay!) has a difference between files open for text reading/writing and files open for binary reading/writing.
  • QBASIC has sequential file access or random file access. For sequential, you use INPUT/PRINT, for random you use GET and PUT.
  • There's no byte type - you define it as a STRING type of a certain size.
  • This is an 8MHz 80286, and .. well, let's just say QBASIC isn't the fastest thing on the planet here.
I could do some basic IO fine, but I couldn't actually transfer and write out the file contents quickly and reliably. Even going from 1200 to 4800 and 9600 baud didn't increase the transfer rate! So even given an inner loop of reading/writing a single byte at a time with nothing else, it still can't keep up.

The other amusingly annoying thing is what to use on the remote side to send binary files. Now, you can use minicom and such on FreeBSD/Linux, but it doesn't have a "raw" transfer type - it has xmodem, ymodem, zmodem and ascii transfers. I wanted to transfer a ~ 50KB binary to let me do ZMODEM transfers, and .. well, this presents a bootstrapping problem.

After a LOT of trial and error, I ended up with the following:

  • I used tip on FreeBSD to talk to the serial port
  • I had to put "hf=true" into .tiprc to force hardware handshaking; it didn't seem to work when I set it after I started tip (~s to set a variable)
  • On the QBASIC side I had to open it up with hardware flow control to get reliable transfers;
  • And I had to 128 byte records - not 1 byte records - to get decent transfer performance!
  • On tip to send the file I would ask it to fork 'dd' to do the transfer (using ~C) and asking it to pad to the 128 byte boundary:
    • dd if=file bs=128 conv=sync
The binary I chose (DSZ.COM) didn't mind the extra padding, it wasn't checksumming itself.

Here's the hacky QBASIC program I hacked up to do the transfer:

OPEN "RB", #2, "MYFILE.TXT", 128

' Note: LEN = 128 is part of the OPEN line, not a separate line!
OPEN "COM1:9600,N,8,1,CD0,CS500,DS500,OP0,BIN,TB2048,RB32768" FOR RANDOM AS #1 LEN = 128

size# = 413 '413 * 128 byte transfer
DIM c AS STRING * 128 ' 128 byte record
FOR i = 1 TO size#
  GET #1, , c
  PUT #2, , c
NEXT i
CLOSE #2
CLOSE #1

Now, this is hackish, but specifically:
  • 9600 baud, 8N1, hardware flow control, 32K receive buffer.
  • 128 byte record size for both the file and UART transfers.
  • the DSZ.COM file size, padded to 128 bytes, was 413 blocks. So, 413 block transfers.
  • Don't forget to CLOSE the file once you've written, or DOS won't finalise the file and you'll end up with a 0 byte file.
This plus tip configured for 9600 and hardware flow control did the right thing. I then used DSZ to use ZMODEM to transfer a fresh copy of itself, and CAT.EXE (Alley Cat!)

Ok, so that bootstrapped enough of things to get a ZMODEM transfer binary onto a bootable floppy disc containing a half-baked DOS 5.0 installation. I can write software with QBASIC and I can transfer files on/off using ZMODEM.

Next up, getting XT-IDE going in this PC/AT and why it isn't ... well, complete.



,

Adrian ChaddRepairing and bootstrapping an IBM 5170 PC/AT, part 1

 I bought an IBM PC/AT 5170 a few years ago for a Hackerdojo project that didn't end up going anywhere.

So, I have a PC/AT with:

  • 8MHz 80286 (type 3 board)
  • 512K on board
  • 128K expansion board (with space for 512K extended RAM, 41256 RAM chip style)
  • ST4038 30MB MFM drive with some gunk on head or platter 3 (random head 3 read failures, sigh)
  • 1.2MB floppy drive
  • CGA card
  • Intel 8/16 ethernet card



Ok, so the bad disk was a pain in the ass. It's 2020, DOS on 1.2MB floppy disks isn't exactly the easiest thing to come across. But, it DOES occasionally boot.

But, first up - the BIOS battery replacement had leaked. Everywhere. So I replaced that, and had to type in a BASIC program into ROM BASIC to reprogram the BIOS NVRAM area with a default enough configuration to boot from floppy or hard disk.



Luckily someone had done that:

http://www.minuszerodegrees.net/5170/setup/5170_setup.htm

So, I got through that.

Then, I had to buy some double high density 5.25" discs. Ok, well, that took a bit, but they're still available as new old stock (noone's making floppy discs anymore, sigh.) I booted the hard disk and after enough attempts at it, it booted to the command prompt. At which point I promptly created a bootable system disc and copied as much of DOS 5.0 off of it as I could.




Then, since I am a child of the 80s and remember floppy discs, I promptly DISCCOPY'ed it to a second disc that I'm leaving as a backup.

And, for funsies, DOSSHELL.



Ok, so what's next?

I decided to buy an alternate BIOS - the Quadtel 286 image that's floating about - because quite frankly having to type in that BASIC program into ROM BASIC every time was a pain in the ass. So, in it went. Which was good, because...

Well, then it stopped working. It turns out that my clean-up of the battery leakage wasn't enough. The system booted with three short beeps and "0E" on the screen.

Now we get into deep, deep PC history.

Luckily, the Quadtel BIOS codes are available here:

http://www.bioscentral.com/postcodes/quadtelbios.htm

.. but with the Intel BIOS, it didn't beep, it didn't do anything. Just a black screen.

What gives?

So, starting with PC/AT and clone machines, the BIOS would write status updates during boot to a fixed IO port. Then if you have a diagnostic card that monitors that IO port, you'd get updates on where the system go to during boot before it hit a problem. These are called POST (power on self test) codes.

Here's a write-up of it and some POST cards:

http://www.minuszerodegrees.net/misc/post_cards.htm

Luckily, the Quadtel BIOS just spat it out on the screen for me. Phew.

So! 0xE says the 8254 interval timer wasn't working. I looked on the board and ... voila! It definitely had a lot of rusty looking crap on it. U115, a 32 byte PROM used for some address line decoding also was unhappy.

Here's how it looked before I had cleaned it up - this is circa July:




I had cleaned all of this out and used some vinegar on a Q-tip to neutralise the leaked battery gunk, but I didn't get underneath the ICs.

So, out they both came. I cleaned up the board, repaired some track damage and whacked in sockets.

Then in went the chips - same issue. Then I was sad.

Then! Into the boxes of ICs I went - where I found an 8254-2 that was spare! I lifted it from a dead PC clone controller board a while ago. In IT went, and the PC/AT came alive again.

(At this point I'd like to note that I was super afraid that the motherboard was really dead, as repairing PC/AT motherboards is not something I really wanted to do. Well, it's done and it works.)

Rightio! So, the PC boots, CGA monitor and all, from floppy disc. Now comes the fun stuff - how do I bootstrap said PC/AT with software, given no software on physical media? Aha, that's in part 2.

Dave HallContact

Get In Touch Contact Us Skwashd Services Pty Ltd T/A Dave Hall Consulting Address : PO Box 7306 Kaleen ACT 2617 Australia Phone Number : +61 (0) 2 8294 4747 ABN : 99 127 791 539 Feel free to use the form below to contact us. Your Name * Email Address * Subject * Message * SendSend

,

Linux AustraliaCouncil Meeting Tuesday 15th December 2020 – Minutes

1. Meeting overview and key information

Present

Sae Ra Germaine

Jonathan Woithe

Julien Goodwin

Russell Stuart

Lisa Sands

Joel Addison

Benno Rice

Apologies 

None

 

Meeting opened at 1930 AEDT by Sae Ra and quorum was achieved.

Minutes taken by Julien.

2. Log of correspondence

  • From Anchor to council@ on 2 Dec 2020: lca2018.org domain renewal is due by 1 Mar 2021.
    • Still a few services that either need to move off.
    • AI: Joel to deal with.
  • From ASIC to council@ on 8 Dec 2020: “Open Source Australia” business registration due by 24 Nov 2020. This was dealt with on the day; an acknowledgement of renewal was received.
    • (Already done)

3. Items for discussion

  • Rusty Wrench
    • AI: Julien to send out to previous winners, without a suggestion from council
  • Code of Conduct wording clarification (background sent to council@)
    • Also the github copy turns out to be out of date, PyCon desire this refresh to happen early enough to be in place for the next PyCon.
    • Change will be announced.
    • AI: Jonathan to write up an announcement for policies, linux-aus,.
  • YouTube Partner Program
    • Setup more formal escalation paths for LCA (and other LA-affiliated events) in future
    • Register for YouTube Partner Programme to get additional support avenues.
    • Need to create an AdSense account for this
    • Do not need to enable monetisation now or future, but do need to decide whether existing videos should have monetisation enabled when we join the program.
    • Sae Ra moves motion that we join the partner program, but do not enable monetisation for existing videos.
      • Passed, one abstention.
    • AI: Joel to register, and update Ryan.

4. Items for noting

  • Election/AGM announcement sent.
    • Need some volunteers for AGM meeting wrangling.
    • AGM is set for: 11am-midday on Friday the 15th of January (AEDT)
      • 8am Perth
    • AI: Julien to send call for AGM items ASAP.
  • LCA update <details redacted>

5. Other business

  • None

6. In camera

  • No items were discussed in camera

Meeting closed at 2020

The post Council Meeting Tuesday 15th December 2020 – Minutes appeared first on Linux Australia.

Stewart SmithTwo Photos from Healseville Sanctuary

If you’re near Melbourne, you should go to Healseville Sanctuary and enjoy the Australian native animals. I’ve been a number of times over the years, and here’s a couple of photos from a (relatively, as in, the last couple of years) trip.

Leah trying to photograph a much too close bird
Koalas seem to always look like they’ve just woken up. I’m pretty convinced this one just had.

Stewart SmithPhotos from Adelaide

Some shots on Kodak Portra 400 from Adelaide. These would have been shot with my Nikon F80 35mm body, I think all with the 50mm lens. These are all pre-pandemic, and I haven’t gone and looked up when exactly. I’m just catching up on scanning some negatives.

,

Maxim ZakharovSafari; wss; OSStatus -9829

You may google out several explanations on an error "WebSocket network error: The operation couldn't be completed (OSStatus error -9829)" when you attempt to connect to a secure websocket using Safari web-browser on Mac OS X 11.0.1 (Big Sur). One of them points out more strict requirements for trusted web-site certificates on MacOS an iOS, which I didn't know.

However, in my case the error was caused by a drawback in Safari browser, - turn out it simply does not send user certificate when performing secure websocket connection (wss:// URL scheme).

Hopefully, Firefox browser doesn't have this drawback and all works fine when you use it. Though you would need to install all CA certificates and user certificate into Firefox's store as it turn out doesn't use the system's one.

Maxim ZakharovAsynchronous Consensus Algorithm

,

Tim RileyOpen source status update, November 2020

Hello again, dear OSS enthusiasts. November was quite a fun month for me. Not only did I merge all the PRs I outlined in October’s status update, I also got to begin work on an area I’d been dreaming about for months: integrating Hanami/dry-system with Zeitwerk!

Added an autoloading loader to dry-system

Zeitwerk is a configurable autoloader for Ruby applications and gems. The “auto” in autoloader means that, once configured, you should never have to manually require before referring to the classes defined in the directories managed by Zeitwerk.

dry-system, on the other hand, was requiring literally every file it encountered, by design! The challenge here was to allow it to work with or without an auto-loader, making either mode a configurable option, ideally without major disruption to the library.

Fortunately, many of the core Dry::System::Container behaviours are already separate into individually configurable components, and in the end, all we needed was a new Loader subclass implementing a 2-line method:

module Dry
  module System
    class Loader
      # Component loader for autoloading-enabled applications
      #
      # This behaves like the default loader, except instead of requiring the given path,
      # it loads the respective constant, allowing the autoloader to load the
      # corresponding file per its own configuration.
      #
      # @see Loader
      # @api public
      class Autoloading < Loader
        def require!
          constant
          self
        end
      end
    end
  end
end

This can be enabled for your container like so:

require "dry/system/loader/autoloading"

class MyContainer < Dry::System::Container
  configure do |config|
    config.loader = Dry::System::Loader::Autoloading
    # ...
  end
end

Truth is, it did take a fair bit of doing to arrive at this simple outcome. Check out the pull request for more detail. The biggest underlying change was moving the responsibility for requiring files out of Container itself and into the Loader (which is called via each Component in the container). While I was in there, I took the chance to tweak a few other things too:

  • Clarified the Container.load_paths! method by renaming it to add_to_load_path! (since it is modifying Ruby’s $LOAD_PATH)
  • Stopped automatically adding the system_dir to the load path, since with Zeitwerk support, it’s now reasonable to run dry-system without any of its managed directories being on the load path
  • Added a new component_dirs setting, defaulting to ["lib"], which is used to verify whether a given component is ”local” to the container. This check was previously done using the directories previously passed to load_paths!, which we can’t rely upon now that we’re supporting autoloaders
  • Added a new add_component_dirs_to_load_path setting, defaulting to true, which will automatically add the configured component_dirs to the load path in an after-configure hook. This will help ease the transition from the previous behaviour, and make dry-system still work nicely when not using an autoloader

With all of this in place, a full working example with Zeitwerk looks like this. First, the container:

require "dry/system/container"
require "dry/system/loader/autoloading"

module Test
  class Container < Dry::System::Container
    config.root = Pathname(__dir__).join("..").realpath
    config.add_component_dirs_to_load_path = false
    config.loader = Dry::System::Loader::Autoloading
    config.default_namespace = "test"
  end
end

Then Zeitwerk setup:

loader = Zeitwerk::Loader.new
loader.push_dir Test::Container.config.root.join("lib").realpath
loader.setup

Then, given a component “foo_builder”, at lib/test/foo_builder.rb:

module Test
  class FooBuilder
    def call
      # We can now referencing this constant without a require!
      Entities::Foo.new
    end
  end
end

With this in place, we can resolve Test::Container["foo_builder"], receive an instance of Test::FooBuilder as expected, then .call it to receive our instance Test::Foo. Tada!

I’m very happy with how all this came together.

Next steps with dry-system

Apart from cracking the Zeitwerk nut, this project also gave me the chance to dive into the guts of dry-system after quite a while. There’s quite a bit of tidying up I’d still like to do, which is my plan for the next month or so. I plan to:

  • Make it possible to configure all aspects of each component_dir via a single block passed to the container’s config
  • Remove the default_namespace top-level container setting (since this will now be configured per-component_dir)
  • Remove the .auto_register! method, since our component-loading behaviour requires component dirs to be configured, and this method bypasses that step (until now, it’s only really worked by happenstance)
  • Make Zeitwork usable without additional config by providing a plugin that can be activated by a simple use :zeitwerk

Once these are done, I’ll hop up into the Hanami framework layer and get to work on passing the necessary configuration through to its own dry-system container so that it can also work with Zeitwerk out of the box.

Hanami core team meeting

This month I also had the (rare!) pleasure of catching up with Luca and Piotr in person to discuss our next steps for Hanami 2 development. Read my notes to learn more. If you’re at all interested in Hanami development (and if you’ve reached this point in my 9th straight monthly update, I assume you are), then this is well worth a read!

Of particular relevance to the topics above, we’ve decided to defer the next Hanami 2 alpha release until the Zeitwerk integration is in place. This will ensure we have a smooth transition across releases in terms of code loading behaviour (if we released sooner, we’d need to document a particular set of rules for alpha2 but then half of those out the window for alpha3, which is just too disruptive).

Thank you to my sponsors!

After all this time, I’m still so appreciative of my tiny band of GitHub sponsors. This stuff is hard work, so I’d really appreciate your support.

See you all again next month, by which point we’ll all have a Ruby 3.0 release!

,

Simon LyallAudiobooks – November 2020

The Geography of Nowhere: The Rise and Decline of America’s Man-made Landscape by James Howard Kunstler

A classic in urban planning, covering the downside of post-war American urban design. It dates from 1993 so but still 90% relevant. 3/5

A Year in Paris: Season by Season in the City of Light
by John Baxter

A series of short chapters arranged in seasonal sections on Paris, People, the Author’s life and the French Revolutionary Calendar. Plenty of Interest. 3/5

These Happy Golden Years: Little House Series, Book 8 by Laura Ingalls Wilder

Covering Laura’s short time as a schoolteacher (aged 15!) and her courting with husband-to-be Almanzo. Most action is in the first half of the book though. 3/5

Pure Invention: How Japan’s Pop Culture Conquered the World by Matt Alt

In depth chapters on things the Walkman, Game Boy and Hello Kitty traces Japans rise first in hardware and then in cultural influence. Excellent story 4/5

On All Fronts: The Education of a Journalist by Clarissa Ward

A conflict-reporter memoir of her life and career. Based mainly in Moscow, Baghdad, and Beirut she especially goes into detail of her missions into Syria during it’s civil war. 3/5

My Scoring System

  • 5/5 = Brilliant, top 5 book of the year
  • 4/5 = Above average, strongly recommend
  • 3/5 = Average. in the middle 70% of books I read
  • 2/5 = Disappointing
  • 1/5 = Did not like at all

Share

,

David RoweOpen IP over VHF/UHF 4

For the last few weeks I’ve been building up some automated test software for my fledgling IP over radio system.

Long term automated tests can help you thrash out a lot of issues. Part of my cautious approach in taking small steps to build a complex system. I’ve built up a frame repeater system where one terminal “pings” another terminal – and repeats this thousands of times. I enjoyed writing service scripts [3] to wrap up the complex command lines, and bring the system “up” and “down” cleanly. The services also provide some useful debug options like local loopback testing of the radio hardware on each terminal.

I started testing the system “over the bench” with two terminals pinging and ponging frames back and forth via cables. After a few hours I hit a bug where the RpiTx RF would stop. A few repeats showed this sometimes happened in a few minutes, and other times after a few hours.

This lead to an interesting bug hunt. I quite enjoy this sort of thing, peeling off the layers of a complex system, getting closer and closer to the actual problem. It was fun to learn about the RpiTx [2] internals. A very clever system of a circular DMA buffer feeding PLL fractional divider values to the PLLC registers on the Pi. The software application chases that DMA read pointer around, trying to keep the buffer full.

By dumping the clock tree I eventually worked out some other process was messing with the PLLC register. Evariste on the RpiTx forum then suggested I try “force_turbo=1” [4]. That fixed it! My theory is the CPU freq driver (wherever that lives) was scaling all the PLLs when the CPU shifted clock speed. To avoid being caught again I added some logic to check PLLC and bomb out if it appears to have been changed.

A few other interesting things I noticed:

  1. I’m running 10 kbit/s for these tests, with a 10kHz shift between the two FSK tones and a carrier frequency of 144.5MHz. I use a FT-817 SSB Rx to monitor the transmission, which has bandwidth of around 3 kHz. A lot of the time a FSK burst sounds like broadband noise, as the FT-817 is just hearing a part of the FSK spectrum. However if you tune to the high or low tone frequency (just under 144.500 or 144.510) you can hear the FSK tones. Nice audio illustration of FSK in action.
  2. On start up RPiTx uses ntp to calibrate the frequency, which leads to slight shifts in the frequency each time it starts. Enough to be heard by the human ear, although I haven’t measured them.

I’ve just finished a 24 hour test where the system sent 8600 bursts (about 6 Mbyte in each direction) over the link, and everything is running nicely (100% of packets were received). This gives me a lot of confidence in the system. I’d rather know if there are any stability issues now than when the device under test is deployed remotely.

I feel quite happy with that result – there’s quite a lot of signal processing software and hardware that must be playing nicely together to make that happen. Very satisfying.

Next Steps

Now it’s time to put the Pi in a box, connect a real antenna and try some over the air tests. My plan is:

  1. Set up the two terminals several km apart, and see if we can get a viable link at 10 kbit/s, although even 1 kbit/s would be fine for initial tests. Enough margin for 100 kbit/s would be even better, but happy to work on that milestone later.
  2. I’m anticipating some fine tuning of the FSK_LDPC waveforms will be required.
  3. I’m also anticipating problems with urban EMI, which will raise the noise floor and set the SNR of the link. I’ve instrumented the system to measure the noise power at both ends of the link, so I can measure this over time. I can also measure received signal power, and estimate path loss. Knowing the gain of the RTLSDR, we can measure signal power in dBm, and estimate noise power in dBm/Hz.
  4. There might be some EMI from the Pi, lets see what happens when the antenna is close.
  5. I’ll run the frame repeater system over several weeks, debug any stability issues, and collect data on S, N, SNR, and Packet Error Rate.

Reading Further

[1] Open IP over VHF/UHF Part 1 Part 2 Part 3
[2] RpiTx – Radio transmitter software for Raspberry Pis
[3] GitHub repo for this project with build scripts, a project plan and a bunch of command lines I use to run various tests. The latest work in progress will be an open pull request.
[4] RpiTx Group discussion of the PLLC bug discussed above

,

Linux AustraliaCouncil Meeting Tuesday 1st December 2020 – Minutes

1. Meeting overview and key information

Present

Sae Ra Germaine

Jonathan Woithe

Julien Goodwin

Russell Stuart

Lisa Sands

Joel Addison

Benno Rice

Apologies

None

 

Meeting opened at 1930 AEDT by Sae Ra and quorum was achieved.

Minutes taken by Julien.

2. Event Review

Drupal

Admin Team

Pycon

LCA 2020

LCA 2021

LCA 2022

3. Log of correspondence

  • 23 Nov 2020: Mailchimp policy update notification sent to council@.
    • N/A as the account has been closed
    • Data export saved into our Google Drive
  • From: Xero; Date: Mon 30 Nov 2020, Subject: Xero pricing changes, Summary: Price is going up by $2/mo.

4. Items for discussion

  • AGM timing
    • We’ll continue as planned, Julien & Sae Ra to ensure the announcements go out.
  • Do we have a meeting on the 29th of December, if not 12th Jan is normal (just before AGM)
    • No to the end of December, yes to the January
  • LCA YouTube account (Joel)
    • Setup more formal escalation paths for LCA (and other LA-affiliated events) in future
    • Register for YouTube Partner Programme to get additional support avenues.
    • Need to create an AdSense account for this
    • Do not need to enable monetisation now or future, but do need to decide whether existing videos should have monetisation enabled when we join the program.
    • AI: Sae Ra will move this on list so people have time to review

5. Items for noting

  • Rusty wrench nom period ongoing
    • 2 proper nominations received
    • 1 resubmission of a nomination from last year requested
    • 1 apparently coming
  • Jonathan reached out to <a member> re Code of Conduct concerns, haven’t gotten a response
  • Grant application for Software Freedom Day, no response from them, so grant has lapsed.
  • <our contact on the CovidSafe analysis team> has yet to provide information about the FOI costs associated with his group’s work on the COVIDsafe app.

6. Other business 

  • None

7. In camera

  • No items were discussed in camera

2038 AEDT close

The post Council Meeting Tuesday 1st December 2020 – Minutes appeared first on Linux Australia.

,

Stewart SmithWhy you should use `nproc` and not grep /proc/cpuinfo

There’s something really quite subtle about how the nproc utility from GNU coreutils works. If you look at the man page, it’s even the very first sentence:

Print the number of processing units available to the current process, which may be less than the number of online processors.

So, what does that actually mean? Well, just because the computer some code is running on has a certain number of CPUs (and here I mean “number of hardware threads”) doesn’t necessarily mean that you can spawn a process that uses that many. What’s a simple example? Containers! Did you know that when you invoke docker to run a container, you can easily limit how much CPU the container can use? In this case, we’re looking at the --cpuset-cpus parameter, as the --cpus one works differently.

$ nproc
8

$ docker run --cpuset-cpus=0-1 --rm=true -it  amazonlinux:2
bash-4.2# nproc
2
bash-4.2# exit

$ docker run --cpuset-cpus=0-2 --rm=true -it  amazonlinux:2
bash-4.2# nproc
3

As you can see, nproc here gets the right bit of information, so if you’re wanting to do a calculation such as “Please use up to the maximum available CPUs” as a parameter to the configuration of a piece of software (such as how many threads to run), you get the right number.

But what if you use some of the other common methods?

$ /usr/bin/lscpu -p | grep -c "^[0-9]"
8
$ grep -c 'processor' /proc/cpuinfo 
8

$ docker run --cpuset-cpus=0-1 --rm=true -it  amazonlinux:2
bash-4.2# yum install -y /usr/bin/lscpu
......
bash-4.2# /usr/bin/lscpu -p | grep -c "^[0-9]"
8
bash-4.2# grep -c 'processor' /proc/cpuinfo 
8
bash-4.2# nproc
2

In this case, if you base your number of threads off grepping lscpu you take another dependency (on the util-linux package), which isn’t needed. You also get the wrong answer, as you do by grepping /proc/cpuinfo. So, what this will end up doing is just increase the number of context switches, possibly also adding a performance degradation. It’s not just in docker containers where this could be an issue of course, you can use the same mechanism that docker uses anywhere you want to control resources of a process.

Another subtle thing to watch out for is differences in /proc/cpuinfo content depending on CPU architecture. You may not think it’s an issue today, but who wants to needlessly debug something?

tl;dr: for determining “how many processes to run”: use nproc, don’t grep lscpu or /proc/cpuinfo

,

Stewart SmithPhotos from Tasmania (2017)

On the random old photos train, there’s some from spending time in Tasmania post linux.conf.au 2017 in Hobart.

All of these are Kodak E100VS film, which was no doubt a bit out of date by the time I shot it (and when they stopped making Ektachrome for a while). It was a nice surprise to be reminded of a truly wonderful Tassie trip, taken with friends, and after the excellent linux.conf.au.

Linux AustraliaCouncil Meeting Tuesday 17th November 2020 – Minutes

1. Meeting overview and key information

Present

Sae Ra Germaine

Jonathan Woithe

Julien Goodwin

Russell Stuart

Lisa Sands

Benno Rice

Joel Addison

Apologies 

None

 

Meeting opened at 1932 AEDT by Sae Ra and quorum was achieved.

Minutes taken by Julien.

2. Log of correspondence

  • None

3. Items for discussion

  • Rusty Wrench timing.
    • Call for nominations draft looks good, (AI) Julien to send ASAP
  • Vala Tech Camp diversity scholarship
    • See mail to council@ 2020-11-09 from Sae Ra, “[LACTTE] For Next Meeting – VALA Tech Camp Diversity Scholarship”
    • Motion by Julien for Linux Australia to sponsor VALA Tech Camp A$2,750, seconded by Jonathan.
    • Passed, one abstention.

4. Items for noting

  • LCA21 <details redacted>

5. Other business

  • Quick discussion re covidsafe research and freedom of information requests, nothing for now, possibly next time
  • Moving the returning officer video into a doc would be nice, but no short term volunteers. Jonathan may look into it early 2021.

6. In camera

  • No items were discussed in camera

 

Meeting closed at 2006

The post Council Meeting Tuesday 17th November 2020 – Minutes appeared first on Linux Australia.

,

David RoweOpen IP over VHF/UHF 3

The goal of this project is to develop a “100 kbit/s IP link” for VHF/UHF using just a Pi and RTLSDR hardware, and open source signal processing software [1]. Since the last post, I’ve integrated a bunch of components and now have a half duplex radio data system running over the bench.

Recent progress:

  1. The Tx and Rx signal processing is now operating happily together on a Pi, CPU load is fine.
  2. The FSK_LDPC modem and FEC [2] has been integrated, so we can now send and receive coded frames. The Tx and Rx command line programs have been modified to send and receive bursts of frames.
  3. I’ve added a PIN diode Transmit/Receive switch, which I developed for the SM2000 project [3]. This is controlled by a GPIO from the Pi. There is also logic to start and stop the Pi Tx carrier at the beginning and end of bursts – so it doesn’t interfere with the Rx side.
  4. I’ve written a “frame repeater” application that takes packets received from the Rx and re-transmits them using the Tx. This will let me run “ping” tests over the air. A neat feature is it injects the received Signal and Noise power into the frame it re-transmits. This will let me measure the received power, the noise floor, and SNR at the remote station.
  5. The receiver in each terminal is very sensitive, and inconveniently picks up frames transmitted by that terminal. After trying a few approaches I settled on a “source filtering” design. When a packet is transmitted, the Tx places a “source byte” in the frame that is unique to that terminal. A one byte MAC address I guess. The local receiver then ignores (filters) any packets with that source address, and only outputs frames from other terminals.

Here is a block diagram of the Pi based terminal, showing hardware and software components:

When I build my next terminal, I will try separate Tx and Rx antennas, as a “minimalist” alternative to the TR switch. The next figure shows the transmit control signals in action. Either side of a burst we need to switch the TR switch and turn the Tx carrier on and off:

Here’s the current half duplex setup on the bench:

Terminal2 is on the left, is comprised of the Pi, RTLSDR, and TR switch. Terminal1 (right) is the HackRF/RTLSDR connected to my laptop. Instead of a TR switch I’m using a hybrid combiner (a 3dB loss, but not an issue for these tests). This also shows how different SDR Tx/Rx hardware can be used with this system.

I’m using 10,000 bit/s for the current work, although that’s software configurable. When I start testing over the air I’ll include options for a range of bit rates, eventually shooting for 100 kbits/s.

Here’s a demo video of the system:

Next Steps

The command lines to run everything are getting unwieldy so I’ll encapsulate them is some “service” scripts to start and stop the system neatly. Then box everything up, try a local RF link, and check for stability over a few days. Once I’m happy I will deploy a terminal and start working through the real world issues. The key to getting complex systems going is taking tiny steps. Test and debug carefully at each step.

It’s coming together quite nicely, and I’m enjoying a few hours of work on the project every weekend. It’s very satisfying to build the layers up one by one, and a pleasant surprise when the pieces start playing nicely together and packets move magically across the system. I’m getting to play with RF, radios, modems, packets, and even building up small parts of a protocol. Good fun!

Reading Further

[1] Open IP over UHF/VHF Part 1 and Part 2.
[2] FSK LDPC Data Mode – open source data mode using a FSK modem and powerful LDPC codes.
[3] SM2000 Part 3 – PIN TR Switch and VHF PA
[4] GitHub repo for this project with build scripts, a project plan and a bunch of command lines I use to run various tests. The latest work in progress will be an open pull request.

Stewart SmithPhotos from Melbourne

I recently got around to scanning some film that took an awful long time to make its way back to me after being developed. There’s some pictures from home.

The rest of this roll of 35mm Fuji Velvia 50 is from Tasmania, which would place this all around December 2016.

,

Glen TurnerBlocking a USB device

udev can be used to block a USB device (or even an entire class of devices, such as USB storage). Add a file /etc/udev/rules.d/99-local-blacklist.rules containing:

SUBSYSTEM=="usb", ATTRS{idVendor}=="0123", ATTRS{idProduct}=="4567", ATTR{authorized}="0"


comment count unavailable comments

,

Lev LafayetteeResearchAustralasia 2020

With annual conferences since 2007 eResearchAustralasia was hosted online this year, due to the impacts of SARS-CoV-2. Typically conferences are held along the eastern seaboard of Australia, which does bring into question the "-asia" part of the suffix. Even the conference logo highlights Australia and New Zealand, to the exclusion of the rest of the word. I am not sure how eResearch NZ feels about this encroachment on their territory. To be fair, however, eResearchAustralasia did have some "-asian" content, primarily in the keynote address of data analytics for COVID-19 tracking in Indonesia, a presentation on data sharing also with an Indonesian focus.

The conference had 582 attendees, up 119 from last year, and ran for five days from Monday, October 19 to Friday, October 23. Presentations and timetabling was quite varied with a combination of single-session keynotes, three or four concurrent streams of oral presentations and "birds-of-a-feather" sessions, lightning talks, "solution showcases", a poster session (read, "download the PDF"), online exhibitions, and a rather poorly-thought-out "speed networking"; overall more than 120 presentations. The conference itself was conducted through a product called "OnAir", which from some accounts has a good event management suite (EventsAIR), but the user interface could certainly do with some significant improvement. One notable advantage of having an online conference with pre-recorded presentations is that the presenters could engage in a live Q&A and elaboration with attendees, which actually meant that more content could be derived.

It was, of course, impossible to attend all the events so any review is orientated to those sessions that could be visited, which is around fifty in my case. The conference has promised to make videos available at a later date on a public platform (e.g., their Youtube channel) as there was no means to download videos at the conference itself, although sometimes slidedecks were provided. Unsurprisingly, a good proportion of the sessions were orientated around the new work and research environments due to the pandemic (including a topic stream), ranging from data management services, DNA sequencing using Galaxy, and multiple sessions on moving training online.

Training, in fact, received its own topic stream which is good to see after many years of alerts on the growing gap between researcher skills, practices, and requirements. This became particularly evident in one presentation from Intersect, which highlighted the need for educational scaffolding. Another training feature of note came from the University of Otago who, with an interest in replication and reproducibility, reported on their training in containers. This provided for a very interesting comparison with the report on The Australian Research Container Orchestration Service (ARCOS) which is establishing an Australian Kubernetes Core Service to support the use and orchestration of containers.

It will be interesting to see how ARCOS interfaces not just with the Australian Research Data Commons (ARDC), but also with the newly announced Australian Research Environment (ARE), a partnership between Pawsey, AARnet, and NCI "to provide a streamlined, nationally integrated workspace connected by high speed links" in 2021, especially when looking at the presentation on The Future of Community Cloud and State Based Infrastructure Organisations and AARNets presentation on Delivering sustainable Research Infrastructure, emphasizing the National Research Data Infrastructure and expansions on CloudStor Active Research Data Storage and Analysis services.

Research Computing Services from the University of Melbourne and friends were well-represented at the Conference. Steve Manos, for example, gave the presentation on ARCOS. Bernard Meade was a speaker on the panel for Governance Models for Research Compute, whereas yours truly inflicted to a willing audience two presentations to conference; one on Spartan: From Experimental Hybrid towards a Petascale Future, and another on contributing to the international HPC Certification Forum; I made sure I provided slidedecks and a transcript, I don't think anyone else did that. There was also one presentation on law and Natural Language Processing which garnered an additional mention of Spartan, albeit only to the extent that they said they hadn't gotten around to using the service yet! Something also of note was the multiple presentations on Galaxy which is, of course, prominent at Melbourne Bioinformatics.

This is, of course, only a taste of the presentations both in terms of what was available at the conference and what your reviewer attended, but it does give some highlights of what where seen as significant contributions. Despite some situational difficulties in hosting the event eResearchAustralasia have done quite well indeed in making the conference happen and deserve strong and sincere congratulations for that success and the impressive number of registrations. Whilst the conference made it clear that people are adapting to the current circumstances and eResearch has made enormous contributions to the scientific landscape on this matter, there is also clear indications of some national-level organisational initiatives as well. Whilst disruption and change are inevitable, especially in these circumstances, it is hoped that the scientific objectives remain of the higer priority throughout.

Lev LafayetteThe Willsmere Cup

Like an increasing number of Australians I'm not going to partake in the Melbourne Cup "the race that stops a nation", at least not in a traditional manner. Such events cause trauma and death on the track, often enough, and the last race is always to the knackery. The horse racing industry is unnecessary and cruel.

I'll just leave this here for those who want to learn more:

https://www.abc.net.au/news/2019-10-18/slaughter-abuse-of-racehorses-und...

But as a game designer I thought to myself, why not run a simulation? So here it is; "The Willsmere Cup"

The Melbourne cup was originally 2 miles (16 furlongs), so I've modified in scale to 32" because I'm using the nostalgic old measuring system here.

Each horse has a base move per thirty-second turn of 4", but with 2 FUDGE dice; I would reduce base move and increase FUDGE dice for poor weather, track quality, etc.

This means an average of 8 rolls (4 minutes) to complete the track, and potentially winnable in just 5 die rolls (2 minutes 30) with normal speeds. The actual record, again using the old length, is 3 minutes 19 or 7 rolls.

The horses in this race are:

Qilin: From China and blessed with Taoist magics, that runs a sagacious and even-paced race (base move unchanged throughout).

Ziggy: The local favourite hails from a red dust plain. Some say it's from Mars, but really it was outback Australia. Has a bit of a sparkle to their personality which can lead to erratic running (3 FUDGE dice).

Cupid: The darling of the race, a real show pony. Completely average in all racing respects (base move unchanged), but who doesn't love the unicorn-pegasus of love?

Atargatis: An Assyrian breed, known for flying starts and flowing like water, but has trouble with endurance (base move of 5 for first 2 minutes, then base move 3 after that).

Twilight: The dark horse in the race with a brooding (one could say, 'Gothic') personality from the New England region of the United States. Known for starting slow (base move 3 for first 2 minutes), but really picks up speed after that (base move 5 after that). A real stalking horse.

Race at Willsmere at 11am. Unicorn-pegasus horsies provided by Erica Hoehn.

At the starting gate. A lovely day here for a miniature unicorn-pegasus race.

And they're off! Atargatis has taken an early lead (5"), then neck-and-neck for Qilin, Cupid, and Ziggy (4"), and with Twilight (3") bringing up the rear.

At the one-minute mark we see that Atargatis (5"+6"=11") has really opened up, and Twilight (3"+5"=8") has caught up and is now equal second with Qilin (4"+4"=8"). Cupid however is dragging (4"+2"=6") and something has really spooked Ziggy (4"+1"=5") who has fallen back to the last place.

One-and-a-half minute mark, Atargatis continues in the lead (5"+6"+4"=15) a very impressive time, almost half-way through the track. In second place, some three lengths behind is Qilin (4"+4"+4"=12") and in equal place is Cupid, having made an amazing burst (4"+2"+6"=12"). One length behind this pack Twilight (3"+5"+3"=11"), and bringing up the rear, but also with a burst of energy is Ziggy (4"+1"+5"=10")

Two-minute mark, traditionally the half-way point of the race. Wait! Is that a giant cat that has entered the grounds? Yes,
Manannán mac Lir is rolling in the sun and enjoying the spectacle as well.

Atargatis continues to lead the field at pace (5"+6"+4"+4"=19"), but there has been a burst from Twilight (3"+5"+3"+5"=16") who is making their move into second place! The steady Qilin is now equal-second (4"+4"+4"+4"=16"), followed by Cupid (4"+2"+6"+3"=15"), slowing down a little, and Ziggy has really been distracted by the cat and falls further behind (4"+1"+5"+2"=12"). When you're a feisty runner, you can't make two mistakes like that in a race.

Two-and-a-half-minute mark, almost three-quarters done, and Twilight continues their burst (3"+5"+3"+5"+6"=22"), this is a great recovery, and into first place! Atargatis has really slowed down (5"+6"+4"+4"+2"=21"). A trip on Qilin! The normally steady unicorn-pegasus has slipped and is now in equal third with Cupid (4"+2"+6"+3"+3"=18", and Ziggy brings up the rear (4"+1"+5"+2"+3"=15").

Three-minute mark, and Twilight really has a remarkable pace going now and is pulling clearly ahead (+6"=28"), Atargatis is making a real effort as well (+5"=26"), but is three lengths behind. Qilin is back to a steady gait (+4"=22"), with Cupid (+3"=21") and Ziggy (+4"=19") back in the field.

But it's all Twilight! At the 3.15 mark, that's a new record, Twilight crosses the finish line (+6"=35"). Atargatis will make some twenty seconds late (+5"+2"=33), with Cupid (+4"+5"=36") just getting a nose in front of Qilin (+5"+4"=35"), and finally at the rear is the unfortunate Ziggy (+3"+4"+4"=34")

AttachmentSize
Image icon willsmerecup01.jpg105.66 KB
Image icon willsmerecup02.jpg197.98 KB
Image icon willsmerecup03.jpg205.6 KB
Image icon willsmerecup04.jpg208.05 KB
Image icon willsmerecup05.jpg193.63 KB

,

Simon LyallAudiobooks – October 2020

Protocol: The Power of Diplomacy and How to Make It Work for You by Capricia Penavic Marshall

A mix of White House stories and tips about how to enhance your career though skills she has learnt. The stories are the best bit of the book. 3/5

Little Town on the Prairie: Little House Series, Book 7 by Laura Ingalls Wilder

Various incidents with 15 year old Laura now studying to become a school teacher while being courted. The family farm progresses and town grows. 3/5

Bold They Rise: The Space Shuttle Early Years (1972-1986) by David Hitt and Heather R. Smith

Covering up to and including the Challenger Disaster. Largely quotes from astronauts and people involved. Interesting seeing how missions quickly went to routine. 3/5

The X-15 Rocket Plane: Flying the First Wings into Space by Michelle Evans

A detailed look at the rocketplane programme. Structured around each of the pilots. Covers all the important flights and events. 4/5

The Time Traveller’s Almanac – Part III – Mazes & Traps
by Multiple Authors

Around 18 short Sci-Fi stories about Time, the oldest from 1881. Not all stories strictly time travel. Plenty of hits among the collection. 3/5

My Scoring System

  • 5/5 = Brilliant, top 5 book of the year
  • 4/5 = Above average, strongly recommend
  • 3/5 = Average. in the middle 70% of books I read
  • 2/5 = Disappointing
  • 1/5 = Did not like at all

Share

,

Tim RileyOpen source status update, October 2020

October was the month! I finally got through the remaining tasks standing between me and an Hanami 2.0.0.alpha2 release. Let’s work through the list now!

Application views configured with application inflector

Now when you subclass Hanami::View inside an Hanami app, it will now use the application’s configured inflector automatically. This is important because hanami-view uses the inflector to determine the class names for your view parts, and it’s just plain table stakes for an framework to apply inflections consistently (especially if you’ve configured custom inflection rules).

The implementation within hanami-view was quite interesting, because it was the first time I had to adjust an ApplicationConfiguration (this one being exposed as config.views on the Hanami::Application subclass) to hide one of its base settings. In this case, it hides the inflector setting because we know it will be configured with the application’s inflector as part of the ApplicationView behaviour (to refresh your memory, ApplicationView is a module that’s mixed in whenever Hanami::View is subclassed within a namespace managed by a full Hanami application).

Ordinarily, I’m all in favour of exposing as many settings as possible, but in this case, it didn’t make sense for a view-specific inflector to be independently configurable right alongside the application inflector itself.

Rest assured, you don’t lose access to this setting entirely, so if you ever have reason to give your views a different inflector, you can go right ahead and directly assign it in a view class:

module Main
  class View < Hanami::View
    # By default, the application inflector is configured

    # But you can also override it:
    config.inflector = MyCustomInflector
  end
end

There was a counterpart hanami PR for this change, and I was quite happy to see it all done, because it means we now have consistent handling of both action and view settings: each gem provides their own ApplicationConfiguration class, which is made accessible via config.actions and config.views respectively. This consistency should make it easier to maintain both of these imported configurations going forward (and, one day, to devise a system for any third party gem to register application-level settings).

Application views have their template configured always

One aspect of the ApplicationView behaviour is to automatically configure a template name on each view class. For example, a Main::Views::Articles::Index would have its template configured as "articles/index".

This is great, but there was an missing piece from the implementation. It assumed that your view hierarchy would always include an abstract base class defined within the application:

module Main
  # Abstract base view
  class View < Hanami::View
  end

  module Views
    module Articles
      # Concrete view
      class Index < View
      end
    end
  end
end

Under this assumption, the base view would never have its template automatically configured. That makes sense in the above arrangement, but if you ever wanted to directly inherit from Hanami::View for a single concrete view (and I can imagine cases where this would make sense), you’d lose the nice template name inference!

With this PR, this limitation is no more: every ApplicationView has a template configured in all circumstances.

Application views are configured with a Part namespace

Keeping with the theme of improving hanami-view integration, another gap I’d noticed was that application views are not automatically configured with a part namespace. This meant another wart if you wanted to use this feature:

require "main/views/parts"

module Main
  class View < Hanami::View
    # Ugh, I have to _type_ all of this out, _by hand?_
    config.part_namespace = Views::Parts
  end
end

Not any more! As of this PR, we now have a config.views.parts_path application-level setting, with a default value of "views/parts". When an ApplicationView is activated, it will take this value, convert it into a module (relative to the view’s application or slice namespace), and assign it as the view’s part_namespace. This would see any view defined in Main having Main::Views::Parts automatically set as its part namespace. Slick!

Security-related default headers restored

Sticking with configuration, but moving over to hanami-controller, Hanami::Action subclasses within an Hanami app (that is, any ApplicationAction) now have these security-related headers configured out of the box:

These are set on the config.actions.default_headers application-level setting, which you can also tweak to suit your requirements.

Previously, these were part of a bespoke one-setting-per-header arrangement in the config.security application-level setting namespace, but I think this new arrangement is both easier to understand and much more maintainable, so I was happy to drop that whole class from hanami as part of rounding this out this work.

Automatic cookie support based on configuration

The last change I made to hanami-controller was to move the config.cookies application-level setting, which was defined in the hanami gem, directly into the config.actions namespace, which is defined inside hamami-controller, much closer to the related behaviour.

We now also automatically include the Hanami::Action::Cookies module into any ApplicationAction if cookies are enabled. This removes yet another implmentation detail and piece of boilerplace that users would otherwise need to consider when building their actions. I’m really happy with how the ApplicationAction idea is enabling this kind of integration in such a clean way.

Check out the finer details in the PR to hanami-controller and witness the corresponding code removal from hanami itself.

Released a minimal application template

It’s been a while now since I released my original Hanami 2 application template, which still serves as a helpful base for traditional all-in-one web applications.

But this isn’t the only good use for Hanami 2! I think it can serve as a helpful base for any kind of application. When I had a colleague ask me on the viability of Hanami to manage a long-running system service, I wanted to demonstrate how it could look, so I’ve now released an Hanami 2 minimal application template. This one is fully stripped back: nothing webby at all, just a good old lib/ and a bin/app to demonstrate an entry point. I think it really underscores the kind of versatility I want to achieve with Hanami 2. Go check it out!

Gave dry-types a nice require-time performance boost

Last but not least, one evening I was investigating just how many files were required as one of my applications booted. I noticed an unusually high number of concurrent-ruby files being required. Turns out this was an unintended consequence of requiring dry-types. One single-line PR later and now a require "dry/types" will load 242 fewer files!

Savouring this moment

It’s taken quite some doing to get to this moment, where an Hanami 2.0.0.alpha2 release finally feels feasible. As you’d detect from my previous posts, it’s felt tantalisingly close for every one of the last few months. As you’d also detect from this post, the final stretch has involed a lot of focused, fiddly, and let’s face it, not all that exciting work. But these are just the kind of details we need to get right for an excellent framework experience, and I’m glad I could continue for long enough to get these done.

I’m keenly aware that there’ll be much, much more of this kind of work ahead of us, but for the time being, I’m savouring this interstice.

In fact, I’ve even given myself a treat: I’ve already started some early explorations of how we could adapt dry-system to fit with zeitwerk so that we can reliable autoloading a part of the core Hanami 2 experience. But more on that later ;)

Thank you to my sponsors!

I now have a sponsors page on this here site, which contains a small list of people to whom I am very thankful. I’d really love for you to join their numbers and sustain my open source work.

As for the next month, new horizons await: I’ll start working out some alpha2 release notes (can you believe it’s been nearly 2 years of work?), as well as continuing on the zeitwerk experiment.

See you all again, same place, same time!

,

David RoweSpeech Spectral Quantisation using VQ-VAE

As an exercise to learn more about machine learning, I’ve been experimenting with Vector Quantiser Variational AutoEncoders (VQ VAE) [2]. Sounds scary but is basically embedding a vector quantiser in a Neural Network so they train together. I’ve come up with a simple network that quantises 80ms (8 x 10ms frames) of spectral magnitudes in 88 bits (about 1100 bits/s).

I arrived at my current model through trial and error, using this example [1] as a starting point. Each 10ms frame is a vector of energies from 14 mel-spaced filters, derived from LPCNet [6]. The network uses conv1D stages to downsample and upsample the vectors, with a two stage VQ (11 bits per stage) in the Autoencoder “bottleneck”. The VQ is also encoding total frame energy, so the remaining parameters for a vocoder would be pitch and (maybe) voicing.

This work (spectral quantisation) is applicable to “old school” vocoders like Codec 2 and is also being used with newer Neural Vocoders in some research papers.

I haven’t used it to synthesise any speech yet but it sure does make nice plots. This one is a 2D histogram of the encoder space, white dots are the stage 1 VQ entries. The 16 dimensional data has been reduced to 2 dimensions using PCA.

If the VQ is working, we should expect more dots in the brighter colour areas, and less in the darker areas.

Here is a sample input (green) output (red) of 8 frames:

This is a transition region, going from voiced to unvoiced speech. It seems to handle it OK. The numbers are (frame_number, SD), where SD is the Spectral Distortion in dB*dB. When we get a high SD frame, quite often it’s not crazy wrong, more an educated guess that will probably sound OK, e.g. a different interpolation profile for the frame energy across a transition. Formants are mostly preserved.

The VQ seems to be doing something sensible, after 20 epochs I can see most VQ entries are being used, and the SD gets better with more bits. The NN part trains much faster that the VQ.

Here is a histogram of the SDs for each frame:

The average SD is around 7.5 dB*dB, similar to some of the Codec 2 quantisers. However this is measured on every 10ms frame in an 8 frame sequence, so it’s a measure of how well it interpolates/decimates in time as well. As I mentioned above – some of the “misses” that push the mean SD higher are inconsequential.

Possible Bug in Codec 2 700C

I use similar spectral magnitude vectors for Codec 2 700C [5] – however when I tried that data the SD was about double. Hmmm. I looked into it and found some bugs/weaknesses in my approach for Codec 2 700C (for that codec the spectral magnitudes a dependant on the pitch estimator which occasionally loses it). So that was a nice outcome – trying to get the same result two different ways can be a pretty useful test.

Further Work

Some ideas for further work:

  1. Use kmeans for training.
  2. Inject bit errors when training to make it robust to channel errors.
  3. Include filtered training material to make it robust to recording conditions.
  4. Integrate into a codec and listen to it.
  5. Try other networks – I’m still learning how to engineer an optimal network.
  6. Make it work with relu activations, I can only get it to work with tanh.

Reading Further

[1] VQ VAE Keras MNIST Example – my starting point for the VQ-VAE work
[2] Neural Discrete Representation Learning
[3] My Github repo for this work
[4] Good introduction to PCA
[5] Codec 2 700C – also uses VQ-ed mel-spaced vectors
[6] LPCNet: DSP-Boosted Neural Speech Synthesis

Linux AustraliaCouncil Meeting Tuesday 3rd November 2020 – Minutes

1. Meeting overview and key information

Present

Sae Ra Germaine

Jonathan Woithe

Julien Goodwin

Russell Stuart

Lisa Sands

Joel Addison

Benno Rice

Apologies

None

 

Meeting opened at 1930 AEDT by Sae Ra and quorum was achieved.

Minutes taken by Julien.

2. Event Review

Drupal

Admin Team

Pycon

LCA 2020

LCA 2021

LCA 2022

3. Log of correspondence

  • From: ASIC; Date: Sun, 25 Oct 2020 19:16:44 +1100; Subject: Renewal: OPEN SOURCE AUSTRALIA
    • MOTION: Russell Stuart moves we pay ASIC AUD$87 to renew “OPEN SOURCE AUSTRALIA” for 3 years.
    • Seconder: Jonathan
    • Outcome: Passed
  • From Google to council@ on 27 Oct 2020. Use of Google App Engine beyond 31 Jan 2021 requires payment information be added to the linked account before 31 Jan 2021. Affected project is “sydney-linux-users-group-hr”.
    • Current SLUG site is hosted on LA infra, we’ll leave it and see if anything breaks.
  • From MailChimp to council@ on 28 Oct 2020. Policies have been updated (Standard Terms of Use, Data Processing Addendum).
    • AI: Sae Ra to close account, was used in migration onto CiviCRM
  • From AgileWare; Subject: New Invoice, due 30/11/2020; Date: Sat, 31 Oct 2020 10:00:27 +1100.  Summary: $330 renewal for 6 months hosting.
    • MOTION: Sae Ra moves LA pays AgileWare AUD$330 to for 6 months in advance web site hosting, and up to AUD$3000 for support renewal.
    • Seconder: Russell
    • Outcome:  Passed

4. Items for discussion

  • None

5. Items for noting

  • Stewart Smith has agreed to be returning officer.
  • Sae Ra needs photo & bio for council members for annual report.
  • Audit need bank statements, some are only just through as of this meeting.
  • Approached by Vala Tech Camp to sponsor for next year.

6. Other business 

  • Call for nominations for Rusty Wrench
    • Announce out by late November, 2-3 week nomination period, close mid-December
    • AI: Julien to update draft

7. In camera

  • No items were discussed in camera

2011 AEDT close

The post Council Meeting Tuesday 3rd November 2020 – Minutes appeared first on Linux Australia.

,

Lev LafayetteContributing To the International HPC Certification Forum

As datasets grow in size and complexity faster than personal computational devices are able to perform more researchers seek HPC systems as a solution to their computational problems. However, many researchers lack the familiarity with the environment for HPC, and require training. As the formal education curriculum has not yet responded sufficiently to this pressure, leaving HPC centres to provide basic training.

One proposed solution to this issues has been the international HPC Certification Forum, established in 2018, and developed from the Performance Conscious HPC (PeCoH) project in 2017 with the Hamburg HPC Competence Center (HHCC), which had the explicit goal of creating the broad standards for an “HPC driving license”. Since its establishment, the Forum has developed a detailed skill-tree across multiple branches (e.g., HPC Knowledge, HPC Use, Performance Engineering etc) and levels of competencies (basic, intermediate, expert) where very specific skills have particular competencies. In addition, the Forum has developed a summative examination system and a PGP-signed certificate.

Whilst the Forum separates the examination and certification from curriculum development and content delivery, it also requires a feedback mechanism from HPC education providers. Review of learning objectives and specific competencies, development of branches in depth and breadth all contribute to building a community ecosystem for the development of the Forum and its success. The availability of “HPC CF Endorsed Training”, with certifiable content is a clear avenue for HPC centres to contribute to the Forum which will be elaborated in this presentation with examples from current work.

A presentation to eResearchAustralasia 2020; slidedeck and transcript available.

,

Lev LafayetteSpartan: From Experimental Hybrid towards a Petascale Future

Previous presentations to eResearch Australiasia described the implementation of Spartan, the University of Melbourne’s general- purpose HPC system. Initially, this system was small but innovative, arguably even experimental. Features included making extensive use of cloud infrastructure for compute nodes, OpenStack for deployment, Ceph for the file system, ROCE for network, Slurm as the workload manager, EasyBuild and LMod, etc.

Based on consideration of job workload and basic principles of sunk, prospective, and opportunity costs, this combination maximised throughput on a low budget, and attracted international attention as a result. Flexibility in design also allowed the introduction of a large LIEF-supported GPGPU partition, the inclusion of older systems from Melbourne Bioinformatics, and departmental contributions. Early design decisions meant that Spartan has been able to provide performance and flexibility, and as a result continues to show high utilisation and job completion (close to 20 million), with overall metrics well what would be a “top 500” system. The inclusion of an extensive training programme based on androgogical principles has also helped significantly.

Very recently Spartan has undergone some significant architecture modifications, which this report will be of interest to other institutions. The adoption of Spectrum Scale file system has further improved scalability, performance, and reliability, along with adapting a pure HPC environment with a significant increase in core count designed for workload changes and especially queue times. Overall, these new developments in Spartan are designed to be integrated to the University’s Petascale Campus Initiative (PCI).

Presentation to eResearchAustralasia 2020

Slidedeck and transcript available.

David RoweFSK LDPC Data Mode

I’m developing an open source data mode using a FSK modem and powerful LDPC codes. The initial use case is the Open IP over UHF/VHF project, but it’s available in the FreeDV API as a general purpose mode for sending data over radio channels.

It uses 2FSK or 4FSK, has a variety of LDPC codes available, works with bursts or streaming frames, and the sample rate and symbol rate can be set at init time.

The FSK modem has been around for some time, and is used for several applications such as balloon telemetry and FreeDV digital voice modes. Bill, VK5DSP, has recently done some fine work to tightly integrate the LDPC codes with the modem. The FSK_LDPC system has been tested over the air in Octave simulation form, been ported to C, and bundled up into the FreeDV API to make using it straight forward from the command line, C or Python.

We’re not using a “black box” chipset here – this is ground up development of the physical layer using open source, careful simulation, automated testing, and verification of our work on real RF signals. As it’s open source the modem is not buried in proprietary silicon so we can look inside, debug issues and integrate powerful FEC codes. Using a standard RTLSDR with a 6dB noise figure, FSK_LDPC is roughly 10dB ahead of the receiver in a sample chipset. That’s a factor of 10 in power efficiency or bit rate – your choice!

Performance

The performance is pretty close to what is theoretically possible for coded FSK [6]. This is about Eb/No=8dB (2FSK) and Eb/No=6dB (4FSK) for error free transmission of coded data. You can work out what that means for your application using:

  MDS = Eb/No + 10*log10(Rb) + NF - 174
  SNR = Eb/No + 10*log10(Rb/B)

So if you were using 4FSK at 100 bits/s, with a 6dB Noise figure, the Minimum Detectable Signal (MDS) would be:

  MDS = 6 + 10*log10(100) + 6 - 174
      = -142dBm

Given a 3kHz noise bandwidth, the SNR would be:

  SNR = 6 + 10*log10(100/3000)
      = -8.8 dB

How it Works

Here is the FSK_LDPC frame design:

At the start of a burst we transmit a preamble to allow the modem to syncronise. Only one preamble is transmitted for each data burst, which can contain as many frames as you like. Each frame starts with a 32 bit Unique Word (UW), then the FEC codeword consisting of the data and parity bits. At the end of the data bits, we reserve 16 bits for a CRC.

This figure shows the processing steps for the receive side:

Unique Word Selection

The Unique Word (UW) is a known sequence of bits we use to obtain “frame sync”, or identify the start of the frame. We need this information so we can feed the received symbols into the LDPC decoder in the correct order.

To find the UW we slide it againt the incoming bit stream and count the number of errors at each position. If the number of errors is beneath a certain threshold – we declare a valid frame and try to decode it with the LDPC decoder.

Even with pure noise (no signal) a random sequence of bits will occasionally get a partial match (better than our threshold) with the UW. That means the occasional dud frame detection. However if we dial up the threshold too far, we might miss good frames that just happen to have a few too many errors in the UW.

So how do we select the length of the UW and threshold? Well for the last few decades I’ve been guessing. However despite being allergic to probability theory I have recently started using the Binominal Distribution to answer this question.

Lets say we have a 32 bit UW, lets plot the Binomial PDF and CDF:


The x-axis is the number of errors. On each graph I’ve plotted two cases:

  1. A 50% Bit Error Rate (BER). This is what we get when no valid signal is present, just random bits from the demodulator.
  2. A 10% bit error rate. This is the worst case where we need to get frame sync – a valid, but low SNR signal. The rate half LDPC codes fall over at about 10% BER.

The CDF tells us “what is the chance of this many or less errors”. We can use it to pick the UW length and thresholds.

In this example, say we select a “vaild UW threshold” of 6 bit errors out of 32. Imagine we are sliding the UW over random bits. Looking at the 50% BER CDF curve, we have a probablity of 2.6E-4 (0.026%) of getting 6 or less errors. Looking at the 10% curve, we have a probablity of 0.96 (96%) of detecting a valid frame – or in other words we will miss 100 – 96 = 4% of the valid frames that just happen to have 7 or more errors in the unique word.

So there is a trade off between false detection on random noise, and missing valid frames. A longer UW helps separate the two cases, but adds some overhead – as UW bits don’t carry any payload data. A lower threshold means you are less likely to trigger on noise, but more likely to miss a valid frame that has a few errors in the UW.

Continuing our example, lets say we try to match the UW on a stream of random bits from off air noise. Because we don’t know where the frame starts, we need to test every single bit position. So at a bit rate of 1000 bits/s we attempt a match 1000 times a second. The probability of a random match in 1000 bits (1 second) is 1000*2.6E-4 = 0.26, or about 1 chance in 4. So every 4 seconds, on average, we will get an accidental UW match on random data. That’s not great, as we don’t want to output garbage frames to higher layers of our system. So a CRC on the decoded data is performed as a final check to determine if the frame is indeed valid.

Putting it all together

We prototyped the system in GNU Octave first, then ported the individual components to stand alone C programs that we can string together using stdin/stdout pipes:

$ cd codec2/build_linux$ cd src/
$ ./ldpc_enc /dev/zero - --code H_256_512_4 --testframes 200 |
  ./framer - - 512 5186 | ./fsk_mod 4 8000 5 1000 100 - - |
  ./cohpsk_ch - - -10.5 --Fs 8000  |
  ./fsk_demod --mask 100 -s 4 8000 5 - - |
  ./deframer - - 512 5186  |
  ./ldpc_dec - /dev/null --code H_256_512_4 --testframes
--snip--
Raw   Tbits: 101888 Terr:   8767 BER: 0.086
Coded Tbits:  50944 Terr:    970 BER: 0.019
      Tpkts:    199 Tper:     23 PER: 0.116

The example above runs 4FSK at 5 symbols/second (10 bits/s), at a sample rate of 8000 Hz. It uses a rate 0.5 LDPC code, so the throughput is 5 bit/s and it works down to -24dB SNR (at around 10% PER). This is what it sounds like on a SSB receiver:

Yeah I know. But it’s in there. Trust me.

The command line programs above are great for development, but unwieldy for real world use. So they’ve been combined into single FreeDV API functions. These functions take data bytes, convert them to samples you send through your radio, then at the receiver back to bytes again. Here’s a simple example of sending some text using the FreeDV raw data API test programs:

$ cd codec2/build_linux/src
$ echo 'Hello World                    ' |
  ./freedv_data_raw_tx FSK_LDPC - - 2>/dev/null |
  ./freedv_data_raw_rx FSK_LDPC - - 2>/dev/null |
  hexdump -C
48 65 6c 6c 6f 20 57 6f  72 6c 64 20 20 20 20 20  |Hello World     |
20 20 20 20 20 20 20 20  20 20 20 20 20 20 11 c6  |              ..|

The “2>/dev/null” hides some of the verbose debug information, to make this example quieter. The 0x11c6 at the end is the 16 bit CRC. This particular example uses frames of 32 bytes, so I’ve padded the input data with spaces.

My current radio for real world testing is a Raspberry Pi Tx and RTLSDR Rx, but FSK_LDPC could be used over regular SSB radios (just pipe the audio into and out of your radio with a sound card), or other SDRs. FSK chips could be used as the Tx (although their receivers are often sub-optimal as we shall see). You could even try it on HF, and receive the signal remotely with a KiwiSDR.

I’ve used a HackRF as a Tx for low level testing. After a few days of tuning and tweaking it works as advertised – I’m getting within 1dB of theory when tested over the bench at rates between 500 and 20000 bits/s. In the table below Minimum Detectable Signal (MDS) is defined as 10% PER, measured over 100 packets. I send the packets arranged as 10 “bursts” of 10 packets each, with a gap between bursts. This gives the acquisition a bit of a work out (burst operation is typically tougher than streaming):

Info bit rate (bits/s) Mode NF (dB) Expected MDS (dBm) Measured MDS (dBm) Si4464 MDS (dBm)
1000 4FSK 6 -132 -131 -123
10000 4FSK 6 -122 -120 -110
5000 2FSK 6 -123 -123 -113

The Si4464 is used as an example of a chipset implementation. The Rx sensitivity figures were extrapolated from the nearest bit rate on Table 3 of the Si4464 data sheet. It’s hard to compare exactly as the Si4664 doesn’t have FEC. In fact it’s not possible to fully utilise the performance of high performance FEC codes on chipsets as they generally don’t have soft decision outputs.

FSK_LDPC can scale to any bit rate you like. The ratio of the sample rate to symbol rate Fs/Rs = 8000/1000 (8kHz, 1000 bits/s) is the same as Fs/Rs = 800000/100000 (800kHz, 100k bits/s), so it’s the same thing to the modem. I’ve tried FSK_LDPC between 5 and 40k bit/s so far.

With a decent LNA in front of the RTLSDR, I measured MDS figures about 4dB lower at each bit rate. I used a rate 0.5 code for the tests to date, but other codes are available (thanks to Bill and the CML library!).

There are a few improvements I’d like to make. In some tests I’m not seeing the 2dB advantage 4FSK should be delivering. Syncronisation is trickier for 4FSK, as we have 4 tones, and the raw modem operating point is 2dB further down the Eb/No curve than 2FSK. I’d also like to add some GFSK style pulse shaping to make the Tx spectrum cleaner. I’m sure some testing over real world links will also show up a few issues.

It’s fun building, then testing, tuning and pushing through one bug after another to build your very own physical layer! It’s a special sort of magic when the real world results start to approach what the theory says is possible.

Reading Further

[1] Open IP over UHF/VHF Part 1 and Part 2 – my first use case for the FSK_LDPC protocol described in this post.
[2] README_FSK – recently updated documentation on the Codec 2 FSK modem, including lots of examples.
[3] README_data – new documentation on Codec 2 data modes, including the FSK_LDPC mode described in this post.
[4] 4FSK on 25 Microwatts – Bill and I sending 4FSK signals across Adelaide, using an early GNU Octave simulation version of the FSK_LDPC mode described in this post.
[5] Bill’s LowSNR blog.
[6] Coded Modulation Library Overview – CML is a wonderful library that we are using in Codec 2 for our LDPC work. Slide 56 tells us the theoretical mininum Eb/No for coded FSK (about 8dB for 2FSK and 6dB for 4FSK).
[7] 4FSK LLR Estimation Part 2 – GitHub PR used for development of the FSK_LDPC mode.

Linux AustraliaCouncil Meeting Tuesday 20th October 2020 – Minutes

1. Meeting overview and key information

Present

Sae Ra Germaine

Jonathan Woithe

Julien Goodwin

Russell Stuart

Lisa Sands

Joel Addison

Apologies 

Benno Rice

 

Meeting opened at 1930 AEDT by Sae Ra and quorum was achieved.

Minutes taken by Julien.

2. Log of correspondence

  • 1 Oct 2020 to @council: Thanks received from NetThing for LA’s support of their 2020 event.

3. Items for discussion

  • Grant application received from Pauline Clague: Online indigenous women’s possum skin cloak making workshop. Application received on 25 Sep 2020, community consultation closed Fri 9 Oct 2020. To be considered by Council on 20 Oct 2020.
    • MOTION BY Sae Ra That Linux Australia Accepts the Grant Proposal Online indigenous women’s possum skin cloak making workshop submitted by Pauline Clague.
    • Seconded: Julien
    • Motion failed.
    • AI: Jonathan to follow up
  • AGM Discussion
    • Agreement on the 15th.
    • Suggestion for Stewart Smith for returning officer
      • AI: Julien to approach

4. Items for noting

  • LCA 2021 <details redacted>
  • Jonathan talked to the Rockhampton Art Gallery folk about their proposal, resources created will be open, may come back.

5. Other business 

  • AI: Julien to provide redacted minutes for audit
  • AI: Julien to create static copy of planet site so admin team can just switch off

6. In camera

  • No items were discussed in camera

 

Meeting closed at 2000

The post Council Meeting Tuesday 20th October 2020 – Minutes appeared first on Linux Australia.

,

David RoweRTLSDR Strong Signals and Noise Figure

I’ve been exploring the strong and weak signal performance of the RTLSDR. This all came about after Bill and I performed some Over the Air tests using the RTLSDR. We found that if we turned the gain all the way up, lots of birdies and distortion appeared. However if we wind the gain back to improve strong signal performance, the Noise Figure (NF) increases, messing with our modem link budgets.

Fortunately, there’s been a lot of work on the RTLSDR internals from the open source community. So I’ve had a fun couple of weeks drilling down into the RTLSDR drivers and experimenting. It’s been really interesting, and I’ve learnt a lot about the trade offs in SDR. For USD$22, the RTLSDR is a fine teaching/learning tool – and a pretty good radio.

Strong and Weak Signals

Here is a block diagram of the RTLSDR as I understand it:

It’s a superhet, with an IF bandwidth of 2MHz or less. The IF is sampled by an 8-bit ADC that runs at 28 MHz. The down sampling (decimation) from 28MHz to 2MHz provides some “processing gain” which results in a respectable performance. One part I don’t really understand is the tracking BPF, but I gather it’s pretty broad, and has no impact on strong signals a few MHz away.

There are a few ways for strong signals to cause spurious signals to appear:

  1. A -30dBm signal several MHz away will block the LNA/mixer analog stages at the input. For example a pager signal at 148.6MHz while you are listening to 144.5 MHz. This causes a few birdies to gradually appear, and some compression of your wanted signal.
  2. A strong signal a few MHz away can be aliased into your passband, as the IF filter stop band attenuation is not particularly deep.
  3. A -68dBm signal inside the IF bandwidth will overload the ADC, and the whole radio will fall in a heap.

The levels quoted above are for maximum gain (-g 49), and are consistent with [1] and [5]. If you reduce the gain, the overload levels get higher, but so does your noise figure. You can sometimes work around the first two issues, e.g. if the birdies don’t happen to fall right on top of your signal they can be ignored. So the first two effects – while unfortunate – tend to be more benign than ADC overload.

At the weak signal end of the operating range, we are concerned about noise. Here is how I model the various noise contributions:

The idea of a radio is to use the tuner to remove the narrow band unwanted signals, leaving just your wanted signal. However noise tends to be evenly distributed in frequency, so we are stuck with any noise that is located in the bandwidth of our wanted signal. A common technique is to have enough gain before the ADC such that the signal being sampled is large compared to the ADC quantisation noise. That way we can ignore the noise contribution from the ADC.

However with strong signals, we need to back off the gain to prevent overload. Now the ADC noise becomes significant, and the overall NF of the radio increases.

Another way of looking at this – if the gain ahead of the ADC is small, we have a smaller signal hitting the ADC, which will toggle less bits of the ADC, resulting in coarse quantisation (more quantisation noise) from the ADC.

The final decimation stage reduces ADC quantisation noise. This figure shows our received signal (a single frequency spike), and the ADC quantisation noise (continuous line, with energy at every frequency):

The noise power is the sum of all the noise in our bandwidth of interest. The decimation filter limits this bandwidth, removing most of the noise except for the small band near our wanted signal (shaded blue area). So the total noise power is summed over a smaller bandwidth, noise power is reduced and our SNR goes up. This means that despite just being 8 bits, the ADC performs reasonably well.

Off Air Strong Signals

Here is what my spectrum analyser sees when connected to my antenna. It’s 10MHz wide, centred on 147MHz. There are some strong pager signals that bounce around the -30dBm level, plus some weaker 2 metre band Ham signals between -80 and -100dBm.

When I tune to 144.5MHz, that pager signal is outside the IF bandwidth, so I don’t get a complete breakdown of the radio. However it does cause some strong signal compression, and some birdies/aliased signals pop up. Here is a screen shot from gqrx when the pager signal is active:

In this plot, the signal right on 144.5 is my wanted signal, a weak signal I injected on the bench. The hump at 144.7 is an artefact of the pager signal, which due to strong signal compression just happens to to appear close to (but not on top of) my wanted signal. The wider hump is the IF filter. Here’s a closer look of the IF filter, set to 100kHz using the gqrx “Bandwidth” field:

To see the IF filter shape I connected a terminated LNA to the input of the RTLSDR. This generates wideband noise that is amplified and can be used to visualise the filter shape.

There has been some really cool recent work exploring the IF filtering capabilities of the R820T2 [2]. I tried a few of the newly available IF filter configurations. My experience was the shape factor isn’t great (the filters roll off slowly), so the IF filters don’t have a huge impact on close-in strong signal performance. They do help with attenuating aliased signals.

It was however a great learning and exploring experience, a real deep dive into SDR.

Gqrx is really cool for this sort of work. For example if you tune a strong signal from a signal generator either side of the IF passband, you can see aliases popping up and zooming across the screen. It’s also possible to link gqrx with different RTLSDR driver libraries, to explore their performance. I use the UDP output feature to send samples to my noise figure measuring tool.

Airspy and RTLSDR

The Airspy uses the same tuner chip so I tested it and found about the same strong signal results, i.e. it overloads at the same signal levels as the RTLSDR at max gain. Guess this is no surprise if it’s the same tuner. I once again [5] measured the Airspy noise figure at about 7dB, slightly higher than the 6dB of the RTLSDR. This is consistent with other measurements [1], but quite a way from Airspy’s quoted figure (3.5dB).

This is a good example of the high sample rate/decimation filter architecture in action – the 8-bit RTLSDR delivers a similar noise figure to the 12-bit Airspy.

But the RTLSDR NF probably doesn’t matter

I’ve spent a few weeks spent peering into driver code, messing with inter-stage gains, hammering the little RTLSDR with strong RF signals, and optimising noise figures. However it might all be for naught! At both my home and Bills – external noise appears to dominate. On the 2M band (144.5MHz) we are measuring noise from our (omni) antennas about 20dB above thermal, e.g. -150dBm/Hz compared to the ideal thermal noise level of -174dBm/Hz. Dreaded EMI from all the high speed electronics in urban environments.

EMI can look at lot like strong signal overload – at high gains all these lines pop up, just like ADC overload. If you reduce the gain a bit they drop down into the noise, although its a smooth reduction in level unlike ADC overload which is very abrubt. I guess we are seeing are harmonics of switching power supply signals or other nearby digital devices. Rather than an artefact of Rx overload, we are seeing a sensitive receiver detecting weak EMI signal right down near the noise floor.

So this changes the equation – rather than optimising internal NF we need to ensure enough link margin to get over the ambient noise. An external LNA won’t help, and even a lossy coax run loss might not matter much, as the SNR is more or less set at the antenna.

I’ve settled on the librtlsdr RTLSDR driver/library for my FSK modem experiments as it supports IF filter and inter-stage gain control. I also have a mash up called rtl_fsk.c where I integrate a FSK modem and some powerful LDPC codes, but that’s another story!

Reading Further

[1] Evaluation of SDR Boards V1.0 – A fantastic report on the performance of several SDRs.
[2] RTLSDR: driver extensions – a very interesting set of conference slides discussing recent work with the R820T2 by Hayati Aygun. Many useful links.
[3] librtlsdr fork of rtlsdr driver library that includes support for the IF filter configuration and individual gain stage control discussed in [2].
[4] My fork of librtlsdr – used for NF measurements and rtl_fsk mash up development.
[5] Some Measurements on E4000 and R802T tuners – Detailed look at the tuner by HB9AJG. Fig 5 & 6 bucket curves show overload levels consistent with [1] and my measurements.
[6] Measuring SDR Noise Figure in Real Time

,

Jan SchmidtRift CV1 – multi-threaded tracking

This video shows the completion of work to split the tracking code into 3 threads – video capture, fast analysis and long analysis.

If the projected pose of an object doesn’t line up with the LEDs where we expect it to be, the frame is sent off for more expensive analysis in another thread. That way, it doesn’t block tracking of other objects – the fast analysis thread can continue with the next frame.

As a new blob is detected in a video frame, it is assigned an ID, and tracked between frames using motion flow. When the analysis results are available at some point in the future, the ID lets us find blobs that still exist in that most recent video frame. If the blobs are still unknowns in the new frame, the code labels them with the LED ID it found – and then hopefully in the next frame, the fast analysis is locked onto the object again.

There are some obvious next things to work on:

  • It’s hard to decide what constitutes a ‘good’ pose match, especially around partially visible LEDs at the edges. More experimentation and refinement needed
  • The IMU dead-reckoning between frames is bad – the accelerometer biases especially for the controllers tends to make them zoom off very quickly and lose tracking. More filtering, bias extraction and investigation should improve that, and help with staying locked onto fast-moving objects.
  • The code that decides whether an LED is expected to be visible in a given pose can use some improving.
  • Often the orientation of a device is good, but the position is wrong – a matching mode that only searches for translational matches could be good.
  • Taking the gravity vector of the device into account can help reject invalid poses, as could some tests against plausible location based on human movements limits.

Code is at https://github.com/thaytan/OpenHMD/tree/rift-correspondence-search

,

Tim RileyOpen source status update, September 2020

Well, didn’t September just fly by? Last month I predicted I’d get through the remaining tasks standing in the way of an Hanami 2.0.0.alpha2 release, and while I made some inroads, I didn’t quite get there. At this point I’ve realised that after many consecutive months of really strong productivity on OSS work (which for me right now is done entirely on nights and weekends), a downtick of a couple of months was inevitable.

Anyway, let’s take a look at what I did manage to achieve!

Reintroduced CSRF protection module to hanami-controller

Sometime during the upheaval that was hanami and hanami-controller’s initial rewrite for 2.0.0, we lost the important CSRFProtection module. I’ve brought it back now, this time locating it within hanami-controller instead of hanami, so it can live alongside the action classes that are meant to include it.

For now, you can manually include it in your action classes:

require "hanami/action"
require "hanami/action/csrf_protection"

class MyAction < Hanami::Action
  include Hanami::Action::CSRFProtection
end

And if you need to manually opt out of the protections for any reason, you can implement this method in any one of your action classes:

def verify_csrf_token?(req, res)
  false
end

Either way, I encourage you to check out the code; it’s a simple module and very readable.

Started on automatic enabling of CSRF protection

For a batteries included experience, having to manually include the CSRFProtection module isn’t ideal. So I’m currently working to make it so the module is automatically included when the Hanami application has sessions enabled. This is close to being done already, in this hanami-controller PR and this counterpart hanami PR. I’m also taking this an an opportunity to move all session-related config away from hanami and into hanami-controller, which I think is a more rational location both in terms of end-user understandability and future maintainability.

We’ll see this one fully wrapped up in next month’s update :)

Improving preservation of state in dry/hanami-view context objects

This one was a doozy. It started with my fixing a bug in my site to do with missing page titles, and then realising that it only partially fixed the problem. I wasn’t doing anything particularly strange in my site, just following a pattern of setting page-specific titles in individual templates:

- page_title "Writing"

h1 Writing
  / ... rest of page

And then rendering the title within the layout:

html
  head
    title = page_title

Both of these page_title invocations called a single method on my view context object:

def page_title(new_title = Undefined)
  if new_title == Undefined
    [@page_title, settings.site_title].compact.join(" | ")
  else
    @page_title = new_title
  end
end

Pretty straightforward, right? However, because the context is reinitialized from a base object for each different rendering environment (first the template, and then the layout), that @page_title we set in the template never goes anywhere else, so it’s not available afterwards in the layout.

This baffled me for a quite a while, because I’ve written similar content_for-style helpers in context classes and they’ve always worked without a hitch. Well, it turns out I got kinda lucky in those cases, because I was using a hash (instead of a direct instance variable) to hold the provided pieces of content, and since hashes (like most objects in Ruby) are passed by reference, that just so happened to permit the same bits of content to be seen from all view context instances.

Once I made this relisation, I first committed this egregious hack just to get my site properly showing titles again, and then I mulled over a couple of options for properly fixing this inside hanami-view.

One option would be to acknowledge this particular use case and adjust the underlying gem to support it, ensuring that the template context is used to initialize the layout context. This works, and it’s certainly the smallest possible fix, but I think it papers over the fundamental issue here: the the creation of multiple context instances is a low-level implementation detail and should not be something the user needs to think about. I think a user should feel free to set an ivar in a context instance and reasonably expect that it’ll be available at all points of the rendering cycle.

So how do we fix this? The obvious way would be to ensure we create only a single context object, and have it work as required for rendering the both the template and the layout. The challenge here is that we require a different RenderEnvironment for each of those, so the correct partials can be looked up, whether they’re called from within templates, or within part or scope classes. This is why we took the approach of creating those multiple context objects in the first place, so each one could have an appropriate RenderEnvironment provided.

So how do we keep a single context instance but somehow swap around the underlying environment? Well, as a matter of fact, there’s a gem for that. After discovering this bug, I was inspired and stayed up to midnight spiking on an approach that relies upon dry-effects and a reader effect to provide the differing render_environment to a single context object.

(The other effect I felt was the extreme tiredness the next day, I’m not the spritely youth I used to be!)

Anyway, if you haven’t checked out dry-effects, I encourage you to do so: it may help you to discover some novel approaches to certain design challenges. In this case, all we need to do is include the effect module in our context class:

module Hanami
  class View
    class Context
      # Instance methods can now expect a `render_env` to be available
      include Dry::Effects.Reader(:render_env)
    end
  end
end

And ensure we’re wrapping a handler around any code expected to throw the effect:

module Hanami
  class View
    module StandaloneView
      # This provides `with_render_env`, used below
      include Dry::Effects::Handler.Reader(:render_env)

      def call(format: config.default_format, context: config.default_context, **input)
        # ...

        render_env = self.class.render_env(format: format, context: context)
        template_env = render_env.chdir(config.template)

        # Anything including Dry::Effects.Reader(:render_env) will have access to the
        # provided `template_env` inside this handler block
        output = with_render_env(template_env) {
          render_env.template(config.template, template_env.scope(config.scope, locals))
        }

        # ...
      end
    end
  end
end

With this in place, we have a design that allows us to use a single context object only for entirety of the render lifecycle. For the simplicity to the user, I think this is a very worthwhile change, and I plan to spend time assessing it in detail this coming month. As Nikita (the author of dry-effects) points out, there’s a performance aspect to consider: although we’re saving ourselves some object allocations here, we now have to dispatch to the handler every time we throw the reader effect for the render_env. Still, it feels like a very promising direction.

Filed issues arising from production Hanami 2 applications

Over the month at work, we put the finishing touches on two brand new services built with Hanami 2. This helped us to identify a bunch of rough edges that will need addressing before we’re done with the release. I filed them on our public Trello board:

This goes to show how critical it is for frameworks like Hanami to have real-world testing, even at these very early stages of new release development. I’m glad I can also serve in this role, and grateful for keenness and patience of our teams in working with cutting edge software!

Fixed accidental memoization of dry-configurable setting values

Last but not least, I fixed this bug in dry-configurable that arose from an earlier change I made to have it evaluate settings immediately if a value was provided.

This was a wonderful little bug to fix, and the perfect encapsulation of why I love programming: we started off with two potentially conflicting use cases, represented as two different test cases (one failing), and had to find a way to satisfy them both while still upholding the integrity of the gem’s overall design. I’m really happy with how this one turned out.

🙌� Thanks to my sponsors!

This month I was honoured to have a new sponsor come on board. Thank you Sven Schwyn for your support! If you’d like to give a boost to my open source work, please consider sponsoring me on GitHub.

See you all next month!

Linux AustraliaSaying Farewell to Planet Linux Australia

Planet Linux Australia (planet.linux.org.au) was started more than 15 years ago by Michael Davies. In the time since (and particularly before the rise of social media), it has provided a valuable service by encouraging the sharing of information and opinions within our Open Source community. However, due to the many diverse communication options now available over the internet, sites such as Planet Linux Australia are no longer used as heavily as they once were. With many other channels now available, the resources required to maintain Planet Linux Australia are becoming difficult to justify.

With this in mind and following the recommendation of Michael Davies, the Linux Australia Council has decided that it is time close Planet Linux Australia. Linux Australia would like to express its profound appreciation for the work Michael and others have done to initiate and maintain this service. Our community has greatly benefited from this service over the years.

The post Saying Farewell to Planet Linux Australia appeared first on Linux Australia.

,

Jan SchmidtRift CV1 update

This is another in my series of updates on developing positional tracking for the Oculus Rift CV1 in OpenHMD

In the last post I ended with a TODO list. Since then I’ve crossed off a few things from that, and fixed a handful of very important bugs that were messing things up. I took last week off work, which gave me some extra hacking hours and enthusiasm too, and really helped push things forward.

Here’s the updated list:

  • The full model search for re-acquiring lock when we start, or when we lose tracking takes a long time. More work will mean avoiding that expensive path as much as possible.
  • Multiple cameras interfere with each other.
    • Capturing frames from all cameras and analysing them happens on a single thread, and any delay in processing causes USB packets to be missed.
    • I plan to split this into 1 thread per camera doing capture and analysis of the ‘quick’ case with good tracking lock, and a 2nd thread that does the more expensive analysis when it’s needed. Partially Fixed
  • At the moment the full model search also happens on the video capture thread, stalling all video input for hundreds of milliseconds – by which time any fast motion means the devices are no longer where we expect them to be.
    • This means that by the next frame, it has often lost tracking again, requiring a new full search… making it late for the next frame, etc.
    • The latency of position observations after a full model search is not accounted for at all in the current fusion algorithm, leading to incorrect reporting. Partially Fixed
  • More validation is needed on the camera pose transformations. For the controllers, the results are definitely wrong – I suspect because the controller LED models are supplied (in the firmware) in a different orientation to the HMD and I used the HMD as the primary test. Much Improved
  • Need to take the position and orientation of the IMU within each device into account. This information is in the firmware information but ignored right now. Fixed
  • Filtering! This is a big ticket item. The quality of the tracking depends on many pieces – how well the pose of devices is extracted from the computer vision and how quickly, and then very much on how well the information from the device IMU is combined with those observations. I have read so many papers on this topic, and started work on a complex Kalman filter for it.
  • Improve the model to LED matching. I’ve done quite a bit of work on refining the model matching algorithm, and it works very well for the HMD. It struggles more with the controllers, where there are fewer LEDs and the 2 controllers are harder to disambiguate. I have some things to try out for improving that – using the IMU orientation information to disambiguate controllers, and using better models for what size/brightness we expect an LED to be for a given pose.
  • Initial calibration / setup. Rather than assuming the position of the headset when it is first sighted, I’d like to have a room calibration step and a calibration file that remembers the position of the cameras.
  • Detecting when cameras have been moved. When cameras observe the same device simultaneously (or nearly so), it should be possible to detect if cameras are giving inconsistent information and do some correction.
  • hot-plug detection of cameras and re-starting them when they go offline or encounter spurious USB protocol errors. The latter happens often enough to be annoying during testing.
  • Other things I can’t think of right now.

As you can see, a few of the top-level items have been fixed, or mostly so. I split the computer vision for the tracking into several threads:

  • 1 thread shared between all sensors to capture USB packets and assemble them into frames
  • 1 thread per sensor to analyse the frame and update poses

The goal with that initial split was to prevent the processing of multiple sensors from interfering with each other, but I found that it also has a strong benefit even with a single sensor. I realised something in the last week that I probably should have noted earlier: The Rift sensors capture a video frame every 19.2ms, but that frame then takes a full 17ms to deliver across the USB – the means that when everything was in one thread, even with 1 sensor there was only about 2.2ms for the full analysis to take place or else we’d miss a packet of the next frame and have to throw it away. With the analysis now happening in a separate thread and a ping-pong double buffer in place, the analysis can take quite a bit longer without losing any video frames.

I plan to add a 2nd per-sensor thread that will divide the analysis further. The current thread will do only fast pass validation of any existing tracking lock, and will defer any longer term analysis to the other thread. That means that if we have a good lock on the HMD, but can’t (for example) find one of the controllers, searching for the controller will be deferred and the fast pass thread will move onto the next frame and keep tracking lock on the headset.

I fixed some bugs in the calculations that move between frames of reference – converting to/from the global position and orientation in the world to the position and orientation relative to each camera sensor when predicting what the appearance of the LEDs should be. I also added in the IMU offset and orientation of the LED models from the firmware, to make the predictions more accurate when devices move in the time between camera exposures.

Yaw Correction: when a device is observed by a sensor, the orientation is incorporated into what the IMU is measuring. The IMU can sense gravity and knows which way is up or down, but not which way is forward. The observation from the camera now corrects for that yaw drift, to keep things pointing the way you expect them to.

Some other bits:

  • Fixing numerical overflow issues in the OpenHMD maths routines
  • Capturing the IMU orientation and prediction that most closely corresponds to the moment each camera image is recorded, instead of when the camera image finishes transferring to the PC (which is 17ms later)
  • Improving the annotated debug view, to help understand what’s happening in the tracking computer vision steps
  • A 1st order estimate of device velocity to help improve the next predicted position

I posted a longer form video walkthrough of the current code in action, and discussing some of the remaining shortcomings.

As previously, the code is available at https://github.com/thaytan/OpenHMD/tree/rift-correspondence-search

Linux AustraliaCouncil Meeting Tuesday 6th October 2020 – Minutes

1. Meeting overview and key information

Present

Sae Ra Germaine

Jonathan Woithe

Julien Goodwin

Russell Stuart

Lisa Sands

Benno Rice

Apologies

Joel Addison

 

Meeting opened at 1931 AEDT by Sae Ra and quorum was achieved.

Minutes taken by Julien.

2. Event Review

Drupal

Admin Team

Pycon

LCA 2020

LCA 2021

LCA 2022

3. Log of correspondence

  • NSW Government Small Business Survey: reminder received via council@ on 23 Sep 2020. Original message received on 21 Sep 2020.
    • We’re not going to complete this survey.
  • Ampache grant progress report 2: received via council@ on 23 Sep 2020. 
    • Not fully final, but we do expect one more.
  • Grant application received from Pauline Clague: Online indigenous women’s possum skin cloak making workshop. Application received on 25 Sep 2020, community consultation closes Fri 9 Oct 2020. To be considered by Council on 20 Oct 2020.

4. Items for discussion

  • AGM
    • AI: Sae Ra will set up time with Julien to plan AGM, possibly next week
    • Annual report has been started
      • Will need photo / bio from people
      • AI: Julien to get minutes posted

5. Items for noting

  • newCardigan AGM (glam tech group), running using our Zoom account
  • Netthing happened
    • We got praised!
  • Software Freedom Day

6. Other business 

  • None

7. In camera

  • One item was discussed in camera.

2042 AEDT close

The post Council Meeting Tuesday 6th October 2020 – Minutes appeared first on Linux Australia.

,

Leon Brooks

,

Dave HallDevSecOps

Agile, scrum, kanban, iterate, cloud, continuous integration, continuous delivery, continuous deployment, DevOps, infrastructure as code, X as a service, machine learning, shift left, zero trust … Some days it feels like software development has turned into buzzword bingo. One of the latest additions to the card is DevSecOps. For the last decade organisations have been breaking down the wall between developers and operations. Teams that adopt DevOps culture, practices and tools deliver better solutions faster.

,

Hamish TaylorWattlebird feeding

While I hope to update this site again soon, here’s a photo I captured over the weekend in my back yard. The red flowering plant is attracting wattlebirds and honey-eaters. This wattlebird stayed still long enough for me to take this shot. After a little bit of editing, I think it has turned out rather well.

Photo taken with: Canon 7D Mark II & Canon 55-250mm lens.

Edited in Lightroom and Photoshop (to remove a sun glare spot off the eye).

Wattlebird feeding

Gary PendergastMore than 280 characters

It’s hard to be nuanced in 280 characters.

The Twitter character limit is a major factor of what can make it so much fun to use: you can read, publish, and interact, in extremely short, digestible chunks. But, it doesn’t fit every topic, ever time. Sometimes you want to talk about complex topics, having honest, thoughtful discussions. In an environment that encourages hot takes, however, it’s often easier to just avoid having those discussions. I can’t blame people for doing that, either: I find myself taking extended breaks from Twitter, as it can easily become overwhelming.

For me, the exception is Twitter threads.

Twitter threads encourage nuance and creativity.

Creative masterpieces like this Choose Your Own Adventure are not just possible, they rely on Twitter threads being the way they are.

Publishing a short essay about your experiences in your job can bring attention to inequality.

And Tumblr screenshot threads are always fun to read, even when they take a turn for the epic (over 4000 tweets in this thread, and it isn’t slowing down!)

Everyone can think of threads that they’ve loved reading.

My point is, threads are wildly underused on Twitter. I think I big part of that is the UI for writing threads: while it’s suited to writing a thread as a series of related tweet-sized chunks, it doesn’t lend itself to writing, revising, and editing anything more complex.

To help make this easier, I’ve been working on a tool that will help you publish an entire post to Twitter from your WordPress site, as a thread. It takes care of transforming your post into Twitter-friendly content, you can just… write. 🙂

It doesn’t just handle the tweet embeds from earlier in the thread: it handles handle uploading and attaching any images and videos you’ve included in your post.

All sorts of embeds work, too. 😉

It’ll be coming in Jetpack 9.0 (due out October 6), but you can try it now in the latest Jetpack Beta! Check it out and tell me what you think. 🙂

This might not fix all of Twitter’s problems, but I hope it’ll help you enjoy reading and writing on Twitter a little more. 💖

,

David RowePlaying with PAPR

The average power of a FreeDV signal is surprisingly hard to measure as the parallel carriers produce a waveform that has many peaks an troughs as the various carriers come in and out of phase with each other. Peter, VK3RV has been working on some interesting experiments to measure FreeDV power using calorimeters. His work got me thinking about FreeDV power and in particular ways to improve the Peak to Average Power Ratio (PAPR).

I’ve messed with a simple clipper for FreeDV 700C in the past, but decided to take a more scientific approach and use some simulations to measure the effect of clipping on FreeDV PAPR and BER. As usual, asking a few questions blew up into a several week long project. The usual bugs and strange, too good to be true initial results until I started to get results that felt sensible. I’ve tested some of the ideas over the air (blowing up an attenuator along the way), and learnt a lot about PAPR and related subjects like Peak Envelope Power (PEP).

The goal of this work is to explore the effect of a clipper on the average power and ultimately the BER of a received FreeDV signal, given a transmitter with a fixed peak output power.

Clipping to reduce PAPR

In normal operation we adjust our Tx drive so the peaks just trigger the ALC. This sets the average power at Ppeak – PAPR Watts, for example Pav = 100WPEP – 10dB = 10W average.

The idea of the clipper is to chop the tops off the FreeDV waveform so the PAPR is decreased. We can then increase the Tx drive, and get a higher average power. For example if PAPR is reduced from 10 to 4dB, we get Pav = 100WPEP – 4dB – 40W. That’s 4x the average power output of the 10dB PAPR case – Woohoo!

In the example below the 16 carrier waveform was clipped and the PAPR reduced from 10.5 to 4.5dB. The filtering applied after the clipper smooths out the transitions (and limits the bandwidth to something reasonable).

However it gets complicated. Clipping actually reduces the average power, as we’ve removed the high energy parts of the waveform. It also distorts the signal. Here is a scatter diagram of the signal before and after clipping:


The effect looks like additive noise. Hmmm, and what happens on multipath channels, does the modem perform the same as for AWGN with clipped signals? Another question – how much clipping should we apply?

So I set about writing a simulation (papr_test.m) and doing some experiments to increase my understanding of clippers, PAPR, and OFDM modem performance using typical FreeDV waveforms. I started out trying a few different compression methods such as different compander curves, but found that clipping plus a bandpass filter gives about the same result. So for simplicity I settled on clipping. Throughout this post many graphs are presented in terms of Eb/No – for the purpose of comparison just consider this the same thing as SNR. If the Eb/No goes up by 1dB, so does the SNR.

Here’s a plot of PAPR versus the number of carriers, showing PAPR getting worse with the number of carriers used:

Random data was used for each symbol. As the number of carriers increases, you start to get phases in carriers cancelling due to random alignment, reducing the big peaks. Behaivour with real world data may be different; if there are instances where the phases of all carriers are aligned there may be larger peaks.

To define the amount of clipping I used an estimate of the PDF and CDF:

The PDF (or histogram) shows how likely a certain level is, and the CDF shows the cumulative PDF. High level samples are quite unlikely. The CDF shows us what proportion of samples are above and below a certain level. This CDF shows us that 80% of the samples have a level of less than 4, so only 20% of the samples are above 4. So a clip level of 0.8 means the clipper hard limits at a level of 4, which would affect the top 20% of the samples. A clip value of 0.6, would mean samples with a level of 2.7 and above are clipped.

Effect of clipping on BER

Here are a bunch of curves that show the effect of clipping on and AWGN and multipath channel (roughly CCIR poor). A 16 carrier signal was used – typical of FreeDV waveforms. The clipping level and resulting PAPR is shown in the legend. I also threw in a Tx diversity curve – sending each symbol twice on double the carriers. This is the approach used on FreeDV 700C and tends to help a lot on multipath channels.

As we clip the signal more and more, the BER performance gets worse (Eb/No x-axis) – but the PAPR is reduced so we can increase the average power, which improves the BER. I’ve tried to show the combined effect on the (peak Eb/No x-axis) curves which scales each curve according to it’s PAPR requirements. This shows the peak power required for a given BER. Lower is better.




Take aways:

  1. The 0.8 and 0.6 clip levels work best on the peak Eb/No scale, ie when we combine effect of the hit on BER performance (bad) and PAPR improvement (good).
  2. There is about 4dB improvement across a range of operating points. This is pretty signficant – similar to gains we get from Tx diversity or a good FEC code.
  3. AWGN and Multipath improvements are similar – good. Sometimes you get an algorithm that works well on AWGN but falls in a heap on multipath channels, which are typically much tougher to push bits through.
  4. I also tried 8 carrier waveforms, which produced results about 1dB better, as I guess fewer carriers have a lower PAPR to start with.
  5. Non-linear techniques like clipping spread the energy in frequency.
  6. Filtering to constrain the frequency spread brings the PAPR up again. We can trade off PAPR with bandwidth: lower PAPR, more bandwidth.
  7. Non-linear technqiques will mess with QAM more. So we may hit a wall at high data rates.

Testing on a Real PA

All these simulations are great, but how do they compare with operation on a real HF radio? I designed an experiment to find out.

First, some definitions.

The same FreeDV OFDM signal is represented in different ways as it winds it’s way through the FreeDV system:

  1. Complex valued samples are used for much of the internal signal processing.
  2. Real valued samples at the interfaces, e.g. for getting samples in and out of a sound card and standard HF radio.
  3. Analog baseband signals, e.g. voltage inside your radio.
  4. Analog RF signals, e.g. at the output of your PA, and input to your receiver terminals.
  5. An electromagnetic wave.

It’s the same signal, as we can convert freely between the representations with no loss of fidelity, but it’s representation can change the way measures like PAPR work. This caused me some confusion – for example the PAPR of the real signal is about 3dB higher than the complex valued version! I’m still a bit fuzzy on this one, but have satisfied myself that the PAPR of the complex signal is the same as the PAPR of the RF signal – which is what we really care about.

Another definition that I had to (re)study was Peak Envelope Power (PEP) – which is the peak power averaged over one or more carrier cycles. This is the RF equivalent to our “peak” in PAPR. When driven by any baseband input signal, it’s the maximum RF power of the radio, averaged over one or more carrier cycles. Signals such as speech and FreeDV waveforms will have occasional peaks that hit the PEP. A baseband sine wave driving the radio would generate a RF signal that sits at the PEP power continuously.

Here is the experimental setup:

The idea is to play canned files through the radio, and measure the average Tx power. It took me several attempts before my experiment gave sensible results. A key improvement was to make the peak power of each sampled signal the same. This means I don’t have to keep messing with the audio drive levels to ensure I have the same peak power. The samples are 16 bits, so I normalised each file such that the peak was at +/- 10000.

Here is the RF power sampler:

It works pretty well on signals from my FT-817 and IC-7200, and will help prevent any more damage to RF test equipment. I used my RF sampler after my first attempt using a SMA barell attenuator resulted in it’s destruction when I accdentally put 5W into it! Suddenly it went from 30dB to 42dB attenuation. Oops.

For all the experiments I am tuned to 7.175 MHz and have the FT-817 on it’s lowest power level of 0.5W.

For my first experiment I played a 1000 Hz sine wave into the system, and measured the average power. I like to start with simple signals, something known that lets me check all the fiddly RF kit is actually working. After a few hours of messing about – I did indeed see 27dBm (0.5W) on my spec-an. So, for a signal with 0dB PAPR, we measure average power = PEP. Check.

In my next experiment, I measured the effect of ALC on TX power. With the FT-817 on it’s lowest power setting (0.5W), I increased the drive until just before the ALC bars came on: Here is the relationship I found with output power:

Bars Tx Power
0 26.7
1 26.4
2 26.7
3 27.0

So the ALC really does clamp the power at the peak value.

On to more complex FreeDV signals.

Mesuring the average power OFDM/parallel tone signals proved much harder to measure on the spec-an. The power bounces around over a period of several seconds the ODFM waveform evolves which can derail many power measurement techniques. The time constant, or measurement window is important – we want to capture the total power over a few seconds and average the value.

After several attempts and lots of head scratching I settled on the following spec-an settings:

  1. 10s sweep time so the RBW filter is averging a lot of time varying power at each point in the sweep.
  2. 100kHz span.
  3. RBW/VBW of 10 kHz so we capture all of the 1kHz wide OFDM signal in the RBW filter peak when averaging.
  4. Power averaging over 5 samples.

The two-tone signal was included to help me debug my spec-an settings, as it has a known (3dB) PAPR.

Here is a table showing the results for several test signals, all of which have the same peak power:

Sample Description PAPR Theory/Sim (dB) PAPR Mesured (dB)
sine1000 sine wave at 1000 Hz 0 0
sine_800_1200 two tones at 800 and 1200Hz 3 4
vanilla 700D test frames unclipped 7.1 7
clip0.8 700D test frames clipped at 0.8 3.4 4
ve9qrp 700D with real speech payload data 11 10.5

Click on the file name to listen to a 5 second sample of the sample. The lower PAPR (higher average power) signals sound louder – I guess our ears work on average power too! I kept the drive constant and the PEP/peak just happened to hit 26dBm. It’s not critical, as long as the drive (and hence peak level) is the same across all waveforms tested.

Note the two tone “control” is 1dB off (4dB measured on a known 3dB PAPR signal), I’m not happy about that. This suggests a spec-an set up issue or limitation on my spec-an (e.g. the way it averages power).

However the other signals line up OK to the simulated values, within about +/- 0.5dB, which suggests I’m on the right track with my simulations.

The modulated 700D test frame signals were generated by the Octave ofdm_tx.m script, which reports the PAPR of the complex signal. The same test frame repeats continuously, which makes BER measurements convenient, but is slightly unrealistic. The PAPR was lower than the ve9qrp signal which has real speech payload data. Perhaps because the more random, real world payload data leads to occasional frames where the phase of the carriers align leading to large peaks.

Another source of discrepancy is the non flat frequency filtering in the baseband audio/crystal filter path the signal has to flow through before it emerges as RF.

The zero-span spec-an setting plots power over time, and is very useful for visualing PAPR. The first plot shows the power of our 1000 Hz sine signal (yellow), and the two tone test signal (purple):

You can see how mixing just two signals modulates the power over time, the effect on PAPR, and how the average power is reduced. Next we have the ve9qrp signal (yellow), and our clip 0.8 signal (purple):

It’s clear the clipped signal has a much higher average power. Note the random way the waveform power peaks and dips, as the various carriers come into phase. Note very few high power peaks in the ve9qrp signal – in this sample we don’t have any that hits +26dBm, as they are fairly rare.

I found eye-balling the zero-span plots gave me similar values to non-zero span results in the table above, a good cross check.

Take aways:

  1. Clipping is indeed improving our measured average power, but there are some discrepencies between the measured and PAPR values estimated from theory/simulation.
  2. Using a SDR to receive the signal and measure PAPR using my own maths might be easier than fiddling with the spec-an and guessing at it’s internal algorithms.
  3. PAPR is worse for real world signals (e.g. ve9qrp) than my canned test frames due to relatively rare alignments of the carrier phases. This might only happen once every few seconds, but significantly raises the PAPR, and hurts our average power. These occasional peaks might be triggering the ALC, pushing the average power down every time they occur. As they are rare, these peaks can be clipped with no impact on perceived speech quality. This is why I like the CDF/PDF method of setting thresholds, it lets us discard rare (low probability) outliers that might be hurting our average power.

Conclusions and Further work

The simulations suggest we can improve FreeDV by 4dB using the right clipper/filter combination. Initial tests over a real PA show we can indeed reduce PAPR in line with our simulations.

This project has lead me down an interesting rabbit hole that has kept me busy for a few weeks! Just in case I haven’t had enough, some ideas for further work:

  1. Align these clipping levels and filtering to FreeDV 700D (and possibly 2020). There is existing clipper and filter code but the thresholds were set by educated guess several years for 700C.
  2. Currently each FreeDV waveform is scaled to have the same average power. This is the signal fed via the sound card to your Tx. Should the levels of each FreeDV waveform be adjusted to be the same peak value instead?
  3. Design an experiment to prove BER performance at a given SNR is improved by 4dB as suggested by these simulations. Currently all we have measured is the average power and PAPR – we haven’t actually verified the expected 4dB increase in performance (suggested by the BER simulations above) which is the real goal.
  4. Try the experiments on a SDR Tx – they tend to get results closer to theory due to no crystal filters/baseband audio filtering.
  5. Try the experiments on a 100WPEP Tx – I have ordered a dummy load to do that relatively safely.
  6. Explore the effect of ALC on FreeDV signals and why we set the signals to “just tickle” the ALC. This is something I don’t really understand, but have just assumed is good practice based on other peoples experiences with parallel tone/OFDM modems and on-air FreeDV use. I can see how ALC would compress the amplitude of the OFDM waveform – which this blog post suggests might be a good thing! Perhaps it does so in an uncontrolled manner – as the curves above show the amount of compression is pretty important. “Just tickling the ALC” guarantees us a linear PA – so we can handle any needed compression/clipping carefully in the DSP.
  7. Explore other ways of reducing PAPR.

To peel away the layers of a complex problem is very satisfying. It always takes me several goes, improvements come as the bugs fall out one by one. Writing these blog posts oftens makes me sit back and say “huh?”, as I discover things that don’t make sense when I write them up. I guess that’s the review process in action.

Links

Design for a RF Sampler I built, mine has a 46dB loss.

Peak to Average Power Ratio for OFDM – Nice discussion of PAPR for OFDM signals from DSPlog.

,

Glen TurnerConverting MPEG-TS to, well, MPEG

Digital TV uses MPEG Transport Stream, which is a container for video designed for lossy transmission, such as radio. To save CPU cycles, Personal Video Records often save the MPEG-TS stream directly to disk. The more usual MPEG is technically MPEG Program Stream, which is designed for lossless transmission, such as storage on a disk.

Since these are a container formats, it should be possible to losslessly and quickly re-code from MPEG-TS to MPEG-PS.

ffmpeg -ss "${STARTTIME}" -to "${DURATION}" -i "${FILENAME}" -ignore_unknown -map 0 -map -0:2 -c copy "${FILENAME}.mpeg"


comment count unavailable comments

,

Chris NeugebauerTalk Notes: Practicality Beats Purity: The Zen Of Python’s Escape Hatch?

I gave the talk Practicality Beats Purity: The Zen of Python’s Escape Hatch as part of PyConline AU 2020, the very online replacement for PyCon AU this year. In that talk, I included a few interesting links code samples which you may be interested in:

@apply

def apply(transform):

    def __decorator__(using_this):
        return transform(using_this)

    return __decorator__


numbers = [1, 2, 3, 4, 5]

@apply(lambda f: list(map(f, numbers)))
def squares(i):
  return i * i

print(list(squares))

# prints: [1, 4, 9, 16, 25]

Init.java

public class Init {
  public static void main(String[] args) {
    System.out.println("Hello, World!")
  }
}

@switch and @case

__NOT_A_MATCHER__ = object()
__MATCHER_SORT_KEY__ = 0

def switch(cls):

    inst = cls()
    methods = []

    for attr in dir(inst):
        method = getattr(inst, attr)
        matcher = getattr(method, "__matcher__", __NOT_A_MATCHER__)

        if matcher == __NOT_A_MATCHER__:
            continue

        methods.append(method)

    methods.sort(key = lambda i: i.__matcher_sort_key__)

    for method in methods:
        matches = method.__matcher__()
        if matches:
            return method()

    raise ValueError(f"No matcher matches value {test_value}")

def case(matcher):

    def __decorator__(f):
        global __MATCHER_SORT_KEY__

        f.__matcher__ = matcher
        f.__matcher_sort_key__ = __MATCHER_SORT_KEY__
        __MATCHER_SORT_KEY__ += 1
        return f

    return __decorator__



if __name__ == "__main__":
    for i in range(100):

        @switch
        class FizzBuzz:

            @case(lambda: i % 15 == 0)
            def fizzbuzz(self):
                return "fizzbuzz"

            @case(lambda: i % 3 == 0)
            def fizz(self):
                return "fizz"

            @case(lambda: i % 5 == 0)
            def buzz(self):
                return "buzz"

            @case(lambda: True)
            def default(self):
                return "-"

        print(f"{i} {FizzBuzz}")

,

Tim RileyOpen source status update, August 2020

Oh, hello there, has it been another month already? After my bumper month in July, August was a little more subdued (I had to devote more energy towards a work project), but I still managed to get a few nice things done.

Hanami session configuration back in action

In a nice little surprise, I realised that all the building blocks had fallen into place for Hanami’s standard session configuration to begin working again.

So with a couple of lines of config uncommented, Luca’s “soundeck� demo app has working cookie sessions again. Anyone pulling from my Hanami 2 application template will see the same config enabled after this commit, too.

Container auto-registration respects application inflector

Another small config-related changed I made was to [pass the Hanami 2 application inflector]((https://github.com/hanami/hanami/pull/1069) through to to the dry-system container handling component auto-registration.

With this in place, if you configure a custom inflection for your app, e.g.

module MyApp
  class Application < Hanami::Application
    config.inflector do |inflections|
      inflections.acronym "NBA"
    end
  end
end

Then it will be respected when your components are auto-registerd, so you can use your custom inflections as part of your module namespacing.

With the setup above, if I had a file called lib/my_app/nba_jam/cheat_codes.rb, the container would rightly expect it to define MyApp::NBAJam::CheatCodes.

I’m delighed to see this in place. Having to deal with awkward namespaces (e.g. SomeApi instead of SomeAPI) purely because the framework wasn’t up to the task of handling it has long been an annoyance to me (these details matter!), and I’m really glad that Hanami 2 will make this a piece of cake.

This outcome is also a testament to the design approach we’ve taken for all the underpinning dry-rb gems. By ensuring important elements like an inflector were represented by a dedicated abstraction - and a configurable one at that - it was so easy for Hanami to provide its own inflector and see it used wherever necessary.

Customisable standard application components

Every Hanami 2 application will come with a few standard components, like a logger, inflector, and your settings. These are made available as registrations in your application container, e.g. Hanami.application["logger"], to make them easy to auto-inject into your other application components as required.

While it was my intention for these standard components to be replaceable by your own custom versions, what we learnt this month is that this was practically impossible! There was just no way to register your own replacements early enough for them to be seen during the application boot process.

After spending a morning trying to get this to work, I decided that this situation was in fact pointing to a missing feature in dry-system. So I went ahead and added support for multiple boot file directories in dry-system. Now you can configure an array of directories on this new bootable_dirs setting:

class MyContainer < Dry::System::Container
  config.bootable_dirs = [
    "config/boot/custom_components",
    "config/boot/standard_components"
  ]
end

When the container locates a bootable component, it will work with these bootable_dirs just like you’d expect your shell to work with its $PATH: it will search the directories, in order, and the first found instance of your component will be used.

With this in place, I updated Hanami to to configure its own bootable_dirs and use its own directory for defining its standard components. The default directory is secondary to the directory specified for the application’s own bootable components, so this means if you want to replace Hanami’s standard logger, you can just create a config/boot/logger.rb and you’ll be golden!

Started rationalising flash

Last month when I was digging into some session-related details of the framework, I realised that the flash we inherited from Hanami 1 was pretty hard to work with. It didn’t seem to behave in the same way we expect a flash to work, e.g. to automatically preserve added messages and make them available to the next request. The code was also too complex. This is a solved problem, so I looked around and started rationalising the Hanami 2 flash system based on code from Roda’s flash plugin. I haven’t had the chance to finish this yet, but it’ll be first cab off the rank in September.

Plans for September

With a concerted effort, I think I could make September the month I knock off all my remaining tasks for a 2.0.0.alpha2 release. It’s been tantalisingly close for a while, but I think it could really happen!

Time to get stuck into it.

🙌� Thanks to my sponsors!

Lastly, my continued thanks to my little posse of GitHub sponsors for your continued support, especially Benjamin Klotz.

I’d really love for you to join the gang. If you care about a healthy, diverse future for Ruby application developers, please consider sponsoring my open source work!

,

Chris SmartHow to create bridges on bonds (with and without VLANs) using NetworkManager

Some production systems you face might make use of bonded network connections that you need to bridge in order to get VMs onto them. That bond may or may not have a native VLAN (in which case you bridge the bond), or it might have VLANs on top (in which case you want to bridge the VLANs), or perhaps you need to do both.

Let’s walk through an example where we have a bond that has a native VLAN, that also has the tagged VLAN 123 on top (and maybe a second VLAN 456), all of which need to be separately bridged. This means we will have the bond (bond0) with a matching bridge (br-bond0), plus a VLAN on the bond (bond0.123) with its matching bridge (br-vlan123). It should look something like this.

+------+   +---------+                           +---------------+
| eth0 |---|         |          +------------+   |  Network one  |
+------+   |         |----------|  br-bond0  |---| (native VLAN) |
           |  bond0  |          +------------+   +---------------+
+------+   |         |                                            
| eth1 |---|         |                                            
+------+   +---------+                           +---------------+
            | |   +---------+   +------------+   |  Network two  |
            | +---| vlan123 |---| br-vlan123 |---| (tagged VLAN) |
            |     +---------+   +------------+   +---------------+
            |                                                     
            |     +---------+   +------------+   +---------------+
            +-----| vlan456 |---| br-vlan456 |---| Network three |
                  +---------+   +------------+   | (tagged VLAN) |
                                                 +---------------+

To make it more complicated, let’s say that the native VLAN on the bond needs a static IP and to operate at an MTU of 1500 while the other uses DHCP and needs MTU of 9000.

OK, so how do we do that?

Start by creating the bridge, then later we create the interface that attaches to that bridge. When creating VLANs, they are created on the bond, but then attached as a slave to the bridge.

Create the bridge for the bond

First, let’s create the bridge for our bond. We’ll export some variables to make scripting easier, including the name, value for spanning tree protocol (SPT) and MTU. Note that in this example the bridge will have an MTU of 1500 (but the bond itself will be 9000 to support other VLANs at that MTU size.)

BRIDGE=br-bond0
BRIDGE_STP=yes
BRIDGE_MTU=1500

OK so let’s create the bridge for the native VLAN on the bond (which doesn’t exist yet).

nmcli con add ifname "${BRIDGE}" type bridge con-name "${BRIDGE}"
nmcli con modify "${BRIDGE}" bridge.stp "${BRIDGE_STP}"
nmcli con modify "${BRIDGE}" 802-3-ethernet.mtu "${BRIDGE_MTU}"

By default this will look for an address with DHCP. If you don’t want that you can either set it manually:

nmcli con modify "${BRIDGE}" ipv4.method static ipv4.address 192.168.0.123/24 ipv6.method ignore

Or disable IP addressing:

nmcli con modify "${BRIDGE}" ipv4.method disabled ipv6.method ignore

Finally, bring up the bridge. Yes, we don’t have anything attached to it yet, but that’s OK.

nmcli con up "${BRIDGE}"

You should be able to see it with nmcli and brctl tools (if available on your distro), although note that there is no device attached to this bridge yet.

nmcli con
brctl show

Next, we create the bond to attach to the bridge.

Create the bond and attach to the bridge

Let’s create the bond. In my example I’m using active-backup (mode 1) but your bond may use balance-rr (round robin, mode 0) or, depending on your switching, perhaps something like link aggregation control protocol (LACP) which is 802.3ad (mode 4).

Let’s say that your bond (we’re going to call bond0) has two interfaces, which are eth0 and eth1 respectively. Note that in this example, although the native interface on this bond wants an MTU of 1500, the VLANs which sit on top of the bond need a higher MTU of 9000. Thus, we set the bridge to 1500 in the previous step, but we need to set the bond and its interfaces to 9000. Let’s export those now to make scripting easier.

BOND=bond0
BOND_SLAVE0=eth0
BOND_SLAVE1=eth1
BOND_MODE=active-backup
BOND_MTU=9000

Now we can go ahead and create the bond, setting the options and the slave devices.

nmcli con add type bond ifname "${BOND}" con-name "${BOND}"
nmcli con modify "${BOND}" bond.options mode="${BOND_MODE}"
nmcli con modify "${BOND}" 802-3-ethernet.mtu "${BOND_MTU}"
nmcli con add type ethernet con-name "${BOND}-slave-${BOND_SLAVE0}" ifname "${BOND_SLAVE0}" master "${BOND}"
nmcli con add type ethernet con-name "${BOND}-slave-${BOND_SLAVE1}" ifname "${BOND_SLAVE1}" master "${BOND}"
nmcli con modify "${BOND}-slave-${BOND_SLAVE0}" 802-3-ethernet.mtu "${BOND_MTU}"
nmcli con modify "${BOND}-slave-${BOND_SLAVE1}" 802-3-ethernet.mtu "${BOND_MTU}"

OK at this point you have a bond specified, great! But now we need to attach it to the bridge, which is what will make the bridge actually work.

nmcli con modify "${BOND}" master "${BRIDGE}" slave-type bridge

Note that before we bring up the bond (or afterwards) we need to disable or delete any existing network connections for the individual interfaces. Check this with nmcli con and delete or disable those connections. Note that this may disconnect you, so make sure you have a console to the machine.

Now, we can bring the bond up which will also activate our interfaces.

nmcli con up "${BOND}"

We can check that the bond came up OK.

cat /proc/net/bonding/bond0

And this bond should also now be on the network, via the bridge which has an IP set.

Now if you look at the bridge you can see there is an interface (bond0) attached to it (your distro might not have brctl).

nmcli con
ls /sys/class/net/br-bond0/brif/
brctl show

Bridging a VLAN on a bond

Now that we have our bond, we can create the bridged for our tagged VLANs (remember that the bridge connected to the bond is a native VLAN so it didn’t need a VLAN interface).

Create the bridge for the VLAN on the bond

Create the new bridge, which for our example is going to use VLAN 123 which will use MTU of 9000.

VLAN=123
BOND=bond0
BRIDGE=br-vlan${VLAN}
BRIDGE_STP=yes
BRIDGE_MTU=9000

OK let’s go! (This is the same as the first bridge we created.)

nmcli con add ifname "${BRIDGE}" type bridge con-name "${BRIDGE}"
nmcli con modify "${BRIDGE}" bridge.stp "${BRIDGE_STP}"
nmcli con modify "${BRIDGE}" 802-3-ethernet.mtu "${BRIDGE_MTU}"

Again, this will look for an address with DHCP, so if you don’t want that, then disable it or set an address manually (as per first example). Then you can bring the device up.

nmcli con up "${BRIDGE}"

Create the VLAN on the bond and attach to bridge

OK, now we have the bridge, we create the VLAN on top of bond0 and then attach it to the bridge we just created.

nmcli con add type vlan con-name "${BOND}.${VLAN}" ifname "${BOND}.${VLAN}" dev "${BOND}" id "${VLAN}"
nmcli con modify "${BOND}.${VLAN}" master "${BRIDGE}" slave-type bridge
nmcli con modify "${BOND}.${VLAN}" 802-3-ethernet.mtu "${BRIDGE_MTU}"

If you look at bridges now, you should see the one you just created, attached to a VLAN device (note, your distro might not have brctl).

nmcli con
brctl show

And that’s about it! Now you can attach VMs to those bridges and have them on those networks. Repeat the process for any other VLANs you need on to of the bond.

,

Ben MartinSmall 1/4 inch socket set into a nicer walnut tray

 I was recently thinking about how I could make a selection of 1/4 inch drive bits easier to use. It seems I am not alone in the crowd of people who leave the bits in the case they came in. Some folks do that for many decades. Apart from being trapped into what "was in the set" this also creates an issue when you have some 1/4 inch parts in a case that includes many more 3/8 inch drive bits. I originally marked the smaller drive parts and though about leaving them in the blow molded case as is the common case.

The CNC fiend in me eventually got the better of me and the below is the result. I cut a prototype in pine first, knowing that the chances of getting it all as I wanted on the first try was not impossible, but not probable either. Version 1 is shown below.

 

 The advantage is that now I have the design in Fusion 360 I can cut this design in about an hour. So if I want to add a bunch of deep sockets to the set I can do that for the time cost mostly of gluing up a panel, fixturing it and a little sand a shellac. Not a trivial en devour but the result I think justifies the means.

Below is the board still fixtured in the cnc machine. I think I will make a jig with some sliding toggle clamps so I can fix panels to the jig and then bolt the jig into the cnc instead of directly using hold down clamps.

I have planned to use a bandsaw to but a profile around the tools and may end up with some handle(s) on the tray. That part is something I have to think more about. The thinking about how I want the tools to be stored and accessed is an interesting side project.



 

 

,

Chris SmartHow to create Linux bridges and Open vSwitch bridges with NetworkManager

My virtual infrastructure Ansible role supports connecting VMs to both Linux and Open vSwitch bridges, but they must already exist on the KVM host.

Here is how to convert an existing Ethernet device into a bridge. Be careful if doing this on a remote machine with only one connection! Make sure you have some other way to log in (e.g. console), or maybe add additional interfaces instead.

Export interfaces and existing connections

First, export the the device you want to convert so we can easily reference it later (e.g. eth1).

export NET_DEV="eth1"

Now list the current NetworkManager connections for your device exported above, so we know what to disable later.

sudo nmcli con |egrep -w "${NET_DEV}"

This might be something like System eth1 or Wired connection 1, let’s export it too for later reference.

export NM_NAME="Wired connection 1"

Create a Linux bridge

Here is an example of creating a persistent Linux bridge with NetworkManager. It will take a device such as eth1 (substitute as appropriate) and convert it into a bridge. Note that we will be specifically giving it the device name of br0 as that’s the standard convention and what things like libvirt will look for.

Make sure you have exported your device as NET_DEV and its existing NetworkManager connection name as NM_NAME from above, you will use them below.

sudo nmcli con add ifname br0 type bridge con-name br0
sudo nmcli con add type bridge-slave ifname "${NET_DEV}" master br0 con-name br0-slave-"${NET_DEV}"

Note that br0 probably has a different MAC address to your physical interface. If so, make sure you update and DHCP reservations (or be able to find the new IP once the bridge is brought up).

sudo ip link show dev br0
sudo ip link show dev "${NET_DEV}"

Configure the bridge

As mentioned above, by default the Linux bridge will get an address via DHCP. If you don’t want it to be on the network (you might have another dedicated interface) then disable DHCP on it.

sudo nmcli con modify br0 ipv4.method disabled ipv6.method disabled

Or, if you need set a static IP you can do that too.

sudo nmcli con modify br0 ipv4.method static ipv4.address 192.168.123.100/24

If you need to set a specific MTU like 9000 (defaults to 1500), you can do that.

sudo nmcli con modify br0-slave-enp4s0 802-3-ethernet.mtu 9000

Finally, spanning tree protocol is on by default, so disable it if you need to.

sudo nmcli con modify br0 bridge.stp no

Bring up the bridge

Now you can either simply reboot, or stop the current interface and bring up the bridge (do it in one command in case you’re using the one interface, else you’ll get disconnected). Note that your IP might change once bridge comes up, if you didn’t check the MAC address and update any static DHCP leases.

sudo nmcli con down "${NM_NAME}" ; \
sudo nmcli con up br0

Create an Open vSwitch (OVS) bridge

OVS bridges are often used for plumbing into libvirt for use with VLANs.

We can create an OVS bridge which will consists of the bridge itself and multiple ports and interfaces which connect everything together, including the physical device itself (so we can talk on the network) and virtual ports for VLANs and VMs. By default the physical port on the bridge will use untagged (native) VLAN, but if all your traffic needs to be tagged then we can add a tagged interface.

Here is an example of creating a persistent OVS bridge with NetworkManager. It will take a device such as eth1 (substitute as appropriate) and convert it into an ovs-bridge.

Install dependencies

You will need openvswitch installed as well as the OVS NetworkManager plugin.

sudo dnf install -y NetworkManager-ovs openvswitch
sudo systemctl enable --now openvswitch
sudo systemctl restart NetworkManager

Create the bridge

Let’s create the bridge, its port and interface with these three commands.

sudo nmcli con add type ovs-bridge conn.interface ovs-bridge con-name ovs-bridge
sudo nmcli con add type ovs-port conn.interface port-ovs-bridge master ovs-bridge con-name ovs-bridge-port
sudo nmcli con add type ovs-interface slave-type ovs-port conn.interface ovs-bridge master ovs-bridge-port con-name ovs-bridge-int

Patch in our physical interface

Next, create another port on the bridge and patch in our physical device as an Ethernet interface so that real traffic can flow across the network. Make sure you have exported your device as NET_DEV and its existing NetworkManager connection name as NM_NAME from above, you will use them below.

sudo nmcli con add type ovs-port conn.interface ovs-port-eth master ovs-bridge con-name ovs-port-eth
sudo nmcli con add type ethernet conn.interface "${NET_DEV}" master ovs-port-eth con-name ovs-port-eth-int

OK now you should have an OVS bridge configured and patched to your local network via your Ethernet device, but not yet active.

Configure the bridge

By default the OVS bridge will be sending untagged traffic and requesting an IP address for ovs-bridge via DHCP. If you don’t want it to be on the network (you might have another dedicated interface) then disable DHCP on the interface.

sudo nmcli con modify ovs-bridge-int ipv4.method disabled ipv6.method disabled

Or if you need to set a static IP you can do that too.

sudo nmcli con modify ovs-bridge-int ipv4.method static ipv4.address 192.168.123.100/24

If you need to set a specific MTU like 9000 (defaults to 1500), you can do that.

sudo nmcli con modify ovs-bridge-int 802-3-ethernet.mtu 9000
sudo nmcli con modify ovs-port-eth-int 802-3-ethernet.mtu 9000

Bring up the bridge

Before you bring up the bridge, note that ovs-bridge will probably have a MAC address which is different to your physical interface. Keep that in mind if you manage DHCP static leases, and make sure you can find the new IP so that you can log back in once the bridge is brought up.

Now you can either simply reboot, or stop the current interface and bring up the bridge and its interfaces (in theory we just need to bring up ovs-port-eth-int, but let’s make sure and do it in one command in case you’re using the one interface, else you’ll get disconnected and not be able to log back in). Note that your MAC address may change here, so if you’re using DHCP and you’ll get a new IP and your session will freeze, so be sure you can find the new IP so you can log back in.

sudo nmcli con down "${NM_NAME}" ; \
sudo nmcli con up ovs-port-eth-int ; \
sudo nmcli con up ovs-bridge-int

Now you have a working Open vSwitch implementation!

Create OVS VLAN ports

From there you might want to create some port groups for specific VLANs. For example, if your network does not have a native VLAN, you will need to create a VLAN interface on the OVS bridge to get onto the network.

Let’s create a new port and interface for VLAN 123 which will use DHCP by default to get an address and bring it up.

sudo nmcli con add type ovs-port conn.interface vlan123 master ovs-bridge ovs-port.tag 123 con-name ovs-port-vlan123
sudo nmcli con add type ovs-interface slave-type ovs-port conn.interface vlan123 master ovs-port-vlan123 con-name ovs-int-vlan123
sudo nmcli con up ovs-int-vlan123

If you need to set a static address on the VLAN interface instead, you can do so by modifying the interface.

sudo nmcli con modify ovs-int-vlan123 ipv4.method static ipv4.address 192.168.123.100/24

View the OVS configuration

Show the switch config and bridge with OVS tools.

sudo ovs-vsctl show

Clean up old interface profile

It’s not really necessary, but you can disable the current NetworkManager config for the device so that it doesn’t conflict with the bridge, if you want to.

sudo nmcli con modify "${NM_NAME}" ipv4.method disabled ipv6.method disabled

Or you can even delete the old interface’s NetworkManager configuration if you want to (but it’s not necessary).

sudo nmcli con delete "${NM_NAME}"

That’s it!

,

Adrian ChaddRFI from crappy electronics, or "how's this sold in the US?"

I picked up a cheap charging cable for my Baofeng UV-9S. (https://www.amazon.com/gp/product/B07TSDSQ4Z/). It .. well, it works.

But it messes up operating my radios! I heard super strong interference on my HF receiver and my VHF receivers.

So, let's take a look. I setup a little antenna in my shack. The baofeng was about 6ft away.



Here's DC to 120MHz. Those peaks to the right? Broadcast FM. The marker is at 28.5MHz.

Ok, let's plug in the baofeng and charger.

Ok, look at that noise. Ugh. That's unfun.

What about VHF? Let's look at that. 100-300MHz.

Ok, that's expected too. I think that's digital TV or something in there. Ok, now, let's plug in the charger, without it charging..


Whaaaaaaaaattttttt oh wait. Yeah, this is likely an unshielded buck converter and it's unloaded. Ok, let's load it up.


Whaaaaaa oh ok. Well that explains everything.

Let's pull it open:



Yup. A buck converter going from 5v to 9v; no shielding, no shielded power cable and no ground plane on the PCB. This is just amazing. The 3ft charge cable is basically an antenna. "Unintentional radiator" indeed.

So - even with a ferrite on the cable, it isn't quiet.


It's quiet at 28MHz now so I can operate on the 10m band with it charging, but this doesn't help at all at VHF.

Ew.




,

Dave HallIf You're not Using YAML for CloudFormation Templates, You're Doing it Wrong

Learn why you should be using YAML in your CloudFormation templates.

,

Adrian ChaddFixing up ath_rate_sample to actually work well with 11n

Way back in 2011 when I was working on FreeBSD's Atheros 802.11n support I needed to go and teach some rate control code about 802.11n MCS rates. (As a side note, the other FreeBSD wifi hackers and I at the time taught wlan_amrr - the AMRR rate control in net80211 - about basic MCS support too, and fixing that will be the subject of a later post.)

The initial hacks I did to ath_rate_sample made it kind of do MCS rates OK, but it certainly wasn't great. To understand why then and what I've done now, it's best to go for a little trip down journey lane - the initial sample rate control algorithm by John Bicket. You can find a copy of the paper he wrote here - https://pdos.csail.mit.edu/papers/jbicket-ms.pdf .

Now, sample didn't try to optimise maximum throughput. Instead, it attempts to optimise for minimum airtime to get the job done, and also attempted to minimise the time spent sampling rates that had a low probability of working. Note this was all done circa 2005 - at the time the other popular rate control methods tried to maintain the highest PHY rate that met some basic success rate (eg packet loss, bit error rate, etc, etc.) The initial implementation in FreeBSD also included multiple packet size bins - 250 and 1600 bytes - to allow rate selection based on packet length.

However, it made some assumptions about rates that don't quite hold in the 802.11n MCS world. Notably, it didn't take the PHY bitrate into account when comparing rates. It mostly assumed that going up in rate code - except between CCK and OFDM rates - meant it was faster. Now, this is true for 11b, 11g and 11a rates - again except when you transition between 11b and 11g rates - but this definitely doesn't hold true in the 802.11n MCS rate world. Yes, between MCS0 to MCS7 the PHY bitrate goes up, but then MCS8 is MCS0 times two streams, and MCS16 is MCS0 times three streams.

So my 2011/2012 just did the minimum hacks to choose /some/ MCS rates. It didn't take the length of aggregates into account; it just used the length of the first packet in the aggregate. Very suboptimal, but it got MCS rates going.

Now fast-forward to 2020. This works fine if you're close to the other end, but it's very terrible if you're at the fringes of acceptable behaviour. My access points at home are not well located and thus I'm reproducing this behaviour very often - so I decided to fix it.

First up - packet length.  I had to do some work to figure out how much data was in the transmit queue for a given node and TID. (Think "QoS category.") The amount of data in the queue wasn't good enough - chances are we couldn't transmit all of it because of 802.11 state (block-ack window, management traffic, sleep state, etc.) So I needed a quick way to query the amount of traffic in the queue taking into account 802.11 state. That .. ended up being a walk of each packet in the software queue for that node/TID list until we hit our limit, but for now that'll do.

So then I can call ath_rate_lookup() to get a rate control schedule knowing how long a packet may be. But depending up on the rate it returns, the amount of data that may be transmitted could be less - there's a 4ms limit on 802.11n aggregates, so at lower MCS rates you end up only sending much smaller frames (like 3KB at the slowest rate.) So I needed a way to return how many bytes to form an aggregate for as well as the rate. That informed the A-MPDU formation routine how much data it could queue in the aggregate for the given rate.

I also stored that away to use when completing the transmit, just to line things up OK.

Ok, so now I'm able to make rate control decisions based on how much data needs to be sent. ath_rate_sample still only worked with 250 and 1600 byte packets. So, I extended that out to 65536 bytes in mostly-powers-of-two values.  This worked pretty well right out of the box, but the rate control process was still making pretty trash decisions.

The next bit is all "statistics". The decisions that ath_rate_sample makes depend upon accurate estimations of how long packet transmissions took. I found that a lot of the logic was drastically over-compensating for failures by accounting a LOT more time for failures at each attempted rate, rather than only accounting how much time failed at that rate. Here's two examples:
  • If a rate failed, then all the other rates would get failure accounted for the whole length of the transmission to that point. I changed it to only account for failures for that rate - so if three out of four rates failed, each failed rate would only get their individual time accounted to that rate, rather than everything.
  • Short (RTS/CTS) and long (no-ACK) retries were being accounted incorrectly. If 10 short retries occured, then the maximum failed transmission for that rate can't be 10 times the "it happened" long retry style packet accounting. It's a short retry; the only thing that could differ is the rate that RTS/CTS is being exchanged at. Penalising rates because of bursts of short failures was incorrect and I changed that accounting.
There are a few more, but you can look at the change log / change history for sys/dev/ath/ath_rate/sample/ to see.

By and large, I pretty accurately nailed making sure that failed transmit rates account for THEIR failures, not the failures of other rates in the schedule. It was super important for MCS rates because mis-accounting failures across the 24-odd rates you can choose in 3-stream transmit can have pretty disasterous effects on throughput - channel conditions change super frequently and you don't want to penalise things for far, far too long and it take a lot of subsequent successful samples just to try using that rate again.

So that was the statistics side done.

Next up - choices.

Choices was a bit less problematic to fix. My earlier hacks mostly just made it possible to choose MCS rates but it didn't really take into account their behaviour. When you're doing 11a/11g OFDM rates, you know that you go in lock-step from 6, 12, 18, 24, 36, 48, 54MB, and if a rate starts failing the higher rate will likely also fail. However, MCS rates are different - the difference between MCS0 (1/2 BPSK, 1 stream) and MCS8 (1/2 BPSK, 2 streams) is only a couple dB of extra required signal strength. So given a rate, you want to sample at MCS rates around it but also ACROSS streams. So I mostly had to make sure that if I was at say MCS3, I'd also test MCS2 and MCS4, but I'd also test MCS10/11/12 (the 2-stream versions of MCS2/3/4) and maybe MCS18/19/20 for 3-stream. I also shouldn't really bother testing too high up the MCS chain if I'm at a lower MCS rate - there's no guarantee that MCS7 is going to work (5/6 QAM64 - fast but needs a pretty clean channel) if I'm doing ok at MCS2. So, I just went to make sure that the sampling logic wouldn't try all the MCS rates when operating at a given MCS rate. It works pretty well - sampling will try a couple MCS rates either side to see if the average transmit time for that rate is higher or lower, and then it'll bump it up or down to minimise said average transmit time.

However, the one gotcha - packet loss and A-MPDU.

ath_rate_sample was based on single frames, not aggregates. So the concept of average transmit time assumed that the data either got there or it didn't. But, with 802.11n A-MPDU aggregation we can have the higher rates succeed at transmitting SOMETHING - meaning that the average transmit time and long retry failure counts look great - but most of the frames in the A-MPDU are dropped. That means low throughput and more actual airtime being used.

When I did this initial work in 2011/2012 I noted this, so I kept an EWMA of the packet loss both of single frames and aggregates. I wouldn't choose higher rates whose EWMA was outside of a couple percent of the current best rate. It didn't matter how good it looked at the long retry view - if only 5% of sub-frames were ACKed, I needed a quick way to dismiss that. The EWMA logic worked pretty well there and only needed a bit of tweaking.


A few things stand out after testing:

  • For shorter packets, it doesn't matter if it chooses the one, two or three stream rate; the bulk of the airtime is overhead and not data. Ie, the difference between MCS4, MCS12 and MCS20 is any extra training symbols for 2/3 stream rates and a few dB extra signal strength required. So, typically it will alternate between them as they all behave roughly the same.
  • For longer packets, the bulk of the airtime starts becoming data, so it begins to choose rates that are obviously providing lower airtime and higher packet success EWMA. MCS12 is the choice for up to 4096 byte aggregates; the higher rates start rapidly dropping off in EWMA. This could be due to a variety of things, but importantly it's optimising things pretty well.
There's a bunch of future work to tidy this all up some more but it can wait.

Adrian ChaddI'm back into the grind of FreeBSD's wireless stack and 802.11ac

hi!

Yes, it's been a while since I posted here and yes, it's been a while since I was actively working on FreeBSD's wireless stack. Life's been .. well, life. I started the ath10k port in 2015. I wasn't expecting it to take 5 years, but here we are. My life has changed quite a lot since 2015 and a lot of the things I was doing in 2015 just stopped being fun for a while.

But the stars have aligned and it's fun again, so here I am.

Here's where things are right now.

First up - if_run. This is the Ralink (now mediatek) 11abgn USB driver for stuff that they made before Mediatek acquired them. A contributor named Ashish Gupta showed up on the #freebsd-wifi IRC channel on efnet to start working on 11n support to if_run and he got it to the point where the basics worked - and I took it and ran with it enough to land 20MHz 11n support. It turns out I had a couple of suitable NICs to test with and, well, it just happened. I'm super happy Ashish came along to get 11n working on another NIC.

The if_run TODO list (which anyone is welcome to contribute to):

  • Ashish is looking at 40MHz wide channel support right now;
  • Short and long-GI support would be good to have;
  • we need to get 11n TX aggregation working via the firmware interface - it looks like the Linux driver has all the bits we need and it doesn't need retransmission support in net80211. The firmware will do it all if we set up the descriptors correctly.

net80211 work


Next up - net80211. So, net80211 has basic 11ac bits, even if people think it's not there. It doesn't know about MU-MIMO streams yet but it'll be a basic 11ac AP and STA if the driver and regulatory domain supports it.

However, as I implement more of the ath10k port, I find more and more missing bits that really need to be in net80211.

A-MPDU / A-MSDU de-encapsulation


The hardware does A-MPDU and A-MSDU de-encapsulation in hardware/firmware, pushing up individual decrypted and de-encapsulated frames to the driver. It supports native wifi and 802.3 (ethernet) encapsulation, and right now we only support native wifi. (Note - net80211 supports 802.3 as well; I'll try to get that going once the driver lands.)

I added support to handle decryption offload with the ath10k supplied A-MPDU/A-MSDU frames (where there's no PN/MIC at all, it's all done in firmware/hardware!) so we could get SOME traffic. However, receive throughput just plainly sucked when I last poked at this. I also added A-MSDU offload support where we wouldn't drop the A-MSDU frames with the same receive 802.11 sequence number. However...

It turns out that my mac was doing A-MSDU in A-MPDU in 11ac, and the net80211 receive A-MPDU reordering was faithfully dropping all A-MSDU frames with the same receive 802.11 sequence number. So TCP would just see massive packet loss and drop the throughput in a huge way. Implementing this feature requires buffering all A-MSDU frames in an A-MPDU sub-frame in the reordering queue rather than tossing them, and then reordering them as if they were a single frame.

So I modified the receive reordering logic to reorder queues of mbufs instead of mbufs, and patched things to allow queuing multiple mbufs as long as they were appropriately stamped as being A-MSDUs in a single A-MPDU subframe .. and now the receive traffic rate is where it should be (> 300mbit UDP/TCP.) Phew.


U-APSD support


I didn't want to implement full U-APSD support in the Atheros 11abgn driver because it requires a lot of driver work to get it right, but the actual U-APSD negotiation support in net80211 is significantly easier. If the NIC supports U-APSD offload (like ath10k does) then I just have to populate the WME QoS fields appropriately and call into the driver to notify them about U-APSD changes.

Right now net80211 doesn't support the ADD-TS / DEL-TS methods for clients requesting explicit QoS requirements.

Migrating more options to per-VAP state


There are a bunch of net80211 state which was still global rather than per-VAP. It makes sense in the old world - NICs that do things in the driver or net80211 side are driven in software, not in firmware, so things like "the current channel", "short/long preamble", etc are global state. However the later NICs that offload various things into firmware can now begin to do interesting things like background channel switching for scan, background channel switching between STA and P2P-AP / P2P-STA. So a lot of state should be kept per-VAP rather than globally so the "right" flags and IEs are set for a given VAP.

I've started migrating this state into per-VAP fields rather than global, but it showed a second shortcoming - because it was global, we weren't explicitly tracking these things per-channel. Ok, this needs a bit more explanation.

Say you're on a 2GHz channel and you need to determine whether you care about 11n, 11g or 11b clients. If you're only seeing and servicing 11n clients then you should be using the short slot time, short preamble and not require RTS/CTS protection to interoperate with pre-11n clients.

But then an 11g client shows up.

The 11g client doesn't need to interoperate with 11b, only 11n - so it doesn't need RTS/CTS. It can use short preamble and short slot time still. But the 11n client need to interoperate, so it needs to switch protection mode into legacy - and it will do RTS/CTS protection.

But then, an 11b client shows up.

At this point the 11g protection kicks in; everyone does RTS/CTS protection and long preamble/slot time kicks in.

Now - is this a property of a VAP, or of a channel? Technically speaking, it's the property of a channel. If any VAP on that channel sees an 11b or 11g client, ALL VAPs need to transition to update protection mode.

I migrated all of this to be per-VAP, but I kept the global state for literally all the drivers that currently consume it. The ath10k driver now uses the per-VAP state for the above, greatly simplifying things (and finishing TODO items in the driver!)


ath10k changes


And yes, I've been hacking on ath10k too.

Locking issues


I've had a bunch of feedback and pull requests from Bjorn and Geramy pointing out lock ordering / deadlock issues in ath10k. I'm slowly working through them; the straight conversion from Linux to FreeBSD showed the differences in our locking and how/when driver threads run. I will rant about this another day.

Encryption key programming


The encryption key programming is programmed using firmware calls, but net80211 currently expects them to be done synchronously. We can't sleep in the net80211 crypto key updates without changing net80211's locks to all be SX locks (and I honestly think that's a bad solution that papers over non-asynchronous code that honestly should just be made asynchronous.) Anyway, so it and the node updates are done using deferred calls - but this required me to take complete copies of the encryption key contents. It turns out net80211 can pretty quickly recycle the key contents - including the key that is hiding inside the ieee80211_node. This fixed up the key reprogramming and deletion - it was sometimes sending garbage to the firmware. Whoops.


What's next?


So what's next? Well, I want to land the ath10k driver! There are still a whole bunch of things to do in both net80211 and the driver before I can do this.

Add 802.11ac channel entries to regdomain.xml


Yes, I added it - but only for FCC. I didn't add them for all the other regulatory domain codes. It's a lot of work because of how this file is implemented and I'd love help here.


Add MU-MIMO group notification


I'd like to make sure that we can at least support associating to a MU-MIMO AP. I think ath10k does it in firmware but we need to support the IE notifications.

Block traffic from being transmitted during a node creation or key update


Right now net80211 will transmit frames right after adding a node or sending a key update - it assumes the driver is completing it before returning. For software driven NICs like the pre-11ac Atheros chips this holds true, but for everything USB and newer firmware based devices this definitely doesn't hold.

For ath10k in particular if you try transmitting a frame without a node in firmware the whole transmit path just hangs. Whoops. So I've fixed that so we can't queue a frame if the firmware doesn't know about the node but ...

... net80211 will send the association responses in hostap mode once the node is created. This means the first association response doesn't make it to the associating client. Since net80211 doesn't yet do this traffic buffering, I'll do it in ath10k- I'll buffer frames during a key update and during node addition/deletion to make sure that nothing is sent OR dropped.

Clean up the Linux-y bits


There's a bunch of dead code which we don't need or don't use; as well as some compatibility bits that define Linux mac80211/nl80211 bits that should live in net80211. I'm going to turn these into net80211 methods and remove the Linux-y bits from ath10k. Bjorn's work to make linuxkpi wifi shims can then just translate the calls to the net80211 API bits I'll add, rather than having to roll full wifi methods inside linuxkpi.


To wrap up ..


.. job changes, relationship changes, having kids, getting a green card, buying a house and paying off old debts from your old hosting company can throw a spanner in the life machine. On the plus side, hacking on FreeBSD and wifi support are fun again and I'm actually able to sleep through the night once more, so ... here goes!

If you're interested in helping out, I've been updating the net80211/driver TODO list here: https://wiki.freebsd.org/WiFi/TodoStuff . I'd love some help, even on the small things!


,

Dave HallLogging Step Functions to CloudWatch

A quick guide on how to stream AWS Step Function logs to AWS CloudWatch.

,

Craig SandersFuck Grey Text

fuck grey text on white backgrounds
fuck grey text on black backgrounds
fuck thin, spindly fonts
fuck 10px text
fuck any size of anything in px
fuck font-weight 300
fuck unreadable web pages
fuck themes that implement this unreadable idiocy
fuck sites that don’t work without javascript
fuck reactjs and everything like it

thank fuck for Stylus. and uBlock Origin. and uMatrix.

Fuck Grey Text is a post from: Errata

,

Hamish TaylorBlog: A new beginning

Earlier today I launched this site. It is the result of a lot of work over the past few weeks. It began as an idea to publicise some of my photos, and morphed into the site you see now, including a store and blog that I’ve named “Photekgraddft”.

In the weirdly named blog, I want to talk about photography, the stories behind some of my more interesting shots, the gear and software I use, my technology career, my recent ADHD diagnosis and many other things.

This scares me quite a lot. I’ve never really put myself out onto the internet before. If you Google me, you’re not going to find anything much. Google Images has no photos of me. I’ve always liked it that way. Until now.

ADHD’ers are sometimes known for “oversharing”, one of the side-effects of the inability to regulate emotions well. I’ve always been the opposite, hiding, because I knew I was different, but didn’t understand why.

The combination of the COVID-19 pandemic and my recent ADHD diagnosis have given me a different perspective. I now know why I hid. And now I want to engage, and be engaged, in the world.

If I can be a force for positive change, around people’s knowledge and opinion of ADHD, then I will.

If talking about Business Analysis (my day job), and sharing my ideas for optimising organisations helps anyone at all, then I will.

If I can show my photos and brighten someone’s day by allowing them to enjoy a sunset, or a flying bird, then I will.

And if anyone buys any of my photos, then I will be shocked!

So welcome to my little vanity project. I hope it can be something positive, for me, if for noone else in this new, odd world in which we now find ourselves living together.

,

Hamish TaylorPhoto: Rain on leaves

,

Hamish TaylorVideo: A Foggy Autumn Morning

Video: A Foggy Autumn Morning

Hamish TaylorPhoto: Walking the dog on a cold Autumn morning

Photo: Walking the dog on a cold Autumn morning

Photo: Walking the dog on a cold Autumn morning

,

Rusty Russell57 Varieties of Pyrite: Exchanges Are Now The Enemy of Bitcoin

TL;DR: exchanges are casinos and don’t want to onboard anyone into bitcoin. Avoid.

There’s a classic scam in the “crypto” space: advertize Bitcoin to get people in, then sell suckers something else entirely. Over the last few years, this bait-and-switch has become the core competency of “bitcoin” exchanges.

I recently visited the homepage of Australian exchange btcmarkets.net: what a mess. There was a list of dozens of identical-looking “cryptos”, with bitcoin second after something called “XRP”; seems like it was sorted by volume?

Incentives have driven exchanges to become casinos, and they’re doing exactly what you’d expect unregulated casinos to do. This is no place you ever want to send anyone.

Incentives For Exchanges

Exchanges make money on trading, not on buying and holding. Despite the fact that bitcoin is the only real attempt to create an open source money, scams with no future are given false equivalence, because more assets means more trading. Worse than that, they are paid directly to list new scams (the crappier, the more money they can charge!) and have recently taken the logical step of introducing and promoting their own crapcoins directly.

It’s like a gold dealer who also sells 57 varieties of pyrite, which give more margin than selling actual gold.

For a long time, I thought exchanges were merely incompetent. Most can’t even give out fresh addresses for deposits, batch their outgoing transactions, pay competent fee rates, perform RBF or use segwit.

But I misunderstood: they don’t want to sell bitcoin. They use bitcoin to get you in the door, but they want you to gamble. This matters: you’ll find subtle and not-so-subtle blockers to simply buying bitcoin on an exchange. If you send a friend off to buy their first bitcoin, they’re likely to come back with something else. That’s no accident.

Looking Deeper, It Gets Worse.

Regrettably, looking harder at specific exchanges makes the picture even bleaker.

Consider Binance: this mainland China backed exchange pretending to be a Hong Kong exchange appeared out of nowhere with fake volume and demonstrated the gullibility of the entire industry by being treated as if it were a respected member. They lost at least 40,000 bitcoin in a known hack, and they also lost all the personal information people sent them to KYC. They aggressively market their own coin. But basically, they’re just MtGox without Mark Karpales’ PHP skills or moral scruples and much better marketing.

Coinbase is more interesting: an MBA-run “bitcoin” company which really dislikes bitcoin. They got where they are by spending big on regulations compliance in the US so they could operate in (almost?) every US state. (They don’t do much to dispel the wide belief that this regulation protects their users, when in practice it seems only USD deposits have any guarantee). Their natural interest is in increasing regulation to maintain that moat, and their biggest problem is Bitcoin.

They have much more affinity for the centralized coins (Ethereum) where they can have influence and control. The anarchic nature of a genuine open source community (not to mention the developers’ oft-stated aim to improve privacy over time) is not culturally compatible with a top-down company run by the Big Dog. It’s a running joke that their CEO can’t say the word “Bitcoin”, but their recent “what will happen to cryptocurrencies in the 2020s” article is breathtaking in its boldness: innovation is mainly happening on altcoins, and they’re going to overtake bitcoin any day now. Those scaling problems which the Bitcoin developers say they don’t know how to solve? This non-technical CEO knows better.

So, don’t send anyone to an exchange, especially not a “market leading” one. Find some service that actually wants to sell them bitcoin, like CashApp or Swan Bitcoin.

,

Matthew OliverGNS3 FRR Appliance

In my spare time, what little I have, I’ve been wanting to play with some OSS networking projects. For those playing along at home, during last Suse hackweek I played with wireguard, and to test the environment I wanted to set up some routing.
For which I used FRR.

FRR is a pretty cool project, if brings the networking routing stack to Linux, or rather gives us a full opensource routing stack. As most routers are actually Linux anyway.

Many years ago I happened to work at Fujitsu working in a gateway environment, and started playing around with networking. And that was my first experience with GNS3. An opensource network simulator. Back then I needed to have a copy of cisco IOS images to really play with routing protocols, so that make things harder, great open source product but needed access to proprietary router OSes.

FRR provides a CLI _very_ similar to ciscos, and make we think, hey I wonder if there is an FRR appliance we can use in GNS3?
And there was!!!

When I downloaded it and decompressed the cow2 image it was 1.5GB!!! For a single router image. It works great, but what if I wanted a bunch of routers to play with things like OSPF or BGP etc. Surely we can make a smaller one.

Kiwi

At Suse we use kiwi-ng to build machine images and release media. And to make things even easier for me we already have a kiwi config for small OpenSuse Leap JEOS images, jeos is “just enough OS”. So I hacked one to include FRR. All extra tweaks needed to the image are also easily done by bash hook scripts.

I wont go in to too much detail how because I created a git repo where I have it all including a detailed README: https://github.com/matthewoliver/frr_gns3

So feel free to check that would and build and use the image.

But today, I went one step further. OpenSuse’s Open Build System, which is used to build all RPMs for OpenSuse, but can also build debs and whatever build you need, also supports building docker containers and system images using kiwi!

So have now got the OBS to build the image for me. The image can be downloaded from: https://download.opensuse.org/repositories/home:/mattoliverau/images/

And if you want to send any OBS requests to change it the project/package is: https://build.opensuse.org/package/show/home:mattoliverau/FRR-OpenSuse-Appliance

To import it into GNS3 you need the gns3a file, which you can find in my git repo or in the OBS project page.

The best part is this image is only 300MB, which is much better then 1.5GB!
I did have it a little smaller, 200-250MB, but unfortunately the JEOS cut down kernel doesn’t contain the MPLS modules, so had to pull in the full default SUSE kernel. If this became a real thing and not a pet project, I could go and build a FRR cutdown kernel to get the size down, but 300MB is already a lot better then where it was at.

Hostname Hack

When using GNS3 and you place a router, you want to be able to name the router and when you access the console it’s _really_ nice to see the router name you specified in GNS3 as the hostname. Why, because if you have a bunch, you want want a bunch of tags all with the localhost hostname on the commandline… this doesn’t really help.

The FRR image is using qemu, and there wasn’t a nice way to access the name of the VM from inside the container, and now an easy way to insert the name from outside. But found 1 approach that seems to be working, enter my dodgy hostname hack!

I also wanted to to it without hacking the gns3server code. I couldn’t easily pass the hostname in but I could pass it in via a null device with the router name its id:

/dev/virtio-ports/frr.router.hostname.%vm-name%

So I simply wrote a script that sets the hostname based on the existence of this device. Made the script a systemd oneshot service to start at boot and it worked!

This means changing the name of the FRR router in the GNS3 interface, all you need to do is restart the router (stop and start the device) and it’ll apply the name to the router. This saves you having to log in as root and running hostname yourself.

Or better, if you name all your FRR routers before turning them on, then it’ll just work.

In conclusion…

Hopefully now we can have a fully opensource, GNS3 + FRR appliance solution for network training, testing, and inspiring network engineers.

,

Matt PalmerPrivate Key Redaction: UR DOIN IT RONG

Because posting private keys on the Internet is a bad idea, some people like to “redact” their private keys, so that it looks kinda-sorta like a private key, but it isn’t actually giving away anything secret. Unfortunately, due to the way that private keys are represented, it is easy to “redact” a key in such a way that it doesn’t actually redact anything at all. RSA private keys are particularly bad at this, but the problem can (potentially) apply to other keys as well.

I’ll show you a bit of “Inside Baseball” with key formats, and then demonstrate the practical implications. Finally, we’ll go through a practical worked example from an actual not-really-redacted key I recently stumbled across in my travels.

The Private Lives of Private Keys

Here is what a typical private key looks like, when you come across it:

-----BEGIN RSA PRIVATE KEY-----
MGICAQACEQCxjdTmecltJEz2PLMpS4BXAgMBAAECEDKtuwD17gpagnASq1zQTYEC
CQDVTYVsjjF7IQIJANUYZsIjRsR3AgkAkahDUXL0RSECCB78r2SnsJC9AghaOK3F
sKoELg==
-----END RSA PRIVATE KEY-----

Obviously, there’s some hidden meaning in there – computers don’t encrypt things by shouting “BEGIN RSA PRIVATE KEY!”, after all. What is between the BEGIN/END lines above is, in fact, a base64-encoded DER format ASN.1 structure representing a PKCS#1 private key.

In simple terms, it’s a list of numbers – very important numbers. The list of numbers is, in order:

  • A version number (0);
  • The “public modulus”, commonly referred to as “n”;
  • The “public exponent”, or “e” (which is almost always 65,537, for various unimportant reasons);
  • The “private exponent”, or “d”;
  • The two “private primes”, or “p” and “q”;
  • Two exponents, which are known as “dmp1” and “dmq1”; and
  • A coefficient, known as “iqmp”.

Why Is This a Problem?

The thing is, only three of those numbers are actually required in a private key. The rest, whilst useful to allow the RSA encryption and decryption to be more efficient, aren’t necessary. The three absolutely required values are e, p, and q.

Of the other numbers, most of them are at least about the same size as each of p and q. So of the total data in an RSA key, less than a quarter of the data is required. Let me show you with the above “toy” key, by breaking it down piece by piece1:

  • MGI – DER for “this is a sequence”
  • CAQ – version (0)
  • CxjdTmecltJEz2PLMpS4BXn
  • AgMBAAe
  • ECEDKtuwD17gpagnASq1zQTYd
  • ECCQDVTYVsjjF7IQp
  • IJANUYZsIjRsR3q
  • AgkAkahDUXL0RSdmp1
  • ECCB78r2SnsJC9dmq1
  • AghaOK3FsKoELg==iqmp

Remember that in order to reconstruct all of these values, all I need are e, p, and q – and e is pretty much always 65,537. So I could “redact” almost all of this key, and still give all the important, private bits of this key. Let me show you:

-----BEGIN RSA PRIVATE KEY-----
..............................................................EC
CQDVTYVsjjF7IQIJANUYZsIjRsR3....................................
........
-----END RSA PRIVATE KEY-----

Now, I doubt that anyone is going to redact a key precisely like this… but then again, this isn’t a “typical” RSA key. They usually look a lot more like this:

-----BEGIN RSA PRIVATE KEY-----
MIIEogIBAAKCAQEAu6Inch7+mWtKn+leB9uCG3MaJIxRyvC/5KTz2fR+h+GOhqj4
SZJobiVB4FrE5FgC7AnlH6qeRi9MI0s6dt5UWZ5oNIeWSaOOeNO+EJDUkSVf67wj
SNGXlSjGAkPZ0nRJiDjhuPvQmdW53hOaBLk5udxPEQbenpXAzbLJ7wH5ouLQ3nQw
HwpwDNQhF6zRO8WoscpDVThOAM+s4PS7EiK8ZR4hu2toon8Ynadlm95V45wR0VlW
zywgbkZCKa1IMrDCscB6CglQ10M3Xzya3iTzDtQxYMVqhDrA7uBYRxA0y1sER+Rb
yhEh03xz3AWemJVLCQuU06r+FABXJuY/QuAVvQIDAQABAoIBAFqwWVhzWqNUlFEO
PoCVvCEAVRZtK+tmyZj9kU87ORz8DCNR8A+/T/JM17ZUqO2lDGSBs9jGYpGRsr8s
USm69BIM2ljpX95fyzDjRu5C0jsFUYNi/7rmctmJR4s4uENcKV5J/++k5oI0Jw4L
c1ntHNWUgjK8m0UTJIlHbQq0bbAoFEcfdZxd3W+SzRG3jND3gifqKxBG04YDwloy
tu+bPV2jEih6p8tykew5OJwtJ3XsSZnqJMwcvDciVbwYNiJ6pUvGq6Z9kumOavm9
XU26m4cWipuK0URWbHWQA7SjbktqEpxsFrn5bYhJ9qXgLUh/I1+WhB2GEf3hQF5A
pDTN4oECgYEA7Kp6lE7ugFBDC09sKAhoQWrVSiFpZG4Z1gsL9z5YmZU/vZf0Su0n
9J2/k5B1GghvSwkTqpDZLXgNz8eIX0WCsS1xpzOuORSNvS1DWuzyATIG2cExuRiB
jYWIJUeCpa5p2PdlZmBrnD/hJ4oNk4oAVpf+HisfDSN7HBpN+TJfcAUCgYEAyvY7
Y4hQfHIdcfF3A9eeCGazIYbwVyfoGu70S/BZb2NoNEPymqsz7NOfwZQkL4O7R3Wl
Rm0vrWT8T5ykEUgT+2ruZVXYSQCKUOl18acbAy0eZ81wGBljZc9VWBrP1rHviVWd
OVDRZNjz6nd6ZMrJvxRa24TvxZbJMmO1cgSW1FkCgYAoWBd1WM9HiGclcnCZknVT
UYbykCeLO0mkN1Xe2/32kH7BLzox26PIC2wxF5seyPlP7Ugw92hOW/zewsD4nLze
v0R0oFa+3EYdTa4BvgqzMXgBfvGfABJ1saG32SzoWYcpuWLLxPwTMsCLIPmXgRr1
qAtl0SwF7Vp7O/C23mNukQKBgB89DOEB7xloWv3Zo27U9f7nB7UmVsGjY8cZdkJl
6O4LB9PbjXCe3ywZWmJqEbO6e83A3sJbNdZjT65VNq9uP50X1T+FmfeKfL99X2jl
RnQTsrVZWmJrLfBSnBkmb0zlMDAcHEnhFYmHFuvEnfL7f1fIoz9cU6c+0RLPY/L7
n9dpAoGAXih17mcmtnV+Ce+lBWzGWw9P4kVDSIxzGxd8gprrGKLa3Q9VuOrLdt58
++UzNUaBN6VYAe4jgxGfZfh+IaSlMouwOjDgE/qzgY8QsjBubzmABR/KWCYiRqkj
qpWCgo1FC1Gn94gh/+dW2Q8+NjYtXWNqQcjRP4AKTBnPktEvdMA=
-----END RSA PRIVATE KEY-----

People typically redact keys by deleting whole lines, and usually replacing them with [...] and the like. But only about 345 of those 1588 characters (excluding the header and footer) are required to construct the entire key. You can redact about 4/5ths of that giant blob of stuff, and your private parts (or at least, those of your key) are still left uncomfortably exposed.

But Wait! There’s More!

Remember how I said that everything in the key other than e, p, and q could be derived from those three numbers? Let’s talk about one of those numbers: n.

This is known as the “public modulus” (because, along with e, it is also present in the public key). It is very easy to calculate: n = p * q. It is also very early in the key (the second number, in fact).

Since n = p * q, it follows that q = n / p. Thus, as long as the key is intact up to p, you can derive q by simple division.

Real World Redaction

At this point, I’d like to introduce an acquaintance of mine: Mr. Johan Finn. He is the proud owner of the GitHub repo johanfinn/scripts. For a while, his repo contained a script that contained a poorly-redacted private key. He since deleted it, by making a new commit, but of course because git never really deletes anything, it’s still available.

Of course, Mr. Finn may delete the repo, or force-push a new history without that commit, so here is the redacted private key, with a bit of the surrounding shell script, for our illustrative pleasure:

#Add private key to .ssh folder
cd /home/johan/.ssh/
echo  "-----BEGIN RSA PRIVATE KEY-----
MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
KKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKK
ÄÄÄÄÄÄÄÄÄÄÄÄÄÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ
MIIJKgIBAAKCAgEAxEVih1JGb8gu/Fm4AZh+ZwJw/pjzzliWrg4mICFt1g7SmIE2
TCQMKABdwd11wOFKCPc/UzRH/fHuQcvWrpbOSdqev/zKff9iedKw/YygkMeIRaXB
fYELqvUAOJ8PPfDm70st9GJRhjGgo5+L3cJB2gfgeiDNHzaFvapRSU0oMGQX+kI9
ezsjDAn+0Pp+r3h/u1QpLSH4moRFGF4omNydI+3iTGB98/EzuNhRBHRNq4oBV5SG
Pq/A1bem2ninnoEaQ+OPESxYzDz3Jy9jV0W/6LvtJ844m+XX69H5fqq5dy55z6DW
sGKn78ULPVZPsYH5Y7C+CM6GAn4nYCpau0t52sqsY5epXdeYx4Dc+Wm0CjXrUDEe
Egl4loPKDxJkQqQ/MQiz6Le/UK9vEmnWn1TRXK3ekzNV4NgDfJANBQobOpwt8WVB
rbsC0ON7n680RQnl7PltK9P1AQW5vHsahkoixk/BhcwhkrkZGyDIl9g8Q/Euyoq3
eivKPLz7/rhDE7C1BzFy7v8AjC3w7i9QeHcWOZFAXo5hiDasIAkljDOsdfD4tP5/
wSO6E6pjL3kJ+RH2FCHd7ciQb+IcuXbku64ln8gab4p8jLa/mcMI+V3eWYnZ82Yu
axsa85hAe4wb60cp/rCJo7ihhDTTvGooqtTisOv2nSvCYpcW9qbL6cGjAXECAwEA
AQKCAgEAjz6wnWDP5Y9ts2FrqUZ5ooamnzpUXlpLhrbu3m5ncl4ZF5LfH+QDN0Kl
KvONmHsUhJynC/vROybSJBU4Fu4bms1DJY3C39h/L7g00qhLG7901pgWMpn3QQtU
4P49qpBii20MGhuTsmQQALtV4kB/vTgYfinoawpo67cdYmk8lqzGzzB/HKxZdNTq
s+zOfxRr7PWMo9LyVRuKLjGyYXZJ/coFaobWBi8Y96Rw5NZZRYQQXLIalC/Dhndm
AHckpstEtx2i8f6yxEUOgPvV/gD7Akn92RpqOGW0g/kYpXjGqZQy9PVHGy61sInY
HSkcOspIkJiS6WyJY9JcvJPM6ns4b84GE9qoUlWVF3RWJk1dqYCw5hz4U8LFyxsF
R6WhYiImvjxBLpab55rSqbGkzjI2z+ucDZyl1gqIv9U6qceVsgRyuqdfVN4deU22
LzO5IEDhnGdFqg9KQY7u8zm686Ejs64T1sh0y4GOmGsSg+P6nsqkdlXH8C+Cf03F
lqPFg8WQC7ojl/S8dPmkT5tcJh3BPwIWuvbtVjFOGQc8x0lb+NwK8h2Nsn6LNazS
0H90adh/IyYX4sBMokrpxAi+gMAWiyJHIHLeH2itNKtAQd3qQowbrWNswJSgJzsT
JuJ7uqRKAFkE6nCeAkuj/6KHHMPsfCAffVdyGaWqhoxmPOrnVgECggEBAOrCCwiC
XxwUgjOfOKx68siFJLfHf4vPo42LZOkAQq5aUmcWHbJVXmoxLYSczyAROopY0wd6
Dx8rqnpO7OtZsdJMeBSHbMVKoBZ77hiCQlrljcj12moFaEAButLCdZFsZW4zF/sx
kWIAaPH9vc4MvHHyvyNoB3yQRdevu57X7xGf9UxWuPil/jvdbt9toaraUT6rUBWU
GYPNKaLFsQzKsFWAzp5RGpASkhuiBJ0Qx3cfLyirjrKqTipe3o3gh/5RSHQ6VAhz
gdUG7WszNWk8FDCL6RTWzPOrbUyJo/wz1kblsL3vhV7ldEKFHeEjsDGroW2VUFlS
asAHNvM4/uYcOSECggEBANYH0427qZtLVuL97htXW9kCAT75xbMwgRskAH4nJDlZ
IggDErmzBhtrHgR+9X09iL47jr7dUcrVNPHzK/WXALFSKzXhkG/yAgmt3r14WgJ6
5y7010LlPFrzaNEyO/S4ISuBLt4cinjJsrFpoo0WI8jXeM5ddG6ncxdurKXMymY7
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::.::
:::::::::::::::::::::::::::.::::::::::::::::::::::::::::::::::::
LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLlL
ÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖÖ
ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ
YYYYYYYYYYYYYYYYYYYYYyYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY
gff0GJCOMZ65pMSy3A3cSAtjlKnb4fWzuHD5CFbusN4WhCT/tNxGNSpzvxd8GIDs
nY7exs9L230oCCpedVgcbayHCbkChEfoPzL1e1jXjgCwCTgt8GjeEFqc1gXNEaUn
O8AJ4VlR8fRszHm6yR0ZUBdY7UJddxQiYOzt0S1RLlECggEAbdcs4mZdqf3OjejJ
06oTPs9NRtAJVZlppSi7pmmAyaNpOuKWMoLPElDAQ3Q7VX26LlExLCZoPOVpdqDH
KbdmBEfTR4e11Pn9vYdu9/i6o10U4hpmf4TYKlqk10g1Sj21l8JATj/7Diey8scO
sAI1iftSg3aBSj8W7rxCxSezrENzuqw5D95a/he1cMUTB6XuravqZK5O4eR0vrxR
AvMzXk5OXrUEALUvt84u6m6XZZ0pq5XZxq74s8p/x1JvTwcpJ3jDKNEixlHfdHEZ
ZIu/xpcwD5gRfVGQamdcWvzGHZYLBFO1y5kAtL8kI9tW7WaouWVLmv99AyxdAaCB
Y5mBAQKCAQEAzU7AnorPzYndlOzkxRFtp6MGsvRBsvvqPLCyUFEXrHNV872O7tdO
GmsMZl+q+TJXw7O54FjJJvqSSS1sk68AGRirHop7VQce8U36BmI2ZX6j2SVAgIkI
9m3btCCt5rfiCatn2+Qg6HECmrCsHw6H0RbwaXS4RZUXD/k4X+sslBitOb7K+Y+N
Bacq6QxxjlIqQdKKPs4P2PNHEAey+kEJJGEQ7bTkNxCZ21kgi1Sc5L8U/IGy0BMC
PvJxssLdaWILyp3Ws8Q4RAoC5c0ZP0W2j+5NSbi3jsDFi0Y6/2GRdY1HAZX4twem
Q0NCedq1JNatP1gsb6bcnVHFDEGsj/35oQKCAQEAgmWMuSrojR/fjJzvke6Wvbox
FRnPk+6YRzuYhAP/YPxSRYyB5at++5Q1qr7QWn7NFozFIVFFT8CBU36ktWQ39MGm
cJ5SGyN9nAbbuWA6e+/u059R7QL+6f64xHRAGyLT3gOb1G0N6h7VqFT25q5Tq0rc
Lf/CvLKoudjv+sQ5GKBPT18+zxmwJ8YUWAsXUyrqoFWY/Tvo5yLxaC0W2gh3+Ppi
EDqe4RRJ3VKuKfZxHn5VLxgtBFN96Gy0+Htm5tiMKOZMYAkHiL+vrVZAX0hIEuRZ
JJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ
MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
-----END RSA PRIVATE KEY-----" >> id_rsa

Now, if you try to reconstruct this key by removing the “obvious” garbage lines (the ones that are all repeated characters, some of which aren’t even valid base64 characters), it still isn’t a key – at least, openssl pkey doesn’t want anything to do with it. The key is very much still in there, though, as we shall soon see.

Using a gem I wrote and a quick bit of Ruby, we can extract a complete private key. The irb session looks something like this:

>> require "derparse"
>> b64 = <<EOF
MIIJKgIBAAKCAgEAxEVih1JGb8gu/Fm4AZh+ZwJw/pjzzliWrg4mICFt1g7SmIE2
TCQMKABdwd11wOFKCPc/UzRH/fHuQcvWrpbOSdqev/zKff9iedKw/YygkMeIRaXB
fYELqvUAOJ8PPfDm70st9GJRhjGgo5+L3cJB2gfgeiDNHzaFvapRSU0oMGQX+kI9
ezsjDAn+0Pp+r3h/u1QpLSH4moRFGF4omNydI+3iTGB98/EzuNhRBHRNq4oBV5SG
Pq/A1bem2ninnoEaQ+OPESxYzDz3Jy9jV0W/6LvtJ844m+XX69H5fqq5dy55z6DW
sGKn78ULPVZPsYH5Y7C+CM6GAn4nYCpau0t52sqsY5epXdeYx4Dc+Wm0CjXrUDEe
Egl4loPKDxJkQqQ/MQiz6Le/UK9vEmnWn1TRXK3ekzNV4NgDfJANBQobOpwt8WVB
rbsC0ON7n680RQnl7PltK9P1AQW5vHsahkoixk/BhcwhkrkZGyDIl9g8Q/Euyoq3
eivKPLz7/rhDE7C1BzFy7v8AjC3w7i9QeHcWOZFAXo5hiDasIAkljDOsdfD4tP5/
wSO6E6pjL3kJ+RH2FCHd7ciQb+IcuXbku64ln8gab4p8jLa/mcMI+V3eWYnZ82Yu
axsa85hAe4wb60cp/rCJo7ihhDTTvGooqtTisOv2nSvCYpcW9qbL6cGjAXECAwEA
AQKCAgEAjz6wnWDP5Y9ts2FrqUZ5ooamnzpUXlpLhrbu3m5ncl4ZF5LfH+QDN0Kl
KvONmHsUhJynC/vROybSJBU4Fu4bms1DJY3C39h/L7g00qhLG7901pgWMpn3QQtU
4P49qpBii20MGhuTsmQQALtV4kB/vTgYfinoawpo67cdYmk8lqzGzzB/HKxZdNTq
s+zOfxRr7PWMo9LyVRuKLjGyYXZJ/coFaobWBi8Y96Rw5NZZRYQQXLIalC/Dhndm
AHckpstEtx2i8f6yxEUOgPvV/gD7Akn92RpqOGW0g/kYpXjGqZQy9PVHGy61sInY
HSkcOspIkJiS6WyJY9JcvJPM6ns4b84GE9qoUlWVF3RWJk1dqYCw5hz4U8LFyxsF
R6WhYiImvjxBLpab55rSqbGkzjI2z+ucDZyl1gqIv9U6qceVsgRyuqdfVN4deU22
LzO5IEDhnGdFqg9KQY7u8zm686Ejs64T1sh0y4GOmGsSg+P6nsqkdlXH8C+Cf03F
lqPFg8WQC7ojl/S8dPmkT5tcJh3BPwIWuvbtVjFOGQc8x0lb+NwK8h2Nsn6LNazS
0H90adh/IyYX4sBMokrpxAi+gMAWiyJHIHLeH2itNKtAQd3qQowbrWNswJSgJzsT
JuJ7uqRKAFkE6nCeAkuj/6KHHMPsfCAffVdyGaWqhoxmPOrnVgECggEBAOrCCwiC
XxwUgjOfOKx68siFJLfHf4vPo42LZOkAQq5aUmcWHbJVXmoxLYSczyAROopY0wd6
Dx8rqnpO7OtZsdJMeBSHbMVKoBZ77hiCQlrljcj12moFaEAButLCdZFsZW4zF/sx
kWIAaPH9vc4MvHHyvyNoB3yQRdevu57X7xGf9UxWuPil/jvdbt9toaraUT6rUBWU
GYPNKaLFsQzKsFWAzp5RGpASkhuiBJ0Qx3cfLyirjrKqTipe3o3gh/5RSHQ6VAhz
gdUG7WszNWk8FDCL6RTWzPOrbUyJo/wz1kblsL3vhV7ldEKFHeEjsDGroW2VUFlS
asAHNvM4/uYcOSECggEBANYH0427qZtLVuL97htXW9kCAT75xbMwgRskAH4nJDlZ
IggDErmzBhtrHgR+9X09iL47jr7dUcrVNPHzK/WXALFSKzXhkG/yAgmt3r14WgJ6
5y7010LlPFrzaNEyO/S4ISuBLt4cinjJsrFpoo0WI8jXeM5ddG6ncxdurKXMymY7
EOF
>> b64 += <<EOF
gff0GJCOMZ65pMSy3A3cSAtjlKnb4fWzuHD5CFbusN4WhCT/tNxGNSpzvxd8GIDs
nY7exs9L230oCCpedVgcbayHCbkChEfoPzL1e1jXjgCwCTgt8GjeEFqc1gXNEaUn
O8AJ4VlR8fRszHm6yR0ZUBdY7UJddxQiYOzt0S1RLlECggEAbdcs4mZdqf3OjejJ
06oTPs9NRtAJVZlppSi7pmmAyaNpOuKWMoLPElDAQ3Q7VX26LlExLCZoPOVpdqDH
KbdmBEfTR4e11Pn9vYdu9/i6o10U4hpmf4TYKlqk10g1Sj21l8JATj/7Diey8scO
sAI1iftSg3aBSj8W7rxCxSezrENzuqw5D95a/he1cMUTB6XuravqZK5O4eR0vrxR
AvMzXk5OXrUEALUvt84u6m6XZZ0pq5XZxq74s8p/x1JvTwcpJ3jDKNEixlHfdHEZ
ZIu/xpcwD5gRfVGQamdcWvzGHZYLBFO1y5kAtL8kI9tW7WaouWVLmv99AyxdAaCB
Y5mBAQKCAQEAzU7AnorPzYndlOzkxRFtp6MGsvRBsvvqPLCyUFEXrHNV872O7tdO
GmsMZl+q+TJXw7O54FjJJvqSSS1sk68AGRirHop7VQce8U36BmI2ZX6j2SVAgIkI
9m3btCCt5rfiCatn2+Qg6HECmrCsHw6H0RbwaXS4RZUXD/k4X+sslBitOb7K+Y+N
Bacq6QxxjlIqQdKKPs4P2PNHEAey+kEJJGEQ7bTkNxCZ21kgi1Sc5L8U/IGy0BMC
PvJxssLdaWILyp3Ws8Q4RAoC5c0ZP0W2j+5NSbi3jsDFi0Y6/2GRdY1HAZX4twem
Q0NCedq1JNatP1gsb6bcnVHFDEGsj/35oQKCAQEAgmWMuSrojR/fjJzvke6Wvbox
FRnPk+6YRzuYhAP/YPxSRYyB5at++5Q1qr7QWn7NFozFIVFFT8CBU36ktWQ39MGm
cJ5SGyN9nAbbuWA6e+/u059R7QL+6f64xHRAGyLT3gOb1G0N6h7VqFT25q5Tq0rc
Lf/CvLKoudjv+sQ5GKBPT18+zxmwJ8YUWAsXUyrqoFWY/Tvo5yLxaC0W2gh3+Ppi
EDqe4RRJ3VKuKfZxHn5VLxgtBFN96Gy0+Htm5tiMKOZMYAkHiL+vrVZAX0hIEuRZ
EOF
>> der = b64.unpack("m").first
>> c = DerParse.new(der).first_node.first_child
>> version = c.value
=> 0
>> c = c.next_node
>> n = c.value
=> 80071596234464993385068908004931... # (etc)
>> c = c.next_node
>> e = c.value
=> 65537
>> c = c.next_node
>> d = c.value
=> 58438813486895877116761996105770... # (etc)
>> c = c.next_node
>> p = c.value
=> 29635449580247160226960937109864... # (etc)
>> c = c.next_node
>> q = c.value
=> 27018856595256414771163410576410... # (etc)

What I’ve done, in case you don’t speak Ruby, is take the two “chunks” of plausible-looking base64 data, chuck them together into a variable named b64, unbase64 it into a variable named der, pass that into a new DerParse instance, and then walk the DER value tree until I got all the values I need.

Interestingly, the q value actually traverses the “split” in the two chunks, which means that there’s always the possibility that there are lines missing from the key. However, since p and q are supposed to be prime, we can “sanity check” them to see if corruption is likely to have occurred:

>> require "openssl"
>> OpenSSL::BN.new(p).prime?
=> true
>> OpenSSL::BN.new(q).prime?
=> true

Excellent! The chances of a corrupted file producing valid-but-incorrect prime numbers isn’t huge, so we can be fairly confident that we’ve got the “real” p and q. Now, with the help of another one of my creations we can use e, p, and q to create a fully-operational battle key:

>> require "openssl/pkey/rsa"
>> k = OpenSSL::PKey::RSA.from_factors(p, q, e)
=> #<OpenSSL::PKey::RSA:0x0000559d5903cd38>
>> k.valid?
=> true
>> k.verify(OpenSSL::Digest::SHA256.new, k.sign(OpenSSL::Digest::SHA256.new, "bob"), "bob")
=> true

… and there you have it. One fairly redacted-looking private key brought back to life by maths and far too much free time.

Sorry Mr. Finn, I hope you’re not still using that key on anything Internet-facing.

What About Other Key Types?

EC keys are very different beasts, but they have much the same problems as RSA keys. A typical EC key contains both private and public data, and the public portion is twice the size – so only about 1/3 of the data in the key is private material. It is quite plausible that you can “redact” an EC key and leave all the actually private bits exposed.

What Do We Do About It?

In short: don’t ever try and redact real private keys. For documentation purposes, just put “KEY GOES HERE” in the appropriate spot, or something like that. Store your secrets somewhere that isn’t a public (or even private!) git repo.

Generating a “dummy” private key and sticking it in there isn’t a great idea, for different reasons: people have this odd habit of reusing “demo” keys in real life. There’s no need to encourage that sort of thing.


  1. Technically the pieces aren’t 100% aligned with the underlying DER, because of how base64 works. I felt it was easier to understand if I stuck to chopping up the base64, rather than decoding into DER and then chopping up the DER. 

,

Jonathan Adamczewskif32, u32, and const

Some time ago, I wrote “floats, bits, and constant expressions” about converting floating point number into its representative ones and zeros as a C++ constant expression – constructing the IEEE 754 representation without being able to examine the bits directly.

I’ve been playing around with Rust recently, and rewrote that conversion code as a bit of a learning exercise for myself, with a thoroughly contrived set of constraints: using integer and single-precision floating point math, at compile time, without unsafe blocks, while using as few unstable features as possible.

I’ve included the listing below, for your bemusement and/or head-shaking, and you can play with the code in the Rust Playground and rust.godbolt.org

// Jonathan Adamczewski 2020-05-12
//
// Constructing the bit-representation of an IEEE 754 single precision floating 
// point number, using integer and single-precision floating point math, at 
// compile time, in rust, without unsafe blocks, while using as few unstable 
// features as I can.
//
// or "What if this silly C++ thing http://brnz.org/hbr/?p=1518 but in Rust?"


// Q. Why? What is this good for?
// A. To the best of my knowledge, this code serves no useful purpose. 
//    But I did learn a thing or two while writing it :)


// This is needed to be able to perform floating point operations in a const 
// function:
#![feature(const_fn)]


// bits_transmute(): Returns the bits representing a floating point value, by
//                   way of std::mem::transmute()
//
// For completeness (and validation), and to make it clear the fundamentally 
// unnecessary nature of the exercise :D - here's a short, straightforward, 
// library-based version. But it needs the const_transmute flag and an unsafe 
// block.
#![feature(const_transmute)]
const fn bits_transmute(f: f32) -> u32 {
  unsafe { std::mem::transmute::<f32, u32>(f) }
}



// get_if_u32(predicate:bool, if_true: u32, if_false: u32):
//   Returns if_true if predicate is true, else if_false
//
// If and match are not able to be used in const functions (at least, not 
// without #![feature(const_if_match)] - so here's a branch-free select function
// for u32s
const fn get_if_u32(predicate: bool, if_true: u32, if_false: u32) -> u32 {
  let pred_mask = (-1 * (predicate as i32)) as u32;
  let true_val = if_true & pred_mask;
  let false_val = if_false & !pred_mask;
  true_val | false_val
}

// get_if_f32(predicate, if_true, if_false):
//   Returns if_true if predicate is true, else if_false
//
// A branch-free select function for f32s.
// 
// If either is_true or is_false is NaN or an infinity, the result will be NaN,
// which is not ideal. I don't know of a better way to implement this function
// within the arbitrary limitations of this silly little side quest.
const fn get_if_f32(predicate: bool, if_true: f32, if_false: f32) -> f32 {
  // can't convert bool to f32 - but can convert bool to i32 to f32
  let pred_sel = (predicate as i32) as f32;
  let pred_not_sel = ((!predicate) as i32) as f32;
  let true_val = if_true * pred_sel;
  let false_val = if_false * pred_not_sel;
  true_val + false_val
}


// bits(): Returns the bits representing a floating point value.
const fn bits(f: f32) -> u32 {
  // the result value, initialized to a NaN value that will otherwise not be
  // produced by this function.
  let mut r = 0xffff_ffff;

  // These floation point operations (and others) cause the following error:
  //     only int, `bool` and `char` operations are stable in const fn
  // hence #![feature(const_fn)] at the top of the file
  
  // Identify special cases
  let is_zero    = f == 0_f32;
  let is_inf     = f == f32::INFINITY;
  let is_neg_inf = f == f32::NEG_INFINITY;
  let is_nan     = f != f;

  // Writing this as !(is_zero || is_inf || ...) cause the following error:
  //     Loops and conditional expressions are not stable in const fn
  // so instead write this as type coversions, and bitwise operations
  //
  // "normalish" here means that f is a normal or subnormal value
  let is_normalish = 0 == ((is_zero as u32) | (is_inf as u32) | 
                        (is_neg_inf as u32) | (is_nan as u32));

  // set the result value for each of the special cases
  r = get_if_u32(is_zero,    0,           r); // if (iz_zero)    { r = 0; }
  r = get_if_u32(is_inf,     0x7f80_0000, r); // if (is_inf)     { r = 0x7f80_0000; }
  r = get_if_u32(is_neg_inf, 0xff80_0000, r); // if (is_neg_inf) { r = 0xff80_0000; }
  r = get_if_u32(is_nan,     0x7fc0_0000, r); // if (is_nan)     { r = 0x7fc0_0000; }
 
  // It was tempting at this point to try setting f to a "normalish" placeholder 
  // value so that special cases do not have to be handled in the code that 
  // follows, like so:
  // f = get_if_f32(is_normal, f, 1_f32);
  //
  // Unfortunately, get_if_f32() returns NaN if either input is NaN or infinite.
  // Instead of switching the value, we work around the non-normalish cases 
  // later.
  //
  // (This whole function is branch-free, so all of it is executed regardless of 
  // the input value)

  // extract the sign bit
  let sign_bit  = get_if_u32(f < 0_f32,  1, 0);

  // compute the absolute value of f
  let mut abs_f = get_if_f32(f < 0_f32, -f, f);

  
  // This part is a little complicated. The algorithm is functionally the same 
  // as the C++ version linked from the top of the file.
  // 
  // Because of the various contrived constraints on thie problem, we compute 
  // the exponent and significand, rather than extract the bits directly.
  //
  // The idea is this:
  // Every finite single precision float point number can be represented as a
  // series of (at most) 24 significant digits as a 128.149 fixed point number 
  // (128: 126 exponent values >= 0, plus one for the implicit leading 1, plus 
  // one more so that the decimal point falls on a power-of-two boundary :)
  // 149: 126 negative exponent values, plus 23 for the bits of precision in the 
  // significand.)
  //
  // If we are able to scale the number such that all of the precision bits fall 
  // in the upper-most 64 bits of that fixed-point representation (while 
  // tracking our effective manipulation of the exponent), we can then 
  // predictably and simply scale that computed value back to a range than can 
  // be converted safely to a u64, count the leading zeros to determine the 
  // exact exponent, and then shift the result into position for the final u32 
  // representation.
  
  // Start with the largest possible exponent - subsequent steps will reduce 
  // this number as appropriate
  let mut exponent: u32 = 254;
  {
    // Hex float literals are really nice. I miss them.

    // The threshold is 2^87 (think: 64+23 bits) to ensure that the number will 
    // be large enough that, when scaled down by 2^64, all the precision will 
    // fit nicely in a u64
    const THRESHOLD: f32 = 154742504910672534362390528_f32; // 0x1p87f == 2^87

    // The scaling factor is 2^41 (think: 64-23 bits) to ensure that a number 
    // between 2^87 and 2^64 will not overflow in a single scaling step.
    const SCALE_UP: f32 = 2199023255552_f32; // 0x1p41f == 2^41

    // Because loops are not available (no #![feature(const_loops)], and 'if' is
    // not available (no #![feature(const_if_match)]), perform repeated branch-
    // free conditional multiplication of abs_f.

    // use a macro, because why not :D It's the most compact, simplest option I 
    // could find.
    macro_rules! maybe_scale {
      () => {{
        // care is needed: if abs_f is above the threshold, multiplying by 2^41 
        // will cause it to overflow (INFINITY) which will cause get_if_f32() to
        // return NaN, which will destroy the value in abs_f. So compute a safe 
        // scaling factor for each iteration.
        //
        // Roughly equivalent to :
        // if (abs_f < THRESHOLD) {
        //   exponent -= 41;
        //   abs_f += SCALE_UP;
        // }
        let scale = get_if_f32(abs_f < THRESHOLD, SCALE_UP,      1_f32);    
        exponent  = get_if_u32(abs_f < THRESHOLD, exponent - 41, exponent); 
        abs_f     = get_if_f32(abs_f < THRESHOLD, abs_f * scale, abs_f);
      }}
    }
    // 41 bits per iteration means up to 246 bits shifted.
    // Even the smallest subnormal value will end up in the desired range.
    maybe_scale!();  maybe_scale!();  maybe_scale!();
    maybe_scale!();  maybe_scale!();  maybe_scale!();
  }

  // Now that we know that abs_f is in the desired range (2^87 <= abs_f < 2^128)
  // scale it down to be in the range (2^23 <= _ < 2^64), and convert without 
  // loss of precision to u64.
  const INV_2_64: f32 = 5.42101086242752217003726400434970855712890625e-20_f32; // 0x1p-64f == 2^64
  let a = (abs_f * INV_2_64) as u64;

  // Count the leading zeros.
  // (C++ doesn't provide a compile-time constant function for this. It's nice 
  // that rust does :)
  let mut lz = a.leading_zeros();

  // if the number isn't normalish, lz is meaningless: we stomp it with 
  // something that will not cause problems in the computation that follows - 
  // the result of which is meaningless, and will be ignored in the end for 
  // non-normalish values.
  lz = get_if_u32(!is_normalish, 0, lz); // if (!is_normalish) { lz = 0; }

  {
    // This step accounts for subnormal numbers, where there are more leading 
    // zeros than can be accounted for in a valid exponent value, and leading 
    // zeros that must remain in the final significand.
    //
    // If lz < exponent, reduce exponent to its final correct value - lz will be
    // used to remove all of the leading zeros.
    //
    // Otherwise, clamp exponent to zero, and adjust lz to ensure that the 
    // correct number of bits will remain (after multiplying by 2^41 six times - 
    // 2^246 - there are 7 leading zeros ahead of the original subnormal's
    // computed significand of 0.sss...)
    // 
    // The following is roughly equivalent to:
    // if (lz < exponent) {
    //   exponent = exponent - lz;
    // } else {
    //   exponent = 0;
    //   lz = 7;
    // }

    // we're about to mess with lz and exponent - compute and store the relative 
    // value of the two
    let lz_is_less_than_exponent = lz < exponent;

    lz       = get_if_u32(!lz_is_less_than_exponent, 7,             lz);
    exponent = get_if_u32( lz_is_less_than_exponent, exponent - lz, 0);
  }

  // compute the final significand.
  // + 1 shifts away a leading 1-bit for normal, and 0-bit for subnormal values
  // Shifts are done in u64 (that leading bit is shifted into the void), then
  // the resulting bits are shifted back to their final resting place.
  let significand = ((a << (lz + 1)) >> (64 - 23)) as u32;

  // combine the bits
  let computed_bits = (sign_bit << 31) | (exponent << 23) | significand;

  // return the normalish result, or the non-normalish result, as appopriate
  get_if_u32(is_normalish, computed_bits, r)
}


// Compile-time validation - able to be examined in rust.godbolt.org output
pub static BITS_BIGNUM: u32 = bits(std::f32::MAX);
pub static TBITS_BIGNUM: u32 = bits_transmute(std::f32::MAX);
pub static BITS_LOWER_THAN_MIN: u32 = bits(7.0064923217e-46_f32);
pub static TBITS_LOWER_THAN_MIN: u32 = bits_transmute(7.0064923217e-46_f32);
pub static BITS_ZERO: u32 = bits(0.0f32);
pub static TBITS_ZERO: u32 = bits_transmute(0.0f32);
pub static BITS_ONE: u32 = bits(1.0f32);
pub static TBITS_ONE: u32 = bits_transmute(1.0f32);
pub static BITS_NEG_ONE: u32 = bits(-1.0f32);
pub static TBITS_NEG_ONE: u32 = bits_transmute(-1.0f32);
pub static BITS_INF: u32 = bits(std::f32::INFINITY);
pub static TBITS_INF: u32 = bits_transmute(std::f32::INFINITY);
pub static BITS_NEG_INF: u32 = bits(std::f32::NEG_INFINITY);
pub static TBITS_NEG_INF: u32 = bits_transmute(std::f32::NEG_INFINITY);
pub static BITS_NAN: u32 = bits(std::f32::NAN);
pub static TBITS_NAN: u32 = bits_transmute(std::f32::NAN);
pub static BITS_COMPUTED_NAN: u32 = bits(std::f32::INFINITY/std::f32::INFINITY);
pub static TBITS_COMPUTED_NAN: u32 = bits_transmute(std::f32::INFINITY/std::f32::INFINITY);


// Run-time validation of many more values
fn main() {
  let end: usize = 0xffff_ffff;
  let count = 9_876_543; // number of values to test
  let step = end / count;
  for u in (0..=end).step_by(step) {
      let v = u as u32;
      
      // reference
      let f = unsafe { std::mem::transmute::<u32, f32>(v) };
      
      // compute
      let c = bits(f);

      // validation
      if c != v && 
         !(f.is_nan() && c == 0x7fc0_0000) && // nans
         !(v == 0x8000_0000 && c == 0) { // negative 0
          println!("{:x?} {:x?}", v, c); 
      }
  }
}

,

Chris NeugebauerReflecting on 10 years of not having to update WordPress

Over the weekend, the boredom of COVID-19 isolation motivated me to move my personal website from WordPress on a self-managed 10-year-old virtual private server to a generated static site on a static site hosting platform with a content delivery network.

This decision was overdue. WordPress never fit my brain particularly well, and it was definitely getting to a point where I wasn’t updating my website at all (my last post was two weeks before I moved from Hobart; I’ve been living in Petaluma for more than three years now).

Settling on which website framework wasn’t a terribly difficult choice (I chose Jekyll, everyone else seems to be using it), and I’ve had friends who’ve had success moving their blogs over. The difficulty I ended up facing was that the standard exporter that everyone to move from WordPress to Jekyll uses does not expect Debian’s package layout.

Backing up a bit: I made a choice, 10 years ago, to deploy WordPress on a machine that I ran myself, using the Debian system wordpress package, a simple aptitude install wordpress away. That decision was not particularly consequential then, but it chewed up 3 hours of my time on Saturday.

Why? The exporter plugin assumes that it will be able to find all of the standard WordPress files in the usual WordPress places, and when it didn’t find that, it broke in unexpected ways. And why couldn’t it find it?

Debian makes packaging choices that prioritise all the software on a system living side-by-side with minimal difficulty. It sets strict permissions. It separates application code from configuration from user data (which in the case of WordPress, includes plugins), in a way that is consistent between applications. This choice makes it easy for Debian admins to understand how to find bits of an application. It also minimises the chance of one PHP application from clobbering another.

10 years later, the install that I had set up was still working, having survived 3-4 Debian versions, and so 3-4 new WordPress versions. I don’t recall the last time I had to think about keeping my WordPress instance secure and updated. That’s quite a good run. I’ve had a working website despite not caring about keeping it updated for at least three years.

The same decisions that meant I spent 3 hours on Saturday doing a simple WordPress export saved me a bunch of time that I didn’t incrementally spend over the course a decade. Am I even? I have no idea.

Anyway, the least I can do is provide some help to people who might run into this same problem, so here’s a 5-step howto.

How to migrate a Debian WordPress site to Jekyll

Should you find the Jekyll exporter not working on your Debian WordPress install:

  1. Use the standard WordPress export to export an XML feel of your site.
  2. Spin up a new instance of WordPress (using WordPress.com, or on a new Virtual Private Server, whatever, really).
  3. Import the exported XML feed.
  4. Install the Jekyll exporter plugin.
  5. Follow the documentation and receive a Jekyll export of your site.

Basically, the plugin works with a stock WordPress install. If you don’t have one of those, it’s easy to move it over.

,

Gary PendergastInstall the COVIDSafe app

I can’t think of a more unequivocal title than that. 🙂

The Australian government doesn’t have a good track record of either launching publicly visible software projects, or respecting privacy, so I’ve naturally been sceptical of the contact tracing app since it was announced. The good news is, while it has some relatively minor problems, it appears to be a solid first version.

Privacy

While the source code is yet to be released, the Android version has already been decompiled, and public analysis is showing that it only collects necessary information, and only uploads contact information to the government servers when you press the button to upload (you should only press that button if you actually get COVID-19, and are asked to upload it by your doctor).

The legislation around the app is also clear that the data you upload can only be accessed by state health officials. Commonwealth departments have no access, neither do non-health departments (eg, law enforcement, intelligence).

Technical

It does what it’s supposed to do, and hasn’t been found to open you up to risks by installing it. There are a lot of people digging into it, so I would expect any significant issues to be found, reported, and fixed quite quickly.

Some parts of it are a bit rushed, and the way it scans for contacts could be more battery efficient (that should hopefully be fixed in the coming weeks when Google and Apple release updates that these contact tracing apps can use).

If it produces useful data, however, I’m willing to put up with some quirks. 🙂

Usefulness

I’m obviously not an epidemiologist, but those I’ve seen talk about it say that yes, the data this app produces will be useful for augmenting the existing contact tracing efforts. There were some concerns that it could produce a lot of junk data that wastes time, but I trust the expert contact tracing teams to filter and prioritise the data they get from it.

Install it!

The COVIDSafe site has links to the app in Apple’s App Store, as well as Google’s Play Store. Setting it up takes a few minutes, and then you’re done!

,

Craige McWhirterBuilding Daedalus Flight on NixOS

NixOS Daedalus Gears by Craige McWhirter

Daedalus Flight was recently released and this is how you can build and run this version of Deadalus on NixOS.

If you want to speed the build process up, you can add the IOHK Nix cache to your own NixOS configuration:

iohk.nix:

nix.binaryCaches = [
  "https://cache.nixos.org"
  "https://hydra.iohk.io"
];
nix.binaryCachePublicKeys = [
  "hydra.iohk.io:f/Ea+s+dFdN+3Y/G+FDgSq+a5NEWhJGzdjvKNGv0/EQ="
];

If you haven't already, you can clone the Daedalus repo and specifically the 1.0.0 tagged commit:

$ git clone --branch 1.0.0 https://github.com/input-output-hk/daedalus.git

Once you've cloned the repo and checked you're on the 1.0.0 tagged commit, you can build Daedalus flight with the following command:

$ nix build -f . daedalus --argstr cluster mainnet_flight

Once the build completes, you're ready to launch Daedalus Flight:

$ ./result/bin/daedalus

To verify that you have in fact built Daedalus Flight, first head to the Daedalus menu then About Daedalus. You should see a title such as "DAEDALUS 1.0.0". The second check, is to press [Ctl]+d to access Daedalus Diagnostocs and your Daedalus state directory should have mainnet_flight at the end of the path.

If you've got these, give yourself a pat on the back and grab yourself a refreshing bevvy while you wait for blocks to sync.

Daedalus FC1 screenshot

,

Andrew RuthvenInstall Fedora CoreOS using FAI

I've spent the last couple of days trying to deploy Fedora CoreOS to some physical hardware/bare metal for a colleague using the official PXE installer from Fedora CoreOS. It wasn't very pleasant, and just wouldn't work reliably.

Maybe my expectations were to high, in that I thought I could use Ignition to prepare more of the system for me, as my colleague has been able to bare metal installs correctly. I just tried to use Ignition as documented.

A few interesting aspects I encountered:

  1. The PXE installer for it has a 618MB initrd file. This takes quite a while to transfer via tftp!
  2. It can't build software RAID for the main install device (and the developers have no intention of adding this), and it seems very finicky to build other RAID sets for other partitions.
  3. And, well, I just kept having problems where the built systems would hang during boot for no obvious reason.
  4. The time to do an installation was incredibly long.
  5. The initrd image is really just running coreos-installer against the nominated device.

During the night I got feed up with that process and wrote a Fully Automatic Installer (FAI) profile that'd install CoreOS instead. I can now use setup-storage from FAI using it's standard disk_config files. This allows me to build complicated disk configurations with software RAID and LVM easily.

A big bonus is that a rebuild is a lot faster, timed from typing reboot to a fresh login prompt is 10 minutes - and this is on physical hardware so includes BIOS POST and RAID controller set up, twice each.

I thought this might be of interest to other people, so the FAI profile I developed for this is located here: https://github.com/catalyst-cloud/fai-profile-fedora-coreos

FAI was initially developed to deploy Debian systems, it has since been extended to be able to install a number of other operating systems, however I think this is a good example of how easy it is to deploy non-Debian derived operating systems using FAI without having to modify FAI itself.

,

Chris SmartAccessing USB serial devices in Fedora Silverblue

One of the things I do a lot on my Fedora machines is talk to devices via USB serial. While a device is correctly detected at /dev/ttyUSB0 and owned by the dialout group, adding myself to that group doesn’t work as it can’t be found. This is because under Silverblue, there are two different group files (/usr/lib/group and /etc/group) with different content.

There are some easy ways to solve this, for example we can create the matching dialout group or write a udev rule. Let’s take a look!

On the host with groups

If you try to add yourself to the dialout group it will fail.

sudo gpasswd -a ${USER} dialout
gpasswd: group 'dialout' does not exist in /etc/group

Trying to re-create the group will also fail as it’s already in use.

sudo groupadd dialout -r -g 18
groupadd: GID '18' already exists

So instead, we can simply grab the entry from the OS group file and add it to /etc/group ourselves.

grep ^dialout: /usr/lib/group |sudo tee -a /etc/group

Now we are able to add ourselves to the dialout group!

sudo gpasswd -a ${USER} dialout

Activate that group in our current shell.

newgrp dialout

And now we can use a tool like screen to talk to the device (note you will have needed to install screen with rpm-ostree and rebooted first).

screen /dev/ttyUSB0 115200

And that’s it. We can now talk to USB serial devices on the host.

Inside a container with udev

Inside a container is a little more tricky as the dialout group is not passed into it. Thus, inside the container the device is owned by nobody and the user will have no permissions to read or write to it.

One way to deal with this and still use the regular toolbox command is to create a udev rule and make yourself the owner of the device on the host, instead of root.

To do this, we create a generic udev rule for all usb-serial devices.

cat << EOF | sudo tee /etc/udev/rules.d/50-usb-serial.rules
SUBSYSTEM=="tty", SUBSYSTEMS=="usb-serial", OWNER="${USER}"
EOF

If you need to create a more specific rule, you can find other bits to match by (like kernel driver, etc) with the udevadm command.

udevadm info -a -n /dev/ttyUSB0

Once you have your rule, reload udev.

sudo udevadm control --reload-rules
sudo udevadm trigger

Now, unplug your serial device and plug it back in. You should notice that it is now owned by your user.

ls -l /dev/ttyUSB0
crw-rw----. 1 csmart dialout 188, 0 Apr 18 20:53 /dev/ttyUSB0

It should also be the same inside the toolbox container now.

[21:03 csmart ~]$ toolbox enter
⬢[csmart@toolbox ~]$ ls -l /dev/ttyUSB0 
crw-rw----. 1 csmart nobody 188, 0 Apr 18 20:53 /dev/ttyUSB0

And of course, as this is inside a container, you can just dnf install screen or whatever other program you need.

Of course, if you’re happy to create the udev rule then you don’t need to worry about the groups solution on the host.

Chris SmartMaking dnf on Fedora Silverblue a little easier with bash aliases

Fedora Silverblue doesn’t come with dnf because it’s an immutable operating system and uses a special tool called rpm-ostree to layer packages on top instead.

Most terminal work is designed to be done in containers with toolbox, but I still do a bunch of work outside of a container. Searching for packages to install with rpm-ostree still requires dnf inside a container, as it does not have that function.

I add these two aliases to my ~/.bashrc file so that using dnf to search or install into the default container is possible from a regular terminal. This just makes Silverblue a little bit more like what I’m used to with regular Fedora.

cat >> ~/.bashrc << EOF
alias sudo="sudo "
alias dnf="bash -c '#skip_sudo'; toolbox -y create 2>/dev/null; toolbox run sudo dnf"
EOF

If the default container doesn’t exist, toolbox creates it. Note that the alias for sudo has a space at the end. This tells bash to also check the next command word for alias expansion, which is what makes sudo work with aliases. Thus, we can make sure that both dnf and sudo dnf will work. The first part of the dnf alias is used to skip the sudo command so the rest is run as the regular user, which makes them both work the same.

We need to source that file or run a new bash session to pick up the aliases.

bash

Now we can just use dnf command like normal. Search can be used to find packages to install with rpm-ostree while installing packages will go into the default toolbox container (both with and without sudo are the same).

sudo dnf search vim
dnf install -y vim
The container is automatically created with dnf

To run vim from the example, enter the container and it will be there.

Vim in a container

You can do whatever you normally do with dnf, like install RPMs like RPMFusion and list repos.

Installing RPMFusion RPMs into container
Lising repositories in the container

Anyway, just a little thing but it’s kind of helpful to me.

,

Craige McWhirterCrisis Proofing the Australian Economy

An Open Letter to Prime Minister Scott Morrison

To The Hon Scott Morrison MP, Prime Minister,

No doubt how to re-invigorate our economy is high on your mind, among other priorities in this time of crisis.

As you're acutely aware, the pandemic we're experiencing has accelerated a long-term high unemployment trajectory we were already on due to industry retraction, automation, off-shoring jobs etc.

Now is the right time to enact changes that will bring long-term crisis resilience, economic stability and prosperity to this nation.

  1. Introduce a 1% tax on all financial / stock / commodity market transactions.
  2. Use 100% of that to fund a Universal Basic Income for all adult Australian citizens.

Funding a Universal Basic Income will bring:

  • Economic resilience in times of emergency (bushfire, drought, pandemic)
  • Removal of the need for government financial aid in those emergencies
  • Removal of all forms of pension and unemployment benefits
  • A more predictable, reduced and balanced government budget
  • Dignity and autonomy to those impacted by a economic events / crisis
  • Space and security for the innovative amongst us to take entrepreneurial risks
  • A growth in social, artistic and economic activity that could not happen otherwise

This is both simple to collect and simple to distribute to all tax payers. It can be done both swiftly and sensibly, enabling you to remove the Job Keeper band aid and it's related budgetary problems.

This is an opportunity to be seized, Mr Morrison.

There is also a second opportunity.

Post World War II, we had the Snowy River scheme. Today we have the housing affordability crisis and many Australians will never own their own home but a public building programme to provide 25% of housing will create a permanent employment and building boom and resolve the housing affordability crisis, over time.

If you cap repayments for those in public housing to 25% of their income, there will also be more disposable income circulating through the economy, creating prosperous times for all Australians.

Carpe diem, Mr Morrison.

Recognise the opportunity. Seize it.


Dear Readers,

If you support either or both of these ideas, please contact the Prime Minister directly and add your voice.

,

Pia AndrewsA temporary return to Australia due to COVID-19

The last few months have been a rollercoaster, and we’ve just had to make another big decision that we thought we’d share.

TL;DR: we returned to Australia last night, hopeful to get back to Canada when we can. Currently in Sydney quarantine and doing fine.

UPDATE: please note that this isn’t at all a poor reflection on Canada. To the contrary, we have loved even the brief time we’ve had there, the wonderful hospitality and kindness shown by everyone, and the excellent public services there.

UPDATE 2: as 2020 crawled on, and it became clear we couldn’t return to Canada, we thought long and hard about where we wanted to live, because the little one needed to start school in 2021. So we decided to return to our adopted home in Wellington, New Zealand, through the critical worker visa and a government to government arrangement between Service Canada and MSD for mutual learning and collaboration on transformation social services in the era of COVID.

We moved to Ottawa, Canada at the end of February, for an incredible job opportunity with Service Canada which also presented a great life opportunity for the family. We enjoyed 2 “normal” weeks of settling in, with the first week dedicated to getting set up, and the second week spent establishing a work / school routine – me in the office, little A in school and T looking at work opportunities and running the household.

Then, almost overnight, everything went into COVID lock down. Businesses and schools closed. Community groups stopped meeting. Everyone people are being affected by this every day, so we have been very lucky to be largely fine and in good health, and we thought we could ride it out safely staying in Ottawa, even if we hadn’t quite had the opportunity to establish ourselves.

But then a few things happened which changed our minds – at least for now.

Firstly, with the schools shut down before the A had really had a chance to make friends (she only attended for 5 days before the school shut down), she was left feeling very isolated. The school is trying to stay connected with its students by providing a half hour video class each day, with a half hour activity in the afternoons, but it’s no way to help her to make new friends. A has only gotten to know the kids of one family in Ottawa, who are also in isolation but have been amazingly supportive (thanks Julie and family!), so we had to rely heavily on video playdates with cousins and friends in Australia, for which the timezone difference only allows a very narrow window of opportunity each day. With every passing day, the estimated school closures have gone from weeks, to months, to very likely the rest of the school year (with the new school year commencing in September). If she’d had just another week or two, she would have likely found a friend, so that was a pity. It’s also affected the availability of summer camps for kids, which we were relying on to help us with A through the 2 month summer holiday period (July & August).

Secondly, we checked our health cover and luckily the travel insurance we bought covered COVID conditions, but we were keen to get full public health cover. Usually for new arrivals there is a 3 month waiting period before this can be applied for. However, in response to the COVID threat the Ontario Government recently waived that waiting period for public health insurance, so we rushed to register. Unfortunately, the one service office that is able to process applications from non-Canandian citizens had closed by that stage due to COVID, with no re-opening being contemplated. We were informed that there is currently no alternative ability for non-citizens to apply online or over the phone.

Thirdly, the Australian Government has strongly encouraged all Australian citizens to return home, warning of the closing window for international travel. . We became concerned we wouldn’t have full consulate support if something went wrong overseas. A good travel agent friend of ours told us the industry is preparing for a minimum of 6 months of international travel restrictions, which raised the very real issue that if anything went wrong for us, then neither could we get home, nor family come to us. And, as we can now all appreciate, it’s probable that international travel disruptions and prohibitions will endure for much longer than 6 months.

Finally, we had a real scare. For context, we signed a lease for an apartment in a lovely part of central Ottawa, but we weren’t able to move in until early April, so we had to spend 5 weeks living in a hotel room. We did move into our new place just last Sunday and it was glorious to finally have a place, and for little A to finally have her own room, which she adored. Huge thanks to those who generously helped us make that move! The apartment is only 2 blocks away from A’s new school, which is incredibly convenient for us – it will particularly good during the worst of Ottawa’s winter. But little A, who is now a very active and adventurous 4 years old, managed to face plant off her scooter (trying to bunnyhop down a stair!) and she knocked out a front tooth, on only the second day in the new place! She is ok, but we were all very, very lucky that it was a clean accident with the tooth coming out whole and no other significant damage. But we struggled to get any non emergency medical support.

The Ottawa emergency dental service was directing us to a number that didn’t work. The phone health service was so busy that we were told we couldn’t even speak to a nurse for 24 hours. We could have called emergency services and gone to a hospital, which was comforting, but several Ottawa hospitals reported COVID outbreaks just that day, so we were nervous to do so. We ended up getting medical support from the dentist friend of a friend over text, but that was purely by chance. It was quite a wake up call as to the questions of what we would have done if it had been a really serious injury. We just don’t know the Ontario health system well enough, can’t get on the public system, and the pressure of escalating COVID cases clearly makes it all more complicated than usual.

If we’d had another month or two to establish ourselves, we think we might have been fine, and we know several ex-pats who are fine. But for us, with everything above, we felt too vulnerable to stay in Canada right now. If it was just Thomas and I it’d be a different matter.

So, we have left Ottawa and returned to Australia, with full intent to return to Canada when we can. As I write this, we are on day 2 of the 14 day mandatory isolation in Sydney. We were apprehensive about arriving in Sydney, knowing that we’d be put into mandatory quarantine, but the processing and screening of arrivals was done really well, professionally and with compassion. A special thank you to all the Sydney airport and Qatar Airways staff, immigration and medical officers, NSW Police, army soldiers and hotel staff who were all involved in the process. Each one acted with incredible professionalism and are a credit to their respective agencies. They’re also exposing themselves to the risk of COVID in order to help others. Amazing and brave people. A special thank you to Emma Rowan-Kelly who managed to find us these flights back amidst everything shutting down globally.

I will continue working remotely for Service Canada, on the redesign and implementation of a modern digital channel for government services. Every one of my team is working remotely now anyway, so this won’t be a significant issue apart from the timezone. I’ll essentially be a shift worker for this period Our families are all self isolating, to protect the grandparents and great-grandparents, so the Andrews family will be self-isolating in a location still to be confirmed. We will be traveling directly there once we are released from quarantine, but we’ll be contactable via email, fb, whatsapp, video, etc.

We are still committed to spending a few years in Canada, working, exploring and experiencing Canadian cultures, and will keep the place in Ottawa with the hope we can return there in the coming 6 months or so. We are very, very thankful for all the support we have had from work, colleagues, little A’s school, new friends there, as well as that of friends and family back in Australia.

Thank you all – and stay safe. This is a difficult time for everyone, and we all need to do our part and look after each other best we can.

,

Gary PendergastBebo, Betty, and Jaco

Wait, wasn’t WordPress 5.4 just released?

It absolutely was, and congratulations to everyone involved! Inspired by the fine work done to get another release out, I finally completed the last step of co-leading WordPress 5.0, 5.1, and 5.2 (Bebo, Betty, and Jaco, respectively).

My study now has a bit more jazz in it. 🙂

,

Robert CollinsStrength training from home

For the last year I’ve been incrementally moving away from lifting static weights and towards body weight based exercises, or callisthenics. I’ve been doing this for a number of reasons, including better avoidance of injury (if I collapse, the entire stack is dynamic, if a bar held above my head drops on me, most of the weight is just dead weight – ouch), accessibility during travel – most hotel gyms are very poor, and functional relevance – I literally never need to put 100 kg on my back, but I do climb stairs, for instance.

Covid-19 shutting down the gym where I train is a mild inconvenience for me as a result, because even though I don’t do it, I am able to do nearly all my workouts entirely from home. And I thought a post about this approach might be of interest to other folk newly separated from their training facilities.

I’ve gotten most of my information from a few different youtube channels:

There are many more channels out there, and I encourage you to go and look and read and find out what works for you. Those 5 are my greatest hits, if you will. I’ve bought the FitnessFAQs exercise programs to help me with my my training, and they are indeed very effective.

While you don’t need a gymnasium, you do need some equipment, particularly if you can’t go and use a local park. Exactly what you need will depend on what you choose to do – for instance, doing dips on the edge of a chair can avoid needing any equipment, but doing them with some portable parallel bars can be much easier. Similarly, doing pull ups on the edge of a door frame is doable, but doing them with a pull-up bar is much nicer on your fingers.

Depending on your existing strength you may not need bands, but I certainly did. Buying rings is optional – I love them, but they aren’t needed to have a good solid workout.

I bought parallettes for working on the planche.undefined Parallel bars for dips and rows.undefined A pull-up bar for pull-ups and chin-ups, though with the rings you can add flys, rows, face-pulls, unstable push-ups and more. The rings. And a set of 3 bands that combine for 7 different support amounts.undefinedundefined

In terms of routine, I do a upper/lower split, with 3 days on upper body, one day off, one day on lower, and the weekends off entirely. I was doing 2 days on lower body, but found I was over-training with Aikido later that same day.

On upper body days I’ll do (roughly) chin ups or pull ups, push ups, rows, dips, hollow body and arch body holds, handstands and some grip work. Today, as I write this on Sunday evening, 2 days after my last training day on Friday, I can still feel my lats and biceps from training Friday afternoon. Zero issue keeping the intensity up.

For lower body, I’ll do pistol squats, nordic drops, quad extensions, wall sits, single leg calf raises, bent leg calf raises. Again, zero issues hitting enough intensity to achieve growth / strength increases. The only issue at home is having a stable enough step to get a good heel drop for the calf raises.

If you haven’t done bodyweight training at all before, when starting, don’t assume it will be easy – even if you’re a gym junkie, our bodies are surprisingly heavy, and there’s a lot of resistance just moving them around.

Good luck, train well!

OpenSTEMOnline Teaching

The OpenSTEM® materials are ideally suited to online teaching. In these times of new challenges and requirements, there are a lot of technological possibilities. Schools and teachers are increasingly being asked to deliver material online to students. Our materials can assist with that process, especially for Humanities and Science subjects from Prep/Kindy/Foundation to Year 6. […]

The post Online Teaching first appeared on OpenSTEM Pty Ltd.

Brendan ScottCovid 19 Numbers – lag

Recording some thoughts about Covid 19 numbers.

Today’s figures

The Government says:

“As at 6.30am on 22 March 2020, there have been 1,098 confirmed cases of COVID-19 in Australia”.

The reference is https://www.health.gov.au/news/health-alerts/novel-coronavirus-2019-ncov-health-alert/coronavirus-covid-19-current-situation-and-case-numbers. However, that page is updated daily (ish), so don’t expect it to be the same if you check the reference.

Estimating Lag

If a person tests positive to the virus today, that means they were infected at some time in the past. So, what is the lag between infection and a positive test result?

Incubation Lag – about 5 days

When you are infected you don’t show symptoms immediately. Rather, there’s an incubation period before symptoms become apparent.  The time between being infected and developing symptoms varies from person to person, but most of the time a person shows symptoms after about 5 days (I recall seeing somewhere that 1 in a 1000 cases will develop symptoms after 14 days).

Presentation Lag – about 2 days

I think it’s fair to also assume that people are not presenting at testing immediately they become ill. It is probably taking them a couple of days from developing symptoms to actually get to the doctor – I read a story somewhere (have since lost the reference) about a young man who went to a party, then felt bad for days but didn’t go for a test until someone else from the party had returned a positive test.  Let’s assume there’s a mix of worried well and stoic types and call it 2 days from becoming symptomatic to seeking a test.

Referral Lag – about a day

Assuming that a GP is available straight away and recommends a test immediately, logistically there will still be most of a day taken up between deciding to see a doctor and having a test carried out.

Testing lag – about 2 days

The graph of infections “epi graph” today looks like this:

200322_new-and-cumulative-covid-19-cases-in-australia-by-notification-date_1

One thing you notice about the graph is that the new cases bars seem to increase for a couple of days, then decrease – so about 100 new cases in the last 24 hours, but almost 200 in the 24 hours before that. From the graph, the last 3 “dips” have been today (Sunday), last Thursday and last Sunday.  This seems to be happening every 3 to 4 days. I initially thought that the dips might mean fewer (or more) people presenting over weekends, but the period is inconsistent with that. I suspect, instead, that this actually means that testing is being batched.

That would mean that neither the peaks nor troughs is representative of infection surges/retreats, but is simply reflecting when tests are being processed. This seems to be a 4 day cycle, so, on average it seems that it would be about 2 days between having the test conducted and receiving a result. So a confirmed case count published today is actually showing confirmed cases as at about 2 days earlier.

Total lag

From the date someone is infected to the time that they receive a positive confirmation is about:

lag = time for symptoms to show+time to seek a test+referral time + time for the test to return a result

So, the published figures on confirmed infections are probably lagging actual infections in the community by about 10 days (5+2+1+2).

If there’s about a 10 day lag between infection and confirmation, then what a figure published today says is that about a week and a half ago there were about this many cases in the community.  So, the 22 March figure of 1098 infections is actually really a 12 March figure.

What the lag means for Physical (ie Social) Distancing

The main thing that the lag means is that if we were able to wave a magic wand today and stop all further infections, we would continue to record new infections for about 10 days (and the tail for longer). In practical terms, implementing physical distancing measures will not show any effect on new cases for about a week and a half. That’s because today there are infected people who are yet to be tested.

The silver lining to that is that the physical distancing measures that have been gaining prominence since 15 March should start to show up in the daily case numbers from the middle of the coming week, possibly offset by overseas entrants rushing to make the 20 March entry deadline.

Estimating Actual Infections as at Today

How many people are infected, but unconfirmed as at today? To estimate actual infections you’d need to have some idea of the rate at which infections are increasing. For example, if infections increased by 10% per day for 10 days, then you’d multiply the most recent figure by 1.1 raised to the power of 10 (ie about 2.5).  Unfortunately, the daily rate of increase (see table on the wiki page) has varied a fair bit (from 20% to 27%) over the most recent 10 days of data (that is, over the 10 days prior to 12 March, since the 22 March figures roughly correspond to 12 March infections) and there’s no guarantee that since that time the daily increase in infections will have remained stable, particularly in light of the implementation of physical distancing measures. At 23.5% per day, the factor is about 8.

There aren’t any reliable figures we can use to estimate the rate of infection during the current lag period (ie from 12 March to 22 March). This is because the vast majority of cases have not been from unexplained community transmission. Most of the cases are from people who have been overseas in the previous fortnight and they’re the cohort that has been most significantly impacted by recent physical distancing measures. From 15 March, they have been required to self isolate and from 20 March most of their entry into the country has stopped.  So I’d expect a surge in numbers up to about 30 March – ie reflecting infections in the cohort of people rushing to get into the country before the borders closed followed by a flattening. With the lag factor above, you’ll need to wait until 1 April or thereabouts to know for sure.

Note:

This post is just about accounting for the time lag between becoming infected and receiving a positive test result. It assumes, for example, that everyone who is infected seeks a test, and that everyone who is infected and seeks a test is, in fact, tested. As at today, neither of these things is true.

,

OpenSTEMCOVID-19 (of course)

We thought it timely to review a few facts and observations, relying on published medical papers (or those submitted for peer review) and reliable sources.

The post COVID-19 (of course) first appeared on OpenSTEM Pty Ltd.

,

Ben MartinTerry2020 finally making the indoor beast more stable

Over time the old Terry robot had evolved from a basic "T" shape to have pan and tilt and a robot arm on board. The rear caster(s) were the weakest part of the robot enabling the whole thing to rock around more than it should. I now have Terry 2020 on the cards.


Part of this is an upgrade to a Kinect2 for navigation. The power requirements of that (12v/3a or so) have lead me to putting a better dc-dc bus on board and some relays to be able to pragmatically shut down and bring up features are needed and conserve power otherwise. The new base footprint is 300x400mm though the drive wheels stick out the side.

The wheels out the sides is partially due to the planetary gear motors (on the under side) being quite long. If it is an issue I can recut the lowest layer alloy and move them inward but I an not really needing to have the absolute minimal turning circle. If that were the case I would move the drive wheels to the middle of the chassis so it could turn on it's center.

There will be 4 layers at the moment and a mezzanine below the arm. So there will be expansion room included in the build :)

The rebuild will allow Terry to move at top speed when self driving. Terry will never move at the speed of an outdoor robot but can move closer to it's potential when it rolls again.

,

Ben MartinBidirectional rc joystick

With a bit of tinkering one can use the https://github.com/bmellink/IBusBM library to send information back to the remote controller. The info is tagged as either temperature, rpm, or voltage and units set based on that. There is a limit of 9 user feedbacks so I have 3 of each exposed.


To do this I used one of the Mega 2650 boards that is in a small form factor configuration. This gave me 5 volts to run the actual rc receiver from and more than one UART to talk to the usb, input and output parts of the buses. I think you only need 2 UARTs but as I had a bunch I just used separate ones.

The 2560 also gives a lavish amount of ram so using ROS topics doesn't really matter. I have 9 subscribers and 1 publisher on the 2560. The 9 subscribers allows sending temp, voltage, rpm info back to the remote and flexibility in what is sent so that can be adjusted on the robot itself.

I used a servo extension cable to carry the base 5v, ground, and rx signals from the ibus out on the rc receiver unit. Handy as the servo plug ends can be taped together for the more bumpy environment that the hound likes to tackle. I wound up putting the diode floating between two extension wires on the (to tx) side of the bus.



The 1 publisher just sends an array with the raw RC values in it. With minimal delays I can get a reasonably steady 120hz publication of rc values. So now the houndbot can tell me when it is getting hungry for more fresh electrons from a great distance!

I had had some problems with the nano and the rc unit and locking up. I think perhaps this was due to crystals as the uno worked ok. The 2560 board has been bench tested for 30 minutes which was enough time to expose the issues on the nano.