2026-03-03T09:08:49+02:00 foo.zone feed To be in the .zone! https://foo.zone/ RCM: The Ruby Configuration Management DSL https://foo.zone/gemfeed/2026-03-02-rcm-ruby-configuration-management-dsl.html 2026-03-02T00:00:00+02:00 Paul Buetow aka snonux paul@dev.buetow.org RCM is a tiny configuration management system written in Ruby. It gives me a small DSL for describing how I want my machines to look, then it applies the changes: create files and directories, manage packages, and make sure certain lines exist in configuration files. It's deliberately KISS and optimised for a single person's machines instead of a whole fleet.

RCM: The Ruby Configuration Management DSL



Published at 2026-03-02T00:00:00+02:00

RCM is a tiny configuration management system written in Ruby. It gives me a small DSL for describing how I want my machines to look, then it applies the changes: create files and directories, manage packages, and make sure certain lines exist in configuration files. It's deliberately KISS and optimised for a single person's machines instead of a whole fleet.

RCM DSL in action

Table of Contents




Why I built RCM



I've used (and still use) the usual suspects in configuration management: Puppet, Ansible, etc. They are powerful, but also come with orchestration layers, agents, inventories, and a lot of moving parts. For my personal machines I wanted something smaller: one Ruby process, one configuration file, a few resource types, and good enough safety features.

I've always been a fan of Ruby's metaprogramming features, and this project let me explore them in a focused, practical way.

Because of that metaprogramming support, Ruby is a great fit for DSLs. You can get very close to natural language without inventing a brand-new syntax. RCM leans into that: the goal is to read a configuration and understand what happens without jumping between multiple files or templating languages.

RCM repo on Codeberg

How the DSL feels



An RCM configuration starts with a configure block. Inside it you declare resources (file, package, given, notify, …). RCM figures out dependencies between resources and runs them in the right order.

configure do
  given { hostname is :earth }

  file '/tmp/test/wg0.conf' do
    requires file '/etc/hosts.test'
    manage directory
    from template
    'content with <%= 1 + 2 %>'
  end

  file '/etc/hosts.test' do
    line '192.168.1.101 earth'
  end
end

Which would look like this when run:

% sudo ruby example.rb
INFO 20260301-213817 dsl(0) => Configuring...
INFO 20260301-213817 file('/tmp/test/wg0.conf') => Registered dependency on file('/etc/hosts.test')
INFO 20260301-213817 file('/tmp/test/wg0.conf') => Evaluating...
INFO 20260301-213817 file('/etc/hosts.test') => Evaluating...
INFO 20260301-213817 file('/etc/hosts.test') => Writing file /etc/hosts.test
INFO 20260301-213817 file('/tmp/test/wg0.conf') => Creating parent directory /tmp/test
INFO 20260301-213817 file('/tmp/test/wg0.conf') => Writing file /tmp/test/wg0.conf

The idea is that you describe the desired state and RCM worries about the steps. The given block can short‑circuit the whole run (for example, only run on a specific hostname). Each file resource can either manage a complete file (from a template) or just make sure individual lines are present.

Keywords and resources



Under the hood, each DSL word is either a keyword or a resource:

  • Keyword is the base class for all top‑level DSL constructs.
  • Resource is the base class for things RCM can manage (files, packages, and so on).

Resources can declare dependencies with requires. Before a resource runs, RCM makes sure all its requirements are satisfied and only evaluates each resource once per run. This keeps the mental model simple even when you compose more complex configurations.

Files, directories, and templates



The file resource handles three common cases:

  • Managing parent directories (manage directory) so you don't have to create them manually.
  • Rendering ERB templates (from template) so you can mix Ruby expressions into config files.
  • Ensuring individual lines exist (line) for the many "append this line if missing" situations.

Every write operation creates a backup copy in .rcmbackup/, so you can always inspect what changed and roll back manually if needed.

How Ruby's metaprogramming helps



The nice thing about RCM is that the Ruby code you write in your configuration is not that different from the Ruby code inside RCM itself. The DSL is just a thin layer on top.

For example, when you write:

file '/etc/hosts.test' do
  line '192.168.1.101 earth'
end

Ruby turns file into a method call and '/etc/hosts.test' into a normal argument. Inside RCM, that method builds a File resource object and stores it for later. The block you pass is just a Ruby block; RCM calls it with the file resource as self, so method calls like line configure that resource. There is no special parser here, just plain Ruby method and block dispatch.

The same goes for constructs like:

given { hostname is :earth }

RCM uses Ruby's dynamic method lookup to interpret hostname and is in that block and to decide whether the rest of the configuration should run at all. Features like method_missing, blocks, and the ability to change what self means in a block make this kind of DSL possible with very little code. You still get all the power of Ruby (conditionals, loops, helper methods), but the surface reads like a small language of its own.

A bit more about method_missing



method_missing is one of the key tools that make the RCM DSL feel natural. In plain Ruby, if you call a method that does not exist, you get a NoMethodError. But before Ruby raises that error, it checks whether the object implements method_missing. If it does, Ruby calls that instead and lets the object decide what to do.

In RCM, you can write things like:

given { hostname is :earth }

Inside that block, calls such as hostname and is don't map to normal Ruby methods. Instead, RCM's DSL objects see those calls in method_missing, and interpret them as "check the current hostname" and "compare it to this symbol". This lets the DSL stay small and flexible: adding a new keyword can be as simple as handling another case in method_missing, without changing the Ruby syntax at all.

Put differently: you can write what looks like a tiny English sentence (hostname is :earth) and Ruby breaks it into method calls (hostname, then is) that RCM can interpret dynamically. Those "barewords" are not special syntax; they are just regular Ruby method names that the DSL catches and turns into configuration logic at runtime.

Here's a simplified sketch of how such a condition object could look in Ruby:

class HostCondition
  def initialize
    @current_hostname = Socket.gethostname.to_sym
  end

  def method_missing(name, *args, &)
    case name
    when :hostname
      @left = @current_hostname
      self               # allow chaining: hostname is :earth
    when :is
      @left == args.first
    else
      super
    end
  end
end

HostCondition.new.hostname.is(:earth)

RCM's real code is more sophisticated, but the idea is the same: Ruby happily calls method_missing for unknown methods like hostname and is, and the DSL turns those calls into a value (true/false) that decides whether the rest of the configuration should run.

Ruby metaprogramming: further reading



If you want to dive deeper into the ideas behind RCM's DSL, these books are great starting points:

  • "Metaprogramming Ruby 2" by Paolo Perrotta
  • "The Well-Grounded Rubyist" by David A. Black (and others)
  • "Eloquent Ruby" by Russ Olsen

They all cover Ruby's object model, blocks, method_missing, and other metaprogramming techniques in much more detail than I can in a single blog post.

Safety, dry runs, and debugging



RCM has a --dry mode: it logs what it would do without actually touching the file system. I use this when iterating on new configurations or refactoring existing ones. Combined with the built‑in logging and debug output, it's straightforward to see which resources were scheduled and in which order.

Because RCM is just Ruby, there's no separate agent protocol or daemon. The same process parses the DSL, resolves dependencies, and performs the actions. If something goes wrong, you can drop into the code, add a quick debug statement, and re‑run your configuration.

RCM vs Puppet and other big tools



RCM does not try to compete with Puppet, Chef, or Ansible on scale. Those tools shine when you manage hundreds or thousands of machines, have multiple teams contributing modules, and need centralised orchestration, reporting, and role‑based access control. They also come with their own DSLs, servers/agents, certificate handling, and a long list of resource types and modules. Ansible may be more similar to RCM than the other tools, but it's still much more complex than RCM.

For my personal use cases, that layer is mostly overhead. I want:

  • No extra daemon, message bus, or master node.
  • No separate DSL to learn besides Ruby itself.
  • A codebase small enough that I can understand and change all of it in an evening.
  • Behaviour I can inspect just by reading the Ruby code.

In that space RCM wins: it is small, transparent, and tuned for one person (me!) with a handful of personal machines or my Laptops. I still think tools like Puppet are the right choice for larger organisations and shared infrastructure, but RCM gives me a tiny, focused alternative for my own systems.

Cutting RCM 0.1.0



As of this post I'm tagging and releasing **RCM 0.1.0**. About 99% of the code has been written by me so far, and before AI agents take over more of the boilerplate and wiring work, it felt like a good moment to cut a release and mark this mostly‑human baseline.

Future changes will very likely involve more automated help, but 0.1.0 is the snapshot of the original, hand‑crafted version of the tool.

What's next



RCM already does what I need on my machines, but there are a few ideas I want to explore:

  • More resource types (for example, services and users) while keeping the core small.
  • Additional package backends beyond Fedora/DNF (in particular MacOS brew).
  • Managing hosts remotely.
  • A slightly more structured way to organise larger configurations without losing the KISS spirit.

Feature overview (for now)



Here is a quick overview of what RCM can do today, grouped by area:

  • File management: file '/path', manage directory, from template, line '...'
  • Packages: package 'name' resources for installing and updating packages (currently focused on Fedora/DNF)
  • Conditions and flow: given { ... } blocks, predicates such as hostname is :earth
  • Notifications and dependencies: requires between resources, notify for follow‑up actions
  • Safety and execution modes: backups in .rcmbackup/, --dry runs, debug logging

Some small examples adapted from RCM's own tests:

Template rendering into a file



configure do
  file './.file_example.rcmtmp' do
    from template
    'One plus two is <%= 1 + 2 %>!'
  end
end

Ensuring a line is absent from a file



configure do
  file './.file_example.rcmtmp' do
    line 'Whats up?'
    is absent
  end
end

Guarding a configuration run on the current hostname



configure do
  given { hostname Socket.gethostname }
  ...
end

Creating and deleting directories, and purging a directory tree



configure do
  directory './.directory_example.rcmtmp' do
    is present
  end

  directory delete do
    path './.directory_example.rcmtmp'
    is absent
  end
end

Managing file and directory modes and ownership



configure do
  touch './.mode_example.rcmtmp' do
    mode 0o600
  end

  directory './.mode_example_dir.rcmtmp' do
    mode 0o705
  end
end

Using a chained, more natural language style for notifications



This will just print out something, not changing anything:

configure do
  notify hello dear world do
    thank you to be part of you
  end
end

Touching files and updating their timestamps



configure do
  touch './.touch_example.rcmtmp'
end

Expressing dependencies between notifications



configure do
  notify foo do
    requires notify bar and requires notify baz
    'foo_message'
  end

  notify bar

  notify baz do
    requires notify bar
    'baz_message'
  end
end



configure do
  symlink './.symlink_example.rcmtmp' do
    manage directory
    './.symlink_target_example.rcmtmp'
  end
end

Detecting duplicate resource definitions at configure time



configure do
  notify :foo
  notify :foo # raises RCM::DSL::DuplicateResource
end

If you find RCM interesting, feel free to browse the code, adapt it to your own setup, or just steal ideas for your own Ruby DSLs. I will probably extend it with more features over time as my own needs evolve.

E-Mail your comments to paul@nospam.buetow.org :-)

Other related posts:

2026-03-02 RCM: The Ruby Configuration Management DSL (You are currently reading this)
2025-10-11 Key Takeaways from The Well-Grounded Rubyist
2021-07-04 The Well-Grounded Rubyist
2016-04-09 Jails and ZFS with Puppet on FreeBSD

Back to the main site
Site Reliability Engineering - Part 5: System Design, Incidents, and Learning https://foo.zone/gemfeed/2026-03-01-site-reliability-engineering-part-5.html 2026-03-01T12:00:00+02:00 Paul Buetow aka snonux paul@dev.buetow.org Welcome to Part 5 of my Site Reliability Engineering (SRE) series. I'm currently working as a Site Reliability Engineer, and I'm here to share what SRE is all about in this blog series.

Site Reliability Engineering - Part 5: System Design, Incidents, and Learning



Published at 2026-03-01T12:00:00+02:00

Welcome to Part 5 of my Site Reliability Engineering (SRE) series. I'm currently working as a Site Reliability Engineer, and I'm here to share what SRE is all about in this blog series.

2023-08-18 Site Reliability Engineering - Part 1: SRE and Organizational Culture
2023-11-19 Site Reliability Engineering - Part 2: Operational Balance
2024-01-09 Site Reliability Engineering - Part 3: On-Call Culture
2024-09-07 Site Reliability Engineering - Part 4: Onboarding for On-Call Engineers
2026-03-01 Site Reliability Engineering - Part 5: System Design, Incidents, and Learning (You are currently reading this)

    ___
   /   \     resilience
  |  o  |  <----------  learning
   \___/

This time I want to share some themes that build on what we've already covered: how system design and incident analysis fit together, why observability should not be an afterthought, and how a design‑improvement loop keeps systems getting better. Let's dive in!

Table of Contents




System Design and Incident Analysis



A big chunk of SRE work revolves around system design and incident analysis. What separates a well-designed system from a mediocre one is its ability to minimise and contain cascading failures. Unchecked, those can spiral into global outages.

Resilience and cascading failures



There's a growing emphasis on building resilient systems so that when something fails, the blast radius stays small. That resilience needs to be baked in at design time: we identify weak points and address them before production. The goal is to keep services dependable and uninterrupted.

Learning from incidents



When incidents do happen, their analysis is a goldmine. Every incident exposes gaps—whether in tooling (ops tools that aren't up to the job) or in skills (engineers missing critical know-how). Blaming "human error" doesn't help. The job is to dig into root causes and fix the system. Postmortems that focus on customer impact help us distil lessons and make the system more robust so we're less likely to repeat the same failure.

System design and incident analysis form a feedback loop: we improve the design based on what we learn from incidents, and a better design reduces the impact of the next one.

Observability: Don't leave it for when it's too late



Product and features often get the spotlight; observability is often an afterthought. Teams agree that "we need better observability" when they're already in the middle of an incident—and by then it's too late. Good observability needs to be in place before things go wrong. Tools that can query high-cardinality data and give granular insight into system behaviour are what let us diagnose problems quickly when chaos hits. So invest in observability early. When the next incident happens, you'll be glad you did.

The iterative spirit



We also accept that system design is never "done." We refine it based on real-world performance, incident learnings, and changing needs. Every incident is a chance to learn and improve; the emphasis is on learning, not blame. SREs work with developers, backend teams, and incident response so that the whole system keeps getting better. Perfection is a journey, not a destination.

Book tips



If you want to go deeper, here are a few books I can recommend:

  • 97 Things Every SRE Should Know: Collective Wisdom from the Experts by Emily Stolarsky and Jaime Woo
  • Site Reliability Engineering: How Google Runs Production Systems by Jennifer Petoff, Niall Murphy, Betsy Beyer, and Chris Jones
  • Implementing Service Level Objectives by Alex Hidalgo

E-Mail your comments to paul@nospam.buetow.org :-)

Back to the main site
Loadbars 0.13.0 released https://foo.zone/gemfeed/2026-03-01-loadbars-0.13.0-released.html 2026-03-01T00:00:00+02:00 Paul Buetow aka snonux paul@dev.buetow.org Loadbars is a real-time server load monitoring tool. It connects to one or more Linux hosts via SSH and shows CPU, memory, network, load average, and disk I/O as vertical colored bars in an SDL window. You can run it locally or point it at your servers and see what's happening right now — like `top` or `vmstat`, but visual and across multiple hosts at once.

Loadbars 0.13.0 released



Published at 2026-03-01T00:00:00+02:00

Loadbars is a real-time server load monitoring tool. It connects to one or more Linux hosts via SSH and shows CPU, memory, network, load average, and disk I/O as vertical colored bars in an SDL window. You can run it locally or point it at your servers and see what's happening right now — like top or vmstat, but visual and across multiple hosts at once.

Loadbars in action

Loadbars can connect to hundreds of servers in parallel; the GIF above doesn't do it justice — at scale you get a wall of bars that makes it easy to spot outliers and compare hosts at a glance.

Loadbars on Codeberg

Table of Contents




What Loadbars is (and isn't)



Loadbars shows the current state only. It is not a tool for collecting loads and drawing graphs for later analysis. There is no history, no recording, no database. Tools like Prometheus or Grafana require significant setup before producing results. Loadbars lets you observe the current state immediately: one binary, SSH (or local), and you're done.

┌─ Loadbars 0.13.0 ─────────────────────────────────────────┐
│                                                           │
│  ████  ████  ████  ██  ████  ████  ████  ██  ░░██  ░░██   │
│  ████  ████  ████  ██  ████  ████  ████  ██  ░░██  ░░██   │
│  ████  ████  ████  ██  ████  ████  ████  ██  ░░██  ░░██   │
│   CPU   cpu0  cpu1  mem  CPU   cpu0  cpu1  mem  net   net │
│  └──── host1 ────┘      └──── host2 ────┘                 │
└───────────────────────────────────────────────────────────┘

Use cases



  • Deployments and rollouts: watch CPU, memory, and network across app servers or nodes while you deploy. Spot the one that isn't coming up or is stuck under load.
  • Load testing: run your load tool against a cluster and see which hosts (or cores) are saturated, whether memory or disk I/O is the bottleneck, and how load spreads.
  • Quick health sweep: no dashboards set up yet? SSH to a handful of hosts and run Loadbars. You get an instant picture of who's busy, who's idle, and who's swapping.
  • Comparing hosts: side-by-side bars make it easy to see if one machine is hotter than the rest (e.g. after a config change or migration).
  • Local tuning: run loadbars --hosts localhost while you benchmark or stress a single box; the bars and load-average view help correlate activity with what you're doing.

What's new since the Perl version



The original Loadbars (Perl + SDL, ~2010–2013) had CPU, memory, network, ClusterSSH, and a config file. The Go rewrite and subsequent releases added the following. Why each one matters:

  • Load average bars: the Perl version had no load average. Now you get 1/5/15-minute load per host. Useful because load average is the classic "how queued is this box" signal — you see saturation and trends at a glance without reading numbers.

  • Disk I/O bars: disk was invisible in the Perl version. You now get read/write throughput (and optionally utilization %) per host or per device. Whole-disk devices only (partitions, loop, ram, zram, and device-mapper are excluded). Useful when you need to tell "is this slow because of CPU or because of disk?" — especially with many hosts, one disk-heavy host stands out. Disk smoothing (config diskaverage, hotkeys b/x) lets you tune how much the bars are averaged.

  • Extended peak line on CPU: a 1px line shows max system+user over the last N samples. Useful to see short spikes that the stacked bar might smooth out, so you don't miss bursty load.

  • Tooltips and host highlight: hover the mouse over any bar to see a tooltip with exact values (CPU %, memory, network, load, or disk depending on bar type). The hovered host's bars are highlighted (inverted) so you can tell which host you're over. Useful when you have hundreds of bars and want to read a specific number or confirm which host a bar belongs to.

  • GuestNice in CPU bars: CPU bars now show GuestNice as a lime green segment (above Nice). One more breakdown for virtualized or container workloads.

  • Version in window title: the default SDL title is "Loadbars <version> (press h for help on stdout)". Override with --title when you need a custom label.

  • Global average CPU line (key g): a single red line across all hosts at the fleet-average CPU. Useful when you have hundreds of bars: you instantly see which hosts are above or below average without comparing bar heights in your head.

  • Global I/O average line (key i): same idea for iowait+IRQ. Useful to spot which hosts are waiting on I/O more than the rest — quick way to find the disk-bound or interrupt-heavy machines.

  • Host separator lines (key s): a thin red vertical line between each host's bars. Useful at scale so you don't lose track of where one host ends and the next begins when the window is full of bars.

  • Scale reset (key r): reset the auto-scale for load and disk back to the floor. Useful after a big spike so the bars don't stay compressed for the rest of the session.

  • Toggle CPU off (key 1 cycles through aggregate → per-core → off): the Perl version didn't let you turn CPU bars off. Useful when you want to focus only on memory, network, load, or disk and reduce clutter.

  • maxbarsperrow: wrap bars into multiple rows instead of one long row. Useful with many hosts so the window doesn't become impossibly wide; you get a grid and can still scan everything.

  • maxwidth: cap on window width in pixels (default 1900). Stops the window growing unbounded with many hosts; use together with maxbarsperrow for a predictable layout.

  • Startup visibility flags: --showmem, --shownet, --showload, --extended, --cpumode, --diskmode (and friends) let you start with the bars you care about already on. Useful so you don't have to press 2, 3, 4, 5 every time.

  • Window title (--title): set the SDL window title. Useful when you run several Loadbars windows (e.g. one per cluster or environment) and need to tell them apart in your taskbar or window list.

  • SSH options (--sshopts): pass extra flags to ssh (e.g. ConnectTimeout, ProxyJump). Useful on locked-down or jump-host setups so Loadbars works without changing your global SSH config for a one-off session.

  • hasagent: skip extra SSH agent checks when you know the key is already loaded. Useful to avoid startup delay or warnings when you've already run ssh-add and are monitoring many hosts.

  • Config file covers every option: any flag from --help can be set in ~/.loadbarsrc (no leading --). Perl had a config but the Go version supports the full set. Useful for reproducible setups and sharing.

  • Positional host arguments: you can run loadbars server1 server2 without --hosts. Convenience when you only have a few hosts.

  • macOS as client: run the Loadbars binary on a Mac and connect to Linux servers via SSH. The Perl version was Linux-only. Useful to watch production from a laptop without a Linux VM or second machine.

  • Single static binary: no Perl runtime, no SDL Perl modules, no CPAN. Useful for deployment — copy one file to a jump host or new machine and run it.

  • Unit tests: mage test (or go test). The Go version has proper tests; useful for development and catching regressions.

  • Window resize (arrow keys): resize the window with the keyboard (left/right = width, up/down = height). Useful to fit more or fewer bars on screen without touching the mouse. (The Perl version had mouse-based resize; Go uses arrow keys.)

  • Hundreds of hosts in parallel: the Go implementation connects to all hosts concurrently and keeps polling without blocking. The Perl version struggled with many hosts. Useful for large fleets; you get a real "wall of bars" instead of a subset.

Core features



Load average bars



Press 4 or l to toggle. Each host gets a bar: teal fill (1-min load), yellow 1px line (5-min), white 1px line (15-min). Scale: auto (floor 2.0) or fixed with --loadmax N. Press r to reset auto-scale.

Disk I/O bars



Press 5 to toggle: aggregate (all whole-disk devices per host) → per-device → off. Partitions, loop, ram, zram, and device-mapper are excluded. Purple fill from top = read, darker purple from bottom = write. Extended mode (e) adds a 3px disk-utilization line. Config: diskmode, diskmax, diskaverage. b/x change disk average samples.

Global reference lines and options



g: global average CPU line (1px red). i: global I/O average line (1px pink). s: host separator lines (1px red). Other options: --maxbarsperrow N, --title, --sshopts, --hasagent. Hotkeys m/n mirror 2/3 for memory and network. Hover over a bar for a tooltip with exact values and host highlight.

CPU monitoring



CPU usage as vertical stacked bars: System (blue), User (yellow), Nice (green), GuestNice (lime green), Idle (black), IOwait (purple), IRQ/SoftIRQ (white), Guest/Steal (red). Press 1 for aggregate vs. per-core. Press e for extended mode (1px peak line: max system+user over last N samples).

Memory and network



  • 2 / m: memory — left half RAM (dark grey/black), right half Swap (grey/black) per host
  • 3 / n: network — RX (top, light green) and TX (bottom) summed over non-loopback interfaces. Red bar = no non-lo interface. Use --netlink or f/v for link speed (utilization %). Default gbit.

All hotkeys



Key     Action
─────   ──────────────────────────────────────────────────
1       Toggle CPU (aggregate / per-core / off)
2 / m   Toggle memory bars
3 / n   Toggle network bars
4 / l   Toggle load average bars
5       Toggle disk I/O (aggregate / per-device / off)
r       Reset load and disk auto-scale peaks
e       Toggle extended (peak line on CPU; disk util line)
g       Toggle global average CPU line
i       Toggle global I/O average line
s       Toggle host separator lines
h       Print hotkey list to stdout
q       Quit
w       Write current settings to ~/.loadbarsrc
a / y   CPU average samples up / down
d / c   Net average samples up / down
b / x   Disk average samples up / down
f / v   Link scale up / down
Arrows  Resize window

SSH and config



Connect with public key auth; hosts need bash and /proc (Linux). No agent needed on the remote side.

loadbars --hosts server1,server2,server3
loadbars --hosts root@server1,root@server2
loadbars servername{01..50}.example.com --showcores 1
loadbars --cluster production

Config: ~/.loadbarsrc (key=value, no --; use # for comments). Any --help option. Press w to save current settings.

Building and platforms



Go 1.25+ and SDL2. Install SDL2 (e.g. sudo dnf install SDL2-devel on Fedora, brew install sdl2 on macOS), then:

mage build
./loadbars --hosts localhost
mage install   # to ~/go/bin
mage test

Tested on Fedora Linux 43 and common distros; macOS as client to remote Linux only (no local macOS monitoring — no /proc).

E-Mail your comments to paul@nospam.buetow.org :-)

Back to the main site
My desk rack: DeskPi RackMate T0 https://foo.zone/gemfeed/2026-02-22-my-desk-rack.html 2026-02-21T11:17:15+02:00 Paul Buetow aka snonux paul@dev.buetow.org On my desk sits a small rack that keeps audio gear, power, and network in one place: the DeskPi RackMate T0. Here's what lives in it and how it's wired.

My desk rack: DeskPi RackMate T0



Published at 2026-02-21T11:17:15+02:00

    ┌─────────────────┐
    │   ●  ●  AIR     │  ← air-quality monitor
    ├─────────────────┤
    │  ╔═╗  CD        │  ← CD transport
    │  ║ ◉║  S/PDIF   │
    │  ╚═╝            │
    ├─────────────────┤
    │  ▓▓▓  USB PWR   │  ← PinePower
    ├─────────────────┤
    │  ░░░  (phones)  │  ← 1U "empty" shelf
    ├─────────────────┤
    │  ◉◉◉◉◉  LAN     │  ← 5-port switch
    ├─────────────────┤
    │  [E50] [L50]    │  ← DAC + AMP
    │   DAC   AMP     │
    └─────────────────┘
         RackMate T0

On my desk sits a small rack that keeps audio gear, power, and network in one place: the DeskPi RackMate T0. Here's what lives in it and how it's wired.

DeskPi RackMate T0

DeskPi RackMate T0 on the desk

Table of Contents




What's in the rack (top to bottom)



Top: CD transport and air-quality monitor



At the top is the S.M.S.L PL200T, a CD transport with anti-vibration design. It outputs digital audio over coaxial S/PDIF into the DAC in the rack. On top of the transport sits a small air-quality monitor so I can keep an eye on the room.

S.M.S.L PL200T CD Transport

CD transport and air-quality monitor on top

A CD transport is not the same as a CD player. A CD player has a built-in DAC (digital-to-analog converter) and outputs analogue audio—you plug it into an amp or active speakers and you're done. A CD transport only reads the disc and outputs a digital signal (e.g. coaxial or optical S/PDIF). It has no DAC. You feed that digital stream into an external DAC, which then does the conversion. The idea is to separate the mechanical part (spinning the disc, reading the pits) from the conversion stage, so you can use one DAC for CDs, streaming, and other sources, and upgrade or swap the transport and the DAC independently.

In the age of streaming and files, putting on a real CD is still a pleasure. You own the disc and the sound isn't at the mercy of a subscription or a server. You pick an album, put it in, and listen from start to finish—no endless scrolling, no algorithm. The format is fixed (16-bit/44.1 kHz), so what you hear is consistent and often better than heavily compressed streams. And there's something satisfying about the ritual: handling the case, the disc, and the artwork instead of tapping a screen.

Power and charging: PinePower Desktop + 1U shelf



Below that is the PinePower Desktop from Pine64, used as a desktop power and USB charging station for phones and other devices. The rack has one free 1U space under the PinePower where I put the devices that are charging, so cables and gadgets stay in one spot.

PinePower Desktop (Pine64)

Network: 5-port mini switch



Next is a compact 5-port Ethernet switch. The uplink goes to a wall socket behind the desk; the other ports feed the computer, laptop, and anything else that needs wired LAN on the desk. Next to the switch you can see my Nothing ear buds.

Nothing ear buds

Bottom: DAC and headphone amp



At the bottom of the rack are the Topping E50 (DAC) and Topping L50 (headphone amplifier). The E50 converts digital to analogue; the L50 drives the headphones. They drive my Hifiman Sundara headphones.

Topping E50 DAC
Topping L50 Headphone Amplifier
Hifiman Sundara

Music sources



  • CD transport: coaxial (S/PDIF) from the S.M.S.L PL200T into the Topping E50.
  • Streaming: USB from the desktop computer and/or laptop on the desk into the E50, so I can play from either machine.

Left side: cable management



On the left of the rack are two cable holders to keep power and signal cables tidy.

Next to the rack



Right beside the rack is my Supernote Nomad, which I use for notes and reading and have written about elsewhere on this blog. It’s the small tablet-shaped device on the right side of the rack.

Supernote Nomad (small tablet on the right of the rack)
Supernote Nomad (product page)

Front view of the rack
Back of the rack

Bedside: another HiFi setup



I have a second setup for high-res listening next to my bed. On the nightstand sit my FiiO K13 R2R (an R2R DAC/amp) and my Denon AH-D9200 headphones. I connect the K13 to my laptop via USB and use it for high-resolution files and streaming when I'm not at the desk.

Fiio K13 R2R
Denon AH-D9200

That's the full desk rack: CD transport and air monitor on top, PinePower and charging shelf, switch, then Topping E50 and L50 at the bottom, with the Hifiman Sundara as the main output and the Supernote Nomad sitting next to it. I hope that you found this interesting.

E-Mail your comments to paul@nospam.buetow.org :-)
A tmux popup editor for Cursor Agent CLI prompts https://foo.zone/gemfeed/2026-02-02-tmux-popup-editor-for-cursor-agent-prompts.html 2026-02-01T20:24:16+02:00 Paul Buetow aka snonux paul@dev.buetow.org I spend some time in Cursor Agent (the CLI version of the Cursor IDE, I don't like really the IDE), and I also jump between Claude Code CLI, Ampcode, Gemini CLI, OpenAI Codex CLI, OpenCode, and Aider just to see how things are evolving. But for the next month I'll be with Cursor Agent.

A tmux popup editor for Cursor Agent CLI prompts



Published at 2026-02-01T20:24:16+02:00

...and any other TUI based application

Table of Contents




Why I built this



I spend some time in Cursor Agent (the CLI version of the Cursor IDE, I don't like really the IDE), and I also jump between Claude Code CLI, Ampcode, Gemini CLI, OpenAI Codex CLI, OpenCode, and Aider just to see how things are evolving. But for the next month I'll be with Cursor Agent.

https://cursor.com/cli

Short prompts are fine in the inline input, but for longer prompts I want a real editor: spellcheck, search/replace, multiple cursors, and all the Helix muscle memory I already have.

Cursor Agent has a Vim editing mode, but not Helix. And even in Vim mode I can't use my full editor setup. I want the real thing, not a partial emulation.

https://helix-editor.com
https://www.vim.org
https://neovim.io

So I built a tiny tmux popup editor. It opens $EDITOR (Helix for me), and when I close it, the buffer is sent back into the prompt. It sounds simple, but it feels surprisingly native.

This is how it looks like:

Popup editor in action

What it is



The idea is straightforward:

  • A tmux key binding prefix-e opens a popup overlay near the bottom of the screen.
  • The popup starts $EDITOR on a temp file.
  • When I exit the editor, the script sends the contents back to the original pane with tmux send-keys.

It also pre-fills the temp file with whatever is already typed after Cursor Agent's prompt, so I can continue where I left off.

How it works (overview)



This is the tmux binding I use (trimmed to the essentials):

bind-key e run-shell -b "tmux display-message -p '#{pane_id}'
  > /tmp/tmux-edit-target-#{client_pid} \;
  tmux popup -E -w 90% -h 35% -x 5% -y 65% -d '#{pane_current_path}'
  \"~/scripts/tmux-edit-send /tmp/tmux-edit-target-#{client_pid}\""

Workflow diagram



This is the whole workflow:

┌────────────────────┐   ┌───────────────┐   ┌─────────────────────┐   ┌─────────────────────┐
│ Cursor input box   │-->| tmux keybind  │-->| popup runs script   │-->| capture + prefill   │
│ (prompt pane)      │   │ prefix + e    │   │ tmux-edit-send      │   │ temp file           │
└────────────────────┘   └───────────────┘   └─────────────────────┘   └─────────────────────┘
                                                                                 |
                                                                                 v
┌────────────────────┐   ┌────────────────────┐   ┌────────────────────┐   ┌────────────────────┐
│ Cursor input box   │<--| send-keys back     |<--| close editor+popup |<--| edit temp file     |
│ (prompt pane)      │   │ to original pane   │   │ (exit $EDITOR)     │   │ in $EDITOR         │
└────────────────────┘   └────────────────────┘   └────────────────────┘   └────────────────────┘

And this is how it looks like after sending back the text to the Cursor Agent's input:

Prefilled prompt text

And here is the full script. It is a bit ugly since it's shell (written with Cursor Agent with GPT-5.2-Codex), and I might (let) rewrite it in Go with proper unit tests, config-file, multi-agent support and release it once I have time. But it works well enough for now.

Update 2026-02-08: This functionality has been integrated into the hexai project (https://codeberg.org/snonux/hexai) with proper multi-agent support for Cursor Agent, Claude Code CLI, and Ampcode. The hexai version includes unit tests, configuration files, and better agent detection. While still experimental, it's more robust than this shell script. See the hexai-tmux-edit command for details.

https://codeberg.org/snonux/hexai

#!/usr/bin/env bash
set -u -o pipefail

LOG_ENABLED=0
log_file="${TMPDIR:-/tmp}/tmux-edit-send.log"
log() {
  if [ "$LOG_ENABLED" -eq 1 ]; then
    printf '%s\n' "$*" >> "$log_file"
  fi
}

# Read the target pane id from a temp file created by tmux binding.
read_target_from_file() {
  local file_path="$1"
  local pane_id
  if [ -n "$file_path" ] && [ -f "$file_path" ]; then
    pane_id="$(sed -n '1p' "$file_path" | tr -d '[:space:]')"
    # Ensure pane ID has % prefix
    if [ -n "$pane_id" ] && [[ "$pane_id" != %* ]]; then
      pane_id="%${pane_id}"
    fi
    printf '%s' "$pane_id"
  fi
}

# Read the target pane id from tmux environment if present.
read_target_from_env() {
  local env_line pane_id
  env_line="$(tmux show-environment -g TMUX_EDIT_TARGET 2>/dev/null || true)"
  case "$env_line" in
    TMUX_EDIT_TARGET=*)
      pane_id="${env_line#TMUX_EDIT_TARGET=}"
      # Ensure pane ID has % prefix
      if [ -n "$pane_id" ] && [[ "$pane_id" != %* ]] && [[ "$pane_id" =~ ^[0-9]+$ ]]; then
        pane_id="%${pane_id}"
      fi
      printf '%s' "$pane_id"
      ;;
  esac
}

# Resolve the target pane id, falling back to the last pane.
resolve_target_pane() {
  local candidate="$1"
  local current_pane last_pane

  current_pane="$(tmux display-message -p "#{pane_id}" 2>/dev/null || true)"
  log "current pane=${current_pane:-<empty>}"
  
  # Ensure candidate has % prefix if it's a pane ID
  if [ -n "$candidate" ] && [[ "$candidate" =~ ^[0-9]+$ ]]; then
    candidate="%${candidate}"
    log "normalized candidate to $candidate"
  fi
  
  if [ -n "$candidate" ] && [[ "$candidate" == *"#{"* ]]; then
    log "format target detected, clearing"
    candidate=""
  fi
  if [ -z "$candidate" ]; then
    candidate="$(tmux display-message -p "#{last_pane}" 2>/dev/null || true)"
    log "using last pane as fallback: $candidate"
  elif [ "$candidate" = "$current_pane" ]; then
    last_pane="$(tmux display-message -p "#{last_pane}" 2>/dev/null || true)"
    if [ -n "$last_pane" ]; then
      candidate="$last_pane"
      log "candidate was current, using last pane: $candidate"
    fi
  fi
  printf '%s' "$candidate"
}

# Capture the latest multi-line prompt content from the pane.
capture_prompt_text() {
  local target="$1"
  tmux capture-pane -p -t "$target" -S -2000 2>/dev/null | awk '
    function trim_box(line) {
      sub(/^ *│ ?/, "", line)
      sub(/ *│ *$/, "", line)
      sub(/[[:space:]]+$/, "", line)
      return line
    }
    /^ *│ *→/ && index($0,"INSERT")==0 && index($0,"Add a follow-up")==0 {
      if (text != "") last = text
      text = ""
      capture = 1
      line = $0
      sub(/^.*→ ?/, "", line)
      line = trim_box(line)
      if (line != "") text = line
      next
    }
    capture {
      if ($0 ~ /^ *└/) {
        capture = 0
        if (text != "") last = text
        next
      }
      if ($0 ~ /^ *│/ && index($0,"INSERT")==0 && index($0,"Add a follow-up")==0) {
        line = trim_box($0)
        if (line != "") {
          if (text != "") text = text " " line
          else text = line
        }
      }
    }
    END {
      if (text != "") last = text
      if (last != "") print last
    }
  '
}

# Write captured prompt text into the temp file if available.
prefill_tmpfile() {
  local tmpfile="$1"
  local prompt_text="$2"
  if [ -n "$prompt_text" ]; then
    printf '%s\n' "$prompt_text" > "$tmpfile"
  fi
}

# Ensure the target pane exists before sending keys.
validate_target_pane() {
  local target="$1"
  local pane target_found
  if [ -z "$target" ]; then
    log "error: no target pane determined"
    echo "Could not determine target pane." >&2
    return 1
  fi
  target_found=0
  log "validate: looking for target='$target' in all panes:"
  for pane in $(tmux list-panes -a -F "#{pane_id}" 2>/dev/null || true); do
    log "validate: checking pane='$pane'"
    if [ "$pane" = "$target" ]; then
      target_found=1
      log "validate: MATCH FOUND!"
      break
    fi
  done
  if [ "$target_found" -ne 1 ]; then
    log "error: target pane not found: $target"
    echo "Target pane not found: $target" >&2
    return 1
  fi
  log "validate: target pane validated successfully"
}

# Send temp file contents to the target pane line by line.
send_content() {
  local target="$1"
  local tmpfile="$2"
  local prompt_text="$3"
  local first_line=1
  local line
  log "send_content: target=$target, prompt_text='$prompt_text'"
  while IFS= read -r line || [ -n "$line" ]; do
    log "send_content: read line='$line'"
    if [ "$first_line" -eq 1 ] && [ -n "$prompt_text" ]; then
      if [[ "$line" == "$prompt_text"* ]]; then
        local old_line="$line"
        line="${line#"$prompt_text"}"
        log "send_content: stripped prompt, was='$old_line' now='$line'"
      fi
    fi
    first_line=0
    log "send_content: sending line='$line'"
    tmux send-keys -t "$target" -l "$line"
    tmux send-keys -t "$target" Enter
  done < "$tmpfile"
  log "sent content to $target"
}

# Main entry point.
main() {
  local target_file="${1:-}"
  local target
  local editor="${EDITOR:-vi}"
  local tmpfile
  local prompt_text

  log "=== tmux-edit-send starting ==="
  log "target_file=$target_file"
  log "EDITOR=$editor"
  
  target="$(read_target_from_file "$target_file" || true)"
  if [ -n "$target" ]; then
    log "file target=${target:-<empty>}"
    rm -f "$target_file"
  fi
  if [ -z "$target" ]; then
    target="${TMUX_EDIT_TARGET:-}"
  fi
  log "env target=${target:-<empty>}"
  if [ -z "$target" ]; then
    target="$(read_target_from_env || true)"
  fi
  log "tmux env target=${target:-<empty>}"
  target="$(resolve_target_pane "$target")"
  log "fallback target=${target:-<empty>}"

  tmpfile="$(mktemp)"
  log "created tmpfile=$tmpfile"
  if [ ! -f "$tmpfile" ]; then
    log "ERROR: mktemp failed to create file"
    echo "ERROR: mktemp failed" >&2
    exit 1
  fi
  mv "$tmpfile" "${tmpfile}.md" 2>&1 | while read -r line; do log "mv output: $line"; done
  tmpfile="${tmpfile}.md"
  log "renamed to tmpfile=$tmpfile"
  if [ ! -f "$tmpfile" ]; then
    log "ERROR: tmpfile does not exist after rename"
    echo "ERROR: tmpfile rename failed" >&2
    exit 1
  fi
  trap 'rm -f "$tmpfile"' EXIT

  log "capturing prompt text from target=$target"
  prompt_text="$(capture_prompt_text "$target")"
  log "captured prompt_text='$prompt_text'"
  prefill_tmpfile "$tmpfile" "$prompt_text"
  log "prefilled tmpfile"

  log "launching editor: $editor $tmpfile"
  "$editor" "$tmpfile"
  local editor_exit=$?
  log "editor exited with status $editor_exit"

  if [ ! -s "$tmpfile" ]; then
    log "empty file, nothing sent"
    exit 0
  fi
  
  log "tmpfile contents:"
  log "$(cat "$tmpfile")"

  log "validating target pane"
  validate_target_pane "$target"
  log "sending content to target=$target"
  send_content "$target" "$tmpfile" "$prompt_text"
  log "=== tmux-edit-send completed ==="
}

main "$@"

Challenges and small discoveries



The problems were mostly small but annoying:

  • Getting the right target pane was the first hurdle. I ended up storing the pane id in a file because of tmux format expansion quirks.
  • The Cursor UI draws a nice box around the prompt, so the prompt line contains a and other markers. I had to filter those out and strip the box-drawing characters.
  • When I prefilled text and then sent it back, I sometimes duplicated the prompt. Stripping the prefilled prompt text from the submitted text fixed that.

Test cases (for a future rewrite)



These are the cases I test whenever I touch the script:

  • Single-line prompt: capture everything after and prefill the editor.
  • Multi-line boxed prompt: capture the wrapped lines inside the │ ... │ box and join them with spaces (no newline in the editor).
  • Ignore UI noise: do not capture lines containing INSERT or Add a follow-up.
  • Preserve appended text: if I add juju to an existing line, the space before juju must survive.
  • No duplicate send: if the prefilled text is still at the start of the first line, it must be stripped once before sending back.

(Almost) works with any editor (or any TUI)



Although I use Helix, this is just $EDITOR. If you prefer Vim, Neovim, or something more exotic, it should work. The same mechanism can be used to feed text into any TUI that reads from a terminal pane, not just Cursor Agent.

One caveat: different agents draw different prompt UIs, so the capture logic depends on the prompt shape. A future version of this script should be more modular in that respect; for now this is just a PoC tailored to Cursor Agent.

Another thing is, what if Cursor decides to change the design of its TUI? I would need to change my script as well.

If I get a chance, I'll clean it up and rewrite it in Go (and release it properly or include it into Hexai, another AI related tool of mine, of which I haven't blogged about yet). For now, I am happy with this little hack. It already feels like a native editing workflow for Cursor Agent prompts.

https://codeberg.org/snonux/hexai

E-Mail your comments to paul@nospam.buetow.org :-)

Other related posts are:

2026-02-02 A tmux popup editor for Cursor Agent CLI prompts (You are currently reading this)
2025-08-05 Local LLM for Coding with Ollama on macOS
2025-05-02 Terminal multiplexing with tmux - Fish edition
2024-06-23 Terminal multiplexing with tmux - Z-Shell edition

Back to the main site
Using Supernote Nomad offline https://foo.zone/gemfeed/2026-01-01-using-supernote-nomad-offline.html 2025-12-31T16:25:30+02:00 Paul Buetow aka snonux paul@dev.buetow.org I am a note taker. For years, I've been searching for a good digital device that could complement my paper notebooks. I've finally found it in the Supernote Nomad. I use it completely offline without cloud-sync, and in this post, I'll explain why this is a benefit.

Using Supernote Nomad offline



Published at 2025-12-31T16:25:30+02:00

I am a note taker. For years, I've been searching for a good digital device that could complement my paper notebooks. I've finally found it in the Supernote Nomad. I use it completely offline without cloud-sync, and in this post, I'll explain why this is a benefit.

Supernote Nomad

I initially bought it because Retta (the manufacturer of the Supernote) stated on their website that an open-source Linux firmware would be released soon. However, after over a year, there still hasn't been any progress (hopefully there will be someday). So I looked into alternative ways to use this device.

⣿⣿⣿⣿⣿⣿⡿⠿⠿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣏⠀⢶⣆⡘⠉⠙⠛⠿⠿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⠋⣤⣄⠘⠃⢠⣀⣀⠀⠀⠀⠀⠀⠉⠉⠛⠛⠿⢿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⡿⠀⡉⠻⡟⠀⠈⠉⠙⠛⠷⠶⣦⣤⣄⣀⠀⠀⠀⠀⠀⣾⣿⣿⣿⣿
⣿⣿⣿⣿⡄⠸⢿⣤⠀⢠⣤⣀⡀⠀⠀⠀⠀⠀⠉⠙⠛⠻⠶⠀⢰⣿⣿⠻⣿⣿
⣿⣿⣿⣿⠠⣶⣆⡉⠀⠀⠈⠉⠙⠛⠳⠶⠦⣤⣤⣄⣀⡀⢀⣴⠟⠋⠙⢷⣬⣿
⣿⣿⣿⠏⣠⡄⠹⠁⠰⢶⣤⣤⣀⡀⠀⠀⠀⠀⠀⠉⢉⣿⠟⠁⠀⠀⣠⣾⣿⣿
⣿⣿⡿⠂⠙⠻⡆⠀⠀⠀⠀⠈⠉⠛⠛⠷⠶⣦⣤⣴⠟⠁⠀⠀⣠⣾⣿⣿⣿⣿
⣿⣿⡇⠸⣿⣄⠀⠰⠶⢶⣤⣄⣀⡀⠀⠀⠀⣴⣟⠁⠀⠀⣠⣾⣿⣿⣿⣿⣿⣿
⣿⡟⠀⣶⣀⠃⠀⠀⠀⠀⠀⠈⠉⠙⠛⠓⢾⡟⢙⣷⣤⢾⣿⣿⣿⣿⣿⣿⣿⣿
⣿⠋⣀⡉⠻⠀⠘⠛⠻⠶⢶⣤⣤⣀⡀⢠⠿⠟⠛⠉⠁⣸⣿⣿⣿⣿⣿⣿⣿⣿
⣿⡀⠛⠳⠆⠀⠀⠀⠀⠀⠀⠀⠉⠉⠛⠛⠷⠶⣦⠄⢀⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣶⣦⣀⣀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣸⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣶⣶⣤⣤⣀⣀⠀⠀⠀⢠⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣷⣶⣾⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿

Table of Contents




The Joy of Being Offline



In a world of constant connectivity, the Supernote Nomad offers a sanctuary. By keeping it offline, I can focus on my thoughts and notes without compromise of my privacy.

One of the most significant advantages of keeping Wi-Fi off is the battery life. The Supernote Nomad can last a week, on a single charge when it's not constantly searching for a network. This makes it a good companion for long trips or intense note-taking sessions.

Privacy was my main concern. By not syncing my notes to Retta's cloud service, I retain full ownership and control over my data. There's no risk of my personal thoughts and ideas being accessed or mined by third parties. It's a simple and effective way to ensure my privacy.

A picture of the Supernote Nomad

My Offline Workflow



My workflow is simple, only relying on a direct USB connection to my Linux laptop.

I connect my Supernote Nomad to my Linux laptop via a USB-C cable. The device is automatically recognized as a storage device, and I can directly access the Note folder, which contains all my notes as .note files. I then copy these files to a dedicated archive folder on my laptop.

Converting Notes to PDF



To make my notes accessible and shareable, I convert them from the proprietary .note format to PDF. For this, I use a fantastic open-source tool called supernote-tool. It's not an official tool from Ratta, but it works flawlessly.

https://github.com/jya-dev/supernote-tool

I've created a small shell script to automate the conversion process using tis tool. This script, convert-notes-to-pdfs.sh, resides in my notes archive folder:

#!/usr/bin/env bash

convert () {
  find . -name \*.note \
    | while read -r note; do
        echo supernote-tool convert -a -t pdf "$note" "${note/.note/.pdf}"
        supernote-tool convert -a -t pdf "$note" "${note/.note/.pdf}.tmp"
        mv "${note/.note/.pdf}.tmp" "${note/.note/.pdf}"
        du -hs "$note" "${note/.note/.pdf}"
        echo
      done
}

# Make the PDFs available on my Phone as well
copy () {
  if [ ! -d ~/Documents/Supernote ]; then
    echo "Directory ~/Documents/Supernote does not exist, skipping"
    exit 1
  fi

  rsync -delete -av --include='*/' --include='*.pdf' --exclude='*' . ~/Documents/Supernote/
  echo This was copied from $(pwd) so dont edit manually >~/Documents/Supernote/README.txt
}

convert
copy

This script does two things:

  • It finds all .note files in the current directory and converts them to PDF using supernote-tool.
  • It copies the generated PDFs to my ~/Documents/Supernote folder.

Syncing to my Phone



The ~/Documents/Supernote folder on my laptop is synchronized with my phone using Syncthing. This way, I have access to all my notes in PDF format on my phone, wherever I go, without relying on any cloud service.

https://syncthing.net/

Firmware updates



One usually updates the software or firmware of the Supernote Nomad via Wi-Fi. However, it is also possible to update it completely offline. To install the firmware update, follow the steps below (the following instructions were copied from the Supernote website):

  • Connect your Supernote to your PC with a USB-C cable. For macOS, an MTP software (e.g. OpenMTP or Android File Transfer) is required for your Supernote to show up on your Mac.
  • For Manta, Nomad, A5 X and A6 X devices, copy the firmware (DO NOT UNZIP) to the "Export" folder of Supernote; for A5 and A6 devices, copy the firmware (DO NOT UNZIP) to the root directory of Supernote.
  • Unplug the USB connection, tap “OK” on your Supernote to continue, and if no prompt pops up, please restart your device directly to proceed to update.

The Writing Experience



The writing feel of the Supernote Nomad is simply great. The combination of the screen's texture and the ceramic nib of the pen creates a feeling that is remarkably close to writing on real paper. The latency is almost non-existent, and the pressure sensitivity allows for a natural and expressive writing experience. It's great to write on, and it makes me want to take more notes.

Another picture of the Supernote Nomad

Conclusion



The Supernote Nomad has become an additional tool for me. By using it offline, I've created a distraction-free and private note-taking environment. The simple, manual workflow for transferring and converting notes gives me full control over my data, and the writing experience is second to none. If you're looking for a digital notebook that respects your privacy and helps you focus, I highly recommend giving the Supernote Nomad a try with an offline-first approach.

The Supernote didn't fully replace my traditional paper journals, though. Each of them has its own use case. However, that is outside the scope of this blog post.

Other related posts:

2026-01-01 Using Supernote Nomad offline (You are currently reading this)
2026-01-01 Cloudless Kobo Forma with KOReader

E-Mail your comments to paul@nospam.buetow.org :-)

Back to the main site
Posts from July to December 2025 https://foo.zone/gemfeed/2026-01-01-posts-from-july-to-december-2025.html 2025-12-31T15:49:06+02:00 Paul Buetow aka snonux paul@dev.buetow.org Hello there, I wish you all a happy new year! These are my social media posts from the last six months. I keep them here to reflect on them and also to not lose them. Social media networks come and go and are not under my control, but my domain is here to stay.

Posts from July to December 2025



Published at 2025-12-31T15:49:06+02:00

Hello there, I wish you all a happy new year! These are my social media posts from the last six months. I keep them here to reflect on them and also to not lose them. Social media networks come and go and are not under my control, but my domain is here to stay.

These are from Mastodon and LinkedIn. Have a look at my about page for my social media profiles. This list is generated with Gos, my social media platform sharing tool.

My about page
https://codeberg.org/snonux/gos

Table of Contents




July 2025



In #Golang, values are actually copied when ...



In #Golang, values are actually copied when assigned (boxed) into an interface. That can have performance impact.

goperf.dev/01-common-patterns/interface-boxing/

Same experiences I had, but it's a time saver. ...



Same experiences I had, but it's a time saver. and when done correctly, those tools are amazing: #llm #coding #programming

lucumr.pocoo.org/2025/06/21/my-first-ai-library/

We (programmers) all use them (I hope): ...



We (programmers) all use them (I hope): language servers. LSP stands for Language Server Protocol, which standardizes communication between coding editors or IDEs and language servers, facilitating features like autocompletion, refactoring, linting, error-checking, etc.... It's interesting to look under the hood a little bit to see how your code editor actually communicates with a language server. #LSP #coding #programming

packagemain.tech/p/understanding-the-language-server-protocol

Shells of the early unices didnt understand ...



Shells of the early unices didnt understand file globbing, that was done by the external glob command! #unix #history #shell

utcc.utoronto.ca/%7Ecks/space/blog/unix/EtcGlobHistory

I've picked up a few techniques from this blog ...



I've picked up a few techniques from this blog post and found them worth sharing here: #ai #llm #prompting #techniques

cracking-ai-engineering.com/writing/2025/07/07/four-prompting-paradigms/

I've published the sixth part of my "Kubernetes ...



I've published the sixth part of my "Kubernetes with FreeBSD" blog series. This time, I set up the storage, which will be used with persistent volume claims later on in the Kubernetes cluster. Have a lot of fun! #freebsd #nfs #ha #zfs #zrepl #carp #kubernetes #k8s #k3s #homelab

foo.zone/gemfeed/2025-07-14-f3s-kubernetes-with-freebsd-part-6.html (Gemini)
foo.zone/gemfeed/2025-07-14-f3s-kubernetes-with-freebsd-part-6.html

The book "Coders at Work" offers a fascinating ...



The book "Coders at Work" offers a fascinating glimpse into how programming legends emerged in the early days of computing. I especially enjoyed the personal stories and insights. It would be great to see a new edition reflecting today’s AI and LLM revolution—so much has changed since!

www.goodreads.com/book/show/6713575-coders-at-work

For me, that's all normal. Couldn't imagine a ...



For me, that's all normal. Couldn't imagine a simpler job. #software

0x1.pt/2025/04/06/the-insanity-of-being-a-software-engineer/

This is similar to my #dtail project. It got ...



This is similar to my #dtail project. It got some features, which dtail doesnt, and dtail has some features, which #nerdlog hasnt. But the principle is the same, both tools don't have a centralised log store and both use SSH to connect to the servers (sources of the logs) directly.

github.com/dimonomid/nerdlog

I also feel the most comfortable in the ...



I also feel the most comfortable in the #terminal. There are a few high-level tools where it doesn't make always a lot of sense like web-browsing most of the web, but for most of the things I do, I prefer the terminal. I think it's a good idea to have a terminal-based interface for most of the things you do. It makes it easier to automate things and to work with other tools.

lambdaland.org/posts/2025-05-13_real_programmers/

I have been enjoying lately as an alternative ...



I have been enjoying lately as an alternative TUI to Claude Code CLI. It is a 100% open-source agentic coding tool, which supports all models from including local ones (e.g. DeepSeek), and has got some nice tweaks like side-by-side diffs and you can also use your favourite text $EDITOR for prompt editing! Highly recommend! #llm #coding #programming #agentic #ai

opencode.ai
models.dev

Jonathan's reflection of 10 years of ...



Jonathan's reflection of 10 years of programming!

jonathan-frere.com/posts/10-years-of-programming/

Some neat zero-copy #Golang tricks here ...



Some neat zero-copy #Golang tricks here

goperf.dev/01-common-patterns/zero-copy/

What was it like working at GitLab? A scary ...



What was it like working at GitLab? A scary moment was the deletion of the gitlab.com database, though fortunately, there was a six-hour-old copy on the staging server. More people don't necessarily produce better results. Additionally, Ruby's metaprogramming isn't ideal for large projects. A burnout. And many more insights....

yorickpeterse.com/articles/what-it-was-like-working-for-gitlab/

I have learned a lot from the Practical #AI ...



I have learned a lot from the Practical #AI #podcast, especially from episode 312, which discusses the #MCP (model context protocol). Are there any MCP servers you plan to use or to build?

practicalai.fm/312

August 2025



At the end of the article it's mentione that ...



At the end of the article it's mentione that it's difficult to stay in the zone when AI does the coding for you. I think it's possible to stay in the zon, but only when you use AI surgically. #llm #ai #programming

newsletter.pragmaticengineer.com/p/cur..-..email=true&r=4ijqut&triedRedirect=true

Great blog post a out #OpenBSDAmsterdam, of ...



Great blog post a out #OpenBSDAmsterdam, of which I am a customer too for some years now. #OpenBSD

www.tumfatig.net/2025/cruising-a-vps-at-openbsd-amsterdam/

Interesting. #llm #ai #slowdown ...



Interesting. #llm #ai #slowdown

m.slashdot.org/story/444304

With the help of genai, I could generate this ...



With the help of genai, I could generate this neat small showcase site, of many of my small to medium sized side projects. The projects descriptions were generated by Claude Code CLI with Sonnet 4 based on the git repo contents. The page content by gitsyncer, a tool I created (listed on the showcase page as well) and gemtexter, which did the HTML generation part (another tool I wrote, listed on the showcase page as well). The stats seem neat, over time a lot of stuff starts to pile up! With the age of AI (so far, only 8 projects were created AI-assisted), I think more projects will spin up faster (not just for me, but for everyone working on side projects). I have more (older) side projects archived on my local NAS, but they are not worth digging out... 📦 Total Projects: 55 📊 Total Commits: 10,379 📈 Total Lines of Code: 252,969 📄 Total Lines of Documentation: 24,167 💻 Languages: Java (22.4%), Go (17.6%), HTML (14.0%), C++ (8.9%), C (7.3%), Perl (6.3%), Shell (6.3%), C/C++ (5.8%), XML (4.6%), Config (1.5%), Ruby (1.1%), HCL (1.1%), Make (0.7%), Python (0.6%), CSS (0.6%), JSON (0.3%), Raku (0.3%), Haskell (0.2%), YAML (0.2%), TOML (0.1%) 📚 Documentation: Text (47.4%), Markdown (38.4%), LaTeX (14.2%) 🤖 AI-Assisted Projects: 8 out of 55 (14.5% AI-assisted, 85.5% human-only) 🚀 Release Status: 31 released, 24 experimental (56.4% with releases, 43.6% experimental) #llm #genai #showcase #coding #programming

foo.zone/about/showcase.html (Gemini)
foo.zone/about/showcase.html

I tinkered a bit with local LLMs for coding: ...



I tinkered a bit with local LLMs for coding: #llm #local #ai #coding #ollama #qwen #deepseek #HelixEditor #LSP #codecompletion #aider

foo.zone/gemfeed/2025-08-05-local-coding-llm-with-ollama.html (Gemini)
foo.zone/gemfeed/2025-08-05-local-coding-llm-with-ollama.html

Good stuff: 10 years of functional options and ...



Good stuff: 10 years of functional options and key lessons Learned along the way #golang

www.bytesizego.com/blog/10-years-functional-options-golang

Top 5 performance boosters #golang ...



Top 5 performance boosters #golang

blog.devtrovert.com/p/go-performance-boosters-the-top-5

This person found the balance.. although I ...



This person found the balance.. although I would use a different code editor: Why Open Source Maintainers Thrive in the LLM Era via @wallabagapp #ai #llm #coding #programming

mikemcquaid.com/why-open-source-maintainers-thrive-in-the-llm-era/

Let's rewrite all slow in #assembly, surely ...



Let's rewrite all slow in #assembly, surely it's not just about the language but also about the architecture and the algorithms used. Still, impressive.

x.com/FFmpeg/status/1945478331077374335

How to store data forever? #storage ...



How to store data forever? #storage #archiving

drewdevault.com/2020/04/22/How-to-store-data-forever.html

No wonder, that almost everyone doing something ...



No wonder, that almost everyone doing something with AI is releasing their own aentic coding tool now. As it's so dead simple to write one. #ai #llm #agenticcoding

ampcode.com/how-to-build-an-agent

Another drawback of running load tests in a ...



Another drawback of running load tests in a pre-prod environment is that it is not always possible to reproduce production load, especially in a complex environment. I personally prefer a combination of pre-prod load testing, production canaries, and gradual production deployment. What are your thoughts? #sre #loadtesting #lt #loadtesting

thefridaydeploy.substack.com/p/load-testing-prepare-for-the-growth

Interesting read Learnings from two years of ...



Interesting read Learnings from two years of using AI tools for software engineering #ai #llm #genai

newsletter.pragmaticengineer.com/p/two-years-of-using-ai

Neat little story a school girl writing her ...



Neat little story a school girl writing her first (and only) malware and have it infected her school.

ntietz.com/blog/that-time-i-wrote-malware/

Happy, that I am not yet obsolete! #llm ...



Happy, that I am not yet obsolete! #llm #sre

clickhouse.com/blog/llm-observability-challenge

September 2025



Loving this as well: #slackware #linux ...



Loving this as well: #slackware #linux

www.osnews.com/story/142145/what-makes-slackware-different/

Some #fun: Random Weird Things Part III blog ...



Some #fun: Random Weird Things Part III blog post

foo.zone/gemfeed/2025-08-15-random-weird-things-iii.html (Gemini)
foo.zone/gemfeed/2025-08-15-random-weird-things-iii.html

Yes, write more useless software. I agree that ...



Yes, write more useless software. I agree that play has a vital role in learning and experimentation. Also, programming is a lot of fun this way. I've learned programming mostly by writing useless software or almost useful tools for myself, but I can now apply all that knowledge to real work as well. #coding #programming

ntietz.com/blog/write-more-useless-software/

I learned a lot from this #OpenBSD #relayd ...



I learned a lot from this #OpenBSD #relayd talk, and I already put the information into production! I know the excellent OpenBSD manual pages document everything, but it is a bit different when you see it presented in a talk.

www.youtube.com/watch?v=yW8QSZyEs6E

Six weeks of claude code



blog.puzzmo.com/posts/2025/07/30/six-weeks-of-claude-code/

It's good that there is now a truly open-source ...



It's good that there is now a truly open-source LLM model; I'm just wondering how it will perform. The difference compared to other open models is that the others only provide open weights, but you can't reproduce the training! That issue would be solved with this Swiss model. I will definitively have a look! #llm #opensource #privacy

m.slashdot.org/story/446310

Have to try this at some point ...



Have to try this at some point, troubleshooting #k8s with the help of #genai

blog.palark.com/k8sgpt-ai-troubleshooting-kubernetes/

I could not agree more. For me, a personal ...



I could not agree more. For me, a personal (tech oriented) website is not a business contact card, but a playground to experience and learn with/about technologies. The Value of a Personal Site #website #personal #tech

atthis.link/blog/2021/personalsite.html

The true enterprise developer can write Java in ...



The true enterprise developer can write Java in any language. #java #programming

#fx is a neat little tool for viewing JSON ...



#fx is a neat little tool for viewing JSON files!

fx.wtf

I wish I had as much time as this guy. He ...



I wish I had as much time as this guy. He writes entire operating systems, including a Unix clone called "Bunnix" in a month. He is also the inventor of the Hare programming language (If I am not wrong). Now, he is also creating a new shell, primarily for his other operating systems and kernels he is working on. #shell #unix #programming #operatingsystem #bunnix #hare

drewdevault.com/2023/04/18/2023-04-18-A-new-shell-for-Unix.html

What exactly was the point of [ “x$var” = ...



What exactly was the point of [ “x$var” = “xval” ]? #bash #shell #posix #sh #history

www.vidarholen.net/contents/blog/?p=1035

Neat #ZFS feature (here #FreeBSD) which I ...



Neat #ZFS feature (here #FreeBSD) which I didn't know of before: Pool snapshots, which are different to snapshots of individual data sets:

it-notes.dragas.net/2024/07/01/enhanci..-..d-stability-with-zfs-pool-checkpoints/

Longer hours help only short term. About 40 ...



Longer hours help only short term. About 40 hours #productivity

thesquareplanet.com/blog/about-40-hours/

You could also use #bpf instead of #strace, ...



You could also use #bpf instead of #strace, albeit modern strace uses bpf if told so: How to use the new Docker Seccomp profiles

blog.jessfraz.com/post/how-to-use-new-docker-seccomp-profiles/

Some great things are approaching #bhyve on ...



Some great things are approaching #bhyve on #FreeBSD and VM Live Migration – Quo vadis? #freebsd #virtualization #bhyve

gyptazy.com/bhyve-on-freebsd-and-vm-live-migration-quo-vadis/

Another synchronization tool part of the ...



Another synchronization tool part of the #golang std lib, singleflight! Used to not overload external resources (like DBs) with N concurrent requests. Useful!

victoriametrics.com/blog/go-singleflight/index.html

Too many open files #linux ...



Too many open files #linux

mattrighetti.com/2025/06/04/too-many-files-open.html

Just posted Part 4 of my #Bash #Golf ...



Just posted Part 4 of my #Bash #Golf series:

foo.zone/gemfeed/2025-09-14-bash-golf-part-4.html (Gemini)
foo.zone/gemfeed/2025-09-14-bash-golf-part-4.html

#Perl is like a swiss army knife, as one of ...



#Perl is like a swiss army knife, as one of the comments states:

developers.slashdot.org/story/25/09/14..-..10th-most-popular-programming-language

Personally, mainly working with colorless ...



Personally, mainly working with colorless languages like #ruby and #golang, now slowly understand the pain ppl would have w/ Rust or JS. It wasn't just me when I got confused writing that Grafana DS plugin in TypeScript...

jpcamara.com/2024/07/15/ruby-methods-are.html

How do GPUs work? Usually, people only know ...



How do GPUs work? Usually, people only know about CPUs... ... I got the gist, but #gpu #cpu

blog.codingconfessions.com/p/gpu-computing

For unattended upgrades you must have a good ...



For unattended upgrades you must have a good testing (or canary) strategy. #sre #reliability #downtime #ubuntu #systemd #kubernetes

newsletter.pragmaticengineer.com/p/why-reliability-is-hard-at-scale

Surely, in the age of #AI and #LLM, people ...



Surely, in the age of #AI and #LLM, people are not writing as much code manually as before, but I don't think skills like using #Vim (or #HelixEditor) are obsolete just yet. You still need to understand what's happening under the hood, and being comfortable with these tools can make you much more efficient when you do need to edit or review code.

www.youtube.com/watch?v=tW0BSgzr2AM

On #AI changes everything... ...



On #AI changes everything...

lucumr.pocoo.org/2025/6/4/changes/

Maps in Go under the hood #golang ...



Maps in Go under the hood #golang

victoriametrics.com/blog/go-map/

"A project that looks complex might just be ...



"A project that looks complex might just be unfamiliar" - Quote from the Applied Go Weekly Newsletter

I must admit that partly I see myself there ...



I must admit that partly I see myself there (sometimes). But it is fun :-) #tools #happy

borretti.me/article/you-can-choose-tools-that-make-you-happy

Makes me think of good old times, where I ...



Makes me think of good old times, where I shipped 5 times as fast.: What happens when code reviews aren’t mandatory? What happens when code reviews aren’t mandatory? via @wallabagapp #productivity #code

testdouble.com/insights/when-code-reviews-arent-mandatory

Neat little blog post, showcasing various ...



Neat little blog post, showcasing various methods used for generic programming before the introduction of generics. Only reflection wasn't listed. #golang

bitfieldconsulting.com/posts/generics

share Didn't know, that on MacOS, besides of ...



share Didn't know, that on MacOS, besides of .so (shared object files, which can be dynamically loaded as well) there is also the MacOS' native .dylib format which serves a similar purpose! #macos #dylib #so

cpu.land/becoming-an-elf-lord

I think this is the way: use LLMs for code you ...



I think this is the way: use LLMs for code you don't care much about and write code manually for what matters most to you. This way, most boring and boilerplate stuff can be auto-generated.

registerspill.thorstenball.com/p/surely-not-all-codes-worth-it

Always enable keepalive? I'd say most of the ...



Always enable keepalive? I'd say most of the time. I've seen cases, where connections weren't reused but new additional were edtablished, causing the servers to run out of worker threads #sre Always. Enable. Keepalives.

www.honeycomb.io/blog/always-enable-keepalives

I just finished reading "Chaos Engineering" by ...



I just finished reading "Chaos Engineering" by Casey Rosenthal—an absolute must-read for anyone passionate about building resilient systems! Chaos Engineering is not abbreaking things randomly—it's a disciplined approach to uncovering weaknesses before they become outages. SREs, this book is packed with practical insights and real-world strategies to strengthen your systems against failure. Highly recommended! #ChaosEngineering #Resilience

www.oreilly.com/library/view/chaos-engineering/9781492043850/

fx is a neat and tidy command-line tool for ...



fx is a neat and tidy command-line tool for interactively viewing JSON files! What I like about it is that it is not too complex (open the help with ? and it is only about one page long) but still very useful. #json #golang

github.com/antonmedv/fx

Some nice #Golang tricks there ...



Some nice #Golang tricks there

blog.devtrovert.com/p/12-personal-go-tricks-that-transformed

October 2025



Word! What Are We Losing With AI? #llm #ai ...



Word! What Are We Losing With AI? #llm #ai

josem.co/what-are-we-losing-with-ai/

It's not yet time for the friday #fun, but: ...



It's not yet time for the friday #fun, but: OpenOffice does not print on Tuesdays ― Andreas Zwinkau :-)

beza1e1.tuxen.de/lore/print_on_tuesday.html

Finally, I retired my AWS/ECS setup for my ...



Finally, I retired my AWS/ECS setup for my self-hosted apps, as it was too expensive to operate—I had to pay $20 monthly just to run pods for only a day or so each month, so I rarely used them. Now, everything has been migrated to my FreeBSD-powered Kubernetes home cluster! Part 7 of this blog series covers the initial pod deployments. #freebsd #k8s #selfhosing

foo.zone/gemfeed/2025-10-02-f3s-kubernetes-with-freebsd-part-7.html (Gemini)
foo.zone/gemfeed/2025-10-02-f3s-kubernetes-with-freebsd-part-7.html

A great blog post about my favourite text ...



A great blog post about my favourite text editor. why even helix? #HeliEditor Now I am considering forking it myself as well :-)

axlefublr.github.io/why-even-helix/

One of the more confusing parts in Go, nil ...



One of the more confusing parts in Go, nil values vs nil errors: #golang

unexpected-go.com/nil-errors-that-are-non-nil-errors.html

Strong engineers are pragmatic, work fast, have ...



Strong engineers are pragmatic, work fast, have technical ability, dont need to be technical geniuses and believe in their ability to solve almost any problem #productivity

www.seangoedecke.com/what-makes-strong-engineers-strong/

I am currently binge-listening to the Google ...



I am currently binge-listening to the Google #SRE ProdCast. It's really great to learn about the stories of individual SREs and their journeys. It is not just about SREs at Google; there are also external guests.

sre.google/prodcast/

Looks like a neat library for writing ...



Looks like a neat library for writing script-a-like programs in #Golang. But honestly, why not directly use a scripting language like #RakuLang or #Ruby

github.com/bitfield/script

Where Gen AI shines is the generation and ...



Where Gen AI shines is the generation and management of YAML files... e.g. Kubernetes manifests. Who likes to write YAML files by hand? #genai #llm #ai #yaml #kubernetes #k8s

At work, everybody is replacable. Some with a ...



At work, everybody is replacable. Some with a hic-up, others with none. There will always someone to step up after you leave.

adamstacoviak.com/im-a-cog/

I actually would switch back to #FreeBSD as ...



I actually would switch back to #FreeBSD as my main Operating System for personal use on my Laptop - FreeBSD used to be my main driver a couple of years ago when I still used "normal" PCs

www.osnews.com/story/140841/freebsd-to-invest-in-laptop-support/

Amazing Print is amazing ...



Amazing Print is amazing

github.com/amazing-print/amazing_print

Always worth a reminde, what are bloom filters ...



Always worth a reminde, what are bloom filters and how do they work? #bloom #bloomfilter #datastructure

micahkepe.com/blog/bloom-filters/

Some #Ruby book notes of mine: ...



Some #Ruby book notes of mine:

foo.zone/gemfeed/2025-10-11-key-takeaways-from-the-well-grounded-rubyist.html (Gemini)
foo.zone/gemfeed/2025-10-11-key-takeaways-from-the-well-grounded-rubyist.html

Sad story. #work #scrum #jira ...



Sad story. #work #scrum #jira

lambdaland.org/posts/2023-02-21_metric_worship/

One of my favorite books: "Some Thoughts on ...



One of my favorite books: "Some Thoughts on Deep Work"

atthis.link/blog/2020/deepwork.html

ltex-ls is great for integrating ...



ltex-ls is great for integrating #LanguageTool prose checking via #LSP into your #HelixEditor! ... There is also vale-ls, which I have enabled as well. I just download ltex-ls and configure it as an LSP for your .txt and .md docs... that's it!

valentjn.github.io/ltex/

supernote-tool is awesome, as I can now ...



supernote-tool is awesome, as I can now download my Supernote notes on my #Linux desktop and convert them into PDFs - enables me to use the Supernote Nomad device as mine completely offline!

Fun story! :-) The case of the 500-mile email ...



Fun story! :-) The case of the 500-mile email ― Andreas Zwinkau via @wallabagapp #unix #sunos #sendmail

beza1e1.tuxen.de/lore/500mile_email.html

Operating myself some software over 10 years of ...



Operating myself some software over 10 years of age for over 10 years now, this podcast really resonated with me: #podcast #software #maintainability #maintenance

changelog.com/podcast/627

#git worktrees are awesome! ...



#git worktrees are awesome!

LLMs for anomaly detection? "While some ...



LLMs for anomaly detection? "While some ML-powered monitoring features have their place, good old-fashioned standard statistics remain hard to beat" Lessons from the pre-LLM AI in Observability: Anomaly Detection and AI-Ops vs. P99 | #llm #monitoring

quesma.com/blog-detail/aiops-observability

After having heavily vibe-coded (personal pet ...



After having heavily vibe-coded (personal pet projects) for 2 months other the summer, I've come back to more structured and intentional AI coding practices. Surly, it was a great learnig experiment: #llm #ai #risk #code #sre #development #genai

www.okoone.com/spark/technology-innova..-..ode-is-quietly-increasing-system-risk/

Slowly, one after another, I am switching all ...



Slowly, one after another, I am switching all my Go projects to Mage. Having a Makefile or Taskfile in a native Go format is so much better.

magefile.org/

Some neat slice tricks for Go: #golang ...



Some neat slice tricks for Go: #golang

blog.devtrovert.com/p/12-slice-tricks-to-enhance-your-go

I spent way too much time on this site. It's ...



I spent way too much time on this site. It's full of tools for the #terminal! Terminal Trove - The $HOME of all things in the terminal. #linux #bsd #unix #terminal #cli #tools

terminaltrove.com/

I share similar experiences with #rust, but I ...



I share similar experiences with #rust, but I am sure one just needs a bit more time to feel productive in it. It's not enough just to try rust out once before becoming fluent in it.

m.slashdot.org/story/446164

Pipelines in Go using channels. #golang ...



Pipelines in Go using channels. #golang

go.dev/blog/pipelines

Some nifty #Ruby tricks: In my opinion, Ruby ...



Some nifty #Ruby tricks: In my opinion, Ruby is unterrated. It's a great language even without Rails.

www.rubyinside.com/21-ruby-tricks-902.html

Reflects my experience ...



Reflects my experience

simonwillison.net/2025/Sep/12/matt-webb/#atom-everything

I like the fact that Markdown fikes, a RCS. an ...



I like the fact that Markdown fikes, a RCS. an text editor and standard unix tools like #grep and #find are all you need for taking notes digitally. I am the same :-) My favorite note-taking method

unixdigest.com/articles/my-favorite-note-taking-method.html

Rich Interactive Widgets for Terminal UIs, it ...



Rich Interactive Widgets for Terminal UIs, it must not always be BubbleTea #golang #terminal #widgets

github.com/rivo/tview

Always fun to dig in the #Perl @Perl woods. ...



Always fun to dig in the #Perl @Perl woods. Now, no more Perl 4 pseudo multi-dimensional hashes in Perl 5 (well, they are still there when you require an older version for compatibility via use flag, though)! :-)

www.effectiveperlprogramming.com/2024/..-..fake-multidimensional-data-structures/

How does #virtual #memory work? #ram ...



How does #virtual #memory work? #ram

drewdevault.com/2018/10/29/How-does-virtual-memory-work.html

flamelens - An interactive flamegraph viewer in ...



flamelens - An interactive flamegraph viewer in the terminal. - Terminal Trove

terminaltrove.com/flamelens/

You can now run Ansible Playbooks and shell ...



You can now run Ansible Playbooks and shell scripts from your Terraform more easily #ansible #terraform #iac

danielmschmidt.de/posts/2025-09-26-terraform-actions-introduction/

For people working with #k8s, this tool is ...



For people working with #k8s, this tool is useful. It lets you fuzzy find different k8s resource types and read a description about them: #kubernetes #fuzzy #cli #tools #devops

github.com/keisku/kubectl-explore

November 2025



Yes, using the right #tool for the job and ...



Yes, using the right #tool for the job and also learn along the way!

drewdevault.com/2016/09/17/Use-the-right-tool.html

Some neat Go tricks: #golang ...



Some neat Go tricks: #golang

harrisoncramer.me/15-go-sublteties-you-may-not-already-know/

There are some truths in this #SRE article: ...



There are some truths in this #SRE article: However, in my opinion, the more experience you have, the more you are expected to be able to resolve issues. So you can't always fallback to others. New starters are treated differently, of course. #oncall

ntietz.com/blog/what-i-tell-people-new-to-oncall/.

The Go flight recorder is a tool that allows ...



The Go flight recorder is a tool that allows developers to capture and analyze the execution of Go programs. It provides insights into performance, memory usage, and other runtime characteristics by recording events and metrics during the program's execution. Yet another tool why Go is awesome! #go #golang #tools

go.dev/blog/flight-recorder

This is useful #golang ...



This is useful #golang

antonz.org/chans/

Great visually animated guide how #raft ...



Great visually animated guide how #raft #consensus works

thesecretlivesofdata.com/raft/

"Today’s junior devs who skip the “hard ...



"Today’s junior devs who skip the “hard way” may plateau early, lacking the depth to grow into senior engineers tomorrow." ... Avoiding Skill Atrophy in the Age of AI

addyo.substack.com/p/avoiding-skill-atrophy-in-the-age

I actually enjoyed readong through the #Fish ...



I actually enjoyed readong through the #Fish #shell docs It's much cleaner than posix shells

fishshell.com/docs/current/language.html

There can be many things which can go wrong, ...



There can be many things which can go wrong, more than mentioned here: #linux

notes.eatonphil.com/2025-03-27-things-that-go-wrong-with-disk-io.html

IMHO, motivation is not always enough. There ...



IMHO, motivation is not always enough. There must also be some discipline. That helps then theres only a little or no motivation

world.hey.com/jason/motivation-50ab8280

Have been generating those CPU flame graphs on ...



Have been generating those CPU flame graphs on bare metal, so being able to use them in k8s seems to be pretty useful to me. #flamegraphs #k8s #kubernetes

www.percona.com/blog/kubernetes-observability-code-profiling-with-flame-graphs/

I personally don't like the typical whiteboard ...



I personally don't like the typical whiteboard coding exercises, nor do I think LeetCode is the answer. It's impossible to assess the skills of a candidate with a few interviews but it is possible to filter out the bad ones. The aim is to get an idea about the candidate and be positive about their potential. #interview #interviewing #hiring

danielabaron.me/blog/reimagining-technical-interviews/

If you've wondered how CPUs and operating ...



If you've wondered how CPUs and operating systems generally work and want the basics explained in an easily digestible format without going to college, have a look at CPU.land. I had a lot of fun reading it! #CPU

cpu.land

And there's an unexpected winner :-) #erlang ...



And there's an unexpected winner :-) #erlang #architecture

freedium.cfd/https://medium.com/@codep..-..t-wasn-t-what-we-expected-67f84c79dc34

Is it it? This is it. What Is It (in Ruby 3.4)? ...



Is it it? This is it. What Is It (in Ruby 3.4)? #ruby

kevinjmurphy.com/posts/what-is-it-in-ruby-34/

From my recent #London trip, I've uploaded ...



From my recent #London trip, I've uploaded some new Street Photography photos to my photo site All photos were post-processed using Open-Source software including #Darktable and #Shotwell. The site itself was generated with a simple #bash script! Not all photos are from London, just the recent additions were.

irregular.ninja!

Agreed, you should make your own programming ...



Agreed, you should make your own programming language, even if it's only for the sake of learning. I also did so over a decade ago. Mine was called Fype - "For Your Program Execution"

ntietz.com/blog/you-should-make-a-new-terrible-programming-language/
foo.zone/gemfeed/2010-05-09-the-fype-programming-language.html (Gemini)
foo.zone/gemfeed/2010-05-09-the-fype-programming-language.html

Principles for C programming #C ...



Principles for C programming #C #programming

drewdevault.com/2017/03/15/How-I-learned-to-stop-worrying-and-love-C.html

#Typst appears to be a great modern ...



#Typst appears to be a great modern alternative to #LaTeX

Things you can do with a debugger but not with ...



Things you can do with a debugger but not with print debugging #debugger #debugging #coding #programming

mahesh-hegde.github.io/posts/what_debugger_can/

Neat tutorial, I think I've to try #jujutsu ...



Neat tutorial, I think I've to try #jujutsu out now! #git #vcs #jujutsu #jj

www.stavros.io/posts/switch-to-jujutsu-already-a-tutorial/

Wise words Best practices are not rules. They ...



Wise words Best practices are not rules. They are guidelines that help you make better decisions. They are not absolute truths, but rather suggestions based on experience and common sense. You should always use your own judgment and adapt them to your specific situation.

www.arp242.net/best-practices.html

How to build a #Linux #Container from ...



How to build a #Linux #Container from scratch without #Docker, #Podman, etc. #Linux #container from scratch

michalpitr.substack.com/p/linux-contai..-..rom-scratch?r=gt6tv&triedRedirect=true

When I reach the point where I am trying to ...



When I reach the point where I am trying to recover from panics in Go, something else has already gone wrong with the design of the codebase, IMHO. However, I must admit that my viewpoint may be flawed, as I code small, self-contained tools and rely on as few dependencies as possible. So I rarely rely on 3rd party libs, which may panic (which wouldn’t be nice to begin with; it would be better if they returned errors). #golang

blog.devtrovert.com/p/go-panic-and-recover-dont-make-these

Personally one of the main benefits of using ...



Personally one of the main benefits of using #tmux over other solutions is, that I can use the same setup on my personal devices (Linux and BSD) and for work (#macOS): you might not need tmux

bower.sh/you-might-not-need-tmux

December 2025



Rhese are some nice #Ruby tricks (Ruby is onw ...



Rhese are some nice #Ruby tricks (Ruby is onw of my favourite languages) 11 Ruby Tricks You Haven’t Seen Before via @wallabagapp

www.rubyguides.com/2016/01/ruby-tricks/

That's fun, use the C preprocessor as a HTML ...



That's fun, use the C preprocessor as a HTML template engine! #c #cpp #fun

wheybags.com/blog/macroblog.html

#jq but for #Markdown? Thats interesting, ...



#jq but for #Markdown? Thats interesting, never thought of that. mdq: jq for Markdown via @wallabagapp

github.com/yshavit/mdq

Elvish seems to be a neat little shell. It's ...



Elvish seems to be a neat little shell. It's implemented in #Golang and can make use of the great Go standard library. The language is more modern than other shells out there (e.g., supporting nested data structures) and eliminates backward compatibility issues (e.g., awkward string parsing with spaces that often causes problems in traditional shells). Elvish also comes with some neat interactive TUI elements. Furthermore, there will be a whole TUI framework built directly into the shell. If I weren't so deeply intertwined with #bash and #zsh, I would personally give #Elvish a try... Interesting, at least, it is.

elv.sh/

Google #SRE required better Wifi on the ...



Google #SRE required better Wifi on the toilet, otherwise YouTube could go down :-)

podcasts.apple.com/us/podcast/incident..-..ai-stacey/id1615778073?i=1000672365156

Indeed ...



Indeed

aaronfrancis.com/2024/because-i-wanted-to-12c5137c

Very interesting post how pods are scheduled ...



Very interesting post how pods are scheduled and terminated with some tips how to improve reliability (pods may be terminated before ingress rules are updated and some traffic may hits non existing pods) #k8s #kubernetes

learnk8s.io/graceful-shutdown

I have added observability to the #Kubernetes ...



I have added observability to the #Kubernetes cluster in the eighth part of my #Kubernetes on #FreeBSD series. #Grafana #Loki #Prometheus #Alloy #k3s #OpenBSD #RockyLinux

foo.zone/gemfeed/2025-12-07-f3s-kubernetes-with-freebsd-part-8.html (Gemini)
foo.zone/gemfeed/2025-12-07-f3s-kubernetes-with-freebsd-part-8.html

Wondering where I could make use of it ...



Wondering where I could make use of it blog/2025/12/an-svg-is-all-you-need.mld #SVG

jon.recoil.org/blog/2025/12/an-svg-is-all-you-need.html

Trying out #COSMIC #Desktop... seems ...



Trying out #COSMIC #Desktop... seems snappier than #GNOME and I like the tiling features...

Best thing I've ever read about #container ...



Best thing I've ever read about #container #security in #kubernetes:

learnkube.com/security-contexts

While acknowledging luck in finding the right ...



While acknowledging luck in finding the right team and company culture, the author stresses that staying and choosing long-term ownership is a deliberate choice for those valuing deep technical ownership over external validation: Why I Ignore The Spotlight as a Staff Engineer #engineering

lalitm.com/software-engineering-outside-the-spotlight/

Great explanation #slo #sla #sli #sre ...



Great explanation #slo #sla #sli #sre

blog.alexewerlof.com/p/sla-vs-slo

Nice service, you send a drive, they host ...



Nice service, you send a drive, they host #ZFS for you!

zfs.rent/

Other related posts:

2025-01-01 Posts from October to December 2024
2025-07-01 Posts from January to June 2025
2026-01-01 Posts from July to December 2025 (You are currently reading this)

E-Mail your comments to paul@nospam.buetow.org :-)

Back to the main site
Cloudless Kobo Forma with KOReader https://foo.zone/gemfeed/2026-01-01-cloudless-kobo-forma-with-koreader.html 2025-12-31T16:08:33+02:00 Paul Buetow aka snonux paul@dev.buetow.org I am an reader, and for years I've been searching for a good digital e-reader to complement my paper books. I advocate for privacy-first and prefer open-source or self-hosted solutions. If that is not possible, I opt for offline solutions. Even if I don't have anything to hide, the tinkerer in me wants those things anyway. I found my ideal device in the Kobo Forma 7 years ago. Now, I use it without Kobo's cloud sync, and in this post, I'll show you how.

Cloudless Kobo Forma with KOReader



Published at 2025-12-31T16:08:33+02:00

I am an reader, and for years I've been searching for a good digital e-reader to complement my paper books. I advocate for privacy-first and prefer open-source or self-hosted solutions. If that is not possible, I opt for offline solutions. Even if I don't have anything to hide, the tinkerer in me wants those things anyway. I found my ideal device in the Kobo Forma 7 years ago. Now, I use it without Kobo's cloud sync, and in this post, I'll show you how.

Art by Donovan Bake

      __...--~~~~~-._   _.-~~~~~--...__
    //               `V'               \\ 
   //                 |                 \\ 
  //__...--~~~~~~-._  |  _.-~~~~~~--...__\\ 
 //__.....----~~~~._\ | /_.~~~~----.....__\\
====================\\|//====================
                dwb `---`

Table of Contents





I initially bought the Kobo Forma because I wanted a device with a large screen for reading PDFs and ePubs. However, as time went on, I became more concerned about the privacy implications of having all my reading data synced to the Kobo cloud. So, I looked into alternative ways to use this device.

KOReader running on Kobo Forma

The Kobo Forma is so old that it can't be purchased from Kobo directly anymore. But I love the form factor; it's much lighter than the Kobo Sage and still has a 7" screen. It's just that the stock firmware is becoming too slow and sluggish.

Kobo Forma

Note: Some of the screenshots in this post are taken from my Kobo Clara HD, which is another Kobo eReader I have. It's smaller and better for travel, and I use the same KOReader setup on both devices.

KOReader to the Rescue



In a world of constant connectivity, the Kobo Forma with the KOReader software offers a way out. By keeping it disconnected from the cloud, I can focus on my reading without compromising my privacy. KOReader is a versatile, open-source document and image viewer which can also be installed on some E Ink reader devices like the Kobo Forma.

KOReader

By not syncing my reading progress and library to Kobo's cloud service, I retain full ownership and control over my data. There's no risk of my personal reading habits being accessed or mined by third parties.

Installation



Installing KOReader is straightforward. You can follow the official guide for that. I used the Linux one:

https://github.com/koreader/koreader/wiki/Installation-on-desktop-linux

Basically, what I had to do is to download a .zip file of the KOReader binary and an install.sh script. Then, I plugged in the Kobo Forma via USB and ran the install script, which did the rest for me.

KOReader installation via USB

After the initial install, KOReader can update itself through its menus.

KOReader self-update menu

It is worth noting that after the KOReader install, the Kobo Forma still boots into the proprietary window manager. To start KOReader, you have to select it from the new "Nickel Menu". KOReader will then stay open until you reboot the device. It's a small annoyance, but it's well worth it!

Nickel Menu

Sideloaded Mode



To use the Kobo Forma completely without a Kobo account, you can enable "Sideloaded Mode". This mode allows you to use the device without being signed in to a Kobo account. When enabled, the home screen will default to your library instead of showing Kobo recommendations, and the sync button will disappear. This prevents the device from trying to sync with the Kobo cloud.

To enable it, you need to edit the configuration file. Connect your Kobo device to your computer via USB. Open the file .kobo/Kobo/Kobo eReader.conf and add the following lines:

[ApplicationPreferences]
SideloadedMode=true

After saving the file, eject the device. You might need to restart it for the changes to take effect.

KOReader is much faster than the stock firmware; it feels about three times as fast. Before trying out KOReader, I was thinking about selling the Forma as it felt too sluggish. But now there is new life in this 7-year-old device! It also offers a night mode (inverted colors), a feature that the stock firmware on the Forma is lacking.

KOReader dark mode (inverted colors)

My Workflow



My workflow is simple and efficient, relying on a direct USB connection to my Linux laptop for sideloading books and a self-hosted sync server for progress synchronization.

Sideloading Books



I connect my Kobo Forma to my Linux laptop via a USB-C cable. The device is automatically recognized as a storage device, and I can directly access its storage to copy over ePubs, PDFs, and other supported formats.

KOReader Sync Server



To keep my reading progress synchronized across multiple devices (my Kobo, my phone, and my Linux laptop), I run a koreader-sync-server instance in my k3s cluster. This allows me to pick up reading where I left off, no matter which device I'm using.

https://codeberg.org/snonux/conf/src/branch/master/f3s/kobo-sync-server

Custom sync server configuration

To configure the sync server in KOReader, open a document, go to "Settings" -> "Progress Sync", and select "Custom sync server". There you can enter the URL of your server and your credentials. The progress can then also be synced to and from KOReader running on other devices (e.g. a Laptop or a Smartphone!)

KOReader sync menu

Exporting Book Notes and Highlights



KOReader allows you to export book notes and highlights directly from the device in various formats, including plain text and Markdown. Unfortunately, these are not automatically synced to the sync server. I have an offline backup procedure where I regularly sync them via USB to my backup server. There's a 3rd party plugin available for KOReader, which seems to be able to do this kind of sync, though.

Wallabag Integration



KOReader has built-in Wallabag support. This allows me to save articles from the web to my self-hosted Wallabag instance and then read them comfortably on my Kobo.

https://wallabag.org/

I haven't tried it out yet, though. I may will and will update this blog post here after done so.

Purchasing e-books



If you search a little bit you also find stores which sell digital rights management (DRM) free e-books (in ePub format), for example buecher.de does, they sell german and english books. Before purchasing, just make sure that the book is DRM-free (not all their books are that.)

All the books I read you can see here:

Novels I've read
Resources, Technical Books, Podcasts, Courses and Guides I recommend

Conclusion



The Kobo Forma with KOReader has become an indispensable tool for me. By using it offline and with self-hosted services, I've created a distraction-free and private reading environment. The simple, manual workflow for transferring books gives me full control over my data, and the reading experience is second to none. If you're looking for a digital e-reader that respects your privacy and helps you focus, I highly recommend giving the Kobo a try with an offline-first approach using KOReader.

Other related posts:

2026-01-01 Using Supernote Nomad offline
2026-01-01 Cloudless Kobo Forma with KOReader (You are currently reading this)

E-Mail your comments to paul@nospam.buetow.org :-)

Back to the main site
X-RAG Observability Hackathon https://foo.zone/gemfeed/2025-12-24-x-rag-observability-hackathon.html 2025-12-24T09:45:29+02:00 Paul Buetow aka snonux paul@dev.buetow.org This blog post describes my hackathon efforts adding observability to X-RAG, the extensible Retrieval-Augmented Generation (RAG) platform built by my brother Florian. I especially made time available over the weekend to join his 3-day hackathon (attending 2 days) with the goal of instrumenting his existing distributed system with observability. What started as 'let's add some metrics' turned into a comprehensive implementation of the three pillars of observability: tracing, metrics, and logs.

X-RAG Observability Hackathon



Published at 2025-12-24T09:45:29+02:00

This blog post describes my hackathon efforts adding observability to X-RAG, the extensible Retrieval-Augmented Generation (RAG) platform built by my brother Florian. I especially made time available over the weekend to join his 3-day hackathon (attending 2 days) with the goal of instrumenting his existing distributed system with observability. What started as "let's add some metrics" turned into a comprehensive implementation of the three pillars of observability: tracing, metrics, and logs.

X-RAG source code on GitHub

Table of Contents




What is X-RAG?



X-RAG is the extensible RAG (Retrieval-Augmented Generation) platform running on Kubernetes. The idea behind RAG is simple: instead of asking an LLM to answer questions from its training data alone, you first retrieve relevant documents from your own knowledge base, then feed those documents to the LLM as context. The LLM synthesises an answer grounded in your actual content—reducing hallucinations and enabling answers about private or recent information the model was never trained on.

X-RAG handles the full pipeline: ingest documents, chunk them into searchable pieces, generate vector embeddings, store them in a vector database, and at query time, retrieve relevant chunks and pass them to an LLM for answer generation. The system supports both local LLMs (Florian runs his on a beefy desktop) and cloud APIs like OpenAI. I configured an OpenAI API key since my laptop's CPU and GPU aren't fast enough for decent local inference.

All services are implemented in Python. I'm more used to Ruby, Go, and Bash these days, but for this project it didn't matter—Python's OpenTelemetry integration is straightforward, I wasn't planning to write or rewrite tons of application code, and with GenAI assistance the language barrier was a non-issue. The OpenTelemetry concepts and patterns should translate to other languages too—the SDK APIs are intentionally similar across Python, Go, Java, and others.

X-RAG consists of several independently scalable microservices:

  • Search UI: FastAPI web interface for queries
  • Ingestion API: Document upload endpoint
  • Embedding Service: gRPC service for vector embeddings
  • Indexer: Kafka consumer that processes documents
  • Search Service: gRPC service orchestrating the RAG pipeline

The Embedding Service deserves extra explanation because in the beginning I didn't really knew what it was. Text isn't directly searchable in a vector database—you need to convert it to numerical vectors (embeddings) that capture semantic meaning. The Embedding Service takes text chunks and calls an embedding model (OpenAI's text-embedding-3-small in my case, or a local model on Florian's setup) to produce these vectors. For the LLM search completion answer, I used gpt-4o-mini.

Similar concepts end up with similar vectors, so "What is machine learning?" and "Explain ML" produce vectors close together in the embedding space. At query time, your question gets embedded too, and the vector database finds chunks with nearby vectors—that's semantic search.

The data layer includes Weaviate (vector database with hybrid search), Kafka (message queue), MinIO (object storage), and Redis (cache). All of this runs in a Kind Kubernetes cluster for local development, with the same manifests deployable to production.

┌─────────────────────────────────────────────────────────────────────────┐
│                      X-RAG Kubernetes Cluster                           │
├─────────────────────────────────────────────────────────────────────────┤
│   ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐    │
│   │ Search UI   │  │Search Svc   │  │Embed Service│  │   Indexer   │    │
│   └──────┬──────┘  └──────┬──────┘  └──────┬──────┘  └──────┬──────┘    │
│          │                │                │                │           │
│          └────────────────┴────────────────┴────────────────┘           │
│                                    │                                    │
│                                    ▼                                    │
│          ┌─────────────┐  ┌─────────────┐  ┌─────────────┐              │
│          │  Weaviate   │  │   Kafka     │  │   MinIO     │              │
│          └─────────────┘  └─────────────┘  └─────────────┘              │
└─────────────────────────────────────────────────────────────────────────┘

Running Kubernetes locally with Kind



X-RAG runs on Kubernetes, but you don't need a cloud account to develop it. The project uses Kind (Kubernetes in Docker)—a tool originally created by the Kubernetes SIG for testing Kubernetes itself.

Kind - Kubernetes in Docker

Kind spins up a full Kubernetes cluster using Docker containers as nodes. The control plane (API server, etcd, scheduler, controller-manager) runs in one container, and worker nodes run in separate containers. Inside these "node containers," pods run just like they would on real servers—using containerd as the container runtime. It's containers all the way down.

Technically, each Kind node is a Docker container running a minimal Linux image with kubelet and containerd installed. When you deploy a pod, kubelet inside the node container instructs containerd to pull and run the container image. So you have Docker running node containers, and inside those, containerd running application containers. Network-wise, Kind sets up a Docker bridge network and uses CNI plugins (kindnet by default) for pod networking within the cluster.

$ docker ps --format "table {{.Names}}\t{{.Image}}"
NAMES                  IMAGE
xrag-k8-control-plane  kindest/node:v1.32.0
xrag-k8-worker         kindest/node:v1.32.0
xrag-k8-worker2        kindest/node:v1.32.0

The kindest/node image contains everything needed: kubelet, containerd, CNI plugins, and pre-pulled pause containers. Port mappings in the Kind config expose services to the host—that's how http://localhost:8080 reaches the search-ui running inside a pod, inside a worker container, inside Docker.

┌─────────────────────────────────────────────────────────────────────────┐
│                           Docker Host                                   │
├─────────────────────────────────────────────────────────────────────────┤
│  ┌───────────────────┐  ┌───────────────────┐  ┌───────────────────┐    │
│  │ xrag-k8-control   │  │ xrag-k8-worker    │  │ xrag-k8-worker2   │    │
│  │ -plane (container)│  │ (container)       │  │ (container)       │    │
│  │                   │  │                   │  │                   │    │
│  │ K8s API server    │  │ Pods:             │  │ Pods:             │    │
│  │ etcd, scheduler   │  │ • search-ui       │  │ • weaviate        │    │
│  │                   │  │ • search-service  │  │ • kafka           │    │
│  │                   │  │ • embedding-svc   │  │ • prometheus      │    │
│  │                   │  │ • indexer         │  │ • grafana         │    │
│  └───────────────────┘  └───────────────────┘  └───────────────────┘    │
└─────────────────────────────────────────────────────────────────────────┘

Why Kind? It gives you a real Kubernetes environment—the same manifests deploy to production clouds unchanged. No minikube quirks, no Docker Compose translation layer. Just Kubernetes. I already have a k3s cluster running at home, but Kind made collaboration easier—everyone working on X-RAG gets the exact same setup by cloning the repo and running make cluster-start.

Florian developed X-RAG on macOS, but it worked seamlessly on my Linux laptop. The only difference was Docker's resource allocation: on macOS you configure limits in Docker Desktop, on Linux it uses host resources directly. That's because under macOS the Linux Docker containers run on an emulation layer as macOS is not Linux.

My hardware: a ThinkPad X1 Carbon Gen 9 with an 11th Gen Intel Core i7-1185G7 (4 cores, 8 threads at 3.00GHz) and 32GB RAM (running Fedora Linux). During the hackathon, memory usage peaked around 15GB—comfortable headroom. CPU was the bottleneck; with ~38 pods running across all namespaces (rag-system, monitoring, kube-system, etc.), plus Discord for the remote video call and Tidal streaming hi-res music, things got tight. When rebuilding Docker images or restarting the cluster, Discord video and audio would stutter—my fellow hackers probably wondered why I kept freezing mid-sentence. A beefier CPU would have meant less waiting and smoother calls, but it was manageable.

Motivation



When I joined the hackathon, Florian's X-RAG was functional but opaque. With five services communicating via gRPC, Kafka, and HTTP, debugging was cumbersome. When a search request take 5 seconds, there was no visibility into where the time was being spent. Was it the embedding generation? The vector search? The LLM synthesis? Nobody would be able to figure it out quickly.

Distributed systems are inherently opaque. Each service logs its own view of the world, but correlating events across service boundaries is archaeology. Grepping through logs on many pods, trying to mentally reconstruct what happened—not fun. This was the perfect hackathon project: Explore this Observability Stack in greater depth.

The observability stack



Before diving into implementation, here's what I deployed. The complete stack runs in the monitoring namespace:

$ kubectl get pods -n monitoring
NAME                                  READY   STATUS
alloy-84ddf4cd8c-7phjp                1/1     Running
grafana-6fcc89b4d6-pnh8l              1/1     Running
kube-state-metrics-5d954c569f-2r45n   1/1     Running
loki-8c9bbf744-sc2p5                  1/1     Running
node-exporter-kb8zz                   1/1     Running
node-exporter-zcrdz                   1/1     Running
node-exporter-zmskc                   1/1     Running
prometheus-7f755f675-dqcht            1/1     Running
tempo-55df7dbcdd-t8fg9                1/1     Running

Each component has a specific role:

  • Grafana Alloy: The unified collector. Receives OTLP from applications, scrapes Prometheus endpoints, tails log files. Think of it as the central nervous system.
  • Prometheus: Time-series database for metrics. Stores counters, gauges, and histograms with 15-day retention.
  • Tempo: Trace storage. Receives spans via OTLP, correlates them by trace ID, enables TraceQL queries.
  • Loki: Log aggregation. Indexes labels (namespace, pod, container), stores log chunks, enables LogQL queries.
  • Grafana: The unified UI. Queries all three backends, correlates signals, displays dashboards.
  • kube-state-metrics: Exposes Kubernetes object metrics (pod status, deployments, resource requests).
  • node-exporter: Exposes host-level metrics (CPU, memory, disk, network) from each Kubernetes node.

Everything is accessible via port-forwards:

  • Grafana: http://localhost:3000 (unified UI for all three signals)
  • Prometheus: http://localhost:9090 (metrics queries)
  • Tempo: http://localhost:3200 (trace queries)
  • Loki: http://localhost:3100 (log queries)

Grafana Alloy: the unified collector



Before diving into the individual signals, I want to highlight Grafana Alloy—the component that ties everything together. Alloy is Grafana's vendor-neutral OpenTelemetry Collector distribution, and it became the backbone of the observability stack.

Grafana Alloy documentation

Why use a centralised collector instead of having each service push directly to backends?

  • Decoupling: Applications don't need to know about Prometheus, Tempo, or Loki. They speak OTLP, and Alloy handles the translation.
  • Unified timestamps: All telemetry flows through one system, making correlation in Grafana more reliable.
  • Processing pipeline: Batch data before sending, filter noisy metrics, enrich with labels—all in one place.
  • Backend flexibility: Switch from Tempo to Jaeger without changing application code.

Alloy uses a configuration language called River, which feels similar to Terraform's HCL—declarative blocks with attributes. If you've written Terraform, River will look familiar. The full Alloy configuration runs to over 1400 lines with comments explaining each section. It handles OTLP receiving, batch processing, Prometheus export, Tempo export, Kubernetes metrics scraping, infrastructure metrics, and pod log collection. All three signals—metrics, traces, logs—flow through this single component, making Alloy the central nervous system of the observability stack.

In the following sections, I'll cover each observability pillar and show the relevant Alloy configuration for each.

Centralised logging with Loki



Getting all logs in one place was the foundation. I deployed Grafana Loki in the monitoring namespace, with Grafana Alloy running as a DaemonSet on each node to collect logs.

┌──────────────────────────────────────────────────────────────────────┐
│                           LOGS PIPELINE                              │
├──────────────────────────────────────────────────────────────────────┤
│  Applications write to stdout → containerd stores in /var/log/pods   │
│                                    │                                 │
│                              File tail                               │
│                                    ▼                                 │
│                         Grafana Alloy (DaemonSet)                    │
│                    Discovers pods, extracts metadata                 │
│                                    │                                 │
│                       HTTP POST /loki/api/v1/push                    │
│                                    ▼                                 │
│                           Grafana Loki                               │
│                   Indexes labels, stores chunks                      │
└──────────────────────────────────────────────────────────────────────┘

Alloy configuration for logs



Alloy discovers pods via the Kubernetes API, tails their log files from /var/log/pods/, and ships to Loki. Importantly, Alloy runs as a DaemonSet on each worker node—it doesn't run inside the application pods. Since containerd writes all container stdout/stderr to /var/log/pods/ on the node's filesystem, Alloy can tail logs for every pod on that node from a single location without any sidecar injection:

loki.source.kubernetes "pod_logs" {
  targets    = discovery.relabel.pod_logs.output
  forward_to = [loki.process.pod_logs.receiver]
}

loki.write "default" {
  endpoint {
    url = "http://loki.monitoring.svc.cluster.local:3100/loki/api/v1/push"
  }
}

Querying logs with LogQL



Now I could query logs in Loki (e.g. via Grafana UI) with LogQL:

{namespace="rag-system", container="search-ui"} |= "ERROR"

Metrics with Prometheus



I added Prometheus metrics to every service. Following the Four Golden Signals (latency, traffic, errors, saturation), I instrumented the codebase with histograms, counters, and gauges:

from prometheus_client import Histogram, Counter, Gauge

search_duration = Histogram(
    "search_service_request_duration_seconds",
    "Total duration of Search Service requests",
    ["method"],
    buckets=[0.1, 0.25, 0.5, 1.0, 2.5, 5.0, 10.0, 20.0, 30.0, 60.0],
)

errors_total = Counter(
    "search_service_errors_total",
    "Error count by type",
    ["method", "error_type"],
)

Initially, I used Prometheus scraping—each service exposed a /metrics endpoint, and Prometheus pulled metrics every 15 seconds. This worked, but I wanted a unified pipeline.

Alloy configuration for application metrics



The breakthrough came with Grafana Alloy as an OpenTelemetry collector. Services now push metrics via OTLP (OpenTelemetry Protocol), and Alloy converts them to Prometheus format:

┌─────────────┐  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐
│ search-ui   │  │search-svc   │  │embed-svc    │  │  indexer    │
│ OTel Meter  │  │ OTel Meter  │  │ OTel Meter  │  │ OTel Meter  │
│      │      │  │      │      │  │      │      │  │      │      │
│ OTLPExporter│  │ OTLPExporter│  │ OTLPExporter│  │ OTLPExporter│
└──────┬──────┘  └──────┬──────┘  └──────┬──────┘  └──────┬──────┘
       │                │                │                │
       └────────────────┴────────────────┴────────────────┘
                                 │
                                 ▼ OTLP/gRPC (port 4317)
                        ┌─────────────────────┐
                        │   Grafana Alloy     │
                        └──────────┬──────────┘
                                   │ prometheus.remote_write
                                   ▼
                        ┌─────────────────────┐
                        │    Prometheus       │
                        └─────────────────────┘

Alloy receives OTLP on ports 4317 (gRPC) or 4318 (HTTP), batches the data for efficiency, and exports to Prometheus:

otelcol.receiver.otlp "default" {
  grpc { endpoint = "0.0.0.0:4317" }
  http { endpoint = "0.0.0.0:4318" }
  output {
    metrics = [otelcol.processor.batch.metrics.input]
    traces  = [otelcol.processor.batch.traces.input]
  }
}

otelcol.processor.batch "metrics" {
  timeout = "5s"
  send_batch_size = 1000
  output { metrics = [otelcol.exporter.prometheus.default.input] }
}

otelcol.exporter.prometheus "default" {
  forward_to = [prometheus.remote_write.prom.receiver]
}

Instead of sending each metric individually, Alloy accumulates up to 1000 metrics (or waits 5 seconds) before flushing. This reduces network overhead and protects backends from being overwhelmed.

Kubernetes metrics: kubelet, cAdvisor, and kube-state-metrics



Alloy also pulls metrics from Kubernetes itself—kubelet resource metrics, cAdvisor container metrics, and kube-state-metrics for cluster state.

Why three separate sources? It does feel fragmented, but each serves a distinct purpose. kubelet exposes resource metrics about pod CPU and memory usage from its own bookkeeping—lightweight summaries of what's running on each node. cAdvisor (Container Advisor) runs inside kubelet and provides detailed container-level metrics: CPU throttling, memory working sets, filesystem I/O, network bytes. These are the raw runtime stats from containerd. kube-state-metrics is different—it doesn't measure resource usage at all. Instead, it queries the Kubernetes API and exposes the *desired state*: how many replicas a Deployment wants, whether a Pod is pending or running, what resource requests and limits are configured. You need all three because "container used 500MB" (cAdvisor), "pod requested 1GB" (kube-state-metrics), and "node has 4GB available" (kubelet) are complementary views. The fragmentation is a consequence of Kubernetes' architecture—no single component has the complete picture.

None of these components speak OpenTelemetry—they all expose Prometheus-format metrics via HTTP endpoints. That's why Alloy uses prometheus.scrape instead of receiving OTLP pushes. Alloy handles both worlds: OTLP from our applications, Prometheus scraping for infrastructure.

prometheus.scrape "kubelet_resource" {
  targets         = discovery.relabel.kubelet.output
  job_name        = "kubelet-resource"
  scheme          = "https"
  scrape_interval = "30s"
  bearer_token_file = "/var/run/secrets/kubernetes.io/serviceaccount/token"
  tls_config { insecure_skip_verify = true }
  forward_to      = [prometheus.remote_write.prom.receiver]
}

prometheus.scrape "cadvisor" {
  targets         = discovery.relabel.cadvisor.output
  job_name        = "cadvisor"
  scheme          = "https"
  scrape_interval = "60s"
  bearer_token_file = "/var/run/secrets/kubernetes.io/serviceaccount/token"
  tls_config { insecure_skip_verify = true }
  forward_to      = [prometheus.relabel.cadvisor_filter.receiver]
}

prometheus.scrape "kube_state_metrics" {
  targets = [
    {"__address__" = "kube-state-metrics.monitoring.svc.cluster.local:8080"},
  ]
  job_name        = "kube-state-metrics"
  scrape_interval = "30s"
  forward_to      = [prometheus.relabel.kube_state_filter.receiver]
}

Note that kubelet and cAdvisor require HTTPS with bearer token authentication (using the service account token mounted by Kubernetes), while kube-state-metrics is a simple HTTP target. cAdvisor is scraped less frequently (60s) because it returns many more metrics with higher cardinality.

Infrastructure metrics: Kafka, Redis, MinIO



Application metrics weren't enough. I also needed visibility into the data layer. Each infrastructure component has a specific role in X-RAG and got its own exporter:

Redis is the caching layer. It stores search results and embeddings to avoid redundant API calls to OpenAI. We collect 25 metrics via oliver006/redis_exporter running as a sidecar, including cache hit/miss rates, memory usage, connected clients, and command latencies. The key metric? redis_keyspace_hits_total / (redis_keyspace_hits_total + redis_keyspace_misses_total) tells you if caching is actually helping.

Kafka is the message queue connecting the ingestion API to the indexer. Documents are published to a topic, and the indexer consumes them asynchronously. We collect 12 metrics via danielqsj/kafka-exporter, with consumer lag being the most critical—it shows how far behind the indexer is. High lag means documents aren't being indexed fast enough.

MinIO is the S3-compatible object storage where raw documents are stored before processing. We collect 16 metrics from its native /minio/v2/metrics/cluster endpoint, covering request rates, error counts, storage usage, and cluster health.

You can verify these counts by querying Prometheus directly:

$ curl -s 'http://localhost:9090/api/v1/label/__name__/values' \
    | jq -r '.data[]' | grep -c '^redis_'
25
$ curl -s 'http://localhost:9090/api/v1/label/__name__/values' \
    | jq -r '.data[]' | grep -c '^kafka_'
12
$ curl -s 'http://localhost:9090/api/v1/label/__name__/values' \
    | jq -r '.data[]' | grep -c '^minio_'
16

Full Alloy configuration with detailed metric filtering

Alloy scrapes all of these and remote-writes to Prometheus:

prometheus.scrape "redis_exporter" {
  targets = [
    {"__address__" = "xrag-redis.rag-system.svc.cluster.local:9121"},
  ]
  job_name        = "redis"
  scrape_interval = "30s"
  forward_to      = [prometheus.relabel.redis_filter.receiver]
}

prometheus.scrape "kafka_exporter" {
  targets = [
    {"__address__" = "kafka-exporter.rag-system.svc.cluster.local:9308"},
  ]
  job_name        = "kafka"
  scrape_interval = "30s"
  forward_to      = [prometheus.relabel.kafka_filter.receiver]
}

prometheus.scrape "minio" {
  targets = [
    {"__address__" = "xrag-minio.rag-system.svc.cluster.local:9000"},
  ]
  job_name     = "minio"
  metrics_path = "/minio/v2/metrics/cluster"
  scrape_interval = "30s"
  forward_to   = [prometheus.relabel.minio_filter.receiver]
}

Note that MinIO exposes metrics at a custom path (/minio/v2/metrics/cluster) rather than the default /metrics. Each exporter forwards to a relabel component that filters down to essential metrics before sending to Prometheus.

With all metrics in Prometheus, I can use PromQL queries in Grafana dashboards. For example, to check Kafka consumer lag and see if the indexer is falling behind:

sum by (consumergroup, topic) (kafka_consumergroup_lag)

Or check Redis cache effectiveness:

redis_keyspace_hits_total / (redis_keyspace_hits_total + redis_keyspace_misses_total)

Distributed tracing with Tempo



Understanding traces, spans, and the trace tree



Before diving into the implementation, let me explain the core concepts I learned. A trace represents a single request's journey through the entire distributed system. Think of it as a receipt that follows your request from the moment it enters the system until the final response.

Each trace is identified by a trace ID—a 128-bit identifier (32 hex characters) that stays constant across all services. When I make a search request, every service handling that request uses the same trace ID: 9df981cac91857b228eca42b501c98c6.

Quick video explaining the difference between trace IDs and span IDs in OpenTelemetry

Within a trace, individual operations are recorded as spans. A span has:

  • A span ID: 64-bit identifier (16 hex characters) unique to this operation
  • A parent span ID: links this span to its caller
  • A name: what operation this represents (e.g., "POST /api/search")
  • Start time and duration
  • Attributes: key-value metadata (e.g., http.status_code=200)

The first span in a trace is the root span—it has no parent. When the root span calls another service, that service creates a child span with the root's span ID as its parent. This parent-child relationship forms a tree structure:

                        ┌─────────────────────────┐
                        │      Root Span          │
                        │  POST /api/search       │
                        │  span_id: a1b2c3d4...   │
                        │  parent: (none)         │
                        └───────────┬─────────────┘
                                    │
              ┌─────────────────────┴─────────────────────┐
              │                                           │
              ▼                                           ▼
┌─────────────────────────┐             ┌─────────────────────────┐
│      Child Span         │             │      Child Span         │
│  gRPC Search            │             │  render_template        │
│  span_id: e5f6g7h8...   │             │  span_id: i9j0k1l2...   │
│  parent: a1b2c3d4...    │             │  parent: a1b2c3d4...    │
└───────────┬─────────────┘             └─────────────────────────┘
            │
            ├──────────────────┬──────────────────┐
            ▼                  ▼                  ▼
     ┌────────────┐     ┌────────────┐     ┌────────────┐
     │ Grandchild │     │ Grandchild │     │ Grandchild │
     │ embedding  │     │ vector     │     │ llm.rag    │
     │ .generate  │     │ _search    │     │ _completion│
     └────────────┘     └────────────┘     └────────────┘

This tree structure answers the critical question: "What called what?" When I see a slow span, I can trace up to see what triggered it and down to see what it's waiting on.

How trace context propagates



The magic that links spans across services is trace context propagation. When Service A calls Service B, it must pass along the trace ID and its own span ID (which becomes the parent). OpenTelemetry uses the W3C traceparent header:

traceparent: 00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01
             │   │                                │                 │
             │   │                                │                 └── flags
             │   │                                └── parent span ID (16 hex)
             │   └── trace ID (32 hex)
             └── version

For HTTP, this travels as a request header. For gRPC, it's passed as metadata. For Kafka, it's embedded in message headers. The receiving service extracts this context, creates a new span with the propagated trace ID and the caller's span ID as parent, then continues the chain.

This is why all my spans link together—OpenTelemetry's auto-instrumentation handles propagation automatically for HTTP, gRPC, and Kafka clients.

Implementation



This is where distributed tracing made the difference. I integrated OpenTelemetry auto-instrumentation for FastAPI, gRPC, and HTTP clients, plus manual spans for RAG-specific operations:

from opentelemetry.instrumentation.fastapi import FastAPIInstrumentor
from opentelemetry.instrumentation.grpc import GrpcAioInstrumentorClient

# Auto-instrument frameworks
FastAPIInstrumentor.instrument_app(app)
GrpcAioInstrumentorClient().instrument()

# Manual spans for custom operations
with tracer.start_as_current_span("llm.rag_completion") as span:
    span.set_attribute("llm.model", model_name)
    result = await generate_answer(query, context)

Auto-instrumentation is the quick win: one line of code and you get spans for every HTTP request, gRPC call, or database query. The instrumentor patches the framework at runtime, so existing code works without modification. The downside? You only get what the library authors decided to capture—generic HTTP attributes like http.method and http.status_code, but nothing domain-specific. Auto-instrumented spans also can't know your business logic, so a slow request shows up as "POST /api/search took 5 seconds" without revealing which internal operation caused the delay.

Manual spans fill that gap. By wrapping specific operations (like llm.rag_completion or vector_search.query), you get visibility into your application's unique behaviour. You can add custom attributes (llm.model, query.top_k, cache.hit) that make traces actually useful for debugging. The downside is maintenance: manual spans are code you write and maintain, and you need to decide where instrumentation adds value versus where it just adds noise. In practice, I found the right balance was auto-instrumentation for framework boundaries (HTTP, gRPC) plus manual spans for the 5-10 operations that actually matter for understanding performance.

The magic is trace context propagation. When the Search UI calls the Search Service via gRPC, the trace ID travels in metadata headers:

Metadata: [
  ("traceparent", "00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01"),
  ("content-type", "application/grpc"),
]

Spans from all services are linked by this trace ID, forming a tree:

Trace ID: 0af7651916cd43dd8448eb211c80319c

├─ [search-ui] POST /api/search (300ms)
│   │
│   ├─ [search-service] Search (gRPC server) (275ms)
│   │   │
│   │   ├─ [search-service] embedding.generate (50ms)
│   │   │   └─ [embedding-service] Embed (45ms)
│   │   │       └─ POST https://api.openai.com (35ms)
│   │   │
│   │   ├─ [search-service] vector_search.query (100ms)
│   │   │
│   │   └─ [search-service] llm.rag_completion (120ms)
│           └─ openai.chat (115ms)

Alloy configuration for traces



Traces are collected by Alloy and stored in Grafana Tempo. Alloy batches traces for efficiency before exporting via OTLP:

otelcol.processor.batch "traces" {
  timeout = "5s"
  send_batch_size = 500
  output { traces = [otelcol.exporter.otlp.tempo.input] }
}

otelcol.exporter.otlp "tempo" {
  client {
    endpoint = "tempo.monitoring.svc.cluster.local:4317"
    tls { insecure = true }
  }
}

In Tempo's UI, I can finally see exactly where time is spent. That 5-second query? Turns out the vector search was waiting on a cold Weaviate connection. Now I knew what to fix.

Async ingestion trace walkthrough



One of the most powerful aspects of distributed tracing is following requests across async boundaries like message queues. The document ingestion pipeline flows through Kafka, creating spans that are linked even though they execute in different processes at different times.

Step 1: Ingest a document



$ curl -s -X POST http://localhost:8082/ingest \
  -H "Content-Type: application/json" \
  -d '{
    "text": "This is the X-RAG Observability Guide...",
    "metadata": {
      "title": "X-RAG Observability Guide",
      "source_file": "docs/OBSERVABILITY.md",
      "type": "markdown"
    },
    "namespace": "default"
  }' | jq .
{
  "document_id": "8538656a-ba99-406c-8da7-87c5f0dda34d",
  "status": "accepted",
  "minio_bucket": "documents",
  "minio_key": "8538656a-ba99-406c-8da7-87c5f0dda34d.json",
  "message": "Document accepted for processing"
}

The ingestion API immediately returns—it doesn't wait for indexing. The document is stored in MinIO and a message is published to Kafka.

Step 2: Find the ingestion trace



Using Tempo's HTTP API (port 3200), we can search for traces by span name using TraceQL:

$ curl -s -G "http://localhost:3200/api/search" \
  --data-urlencode 'q={name="POST /ingest"}' \
  --data-urlencode 'limit=3' | jq '.traces[0].traceID'
"b3fc896a1cf32b425b8e8c46c86c76f7"

Step 3: Fetch the complete trace



$ curl -s "http://localhost:3200/api/traces/b3fc896a1cf32b425b8e8c46c86c76f7" \
  | jq '[.batches[] | ... | {service, span}] | unique'
[
  { "service": "ingestion-api", "span": "POST /ingest" },
  { "service": "ingestion-api", "span": "storage.upload" },
  { "service": "ingestion-api", "span": "messaging.publish" },
  { "service": "indexer", "span": "indexer.process_document" },
  { "service": "indexer", "span": "document.duplicate_check" },
  { "service": "indexer", "span": "document.pipeline" },
  { "service": "indexer", "span": "storage.download" },
  { "service": "indexer", "span": "/xrag.embedding.EmbeddingService/EmbedBatch" },
  { "service": "embedding-service", "span": "openai.embeddings" },
  { "service": "indexer", "span": "db.insert" }
]

The trace spans three services: ingestion-api, indexer, and embedding-service. The trace context propagates through Kafka, linking the original HTTP request to the async consumer processing.

Step 4: Analyse the async trace



ingestion-api | POST /ingest             |   16ms  ← HTTP response returns
ingestion-api | storage.upload           |   13ms  ← Save to MinIO
ingestion-api | messaging.publish        |    1ms  ← Publish to Kafka
              |                          |         
              | ~~~ Kafka queue ~~~      |         ← Async boundary
              |                          |         
indexer       | indexer.process_document | 1799ms  ← Consumer picks up message
indexer       | document.duplicate_check |    1ms
indexer       | document.pipeline        | 1796ms
indexer       | storage.download         |    1ms  ← Fetch from MinIO
indexer       | EmbedBatch (gRPC)        |  754ms  ← Call embedding service
embedding-svc | openai.embeddings        |  752ms  ← OpenAI API
indexer       | db.insert                | 1038ms  ← Store in Weaviate

The total async processing takes ~1.8 seconds, but the user sees a 16ms response. Without tracing, debugging "why isn't my document showing up in search results?" would require correlating logs from three services manually.

Key insight: The trace context propagates through Kafka message headers, allowing the indexer's spans to link back to the original ingestion request. This is configured via OpenTelemetry's Kafka instrumentation.

Viewing traces in Grafana



To view a trace in Grafana's UI:

1. Open Grafana at http://localhost:3000/explore
2. Select Tempo as the data source (top-left dropdown)
3. Choose TraceQL as the query type
4. Paste the trace ID: b3fc896a1cf32b425b8e8c46c86c76f7
5. Click Run query

The trace viewer shows a Gantt chart with all spans, their timing, and parent-child relationships. Click any span to see its attributes.

Async ingestion trace in Grafana Tempo

Ingestion trace node graph showing service dependencies

End-to-end search trace walkthrough



To demonstrate the observability stack in action, here's a complete trace from a search request through all services.

Step 1: Make a search request



Normally you'd use the Search UI web interface at http://localhost:8080, but for demonstration purposes curl makes it easier to show the raw request and response:

$ curl -s -X POST http://localhost:8080/api/search \
  -H "Content-Type: application/json" \
  -d '{"query": "What is RAG?", "namespace": "default", "mode": "hybrid", "top_k": 5}' | jq .
{
  "answer": "I don't have enough information to answer this question.",
  "sources": [
    {
      "id": "71adbc34-56c1-4f75-9248-4ed38094ac69",
      "content": "# X-RAG Observability Guide This document describes...",
      "score": 0.8292956352233887,
      "metadata": {
        "source": "docs/OBSERVABILITY.md",
        "type": "markdown",
        "namespace": "default"
      }
    }
  ],
  "metadata": {
    "namespace": "default",
    "num_sources": "5",
    "cache_hit": "False",
    "mode": "hybrid",
    "top_k": "5",
    "trace_id": "9df981cac91857b228eca42b501c98c6"
  }
}

The response includes a trace_id that links this request to all spans across services.

Step 2: Query Tempo for the trace



Using the trace ID from the response, query Tempo's API:

$ curl -s "http://localhost:3200/api/traces/9df981cac91857b228eca42b501c98c6" \
  | jq '.batches[].scopeSpans[].spans[] 
        | {name, service: .attributes[] 
           | select(.key=="service.name") 
           | .value.stringValue}'

The raw trace shows spans from multiple services:

  • search-ui: POST /api/search (root span, 2138ms total)
  • search-ui: /xrag.search.SearchService/Search (gRPC client call)
  • search-service: /xrag.search.SearchService/Search (gRPC server)
  • search-service: /xrag.embedding.EmbeddingService/Embed (gRPC client)
  • embedding-service: /xrag.embedding.EmbeddingService/Embed (gRPC server)
  • embedding-service: openai.embeddings (OpenAI API call, 647ms)
  • embedding-service: POST https://api.openai.com/v1/embeddings (HTTP client)
  • search-service: vector_search.query (Weaviate hybrid search, 13ms)
  • search-service: openai.chat (LLM answer generation, 1468ms)
  • search-service: POST https://api.openai.com/v1/chat/completions (HTTP client)

Step 3: Analyse the trace



From this single trace, I can see exactly where time is spent:

Total request:                     2138ms
├── gRPC to search-service:        2135ms
│   ├── Embedding generation:       649ms
│   │   └── OpenAI embeddings API:   640ms
│   ├── Vector search (Weaviate):    13ms
│   └── LLM answer generation:     1468ms
│       └── OpenAI chat API:       1463ms

The bottleneck is clear: 68% of time is spent in LLM answer generation. The vector search (13ms) and embedding generation (649ms) are relatively fast. Without tracing, I would have guessed the embedding service was slow—traces proved otherwise.

Step 4: Search traces with TraceQL



Tempo supports TraceQL for querying traces by attributes:

$ curl -s -G "http://localhost:3200/api/search" \
  --data-urlencode 'q={resource.service.name="search-service"}' \
  --data-urlencode 'limit=5' | jq '.traces[:2] | .[].rootTraceName'
"/xrag.search.SearchService/Search"
"GET /health/ready"

Other useful TraceQL queries:

# Find slow searches (> 2 seconds)
{resource.service.name="search-ui" && name="POST /api/search"} | duration > 2s

# Find errors
{status=error}

# Find OpenAI calls
{name=~"openai.*"}

Viewing the search trace in Grafana



Follow the same steps as above, but use the search trace ID: 9df981cac91857b228eca42b501c98c6

Search trace in Grafana Tempo

Search trace node graph showing service flow

Correlating the three signals



The real power comes from correlating traces, metrics, and logs. When an alert fires for high error rate, I follow this workflow:

1. Metrics: Prometheus shows error spike started at 10:23:00
2. Traces: Query Tempo for traces with status=error around that time
3. Logs: Use the trace ID to find detailed error messages in Loki

{namespace="rag-system"} |= "trace_id=abc123" |= "error"

Prometheus exemplars link specific metric samples to trace IDs, so I can click directly from a latency spike to the responsible trace.

Grafana dashboards



During the hackathon, I also created six pre-built Grafana dashboards that are automatically provisioned when the monitoring stack starts:

| Dashboard | Description |
|-----------|-------------|
| **X-RAG Overview** | The main dashboard with 22 panels covering request rates, latencies, error rates, and service health across all X-RAG components |
| **OpenTelemetry HTTP Metrics** | HTTP request/response metrics from OpenTelemetry-instrumented services—request rates, latency percentiles, and status code breakdowns |
| **Pod System Metrics** | Kubernetes pod resource utilisation: CPU usage, memory consumption, network I/O, disk I/O, and pod state from kube-state-metrics |
| **Redis** | Cache performance: memory usage, hit/miss rates, commands per second, connected clients, and memory fragmentation |
| **Kafka** | Message queue health: consumer lag (critical for indexer monitoring), broker status, topic partitions, and throughput |
| **MinIO** | Object storage metrics: S3 request rates, error counts, traffic volume, bucket sizes, and disk usage |

All dashboards are stored as JSON files in infra/k8s/monitoring/grafana-dashboards/ and deployed via ConfigMaps, so they survive pod restarts and cluster recreations.

X-RAG Overview dashboard
Pod System Metrics dashboard

Results: two days well spent



What did two days of hackathon work achieve? The system went from flying blind to fully instrumented:

  • All three pillars implemented: logs (Loki), metrics (Prometheus), traces (Tempo)
  • Unified collection via Grafana Alloy
  • Infrastructure metrics for Kafka, Redis, and MinIO
  • Six pre-built Grafana dashboards covering application metrics, pod resources, and infrastructure
  • Trace context propagation across all gRPC calls

The biggest insight from testing? The embedding service wasn't the bottleneck I assumed. Traces revealed that LLM synthesis dominated latency, not embedding generation. Without tracing, optimisation efforts would have targeted the wrong component.

Beyond the technical wins, I had a lot of fun. The hackathon brought together people working on different projects, and I got to know some really nice folks during the sessions themselves. There's something energising about being in a (virtual) room with other people all heads-down on their own challenges—even if you're not collaborating directly, the shared focus is motivating.

SLIs, SLOs and SLAs



The system now has full observability, but there's always more. And to be clear: this is not production-grade yet. It works well for development and could scale to production, but that would need to be validated with proper load testing and chaos testing first. We haven't stress-tested the observability pipeline under heavy load, nor have we tested failure scenarios like Tempo going down or Alloy running out of memory. The Alloy config includes comments on sampling strategies and rate limiting that would be essential for high-traffic environments.

One thing we didn't cover: monitoring and alerting. These are related but distinct from observability. Observability is about collecting and exploring data to understand system behaviour. Monitoring is about defining thresholds and alerting when they're breached. We have Prometheus with all the metrics, but no alerting rules yet—no PagerDuty integration, no Slack notifications when latency spikes or error rates climb.

We also didn't define any SLIs (Service Level Indicators) or SLOs (Service Level Objectives). An SLI is a quantitative measure of service quality—for example, "99th percentile search latency" or "percentage of requests returning successfully." An SLO is a target for that indicator—"99th percentile latency should be under 2 seconds" or "99.9% of requests should succeed." Without SLOs, you don't know what "good" looks like, and alerting becomes arbitrary.

For X-RAG specifically, potential SLOs might include:

  • Search latency: 99th percentile over 5 minutes search response time under 3 seconds
  • Uptime: 99.9% availability of the search API endpoint
  • Response quality: How good was the search? There are some metrics which could be used...

SLAs (Service Level Agreements) are often confused with SLOs, but they're different. An SLA is a contractual commitment to customers—a legally binding promise with consequences (refunds, credits, penalties) if you fail to meet it. SLOs are internal engineering targets; SLAs are external business promises. Typically, SLAs are less strict than SLOs: if your internal target is 99.9% availability (SLO), your customer contract might promise 99.5% (SLA), giving you a buffer before you owe anyone money.

But then again, X-RAG is a proof-of-concept, a prototype, a learning system—there are no real customers to disappoint. SLOs would become essential if this ever served actual users, and SLAs would follow once there's a business relationship to protect.

Using Amp for AI-assisted development



I used Amp (formerly Ampcode) throughout this project. While I knew what I wanted to achieve, I let the LLM generate the actual configurations, Kubernetes manifests, and Python instrumentation code.

Amp - AI coding agent by Sourcegraph

My workflow was step-by-step rather than handing over a grand plan:

1. "Deploy Grafana Alloy to the monitoring namespace"
2. "Verify Alloy is running and receiving data"
3. "Document what we did to docs/OBSERVABILITY.md"
4. "Commit with message 'feat: add Grafana Alloy for telemetry collection'"
5. Hand off context, start fresh: "Now instrument the search-ui with OpenTelemetry to push traces to Alloy..."

Chaining many small, focused tasks worked better than one massive plan. Each task had clear success criteria, and I could verify results before moving on. The LLM generated the River configuration, the OpenTelemetry Python code, the Kubernetes manifests—I reviewed, tweaked, and committed.

I only ran out of the 200k token context window once, during a debugging session that involved restarting the Kubernetes cluster multiple times. The fix required correlating error messages across several services, and the conversation history grew too long. Starting a fresh context and summarising the problem solved it.

Amp automatically selects the best model for the task at hand. Based on the response speed and Sourcegraph's recent announcements, I believe it was using Claude Opus 4.5 for most of my coding and infrastructure work. The quality was excellent—it understood Python, Kubernetes, OpenTelemetry, and Grafana tooling without much hand-holding.

Let me be clear: without the LLM, I'd never have managed to write all these configuration files by hand in two days. The Alloy config alone is 1400+ lines. But I also reviewed and verified every change manually, verified it made sense, and understood what was being deployed. This wasn't vibe-coding—the whole point of the hackathon was to learn. I already knew Grafana and Prometheus from previous work, but OpenTelemetry, Alloy, Tempo, Loki and the X-RAG system overall were all pretty new to me. By reviewing each generated config and understanding why it was structured that way, I actually learned the tools rather than just deploying magic incantations.

Cost-wise, I spent around 20 USD on Amp credits over the two-day hackathon. For the amount of code generated, configs reviewed, and debugging assistance—that's remarkably affordable.

Other changes along the way



Looking at the git history, I made 25 commits during the hackathon. Beyond the main observability features, there were several smaller but useful additions:

OBSERVABILITY_ENABLED flag: Added an environment variable to completely disable the monitoring stack. Set OBSERVABILITY_ENABLED=false in .env and the cluster starts without Prometheus, Grafana, Tempo, Loki, or Alloy. Useful when you just want to work on application code without the overhead.

Load generator: Added a make load-gen target that fires concurrent requests at the search API. Useful for generating enough trace data to see patterns in Tempo, and for stress-testing the observability pipeline itself.

Verification scripts: Created scripts to test that OTLP is actually reaching Alloy and that traces appear in Tempo. Debugging "why aren't my traces showing up?" is frustrating without a systematic way to verify each hop in the pipeline.

Moving monitoring to dedicated namespace: Refactored from having observability components scattered across namespaces to a clean monitoring namespace. Makes kubectl get pods -n monitoring show exactly what's running for observability.

Lessons learned



  • Start with metrics, but don't stop there—they tell you *what*, not *why*
  • Trace context propagation is the key to distributed debugging
  • Grafana Alloy as a unified collector simplifies the pipeline
  • Infrastructure metrics matter—your app is only as fast as your data layer
  • The three pillars work together; none is sufficient alone

All manifests and observability code live in Florian's repository:

X-RAG on GitHub (source code, K8s manifests, observability configs)

The best part? Everything I learned during this hackathon—OpenTelemetry instrumentation, Grafana Alloy configuration, trace context propagation, PromQL queries—I can immediately apply at work as we are shifting to that new observability stack and I am going to have a few meetings talking with developers how and what they need to implement for application instrumentalization. Observability patterns are universal, and hands-on experience with a real distributed system beats reading documentation any day.

E-Mail your comments to paul@nospam.buetow.org

Back to the main site
f3s: Kubernetes with FreeBSD - Part 8: Observability https://foo.zone/gemfeed/2025-12-07-f3s-kubernetes-with-freebsd-part-8.html 2025-12-06T23:58:24+02:00 Paul Buetow aka snonux paul@dev.buetow.org This is the 8th blog post about the f3s series for my self-hosting demands in a home lab. f3s? The 'f' stands for FreeBSD, and the '3s' stands for k3s, the Kubernetes distribution I use on FreeBSD-based physical machines.

f3s: Kubernetes with FreeBSD - Part 8: Observability



Published at 2025-12-06T23:58:24+02:00

This is the 8th blog post about the f3s series for my self-hosting demands in a home lab. f3s? The "f" stands for FreeBSD, and the "3s" stands for k3s, the Kubernetes distribution I use on FreeBSD-based physical machines.

2024-11-17 f3s: Kubernetes with FreeBSD - Part 1: Setting the stage
2024-12-03 f3s: Kubernetes with FreeBSD - Part 2: Hardware and base installation
2025-02-01 f3s: Kubernetes with FreeBSD - Part 3: Protecting from power cuts
2025-04-05 f3s: Kubernetes with FreeBSD - Part 4: Rocky Linux Bhyve VMs
2025-05-11 f3s: Kubernetes with FreeBSD - Part 5: WireGuard mesh network
2025-07-14 f3s: Kubernetes with FreeBSD - Part 6: Storage
2025-10-02 f3s: Kubernetes with FreeBSD - Part 7: k3s and first pod deployments
2025-12-07 f3s: Kubernetes with FreeBSD - Part 8: Observability (You are currently reading this)

f3s logo

Table of Contents




Introduction



In this blog post, I set up a complete observability stack for the k3s cluster. Observability is crucial for understanding what's happening inside the cluster—whether its tracking resource usage, debugging issues, or analysing application behaviour. The stack consists of four main components, all deployed into the monitoring namespace:

  • Prometheus: time-series database for metrics collection and alerting
  • Grafana: visualisation and dashboarding frontend
  • Loki: log aggregation system (like Prometheus, but for logs)
  • Alloy: telemetry collector that ships logs from all pods to Loki

Together, these form the "PLG" stack (Prometheus, Loki, Grafana), which is a popular open-source alternative to commercial observability platforms.

All manifests for the f3s stack live in my configuration repository:

codeberg.org/snonux/conf/f3s

Important Note: GitOps Migration



**Note:** After publishing this blog post, the f3s cluster was migrated from imperative Helm deployments to declarative GitOps using ArgoCD. The Kubernetes manifests, Helm charts, and Justfiles in the repository have been reorganized for ArgoCD-based continuous deployment.

**To view the exact configuration as it existed when this blog post was written** (before the ArgoCD migration), check out the pre-ArgoCD revision:

$ git clone https://codeberg.org/snonux/conf.git
$ cd conf
$ git checkout 15a86f3  # Last commit before ArgoCD migration
$ cd f3s/prometheus/

**Current master branch** contains the ArgoCD-managed versions with:
  • Application manifests organized under argocd-apps/{monitoring,services,infra,test}/
  • Resources organized under prometheus/manifests/, loki/, etc.
  • Justfiles updated to trigger ArgoCD syncs instead of direct Helm commands

The deployment concepts and architecture remain the same—only the deployment method changed from imperative (helm install/upgrade) to declarative (GitOps with ArgoCD).

Persistent storage recap



All observability components need persistent storage so that metrics and logs survive pod restarts. As covered in Part 6 of this series, the cluster uses NFS-backed persistent volumes:

f3s: Kubernetes with FreeBSD - Part 6: Storage

The FreeBSD hosts (f0, f1) serve as master-standby NFS servers, exporting ZFS datasets that are replicated across hosts using zrepl. The Rocky Linux k3s nodes (r0, r1, r2) mount these exports at /data/nfs/k3svolumes. This directory contains subdirectories for each application that needs persistent storage—including Prometheus, Grafana, and Loki.

For example, the observability stack uses these paths on the NFS share:

  • /data/nfs/k3svolumes/prometheus/data — Prometheus time-series database
  • /data/nfs/k3svolumes/grafana/data — Grafana configuration, dashboards, and plugins
  • /data/nfs/k3svolumes/loki/data — Loki log chunks and index

Each path gets a corresponding PersistentVolume and PersistentVolumeClaim in Kubernetes, allowing pods to mount them as regular volumes. Because the underlying storage is ZFS with replication, we get snapshots and redundancy for free.

The monitoring namespace



First, I created the monitoring namespace where all observability components will live:

$ kubectl create namespace monitoring
namespace/monitoring created

Installing Prometheus and Grafana



Prometheus and Grafana are deployed together using the kube-prometheus-stack Helm chart from the Prometheus community. This chart bundles Prometheus, Grafana, Alertmanager, and various exporters (Node Exporter, Kube State Metrics) into a single deployment. Ill explain what each component does in detail later when we look at the running pods.

Prerequisites



Add the Prometheus Helm chart repository:

$ helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
$ helm repo update

Create the directories on the NFS server for persistent storage:

[root@r0 ~]# mkdir -p /data/nfs/k3svolumes/prometheus/data
[root@r0 ~]# mkdir -p /data/nfs/k3svolumes/grafana/data

Deploying with the Justfile



The configuration repository contains a Justfile that automates the deployment. just is a handy command runner—think of it as a simpler, more modern alternative to make. I use it throughout the f3s repository to wrap repetitive Helm and kubectl commands:

just - A handy way to save and run project-specific commands
codeberg.org/snonux/conf/f3s/prometheus

To install everything:

$ cd conf/f3s/prometheus
$ just install
kubectl apply -f persistent-volumes.yaml
persistentvolume/prometheus-data-pv created
persistentvolume/grafana-data-pv created
persistentvolumeclaim/grafana-data-pvc created
helm install prometheus prometheus-community/kube-prometheus-stack \
    --namespace monitoring -f persistence-values.yaml
NAME: prometheus
LAST DEPLOYED: ...
NAMESPACE: monitoring
STATUS: deployed

The persistence-values.yaml configures Prometheus and Grafana to use the NFS-backed persistent volumes I mentioned earlier, ensuring data survives pod restarts. It also enables scraping of etcd and kube-controller-manager metrics:

kubeEtcd:
  enabled: true
  endpoints:
    - 192.168.2.120
    - 192.168.2.121
    - 192.168.2.122
  service:
    enabled: true
    port: 2381
    targetPort: 2381

kubeControllerManager:
  enabled: true
  endpoints:
    - 192.168.2.120
    - 192.168.2.121
    - 192.168.2.122
  service:
    enabled: true
    port: 10257
    targetPort: 10257
  serviceMonitor:
    enabled: true
    https: true
    insecureSkipVerify: true

By default, k3s binds the controller-manager to localhost only, so the "Kubernetes / Controller Manager" dashboard in Grafana will show no data. To expose the metrics endpoint, add the following to /etc/rancher/k3s/config.yaml on each k3s server node:

[root@r0 ~]# cat >> /etc/rancher/k3s/config.yaml << 'EOF'
kube-controller-manager-arg:
  - bind-address=0.0.0.0
EOF
[root@r0 ~]# systemctl restart k3s

Repeat for r1 and r2. After restarting all nodes, the controller-manager metrics endpoint will be accessible and Prometheus can scrape it.

The persistent volume definitions bind to specific paths on the NFS share using hostPath volumes—the same pattern used for other services in Part 7:

f3s: Kubernetes with FreeBSD - Part 7: k3s and first pod deployments

Exposing Grafana via ingress



The chart also deploys an ingress for Grafana, making it accessible at grafana.f3s.foo.zone. The ingress configuration follows the same pattern as other services in the cluster—Traefik handles the routing internally, while the OpenBSD edge relays terminate TLS and forward traffic through WireGuard.

Once deployed, Grafana is accessible and comes pre-configured with Prometheus as a data source. You can verify the Prometheus service is running:

$ kubectl get svc -n monitoring prometheus-kube-prometheus-prometheus
NAME                                    TYPE        CLUSTER-IP      PORT(S)
prometheus-kube-prometheus-prometheus   ClusterIP   10.43.152.163   9090/TCP,8080/TCP

Grafana connects to Prometheus using the internal service URL http://prometheus-kube-prometheus-prometheus.monitoring.svc.cluster.local:9090. The default Grafana credentials are admin/prom-operator, which should be changed immediately after first login.

Grafana dashboard showing Prometheus metrics

Grafana dashboard showing cluster metrics

Installing Loki and Alloy



While Prometheus handles metrics, Loki handles logs. It's designed to be cost-effective and easy to operate—it doesn't index the contents of logs, only the metadata (labels), making it very efficient for storage.

Alloy is Grafana's telemetry collector (the successor to Promtail). It runs as a DaemonSet on each node, tails container logs, and ships them to Loki.

Prerequisites



Create the data directory on the NFS server:

[root@r0 ~]# mkdir -p /data/nfs/k3svolumes/loki/data

Deploying Loki and Alloy



The Loki configuration also lives in the repository:

codeberg.org/snonux/conf/f3s/loki

To install:

$ cd conf/f3s/loki
$ just install
helm repo add grafana https://grafana.github.io/helm-charts || true
helm repo update
kubectl apply -f persistent-volumes.yaml
persistentvolume/loki-data-pv created
persistentvolumeclaim/loki-data-pvc created
helm install loki grafana/loki --namespace monitoring -f values.yaml
NAME: loki
LAST DEPLOYED: ...
NAMESPACE: monitoring
STATUS: deployed
...
helm install alloy grafana/alloy --namespace monitoring -f alloy-values.yaml
NAME: alloy
LAST DEPLOYED: ...
NAMESPACE: monitoring
STATUS: deployed

Loki runs in single-binary mode with a single replica (loki-0), which is appropriate for a home lab cluster. This means there's only one Loki pod running at any time. If the node hosting Loki fails, Kubernetes will automatically reschedule the pod to another worker node—but there will be a brief downtime (typically under a minute) while this happens. For my home lab use case, this is perfectly acceptable.

For full high-availability, you'd deploy Loki in microservices mode with separate read, write, and backend components, backed by object storage like S3 or MinIO instead of local filesystem storage. That's a more complex setup that I might explore in a future blog post—but for now, the single-binary mode with NFS-backed persistence strikes the right balance between simplicity and durability.

Configuring Alloy



Alloy is configured via alloy-values.yaml to discover all pods in the cluster and forward their logs to Loki:

discovery.kubernetes "pods" {
  role = "pod"
}

discovery.relabel "pods" {
  targets = discovery.kubernetes.pods.targets

  rule {
    source_labels = ["__meta_kubernetes_namespace"]
    target_label  = "namespace"
  }

  rule {
    source_labels = ["__meta_kubernetes_pod_name"]
    target_label  = "pod"
  }

  rule {
    source_labels = ["__meta_kubernetes_pod_container_name"]
    target_label  = "container"
  }

  rule {
    source_labels = ["__meta_kubernetes_pod_label_app"]
    target_label  = "app"
  }
}

loki.source.kubernetes "pods" {
  targets    = discovery.relabel.pods.output
  forward_to = [loki.write.default.receiver]
}

loki.write "default" {
  endpoint {
    url = "http://loki.monitoring.svc.cluster.local:3100/loki/api/v1/push"
  }
}

This configuration automatically labels each log line with the namespace, pod name, container name, and app label, making it easy to filter logs in Grafana.

Adding Loki as a Grafana data source



Loki doesn't have its own web UI—you query it through Grafana. First, verify the Loki service is running:

$ kubectl get svc -n monitoring loki
NAME   TYPE        CLUSTER-IP    PORT(S)
loki   ClusterIP   10.43.64.60   3100/TCP,9095/TCP

To add Loki as a data source in Grafana:

  • Navigate to Configuration → Data Sources
  • Click "Add data source"
  • Select "Loki"
  • Set the URL to: http://loki.monitoring.svc.cluster.local:3100
  • Click "Save & Test"

Once configured, you can explore logs in Grafana's "Explore" view. I'll show some example queries in the "Using the observability stack" section below.

Exploring logs in Grafana with Loki

The complete monitoring stack



After deploying everything, here's what's running in the monitoring namespace:

$ kubectl get pods -n monitoring
NAME                                                     READY   STATUS    RESTARTS   AGE
alertmanager-prometheus-kube-prometheus-alertmanager-0   2/2     Running   0          42d
alloy-g5fgj                                              2/2     Running   0          29m
alloy-nfw8w                                              2/2     Running   0          29m
alloy-tg9vj                                              2/2     Running   0          29m
loki-0                                                   2/2     Running   0          25m
prometheus-grafana-868f9dc7cf-lg2vl                      3/3     Running   0          42d
prometheus-kube-prometheus-operator-8d7bbc48c-p4sf4      1/1     Running   0          42d
prometheus-kube-state-metrics-7c5fb9d798-hh2fx           1/1     Running   0          42d
prometheus-prometheus-kube-prometheus-prometheus-0       2/2     Running   0          42d
prometheus-prometheus-node-exporter-2nsg9                1/1     Running   0          42d
prometheus-prometheus-node-exporter-mqr25                1/1     Running   0          42d
prometheus-prometheus-node-exporter-wp4ds                1/1     Running   0          42d

And the services:

$ kubectl get svc -n monitoring
NAME                                      TYPE        CLUSTER-IP      PORT(S)
alertmanager-operated                     ClusterIP   None            9093/TCP,9094/TCP
alloy                                     ClusterIP   10.43.74.14     12345/TCP
loki                                      ClusterIP   10.43.64.60     3100/TCP,9095/TCP
loki-headless                             ClusterIP   None            3100/TCP
prometheus-grafana                        ClusterIP   10.43.46.82     80/TCP
prometheus-kube-prometheus-alertmanager   ClusterIP   10.43.208.43    9093/TCP,8080/TCP
prometheus-kube-prometheus-operator       ClusterIP   10.43.246.121   443/TCP
prometheus-kube-prometheus-prometheus     ClusterIP   10.43.152.163   9090/TCP,8080/TCP
prometheus-kube-state-metrics             ClusterIP   10.43.64.26     8080/TCP
prometheus-prometheus-node-exporter       ClusterIP   10.43.127.242   9100/TCP

Let me break down what each pod does:

  • alertmanager-prometheus-kube-prometheus-alertmanager-0: the Alertmanager instance that receives alerts from Prometheus, deduplicates them, groups related alerts together, and routes notifications to the appropriate receivers (email, Slack, PagerDuty, etc.). It runs as a StatefulSet with persistent storage for silences and notification state.

  • alloy-g5fgj, alloy-nfw8w, alloy-tg9vj: three Alloy pods running as a DaemonSet, one on each k3s node. Each pod tails the container logs from its local node via the Kubernetes API and forwards them to Loki. This ensures log collection continues even if a node becomes isolated from the others.

  • loki-0: the single Loki instance running in single-binary mode. It receives log streams from Alloy, stores them in chunks on the NFS-backed persistent volume, and serves queries from Grafana. The -0 suffix indicates it's a StatefulSet pod.

  • prometheus-grafana-...: the Grafana web interface for visualising metrics and logs. It comes pre-configured with Prometheus as a data source and includes dozens of dashboards for Kubernetes monitoring. Dashboards, users, and settings are persisted to the NFS share.

  • prometheus-kube-prometheus-operator-...: the Prometheus Operator that watches for custom resources (ServiceMonitor, PodMonitor, PrometheusRule) and automatically configures Prometheus to scrape new targets. This allows applications to declare their own monitoring requirements.

  • prometheus-kube-state-metrics-...: generates metrics about the state of Kubernetes objects themselves: how many pods are running, pending, or failed; deployment replica counts; node conditions; PVC status; and more. Essential for cluster-level dashboards.

  • prometheus-prometheus-kube-prometheus-prometheus-0: the Prometheus server that scrapes metrics from all configured targets (pods, services, nodes), stores them in a time-series database, evaluates alerting rules, and serves queries to Grafana.

  • prometheus-prometheus-node-exporter-...: three Node Exporter pods running as a DaemonSet, one on each node. They expose hardware and OS-level metrics: CPU usage, memory, disk I/O, filesystem usage, network statistics, and more. These feed the "Node Exporter" dashboards in Grafana.

Using the observability stack



Viewing metrics in Grafana



The kube-prometheus-stack comes with many pre-built dashboards. Some useful ones include:

  • Kubernetes / Compute Resources / Cluster: overview of CPU and memory usage across the cluster
  • Kubernetes / Compute Resources / Namespace (Pods): resource usage by namespace
  • Node Exporter / Nodes: detailed host metrics like disk I/O, network, and CPU

Querying logs with LogQL



In Grafana's Explore view, select Loki as the data source and try queries like:

# All logs from the services namespace
{namespace="services"}

# Logs from pods matching a pattern
{pod=~"miniflux.*"}

# Filter by log content
{namespace="services"} |= "error"

# Parse JSON logs and filter
{namespace="services"} | json | level="error"

Creating alerts



Prometheus supports alerting rules that can notify you when something goes wrong. The kube-prometheus-stack includes many default alerts for common issues like high CPU usage, pod crashes, and node problems. These can be customised via PrometheusRule CRDs.

Monitoring external FreeBSD hosts



The observability stack can also monitor servers outside the Kubernetes cluster. The FreeBSD hosts (f0, f1, f2) that serve NFS storage can be added to Prometheus using the Node Exporter.

Installing Node Exporter on FreeBSD



On each FreeBSD host, install the node_exporter package:

paul@f0:~ % doas pkg install -y node_exporter

Enable the service to start at boot:

paul@f0:~ % doas sysrc node_exporter_enable=YES
node_exporter_enable:  -> YES

Configure node_exporter to listen on the WireGuard interface. This ensures metrics are only accessible through the secure tunnel, not the public network. Replace the IP with the host's WireGuard address:

paul@f0:~ % doas sysrc node_exporter_args='--web.listen-address=192.168.2.130:9100'
node_exporter_args:  -> --web.listen-address=192.168.2.130:9100

Start the service:

paul@f0:~ % doas service node_exporter start
Starting node_exporter.

Verify it's running:

paul@f0:~ % curl -s http://192.168.2.130:9100/metrics | head -3
# HELP go_gc_duration_seconds A summary of the wall-time pause...
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 0

Repeat for the other FreeBSD hosts (f1, f2) with their respective WireGuard IPs.

Adding FreeBSD hosts to Prometheus



Create a file additional-scrape-configs.yaml in the prometheus configuration directory:

- job_name: 'node-exporter'
  static_configs:
    - targets:
      - '192.168.2.130:9100'  # f0 via WireGuard
      - '192.168.2.131:9100'  # f1 via WireGuard
      - '192.168.2.132:9100'  # f2 via WireGuard
      labels:
        os: freebsd

The job_name must be node-exporter to match the existing dashboards. The os: freebsd label allows filtering these hosts separately if needed.

Create a Kubernetes secret from this file:

$ kubectl create secret generic additional-scrape-configs \
    --from-file=additional-scrape-configs.yaml \
    -n monitoring

Update persistence-values.yaml to reference the secret:

prometheus:
  prometheusSpec:
    additionalScrapeConfigsSecret:
      enabled: true
      name: additional-scrape-configs
      key: additional-scrape-configs.yaml

Upgrade the Prometheus deployment:

$ just upgrade

After a minute or so, the FreeBSD hosts appear in the Prometheus targets and in the Node Exporter dashboards in Grafana.

FreeBSD hosts in the Node Exporter dashboard

FreeBSD memory metrics compatibility



The default Node Exporter dashboards are designed for Linux and expect metrics like node_memory_MemAvailable_bytes. FreeBSD uses different metric names (node_memory_size_bytes, node_memory_free_bytes, etc.), so memory panels will show "No data" out of the box.

To fix this, I created a PrometheusRule that generates synthetic Linux-compatible metrics from the FreeBSD equivalents:

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: freebsd-memory-rules
  namespace: monitoring
  labels:
    release: prometheus
spec:
  groups:
    - name: freebsd-memory
      rules:
        - record: node_memory_MemTotal_bytes
          expr: node_memory_size_bytes{os="freebsd"}
        - record: node_memory_MemAvailable_bytes
          expr: |
            node_memory_free_bytes{os="freebsd"}
              + node_memory_inactive_bytes{os="freebsd"}
              + node_memory_cache_bytes{os="freebsd"}
        - record: node_memory_MemFree_bytes
          expr: node_memory_free_bytes{os="freebsd"}
        - record: node_memory_Buffers_bytes
          expr: node_memory_buffer_bytes{os="freebsd"}
        - record: node_memory_Cached_bytes
          expr: node_memory_cache_bytes{os="freebsd"}

This file is saved as freebsd-recording-rules.yaml and applied as part of the Prometheus installation. The os="freebsd" label (set in the scrape config) ensures these rules only apply to FreeBSD hosts. After applying, the memory panels in the Node Exporter dashboards populate correctly for FreeBSD.

freebsd-recording-rules.yaml on Codeberg

Disk I/O metrics limitation



Unlike memory metrics, disk I/O metrics (node_disk_read_bytes_total, node_disk_written_bytes_total, etc.) are not available on FreeBSD. The Linux diskstats collector that provides these metrics doesn't have a FreeBSD equivalent in the node_exporter.

The disk I/O panels in the Node Exporter dashboards will show "No data" for FreeBSD hosts. FreeBSD does expose ZFS-specific metrics (node_zfs_arcstats_*) for ARC cache performance, and per-dataset I/O stats are available via sysctl kstat.zfs, but mapping these to the Linux-style metrics the dashboards expect is non-trivial. Creating custom ZFS-specific dashboards is left as an exercise for another day.

Monitoring external OpenBSD hosts



The same approach works for OpenBSD hosts. I have two OpenBSD edge relay servers (blowfish, fishfinger) that handle TLS termination and forward traffic through WireGuard to the cluster. These can also be monitored with Node Exporter.

Installing Node Exporter on OpenBSD



On each OpenBSD host, install the node_exporter package:

blowfish:~ $ doas pkg_add node_exporter
quirks-7.103 signed on 2025-10-13T22:55:16Z
The following new rcscripts were installed: /etc/rc.d/node_exporter
See rcctl(8) for details.

Enable the service to start at boot:

blowfish:~ $ doas rcctl enable node_exporter

Configure node_exporter to listen on the WireGuard interface. This ensures metrics are only accessible through the secure tunnel, not the public network. Replace the IP with the host's WireGuard address:

blowfish:~ $ doas rcctl set node_exporter flags '--web.listen-address=192.168.2.110:9100'

Start the service:

blowfish:~ $ doas rcctl start node_exporter
node_exporter(ok)

Verify it's running:

blowfish:~ $ curl -s http://192.168.2.110:9100/metrics | head -3
# HELP go_gc_duration_seconds A summary of the wall-time pause...
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 0

Repeat for the other OpenBSD host (fishfinger) with its respective WireGuard IP (192.168.2.111).

Adding OpenBSD hosts to Prometheus



Update additional-scrape-configs.yaml to include the OpenBSD targets:

- job_name: 'node-exporter'
  static_configs:
    - targets:
      - '192.168.2.130:9100'  # f0 via WireGuard
      - '192.168.2.131:9100'  # f1 via WireGuard
      - '192.168.2.132:9100'  # f2 via WireGuard
      labels:
        os: freebsd
    - targets:
      - '192.168.2.110:9100'  # blowfish via WireGuard
      - '192.168.2.111:9100'  # fishfinger via WireGuard
      labels:
        os: openbsd

The os: openbsd label allows filtering these hosts separately from FreeBSD and Linux nodes.

OpenBSD memory metrics compatibility



OpenBSD uses the same memory metric names as FreeBSD (node_memory_size_bytes, node_memory_free_bytes, etc.), so a similar PrometheusRule is needed to generate Linux-compatible metrics:

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: openbsd-memory-rules
  namespace: monitoring
  labels:
    release: prometheus
spec:
  groups:
    - name: openbsd-memory
      rules:
        - record: node_memory_MemTotal_bytes
          expr: node_memory_size_bytes{os="openbsd"}
          labels:
            os: openbsd
        - record: node_memory_MemAvailable_bytes
          expr: |
            node_memory_free_bytes{os="openbsd"}
              + node_memory_inactive_bytes{os="openbsd"}
              + node_memory_cache_bytes{os="openbsd"}
          labels:
            os: openbsd
        - record: node_memory_MemFree_bytes
          expr: node_memory_free_bytes{os="openbsd"}
          labels:
            os: openbsd
        - record: node_memory_Cached_bytes
          expr: node_memory_cache_bytes{os="openbsd"}
          labels:
            os: openbsd

This file is saved as openbsd-recording-rules.yaml and applied alongside the FreeBSD rules. Note that OpenBSD doesn't expose a buffer memory metric, so that rule is omitted.

openbsd-recording-rules.yaml on Codeberg

After running just upgrade, the OpenBSD hosts appear in Prometheus targets and the Node Exporter dashboards.

Summary



With Prometheus, Grafana, Loki, and Alloy deployed, I now have complete visibility into the k3s cluster, the FreeBSD storage servers, and the OpenBSD edge relays:

  • metrics: Prometheus collects and stores time-series data from all components
  • Logs: Loki aggregates logs from all containers, searchable via Grafana
  • Visualisation: Grafana provides dashboards and exploration tools
  • Alerting: Alertmanager can notify on conditions defined in Prometheus rules

This observability stack runs entirely on the home lab infrastructure, with data persisted to the NFS share. It's lightweight enough for a three-node cluster but provides the same capabilities as production-grade setups.

Other *BSD-related posts:

2025-12-07 f3s: Kubernetes with FreeBSD - Part 8: Observability (You are currently reading this)
2025-10-02 f3s: Kubernetes with FreeBSD - Part 7: k3s and first pod deployments
2025-07-14 f3s: Kubernetes with FreeBSD - Part 6: Storage
2025-05-11 f3s: Kubernetes with FreeBSD - Part 5: WireGuard mesh network
2025-04-05 f3s: Kubernetes with FreeBSD - Part 4: Rocky Linux Bhyve VMs
2025-02-01 f3s: Kubernetes with FreeBSD - Part 3: Protecting from power cuts
2024-12-03 f3s: Kubernetes with FreeBSD - Part 2: Hardware and base installation
2024-11-17 f3s: Kubernetes with FreeBSD - Part 1: Setting the stage
2024-04-01 KISS high-availability with OpenBSD
2024-01-13 One reason why I love OpenBSD
2022-10-30 Installing DTail on OpenBSD
2022-07-30 Let's Encrypt with OpenBSD and Rex
2016-04-09 Jails and ZFS with Puppet on FreeBSD

E-Mail your comments to paul@nospam.buetow.org

Back to the main site
'The Courage To Be Disliked' book notes https://foo.zone/gemfeed/2025-11-02-the-courage-to-be-disliked-book-notes.html 2025-11-01T17:28:38+02:00 Paul Buetow aka snonux paul@dev.buetow.org These are my personal book notes from Ichiro Kishimi and Fumitake Koga's 'The Courage To Be Disliked'. They are for me, but I hope they might be useful to you too.

"The Courage To Be Disliked" book notes



Published at 2025-11-01T17:28:38+02:00

These are my personal book notes from Ichiro Kishimi and Fumitake Koga's "The Courage To Be Disliked". They are for me, but I hope they might be useful to you too.

         ,..........   ..........,
     ,..,'          '.'          ',..,
    ,' ,'            :            ', ',
   ,' ,'             :             ', ',
  ,' ,'              :              ', ',
 ,' ,'............., : ,.............', ',
,'  '............   '.'   ............'  ',
 '''''''''''''''''';''';''''''''''''''''''
                    '''

Table of Contents




The Nature of Life and Happiness



  • Life and the world are fundamentally simple; we are the ones who make them complicated. Drama does not exist.
  • Happiness is a choice and is attainable for everyone. Often, we lack the courage to be happy because it's easier to stay in a familiar, albeit unhappy, situation than to choose a new lifestyle, which may bring anxiety and unknowns.
  • Unhappiness is something you choose for yourself.

Subjective Reality and Perception



  • Our perception of the world is subjective. We don't see the world as it is, but as we are.
  • The world you see is different from the one I see, and it's impossible to truly share your world with anyone else.

This is illustrated by the "10 people" example: if one person dislikes you, two love you, and seven are indifferent, focusing only on the one who dislikes you gives a distorted and negative view of your life. You are focusing on a tiny, insignificant part and judging the whole by it.

The challenge is to find the courage to see the world directly, without the filters of our own subjective views.

The Power to Change and the Role of the Past



  • We are not defined by our past experiences but by the meaning we assign to them. The past does not determine our future.
  • The book rejects Freudian etiology (the idea that past trauma defines us) in favor of teleology (the idea that we are driven by our present goals).
  • Change is possible for everyone at any moment, regardless of their circumstances or age. This change must come from your own doing, not from others.
  • We live in accordance with our present goals, not past causes. The past does not exist; the only issue is the present.
  • Emotions, like anger, can be fabricated tools used to achieve a goal (e.g., to control or shout at someone) rather than uncontrollable forces that rule us.

Self-Acceptance, Lifestyle, and Life Lies



  • Your "lifestyle"—your worldview and outlook on life—is a choice, not a fixed personality trait. You can change it instantly.
  • The key is self-acceptance, not self-affirmation. Accept what you cannot change and have the courage to change what you can.
  • You cannot be reborn as someone else. It is better to learn to love yourself and make the best use of the "equipment" you were born with.
  • Workaholism is a "life lie." It is a form of being in disharmony with life, using work as an excuse to avoid other life tasks and responsibilities.

Interpersonal Relationships



  • All problems are, at their core, problems of interpersonal relationships. To escape all problems would mean to live alone in the universe, which is impossible.
  • The book identifies three "Life Tasks" that everyone faces: the task of work, the task of friendship, and the task of love.
  • **Competition:** Life is not a competition. When we stop comparing ourselves to others, we cease to see them as enemies. They become comrades, and we can genuinely celebrate their successes. This removes the fear of losing and allows for peace.
  • **Power Struggles:** When someone is angry with you, recognize it as their attempt at a power struggle. The person who attacks you is the one with the problem. Do not get drawn in. Arguing about who is right or wrong is a trap. Admitting a fault is not a defeat.
  • **Horizontal vs. Vertical Relationships:** Strive for "horizontal relationships" based on equality, rather than "vertical relationships" based on hierarchy. Praise and rebuke are forms of manipulation found in vertical relationships. Instead, offer encouragement. (Note: The original author expresses disagreement with applying this to children, feeling a hierarchy is necessary and that children appreciate praise).
  • **Separation of Tasks:** Understand what is your responsibility and what is someone else's. For example, if someone takes advantage of your trust, that is their task. Your task is to decide whether to trust them in the first place.
  • **Confidence in Others:** Having unconditional confidence in others helps build deep relationships and a sense of belonging, turning others into comrades.

Inferiority and Superiority



  • A feeling of inferiority is not inherently bad; it can be a catalyst for growth when we compare ourselves to our ideal self. This "pursuit of superiority" drives progress.
  • This is different from an "inferiority complex," which is using feelings of inadequacy as an excuse to avoid change and responsibility.
  • Value is based on a social context. An object's worth is subjective and can be reinterpreted.

Community, Contribution, and Happiness



  • The definition of happiness is the feeling of contribution.
  • A true sense of self-worth comes from feeling useful to a community (the "community feeling").
  • This contribution doesn't have to be grand. You can be of worth to the community simply by being.
  • When you have a genuine feeling of contribution, you no longer need recognition or praise from others.

Living in the Here and Now



  • Life is a series of moments ("dots"), not a continuous line. We should live fully in the "here and now."
  • The greatest life lie is to dwell on the past and the future, which do not exist, instead of focusing on the present moment.
  • Focus on the process, not just the outcome. The goal of a dance is the dancing itself, not just reaching a destination.

The Courage to Be Normal



  • Why does everyone want to be special? Is it inferior to be normal?
  • Embracing being normal, instead of striving for a special status, is a form of courage. In the grander sense, isn't everyone normal?

Freedom is Being Disliked



  • The price of true freedom is to be disliked by other people. It is a sign that you are living in accordance with your own principles.

The Meaning of Life



  • Life has no inherent meaning. It is up to each individual to assign meaning to their own life.
  • Do not be afraid of being disliked by others for living your life according to the meaning you create.
  • You have the power to change yourself, and in doing so, you change your world. No one else can change it for you.

E-Mail your comments to paul@nospam.buetow.org :-)

Other book notes of mine are:

2025-11-02 'The Courage To Be Disliked' book notes (You are currently reading this)
2025-06-07 'A Monk's Guide to Happiness' book notes
2025-04-19 'When: The Scientific Secrets of Perfect Timing' book notes
2024-10-24 'Staff Engineer' book notes
2024-07-07 'The Stoic Challenge' book notes
2024-05-01 'Slow Productivity' book notes
2023-11-11 'Mind Management' book notes
2023-07-17 'Software Developers Career Guide and Soft Skills' book notes
2023-05-06 'The Obstacle is the Way' book notes
2023-04-01 'Never split the difference' book notes
2023-03-16 'The Pragmatic Programmer' book notes

Back to the main site
Perl New Features and Foostats https://foo.zone/gemfeed/2025-11-02-perl-new-features-and-foostats.html 2025-11-01T16:10:35+02:00 Paul Buetow aka snonux paul@dev.buetow.org Perl recently reached rank 10 in the TIOBE index. That headline made me write this blog post as I was developing the Foostats script for simple analytics of my personal websites and Gemini capsules (e.g. `foo.zone`) and there were a couple of new features added to the Perl language over the last releases. The book *Perl New Features* by brian d foy documents the changes well; this post shows how those features look in a real program that runs every morning for my stats generation.

Perl New Features and Foostats



Published at 2025-11-01T16:10:35+02:00

Perl recently reached rank 10 in the TIOBE index. That headline made me write this blog post as I was developing the Foostats script for simple analytics of my personal websites and Gemini capsules (e.g. foo.zone) and there were a couple of new features added to the Perl language over the last releases. The book *Perl New Features* by brian d foy documents the changes well; this post shows how those features look in a real program that runs every morning for my stats generation.

Perl re-enters the top ten
Perl New Features by Joshua McAdams and brian d foy

$b="24P7cP3dP31P3bPaP28P24P64P31P2cP24P64P32P2cP24P73P2cP24P67P2cP24P7
2P29P3dP28P22P31P30P30P30P30P22P2cP22P31P30P30P30P30P30P22P2cP22P4aP75
P7                                                                  3P
74                                                                  P2
0P  41P6eP6fP74P     68P65P72P20P50 P65P72P6cP2     0P48P           61
P6  3P6bP65P72P22P   29P3bPaP40P6dP 3dP73P70P6cP6   9P74P           20
P2  fP2fP    2cP22P  2cP2eP3aP21P2  bP2aP    30P4f  P40P2           2P
3b  PaP24      P6eP3 dP6c           P65P6      eP67 P74P6           8P
20  P24P7      3P3bP aP24           P75P3      dP22 P20P2           2P
78  P24P6      eP3bP aPaP           70P72      P69P 6eP74           P2
0P  22P5c    P6eP20  P20P           24P75    P5cP7  2P22P           3b
Pa  PaP66P6fP72P2    8P24P7aP20P    3dP20P31P3bP    20P24           P7
aP  3cP3dP24P6       eP3bP20P24     P7aP2bP2bP      29P20           P7
bP  aPaP9            P77P28P24P6    4P31P29P        3bPaP           9P
24  P72P3            dP69           P6eP74P28       P72P6           1P
6e  P64P2            8P24           P6eP2 9P29P     3bPaP           9P
24  P67P3            dP73           P75P6  2P73P    74P72           P2
0P  24P73            P2cP24P72P2cP  31P3b   PaP9P   24P67P20P3fP20  P6
4P  6fP20            P9P7bP20PaP9P9 P9P9P    9P66P  6fP72P20P28P24  P6
bP  3dP30            P3bP24P6bP3cP3 9P3bP    24P6bP 2bP2bP29P20P7b  Pa
P9                                                                  P9
P9                                                                  P9
P9  P9P73P75P6     2P73   P74P  72P2       8P24P75P2c     P24P72    P2
cP  31P29P3dP24P   6dP5   bP24  P6bP       5dP3bP20Pa   P9P9  P9P9  P9
P9  P70P    72P69  P6eP   74P2  0P22       P20P20P24P  75P      5cP 72
P2  2P3b      PaP9 P9P9   P9P9  P9P7       7P28       P24        P6 4P
32  P29P      3bPa P9P9   P9P9  P9P7       dPaP       9P9           P9
P9  P9P7      3P75 P62P   73P7  4P72       P28P        24P7         5P
2c  P24P    72P2c  P31P   29P3  dP24       P67P3bP20P   aP9P9       P9
P9  P7dP20PaP9P    9P3a   P20P  72P6       5P64P6fP3b      PaP9     P7
3P  75P62P73P      74P7   2P28  P24P       73P2cP24P7        2P2c   P3
1P  29P3dP2        2P30   P22P  3bPa       P9P7                0P7  2P
69  P6eP74P2       0P22   P20P  20P2       4P75                 P5c P7
2P  22P3 bPaPa     P7dP   aPaP  77P2       0P28                 P24 P6
4P  32P2  9P3bP    aP70   P72P  69P6       eP74       P2        0P2 2P
20  P20P   24P75   P20P21P5cP7  2P22P3bPaP 73P6cP65P6 5P7     0P20  P3
2P  3bPa    P70P7  2P69P6eP74P  20P22P20P2 0P24P75P20  P21P  5cP6   eP
22  P3bP     aPaP7  3P75P62P2   0P77P20P7b PaP9P24P6c    P3dP73     P6
8P                                                                  69
P6                                                                  6P
74P3bPaP9P66P6fP72P28P24P6aP3dP30P3bP24P6aP3cP24P6cP3bP24P6aP2bP2bP29P
7bP7dPaP7dP";$b=~s/\s//g;split /P/,$b;foreach(@_){$c.=chr hex};eval $c

The above Perl script prints out "Just Another Perl Hacker !" in an
animation of sorts.


Table of Contents




Motivation



I've been running foo.zone for a while now, but I've never looked into visitor statistics or analytics. I value privacy—not just my own, but also the privacy of others (the visitors of this site) — so I hesitated to use any off-the-shelf analytics plugins. All I wanted to collect were:

  • Which blog posts had the most (unique) visitors
  • Exclude, if possible, any bots and scrapers from the stats
  • Track only anonymized IP addresses, never store raw addresses

With Foostats I've created a Perl script which does that for my highly opinionated website/blog setup, which consists of:

Gemtexter, my static site and Gemini capsule generator
How I host this site highly-available using OpenBSD

Why I used Perl



Even though nowadays I code more in Go and Ruby, I stuck with Perl for Foostats for four simple reasons:

  • I wanted an excuse to explore the newer features of my first programming love.
  • Sometimes, I miss Perl.
  • Perl ships with OpenBSD (the operating system on which my sites run) by default.
  • It really does live up to its Practical Extraction and Report Language (that's what the name Perl means) for this kind of log grinding I did with Foostats.

Inside Foostats



Foostats is simply a log file analyser, which analyses the OpenBSD httpd and relayd logs.

https://man.openbsd.org/httpd.8
https://man.openbsd.org/relayd.8

Log pipeline



A CRON job starts Foostats, reads OpenBSD httpd and relayd access logs, and produces the numbers published at https://stats.foo.zone and https://stats.foo.zone. The dashboards are humble because traffic on my sites is still light, yet the trends are interesting for spotting patterns. The script is opinionated (I am repeating myself here, I know), and I will probably be the only one ever using it for my own sites. However, the code demonstrates how Perl's newer features help keep a small script like this exciting and fun!

Foostats (HTTP)
Foostats (Gemini)

On OpenBSD, I've configured the job via the daily.local on both of my OpenBSD servers (fishfinger.buetow.org and blowfish.buetow.org - note one is the master server, the other is the standby server, but the script runs on both and the stats are merged later in the process):

fishfinger$ grep foostats /etc/daily.local
perl /usr/local/bin/foostats.pl --parse-logs --replicate --report

Internally, Foostats::Logreader parses each line of the log files /var/log/daemon* and /var/www/logs/access_log*, turns timestamps into YYYYMMDD/HHMMSS values, hashes IP addresses with SHA3 (for anonymization), and hands a normalized event to Foostats::Filter. The filter compares the URI against entries in fooodds.txt, tracks how many times an IP address requests within the exact second, and drops anything suspicious (e.g., from web crawlers or malicious attackers). Valid events reach Foostats::Aggregator, which counts requests per protocol, records unique visitors for the Gemtext and Atom feeds, and remembers page-level IP sets. Foostats::FileOutputter writes the result as gzipped JSON files—one per day and per protocol—with IPv4/IPv6 splits, filtered counters, feed readership, and hashes for long URLs.

fooodds.txt



fooodds.txt is a plain text list of substrings of URLs to be blocked, making it quick to shut down web crawlers. Foostats also detects rapid requests (an indicator of excessive crawling) and blocks the IP. Audit lines are written to /var/log/fooodds, which can later be reviewed for false or true positives (I do this around once a month). The Justfile even has a gather-fooodds target that collects suspicious paths from remote logs so new patterns can be added quickly.

Feed kinds



There are different kinds of feeds being tracked by Foostats:

  • The Atom web-feed
  • The same feed via Gemini
  • The Gemfeed (a special format popular in the Geminispace)

Aggregation and output



As mentioned, Foostats merges the stats from both hosts, master and standby. For the master-standby setup description, read:

KISS high-availability with OpenBSD

Those gzipped files land in stats/. From there, Foostats::Replicator can pull matching files from the partner host (fishfinger or blowfish) so the view covers both servers, Foostats::Merger combines them into daily summaries, and Foostats::Reporter rebuilds Gemtext and HTML reports.

Those are the raw stats files:

https://blowfish.buetow.org/foostats/
https://fishfinger.buetow.org/foostats/

These are the 30-day reports generated (already linked earlier in this post, but adding here again for clarity):

stats.foo.zone Gemini capsule dashboard
stats.foo.zone HTTP dashboard

Command-line entry points



foostats_main is the command entry point. --parse-logs refreshes the gzipped files, --replicate runs the cross-host sync, and --report rebuilds the HTML and Gemini report pages. --all performs everything in one go. Defaults point to /var/www/htdocs/buetow.org/self/foostats for data, /var/gemini/stats.foo.zone for Gemtext output, and /var/www/htdocs/gemtexter/stats.foo.zone for HTML output. Replication always forces the three most recent days' worth of data across HTTPS and leaves older files untouched to save bandwidth.

The complete source lives on Codeberg here:

Foostats on Codeberg

Now let's go to some new Perl features:

Packages as real blocks



Scoped packages



Recent Perl versions allow the block form package Foo { ... }. Foostats uses it for every package. Imports stay local to the block, helper subs do not leak into the global symbol table, and configuration happens where the code needs it.

The old way:

package foo;

sub hello {
    print "Hello from package foo\n";
}

package bar;

sub hello {
    print "Hello from package bar\n";
}

But now it is also possible to do this:

package foo {
    sub hello {
        print "Hello from package foo\n";
    }
}

package bar {
    sub hello {
        print "Hello from package bar\n";
    }
}

Postfix dereferencing keeps data structures tidy



Clear dereferencing



The script handles nested hashes and arrays. Postfix dereferencing ($hash->%*, $array->@*) keeps that readable.

E.g. instead of having to write:

for my $elem (@{$array_ref}) {
    print "$elem\n";
}

one can now do:

for my $elem ($array_ref->@*) {
    print "$elem\n";
}

You see that this feature becomes increasingly useful with nested data structures, e.g. to print all keys of the nested hash:

print for keys $hash->{stats}->%*;

Loops over like $stats->{page_ips}->{urls}->%* or $merge{$key}->{$_}->%* show which level of the structure is in play. The merger in Foostats updates host and URL statistics without building temporary arrays, and the reporter code mirrors the layout of the final tables. Before postfix dereferencing, the same code relied on braces within braces and was harder to read.

say is the default voice now



say became the default once the script switched to use v5.38;. It adds a newline to every message printed, comparable to Ruby's puts, making log messages like "Processing $path" or "Writing report to $report_path" cleaner:

use v5.38;

print "Hello, world!\n";    # old way
say "Hello, world!";        # new way

Lexical subs promote local reasoning



Lexical subroutines keep helpers close to the code that needs them. In Foostats::Logreader::parse_web_logs, functions such as my sub parse_date and my sub open_file live only inside that scope.

This is an example of a lexical sub named trim, which is only visible within the outer sub named process_lines:

use v5.38;

sub process_lines (@lines) {
    my sub trim ($str) {
        $str =~ s/^\s+|\s+$//gr;
    }
    return [ map { trim($_) } @lines ];
}

my @raw = ("  foo  ", " bar", "baz ");
my $cleaned = process_lines(@raw);
say for @$cleaned; # prints "foo", "bar", "baz"

Reference aliasing makes intent explicit



Reference aliasing can be enabled with use feature qw(refaliasing) and helps communicate intent more clearly (if you remember the Perl syntax, of course—otherwise, it can look rather cryptic). The filter starts with \my $uri_path = \$event->{uri_path} so any later modification touches the original event. This is an example with ref aliasing in action:

use feature qw(refaliasing);

my $hash = { foo => 42 };
\my $foo = \$hash->{foo};

$foo = 99;
print $hash->{foo}; # prints 99

The aggregator in Foostats aliases $self->{stats}{$date_key} before updating counters, so the structure remains intact. Combined with subroutine signatures, this makes it obvious when a piece of data is shared instead of copied, preventing silent bugs. This enables having shorter names for long nested data structures.

Persistent state without globals



A Perl state variable is declared with state $var and retains its value between calls to the enclosing subroutine. Foostats uses that for rate limiting and de-duplicated logging.

This is a small example demonstrating the use of a state variable in Perl:

sub counter {
    state $count = 0;
    $count++;
    return $count;
}

say counter(); # 1
say counter(); # 2
say counter(); # 3

Hash and array state variables have been supported since state arrived in Perl 5.10. Scalar state variables were already supported previously.

Rate limiting state



In Foostats, state variables store run-specific state without using package globals. state %blocked remembers IP hashes that already triggered the odd-request filter, and state $last_time and state %count track how many requests an IP makes in the exact second.

De-duplicated logging



state %dedup keeps the log output of the suspicious calls to one warning per URI. Early versions utilized global hashes for the same tasks, producing inconsistent results during tests. Switching to state removed those edge cases.

Subroutine signatures



Perl now supports subroutine signatures like other modern languages do. Foostats uses them everywhere. Examples:

# Old way
sub greet_old { my $name = shift; print "Hello, $name!\n" }

# Another old way
sub greet_old2 ($) { my $name = shift; print "Hello, $name!\n" }

# New way
sub greet ($name) { say "Hello, $name!"; }

greet("Alice"); # prints "Hello, Alice!"

In Foostats, constructors declare sub new ($class, $odds_file, $log_path), anonymous callbacks expose sub ($event), and helper subs list the values they expect, e.g.:

my $anon = sub ($name) {
    say "Hello, $name!";
};

$anon->("World"); # prints "Hello, World!"

Defined-or assignment for defaults without boilerplate



The operator //= keeps configuration and counters simple. Environment variables may be missing when CRON runs the script, so //=, combined with signatures, sets defaults without warnings. Example use of that operator:

my $foo;
$foo //= 42;
say $foo; # prints 42

$foo //= 99;
say $foo; # still prints 42, because $foo was already defined

Cleanup with defer



Even though not used in Foostats, this feature (similar to Go's defer) is neat to have in Perl now.

The defer block (use feature 'defer") schedules a piece of code to run when the current scope exits, regardless of how it exits (e.g. normal return, exception). This is perfect for ensuring resources, such as file handles, are closed.

use feature qw(defer);

sub parse_log_file ($path) {
    open my $fh, '<', $path or die "Cannot open $path: $!";
    defer { close $fh };

    while (my $line = <$fh>) {
        # ... parsing logic that might throw an exception ...
    }
    # $fh is automatically closed here
}

This pattern replaces manual close calls in every exit path of the subroutine and is more robust than relying solely on object destructors.

Builtins and booleans



The script also utilizes other modern additions that often go unnoticed. use builtin qw(true false); combined with experimental::builtin provides more real boolean values.

Conclusion



I want to code more in Perl again. The newer features make it a joy to write small scripts like Foostats. If you haven't looked at Perl in a while, give it another try! The main thing which holds me back from writing more Perl is the lack of good tooling. For example, there is no proper LSP and tree sitter support available, which would work as good as the ones available for Go and Ruby.

A reader pointed out that there's now a third-party Perl Tree-sitter implementation one could use:

https://github.com/tree-sitter-perl/tree-sitter-perl

E-Mail your comments to paul@nospam.buetow.org :-)

Other related posts are:

2025-11-02 Perl New Features and Foostats (You are currently reading this)
2023-05-01 Unveiling guprecords.raku: Global Uptime Records with Raku
2022-05-27 Perl is still a great choice
2011-05-07 Perl Daemon (Service Framework)
2008-06-26 Perl Poetry

Back to the main site
Key Takeaways from The Well-Grounded Rubyist https://foo.zone/gemfeed/2025-10-11-key-takeaways-from-the-well-grounded-rubyist.html 2025-10-11T15:25:14+03:00 Paul Buetow aka snonux paul@dev.buetow.org Some time ago, I wrote about my journey into Ruby and how 'The Well-Grounded Rubyist' helped me to get a better understanding of the language. I took a lot of notes while reading the book, and I think it's time to share some of them. This is not a comprehensive review, but rather a collection of interesting tidbits and concepts that stuck with me.

Key Takeaways from The Well-Grounded Rubyist



Published at 2025-10-11T15:25:14+03:00

Some time ago, I wrote about my journey into Ruby and how "The Well-Grounded Rubyist" helped me to get a better understanding of the language. I took a lot of notes while reading the book, and I think it's time to share some of them. This is not a comprehensive review, but rather a collection of interesting tidbits and concepts that stuck with me.

Table of Contents




My first post about the book.



The Object Model



One of the most fascinating aspects of Ruby is its object model. The book does a great job of explaining the details.

Everything is an object (almost)



In Ruby, most things are objects. This includes numbers, strings, and even classes themselves. This has some interesting consequences. For example, you can't use i++ like in C or Java. Integers are immutable objects. 1 is always the same object. 1 + 1 returns a new object, 2.

The self keyword



There is always a current object, self. If you call a method without an explicit receiver, it's called on self. For example, puts "hello" is actually self.puts "hello".

# At the top level, self is the main object
p self
# => main
p self.class
# => Object

def foo
  # Inside a method, self is the object that received the call
  p self
end

foo
# => main

This code demonstrates how self changes depending on the context. At the top level, it's main, an instance of Object. When foo is called without a receiver, it's called on main.

Singleton Methods



You can add methods to individual objects. These are called singleton methods.

obj = "a string"

def obj.shout
  self.upcase + "!"
end

p obj.shout
# => "A STRING!"

obj2 = "another string"
# obj2.shout would raise a NoMethodError

Here, the shout method is only available on the obj object. This is a powerful feature for adding behavior to specific instances.

Classes are Objects



Classes themselves are objects, instances of the Class class. This means you can create classes dynamically.

MyClass = Class.new do
  def say_hello
    puts "Hello from a dynamically created class!"
  end
end

instance = MyClass.new
instance.say_hello
# => Hello from a dynamically created class!

This shows how to create a new class and assign it to a constant. This is what happens behind the scenes when you use the class keyword.

Control Flow and Methods



The book clarified many things about how methods and control flow work in Ruby.

case and the === operator



The case statement is more powerful than I thought. It uses the === (threequals or case equality) operator for comparison, not ==. Different classes can implement === in their own way.

# For ranges, it checks for inclusion
p (1..5) === 3 # => true

# For classes, it checks if the object is an instance of the class
p String === "hello" # => true

# For regexes, it checks for a match
p /llo/ === "hello" # => true

def check(value)
  case value
  when String
    "It's a string"
  when (1..10)
    "It's a number between 1 and 10"
  else
    "Something else"
  end
end

p check(5) # => "It's a number between 1 and 10"

Blocks and yield



Blocks are a cornerstone of Ruby. You can pass them to methods to customize their behavior. The yield keyword is used to call the block.

def my_iterator
  puts "Entering the method"
  yield
  puts "Back in the method"
  yield
end

my_iterator { puts "Inside the block" }
# Entering the method
# Inside the block
# Back in the method
# Inside the block

This simple iterator shows how yield transfers control to the block. You can also pass arguments to yield and get a return value from the block.

def with_return
  result = yield(5)
  puts "The block returned #{result}"
end

with_return { |n| n * 2 }
# => The block returned 10

This demonstrates passing an argument to the block and using its return value.

Fun with Data Types



Ruby's core data types are full of nice little features.

Symbols



Symbols are like immutable strings. They are great for keys in hashes because they are unique and memory-efficient.

# Two strings with the same content are different objects
p "foo".object_id
p "foo".object_id

# Two symbols with the same content are the same object
p :foo.object_id
p :foo.object_id

# Modern hash syntax uses symbols as keys
my_hash = { name: "Paul", language: "Ruby" }
p my_hash[:name] # => "Paul"

This code highlights the difference between strings and symbols and shows the convenient hash syntax.

Arrays and Hashes



Arrays and hashes have a rich API. The %w and %i shortcuts for creating arrays of strings and symbols are very handy.

# Array of strings
p %w[one two three]
# => ["one", "two", "three"]

# Array of symbols
p %i[one two three]
# => [:one, :two, :three]

A quick way to create arrays. You can also retrieve multiple values at once.

arr = [10, 20, 30, 40, 50]
p arr.values_at(0, 2, 4)
# => [10, 30, 50]

hash = { a: 1, b: 2, c: 3 }
p hash.values_at(:a, :c)
# => [1, 3]

The values_at method is a concise way to get multiple elements.

Final Thoughts



These are just a few of the many things I learned from "The Well-Grounded Rubyist". The book gave me a much deeper appreciation for the language and its design. If you are a Ruby programmer, I highly recommend it. Meanwhile, I also read the book "Programming Ruby 3.3", just I didn't have time to process my notes there yet.

E-Mail your comments to paul@nospam.buetow.org :-)

Other Ruby-related posts:

2026-03-02 RCM: The Ruby Configuration Management DSL
2025-10-11 Key Takeaways from The Well-Grounded Rubyist (You are currently reading this)
2021-07-04 The Well-Grounded Rubyist

Back to the main site
f3s: Kubernetes with FreeBSD - Part 7: k3s and first pod deployments https://foo.zone/gemfeed/2025-10-02-f3s-kubernetes-with-freebsd-part-7.html 2025-10-02T11:27:19+03:00, last updated Tue 30 Dec 10:11:58 EET 2025 Paul Buetow aka snonux paul@dev.buetow.org This is the seventh blog post about the f3s series for my self-hosting demands in a home lab. f3s? The 'f' stands for FreeBSD, and the '3s' stands for k3s, the Kubernetes distribution I use on FreeBSD-based physical machines.

f3s: Kubernetes with FreeBSD - Part 7: k3s and first pod deployments



Published at 2025-10-02T11:27:19+03:00, last updated Tue 30 Dec 10:11:58 EET 2025

This is the seventh blog post about the f3s series for my self-hosting demands in a home lab. f3s? The "f" stands for FreeBSD, and the "3s" stands for k3s, the Kubernetes distribution I use on FreeBSD-based physical machines.

2024-11-17 f3s: Kubernetes with FreeBSD - Part 1: Setting the stage
2024-12-03 f3s: Kubernetes with FreeBSD - Part 2: Hardware and base installation
2025-02-01 f3s: Kubernetes with FreeBSD - Part 3: Protecting from power cuts
2025-04-05 f3s: Kubernetes with FreeBSD - Part 4: Rocky Linux Bhyve VMs
2025-05-11 f3s: Kubernetes with FreeBSD - Part 5: WireGuard mesh network
2025-07-14 f3s: Kubernetes with FreeBSD - Part 6: Storage
2025-10-02 f3s: Kubernetes with FreeBSD - Part 7: k3s and first pod deployments (You are currently reading this)
2025-12-07 f3s: Kubernetes with FreeBSD - Part 8: Observability

f3s logo

Table of Contents




Introduction



In this blog post, I am finally going to install k3s (the Kubernetes distribution I use) to the whole setup and deploy the first workloads (helm charts, and a private registry) to it.

https://k3s.io

Important Note: GitOps Migration



**Note:** After publishing this blog post, the f3s cluster was migrated from imperative Helm deployments to declarative GitOps using ArgoCD. The Kubernetes manifests and Helm charts in the repository have been reorganized for ArgoCD-based continuous deployment.

**To view the exact manifests and charts as they existed when this blog post was written** (before the ArgoCD migration), check out the pre-ArgoCD revision:

$ git clone https://codeberg.org/snonux/conf.git
$ cd conf
$ git checkout 15a86f3  # Last commit before ArgoCD migration
$ cd f3s/

**Current master branch** contains the ArgoCD-managed versions with:
  • Application manifests organized under argocd-apps/{monitoring,services,infra,test}/
  • Additional resources under */manifests/ directories (e.g., prometheus/manifests/)
  • Justfiles updated to trigger ArgoCD syncs instead of direct Helm commands

The deployment concepts and architecture remain the same—only the deployment method changed from imperative (helm install/upgrade) to declarative (GitOps with ArgoCD).

Updating



Before proceeding, I bring all systems involved up-to-date. On all three Rocky Linux 9 boxes r0, r1, and r2:

dnf update -y
reboot

On the FreeBSD hosts, I upgraded from FreeBSD 14.2 to 14.3-RELEASE, running this on all three hosts f0, f1 and f2:

paul@f0:~ % doas freebsd-update fetch
paul@f0:~ % doas freebsd-update install
paul@f0:~ % doas reboot
.
.
.
paul@f0:~ % doas freebsd-update -r 14.3-RELEASE upgrade
paul@f0:~ % doas freebsd-update install
paul@f0:~ % doas freebsd-update install
paul@f0:~ % doas reboot
.
.
.
paul@f0:~ % doas freebsd-update install
paul@f0:~ % doas pkg update
paul@f0:~ % doas pkg upgrade
paul@f0:~ % doas reboot
.
.
.
paul@f0:~ % uname -a
FreeBSD f0.lan.buetow.org 14.3-RELEASE FreeBSD 14.3-RELEASE
        releng/14.3-n271432-8c9ce319fef7 GENERIC amd64

Installing k3s



Generating K3S_TOKEN and starting the first k3s node



I generated the k3s token on my Fedora laptop with pwgen -n 32 and selected one of the results. Then, on all three r hosts, I ran the following (replace SECRET_TOKEN with the actual secret):

[root@r0 ~]# echo -n SECRET_TOKEN > ~/.k3s_token

The following steps are also documented on the k3s website:

https://docs.k3s.io/datastore/ha-embedded

To bootstrap k3s on the first node, I ran this on r0:

[root@r0 ~]# curl -sfL https://get.k3s.io | K3S_TOKEN=$(cat ~/.k3s_token) \
        sh -s - server --cluster-init \
        --node-ip=192.168.2.120 \
        --advertise-address=192.168.2.120 \
        --tls-san=r0.wg0.wan.buetow.org
[INFO]  Finding release for channel stable
[INFO]  Using v1.32.6+k3s1 as release
.
.
.
[INFO]  systemd: Starting k3s

Note: The --node-ip and --advertise-address flags are important to ensure that the embedded etcd cluster communicates over the WireGuard interface (192.168.2.x) rather than the LAN interface (192.168.1.x). This ensures that all control plane traffic is encrypted via WireGuard.

Adding the remaining nodes to the cluster



Then I ran on the other two nodes r1 and r2:

[root@r1 ~]# curl -sfL https://get.k3s.io | K3S_TOKEN=$(cat ~/.k3s_token) \
        sh -s - server --server https://r0.wg0.wan.buetow.org:6443 \
        --node-ip=192.168.2.121 \
        --advertise-address=192.168.2.121 \
        --tls-san=r1.wg0.wan.buetow.org

[root@r2 ~]# curl -sfL https://get.k3s.io | K3S_TOKEN=$(cat ~/.k3s_token) \
        sh -s - server --server https://r0.wg0.wan.buetow.org:6443 \
        --node-ip=192.168.2.122 \
        --advertise-address=192.168.2.122 \
        --tls-san=r2.wg0.wan.buetow.org
.
.
.


Once done, I had a three-node Kubernetes cluster control plane:

[root@r0 ~]# kubectl get nodes
NAME                STATUS   ROLES                       AGE     VERSION
r0.lan.buetow.org   Ready    control-plane,etcd,master   4m44s   v1.32.6+k3s1
r1.lan.buetow.org   Ready    control-plane,etcd,master   3m13s   v1.32.6+k3s1
r2.lan.buetow.org   Ready    control-plane,etcd,master   30s     v1.32.6+k3s1

[root@r0 ~]# kubectl get pods --all-namespaces
NAMESPACE     NAME                                      READY   STATUS      RESTARTS   AGE
kube-system   coredns-5688667fd4-fs2jj                  1/1     Running     0          5m27s
kube-system   helm-install-traefik-crd-f9hgd            0/1     Completed   0          5m27s
kube-system   helm-install-traefik-zqqqk                0/1     Completed   2          5m27s
kube-system   local-path-provisioner-774c6665dc-jqlnc   1/1     Running     0          5m27s
kube-system   metrics-server-6f4c6675d5-5xpmp           1/1     Running     0          5m27s
kube-system   svclb-traefik-411cec5b-cdp2l              2/2     Running     0          78s
kube-system   svclb-traefik-411cec5b-f625r              2/2     Running     0          4m58s
kube-system   svclb-traefik-411cec5b-twrd7              2/2     Running     0          4m2s
kube-system   traefik-c98fdf6fb-lt6fx                   1/1     Running     0          4m58s

In order to connect with kubectl from my Fedora laptop, I had to copy /etc/rancher/k3s/k3s.yaml from r0 to ~/.kube/config and then replace the value of the server field with r0.lan.buetow.org. kubectl can now manage the cluster. Note that this step has to be repeated when I want to connect to another node of the cluster (e.g. when r0 is down).

Test deployments



Test deployment to Kubernetes



Let's create a test namespace:

> ~ kubectl create namespace test
namespace/test created

> ~ kubectl get namespaces
NAME              STATUS   AGE
default           Active   6h11m
kube-node-lease   Active   6h11m
kube-public       Active   6h11m
kube-system       Active   6h11m
test              Active   5s

> ~ kubectl config set-context --current --namespace=test
Context "default" modified.

And let's also create an Apache test pod:

> ~ cat <<END > apache-deployment.yaml
# Apache HTTP Server Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: apache-deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      app: apache
  template:
    metadata:
      labels:
        app: apache
    spec:
      containers:
      - name: apache
        image: httpd:latest
        ports:
        # Container port where Apache listens
        - containerPort: 80
END

> ~ kubectl apply -f apache-deployment.yaml
deployment.apps/apache-deployment created

> ~ kubectl get all
NAME                                     READY   STATUS    RESTARTS   AGE
pod/apache-deployment-5fd955856f-4pjmf   1/1     Running   0          7s

NAME                                READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/apache-deployment   1/1     1            1           7s

NAME                                           DESIRED   CURRENT   READY   AGE
replicaset.apps/apache-deployment-5fd955856f   1         1         1       7s

Let's also create a service:

> ~ cat <<END > apache-service.yaml
apiVersion: v1
kind: Service
metadata:
  labels:
    app: apache
  name: apache-service
spec:
  ports:
    - name: web
      port: 80
      protocol: TCP
      # Expose port 80 on the service
      targetPort: 80
  selector:
  # Link this service to pods with the label app=apache
    app: apache
END

> ~ kubectl apply -f apache-service.yaml
service/apache-service created

> ~ kubectl get service
NAME             TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE
apache-service   ClusterIP   10.43.249.165   <none>        80/TCP    4s

Now let's create an ingress:

Note: I've modified the hosts listed in this example after I published this blog post to ensure that there aren't any bots scraping it.

> ~ cat <<END > apache-ingress.yaml

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: apache-ingress
  namespace: test
  annotations:
    spec.ingressClassName: traefik
    traefik.ingress.kubernetes.io/router.entrypoints: web
spec:
  rules:
    - host: f3s.foo.zone
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: apache-service
                port:
                  number: 80
    - host: standby.f3s.foo.zone
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: apache-service
                port:
                  number: 80
    - host: www.f3s.foo.zone
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: apache-service
                port:
                  number: 80
END

> ~ kubectl apply -f apache-ingress.yaml
ingress.networking.k8s.io/apache-ingress created

> ~ kubectl describe ingress
Name:             apache-ingress
Labels:           <none>
Namespace:        test
Address:          192.168.2.120,192.168.2.121,192.168.2.122
Ingress Class:    traefik
Default backend:  <default>
Rules:
  Host                    Path  Backends
  ----                    ----  --------
  f3s.foo.zone
                          /   apache-service:80 (10.42.1.11:80)
  standby.f3s.foo.zone
                          /   apache-service:80 (10.42.1.11:80)
  www.f3s.foo.zone
                          /   apache-service:80 (10.42.1.11:80)
Annotations:              spec.ingressClassName: traefik
                          traefik.ingress.kubernetes.io/router.entrypoints: web
Events:                   <none>

Notes:

  • In the ingress, I use plain HTTP (web) for the Traefik rule, as all the "production" traffic will be routed through a WireGuard tunnel anyway, as I will show later.

So I tested the Apache web server through the ingress rule:

> ~ curl -H "Host: www.f3s.foo.zone" http://r0.lan.buetow.org:80
<html><body><h1>It works!</h1></body></html>

Test deployment with persistent volume claim



Next, I modified the Apache example to serve the htdocs directory from the NFS share I created in the previous blog post. I used the following manifests. Most of them are the same as before, except for the persistent volume claim and the volume mount in the Apache deployment.

> ~ cat <<END > apache-deployment.yaml
# Apache HTTP Server Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: apache-deployment
  namespace: test
spec:
  replicas: 2
  selector:
    matchLabels:
      app: apache
  template:
    metadata:
      labels:
        app: apache
    spec:
      containers:
      - name: apache
        image: httpd:latest
        ports:
        # Container port where Apache listens
        - containerPort: 80
        readinessProbe:
          httpGet:
            path: /
            port: 80
          initialDelaySeconds: 5
          periodSeconds: 10
        livenessProbe:
          httpGet:
            path: /
            port: 80
          initialDelaySeconds: 15
          periodSeconds: 10
        volumeMounts:
        - name: apache-htdocs
          mountPath: /usr/local/apache2/htdocs/
      volumes:
      - name: apache-htdocs
        persistentVolumeClaim:
          claimName: example-apache-pvc
END

> ~ cat <<END > apache-ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: apache-ingress
  namespace: test
  annotations:
    spec.ingressClassName: traefik
    traefik.ingress.kubernetes.io/router.entrypoints: web
spec:
  rules:
    - host: f3s.foo.zone
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: apache-service
                port:
                  number: 80
    - host: standby.f3s.foo.zone
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: apache-service
                port:
                  number: 80
    - host: www.f3s.foo.zone
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: apache-service
                port:
                  number: 80
END

> ~ cat <<END > apache-persistent-volume.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
  name: example-apache-pv
spec:
  capacity:
    storage: 1Gi
  volumeMode: Filesystem
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  hostPath:
    path: /data/nfs/k3svolumes/example-apache-volume-claim
    type: Directory
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: example-apache-pvc
  namespace: test
spec:
  storageClassName: ""
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
END

> ~ cat <<END > apache-service.yaml
apiVersion: v1
kind: Service
metadata:
  labels:
    app: apache
  name: apache-service
  namespace: test
spec:
  ports:
    - name: web
      port: 80
      protocol: TCP
      # Expose port 80 on the service
      targetPort: 80
  selector:
  # Link this service to pods with the label app=apache
    app: apache
END

I applied the manifests:

> ~ kubectl apply -f apache-persistent-volume.yaml
> ~ kubectl apply -f apache-service.yaml
> ~ kubectl apply -f apache-deployment.yaml
> ~ kubectl apply -f apache-ingress.yaml

Looking at the deployment, I could see it failed because the directory didn't exist yet on the NFS share (note that I also increased the replica count to 2 so if one node goes down there's already a replica running on another node for faster failover):

> ~ kubectl get pods
NAME                                 READY   STATUS              RESTARTS   AGE
apache-deployment-5b96bd6b6b-fv2jx   0/1     ContainerCreating   0          9m15s
apache-deployment-5b96bd6b6b-ax2ji   0/1     ContainerCreating   0          9m15s

> ~ kubectl describe pod apache-deployment-5b96bd6b6b-fv2jx | tail -n 5
Events:
  Type     Reason       Age                   From               Message
  ----     ------       ----                  ----               -------
  Normal   Scheduled    9m34s                 default-scheduler  Successfully
    assigned test/apache-deployment-5b96bd6b6b-fv2jx to r2.lan.buetow.org
  Warning  FailedMount  80s (x12 over 9m34s)  kubelet            MountVolume.SetUp
    failed for volume "example-apache-pv" : hostPath type check failed:
    /data/nfs/k3svolumes/example-apache is not a directory

That's intentional—I needed to create the directory on the NFS share first, so I did that (e.g. on r0):

[root@r0 ~]# mkdir /data/nfs/k3svolumes/example-apache-volume-claim/

[root@r0 ~]# cat <<END > /data/nfs/k3svolumes/example-apache-volume-claim/index.html
<!DOCTYPE html>
<html>
<head>
  <title>Hello, it works</title>
</head>
<body>
  <h1>Hello, it works!</h1>
  <p>This site is served via a PVC!</p>
</body>
</html>
END

The index.html file gives us some actual content to serve. After deleting the pod, it recreates itself and the volume mounts correctly:

> ~ kubectl delete pod apache-deployment-5b96bd6b6b-fv2jx

> ~ curl -H "Host: www.f3s.foo.zone" http://r0.lan.buetow.org:80
<!DOCTYPE html>
<html>
<head>
  <title>Hello, it works</title>
</head>
<body>
  <h1>Hello, it works!</h1>
  <p>This site is served via a PVC!</p>
</body>
</html>

Scaling Traefik for faster failover



Traefik (used for ingress on k3s) ships with a single replica by default, but for faster failover I bumped it to two replicas so each worker node runs one pod. That way, if a node disappears, the service stays up while Kubernetes schedules a replacement. Here's the command I used:

> ~ kubectl -n kube-system scale deployment traefik --replicas=2

And the result:

> ~ kubectl -n kube-system get pods -l app.kubernetes.io/name=traefik
kube-system   traefik-c98fdf6fb-97kqk   1/1   Running   19 (53d ago)   64d
kube-system   traefik-c98fdf6fb-9npg2   1/1   Running   11 (53d ago)   61d

Make it accessible from the public internet



Next, I made this accessible through the public internet via the www.f3s.foo.zone hosts. As a reminder from part 1 of this series, I reviewed the section titled "OpenBSD/relayd to the rescue for external connectivity":

f3s: Kubernetes with FreeBSD - Part 1: Setting the stage

All apps should be reachable through the internet (e.g., from my phone or computer when travelling). For external connectivity and TLS management, I've got two OpenBSD VMs (one hosted by OpenBSD Amsterdam and another hosted by Hetzner) handling public-facing services like DNS, relaying traffic, and automating Let's Encrypt certificates.

All of this (every Linux VM to every OpenBSD box) will be connected via WireGuard tunnels, keeping everything private and secure. There will be 6 WireGuard tunnels (3 k3s nodes times two OpenBSD VMs).

So, when I want to access a service running in k3s, I will hit an external DNS endpoint (with the authoritative DNS servers being the OpenBSD boxes). The DNS will resolve to the master OpenBSD VM (see my KISS highly-available with OpenBSD blog post), and from there, the relayd process (with a Let's Encrypt certificate—see my Let's Encrypt with OpenBSD and Rex blog post) will accept the TCP connection and forward it through the WireGuard tunnel to a reachable node port of one of the k3s nodes, thus serving the traffic.

> ~ curl https://f3s.foo.zone
<html><body><h1>It works!</h1></body></html>

> ~ curl https://www.f3s.foo.zone
<html><body><h1>It works!</h1></body></html>

> ~ curl https://standby.f3s.foo.zone
<html><body><h1>It works!</h1></body></html>

This is how it works in relayd.conf on OpenBSD:

OpenBSD relayd configuration



The OpenBSD edge relays keep the Kubernetes-facing addresses for the f3s ingress endpoints in a shared backend table so TLS traffic for every f3s hostname lands on the same pool of k3s nodes (pointing to the WireGuard IP addresses of those nodes - remember, they are running locally in my LAN, wheras the OpenBSD edge relays operate in the public internet):

table <f3s> {
  192.168.2.120
  192.168.2.121
  192.168.2.122
}

Inside the http protocol "https" block each public hostname gets its Let's Encrypt certificate. The protocol configures TLS keypairs for all f3s services and other public endpoints. For f3s hosts specifically, there are no explicit forward to rules in the protocol—they use the relay-level failover mechanism described later. Non-f3s hosts get explicit localhost routing to prevent them from trying the f3s backends:

http protocol "https" {
    # TLS certificates for all f3s services
    tls keypair f3s.foo.zone
    tls keypair www.f3s.foo.zone
    tls keypair standby.f3s.foo.zone
    tls keypair anki.f3s.foo.zone
    tls keypair www.anki.f3s.foo.zone
    tls keypair standby.anki.f3s.foo.zone
    tls keypair bag.f3s.foo.zone
    tls keypair www.bag.f3s.foo.zone
    tls keypair standby.bag.f3s.foo.zone
    tls keypair flux.f3s.foo.zone
    tls keypair www.flux.f3s.foo.zone
    tls keypair standby.flux.f3s.foo.zone
    tls keypair audiobookshelf.f3s.foo.zone
    tls keypair www.audiobookshelf.f3s.foo.zone
    tls keypair standby.audiobookshelf.f3s.foo.zone
    tls keypair gpodder.f3s.foo.zone
    tls keypair www.gpodder.f3s.foo.zone
    tls keypair standby.gpodder.f3s.foo.zone
    tls keypair radicale.f3s.foo.zone
    tls keypair www.radicale.f3s.foo.zone
    tls keypair standby.radicale.f3s.foo.zone
    tls keypair vault.f3s.foo.zone
    tls keypair www.vault.f3s.foo.zone
    tls keypair standby.vault.f3s.foo.zone
    tls keypair syncthing.f3s.foo.zone
    tls keypair www.syncthing.f3s.foo.zone
    tls keypair standby.syncthing.f3s.foo.zone
    tls keypair uprecords.f3s.foo.zone
    tls keypair www.uprecords.f3s.foo.zone
    tls keypair standby.uprecords.f3s.foo.zone

    # Explicitly route non-f3s hosts to localhost
    match request header "Host" value "foo.zone" forward to <localhost>
    match request header "Host" value "www.foo.zone" forward to <localhost>
    match request header "Host" value "dtail.dev" forward to <localhost>
    # ... other non-f3s hosts ...

    # NOTE: f3s hosts have NO match rules here!
    # They use relay-level failover (f3s -> localhost backup)
    # See the relay configuration below for automatic failover details
}

Both IPv4 and IPv6 listeners reuse the same protocol definition, making the relay transparent for dual-stack clients while still health checking every k3s backend before forwarding traffic over WireGuard:

relay "https4" {
    listen on 46.23.94.99 port 443 tls
    protocol "https"
    # Primary: f3s cluster (with health checks) - Falls back to localhost when all hosts down
    forward to <f3s> port 80 check tcp
    forward to <localhost> port 8080
}

relay "https6" {
    listen on 2a03:6000:6f67:624::99 port 443 tls
    protocol "https"
    # Primary: f3s cluster (with health checks) - Falls back to localhost when all hosts down
    forward to <f3s> port 80 check tcp
    forward to <localhost> port 8080
}

In practice, that means relayd terminates TLS with the correct certificate, keeps the three WireGuard-connected backends in rotation, and ships each request to whichever bhyve VM answers first.

Automatic failover when f3s cluster is down



Update: This section was added at Tue 30 Dec 10:11:44 EET 2025

One important aspect of this setup is graceful degradation: when all three f3s nodes are unreachable (e.g., during maintenance or a power outage in my LAN), users should see a friendly status page instead of an error message.

OpenBSD's relayd supports automatic failover through its health check mechanism. According to the relayd.conf manual:

This directive can be specified multiple times - subsequent entries will be used as the backup table if all hosts in the previous table are down.

The key is the order of forward to statements in the relay configuration. By placing the f3s table first with check tcp health checks, followed by localhost as a backup, relayd automatically routes traffic based on backend availability:

When f3s cluster is UP:

  • Health checks on port 80 succeed for f3s nodes
  • All f3s traffic routes to the Kubernetes cluster
  • Localhost backup remains idle

When f3s cluster is DOWN:

  • All health checks fail (nodes unreachable)
  • The <f3s> table becomes unavailable
  • Traffic automatically falls back to <localhost> on port 8080
  • OpenBSD's httpd serves a static fallback page

# NEW configuration - supports automatic failover
http protocol "https" {
    # Explicitly route non-f3s hosts to localhost
    match request header "Host" value "foo.zone" forward to <localhost>
    match request header "Host" value "dtail.dev" forward to <localhost>
    # ... other non-f3s hosts ...

    # f3s hosts have NO protocol rules - they use relay-level failover
    # (no match rules for f3s.foo.zone, anki.f3s.foo.zone, etc.)
}

relay "https4" {
    # f3s FIRST (with health checks), localhost as BACKUP
    forward to <f3s> port 80 check tcp
    forward to <localhost> port 8080
}

This way, f3s traffic uses the relay's default behavior: try the first table, fall back to the second when health checks fail.

OpenBSD httpd fallback configuration



The localhost httpd service on port 8080 serves the fallback content from /var/www/htdocs/f3s_fallback/. This directory contains a simple HTML page explaining the situation.

The key configuration detail is using request rewrite to ensure the fallback page is served for ALL paths, not just the root. Without this, accessing paths like /login?redirect=/files/ would return 404 instead of the fallback page:

# OpenBSD httpd.conf
# Fallback for f3s hosts - serve fallback page for ALL paths
server "f3s.foo.zone" {
  listen on * port 8080
  log style forwarded
  location * {
    # Rewrite all requests to /index.html to show fallback page regardless of path
    request rewrite "/index.html"
    root "/htdocs/f3s_fallback"
  }
}

server "anki.f3s.foo.zone" {
  listen on * port 8080
  log style forwarded
  location * {
    request rewrite "/index.html"
    root "/htdocs/f3s_fallback"
  }
}

# ... similar blocks for all f3s hostnames ...

The request rewrite "/index.html" directive ensures that whether someone accesses /, /login, /api/status, or any other path, they all receive the same fallback page. This prevents confusing 404 errors when users have bookmarked specific URLs or follow deep links while the cluster is down.

The fallback page itself is straightforward:

<!DOCTYPE html>
<html>
<head>
    <title>Server turned off</title>
    <style>
        body {
            font-family: sans-serif;
            text-align: center;
            padding-top: 50px;
        }
        .container {
            max-width: 600px;
            margin: 0 auto;
        }
    </style>
</head>
<body>
    <div class="container">
        <h1>Server turned off</h1>
        <p>The servers are all currently turned off.</p>
        <p>Please try again later.</p>
        <p>Or email <a href="mailto:paul@nospam.buetow.org">paul@nospam.buetow.org</a>
           - so I can turn them back on for you!</p>
    </div>
</body>
</html>

This approach provides several benefits:

  • Automatic detection: Health checks run continuously; no manual intervention needed
  • Instant fallback: When all f3s nodes go down, the next request automatically routes to localhost
  • Transparent recovery: When f3s comes back online, health checks pass and traffic resumes automatically
  • User experience: Visitors see a helpful message instead of connection errors
  • No DNS changes: The same hostnames work whether f3s is up or down

This fallback mechanism has proven invaluable during maintenance windows and unexpected outages, ensuring that users always get a response even when the home lab is offline.

Exposing services via LAN ingress



In addition to external access through the OpenBSD relays, services can also be exposed on the local network using LAN-specific ingresses. This is useful for accessing services from within the home network without going through the internet, reducing latency and providing an alternative path if the external relays are unavailable.

The LAN ingress architecture leverages the existing FreeBSD CARP (Common Address Redundancy Protocol) failover infrastructure that's already in place for NFS-over-TLS (see Part 5). Instead of deploying MetalLB or another LoadBalancer implementation, we reuse the CARP virtual IP (192.168.1.138) by adding HTTP/HTTPS forwarding alongside the existing stunnel service on port 2323.

Architecture overview



The LAN access path differs from external access:

**External access (*.f3s.foo.zone):**
Internet → OpenBSD relayd (TLS termination, Let's Encrypt)
        → WireGuard tunnel
        → k3s Traefik :80 (HTTP)
        → Service

**LAN access (*.f3s.lan.foo.zone):**
LAN → FreeBSD CARP VIP (192.168.1.138)
    → FreeBSD relayd (TCP forwarding)
    → k3s Traefik :443 (TLS termination, cert-manager)
    → Service

The key architectural decisions:

  • FreeBSD relayd performs pure TCP forwarding (Layer 4) for ports 80 and 443, not TLS termination
  • Traefik inside k3s handles TLS offloading using certificates from cert-manager
  • Self-signed CA for LAN domains (no external dependencies)
  • CARP provides automatic failover between f0 and f1
  • No code changes to applications—just add a LAN ingress resource

Installing cert-manager



First, install cert-manager to handle certificate lifecycle management for LAN services. The installation is automated with a Justfile:

codeberg.org/snonux/conf/f3s/cert-manager

$ cd conf/f3s/cert-manager
$ just install
kubectl apply -f cert-manager.yaml
# ... cert-manager CRDs and resources created ...
kubectl apply -f self-signed-issuer.yaml
clusterissuer.cert-manager.io/selfsigned-issuer created
clusterissuer.cert-manager.io/selfsigned-ca-issuer created
kubectl apply -f ca-certificate.yaml
certificate.cert-manager.io/selfsigned-ca created
kubectl apply -f wildcard-certificate.yaml
certificate.cert-manager.io/f3s-lan-wildcard created

This creates:

  • A self-signed ClusterIssuer
  • A CA certificate (f3s-lan-ca) valid for 10 years
  • A CA-signed ClusterIssuer
  • A wildcard certificate (*.f3s.lan.foo.zone) valid for 90 days with automatic renewal

Verify the certificates:

$ kubectl get certificate -n cert-manager
NAME               READY   SECRET                 AGE
f3s-lan-wildcard   True    f3s-lan-tls            5m
selfsigned-ca      True    selfsigned-ca-secret   5m

The wildcard certificate (f3s-lan-tls) needs to be copied to any namespace that uses it:

$ kubectl get secret f3s-lan-tls -n cert-manager -o yaml | \
    sed 's/namespace: cert-manager/namespace: services/' | \
    kubectl apply -f -

Configuring FreeBSD relayd for LAN access



On both FreeBSD hosts (f0, f1), install and configure relayd for TCP forwarding:

paul@f0:~ % doas pkg install -y relayd

Create /usr/local/etc/relayd.conf:

# k3s nodes backend table
table <k3s_nodes> { 192.168.1.120 192.168.1.121 192.168.1.122 }

# TCP forwarding to Traefik (no TLS termination)
relay "lan_http" {
    listen on 192.168.1.138 port 80
    forward to <k3s_nodes> port 80 check tcp
}

relay "lan_https" {
    listen on 192.168.1.138 port 443
    forward to <k3s_nodes> port 443 check tcp
}

Note: The IP addresses 192.168.1.120-122 are the LAN IPs of the k3s nodes (r0, r1, r2), not their WireGuard IPs. FreeBSD relayd requires PF (Packet Filter) to be enabled. Create a minimal /etc/pf.conf:

# Basic PF rules for relayd
set skip on lo0
pass in quick
pass out quick

Enable PF and relayd:

paul@f0:~ % doas sysrc pf_enable=YES pflog_enable=YES relayd_enable=YES
paul@f0:~ % doas service pf start
paul@f0:~ % doas service pflog start
paul@f0:~ % doas service relayd start

Verify relayd is listening on the CARP VIP:

paul@f0:~ % doas sockstat -4 -l | grep 192.168.1.138
_relayd  relayd   2903  11  tcp4   192.168.1.138:80      *:*
_relayd  relayd   2903  12  tcp4   192.168.1.138:443     *:*

Repeat the same configuration on f1. Both hosts will run relayd listening on the CARP VIP, but only the CARP MASTER will respond to traffic. When failover occurs, the new MASTER takes over seamlessly.

Adding LAN ingress to services



To expose a service on the LAN, add a second Ingress resource to its Helm chart. Here's an example:

---
# LAN Ingress for f3s.lan.foo.zone
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: ingress-lan
  namespace: services
  annotations:
    spec.ingressClassName: traefik
    traefik.ingress.kubernetes.io/router.entrypoints: web,websecure
spec:
  tls:
    - hosts:
        - f3s.lan.foo.zone
      secretName: f3s-lan-tls
  rules:
    - host: f3s.lan.foo.zone
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: service
                port:
                  number: 4533

Key points:

  • Use web,websecure entrypoints (both HTTP and HTTPS)
  • Reference the f3s-lan-tls secret in the tls section
  • Use .f3s.lan.foo.zone subdomain pattern
  • Same backend service as the external ingress

Apply the ingress and test:

$ kubectl apply -f ingress-lan.yaml
ingress.networking.k8s.io/ingress-lan created

$ curl -k https://f3s.lan.foo.zone
HTTP/2 302 
location: /app/

Client-side DNS and CA setup



To access LAN services, clients need DNS entries and must trust the self-signed CA.

Add DNS entries to /etc/hosts on your laptop:

$ sudo tee -a /etc/hosts << 'EOF'
# f3s LAN services
192.168.1.138  f3s.lan.foo.zone
EOF

The CARP VIP 192.168.1.138 provides high availability—traffic automatically fails over to the backup host if the master goes down.

Export the self-signed CA certificate:

$ kubectl get secret selfsigned-ca-secret -n cert-manager -o jsonpath='{.data.ca\.crt}' | \
    base64 -d > f3s-lan-ca.crt

Install the CA certificate on Linux (Fedora/Rocky):

$ sudo cp f3s-lan-ca.crt /etc/pki/ca-trust/source/anchors/
$ sudo update-ca-trust

After trusting the CA, browsers will accept the LAN certificates without warnings.

Scaling to other services



The same pattern can be applied to any service. To add LAN access:

1. Copy the f3s-lan-tls secret to the service's namespace (if not already there)
2. Add a LAN Ingress resource using the pattern above
3. Configure DNS: 192.168.1.138 service.f3s.lan.foo.zone

No changes needed to:

  • relayd configuration (forwards all traffic)
  • cert-manager (wildcard cert covers all *.f3s.lan.foo.zone)
  • CARP configuration (VIP shared by all services)

TLS offloaders summary



The f3s infrastructure now has three distinct TLS offloaders:

  • **OpenBSD relayd**: External internet traffic (*.f3s.foo.zone) using Let's Encrypt
  • **Traefik (k3s)**: LAN HTTPS traffic (*.f3s.lan.foo.zone) using cert-manager
  • **stunnel**: NFS-over-TLS (port 2323) using custom PKI

Each serves a different purpose with appropriate certificate management for its use case.

Deploying the private Docker image registry



As not all Docker images I want to deploy are available on public Docker registries and as I also build some of them by myself, there is the need of a private registry.

All manifests for the f3s stack live in my configuration repository:

codeberg.org/snonux/conf/f3s

Within that repo, the f3s/registry/ directory contains the Helm chart, a Justfile, and a detailed README. Here's the condensed walkthrough I used to roll out the registry with Helm.

Prepare the NFS-backed storage



Create the directory that will hold the registry blobs on the NFS share (I ran this on r0, but any node that exports /data/nfs/k3svolumes works):

[root@r0 ~]# mkdir -p /data/nfs/k3svolumes/registry

Install (or upgrade) the chart



Clone the repo (or pull the latest changes) on a workstation that has helm configured for the cluster, then deploy the chart. The Justfile wraps the commands, but the raw Helm invocation looks like this:

$ git clone https://codeberg.org/snonux/conf/f3s.git
$ cd conf/f3s/examples/conf/f3s/registry
$ helm upgrade --install registry ./helm-chart --namespace infra --create-namespace

Helm creates the infra namespace if it does not exist, provisions a PersistentVolume/PersistentVolumeClaim pair that points at /data/nfs/k3svolumes/registry, and spins up a single registry pod exposed via the docker-registry-service NodePort (30001). Verify everything is up before continuing:

$ kubectl get pods --namespace infra
NAME                               READY   STATUS    RESTARTS      AGE
docker-registry-6bc9bb46bb-6grkr   1/1     Running   6 (53d ago)   54d

$ kubectl get svc docker-registry-service -n infra
NAME                      TYPE       CLUSTER-IP     EXTERNAL-IP   PORT(S)          AGE
docker-registry-service   NodePort   10.43.141.56   <none>        5000:30001/TCP   54d

Allow nodes and workstations to trust the registry



The registry listens on plain HTTP, so both Docker daemons on workstations and the k3s nodes need to treat it as an insecure registry. That's fine for my personal needs, as:

  • I don't store any secrets in the images
  • I access the registry this way only via my LAN
  • I may will change it later on...

On my Fedora workstation where I build images:

$ cat <<"EOF" | sudo tee /etc/docker/daemon.json >/dev/null
{
  "insecure-registries": [
    "r0.lan.buetow.org:30001",
    "r1.lan.buetow.org:30001",
    "r2.lan.buetow.org:30001"
  ]
}
EOF
$ sudo systemctl restart docker

On each k3s node, make registry.lan.buetow.org resolve locally and point k3s at the NodePort:

$ for node in r0 r1 r2; do
>   ssh root@$node "echo '127.0.0.1 registry.lan.buetow.org' >> /etc/hosts"
> done

$ for node in r0 r1 r2; do
> ssh root@$node "cat <<'EOF' > /etc/rancher/k3s/registries.yaml
mirrors:
  "registry.lan.buetow.org:30001":
    endpoint:
      - "http://localhost:30001"
EOF
systemctl restart k3s"
> done

Thanks to the relayd configuration earlier in the post, the external hostnames (f3s.foo.zone, etc.) can already reach NodePort 30001, so publishing the registry later to the outside world is just a matter of wiring the DNS the same way as the ingress hosts. But by default, that's not enabled for now due to security reasons.

Pushing and pulling images



Tag any locally built image with one of the node IPs on port 30001, then push it. I usually target whichever node is closest to me, but any of the three will do:

$ docker tag my-app:latest r0.lan.buetow.org:30001/my-app:latest
$ docker push r0.lan.buetow.org:30001/my-app:latest

Inside the cluster (or from other nodes), reference the image via the service name that Helm created:

image: docker-registry-service:5000/my-app:latest

You can test the pull path straight away:

$ kubectl run registry-test \
>   --image=docker-registry-service:5000/my-app:latest \
>   --restart=Never -n test --command -- sleep 300

If the pod pulls successfully, the private registry is ready for use by the rest of the workloads. Note, that the commands above actually don't work, they are only for illustration purpose mentioned here.

Example: Anki Sync Server from the private registry



One of the first workloads I migrated onto the k3s cluster after standing up the registry was my Anki sync server. The configuration repo ships everything in examples/conf/f3s/anki-sync-server/: a Docker build context plus a Helm chart that references the freshly built image.

Build and push the image



The Dockerfile lives under docker-image/ and takes the Anki release to compile as an ANKI_VERSION build argument. The accompanying Justfile wraps the steps, but the raw commands look like this:

$ cd conf/f3s/examples/conf/f3s/anki-sync-server/docker-image
$ docker build -t anki-sync-server:25.07.5b --build-arg ANKI_VERSION=25.07.5 .
$ docker tag anki-sync-server:25.07.5b \
    r0.lan.buetow.org:30001/anki-sync-server:25.07.5b
$ docker push r0.lan.buetow.org:30001/anki-sync-server:25.07.5b

Because every k3s node treats registry.lan.buetow.org:30001 as an insecure mirror (see above), the push succeeds regardless of which node answers. If you prefer the shortcut, just f3s in that directory performs the same build/tag/push sequence.

Create the Anki secret and storage on the cluster



The Helm chart expects the services namespace, a pre-created NFS directory, and a Kubernetes secret that holds the credentials the upstream container understands:

$ ssh root@r0 "mkdir -p /data/nfs/k3svolumes/anki-sync-server/anki_data"
$ kubectl create namespace services
$ kubectl create secret generic anki-sync-server-secret \
    --from-literal=SYNC_USER1='paul:SECRETPASSWORD' \
    -n services

If the services namespace already exists, you can skip that line or let Kubernetes tell you the namespace is unchanged.

Deploy the chart



With the prerequisites in place, install (or upgrade) the chart. It pins the container image to the tag we just pushed and mounts the NFS export via a PersistentVolume/PersistentVolumeClaim pair:

$ cd ../helm-chart
$ helm upgrade --install anki-sync-server . -n services

Helm provisions everything referenced in the templates:

containers:
- name: anki-sync-server  image: registry.lan.buetow.org:30001/anki-sync-server:25.07.5b
  volumeMounts:
  - name: anki-data
    mountPath: /anki_data

Once the release comes up, verify that the pod pulled the freshly pushed image and that the ingress we configured earlier resolves through relayd just like the Apache example.

$ kubectl get pods -n services
$ kubectl get ingress anki-sync-server-ingress -n services
$ curl https://anki.f3s.foo.zone/health

All of this runs solely on first-party images that now live in the private registry, proving the full flow from local bild to WireGuard-exposed service.

NFSv4 UID mapping for Postgres-backed (and other) apps



NFSv4 only sees numeric user and group IDs, so the postgres account created inside the container must exist with the same UID/GID on the Kubernetes worker and on the FreeBSD NFS servers. Otherwise the pod starts with UID 999, the export sees it as an unknown anonymous user, and Postgres fails to initialise its data directory.

To verify things line up end-to-end I run id in the container and on the hosts:

> ~ kubectl exec -n services deploy/miniflux-postgres -- id postgres
uid=999(postgres) gid=999(postgres) groups=999(postgres)

[root@r0 ~]# id postgres
uid=999(postgres) gid=999(postgres) groups=999(postgres)

paul@f0:~ % doas id postgres
uid=999(postgres) gid=99(postgres) groups=999(postgres)

The Rocky Linux workers get their matching user with plain useradd/groupadd (repeat on r0, r1, and r2):

[root@r0 ~]# groupadd --gid 999 postgres
[root@r0 ~]# useradd --uid 999 --gid 999 \
                --home-dir /var/lib/pgsql \
                --shell /sbin/nologin postgres

FreeBSD uses pw, so on each NFS server (f0, f1, f2) I created the same account and disabled shell access:

paul@f0:~ % doas pw groupadd postgres -g 999
paul@f0:~ % doas pw useradd postgres -u 999 -g postgres \
                -d /var/db/postgres -s /usr/sbin/nologin

Once the UID/GID exist everywhere, the Miniflux chart in examples/conf/f3s/miniflux deploys cleanly. The chart provisions both the application and its bundled Postgres database, mounts the exported directory, and builds the DSN at runtime. The important bits live in helm-chart/templates/persistent-volumes.yaml and deployment.yaml:

# Persistent volume lives on the NFS export
hostPath:
  path: /data/nfs/k3svolumes/miniflux/data
  type: Directory
...
containers:
- name: miniflux-postgres
  image: postgres:17
  volumeMounts:
  - name: miniflux-postgres-data
    mountPath: /var/lib/postgresql/data

Follow the README beside the chart to create the secrets and the target directory:

$ cd examples/conf/f3s/miniflux/helm-chart
$ mkdir -p /data/nfs/k3svolumes/miniflux/data
$ kubectl create secret generic miniflux-db-password \
    --from-literal=fluxdb_password='YOUR_PASSWORD' -n services
$ kubectl create secret generic miniflux-admin-password \
    --from-literal=admin_password='YOUR_ADMIN_PASSWORD' -n services
$ helm upgrade --install miniflux . -n services --create-namespace

And to verify it's all up:

$ kubectl get all --namespace=services | grep mini
pod/miniflux-postgres-556444cb8d-xvv2p   1/1     Running   0             54d
pod/miniflux-server-85d7c64664-stmt9     1/1     Running   0             54d
service/miniflux                   ClusterIP   10.43.47.80     <none>        8080/TCP             54d
service/miniflux-postgres          ClusterIP   10.43.139.50    <none>        5432/TCP             54d
deployment.apps/miniflux-postgres   1/1     1            1           54d
deployment.apps/miniflux-server     1/1     1            1           54d
replicaset.apps/miniflux-postgres-556444cb8d   1         1         1       54d
replicaset.apps/miniflux-server-85d7c64664     1         1         1       54d

Or from the repository root I simply run:

Helm charts currently in service



These are the charts that already live under examples/conf/f3s and run on the cluster today (and I'll keep adding more as new services graduate into production):

  • anki-sync-server — custom-built image served from the private registry, stores decks on /data/nfs/k3svolumes/anki-sync-server/anki_data, and authenticates through the anki-sync-server-secret.
  • koreade-sync-server — Sync server for KOReader.
  • audiobookshelf — media streaming stack with three hostPath mounts (config, audiobooks, podcasts) so the library survives node rebuilds.
  • example-apache — minimal HTTP service I use for smoke-testing ingress and relayd rules.
  • example-apache-volume-claim — Apache plus PVC variant that exercises NFS-backed storage for walkthroughs like the one earlier in this post.
  • miniflux — the Postgres-backed feed reader described above, wired for NFSv4 UID mapping and per-release secrets.
  • opodsync — podsync deployment with its data directory under /data/nfs/k3svolumes/opodsync/data.
  • radicale — CalDAV/CardDAV (and gpodder) backend with separate collections and auth volumes.
  • registry — the plain-HTTP Docker registry exposed on NodePort 30001 and mirrored internally as registry.lan.buetow.org:30001.
  • syncthing — two-volume setup for config and shared data, fronted by the syncthing.f3s.foo.zone ingress.
  • wallabag — read-it-later service with persistent data and images directories on the NFS export.

I hope you enjoyed this walkthrough. Read the next post of this series:

f3s: Kubernetes with FreeBSD - Part 8: Observability

Other *BSD-related posts:

2025-12-07 f3s: Kubernetes with FreeBSD - Part 8: Observability
2025-10-02 f3s: Kubernetes with FreeBSD - Part 7: k3s and first pod deployments (You are currently reading this)
2025-07-14 f3s: Kubernetes with FreeBSD - Part 6: Storage
2025-05-11 f3s: Kubernetes with FreeBSD - Part 5: WireGuard mesh network
2025-04-05 f3s: Kubernetes with FreeBSD - Part 4: Rocky Linux Bhyve VMs
2025-02-01 f3s: Kubernetes with FreeBSD - Part 3: Protecting from power cuts
2024-12-03 f3s: Kubernetes with FreeBSD - Part 2: Hardware and base installation
2024-11-17 f3s: Kubernetes with FreeBSD - Part 1: Setting the stage
2024-04-01 KISS high-availability with OpenBSD
2024-01-13 One reason why I love OpenBSD
2022-10-30 Installing DTail on OpenBSD
2022-07-30 Let's Encrypt with OpenBSD and Rex
2016-04-09 Jails and ZFS with Puppet on FreeBSD

E-Mail your comments to paul@nospam.buetow.org

Back to the main site
Bash Golf Part 4 https://foo.zone/gemfeed/2025-09-14-bash-golf-part-4.html 2025-09-13T12:04:03+03:00 Paul Buetow aka snonux paul@dev.buetow.org This is the fourth blog post about my Bash Golf series. This series is random Bash tips, tricks, and weirdnesses I have encountered over time.

Bash Golf Part 4



Published at 2025-09-13T12:04:03+03:00

This is the fourth blog post about my Bash Golf series. This series is random Bash tips, tricks, and weirdnesses I have encountered over time.

2021-11-29 Bash Golf Part 1
2022-01-01 Bash Golf Part 2
2023-12-10 Bash Golf Part 3
2025-09-14 Bash Golf Part 4 (You are currently reading this)

    '\       '\        '\        '\                   .  .        |>18>>
      \        \         \         \              .         ' .   |
     O>>      O>>       O>>       O>>         .                 'o |
      \       .\. ..    .\. ..    .\. ..   .                      |
      /\    .  /\     .  /\     .  /\    . .                      |
     / /   .  / /  .'.  / /  .'.  / /  .'    .                    |
jgs^^^^^^^`^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                        Art by Joan Stark, mod. by Paul Buetow

Table of Contents




Split pipelines with tee + process substitution



Sometimes you want to fan out one stream to multiple consumers and still continue the original pipeline. tee plus process substitution does exactly that:

somecommand \
    | tee >(command1) >(command2) \
    | command3

All of command1, command2, and command3 see the output of somecommand. Example:

printf 'a\nb\n' \
    | tee >(sed 's/.*/X:&/; s/$/ :c1/') >(tr a-z A-Z | sed 's/$/ :c2/') \
    | sed 's/$/ :c3/'

Output:

a :c3
b :c3
A :c2 :c3
B :c2 :c3
X:a :c1 :c3
X:b :c1 :c3

This relies on Bash process substitution (>(...)). Make sure your shell is Bash and not a POSIX /bin/sh.

Example (fails under dash/POSIX sh):

/bin/sh -c 'echo hi | tee >(cat)'
# /bin/sh: 1: Syntax error: "(" unexpected

Combine with set -o pipefail if failures in side branches should fail the whole pipeline.

Example:

set -o pipefail
printf 'ok\n' | tee >(false) | cat >/dev/null
echo $?   # 1 because a side branch failed

Further reading:

Splitting pipelines with tee

Heredocs for remote sessions (and their gotchas)



Heredocs are great to send multiple commands over SSH in a readable way:

ssh "$SSH_USER@$SSH_HOST" <<EOF
    # Go to the work directory
    cd "$WORK_DIR"
  
    # Make a git pull
    git pull
  
    # Export environment variables required for the service to run
    export AUTH_TOKEN="$APP_AUTH_TOKEN"
  
    # Start the service
    docker compose up -d --build
EOF

Tips:

Quoting the delimiter changes interpolation. Use <<'EOF' to avoid local expansion and send the content literally.

Example:

FOO=bar
cat <<'EOF'
$FOO is not expanded here
EOF

Prefer explicit quoting for variables (as above) to avoid surprises. Example (spaces preserved only when quoted):

WORK_DIR="/tmp/my work"
ssh host <<EOF
    cd $WORK_DIR      # may break if unquoted
    cd "$WORK_DIR"   # safe
EOF

Consider set -euo pipefail at the top of the remote block for stricter error handling. Example:

ssh host <<'EOF'
    set -euo pipefail
    false   # causes immediate failure
    echo never
EOF

Indent-friendly variant: use a dash to strip leading tabs in the body:

cat <<-EOF > script.sh
	#!/usr/bin/env bash
	echo "tab-indented content is dedented"
EOF

Further reading:

Heredoc headaches and fixes

Namespacing and dynamic dispatch with ::



You can emulate simple namespacing by encoding hierarchy in function names. One neat pattern is pseudo-inheritance via a tiny super helper that maps pkg::lang::action to a pkg::base::action default.

#!/usr/bin/env bash
set -euo pipefail

super() {
    local -r fn=${FUNCNAME[1]}
    # Split name on :: and dispatch to base implementation
    local -a parts=( ${fn//::/ } )
    "${parts[0]}::base::${parts[2]}" "$@"
}

foo::base::greet() { echo "base: $@"; }
foo::german::greet()  { super "Guten Tag, $@!"; }
foo::english::greet() { super "Good day,  $@!"; }

for lang in german english; do
    foo::$lang::greet Paul
done

Output:

base: Guten Tag, Paul!
base: Good day,  Paul!

Indirect references with namerefs



declare -n creates a name reference — a variable that points to another variable. It’s cleaner than eval for indirection:

user_name=paul
declare -n ref=user_name
echo "$ref"       # paul
ref=julia
echo "$user_name" # julia

Output:

paul
julia

Namerefs are local to functions when declared with local -n. Requires Bash ≥4.3.

You can also construct the target name dynamically:

make_var() {
    local idx=$1; shift
    local name="slot_$idx"
    printf -v "$name" '%s' "$*"   # create variable slot_$idx
}

get_var() {
    local idx=$1
    local -n ref="slot_$idx"      # bind ref to slot_$idx
    printf '%s\n' "$ref"
}

make_var 7 "seven"
get_var 7

Output:

seven

Function declaration forms



All of these work in Bash, but only the first one is POSIX-ish:

foo() { echo foo; }
function foo { echo foo; }
function foo() { echo foo; }

Recommendation: prefer name() { ... } for portability and consistency.

Chaining function calls in conditionals



Functions return a status like commands. You can short-circuit them in conditionals:

deploy_check() { test -f deploy.yaml; }
smoke_test()   { curl -fsS http://localhost/healthz >/dev/null; }

if deploy_check || smoke_test; then
    echo "All good."
else
    echo "Something failed." >&2
fi

You can also compress it golf-style:

deploy_check || smoke_test && echo ok || echo fail >&2

Grep, sed, awk quickies



Word match and context: grep -w word file; with context: grep -C3 foo file (same as -A3 -B3). Example:

cat > /tmp/ctx.txt <<EOF
one
foo
two
three
bar
EOF
grep -C1 foo /tmp/ctx.txt

Output:

one
foo
two

Skip a directory while recursing: grep -R --exclude-dir=foo 'bar' /path. Example:

mkdir -p /tmp/golf/foo /tmp/golf/src
printf 'bar\n' > /tmp/golf/src/a.txt
printf 'bar\n' > /tmp/golf/foo/skip.txt
grep -R --exclude-dir=foo 'bar' /tmp/golf

Output:

/tmp/golf/src/a.txt:bar

Insert lines with sed: sed -e '1isomething' -e '3isomething' file. Example:

printf 'A\nB\nC\n' > /tmp/s.txt
sed -e '1iHEAD' -e '3iMID' /tmp/s.txt

Output:

HEAD
A
B
MID
C

Drop last column with awk: awk 'NF{NF-=1};1' file. Example:

printf 'a b c\nx y z\n' > /tmp/t.txt
cat /tmp/t.txt
echo
awk 'NF{NF-=1};1' /tmp/t.txt

Output:

a b c
x y z

a b
x y

Safe xargs with NULs



Avoid breaking on spaces/newlines by pairing find -print0 with xargs -0:

find . -type f -name '*.log' -print0 | xargs -0 rm -f

Example with spaces and NULs only:

printf 'a\0b c\0' | xargs -0 -I{} printf '<%s>\n' {}

Output:

<a>
<b c>

Efficient file-to-variable and arrays



Read a whole file into a variable without spawning cat:

cfg=$(<config.ini)

Read lines into an array safely with mapfile (aka readarray):

mapfile -t lines < <(grep -v '^#' config.ini)
printf '%s\n' "${lines[@]}"

Assign formatted strings without a subshell using printf -v:

printf -v msg 'Hello %s, id=%04d' "$USER" 42
echo "$msg"

Output:

Hello paul, id=0042

Read NUL-delimited data (pairs well with -print0):

mapfile -d '' -t files < <(find . -type f -print0)
printf '%s\n' "${files[@]}"

Quick password generator



Pure Bash with /dev/urandom:

LC_ALL=C tr -dc 'A-Za-z0-9_' </dev/urandom | head -c 16; echo

Alternative using openssl:

openssl rand -base64 16 | tr -d '\n' | cut -c1-22

yes for automation



yes streams a string repeatedly; handy for feeding interactive commands or quick load generation:

yes | rm -r large_directory        # auto-confirm
yes n | dangerous-command          # auto-decline
yes anything | head -n1            # prints one line: anything

Forcing true to fail (and vice versa)



You can shadow builtins with functions:

true()  { return 1; }
false() { return 0; }

true  || echo 'true failed'
false && echo 'false succeeded'

# Bypass function with builtin/command
builtin true # returns 0
command true # returns 0

To disable a builtin entirely: enable -n true (re-enable with enable true).

Further reading:

Force true to return false

Restricted Bash



bash -r (or rbash) starts a restricted shell that limits potentially dangerous actions, for example:

  • Changing directories (cd).
  • Modifying PATH, SHELL, BASH_ENV, or ENV.
  • Redirecting output.
  • Running commands with / in the name.
  • Using exec.

It’s a coarse sandbox for highly constrained shells; read man bash (RESTRICTED SHELL) for details and caveats.

Example session:

rbash -c 'cd /'            # cd: restricted
rbash -c 'PATH=/tmp'       # PATH: restricted
rbash -c 'echo hi > out'   # redirection: restricted
rbash -c '/bin/echo hi'    # commands with /: restricted
rbash -c 'exec ls'         # exec: restricted

Useless use of cat (and when it’s ok)



Avoid the extra process if a command already reads files or STDIN:

# Prefer
grep -i foo file
<file grep -i foo        # or feed via redirection

# Over
cat file | grep -i foo

But for interactive composition, or when you truly need to concatenate multiple sources into a single stream, cat is fine, as you may think, "First I need the content, then I do X." Changing the "useless use of cat" in retrospect is really a waste of time for one-time interactive use:

cat file1 file2 | grep -i foo

From notes: “Good for interactivity; Useless use of cat” — use judgment.

Atomic locking with mkdir



Portable advisory locks can be emulated with mkdir because it’s atomic:

lockdir=/tmp/myjob.lock
if mkdir "$lockdir" 2>/dev/null; then
    trap 'rmdir "$lockdir"' EXIT INT TERM
    # critical section
    do_work
else
    echo "Another instance is running" >&2
    exit 1
fi

This works well on Linux. Remove the lock in trap so crashes don’t leave stale locks.

Smarter globs and faster find-exec



  • Enable extended globs when useful: shopt -s extglob; then patterns like !(tmp|cache) work.
  • Use -exec ... {} + to batch many paths in fewer process invocations:

find . -name '*.log' -exec gzip -9 {} +

Example for extglob (exclude two dirs from listing):

shopt -s extglob
ls -d -- !(.git|node_modules) 2>/dev/null

E-Mail your comments to paul@nospam.buetow.org :-)

Other related posts are:

2025-09-14 Bash Golf Part 4 (You are currently reading this)
2023-12-10 Bash Golf Part 3
2022-01-01 Bash Golf Part 2
2021-11-29 Bash Golf Part 1
2021-06-05 Gemtexter - One Bash script to rule it all
2021-05-16 Personal Bash coding style guide

Back to the main site
Random Weird Things - Part Ⅲ https://foo.zone/gemfeed/2025-08-15-random-weird-things-iii.html 2025-08-14T23:21:32+03:00 Paul Buetow aka snonux paul@dev.buetow.org Every so often, I come across random, weird, and unexpected things on the internet. It would be neat to share them here from time to time. This is the third run.

Random Weird Things - Part Ⅲ



Published at 2025-08-14T23:21:32+03:00

Every so often, I come across random, weird, and unexpected things on the internet. It would be neat to share them here from time to time. This is the third run.

2024-07-05 Random Weird Things - Part Ⅰ
2025-02-08 Random Weird Things - Part Ⅱ
2025-08-15 Random Weird Things - Part Ⅲ (You are currently reading this)

 /\_/\        /\_/\        /\_/\
( o.o ) WHOA!( o.o ) WHOA!( o.o )
 > ^ <        > ^ <        > ^ <
 /   \  MEOW! /   \  MOEEW!/   \
/_____\      /_____\      /_____\

Table of Contents




21. Doom in TypeScript’s type system



Yes, really. Someone has implemented Doom to run within the TypeScript type system—compile-time madness, but fun to watch.

Doom in the TS type system

TypeScript’s type checker is surprisingly expressive: conditional types, recursion, and template literal types let you encode nontrivial logic that “executes” during compilation. The demo exploits this to build a tiny ray-caster that renders as compiler errors or types. It’s wildly impractical, but a great reminder that enough expressiveness plus recursion tends to drift toward Turing completeness.

Run it in a PDF



22. Doom inside a PDF



Running Doom embedded in a PDF file. No separate binary—just a cursed document.

doompdf

This relies on features like PDF JavaScript and interactive objects, which some viewers still support. Expect mixed results: many modern readers sandbox or disable scripting by default for security. If you try it, use a compatible desktop viewer and be prepared for portability quirks.

23. Linux inside a PDF



Boot a tiny Linux inside a PDF. This rabbit hole goes deep.

linuxpdf

Like the Doom-in-PDF trick, this leans on the PDF runtime to host unconventional logic and rendering. It’s more of an art piece than a daily driver, but it shows how “document” formats can accidentally become platforms. The security posture of PDF viewers varies significantly, so expect inconsistent behaviour across different apps.

24. SQLite loves Tcl



SQLite was initially designed as a Tcl extension and still relies heavily on Tcl today: the amalgamated C source is generated by mksqlite3c.tcl, tests are written in Tcl, and even the documentation is built with it.

Tcl 2017 paper

The famous single-file sqlite3.c is not hand-edited—developers maintain sources, plus build scripts that knit everything together deterministically. Their Tcl-centric tooling provides them with reproducible builds and a very opinionated workflow. It’s a great counterexample to the idea that “serious” projects must standardise on the most popular build stacks.

25. Fossil, “e”, and a Tcl/Tk chat



The SQLite folks use a custom Tcl/Tk editor called “e”, a homegrown VCS (Fossil), and even a Tcl/Tk chat room for development—peak bespoke tooling.

More details in the paper

Fossil bundles source control, tickets, wiki, and a web UI into a single portable binary—no external services required. The “e” editor and chat complete a tight, integrated loop tailored to their team’s needs and constraints. It’s delightfully “boring tech” that has produced one of the most reliable databases on earth.

26. Kubernetes from an Excel spreadsheet



Drive kubectl from an .xlsx file because clusters belong in spreadsheets, apparently.

xlskubectl

Resources are rows; columns map to fields; the tool renders YAML and applies it for you. It’s oddly ergonomic for demos, audits, or letting non‑YAML‑native teammates propose changes. Obviously, be careful—permissions and review gates still matter even if your “IDE” is Excel.

27. SRE means “Sorry…”



An industry joke (or truth?) that SRE (short for Site Reliability Engineer) stands for “Sorry…”.

Anecdotes are a good reminder that failure is inevitable and empathy is essential. The best takeaways are about clear communication, graceful degradation, and blameless postmortems. Laughing helps, but guardrails and good on‑call hygiene help even more.

28. Touch Grass, the app



When screens consume too much, this site/app nudges you to go outside.

Touch grass

It’s simple and playful—sometimes that’s the nudge you need to break doomscroll loops. Treat it like a micro‑ritual: set a reminder, step outside, reset. Your eyes (and nervous system) will thank you.

29. Blogging with the C preprocessor



Use the C preprocessor to assemble a blog. It shouldn’t work this well—and yet.

Macroblog with cpp

Posts are stitched together with #includes and macros, giving you DRY content blocks and repeatable builds. It’s hacky, fast, and delightfully text‑only—perfect for people who think makefiles are a UI. Would I recommend it for everyone? No. Is it charming and effective? Absolutely.

30. Accidentally Turing-complete



A delightful catalogue of systems that unintentionally become Turing-complete.

Accidentally Turing-complete

Give a system conditionals, state, and unbounded composition, and it often crosses the threshold into general computation—whether that was the goal or not. The list includes items such as CSS, regular expression dialects, and even card games. It’s a fun lens for understanding why “just a configuration language” can get complicated fast.

I hope you had some fun. E-Mail your comments to paul@nospam.buetow.org :-)

Back to the main site
Local LLM for Coding with Ollama on macOS https://foo.zone/gemfeed/2025-08-05-local-coding-llm-with-ollama.html 2025-08-04T16:43:39+03:00 Paul Buetow aka snonux paul@dev.buetow.org With all the AI buzz around coding assistants, and being a bit concerned about being dependent on third-party cloud providers here, I decided to explore the capabilities of local large language models (LLMs) using Ollama.

Local LLM for Coding with Ollama on macOS



Published at 2025-08-04T16:43:39+03:00

      [::]
     _|  |_
   /  o  o  \                       |
  |    ∆    |  <-- Ollama          / \
  |  \___/  |                     /   \
   \_______/             LLM --> / 30B \
    |     |                     / Qwen3 \
   /|     |\                   /  Coder  \
  /_|     |_\_________________/ quantised \

Table of Contents




With all the AI buzz around coding assistants, and being a bit concerned about being dependent on third-party cloud providers here, I decided to explore the capabilities of local large language models (LLMs) using Ollama.

Ollama is a powerful tool that brings local AI capabilities directly to your local hardware. By running AI models locally, you can enjoy the benefits of intelligent assistance without relying on cloud services. This document outlines my initial setup and experiences with Ollama, with a focus on coding tasks and agentic coding.

https://ollama.com/

Why Local LLMs?



Using local AI models through Ollama offers several advantages:

  • Data Privacy: Keep your code and data completely private by processing everything locally.
  • Cost-Effective: Reduce reliance on expensive cloud API calls.
  • Reliability: Works seamlessly even with spotty internet or offline.
  • Speed: Avoid network latency and enjoy instant responses while coding. Although I mostly found Ollama slower than commercial LLM providers. However, that may change with the evolution of models and hardware.

Hardware Considerations



Running large language models locally is currently limited by consumer hardware capabilities:

  • GPU Memory: Most consumer-grade GPUs (even in 2025) top out at 16–24GB of VRAM, making it challenging to run larger models like the 30B (30 billion) parameter LLMs (they go up to the 100 billion and more).
  • RAM Constraints: On my MacBook Pro with M3 CPU and 36GB RAM, I chose a 14B model (qwen2.5-coder:14b-instruct) as it represents a practical balance between capability and resource requirements.

For reference, here are some key points about running large LLMs locally:

  • Models larger than 30B: I don't even think about running them locally. One (e.g. from Qwen, Deepseek or Kimi K2) with several hundred billion parameters could match the "performance" of commercial LLMs (Claude Sonnet 4, etc). Still, for personal use, the hardware demands are just too high (or temporarily "rent" it via the public cloud?).
  • 30B models: Require at least 48GB of GPU VRAM for full inference without quantisation. Currently only feasible on high-end professional GPUs (or an Apple-silicone Mac with enough unified RAM).
  • 14B models: Can run with 16-24GB GPU memory (VRAM), suitable for consumer-grade hardware (or use a quantised larger model)
  • 7B-13B models: Best fit for mainstream consumer hardware, requiring minimal VRAM and running smoothly on mid-range GPUs, but with limited capabilities compared to larger models and more hallucinations.

The model I'll be mainly using in this blog post (qwen2.5-coder:14b-instruct) is particularly interesting as:

  • instruct: Indicates this is the instruction-tuned variant, optimised for diverse tasks including coding
  • coder: Tells me that this model was trained on a mix of code and text data, making it especially effective for programming assistance

https://ollama.com/library/qwen2.5-coder
https://huggingface.co/Qwen/Qwen2.5-Coder-14B-Instruct

For general thinking tasks, I found deepseek-r1:14b to be useful (in the future, I also want to try other qwen models here). For instance, I utilised deepseek-r1:14b to format this blog post and correct some English errors, demonstrating its effectiveness in natural language processing tasks. Additionally, it has proven invaluable for adding context and enhancing clarity in technical explanations, all while running locally on the MacBook Pro. Admittedly, it was a lot slower than "just using ChatGPT", but still within a minute or so.

https://ollama.com/library/deepseek-r1:14b
https://huggingface.co/deepseek-ai/DeepSeek-R1

A quantised (as mentioned above) LLM which has been converted from high-precision connection (typically 16- or 32-bit floating point) representations to lower-precision formats, such as 8-bit integers. This reduces the overall memory footprint of the model, making it significantly smaller and enabling it to run more efficiently on hardware with limited resources or to allow higher throughput on GPUs and CPUs. The benefits of quantisation include reduced storage and faster inference times due to simpler computations and better memory bandwidth utilisation. However, quantisation can introduce a drop in model accuracy because the lower numerical precision means the model cannot represent parameter values as precisely. In some cases, it may lead to instability or unexpected outputs in specific tasks or edge cases.

Basic Setup and Manual Code Prompting



Installing Ollama and a Model



To install Ollama, performed these steps (this assumes that you have already installed Homebrew on your macOS system):

brew install ollama
rehash
ollama serve

Which started up the Ollama server with something like this (the screenshots shows already some requests made):

Ollama serving

And then, in a new terminal, I pulled the model with:

ollama pull qwen2.5-coder:14b-instruct

Now, I was ready to go! It wasn't so difficult. Now, let's see how I used this model for coding tasks.

Example Usage



I run the following command to get a Go function for calculating Fibonacci numbers:

time echo "Write a function in golang to print out the Nth fibonacci number, \
  only the function without the boilerplate" | ollama run qwen2.5-coder:14b-instruct

Output:

func fibonacci(n int) int {
    if n <= 1 {
        return n
    }
    a, b := 0, 1
    for i := 2; i <= n; i++ {
        a, b = b, a+b
    }
    return b
}

Execution Metrics:

Executed in    4.90 secs      fish           external
   usr time   15.54 millis    0.31 millis   15.24 millis
   sys time   19.68 millis    1.02 millis   18.66 millis

Note, after having written this blog post, I tried the same with the newer model qwen3-coder:30b-a3b-q4_K_M (which "just" came out, and it's a quantised 30B model), and it was much faster:

Executed in    1.83 secs      fish           external
   usr time   17.82 millis    4.40 millis   13.42 millis
   sys time   17.07 millis    1.57 millis   15.50 millis

https://ollama.com/library/qwen3-coder:30b-a3b-q4_K_M

Agentic Coding with Aider



Installation



Aider is a tool that enables agentic coding by leveraging AI models (also local ones, as in our case). While setting up OpenAI Codex and OpenCode with Ollama proved challenging (those tools either didn't know how to work with the "tools" (the capability to execute external commands or to edit files for example) or didn't connect at all to Ollama for some reason), Aider worked smoothly.

To get started, the only thing I had to do was to install it via Homebrew, initialise a Git repository, and then start Aider with the Ollama model ollama_chat/qwen2.5-coder:14b-instruct:

brew install aider
mkdir -p ~/git/aitest && cd ~/git/aitest && git init
aider --model ollama_chat/qwen2.5-coder:14b-instruct

https://aider.chat
https://opencode.ai
https://github.com/openai/codex

Agentic coding prompt



This is the prompt I gave:

Create a Go project with these files:

* `cmd/aitest/main.go`: CLI entry point
* `internal/version.go`: Version information (0.0.0), should be printed when the
   program was started with `-version` flag
* `internal/count.go`: File counting functionality, the program should print out
   the number of files in a given subdirectory (the directory is provided as a
   command line flag with `-dir`), if none flag is given, no counting should be
   done
* `README.md`: Installation and usage instructions

It then generated something, but did not work out of the box, as it had some issues with the imports and package names. So I had to do some follow-up prompts to fix those issues with something like this:

* Update import paths to match module name, github.com/yourname/aitest should be
  aitest in main.go
* The package names of internal/count.go and internal/version.go should be
  internal, and not count and version.

Aider fixing the packages

Compilation & Execution



Once done so, the project was ready and I could compile and run it:

go build cmd/aitest/main.go
./main -v
0.0.0
./main -dir .
Number of files in directory .: 4

The code



The code it generated was simple, but functional. The ./cmd/aitest/main.go file:

package main

import (
	"flag"
	"fmt"
	"os"

	"aitest/internal"
)

func main() {
	var versionFlag bool
	flag.BoolVar(&versionFlag, "v", false, "print version")
	dir := flag.String("dir", "", "directory to count files in")
	flag.Parse()

	if versionFlag {
		fmt.Println(internal.GetVersion())
		return
	}

	if *dir != "" {
		fileCount, err := internal.CountFiles(*dir)
		if err != nil {
			fmt.Fprintf(os.Stderr, "Error counting files: %v\n", err)
			os.Exit(1)
		}
		fmt.Printf("Number of files in directory %s: %d\n", *dir, fileCount)
	} else {
		fmt.Println("No directory specified. No count given.")
	}
}

The ./internal/version.go file:

package internal

var Version = "0.0.0"

func GetVersion() string {
	return Version
}

The ./internal/count.go file:

package internal

import (
	"os"
)

func CountFiles(dir string) (int, error) {
	files, err := os.ReadDir(dir)
	if err != nil {
		return 0, err
	}

	count := 0
	for _, file := range files {
		if !file.IsDir() {
			count++
		}
	}

	return count, nil
}

The code is quite straightforward, especially for generating boilerplate code this will be useful for many use cases!

In-Editor Code Completion



To leverage Ollama for real-time code completion in my editor, I have integrated it with Helix, my preferred text editor. Helix supports the LSP (Language Server Protocol), which enables advanced code completion features. The lsp-ai is an LSP server that can interface with Ollama models for code completion tasks.

https://helix-editor.com
https://github.com/SilasMarvin/lsp-ai

Installation of lsp-ai



I installed lsp-ai via Rust's Cargo package manager. (If you don't have Rust installed, you can install it via Homebrew as well.):

cargo install lsp-ai

Helix Configuration



I edited ~/.config/helix/languages.toml to include:

[[language]]
name = "go"
auto-format= true
diagnostic-severity = "hint"
formatter = { command = "goimports" }
language-servers = [ "gopls", "golangci-lint-lsp", "lsp-ai", "gpt" ]

Note that there is also a gpt language server configured, which is for GitHub Copilot, but it is out of scope of this blog post. Let's also configure lsp-ai settings in the same file:

[language-server.lsp-ai]
command = "lsp-ai"

[language-server.lsp-ai.config.memory]
file_store = { }

[language-server.lsp-ai.config.models.model1]
type = "ollama"
model =  "qwen2.5-coder"

[language-server.lsp-ai.config.models.model2]
type = "ollama"
model = "mistral-nemo:latest"

[language-server.lsp-ai.config.models.model3]
type = "ollama"
model = "deepseek-r1:14b"

[language-server.lsp-ai.config.completion]
model = "model1"

[language-server.lsp-ai.config.completion.parameters]
max_tokens = 64
max_context = 8096

## Configure the messages per your needs
[[language-server.lsp-ai.config.completion.parameters.messages]]
role = "system"
content = "Instructions:\n- You are an AI programming assistant.\n- Given a
piece of code with the cursor location marked by \"<CURSOR>\", replace
\"<CURSOR>\" with the correct code or comment.\n- First, think step-by-step.\n
- Describe your plan for what to build in pseudocode, written out in great
detail.\n- Then output the code replacing the \"<CURSOR>\"\n- Ensure that your
completion fits within the language context of the provided code snippet (e.g.,
Go, Ruby, Bash, Java, Puppet DSL).\n\nRules:\n- Only respond with code or
comments.\n- Only replace \"<CURSOR>\"; do not include any previously written
code.\n- Never include \"<CURSOR>\" in your response\n- If the cursor is within
a comment, complete the comment meaningfully.\n- Handle ambiguous cases by
providing the most contextually appropriate completion.\n- Be consistent with
your responses."

[[language-server.lsp-ai.config.completion.parameters.messages]]
role = "user"
content = "func greet(name) {\n    print(f\"Hello, {<CURSOR>}\")\n}"

[[language-server.lsp-ai.config.completion.parameters.messages]]
role = "assistant"
content = "name"

[[language-server.lsp-ai.config.completion.parameters.messages]]
role = "user"
content = "func sum(a, b) {\n    return a + <CURSOR>\n}"

[[language-server.lsp-ai.config.completion.parameters.messages]]
role = "assistant"
content = "b"

[[language-server.lsp-ai.config.completion.parameters.messages]]
role = "user"
content = "func multiply(a, b int ) int {\n    a * <CURSOR>\n}"

[[language-server.lsp-ai.config.completion.parameters.messages]]
role = "assistant"
content = "b"

[[language-server.lsp-ai.config.completion.parameters.messages]]
role = "user"
content = "// <CURSOR>\nfunc add(a, b) {\n    return a + b\n}"

[[language-server.lsp-ai.config.completion.parameters.messages]]
role = "assistant"
content = "Adds two numbers"

[[language-server.lsp-ai.config.completion.parameters.messages]]
role = "user"
content = "// This function checks if a number is even\n<CURSOR>"

[[language-server.lsp-ai.config.completion.parameters.messages]]
role = "assistant"
content = "func is_even(n) {\n    return n % 2 == 0\n}"

[[language-server.lsp-ai.config.completion.parameters.messages]]
role = "user"
content = "{CODE}"

As you can see, I have also added other models, such as Mistral Nemo and DeepSeek R1, so that I can switch between them in Helix. Other than that, the completion parameters are interesting. They define how the LLM should interact with the text in the text editor based on the given examples.

If you want to see more lsp-ai configuration examples, they are some for Vim and Helix in the lsp-ai git repository!

Code completion in action



The screenshot shows how Ollama's qwen2.5-coder model provides code completion suggestions within the Helix editor. LSP auto-completion is triggered by leaving the cursor at position <CURSOR> for a short period in the code snippet, and Ollama responds with relevant completions based on the context.

Completing the fib-function

In the LSP auto-completion, the one prefixed with ai - was generated by qwen2.5-coder, the other ones are from other LSP servers (GitHub Copilot, Go linter, Go language server, etc.).

I found GitHub Copilot to be still faster than qwen2.5-coder:14b, but the local LLM one is actually workable for me already. And, as mentioned earlier, things will likely improve in the future regarding local LLMs. So I am excited about the future of local LLMs and coding tools like Ollama and Helix.

After trying qwen3-coder:30b-a3b-q4_K_M (following the publication of this blog post), I found it to be significantly faster and more capable than the previous model, making it a promising option for local coding tasks. Experimentation reveals that even current local setups are surprisingly effective for routine coding tasks, offering a glimpse into the future of on-machine AI assistance.

Conclusion



Will there ever be a time we can run larger models (60B, 100B, ...and larger) on consumer hardware, or even on our phones? We are not quite there yet, but I am optimistic that we will see improvements in the next few years. As hardware capabilities improve and/or become cheaper, and more efficient models are developed (or new techniques will be invented to make language models more effective), the landscape of local AI coding assistants will continue to evolve.

For now, even the models listed in this blog post are very promising already, and they run on consumer-grade hardware (at least in the realm of the initial tests I've performed... the ones in this blog post are overly simplistic, though! But they were good for getting started with Ollama and initial demonstration)! I will continue experimenting with Ollama and other local LLMs to see how they can enhance my coding experience. I may cancel my Copilot subscription, which I currently use only for in-editor auto-completion, at some point.

However, truth be told, I don't think the setup described in this blog post currently matches the performance of commercial models like Claude Code (Sonnet 4, Opus 4), Gemini 2.5 Pro, the OpenAI models and others. Maybe we could get close if we had the high-end hardware needed to run the largest Qwen Coder model available. But, as mentioned already, that is out of reach for occasional coders like me. Furthermore, I want to continue coding manually to some degree, as otherwise I will start to forget how to write for-loops, which would be awkward... However, do we always need the best model when AI can help generate boilerplate or repetitive tasks even with smaller models?

E-Mail your comments to paul@nospam.buetow.org :-)

Other related posts are:

2025-08-05 Local LLM for Coding with Ollama on macOS (You are currently reading this)
2025-06-22 Task Samurai: An agentic coding learning experiment

Back to the main site
f3s: Kubernetes with FreeBSD - Part 6: Storage https://foo.zone/gemfeed/2025-07-14-f3s-kubernetes-with-freebsd-part-6.html 2025-07-13T16:44:29+03:00, last updated Tue 27 Jan 10:09:08 EET 2026 Paul Buetow aka snonux paul@dev.buetow.org This is the sixth blog post about the f3s series for self-hosting demands in a home lab. f3s? The 'f' stands for FreeBSD, and the '3s' stands for k3s, the Kubernetes distribution used on FreeBSD-based physical machines.

f3s: Kubernetes with FreeBSD - Part 6: Storage



Published at 2025-07-13T16:44:29+03:00, last updated Tue 27 Jan 10:09:08 EET 2026

This is the sixth blog post about the f3s series for self-hosting demands in a home lab. f3s? The "f" stands for FreeBSD, and the "3s" stands for k3s, the Kubernetes distribution used on FreeBSD-based physical machines.

2024-11-17 f3s: Kubernetes with FreeBSD - Part 1: Setting the stage
2024-12-03 f3s: Kubernetes with FreeBSD - Part 2: Hardware and base installation
2025-02-01 f3s: Kubernetes with FreeBSD - Part 3: Protecting from power cuts
2025-04-05 f3s: Kubernetes with FreeBSD - Part 4: Rocky Linux Bhyve VMs
2025-05-11 f3s: Kubernetes with FreeBSD - Part 5: WireGuard mesh network
2025-07-14 f3s: Kubernetes with FreeBSD - Part 6: Storage (You are currently reading this)
2025-10-02 f3s: Kubernetes with FreeBSD - Part 7: k3s and first pod deployments
2025-12-07 f3s: Kubernetes with FreeBSD - Part 8: Observability

f3s logo

Table of Contents




Introduction



In the previous posts, we set up a WireGuard mesh network. In the future, we will also set up a Kubernetes cluster. Kubernetes workloads often require persistent storage for databases, configuration files, and application data. Local storage on each node has significant limitations:

  • No data sharing: Pods (once we run Kubernetes) on different nodes can't access the same data
  • Pod mobility: If a pod moves to another node, it loses access to its data
  • No redundancy: Hardware failure means data loss

This post implements a robust storage solution using:

  • CARP: For high availability with automatic IP failover
  • NFS over stunnel: For secure, encrypted network storage
  • ZFS: For data integrity, encryption, and efficient snapshots
  • zrepl: For continuous ZFS replication between nodes

The result is a highly available, encrypted storage system that survives node failures while providing shared storage to all Kubernetes pods.

Other than what was mentioned in the first post of this blog series, we aren't using HAST, but zrepl for data replication. Read more about it later in this blog post.

Additional storage capacity



We add 1 TB of additional storage to each of the nodes (f0, f1, f2) in the form of an SSD drive. The Beelink mini PCs have enough space in the chassis for the extra space.



Upgrading the storage was as easy as unscrewing, plugging the drive in, and then screwing it back together again. The procedure was uneventful! We're using two different SSD models (Samsung 870 EVO and Crucial BX500) to avoid simultaneous failures from the same manufacturing batch.

We then create the zdata ZFS pool on all three nodes:

paul@f0:~ % doas zpool create -m /data zdata /dev/ada1
paul@f0:~ % zpool list
NAME    SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
zdata   928G  12.1M   928G        -         -     0%     0%  1.00x    ONLINE  -
zroot   472G  29.0G   443G        -         -     0%     6%  1.00x    ONLINE  -

paul@f0:/ % doas camcontrol devlist
<512GB SSD D910R170>               at scbus0 target 0 lun 0 (pass0,ada0)
<Samsung SSD 870 EVO 1TB SVT03B6Q>  at scbus1 target 0 lun 0 (pass1,ada1)
paul@f0:/ %

To verify that we have a different SSD on the second node (the third node has the same drive as the first):

paul@f1:/ % doas camcontrol devlist
<512GB SSD D910R170>               at scbus0 target 0 lun 0 (pass0,ada0)
<CT1000BX500SSD1 M6CR072>          at scbus1 target 0 lun 0 (pass1,ada1)

ZFS encryption keys



ZFS native encryption requires encryption keys to unlock datasets. We need a secure method to store these keys that balances security with operational needs:

  • Security: Keys must not be stored on the same disks they encrypt
  • Availability: Keys must be available at boot for automatic mounting
  • Portability: Keys should be easily moved between systems for recovery

Using USB flash drives as hardware key storage provides a convenient and elegant solution. The encrypted data is unreadable without physical access to the USB key, protecting against disk theft or improper disposal. In production environments, you may use enterprise key management systems; however, for a home lab, USB keys offer good security with minimal complexity.

UFS on USB keys



We'll format the USB drives with UFS (Unix File System) rather than ZFS for simplicity. There is no need to use ZFS.

Let's see the USB keys:

USB keys

To verify that the USB key (flash disk) is there:

paul@f0:/ % doas camcontrol devlist
<512GB SSD D910R170>               at scbus0 target 0 lun 0 (pass0,ada0)
<Samsung SSD 870 EVO 1TB SVT03B6Q>  at scbus1 target 0 lun 0 (pass1,ada1)
<Generic Flash Disk 8.07>          at scbus2 target 0 lun 0 (da0,pass2)
paul@f0:/ %

Let's create the UFS file system and mount it (done on all three nodes f0, f1 and f2):

paul@f0:/ % doas newfs /dev/da0
/dev/da0: 15000.0MB (30720000 sectors) block size 32768, fragment size 4096
        using 24 cylinder groups of 625.22MB, 20007 blks, 80128 inodes.
        with soft updates
super-block backups (for fsck_ffs -b #) at:
 192, 1280640, 2561088, 3841536, 5121984, 6402432, 7682880, 8963328, 10243776,
11524224, 12804672, 14085120, 15365568, 16646016, 17926464, 19206912,k 20487360,
...

paul@f0:/ % echo '/dev/da0 /keys ufs rw 0 2' | doas tee -a /etc/fstab
/dev/da0 /keys ufs rw 0 2
paul@f0:/ % doas mkdir /keys
paul@f0:/ % doas mount /keys
paul@f0:/ % df | grep keys
/dev/da0             14877596       8  13687384     0%    /keys

USB keys stuck in

Generating encryption keys



The following keys will later be used to encrypt the ZFS file systems. They will be stored on all three nodes, serving as a backup in case one of the keys is lost or corrupted. When we later replicate encrypted ZFS volumes from one node to another, the keys must also be available on the destination node.

paul@f0:/keys % doas openssl rand -out /keys/f0.lan.buetow.org:bhyve.key 32
paul@f0:/keys % doas openssl rand -out /keys/f1.lan.buetow.org:bhyve.key 32
paul@f0:/keys % doas openssl rand -out /keys/f2.lan.buetow.org:bhyve.key 32
paul@f0:/keys % doas openssl rand -out /keys/f0.lan.buetow.org:zdata.key 32
paul@f0:/keys % doas openssl rand -out /keys/f1.lan.buetow.org:zdata.key 32
paul@f0:/keys % doas openssl rand -out /keys/f2.lan.buetow.org:zdata.key 32
paul@f0:/keys % doas chown root *
paul@f0:/keys % doas chmod 400 *

paul@f0:/keys % ls -l
total 20
*r--------  1 root wheel 32 May 25 13:07 f0.lan.buetow.org:bhyve.key
*r--------  1 root wheel 32 May 25 13:07 f1.lan.buetow.org:bhyve.key
*r--------  1 root wheel 32 May 25 13:07 f2.lan.buetow.org:bhyve.key
*r--------  1 root wheel 32 May 25 13:07 f0.lan.buetow.org:zdata.key
*r--------  1 root wheel 32 May 25 13:07 f1.lan.buetow.org:zdata.key
*r--------  1 root wheel 32 May 25 13:07 f2.lan.buetow.org:zdata.key

After creation, these are copied to the other two nodes, f1 and f2, into the /keys partition (I won't provide the commands here; create a tarball, copy it over, and extract it on the destination nodes).

Configuring zdata ZFS pool encryption



Let's encrypt our zdata ZFS pool. We are not encrypting the whole pool, but everything within the zdata/enc data set:

paul@f0:/keys % doas zfs create -o encryption=on -o keyformat=raw -o \
  keylocation=file:///keys/`hostname`:zdata.key zdata/enc
paul@f0:/ % zfs list | grep zdata
zdata                                          836K   899G    96K  /data
zdata/enc                                      200K   899G   200K  /data/enc

paul@f0:/keys % zfs get all zdata/enc | grep -E -i '(encryption|key)'
zdata/enc  encryption            aes-256-gcm                               -
zdata/enc  keylocation           file:///keys/f0.lan.buetow.org:zdata.key  local
zdata/enc  keyformat             raw                                       -
zdata/enc  encryptionroot        zdata/enc                                 -
zdata/enc  keystatus             available                                 -

All future data sets within zdata/enc will inherit the same encryption key.

Migrating Bhyve VMs to an encrypted bhyve ZFS volume



We set up Bhyve VMs in a previous blog post. Their ZFS data sets rely on zroot, which is the default ZFS pool on the internal 512GB NVME drive. They aren't encrypted yet, so we encrypt the VM data sets as well now. To do so, we first shut down the VMs on all three nodes:

paul@f0:/keys % doas vm stop rocky
Sending ACPI shutdown to rocky

paul@f0:/keys % doas vm list
NAME     DATASTORE  LOADER     CPU  MEMORY  VNC  AUTO     STATE
rocky    default    uefi       4    14G     -    Yes [1]  Stopped

After this, we rename the unencrypted data set to _old, create a new encrypted data set, and also snapshot it as @hamburger.

paul@f0:/keys % doas zfs rename zroot/bhyve zroot/bhyve_old
paul@f0:/keys % doas zfs set mountpoint=/mnt zroot/bhyve_old
paul@f0:/keys % doas zfs snapshot zroot/bhyve_old/rocky@hamburger

paul@f0:/keys % doas zfs create -o encryption=on -o keyformat=raw -o \
  keylocation=file:///keys/`hostname`:bhyve.key zroot/bhyve
paul@f0:/keys % doas zfs set mountpoint=/zroot/bhyve zroot/bhyve
paul@f0:/keys % doas zfs set mountpoint=/zroot/bhyve/rocky zroot/bhyve/rocky

Once done, we import the snapshot into the encrypted dataset and also copy some other metadata files from vm-bhyve back over.

paul@f0:/keys % doas zfs send zroot/bhyve_old/rocky@hamburger | \
  doas zfs recv zroot/bhyve/rocky
paul@f0:/keys % doas cp -Rp /mnt/.config /zroot/bhyve/
paul@f0:/keys % doas cp -Rp /mnt/.img /zroot/bhyve/
paul@f0:/keys % doas cp -Rp /mnt/.templates /zroot/bhyve/
paul@f0:/keys % doas cp -Rp /mnt/.iso /zroot/bhyve/

We also have to make encrypted ZFS data sets mount automatically on boot:

paul@f0:/keys % doas sysrc zfskeys_enable=YES
zfskeys_enable:  -> YES
paul@f0:/keys % doas vm init
paul@f0:/keys % doas reboot
.
.
.
paul@f0:~ % doas vm list
paul@f0:~ % doas vm list
NAME     DATASTORE  LOADER     CPU  MEMORY  VNC           AUTO     STATE
rocky    default    uefi       4    14G     0.0.0.0:5900  Yes [1]  Running (2265)

As you can see, the VM is running. This means the encrypted zroot/bhyve was mounted successfully after the reboot! Now we can destroy the old, unencrypted, and now unused bhyve dataset:

paul@f0:~ % doas zfs destroy -R zroot/bhyve_old

To verify once again that zroot/bhyve and zroot/bhyve/rocky are now both encrypted, we run:

paul@f0:~ % zfs get all zroot/bhyve | grep -E '(encryption|key)'
zroot/bhyve  encryption            aes-256-gcm                               -
zroot/bhyve  keylocation           file:///keys/f0.lan.buetow.org:bhyve.key  local
zroot/bhyve  keyformat             raw                                       -
zroot/bhyve  encryptionroot        zroot/bhyve                               -
zroot/bhyve  keystatus             available                                 -

paul@f0:~ % zfs get all zroot/bhyve/rocky | grep -E '(encryption|key)'
zroot/bhyve/rocky  encryption            aes-256-gcm            -
zroot/bhyve/rocky  keylocation           none                   default
zroot/bhyve/rocky  keyformat             raw                    -
zroot/bhyve/rocky  encryptionroot        zroot/bhyve            -
zroot/bhyve/rocky  keystatus             available              -

ZFS Replication with zrepl



Data replication is the cornerstone of high availability. While CARP handles IP failover (see later in this post), we need continuous data replication to ensure the backup server has current data when it becomes active. Without replication, failover would result in data loss or require shared storage (like iSCSI), which introduces a single point of failure.

Understanding Replication Requirements



Our storage system has different replication needs:

  • NFS data (/data/nfs/k3svolumes): Soon, it will contain active Kubernetes persistent volumes. Needs frequent replication (every minute) to minimise data loss during failover.
  • VM data (/zroot/bhyve/freebsd): Contains VM images that change less frequently. Can tolerate longer replication intervals (every 10 minutes).

The 1-minute replication window is perfectly acceptable for my personal use cases. This isn't a high-frequency trading system or a real-time database—it's storage for personal projects, development work, and home lab experiments. Losing at most 1 minute of work in a disaster scenario is a reasonable trade-off for the reliability and simplicity of snapshot-based replication. Additionally, in the case of a "1 minute of data loss," I would likely still have the data available on the client side.

Why use zrepl instead of HAST? While HAST (Highly Available Storage) is FreeBSD's native solution for high-availability storage and supports synchronous replication—thus eliminating the mentioned 1-minute window—I've chosen zrepl for several important reasons:

  • HAST can cause ZFS corruption: HAST operates at the block level and doesn't understand ZFS's transactional semantics. During failover, in-flight transactions can lead to corrupted zpools. I've experienced this firsthand (I am confident I have configured something wrong) - the automatic failover would trigger while ZFS was still writing, resulting in an unmountable pool.
  • ZFS-aware replication: zrepl understands ZFS datasets and snapshots. It replicates at the dataset level, ensuring each snapshot is a consistent point-in-time copy. This is fundamentally safer than block-level replication.
  • Snapshot history: With zrepl, you get multiple recovery points (every minute for NFS data in our setup). If corruption occurs, you can roll back to any previous snapshot. HAST only gives you the current state.
  • Easier recovery: When something goes wrong with zrepl, you still have intact snapshots on both sides. With HAST, a corrupted primary often means a corrupted secondary as well.

FreeBSD HAST

Installing zrepl



First, install zrepl on both hosts involved (we will replicate data from f0 to f1):

paul@f0:~ % doas pkg install -y zrepl

Then, we verify the pools and datasets on both hosts:

# On f0
paul@f0:~ % doas zpool list
NAME    SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
zdata   928G  1.03M   928G        -         -     0%     0%  1.00x    ONLINE  -
zroot   472G  26.7G   445G        -         -     0%     5%  1.00x    ONLINE  -

paul@f0:~ % doas zfs list -r zdata/enc
NAME        USED  AVAIL  REFER  MOUNTPOINT
zdata/enc   200K   899G   200K  /data/enc

# On f1
paul@f1:~ % doas zpool list
NAME    SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
zdata   928G   956K   928G        -         -     0%     0%  1.00x    ONLINE  -
zroot   472G  11.7G   460G        -         -     0%     2%  1.00x    ONLINE  -

paul@f1:~ % doas zfs list -r zdata/enc
NAME        USED  AVAIL  REFER  MOUNTPOINT
zdata/enc   200K   899G   200K  /data/enc

Since we have a WireGuard tunnel between f0 and f1, we'll use TCP transport over the secure tunnel instead of SSH. First, check the WireGuard IP addresses:

# Check WireGuard interface IPs
paul@f0:~ % ifconfig wg0 | grep inet
	inet 192.168.2.130 netmask 0xffffff00

paul@f1:~ % ifconfig wg0 | grep inet
	inet 192.168.2.131 netmask 0xffffff00

Let's create a dedicated dataset for NFS data that will be replicated:

# Create the nfsdata dataset that will hold all data exposed via NFS
paul@f0:~ % doas zfs create zdata/enc/nfsdata

Afterwards, we create the zrepl configuration on f0:

paul@f0:~ % doas tee /usr/local/etc/zrepl/zrepl.yml <<'EOF'
global:
  logging:
    - type: stdout
      level: info
      format: human

jobs:
  - name: f0_to_f1_nfsdata
    type: push
    connect:
      type: tcp
      address: "192.168.2.131:8888"
    filesystems:
      "zdata/enc/nfsdata": true
    send:
      encrypted: true
    snapshotting:
      type: periodic
      prefix: zrepl_
      interval: 1m
    pruning:
      keep_sender:
        - type: last_n
          count: 10
        - type: grid
          grid: 4x7d | 6x30d
          regex: "^zrepl_.*"
      keep_receiver:
        - type: last_n
          count: 10
        - type: grid
          grid: 4x7d | 6x30d
          regex: "^zrepl_.*"

  - name: f0_to_f1_freebsd
    type: push
    connect:
      type: tcp
      address: "192.168.2.131:8888"
    filesystems:
      "zroot/bhyve/freebsd": true
    send:
      encrypted: true
    snapshotting:
      type: periodic
      prefix: zrepl_
      interval: 10m
    pruning:
      keep_sender:
        - type: last_n
          count: 10
        - type: grid
          grid: 4x7d
          regex: "^zrepl_.*"
      keep_receiver:
        - type: last_n
          count: 10
        - type: grid
          grid: 4x7d
          regex: "^zrepl_.*"
EOF

We're using two separate replication jobs with different intervals:

  • f0_to_f1_nfsdata: Replicates NFS data every minute for faster failover recovery
  • f0_to_f1_freebsd: Replicates FreeBSD VM every ten minutes (less critical)

The FreeBSD VM is only used for development purposes, so it doesn't require as frequent replication as the NFS data. It's off-topic to this blog series, but it showcases how zrepl's flexibility in handling different datasets with varying replication needs.

Furthermore:

  • We're specifically replicating zdata/enc/nfsdata instead of the entire zdata/enc dataset. This dedicated dataset will contain all the data we later want to expose via NFS, keeping a clear separation between replicated NFS data and other local encrypted data.
  • We use send: encrypted: true to keep the replication stream encrypted. While WireGuard already encrypts in transit, this provides additional protection. For reduced CPU overhead, you could set encrypted: false since the tunnel is secure.

Configuring zrepl on f1 (sink)



On f1 (the sink, meaning it's the node receiving the replication data), we configure zrepl to receive the data as follows:

# First, create a dedicated sink dataset
paul@f1:~ % doas zfs create zdata/sink

paul@f1:~ % doas tee /usr/local/etc/zrepl/zrepl.yml <<'EOF'
global:
  logging:
    - type: stdout
      level: info
      format: human

jobs:
  - name: sink
    type: sink
    serve:
      type: tcp
      listen: "192.168.2.131:8888"
      clients:
        "192.168.2.130": "f0"
    recv:
      placeholder:
        encryption: inherit
    root_fs: "zdata/sink"
EOF

Enabling and starting zrepl services



We then enable and start zrepl on both hosts via:

# On f0
paul@f0:~ % doas sysrc zrepl_enable=YES
zrepl_enable:  -> YES
paul@f0:~ % doas service `zrepl` start
Starting zrepl.

# On f1
paul@f1:~ % doas sysrc zrepl_enable=YES
zrepl_enable:  -> YES
paul@f1:~ % doas service `zrepl` start
Starting zrepl.

To check the replication status, we run:

# On f0, check `zrepl` status (use raw mode for non-tty)
paul@f0:~ % doas pkg install jq
paul@f0:~ % doas zrepl status --mode raw | grep -A2 "Replication" | jq .
"Replication":{"StartAt":"2025-07-01T22:31:48.712143123+03:00"...

# Check if services are running
paul@f0:~ % doas service zrepl status
zrepl is running as pid 2649.

paul@f1:~ % doas service zrepl status
zrepl is running as pid 2574.

# Check for `zrepl` snapshots on source
paul@f0:~ % doas zfs list -t snapshot -r zdata/enc | grep zrepl
zdata/enc@zrepl_20250701_193148_000    0B      -   176K  -

# On f1, verify the replicated datasets  
paul@f1:~ % doas zfs list -r zdata | grep f0
zdata/f0             576K   899G   200K  none
zdata/f0/zdata       376K   899G   200K  none
zdata/f0/zdata/enc   176K   899G   176K  none

# Check replicated snapshots on f1
paul@f1:~ % doas zfs list -t snapshot -r zdata | grep zrepl
zdata/f0/zdata/enc@zrepl_20250701_193148_000     0B      -   176K  -
zdata/f0/zdata/enc@zrepl_20250701_194148_000     0B      -   176K  -
.
.
.

Monitoring replication



You can monitor the replication progress with:

paul@f0:~ % doas zrepl status

zrepl status

With this setup, both zdata/enc/nfsdata and zroot/bhyve/freebsd on f0 will be automatically replicated to f1 every 1 minute (or 10 minutes in the case of the FreeBSD VM), with encrypted snapshots preserved on both sides. The pruning policy ensures that we keep the last 10 snapshots while managing disk space efficiently.

The replicated data appears on f1 under zdata/sink/ with the source host and dataset hierarchy preserved:

  • zdata/enc/nfsdatazdata/sink/f0/zdata/enc/nfsdata
  • zroot/bhyve/freebsdzdata/sink/f0/zroot/bhyve/freebsd

This is by design - zrepl preserves the complete path from the source to ensure there are no conflicts when replicating from multiple sources.

Verifying replication after reboot



The zrepl service is configured to start automatically at boot. After rebooting both hosts:

paul@f0:~ % uptime
11:17PM  up 1 min, 0 users, load averages: 0.16, 0.06, 0.02

paul@f0:~ % doas service `zrepl` status
zrepl is running as pid 2366.

paul@f1:~ % doas service `zrepl` status
zrepl is running as pid 2309.

# Check that new snapshots are being created and replicated
paul@f0:~ % doas zfs list -t snapshot | grep `zrepl` | tail -2
zdata/enc/nfsdata@zrepl_20250701_202530_000                0B      -   200K  -
zroot/bhyve/freebsd@zrepl_20250701_202530_000               0B      -  2.97G  -
.
.
.

paul@f1:~ % doas zfs list -t snapshot -r zdata/sink | grep 202530
zdata/sink/f0/zdata/enc/nfsdata@zrepl_20250701_202530_000      0B      -   176K  -
zdata/sink/f0/zroot/bhyve/freebsd@zrepl_20250701_202530_000     0B      -  2.97G  -
.
.
.

The timestamps confirm that replication resumed automatically after the reboot, ensuring continuous data protection. We can also write a test file to the NFS data directory on f0 and verify whether it appears on f1 after a minute.

Understanding Failover Limitations and Design Decisions



Our system intentionally fails over to a read-only copy of the replica in the event of the primary's failure. This is due to the nature of zrepl, which only replicates data in one direction. If we mount the data set on the sink node in read-write mode, it would cause the ZFS dataset to diverge from the original, and the replication would break. It can still be mounted read-write on the sink node in case of a genuine issue on the primary node, but that step is left intentionally manual. Therefore, we don't need to fix the replication later on manually.

So in summary:

  • Split-brain prevention: Automatic failover to a read-write copy can cause both nodes to become active simultaneously if network communication fails. This leads to data divergence that's extremely difficult to resolve.
  • False positive protection: Temporary network issues or high load can trigger unwanted failovers. Manual intervention ensures that failovers occur only when truly necessary.
  • Data integrity over availability: For storage systems, data consistency is paramount. A few minutes of downtime is preferable to data corruption in this specific use case.
  • Simplified recovery: With manual failover, you always know which dataset is authoritative, making recovery more straightforward.

Mounting the NFS datasets



To make the NFS data accessible on both nodes, we need to mount it. On f0, this is straightforward:

# On f0 - set mountpoint for the primary nfsdata
paul@f0:~ % doas zfs set mountpoint=/data/nfs zdata/enc/nfsdata
paul@f0:~ % doas mkdir -p /data/nfs

# Verify it's mounted
paul@f0:~ % df -h /data/nfs
Filesystem           Size    Used   Avail Capacity  Mounted on
zdata/enc/nfsdata    899G    204K    899G     0%    /data/nfs

On f1, we need to handle the encryption key and mount the standby copy:

# On f1 - first check encryption status
paul@f1:~ % doas zfs get keystatus zdata/sink/f0/zdata/enc/nfsdata
NAME                             PROPERTY   VALUE        SOURCE
zdata/sink/f0/zdata/enc/nfsdata  keystatus  unavailable  -

# Load the encryption key (using f0's key stored on the USB)
paul@f1:~ % doas zfs load-key -L file:///keys/f0.lan.buetow.org:zdata.key \
    zdata/sink/f0/zdata/enc/nfsdata

# Set mountpoint and mount (same path as f0 for easier failover)
paul@f1:~ % doas mkdir -p /data/nfs
paul@f1:~ % doas zfs set mountpoint=/data/nfs zdata/sink/f0/zdata/enc/nfsdata
paul@f1:~ % doas zfs mount zdata/sink/f0/zdata/enc/nfsdata

# Make it read-only to prevent accidental writes that would break replication
paul@f1:~ % doas zfs set readonly=on zdata/sink/f0/zdata/enc/nfsdata

# Verify
paul@f1:~ % df -h /data/nfs
Filesystem                         Size    Used   Avail Capacity  Mounted on
zdata/sink/f0/zdata/enc/nfsdata    896G    204K    896G     0%    /data/nfs

Note: The dataset is mounted at the same path (/data/nfs) on both hosts to simplify failover procedures. The dataset on f1 is set to readonly=on to prevent accidental modifications, which, as mentioned earlier, would break replication. If we did, replication from f0 to f1 would fail like this:

cannot receive incremental stream: destination zdata/sink/f0/zdata/enc/nfsdata has been modified since most recent snapshot

To fix a broken replication after accidental writes, we can do:

# Option 1: Rollback to the last common snapshot (loses local changes)
paul@f1:~ % doas zfs rollback zdata/sink/f0/zdata/enc/nfsdata@zrepl_20250701_204054_000

# Option 2: Make it read-only to prevent accidents again
paul@f1:~ % doas zfs set readonly=on zdata/sink/f0/zdata/enc/nfsdata

And replication should work again!

Troubleshooting: Files not appearing in replication



If you write files to /data/nfs/ on f0 but they don't appear on f1, check if the dataset is mounted on f0?

paul@f0:~ % doas zfs list -o name,mountpoint,mounted | grep nfsdata
zdata/enc/nfsdata                             /data/nfs             yes

If it shows no, the dataset isn't mounted! This means files are being written to the root filesystem, not ZFS. Next, we should check whether the encryption key is loaded:

paul@f0:~ % doas zfs get keystatus zdata/enc/nfsdata
NAME               PROPERTY   VALUE        SOURCE
zdata/enc/nfsdata  keystatus  available    -
# If "unavailable", load the key:
paul@f0:~ % doas zfs load-key -L file:///keys/f0.lan.buetow.org:zdata.key zdata/enc/nfsdata
paul@f0:~ % doas zfs mount zdata/enc/nfsdata

You can also verify that files are in the snapshot (not just the directory):

paul@f0:~ % ls -la /data/nfs/.zfs/snapshot/zrepl_*/

This issue commonly occurs after a reboot if the encryption keys aren't configured to load automatically.

Configuring automatic key loading on boot



To ensure all additional encrypted datasets are mounted automatically after reboot as well, we do:

# On f0 - configure all encrypted datasets
paul@f0:~ % doas sysrc zfskeys_enable=YES
zfskeys_enable: YES -> YES
paul@f0:~ % doas sysrc zfskeys_datasets="zdata/enc zdata/enc/nfsdata zroot/bhyve"
zfskeys_datasets:  -> zdata/enc zdata/enc/nfsdata zroot/bhyve

# Set correct key locations for all datasets
paul@f0:~ % doas zfs set \
  keylocation=file:///keys/f0.lan.buetow.org:zdata.key zdata/enc/nfsdata

# On f1 - include the replicated dataset
paul@f1:~ % doas sysrc zfskeys_enable=YES
zfskeys_enable: YES -> YES
paul@f1:~ % doas sysrc \
  zfskeys_datasets="zdata/enc zroot/bhyve zdata/sink/f0/zdata/enc/nfsdata"
zfskeys_datasets:  -> zdata/enc zroot/bhyve zdata/sink/f0/zdata/enc/nfsdata

# Set key location for replicated dataset
paul@f1:~ % doas zfs set \
  keylocation=file:///keys/f0.lan.buetow.org:zdata.key zdata/sink/f0/zdata/enc/nfsdata

Important notes:

  • Each encryption root needs its own key load entry
  • The replicated dataset on f1 uses the same encryption key as the source on f0
  • Always verify datasets are mounted after reboot with zfs list -o name,mounted
  • Critical: Always ensure the replicated dataset on f1 remains read-only with doas zfs set readonly=on zdata/sink/f0/zdata/enc/nfsdata

Troubleshooting: zrepl Replication Not Working



If zrepl replication is not working, here's a systematic approach to diagnose and fix common issues:

Check if zrepl Services are Running



First, verify that zrepl is running on both nodes:

# Check service status on both f0 and f1
paul@f0:~ % doas service zrepl status
paul@f1:~ % doas service zrepl status

# If not running, start the service
paul@f0:~ % doas service zrepl start
paul@f1:~ % doas service zrepl start

Check zrepl Status for Errors



Use the status command to see detailed error information:

# Check detailed status (use --mode raw for non-tty environments)
paul@f0:~ % doas zrepl status --mode raw

# Look for error messages in the replication section
# Common errors include "no common snapshot" or connection failures

Fixing "No Common Snapshot" Errors



This is the most common replication issue, typically occurring when:

  • The receiver has existing snapshots that don't match the sender
  • Different snapshot naming schemes are in use
  • The receiver dataset was created independently

**Error message example:**
no common snapshot or suitable bookmark between sender and receiver

**Solution: Clean up conflicting snapshots on receiver**

# First, identify the destination dataset on f1
paul@f1:~ % doas zfs list | grep sink

# Check existing snapshots on the problematic dataset
paul@f1:~ % doas zfs list -t snapshot | grep nfsdata

# If you see snapshots with different naming (e.g., @daily-*, @weekly-*)
# these conflict with zrepl's @zrepl_* snapshots

# Destroy the entire destination dataset to allow clean replication
paul@f1:~ % doas zfs destroy -r zdata/sink/f0/zdata/enc/nfsdata

# For VM replication, do the same for the freebsd dataset
paul@f1:~ % doas zfs destroy -r zdata/sink/f0/zroot/bhyve/freebsd

# Wake up zrepl to start fresh replication
paul@f0:~ % doas zrepl signal wakeup f0_to_f1_nfsdata
paul@f0:~ % doas zrepl signal wakeup f0_to_f1_freebsd

# Check replication status
paul@f0:~ % doas zrepl status --mode raw

**Verification that replication is working:**

# Look for "stepping" state and active zfs send processes
paul@f0:~ % doas zrepl status --mode raw | grep -A5 "State.*stepping"

# Check for active ZFS commands
paul@f0:~ % doas zrepl status --mode raw | grep -A10 "ZFSCmds.*Active"

# Monitor progress - bytes replicated should be increasing
paul@f0:~ % doas zrepl status --mode raw | grep BytesReplicated

Network Connectivity Issues



If replication fails to connect:

# Test connectivity between nodes
paul@f0:~ % nc -zv 192.168.2.131 8888

# Check if zrepl is listening on f1
paul@f1:~ % doas netstat -an | grep 8888

# Verify WireGuard tunnel is working
paul@f0:~ % ping 192.168.2.131

Encryption Key Issues



If encrypted replication fails:

# Verify encryption keys are available on both nodes
paul@f0:~ % doas zfs get keystatus zdata/enc/nfsdata
paul@f1:~ % doas zfs get keystatus zdata/sink/f0/zdata/enc/nfsdata

# Load keys if unavailable
paul@f1:~ % doas zfs load-key -L file:///keys/f0.lan.buetow.org:zdata.key \
    zdata/sink/f0/zdata/enc/nfsdata

Monitoring Ongoing Replication



After fixing issues, monitor replication health:

# Monitor replication progress (run repeatedly to check status)
paul@f0:~ % doas zrepl status --mode raw | grep -A10 BytesReplicated

# Or install watch from ports and use it
paul@f0:~ % doas pkg install watch
paul@f0:~ % watch -n 5 'doas zrepl status --mode raw | grep -A10 BytesReplicated'

# Check for new snapshots being created
paul@f0:~ % doas zfs list -t snapshot | grep zrepl | tail -5

# Verify snapshots appear on receiver
paul@f1:~ % doas zfs list -t snapshot -r zdata/sink | grep zrepl | tail -5

This troubleshooting process resolves the most common zrepl issues and ensures continuous data replication between your storage nodes.

CARP (Common Address Redundancy Protocol)



High availability is crucial for storage systems. If the storage server goes down, all NFS clients (which will also be Kubernetes pods later on in this series) lose access to their persistent data. CARP provides a solution by creating a virtual IP address that automatically migrates to a different server during failures. This means that clients point to that VIP for NFS mounts and are always contacting the current primary node.

How CARP Works



In our case, CARP allows two hosts (f0 and f1) to share a virtual IP address (VIP). The hosts communicate using multicast to elect a MASTER, while the other remain as BACKUP. When the MASTER fails, the BACKUP automatically promotes itself, and the VIP is reassigned to the new MASTER. This happens within seconds.

Key benefits for our storage system:

  • Automatic failover: No manual intervention is required for basic failures, although there are a few limitations. The backup will have read-only access to the available data by default, as we have already learned.
  • Transparent to clients: Pods continue using the same IP address
  • Works with stunnel: Behind the VIP, there will be a stunnel process running, which ensures encrypted connections follow the active server.

FreeBSD CARP
Stunnel

Configuring CARP



First, we add the CARP configuration to /etc/rc.conf on both f0 and f1:

Update: Sun 4 Jan 00:17:00 EET 2026 - Added advskew 100 to f1 so f0 always wins CARP elections when it comes back online after a reboot.

# On f0 - The virtual IP 192.168.1.138 will float between f0 and f1
ifconfig_re0_alias0="inet vhid 1 pass testpass alias 192.168.1.138/32"

# On f1 - Higher advskew means lower priority, so f0 wins elections
ifconfig_re0_alias0="inet vhid 1 advskew 100 pass testpass alias 192.168.1.138/32"

Whereas:

  • vhid 1: Virtual Host ID - must match on all CARP members
  • advskew: Advertisement skew - higher value means lower priority (f1 uses 100, f0 uses default 0)
  • pass testpass: Password for CARP authentication (if you follow this, use a different password!)
  • alias 192.168.1.138/32: The virtual IP address with a /32 netmask

Next, update /etc/hosts on all nodes (f0, f1, f2, r0, r1, r2) to resolve the VIP hostname:

192.168.2.138 f3s-storage-ha f3s-storage-ha.wg0 f3s-storage-ha.wg0.wan.buetow.org
fd42:beef:cafe:2::138 f3s-storage-ha f3s-storage-ha.wg0 f3s-storage-ha.wg0.wan.buetow.org

This allows clients to connect to f3s-storage-ha regardless of which physical server is currently the MASTER.

CARP State Change Notifications



To correctly manage services during failover, we need to detect CARP state changes. FreeBSD's devd system can notify us when CARP transitions between MASTER and BACKUP states.

Add this to /etc/devd.conf on both f0 and f1:

paul@f0:~ % cat <<END | doas tee -a /etc/devd.conf
notify 0 {
        match "system"          "CARP";
        match "subsystem"       "[0-9]+@[0-9a-z.]+";
        match "type"            "(MASTER|BACKUP)";
        action "/usr/local/bin/carpcontrol.sh $subsystem $type";
};
END

paul@f0:~ % doas service devd restart

Next, we create the CARP control script that will restart stunnel when the CARP state changes:

Update: Fixed the script at Sat 3 Jan 23:55:11 EET 2026 - changed $1 to $2 because devd passes $subsystem $type, so the state is in the second argument.

paul@f0:~ % doas tee /usr/local/bin/carpcontrol.sh <<'EOF'
#!/bin/sh
# CARP state change control script

case "$2" in
    MASTER)
        logger "CARP state changed to MASTER, starting services"
        ;;
    BACKUP)
        logger "CARP state changed to BACKUP, stopping services"
        ;;
    *)
        logger "CARP state changed to $2 (unhandled)"
        ;;
esac
EOF

paul@f0:~ % doas chmod +x /usr/local/bin/carpcontrol.sh

# Copy the same script to f1
paul@f0:~ % scp /usr/local/bin/carpcontrol.sh f1:/tmp/
paul@f1:~ % doas mv /tmp/carpcontrol.sh /usr/local/bin/
paul@f1:~ % doas chmod +x /usr/local/bin/carpcontrol.sh

Note that carpcontrol.sh doesn't do anything useful yet. We will provide more details (including starting and stopping services upon failover) later in this blog post.

To enable CARP in /boot/loader.conf, run:

paul@f0:~ % echo 'carp_load="YES"' | doas tee -a /boot/loader.conf
carp_load="YES"
paul@f1:~ % echo 'carp_load="YES"' | doas tee -a /boot/loader.conf  
carp_load="YES"

Then reboot both hosts or run doas kldload carp to load the module immediately.

NFS Server Configuration



With ZFS replication in place, we can now set up NFS servers on both f0 and f1 to export the replicated data. Since native NFS over TLS (RFC 9289) has compatibility issues between Linux and FreeBSD (not digging into the details here, but I couldn't get it to work), we'll use stunnel to provide encryption.

Setting up NFS on f0 (Primary)



First, enable the NFS services in rc.conf:

paul@f0:~ % doas sysrc nfs_server_enable=YES
nfs_server_enable: YES -> YES
paul@f0:~ % doas sysrc nfsv4_server_enable=YES
nfsv4_server_enable: YES -> YES
paul@f0:~ % doas sysrc nfsuserd_enable=YES
nfsuserd_enable: YES -> YES
paul@f0:~ % doas sysrc nfsuserd_flags="-domain lan.buetow.org"
nfsuserd_flags: "" -> "-domain lan.buetow.org"
paul@f0:~ % doas sysrc mountd_enable=YES
mountd_enable: NO -> YES
paul@f0:~ % doas sysrc rpcbind_enable=YES
rpcbind_enable: NO -> YES

Update: 08.08.2025: I've added the domain to nfsuserd_flags

And we also create a dedicated directory for Kubernetes volumes:

# First, ensure the dataset is mounted
paul@f0:~ % doas zfs get mounted zdata/enc/nfsdata
NAME               PROPERTY  VALUE    SOURCE
zdata/enc/nfsdata  mounted   yes      -

# Create the k3svolumes directory
paul@f0:~ % doas mkdir -p /data/nfs/k3svolumes
paul@f0:~ % doas chmod 755 /data/nfs/k3svolumes

We also create the /etc/exports file. Since we're using stunnel for encryption, ALL clients must connect through stunnel, which appears as localhost (127.0.0.1) to the NFS server:

paul@f0:~ % doas tee /etc/exports <<'EOF'
V4: /data/nfs -sec=sys
/data/nfs -alldirs -maproot=root -network 127.0.0.1 -mask 255.255.255.255
EOF

The exports configuration:

  • V4: /data/nfs -sec=sys: Sets the NFSv4 root directory to /data/nfs
  • -maproot=root: Maps root user from client to root on server
  • -network 127.0.0.1: Only accepts connections from localhost (stunnel)

To start the NFS services, we run:

paul@f0:~ % doas service rpcbind start
Starting rpcbind.
paul@f0:~ % doas service mountd start
Starting mountd.
paul@f0:~ % doas service nfsd start
Starting nfsd.
paul@f0:~ % doas service nfsuserd start
Starting nfsuserd.

Configuring Stunnel for NFS Encryption with CARP Failover



Using stunnel with client certificate authentication for NFS encryption provides several advantages:

  • Compatibility: Works with any NFS version and between different operating systems
  • Strong encryption: Uses TLS/SSL with configurable cipher suites
  • Transparent: Applications don't need modification, encryption happens at the transport layer
  • Performance: Minimal overhead (~2% in benchmarks)
  • Flexibility: Can encrypt any TCP-based protocol, not just NFS
  • Strong Authentication: Client certificates provide cryptographic proof of identity
  • Access Control: Only clients with valid certificates signed by your CA can connect
  • Certificate Revocation: You can revoke access by removing certificates from the CA

Stunnel integrates seamlessly with our CARP setup:

                    CARP VIP (192.168.1.138)
                           |
    f0 (MASTER) ←---------→|←---------→ f1 (BACKUP)
    stunnel:2323           |           stunnel:stopped
    nfsd:2049              |           nfsd:stopped
                           |
                    Clients connect here

The key insight is that stunnel binds to the CARP VIP. When CARP fails over, the VIP is moved to the new master, and stunnel starts there automatically. Clients maintain their connection to the same IP throughout.

Creating a Certificate Authority for Client Authentication



First, create a CA to sign both server and client certificates:

# On f0 - Create CA
paul@f0:~ % doas mkdir -p /usr/local/etc/stunnel/ca
paul@f0:~ % cd /usr/local/etc/stunnel/ca
paul@f0:~ % doas openssl genrsa -out ca-key.pem 4096
paul@f0:~ % doas openssl req -new -x509 -days 3650 -key ca-key.pem -out ca-cert.pem \
  -subj '/C=US/ST=State/L=City/O=F3S Storage/CN=F3S Stunnel CA'

# Create server certificate
paul@f0:~ % cd /usr/local/etc/stunnel
paul@f0:~ % doas openssl genrsa -out server-key.pem 4096
paul@f0:~ % doas openssl req -new -key server-key.pem -out server.csr \
  -subj '/C=US/ST=State/L=City/O=F3S Storage/CN=f3s-storage-ha.lan'
paul@f0:~ % doas openssl x509 -req -days 3650 -in server.csr -CA ca/ca-cert.pem \
  -CAkey ca/ca-key.pem -CAcreateserial -out server-cert.pem

# Create client certificates for authorised clients
paul@f0:~ % cd /usr/local/etc/stunnel/ca
paul@f0:~ % doas sh -c 'for client in r0 r1 r2 earth; do 
  openssl genrsa -out ${client}-key.pem 4096
  openssl req -new -key ${client}-key.pem -out ${client}.csr \
    -subj "/C=US/ST=State/L=City/O=F3S Storage/CN=${client}.lan.buetow.org"
  openssl x509 -req -days 3650 -in ${client}.csr -CA ca-cert.pem \
    -CAkey ca-key.pem -CAcreateserial -out ${client}-cert.pem
  # Combine cert and key into a single file for stunnel client
  cat ${client}-cert.pem ${client}-key.pem > ${client}-stunnel.pem
done'

Install and Configure Stunnel on f0



# Install stunnel
paul@f0:~ % doas pkg install -y stunnel

# Configure stunnel server with client certificate authentication
paul@f0:~ % doas tee /usr/local/etc/stunnel/stunnel.conf <<'EOF'
cert = /usr/local/etc/stunnel/server-cert.pem
key = /usr/local/etc/stunnel/server-key.pem

setuid = stunnel
setgid = stunnel

[nfs-tls]
accept = 192.168.1.138:2323
connect = 127.0.0.1:2049
CAfile = /usr/local/etc/stunnel/ca/ca-cert.pem
verify = 2
requireCert = yes
EOF

# Enable and start stunnel
paul@f0:~ % doas sysrc stunnel_enable=YES
stunnel_enable:  -> YES
paul@f0:~ % doas service stunnel start
Starting stunnel.

# Restart stunnel to apply the CARP VIP binding
paul@f0:~ % doas service stunnel restart
Stopping stunnel.
Starting stunnel.

The configuration includes:

  • verify = 2: Verify client certificate and fail if not provided
  • requireCert = yes: Client must present a valid certificate
  • CAfile: Path to the CA certificate that signed the client certificates

Setting up NFS on f1 (Standby)



Repeat the same configuration on f1:

paul@f1:~ % doas sysrc nfs_server_enable=YES
nfs_server_enable: NO -> YES
paul@f1:~ % doas sysrc nfsv4_server_enable=YES
nfsv4_server_enable: NO -> YES
paul@f1:~ % doas sysrc nfsuserd_enable=YES
nfsuserd_enable: NO -> YES
paul@f1:~ % doas sysrc mountd_enable=YES
mountd_enable: NO -> YES
paul@f1:~ % doas sysrc rpcbind_enable=YES
rpcbind_enable: NO -> YES

paul@f1:~ % doas tee /etc/exports <<'EOF'
V4: /data/nfs -sec=sys
/data/nfs -alldirs -maproot=root -network 127.0.0.1 -mask 255.255.255.255
EOF

paul@f1:~ % doas service rpcbind start
Starting rpcbind.
paul@f1:~ % doas service mountd start
Starting mountd.
paul@f1:~ % doas service nfsd start
Starting nfsd.
paul@f1:~ % doas service nfsuserd start
Starting nfsuserd.

And to configure stunnel on f1, we run:

# Install stunnel
paul@f1:~ % doas pkg install -y stunnel

# Copy certificates from f0
paul@f0:~ % doas tar -cf /tmp/stunnel-certs.tar \
  -C /usr/local/etc/stunnel server-cert.pem server-key.pem ca
paul@f0:~ % scp /tmp/stunnel-certs.tar f1:/tmp/

paul@f1:~ % cd /usr/local/etc/stunnel && doas tar -xf /tmp/stunnel-certs.tar

# Configure stunnel server on f1 with client certificate authentication
paul@f1:~ % doas tee /usr/local/etc/stunnel/stunnel.conf <<'EOF'
cert = /usr/local/etc/stunnel/server-cert.pem
key = /usr/local/etc/stunnel/server-key.pem

setuid = stunnel
setgid = stunnel

[nfs-tls]
accept = 192.168.1.138:2323
connect = 127.0.0.1:2049
CAfile = /usr/local/etc/stunnel/ca/ca-cert.pem
verify = 2
requireCert = yes
EOF

# Enable and start stunnel
paul@f1:~ % doas sysrc stunnel_enable=YES
stunnel_enable:  -> YES
paul@f1:~ % doas service stunnel start
Starting stunnel.

# Restart stunnel to apply the CARP VIP binding
paul@f1:~ % doas service stunnel restart
Stopping stunnel.
Starting stunnel.

CARP Control Script for Clean Failover



With stunnel configured to bind to the CARP VIP (192.168.1.138), only the server that is currently the CARP MASTER will accept stunnel connections. This provides automatic failover for encrypted NFS:

  • When f0 is CARP MASTER: stunnel on f0 accepts connections on 192.168.1.138:2323
  • When f1 becomes CARP MASTER: stunnel on f1 starts accepting connections on 192.168.1.138:2323
  • The backup server's stunnel process will fail to bind to the VIP and won't accept connections

This ensures that clients always connect to the active NFS server through the CARP VIP. To ensure clean failover behaviour and prevent stale file handles, we'll update our carpcontrol.sh script so that:

  • Stops NFS services on BACKUP nodes (preventing split-brain scenarios)
  • Starts NFS services only on the MASTER node
  • Manages stunnel binding to the CARP VIP

This approach ensures clients can only connect to the active server, eliminating stale handles from the inactive server:

Update: Fixed the script at Sat 3 Jan 23:55:11 EET 2026 - changed $1 to $2 because devd passes $subsystem $type, so the state is in the second argument.

# Create CARP control script on both f0 and f1
paul@f0:~ % doas tee /usr/local/bin/carpcontrol.sh <<'EOF'
#!/bin/sh
# CARP state change control script

HOSTNAME=`hostname`

if [ ! -f /data/nfs/nfs.DO_NOT_REMOVE ]; then
    logger '/data/nfs not mounted, mounting it now!'
    if [ "$HOSTNAME" = 'f0.lan.buetow.org' ]; then
        zfs load-key -L file:///keys/f0.lan.buetow.org:zdata.key zdata/enc/nfsdata
        zfs set mountpoint=/data/nfs zdata/enc/nfsdata
    else
        zfs load-key -L file:///keys/f0.lan.buetow.org:zdata.key zdata/sink/f0/zdata/enc/nfsdata
        zfs set mountpoint=/data/nfs zdata/sink/f0/zdata/enc/nfsdata
        zfs mount zdata/sink/f0/zdata/enc/nfsdata
        zfs set readonly=on zdata/sink/f0/zdata/enc/nfsdata
    fi
    service nfsd stop 2>&1
    service mountd stop 2>&1
fi


case "$2" in
    MASTER)
        logger "CARP state changed to MASTER, starting services"
        service rpcbind start >/dev/null 2>&1
        service mountd start >/dev/null 2>&1
        service nfsd start >/dev/null 2>&1
        service nfsuserd start >/dev/null 2>&1
        service stunnel restart >/dev/null 2>&1
        logger "CARP MASTER: NFS and stunnel services started"
        ;;
    BACKUP)
        logger "CARP state changed to BACKUP, stopping services"
        service stunnel stop >/dev/null 2>&1
        service nfsd stop >/dev/null 2>&1
        service mountd stop >/dev/null 2>&1
        service nfsuserd stop >/dev/null 2>&1
        logger "CARP BACKUP: NFS and stunnel services stopped"
        ;;
    *)
        logger "CARP state changed to $2 (unhandled)"
        ;;
esac
EOF

paul@f0:~ % doas chmod +x /usr/local/bin/carpcontrol.sh

CARP Management Script



To simplify CARP state management and failover testing, create this helper script on both f0 and f1:

# Create the CARP management script
paul@f0:~ % doas tee /usr/local/bin/carp <<'EOF'
#!/bin/sh
# CARP state management script
# Usage: carp [master|backup|auto-failback enable|auto-failback disable]
# Without arguments: shows current state

# Find the interface with CARP configured
CARP_IF=$(ifconfig -l | xargs -n1 | while read if; do
    ifconfig "$if" 2>/dev/null | grep -q "carp:" && echo "$if" && break
done)

if [ -z "$CARP_IF" ]; then
    echo "Error: No CARP interface found"
    exit 1
fi

# Get CARP VHID
VHID=$(ifconfig "$CARP_IF" | grep "carp:" | sed -n 's/.*vhid \([0-9]*\).*/\1/p')

if [ -z "$VHID" ]; then
    echo "Error: Could not determine CARP VHID"
    exit 1
fi

# Function to get the current state
get_state() {
    ifconfig "$CARP_IF" | grep "carp:" | awk '{print $2}'
}

# Check for auto-failback block file
BLOCK_FILE="/data/nfs/nfs.NO_AUTO_FAILBACK"
check_auto_failback() {
    if [ -f "$BLOCK_FILE" ]; then
        echo "WARNING: Auto-failback is DISABLED (file exists: $BLOCK_FILE)"
    fi
}

# Main logic
case "$1" in
    "")
        # No argument - show current state
        STATE=$(get_state)
        echo "CARP state on $CARP_IF (vhid $VHID): $STATE"
        check_auto_failback
        ;;
    master)
        # Force to MASTER state
        echo "Setting CARP to MASTER state..."
        ifconfig "$CARP_IF" vhid "$VHID" state master
        sleep 1
        STATE=$(get_state)
        echo "CARP state on $CARP_IF (vhid $VHID): $STATE"
        check_auto_failback
        ;;
    backup)
        # Force to BACKUP state
        echo "Setting CARP to BACKUP state..."
        ifconfig "$CARP_IF" vhid "$VHID" state backup
        sleep 1
        STATE=$(get_state)
        echo "CARP state on $CARP_IF (vhid $VHID): $STATE"
        check_auto_failback
        ;;
    auto-failback)
        case "$2" in
            enable)
                if [ -f "$BLOCK_FILE" ]; then
                    rm "$BLOCK_FILE"
                    echo "Auto-failback ENABLED (removed $BLOCK_FILE)"
                else
                    echo "Auto-failback was already enabled"
                fi
                ;;
            disable)
                if [ ! -f "$BLOCK_FILE" ]; then
                    touch "$BLOCK_FILE"
                    echo "Auto-failback DISABLED (created $BLOCK_FILE)"
                else
                    echo "Auto-failback was already disabled"
                fi
                ;;
            *)
                echo "Usage: $0 auto-failback [enable|disable]"
                echo "  enable:  Remove block file to allow automatic failback"
                echo "  disable: Create block file to prevent automatic failback"
                exit 1
                ;;
        esac
        ;;
    *)
        echo "Usage: $0 [master|backup|auto-failback enable|auto-failback disable]"
        echo "  Without arguments: show current CARP state"
        echo "  master: force this node to become CARP MASTER"
        echo "  backup: force this node to become CARP BACKUP"
        echo "  auto-failback enable:  allow automatic failback to f0"
        echo "  auto-failback disable: prevent automatic failback to f0"
        exit 1
        ;;
esac
EOF

paul@f0:~ % doas chmod +x /usr/local/bin/carp

# Copy to f1 as well
paul@f0:~ % scp /usr/local/bin/carp f1:/tmp/
paul@f1:~ % doas cp /tmp/carp /usr/local/bin/carp && doas chmod +x /usr/local/bin/carp

Now you can easily manage CARP states and auto-failback:

# Check current CARP state
paul@f0:~ % doas carp
CARP state on re0 (vhid 1): MASTER

# If auto-failback is disabled, you'll see a warning
paul@f0:~ % doas carp
CARP state on re0 (vhid 1): MASTER
WARNING: Auto-failback is DISABLED (file exists: /data/nfs/nfs.NO_AUTO_FAILBACK)

# Force f0 to become BACKUP (triggers failover to f1)
paul@f0:~ % doas carp backup
Setting CARP to BACKUP state...
CARP state on re0 (vhid 1): BACKUP

# Disable auto-failback (useful for maintenance)
paul@f0:~ % doas carp auto-failback disable
Auto-failback DISABLED (created /data/nfs/nfs.NO_AUTO_FAILBACK)

# Enable auto-failback
paul@f0:~ % doas carp auto-failback enable
Auto-failback ENABLED (removed /data/nfs/nfs.NO_AUTO_FAILBACK)

Automatic Failback After Reboot



When f0 reboots (planned or unplanned), f1 takes over as CARP MASTER. To ensure f0 automatically reclaims its primary role once it's fully operational, we'll implement an automatic failback mechanism. With:

Update: Fixed the script at Sun 4 Jan 00:04:28 EET 2026 - removed the NFS service check because when f0 is BACKUP, NFS services are intentionally stopped by carpcontrol.sh, which would prevent auto-failback from ever triggering.

paul@f0:~ % doas tee /usr/local/bin/carp-auto-failback.sh <<'EOF'
#!/bin/sh
# CARP automatic failback script for f0
# Ensures f0 reclaims MASTER role after reboot when storage is ready

LOGFILE="/var/log/carp-auto-failback.log"
MARKER_FILE="/data/nfs/nfs.DO_NOT_REMOVE"
BLOCK_FILE="/data/nfs/nfs.NO_AUTO_FAILBACK"

log_message() {
    echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" >> "$LOGFILE"
}

# Check if we're already MASTER
CURRENT_STATE=$(/usr/local/bin/carp | awk '{print $NF}')
if [ "$CURRENT_STATE" = "MASTER" ]; then
    exit 0
fi

# Check if /data/nfs is mounted
if ! mount | grep -q "on /data/nfs "; then
    log_message "SKIP: /data/nfs not mounted"
    exit 0
fi

# Check if the marker file exists
# (identifies that the ZFS data set is properly mounted)
if [ ! -f "$MARKER_FILE" ]; then
    log_message "SKIP: Marker file $MARKER_FILE not found"
    exit 0
fi

# Check if failback is blocked (for maintenance)
if [ -f "$BLOCK_FILE" ]; then
    log_message "SKIP: Failback blocked by $BLOCK_FILE"
    exit 0
fi

# All conditions met - promote to MASTER
log_message "CONDITIONS MET: Promoting to MASTER (was $CURRENT_STATE)"
/usr/local/bin/carp master

# Log result
sleep 2
NEW_STATE=$(/usr/local/bin/carp | awk '{print $NF}')
log_message "Failback complete: State is now $NEW_STATE"

# If successful, log to the system log too
if [ "$NEW_STATE" = "MASTER" ]; then
    logger "CARP: f0 automatically reclaimed MASTER role"
fi
EOF

paul@f0:~ % doas chmod +x /usr/local/bin/carp-auto-failback.sh

The marker file identifies that the ZFS data set is mounted correctly. We create it with:

paul@f0:~ % doas touch /data/nfs/nfs.DO_NOT_REMOVE

We add a cron job to check every minute:

paul@f0:~ % echo "* * * * * /usr/local/bin/carp-auto-failback.sh" | doas crontab -

The enhanced CARP script provides integrated control over auto-failback. To temporarily turn off automatic failback (e.g., for f0 maintenance), we run:

paul@f0:~ % doas carp auto-failback disable
Auto-failback DISABLED (created /data/nfs/nfs.NO_AUTO_FAILBACK)

And to re-enable it:

paul@f0:~ % doas carp auto-failback enable
Auto-failback ENABLED (removed /data/nfs/nfs.NO_AUTO_FAILBACK)

To check whether auto-failback is enabled, we run:

paul@f0:~ % doas carp
CARP state on re0 (vhid 1): MASTER
# If disabled, you'll see: WARNING: Auto-failback is DISABLED

The failback attempts are logged to /var/log/carp-auto-failback.log!

So, in summary:

  • After f0 reboots: f1 is MASTER, f0 boots as BACKUP
  • Cron runs every minute: Checks if conditions are met (Is f0 currently BACKUP? (don't run if already MASTER)), (Is /data/nfs mounted? (ZFS datasets are ready)), (Does marker file exist? (confirms this is primary storage)), (Is failback blocked? (admin can prevent failback)), (Are NFS services running? (system is fully ready))
  • Failback occurs: Typically 2-3 minutes after boot completes
  • Logging: All attempts logged for troubleshooting

This ensures f0 automatically resumes its role as primary storage server after any reboot, while providing administrative control when needed.

Client Configuration for NFS via Stunnel



To mount NFS shares with stunnel encryption, clients must install and configure stunnel using their client certificates.

Configuring Rocky Linux Clients (r0, r1, r2)



On the Rocky Linux VMs, we run:

# Install stunnel on client (example for `r0`)
[root@r0 ~]# dnf install -y stunnel nfs-utils

# Copy client certificate and CA certificate from f0
[root@r0 ~]# scp f0:/usr/local/etc/stunnel/ca/r0-stunnel.pem /etc/stunnel/
[root@r0 ~]# scp f0:/usr/local/etc/stunnel/ca/ca-cert.pem /etc/stunnel/

# Configure stunnel client with certificate authentication
[root@r0 ~]# tee /etc/stunnel/stunnel.conf <<'EOF'
cert = /etc/stunnel/r0-stunnel.pem
CAfile = /etc/stunnel/ca-cert.pem
client = yes
verify = 2

[nfs-ha]
accept = 127.0.0.1:2323
connect = 192.168.1.138:2323
EOF

# Enable and start stunnel
[root@r0 ~]# systemctl enable --now stunnel

# Repeat for r1 and r2 with their respective certificates

Note: Each client must use its certificate file (r0-stunnel.pem, r1-stunnel.pem, r2-stunnel.pem, or earth-stunnel.pem - the latter is for my Laptop, which can also mount the NFS shares).

NFSv4 user mapping config on Rocky



Update: This section was added 08.08.2025!

For this, we need to set the Domain in /etc/idmapd.conf on all 3 Rocky hosts to lan.buetow.org (remember, earlier in this blog post we set the nfsuserd domain on the NFS server side to lan.buetow.org as well!)

[General]

Domain = lan.buetow.org
.
.
.

We also need to increase the inotify limit, otherwise nfs-idmapd may fail to start with "Too many open files":

[root@r0 ~]# echo 'fs.inotify.max_user_instances = 512' > /etc/sysctl.d/99-inotify.conf
[root@r0 ~]# sysctl -w fs.inotify.max_user_instances=512

And afterwards, we need to run the following on all 3 Rocky hosts:

[root@r0 ~]# systemctl start nfs-idmapd
[root@r0 ~]# systemctl enable --now nfs-client.target

and then, safest, reboot those.

Testing NFS Mount with Stunnel



To mount NFS through the stunnel encrypted tunnel, we run:

# Create a mount point
[root@r0 ~]# mkdir -p /data/nfs/k3svolumes

# Mount through stunnel (using localhost and NFSv4)
[root@r0 ~]# mount -t nfs4 -o port=2323 127.0.0.1:/k3svolumes /data/nfs/k3svolumes

# Verify mount
[root@r0 ~]# mount | grep k3svolumes
127.0.0.1:/k3svolumes on /data/nfs/k3svolumes 
  type nfs4 (rw,relatime,vers=4.2,rsize=131072,wsize=131072,
  namlen=255,hard,proto=tcp,port=2323,timeo=600,retrans=2,sec=sys,
  clientaddr=127.0.0.1,local_lock=none,addr=127.0.0.1)

# For persistent mount, add to /etc/fstab:
127.0.0.1:/k3svolumes /data/nfs/k3svolumes nfs4 port=2323,_netdev,soft,timeo=10,retrans=2,intr 0 0

Note: The mount uses localhost (127.0.0.1) because stunnel is listening locally and forwarding the encrypted traffic to the remote server.

Testing CARP Failover with mounted clients and stale file handles:



To test the failover process:

# On f0 (current MASTER) - trigger failover
paul@f0:~ % doas ifconfig re0 vhid 1 state backup

# On f1 - verify it becomes MASTER
paul@f1:~ % ifconfig re0 | grep carp
    inet 192.168.1.138 netmask 0xffffffff broadcast 192.168.1.138 vhid 1

# Check stunnel is now listening on f1
paul@f1:~ % doas sockstat -l | grep 2323
stunnel  stunnel    4567  3  tcp4   192.168.1.138:2323    *:*

# On client - verify NFS mount still works
[root@r0 ~]# ls /data/nfs/k3svolumes/
[root@r0 ~]# echo "Test after failover" > /data/nfs/k3svolumes/failover-test.txt

After a CARP failover, NFS clients may experience "Stale file handle" errors because they cached file handles from the previous server. To resolve this manually, we can run:

# Force unmount and remount
[root@r0 ~]# umount -f /data/nfs/k3svolumes
[root@r0 ~]# mount /data/nfs/k3svolumes

For the automatic recovery, we create a script:

[root@r0 ~]# cat > /usr/local/bin/check-nfs-mount.sh << 'EOF'
#!/bin/bash
# Fast NFS mount health monitor - runs every 10 seconds via systemd timer

MOUNT_POINT="/data/nfs/k3svolumes"
LOCK_FILE="/var/run/nfs-mount-check.lock"

# Use a lock file to prevent concurrent runs
if [ -f "$LOCK_FILE" ]; then
    exit 0
fi
touch "$LOCK_FILE"
trap "rm -f $LOCK_FILE" EXIT

fix_mount () {
    echo "Attempting to remount NFS mount $MOUNT_POINT"
    if mount -o remount -f "$MOUNT_POINT" 2>/dev/null; then
        echo "Remount command issued for $MOUNT_POINT"
    else
        echo "Failed to remount NFS mount $MOUNT_POINT"
    fi

    echo "Checking if $MOUNT_POINT is a mountpoint"
    if mountpoint "$MOUNT_POINT" >/dev/null 2>&1; then
        echo "$MOUNT_POINT is a valid mountpoint"
    else
        echo "$MOUNT_POINT is not a valid mountpoint, attempting mount"
        if mount "$MOUNT_POINT"; then
            echo "Successfully mounted $MOUNT_POINT"
            return
        else
            echo "Failed to mount $MOUNT_POINT"
        fi
    fi

    echo "Attempting to unmount $MOUNT_POINT"
    if umount -f "$MOUNT_POINT" 2>/dev/null; then
        echo "Successfully unmounted $MOUNT_POINT"
    else
        echo "Failed to unmount $MOUNT_POINT (it might not be mounted)"
    fi

    echo "Attempting to mount $MOUNT_POINT"
    if mount "$MOUNT_POINT"; then
        echo "NFS mount $MOUNT_POINT mounted successfully"
        return
    else
        echo "Failed to mount NFS mount $MOUNT_POINT"
    fi

    echo "Failed to fix NFS mount $MOUNT_POINT"
    exit 1
}

if ! mountpoint "$MOUNT_POINT" >/dev/null 2>&1; then
    echo "NFS mount $MOUNT_POINT not found"
    fix_mount
fi

if ! timeout 2s stat "$MOUNT_POINT" >/dev/null 2>&1; then
    echo "NFS mount $MOUNT_POINT appears to be unresponsive"
    fix_mount
fi
EOF

[root@r0 ~]# chmod +x /usr/local/bin/check-nfs-mount.sh

And we create the systemd service as follows:

[root@r0 ~]# cat > /etc/systemd/system/nfs-mount-monitor.service << 'EOF'
[Unit]
Description=NFS Mount Health Monitor
After=network-online.target

[Service]
Type=oneshot
ExecStart=/usr/local/bin/check-nfs-mount.sh
StandardOutput=journal
StandardError=journal
EOF

And we also create the systemd timer (runs every 10 seconds):

[root@r0 ~]# cat > /etc/systemd/system/nfs-mount-monitor.timer << 'EOF'
[Unit]
Description=Run NFS Mount Health Monitor every 10 seconds
Requires=nfs-mount-monitor.service

[Timer]
OnBootSec=30s
OnUnitActiveSec=10s
AccuracySec=1s

[Install]
WantedBy=timers.target
EOF

To enable and start the timer, we run:

[root@r0 ~]# systemctl daemon-reload
[root@r0 ~]# systemctl enable nfs-mount-monitor.timer
[root@r0 ~]# systemctl start nfs-mount-monitor.timer

# Check status
[root@r0 ~]# systemctl status nfs-mount-monitor.timer
● nfs-mount-monitor.timer - Run NFS Mount Health Monitor every 10 seconds
     Loaded: loaded (/etc/systemd/system/nfs-mount-monitor.timer; enabled)
     Active: active (waiting) since Sat 2025-07-06 10:00:00 EEST
    Trigger: Sat 2025-07-06 10:00:10 EEST; 8s left

# Monitor logs
[root@r0 ~]# journalctl -u nfs-mount-monitor -f

Note: Stale file handles are inherent to NFS failover because file handles are server-specific. The best approach depends on your application's tolerance for brief disruptions. Of course, all the changes made to r0 above must also be applied to r1 and r2.

Complete Failover Test



Here's a comprehensive test of the failover behaviour with all optimisations in place:

# 1. Check the initial state
paul@f0:~ % ifconfig re0 | grep carp
    carp: MASTER vhid 1 advbase 1 advskew 0
paul@f1:~ % ifconfig re0 | grep carp
    carp: BACKUP vhid 1 advbase 1 advskew 100

# 2. Create a test file from a client
[root@r0 ~]# echo "test before failover" > /data/nfs/k3svolumes/test-before.txt

# 3. Trigger failover (f0 → f1)
paul@f0:~ % doas ifconfig re0 vhid 1 state backup

# 4. Monitor client behaviour
[root@r0 ~]# ls /data/nfs/k3svolumes/
ls: cannot access '/data/nfs/k3svolumes/': Stale file handle

# 5. Check automatic recovery (within 10 seconds)
[root@r0 ~]# journalctl -u nfs-mount-monitor -f
Jul 06 10:15:32 r0 nfs-monitor[1234]: NFS mount unhealthy detected at \
  Sun Jul 6 10:15:32 EEST 2025
Jul 06 10:15:32 r0 nfs-monitor[1234]: Attempting to fix stale NFS mount at \
  Sun Jul 6 10:15:32 EEST 2025
Jul 06 10:15:33 r0 nfs-monitor[1234]: NFS mount fixed at \
  Sun Jul 6 10:15:33 EEST 2025

Failover Timeline:

  • 0 seconds: CARP failover triggered
  • 0-2 seconds: Clients get "Stale file handle" errors (not hanging)
  • 3-10 seconds: Soft mounts ensure quick failure of operations
  • Within 10 seconds: Automatic recovery via systemd timer

Benefits of the Optimised Setup:

  • No hanging processes - Soft mounts fail quickly
  • Clean failover - Old server stops serving immediately
  • Fast automatic recovery - No manual intervention needed
  • Predictable timing - Recovery within 10 seconds with systemd timer
  • Better visibility - systemd journal provides detailed logs

Important Considerations:

  • Recent writes (within 1 minute) may not be visible after failover due to replication lag
  • Applications should handle brief NFS errors gracefully
  • For zero-downtime requirements, consider synchronous replication or distributed storage (see "Future storage explorations" section later in this blog post)

Update: Upgrade to 4TB drives



Update: 27.01.2026 I have since replaced the 1TB drives with 4TB drives for more storage capacity. The upgrade procedure was different for each node!

Upgrading f1 (simpler approach)



Since f1 is the replication sink, the upgrade was straightforward:

  • 1. Physically replaced the 1TB drive with the 4TB drive
  • 2. Re-setup the drive as described earlier in this blog post
  • 3. Re-replicated all data from f0 to f1 via zrepl
  • 4. Reloaded the encryption keys as described in this blog post
  • 5. Set the mount point again for the encrypted dataset, explicitly as read-only (since f1 is the replication sink)

Upgrading f0 (using ZFS resilvering)



For f0, which is the primary storage node, I used ZFS resilvering to avoid data loss:

  • 1. Plugged the new 4TB drive into an external USB SSD drive reader
  • 2. Attached the 4TB drive to the zdata pool for resilvering
  • 3. Once resilvering completed, detached the 1TB drive from the zdata pool
  • 4. Shutdown f0 and physically replaced the internal drive
  • 5. Booted with the new drive in place
  • 6. Expanded the pool to use the full 4TB capacity:

paul@f0:~ % doas zpool online -e /dev/ada1

  • 7. Reloaded the encryption keys as described in this blog post
  • 8. Set the mount point again for the encrypted dataset

This was a one-time effort on both nodes - after a reboot, everything was remembered and came up normally. Here are the updated outputs:

paul@f0:~ % doas zpool list
NAME    SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
zdata  3.63T   677G  2.97T        -         -     3%    18%  1.00x    ONLINE  -
zroot   472G  68.4G   404G        -         -    13%    14%  1.00x    ONLINE  -

paul@f0:~ % doas camcontrol devlist
<512GB SSD D910R170>               at scbus0 target 0 lun 0 (pass0,ada0)
<SD Ultra 3D 4TB 530500WD>         at scbus1 target 0 lun 0 (pass1,ada1)
<Generic Flash Disk 8.07>          at scbus2 target 0 lun 0 (da0,pass2)

We're still using different SSD models on f1 (WD Blue SA510 4TB) to avoid simultaneous failures:

paul@f1:~ % doas camcontrol devlist
<512GB SSD D910R170>               at scbus0 target 0 lun 0 (pass0,ada0)
<WD Blue SA510 2.5 4TB 530500WD>   at scbus1 target 0 lun 0 (pass1,ada1)
<Generic Flash Disk 8.07>          at scbus2 target 0 lun 0 (da0,pass2)

Conclusion



We've built a robust, encrypted storage system for our FreeBSD-based Kubernetes cluster that provides:

  • High Availability: CARP ensures the storage VIP moves automatically during failures
  • Data Protection: ZFS encryption protects data at rest, stunnel protects data in transit
  • Continuous Replication: 1-minute RPO for the data, automated via zrepl
  • Secure Access: Client certificate authentication prevents unauthorised access

Some key lessons learned are:

  • Stunnel vs Native NFS/TLS: While native encryption would be ideal, stunnel provides better cross-platform compatibility
  • Manual vs Automatic Failover: For storage systems, controlled failover often prevents more problems than it causes
  • Client Compatibility: Different NFS implementations behave differently - test thoroughly

Future Storage Explorations



While zrepl provides excellent snapshot-based replication for disaster recovery, there are other storage technologies worth exploring for the f3s project:

MinIO for S3-Compatible Object Storage



MinIO is a high-performance, S3-compatible object storage system that could complement our ZFS-based storage. Some potential use cases:

  • S3 API compatibility: Many modern applications expect S3-style object storage APIs. MinIO could provide this interface while using our ZFS storage as the backend.
  • Multi-site replication: MinIO supports active-active replication across multiple sites, which could work well with our f0/f1/f2 node setup.
  • Kubernetes native: MinIO has excellent Kubernetes integration with operators and CSI drivers, making it ideal for the f3s k3s environment.

MooseFS for Distributed High Availability



MooseFS is a fault-tolerant, distributed file system that could provide proper high-availability storage:

  • True HA: Unlike our current setup, which requires manual failover, MooseFS provides automatic failover with no single point of failure.
  • POSIX compliance: Applications can use MooseFS like any regular filesystem, no code changes needed.
  • Flexible redundancy: Configure different replication levels per directory or file, optimising storage efficiency.
  • FreeBSD support: MooseFS has native FreeBSD support, making it a natural fit for the f3s project.

Both technologies could run on top of our encrypted ZFS volumes, combining ZFS's data integrity and encryption features with distributed storage capabilities. This would be particularly interesting for workloads that need either S3-compatible APIs (MinIO) or transparent distributed POSIX storage (MooseFS). What about Ceph and GlusterFS? Unfortunately, there doesn't seem to be great native FreeBSD support for them. However, other alternatives also appear suitable for my use case.

Read the next post of this series:

f3s: Kubernetes with FreeBSD - Part 7: k3s and first pod deployments

Other *BSD-related posts:

2025-12-07 f3s: Kubernetes with FreeBSD - Part 8: Observability
2025-10-02 f3s: Kubernetes with FreeBSD - Part 7: k3s and first pod deployments
2025-07-14 f3s: Kubernetes with FreeBSD - Part 6: Storage (You are currently reading this)
2025-05-11 f3s: Kubernetes with FreeBSD - Part 5: WireGuard mesh network
2025-04-05 f3s: Kubernetes with FreeBSD - Part 4: Rocky Linux Bhyve VMs
2025-02-01 f3s: Kubernetes with FreeBSD - Part 3: Protecting from power cuts
2024-12-03 f3s: Kubernetes with FreeBSD - Part 2: Hardware and base installation
2024-11-17 f3s: Kubernetes with FreeBSD - Part 1: Setting the stage
2024-04-01 KISS high-availability with OpenBSD
2024-01-13 One reason why I love OpenBSD
2022-10-30 Installing DTail on OpenBSD
2022-07-30 Let's Encrypt with OpenBSD and Rex
2016-04-09 Jails and ZFS with Puppet on FreeBSD

E-Mail your comments to paul@nospam.buetow.org

Back to the main site
Posts from January to June 2025 https://foo.zone/gemfeed/2025-07-01-posts-from-january-to-june-2025.html 2025-07-01T22:39:29+03:00 Paul Buetow aka snonux paul@dev.buetow.org These are my social media posts from the last six months. I keep them here to reflect on them and also to not lose them. Social media networks come and go and are not under my control, but my domain is here to stay.

Posts from January to June 2025



Published at 2025-07-01T22:39:29+03:00

These are my social media posts from the last six months. I keep them here to reflect on them and also to not lose them. Social media networks come and go and are not under my control, but my domain is here to stay.

These are from Mastodon and LinkedIn. Have a look at my about page for my social media profiles. This list is generated with Gos, my social media platform sharing tool.

My about page
https://codeberg.org/snonux/gos

Table of Contents




January 2025



I am currently binge-listening to the Google ...



I am currently binge-listening to the Google #SRE ProdCast. It's really great to learn about the stories of individual SREs and their journeys. It is not just about SREs at Google; there are also external guests.

sre.google/prodcast/

Recently, there was a >5000 LOC #bash ...



Recently, there was a >5000 LOC #bash codebase at work that reported the progress of a migration, nobody understood it and it was wonky (sometimes it would not return the desired results). On top of that, the coding style was very bad as well (I could rant forever here). The engineer who wrote it left the company. I rewrote it in #Perl in about 300 LOC. Colleagues asked why not Python. Perl is the perfect choice here—it's even in its name: Practical Extraction and Report Language!

Ghostty is a terminal emulator that was ...



Ghostty is a terminal emulator that was recently released publicly as open-source. I love that it works natively on both Linux and macOS; it looks great (font rendering) and is fast and customizable via a config file (which I manage with a config mng system). Ghostty is a passion project written in Zig, the author loved the community so much while working on it that he donated $300k to the Zig Foundation. #terminal #emulator

ghostty.org

Go is not an easy programming language. Don't ...



Go is not an easy programming language. Don't confuse easy with simple syntax. I'd agree to this. With the recent addition of Generics to the language I also feel that even the syntax stops being simple.. Also, simplicity is complex (especially under the hood how the language works - there are many mechanics you need to know if you really want to master the language). #golang

www.arp242.net/go-easy.html

How will AI change software engineering (or has ...



How will AI change software engineering (or has it already)? The bottom line is that less experienced engineers may have problems (accepting incomplete or incorrect programs, only reaching 70 percent solutions), while experienced engineers can leverage AI to boost their performance as they know how to fix the remaining 30 percent of the generated code. #ai #engineering #software

newsletter.pragmaticengineer.com/p/how-ai-will-change-software-engineering

Eliminating toil - Toil is not always a bad ...



Eliminating toil - Toil is not always a bad thing - some even enjoy toil - it is calming in small amounts - but it becomes toxic in large amounts - #SRE

sre.google/sre-book/eliminating-toil/

Fun read. How about using the character ...



Fun read. How about using the character sequence :-) as a statement separator in a programming language?

ntietz.com/blog/researching-why-we-use-semicolons-as-statement-terminators/

Thats unexpected, you cant remove a NaN key ...



Thats unexpected, you cant remove a NaN key from a map without clearing it! #golang via @wallabagapp

unexpected-go.com/you-cant-remove-a-nan-key-from-a-map-without-clearing-it.html

Nice refresher for #shell #bash #zsh ...



Nice refresher for #shell #bash #zsh redirection rules

rednafi.com/misc/shell_redirection/

I think discussing action items in incident ...



I think discussing action items in incident reviews is important. At least the obvious should be captured and noted down. It does not mean that the action items need to be fully refined in the review meeting; that would be out of scope, in my opinion.

surfingcomplexity.blog/2024/09/28/why-..-..-action-items-during-incident-reviews/

At first, functional options add a bit of ...



At first, functional options add a bit of boilerplate, but they turn out to be quite neat, especially when you have very long parameter lists that need to be made neat and tidy. #golang

www.calhoun.io/using-functional-options-instead-of-method-chaining-in-go/

In the "Working with an SRE Interview" I have ...



In the "Working with an SRE Interview" I have been askd about what it's like working with an SRE! We'd covered much more in depth, but we decided not to make it too long in the final version! #sre #interview

foo.zone/gemfeed/2025-01-15-working-with-an-sre-interview.html (Gemini)
foo.zone/gemfeed/2025-01-15-working-with-an-sre-interview.html

Small introduction to the #Android ...



Small introduction to the #Android distribution called #GrapheneOS For myself, I am using a Pixel 7 Pro, which comes with "only" 5 years of support (not yet 7 years like the Pixel 8 and 9 series). I also wrote about GrapheneOS here once:

dataswamp.org/~solene/2025-01-12-intro-to-grapheneos.html
foo.zone/gemfeed/2023-01-23-why-grapheneos-rox.html (Gemini)
foo.zone/gemfeed/2023-01-23-why-grapheneos-rox.html

Helix 2025.01 has been released. The completion ...



Helix 2025.01 has been released. The completion of path names and the snippet functionality will be particularly useful for me. Overall, it's a great release. The release notes cover only some highlights, but there are many more changes in this version so also have a look at the Changelog! #HelixEditor

helix-editor.com/news/release-25-01-highlights/

I found these are excellent examples of how ...



I found these are excellent examples of how #OpenBSD's #relayd can be used.

www.tumfatig.net/2023/using-openbsd-relayd8-as-an-application-layer-gateway/

LLMs for Ops? Summaries of logs, probabilities ...



LLMs for Ops? Summaries of logs, probabilities about correctness, auto-generating Ansible, some uses cases are there. Wouldn't trust it fully, though.

youtu.be/WodaffxVq-E?si=noY0egrfl5izCSQI

Enjoying an APC Power-UPS BX750MI in my ...



Enjoying an APC Power-UPS BX750MI in my #homelab with #FreeBSD and apcupsd. I can easily use the UPS status to auto-shutdown a cluster of FreeBSD machines on a power cut. One FreeBSD machine acts as the apcupsd master, connected via USB to the APC, while the remaining machines read the status remotely via the apcupsd network port from the master. However, it won't work when the master is down. #APC #UPS

"Even in the projects where I'm the only ...



"Even in the projects where I'm the only person, there are at least three people involved: past me, present me, and future me." - Quote from #software #programming

liw.fi/40/#index1h1

Connecting an #UPS to my #FreeBSD cluster ...



Connecting an #UPS to my #FreeBSD cluster in my #homelab, protecting it from power cuts!

foo.zone/gemfeed/2025-02-01-f3s-kubernetes-with-freebsd-part-3.html (Gemini)
foo.zone/gemfeed/2025-02-01-f3s-kubernetes-with-freebsd-part-3.html

So, the Co-founder and CTO of honeycomb.io and ...



So, the Co-founder and CTO of honeycomb.io and author of the book Observability Engineering always hated observability. And Distinguished Software Engineer and The Pragmatic Engineer host can't pronounce the word Observability. :-) No, jokes aside, I liked this podcast episode of The Pragmatic Engineer: Observability: the present and future, with Charity Majors #sre #observability

newsletter.pragmaticengineer.com/p/observability-the-present-and-future

February 2025



I don't know about you, but at work, I usually ...



I don't know about you, but at work, I usually deal with complex setups involving thousands of servers and work in a complex hybrid microservices-based environment (cloud and on-prem), where homelabbing (as simple as described in my blog post) is really relaxing and recreative. So, I was homelabbing a bit again, securing my #FreeBSD cluster from power cuts. #UPS #recreative

foo.zone/gemfeed/2025-02-01-f3s-kubernetes-with-freebsd-part-3.html (Gemini)
foo.zone/gemfeed/2025-02-01-f3s-kubernetes-with-freebsd-part-3.html

Great proposal (got accepted by the Goteam) for ...



Great proposal (got accepted by the Goteam) for safer file system open functions #golang

github.com/golang/go/issues/67002

My Gemtexter has only 1320 LOC.... The Biggest ...



My Gemtexter has only 1320 LOC.... The Biggest Shell Programs in the World are huuuge... #shell #sh

github.com/oils-for-unix/oils/wiki/The-Biggest-Shell-Programs-in-the-World

Against /tmp - He is making a point #unix ...



Against /tmp - He is making a point #unix #linux #bsd #filesystem via @wallabagapp

dotat.at/@/2024-10-22-tmp.html

Random Weird Things Part 2: #blog ...



Random Weird Things Part 2: #blog #computing

foo.zone/gemfeed/2025-02-08-random-weird-things-ii.html (Gemini)
foo.zone/gemfeed/2025-02-08-random-weird-things-ii.html

As a former #Pebble user and fan, thats ...



As a former #Pebble user and fan, thats aweaome news. PebbleOS is now open source and there will aoon be a new watch. I don't know about you, but I will be the first getting one :-) #foss

ericmigi.com/blog/why-were-bringing-pebble-back

I think I am slowly getting the point of Cue. ...



I think I am slowly getting the point of Cue. For example, it can replace both a JSON file and a JSON Schema. Furthermore, you can convert it from and into different formats (Cue, JSON, YAML, Go data types, ...), and you can nicely embed this into a Go project as well. #cue #cuelang #golang #configuration

cuelang.org

Jonathan's reflection of 10 years of ...



Jonathan's reflection of 10 years of programming!

jonathan-frere.com/posts/10-years-of-programming/

Really enjoyed reading this. Easily digestible ...



Really enjoyed reading this. Easily digestible summary of what's new in Go 1.24. #golang

antonz.org/go-1-24/

Some great advice from 40 years of experience ...



Some great advice from 40 years of experience as a software developer. #software #development

liw.fi/40/#index1h1

I enjoyed this talk, some recipes I knew ...



I enjoyed this talk, some recipes I knew already, others were new to me. The "line of sight" is my favourite, which I always tend to follow. I also liked the example where the speaker simplified a "complex" nested functions into two not-nested-if-statements. #golang

www.youtube.com/watch?v=zdKHq9Xo4OY&list=WL&index=5

A way of how to add the version info to the Go ...



A way of how to add the version info to the Go binary. ... I personally just hardcode the version number in version.go and update it there manually for each release. But with Go 1.24, I will try embedding it! #golang

jerrynsh.com/3-easy-ways-to-add-version-flag-in-go/

In other words, using t.Parallel() for ...



In other words, using t.Parallel() for lightweight unit tests will likely make them slower.... #golang

threedots.tech/post/go-test-parallelism/

Neat little blog post, showcasing various ...



Neat little blog post, showcasing various methods unsed for generic programming before of the introduction of generics. Only reflection wasn't listed. #golang

bitfieldconsulting.com/posts/generics

The smallest thing in Go #golang ...



The smallest thing in Go #golang

bitfieldconsulting.com/posts/iota

Fun with defer in #golang, I did't know, that ...



Fun with defer in #golang, I did't know, that a defer object can either be heap or stack allocated. And there are some rules for inlining, too.

victoriametrics.com/blog/defer-in-go/

What I like about Go is that it is still ...



What I like about Go is that it is still possible to understand what's going on under the hood, whereas in JVM-based languages (for example) or dynamic languages, there are too many optimizations and abstractions. However, you don't need to know too much about how it works under the hood in Go (like memory management in C). It's just the fact that you can—you have a choice. #golang

blog.devtrovert.com/p/goroutine-scheduler-revealed-youll

March 2025



Television has somewhat transformed how I work ...



Television has somewhat transformed how I work in the shell on a day-to-day basis. It is especially useful for me in navigating all the local Git repositories on my laptop. I have bound Ctrl+G in my shell for that now. #television #tv #tool #shell

github.com/alexpasmantier/television

Once in a while, I like to read a book about a ...



Once in a while, I like to read a book about a programming language I have been using for a while to find new tricks or to refresh and sharpen my knowledge about it. I just finished reading "Programming Ruby 3.3," and I must say this is my favorite Ruby book now. What makes this one so special is that it is quite recent and covers all the new features. #ruby #programming #coding

pragprog.com/titles/ruby5/programming-ruby-3-3-5th-edition/

As you may have noticed, I like to share on ...



As you may have noticed, I like to share on Mastodon and LinkedIn all the technical things I find interesting, and this blog post is technically all about that. Having said that, I love these tiny side projects. They are so relaxing to work on! #gos #golang #tool #programming #fun

foo.zone/gemfeed/2025-03-05-sharing-on-social-media-with-gos.html (Gemini)
foo.zone/gemfeed/2025-03-05-sharing-on-social-media-with-gos.html

Personally, I think AI (LLMs) are pretty ...



Personally, I think AI (LLMs) are pretty useful. But there's really some Hype around that. However, AI is about to stay - its not all hype

unixdigest.com/articles/i-passionately-hate-hype-especially-the-ai-hype.html

Type aliases in #golang, soon also work with ...



Type aliases in #golang, soon also work with generics. It's an interesting feature, useful for refactorings and simplifications

go.dev/blog/alias-names

#Perl, my "first love" of programming ...



#Perl, my "first love" of programming languages. Still there, still use it here and then (but not as my primary language at the moment). And others do so as well, apparently. Which makes me happy! :-)

dev.to/fa5tworm/why-perl-remains-indis..-..e-of-modern-programming-languages-2io0

I guess there are valid reasons for phttpdget, ...



I guess there are valid reasons for phttpdget, which I also don't know about? Maybe complexity and/or licensing of other tools. #FreeBSD

l33t.codes/2024/12/05/Updating-FreeBSD-and-Re-Inventing-the-Wheel/

This is one of the reasons why I like ...



This is one of the reasons why I like terminal-based applications so much—they are usually more lightweight than GUI-based ones (and also more flexible).

www.arp242.net/stupid-light.html

Advanced Concurrency Patterns with #Golang ...



Advanced Concurrency Patterns with #Golang

blogtitle.github.io/go-advanced-concurrency-patterns-part-1/

#SQLite was designed as an #TCL extension. ...



#SQLite was designed as an #TCL extension. There are ~trillion SQLite databases in active use. SQLite heavily relies on #TCL: C code generation via mksqlite3c.tcl, C code isn't edited directly by the SQLite developers, and for testing , and for doc generation). The devs use a custom editor written in Tcl/Tk called "e" to edit the source! There's a custom versioning system Fossil, a custom chat-room written in Tcl/Tk!

www.tcl-lang.org/community/tcl2017/assets/talk93/Paper.html

Git provides automatic rendering of Markdown ...



Git provides automatic rendering of Markdown files, including README.md, in a repository’s root directory" .... so much junk now in LLM powered search engines.... #llm #ai

These are some neat little Go tips. Linters ...



These are some neat little Go tips. Linters already tell you when you silently omit a function return value, though. The slice filter without allocation trick is nice and simple. And I agree that switch statements are preferable to if-else statements. #golang

blog.devtrovert.com/p/go-ep5-avoid-contextbackground-make

This is a great introductory blog post about ...



This is a great introductory blog post about the Helix modal editor. It's also been my first choice for over a year now. I am really looking forward to the Steel plugin system, though. I don't think I need a lot of plugins, but one or two would certainly be on my wish list. #HelixEditor #Helix

felix-knorr.net/posts/2025-03-16-helix-review.html

Maps in Go under the hood #golang ...



Maps in Go under the hood #golang

victoriametrics.com/blog/go-map/

I found that working on multiple side projects ...



I found that working on multiple side projects concurrently is better than concentrating on just one. This seems inefficient, but if you to lose motivation, you can temporarily switch to another one with full élan. Remember to stop starting and start finishing. This doesn't mean you should be working on 10+ side projects concurrently! Select your projects and commit to finishing them before starting the next thing. For example, my current limit of concurrent side projects is around five.

I have been in incidents. Understandably, ...



I have been in incidents. Understandably, everyone wants the issue to be resolved as quickly and others want to know how long TTR will be. IMHO, providing no estimates at all is no solution either. So maybe give a rough estimate but clearly communicate that the estimate is rough and that X, Y, and Z can interfere, meaning there is a chance it will take longer to resolve the incident. Just my thought. What's yours?

firehydrant.com/blog/hot-take-dont-provide-incident-resolution-estimates/

I dont understand what it is. Certificates are ...



I dont understand what it is. Certificates are so easy to monitor but still, expirations cause so many incidents. #sre

securityboulevard.com/2024/10/dont-let..-..time-prevent-outages-with-a-smart-clm/

Don't just blindly trust LLMs. I recently ...



Don't just blindly trust LLMs. I recently trusted an LLM, spent 1 hour debugging, and ultimately had to verify my assumption about fcntl behavior regarding inherited file descriptors in child processes manually with a C program, as the manual page wasn't clear to me. I could have done that immediately and I would have been done within 10 minutes. #productivity #loss #llm #programming #C

April 2025



I knew about any being equivalent to ...



I knew about any being equivalent to interface{} in #Golang, but wasn't aware, that it was introduced to Go because of the generics.

Neat summary of new #Perl features per ...



Neat summary of new #Perl features per release

sheet.shiar.nl/perl

errors.As() checks for the error type, whereas ...



errors.As() checks for the error type, whereas errors.Is() checks for the exact error value. Interesting read about Errors in #golang - and there is also a cat meme in the middle of the blog post! And then, it continues with pointers to pointers to error values or how about a pointer to an empty interface?

adrianlarion.com/golang-error-handling..-..-errors-unwrap-custom-errors-and-more/

Good stuff: 10 years of functional options and ...



Good stuff: 10 years of functional options and key lessons Learned along the way #golang

www.bytesizego.com/blog/10-years-functional-options-golang

I had some fun with #FreeBSD, #Bhyve and ...



I had some fun with #FreeBSD, #Bhyve and #Rocky #Linux. Not just for fun, also for science and profit! #homelab #selfhosting #self-hosting

foo.zone/gemfeed/2025-04-05-f3s-kubernetes-with-freebsd-part-4.html (Gemini)
foo.zone/gemfeed/2025-04-05-f3s-kubernetes-with-freebsd-part-4.html

The moment your blog receives PRs for typo ...



The moment your blog receives PRs for typo corrections, you notice, that people are actually reading and care about your stuff :-) #blog #personal #tech

One thing not mentioned is that #OpenRsync's ...



One thing not mentioned is that #OpenRsync's origin is the #OpenBSD project (at least as far as I am aware! Correct me if I am wrong :-) )! #openbsd #rsync #macos #openrsync

derflounder.wordpress.com/2025/04/06/r..-..laced-with-openrsync-on-macos-sequoia/

This is an interesting #Elixir pipes operator ...



This is an interesting #Elixir pipes operator experiment in #Ruby. #Python has also been experimenting with such an operator. Raku (not mentioned in the linked article) already has the ==> sequence operator, of course (which can also can be used backwards <== - who has doubted? :-) ). #syntax #codegolf #fun #coding #RakuLang

zverok.space/blog/2024-11-16-elixir-pipes.html

The story of how my favorite #Golang book was ...



The story of how my favorite #Golang book was written:

www.thecoder.cafe/p/100-go-mistakes

These are my personal book notes from Daniel ...



These are my personal book notes from Daniel Pink's "When: The Scientific Secrets of Perfect Timing." The notes are for me (to improve happiness and productivity). You still need to read the whole book to get your own insights, but maybe the notes will be useful for you as well. #blog #book #booknotes #productivity

foo.zone/gemfeed/2025-04-19-when-book-notes.html (Gemini)
foo.zone/gemfeed/2025-04-19-when-book-notes.html

I certainly learned a lot reading this #llm ...



I certainly learned a lot reading this #llm #coding #programming

simonwillison.net/2025/Mar/11/using-llms-for-code/

Writing indempotent #Bash scripts ...



Writing indempotent #Bash scripts

arslan.io/2019/07/03/how-to-write-idempotent-bash-scripts/

Regarding #AI for code generation. You should ...



Regarding #AI for code generation. You should be at least a bit curious and exleriement a bit. You don't have to use it if you don't see fit purpose.

registerspill.thorstenball.com/p/they-..-..email=true&r=2n9ive&triedRedirect=true

I like the Rocky metaphor. And this post also ...



I like the Rocky metaphor. And this post also reflects my thoughts on coding. #llm #ai #software

cekrem.github.io/posts/coding-as-craft-going-back-to-the-old-gym/

May 2025



There's now also a #Fish shell edition of my ...



There's now also a #Fish shell edition of my #tmux helper scripts: #fishshell

foo.zone/gemfeed/2025-05-02-terminal-multiplexing-with-tmux-fish-edition.html (Gemini)
foo.zone/gemfeed/2025-05-02-terminal-multiplexing-with-tmux-fish-edition.html

I loved this talk. It's about how you can ...



I loved this talk. It's about how you can create your own #Linux #container without Docker, using less than 100 lines of shell code without Docker or Podman and co. - Why is this talk useful? If you understand how #containers work "under the hood," it becomes easier to make design decisions, write your own tools, or debug production systems. I also recommend his training courses, of which I visited one once.

www.youtube.com/watch?v=4RUiVAlJE2w

Some unexpected #golang stuff, ppl say, that ...



Some unexpected #golang stuff, ppl say, that Go is a simple language. IMHO the devil is in the details.

unexpected-go.com/

With the advent of AI and LLMs, I have observed ...



With the advent of AI and LLMs, I have observed that being able to type quickly has become even more important for engineers. Previously, fast typing wasn't as crucial when coding, as most of the time was spent thinking or navigating through the code. However, with LLMs, you find yourself typing much more frequently. That's an unexpected personal win for me, as I recently learned fast touch typing: #llm #coding #programming

foo.zone/gemfeed/2024-08-05-typing-127.1-words-per-minute.html (Gemini)
foo.zone/gemfeed/2024-08-05-typing-127.1-words-per-minute.html

For science, fun and profit, I set up a ...



For science, fun and profit, I set up a #WireGuard mesh network for my #FreeBSD, #OpenBSD, #RockyLinux and #Kubernetes #homelab: There's also a mesh generator, which I wrote in #Ruby. #k3s #linux #k8s #k3s

foo.zone/gemfeed/2025-05-11-f3s-kubernetes-with-freebsd-part-5.html (Gemini)
foo.zone/gemfeed/2025-05-11-f3s-kubernetes-with-freebsd-part-5.html

Ever wondered about the hung task Linux ...



Ever wondered about the hung task Linux messages on a busy server? Every case is unique, and there is no standard approach to debug them, but here it gets a bit demystified: #linux #kernel

blog.cloudflare.com/searching-for-the-cause-of-hung-tasks-in-the-linux-kernel/

A bit of #fun: The FORTRAN hating gateway ― ...



A bit of #fun: The FORTRAN hating gateway ― Andreas Zwinkau

beza1e1.tuxen.de/lore/fortran_hating_gateway.html

So, Golang was invented while engineers at ...



So, Golang was invented while engineers at Google waited for C++ to compile. Here I am, waiting a long time for Java to compile...

I couldn't do without here-docs. If they did ...



I couldn't do without here-docs. If they did not exist, I would need to find another field and pursue a career there. #bash #sh #shell

rednafi.com/misc/heredoc_headache/

I started using computers as a kid on MS-DOS ...



I started using computers as a kid on MS-DOS and mainly used Norton Commander to navigate the file system in order to start games. Later, I became more interested in computing in general and switched to Linux, but there was no NC. However, there was GNU Midnight Commander, which I still use regularly to this day. It's absolutely worth checking out, even in the modern day. #tools #opensource

en.wikipedia.org/wiki/Midnight_Commander

Thats interesting, running #Android in ...



Thats interesting, running #Android in #Kubernetes

ku.bz/Gs4-wpK5h

Before wiping the pre-installed #Windows 11 ...



Before wiping the pre-installed #Windows 11 Pro on my new Beelink mini PC, I tested #WSL2 with #Fedora #Linux. I compiled my pet project, I/O Riot NG (ior), which requires many system libraries, including #BPF. I’m impressed—everything works just like on native Fedora, and my tool runs and traces I/O syscalls with BPF out of the box. I might would prefer now Windows over MacOS if I had to chose between those two for work.

codeberg.org/snonux/ior

Some might hate me saying this, but didnt ...



Some might hate me saying this, but didnt #systemd solve the problem of a shared /tmp directory by introducing PrivateTmp?? but yes why did it have to go that way...

www.osnews.com/story/140968/tmp-should-not-exist/

Wouldn't still do that, even with 100% test ...



Wouldn't still do that, even with 100% test coverage, LT and integration tests, unless theres an exception the business relies on #sre

medium.com/openclassrooms-product-desi..-..g/do-not-deploy-on-friday-92b1b46ebfe6

Some neat slice tricks for Go: #golang ...



Some neat slice tricks for Go: #golang

blog.devtrovert.com/p/12-slice-tricks-to-enhance-your-go

I understand that Kubernetes is not for ...



I understand that Kubernetes is not for everyone, but it still seems to be the new default for everything newly built. Despite the fact that Kubernetes is complex to maintain and use, there is still a lot of SRE/DevOps talent out there who have it on their CVs, which contributes significantly to the supportability of the infrastructure and the applications running on it. This way, you don't have to teach every new engineer your "own way" infrastructure. It's like a standard language of infrastructure that many people speak. However, Kubernetes should not be the default solution for everything, in my opinion. #kubernetes #k8s

www.gitpod.io/blog/we-are-leaving-kubernetes

June 2025



Some great advices, will try out some of them! ...



Some great advices, will try out some of them! #programming

endler.dev/2025/best-programmers/

In #Golang, values are actually copied when ...



In #Golang, values are actually copied when assigned (boxed) into an interface. That can have performance impact.

goperf.dev/01-common-patterns/interface-boxing/

This is a great little tutorial for searching ...



This is a great little tutorial for searching in the #HelixEditor #editor #coding

helix-editor-tutorials.com/tutorials/using-helix-global-search/

The mov instruction of a CPU is turing ...



The mov instruction of a CPU is turing complete. And theres an implementation of #Doom only using mov, it renders one frame per 7 hours! #fun

beza1e1.tuxen.de/articles/accidentally_turing_complete.html

I removed the social media profile from my ...



I removed the social media profile from my GrapheneOS phone. Originally, I created a separate profile just for social media to avoid using it too often. But I noticed that I switched to it too frequently. Not having social media within reach is probably the best option. #socialmedia #sm #distractions

So want a "real" recent UNIX? Use AIX! #macos ...



So want a "real" recent UNIX? Use AIX! #macos #unix #aix

www.osnews.com/story/141633/apples-macos-unix-certification-is-a-lie/

This episode, I think, is kind of an eye-opener ...



This episode, I think, is kind of an eye-opener for me personally. I knew, that AI is there to stay, but you better should now start playing with your pet projects, otherwise your performance reviews will be awkward in a year or two from now, when you are expected to use AI for your daily work. #ai #llm #coding #programming

changelog.com/friends/96

My #OpenBSD blog setup got mentioned in the ...



My #OpenBSD blog setup got mentioned in the BSDNow.tv Podcast (In the Feedback section) :-) #BSD #podcast #runbsd

www.bsdnow.tv/614

#Golang is the best when it comes to agentic ...



#Golang is the best when it comes to agentic coding: #llm

lucumr.pocoo.org/2025/6/12/agentic-coding/

Where #zsh is better than #bash ...



Where #zsh is better than #bash

www.arp242.net/why-zsh.html

I really enjoyed this talk about obscure Go ...



I really enjoyed this talk about obscure Go optimizations. None of it is really standard and can change from one version of Go to another, though. #golang #talk

www.youtube.com/watch?v=rRtihWOcaLI

Commenting your regular expression is generally ...



Commenting your regular expression is generally a good advice! Works pretty well as described in the article not just in #Ruby, but also in #Perl (@Perl), #RakuLang, ...

thoughtbot.com/blog/comment-your-regular-expressions

You have to make a decision for yourself, but ...



You have to make a decision for yourself, but generally, work smarter (and faster—but keep the quality)! About 40 hours #productivity #work #workload

thesquareplanet.com/blog/about-40-hours/

"100 Go Mistakes and How to Avoid Them" is one ...



"100 Go Mistakes and How to Avoid Them" is one of my favorite #Golang books. Julia Evans also stumbled across some issues she'd learned from this book. The book itself is an absolute must for every Gopher (or someone who wants to become one!)

jvns.ca/blog/2024/08/06/go-structs-copied-on-assignment/

The #Ruby Data class seems quite helpful ...



The #Ruby Data class seems quite helpful

allaboutcoding.ghinda.com/example-of-value-objects-using-rubys-data-class

Other related posts:

2025-01-01 Posts from October to December 2024
2025-07-01 Posts from January to June 2025 (You are currently reading this)
2026-01-01 Posts from July to December 2025

E-Mail your comments to paul@nospam.buetow.org :-)

Back to the main site
Task Samurai: An agentic coding learning experiment https://foo.zone/gemfeed/2025-06-22-task-samurai.html 2025-06-22T20:00:51+03:00 Paul Buetow aka snonux paul@dev.buetow.org Task Samurai is a fast terminal interface for Taskwarrior written in Go using the Bubble Tea framework. It displays your tasks in a table and allows you to manage them without leaving your keyboard.

Task Samurai: An agentic coding learning experiment



Published at 2025-06-22T20:00:51+03:00

Task Samurai Logo

Table of Contents




Introduction



Task Samurai is a fast terminal interface for Taskwarrior written in Go using the Bubble Tea framework. It displays your tasks in a table and allows you to manage them without leaving your keyboard.

https://taskwarrior.org
https://github.com/charmbracelet/bubbletea

Why does this exist?



I wanted to tinker with agentic coding. This project was implemented entirely using OpenAI Codex. (After this blog post was published, I also used the Claude Code CLI.)

  • I wanted a faster UI for Taskwarrior than other options, like Vit, which is Python-based.
  • I wanted something built with Bubble Tea, but I never had time to dive deep into it.
  • I wanted to build a toy project (like Task Samurai) first, before tackling the big ones, to get started with agentic coding.

https://openai.com/codex/

Given the current industry trend and the rapid advancements in technology, it has become clear that experimenting with AI-assisted coding tools is almost a necessity to stay relevant. Embracing these new developments doesn't mean abandoning traditional coding; instead, it means integrating new capabilities into your workflow to stay ahead in a fast-evolving field.

How it works



Task Samurai invokes the task command (that's the original Taskwarrior CLI command) to read and modify tasks. The tasks are displayed in a Bubble Tea table, where each row represents a task. Hotkeys trigger Taskwarrior commands such as starting, completing or annotating tasks. The UI refreshes automatically after each action, so the table is always up to date.

Task Samurai Screenshot

Where and how to get it



Go to:

https://codeberg.org/snonux/tasksamurai

And follow the README.md!

Lessons learned from building Task Samurai with agentic coding



Developer workflow



I was trying out OpenAI Codex because I regularly run out of Claude Code CLI (another agentic coding tool I am currently trying out) credits (it still happens!), but Codex was still available to me. So, I took the opportunity to push agentic coding a bit further with another platform.

I didn't really love the web UI you have to use for Codex, as I usually live in the terminal. But this is all I have for Codex for now, and I thought I'd give it a try regardless. The web UI is simple and pretty straightforward. There's also a Codex CLI one could use directly in the terminal, but I didn't get it working. I will try again soon.

Update: Codex CLI now works for me, after OpenAI released a new version!

For every task given to Codex, it spins up its own container. From there, you can drill down and watch what it is doing. At the end, the result (in the form of a code diff) will be presented. From there, you can make suggestions about what else to change in the codebase. What I found inconvenient is that for every additional change, there's an overhead because Codex has to spin up a container and bootstrap the entire development environment again, which adds extra delay. That could be eliminated by setting up predefined custom containers, but that feature still seems somewhat limited.

Once satisfied, you can ask Codex to create a GitHub PR (too bad only GitHub is supported and no other Git hosters); from there, you can merge it and then pull it to your local laptop or workstation to test the changes again. I found myself looping a lot around the Codex UI, GitHub PRs, and local checkouts.

How it went



Task Samurai's codebase came together quickly: the entire Git history spans from June 19 to 22, 2025, culminating in 179 commits:

  • June 19: Scaffolded the Go boilerplate, set up tests, integrated the Bubble Tea UI framework, and got the first table views showing up.
  • June 20: (The big one—120 commits!) Added hotkeys, colourized tasks, annotation support, undo/redo, and, for fun, fireworks on quit (which never worked and got removed at a later point). This is where most of the bugs, merges, and fast-paced changes happen.
  • June 21: Refined searching, theming, and column sizing and documented all those hotkeys. Numerous tweaks to make the UI cleaner and more user-friendly.
  • June 22: Final touches—added screenshots, polished the logo, fixed module paths… and then it was a wrap.

Most big breakthroughs (and bug introductions) came during that middle day of intense iteration. The latter stages were all about smoothing out the rough edges.

It's worth noting that I worked on it in the evenings when I had some free time, as I also had to fit in my regular work and family commitments during the day. So, I didn't spend full working days on this project.

What went wrong



Going agentic isn't all smooth. Here are the hiccups I ran into, plus a few lessons:

  • Merge Floods: Every minor feature or fix existed on its branch, so merging was a constant process. It kept progress flowing but also drowned the committed history in noise and the occasional conflict. I found this to be an issue with OpenAI's Codex in particular. Not so much with other agentic coding tools like Claude Code CLI (not covered in this blog post.)
  • Fixes on fixes: Features like "fireworks on exit" had chains of "fix exit," "fix cell selection," etc. Sometimes, new additions introduced bugs that needed rapid patching.

Patterns that helped



Despite the chaos, a few strategies kept things moving:

  • Scaffolding First: I started with the basic table UI and command wrappers, then layered on features—never the other way around.
  • Tiny PRs: Small, atomic merges meant feedback came fast (and so did fixes).
  • Tests Matter: A solid base of unit tests for task manipulations kept things from breaking entirely when experimenting.
  • Live Documentation: Documentation, such as the README, is updated regularly to reflect all the hotkey and feature changes.

Maybe a better approach would have been to design the whole application from scratch before letting Codix do any of the coding. I will try that with my next toy project.

What I learned using agentic coding



Stepping into agentic coding with Codex as my "pair programmer" was a big shift. I learned a lot—not just about automating code generation, but also about how you have to tightly steer, guide, and audit every line as things move at high speed. I must admit, I sometimes lost track of what all the generated code was actually doing. But as the features seemed to work after a few iterations, I was satisfied—which is a bit concerning. Imagine if I approved a PR for a production-grade deployment without fully understanding what it was doing (and not a toy project like in this post).

how much time did I save?



Did it buy me speed?

  • Say each commit takes Codex 5 minutes to generate, and you need to review/guide 179 commits = about _6 hours of active development_.
  • If you coded it all yourself, including all the bug fixes, features, design, and documentation, you might spend _10–20 hours_.
  • That's a couple of days of potential savings—and I am by no means an expert in agentic coding, since this was my first completed agentic coding project.

Conclusion



Building Task Samurai with agentic coding was a wild ride—rapid feature growth, countless fast fixes, and more merge commits I'd expected. Keep the iterations short (or maybe in my next experiment, much larger, with better and more complete design before generating a single line of code), keep tests and documentation concise, and review and refine for final polish at the end. Even with the bumps along the way, shipping a terminal UI in days instead of weeks is a neat little showcase vibe coding.

Am I an agentic coding expert now? I don't think so. There are still many things to learn, and the landscape is constantly evolving.

While working on Task Samurai, there were times I missed manual coding and the satisfaction that comes from writing every line yourself, debugging issues manually, and crafting solutions from scratch. However, this is the direction in which the industry seems to be shifting, unfortunately. If applied correctly, AI will boost performance, and if you don't use AI, your next performance review may be awkward.

Personally, I am not sure whether I like where the industry is going with agentic coding. I love "traditional" coding, and with agentic coding you operate at a higher level and don't interact directly with code as often, which I would miss. I think that in the future, designing, reviewing, and being able to read and understand code will be more important than writing code by hand.

Do you have any thoughts on that? I hope, I am partially wrong at least.

E-Mail your comments to paul@nospam.buetow.org :-)

Other related posts are:

2025-08-05 Local LLM for Coding with Ollama on macOS
2025-06-22 Task Samurai: An agentic coding learning experiment (You are currently reading this)

Back to the main site
'A Monk's Guide to Happiness' book notes https://foo.zone/gemfeed/2025-06-07-a-monks-guide-to-happiness-book-notes.html 2025-06-07T10:30:11+03:00 Paul Buetow aka snonux paul@dev.buetow.org These are my personal book notes from Gelong Thubten's 'A Monk's Guide to Happiness: Meditation in the 21st century.' They are for my own reference, but I hope they might be useful to you as well.

"A Monk's Guide to Happiness" book notes



Published at 2025-06-07T10:30:11+03:00

These are my personal book notes from Gelong Thubten's "A Monk's Guide to Happiness: Meditation in the 21st century." They are for my own reference, but I hope they might be useful to you as well.

Table of Contents




Understanding Happiness



  • Happiness is a skill we can train.
  • Happiness is not about accomplishing goals, as that would be in the future.
  • Feel free now. No urge about past and future.
  • We can learn to produce our own happiness independently of physical needs. When we walk in a park, how do we feel? We can train to reproduce that feeling independently.

The Role of Meditation



  • Meditation is not about clearing your mind. A busy mind has nothing to do with interfering with your meditation.
  • Our problem is that we need to detect that awareness. Meditation connects us with awareness. Awareness is freedom.
  • We can let the mind be and don't care about the thoughts. It will have benefits for your life. It will protect you from all kinds of stress.
  • Better meditate with open eyes so you don't associate it with the dark. You will also be able to be in a meditation state of mind outside of the meditation session.
  • Have a baseline for time to build up discipline.
  • We don't need to do anything about stress, just take a step back.

Managing Thoughts and Emotions



  • Our flow of emotions is really just habits. That can be changed through training, e.g., meditation training.
  • A part of the mind recognises that we are sad or angry. That part is not sad or angry by itself, obviously. So we can escape to that part of the mind, be the observer, and not draw in the constant flow of emotions and thoughts.
  • Let the front and back doors of your house open, and let the thoughts come in and leave. Just don't serve them tea. This once said, a great Zen master.
  • Thoughts are friends and not enemies.
  • Thoughts help the meditation as they make us notice that we wandered off, and therefore, we strengthen the reflection.

Practice and Discipline



  • The importance of habits to practice mindfulness. Bring mindfulness into the daily practice.
  • Integrating short moments of mindfulness during the day is the fast track to happiness. Start off with small tasks, e.g. while washing your hands.
  • Have many small doses of mindfulness and don't prolong as otherwise, your mind will revolt.
  • Have a small moment of mindfulness when you wake up and go to sleep.
  • Practice staying fully present in an uncomfortable situation and without judgement.
  • Don't become two persons who never meet: the meditator and the not meditator. So integrate mindfulness during the day too.

Perspectives on Relationships and Interactions



  • Who is the opponent? The other person. The things he said or our reactions to things? Forgiveness is a high form of compassion.
  • Understand the suffering of the person who "hurt" us. Where is the aggressor really coming from?
  • People who are stressed or unhappy do and say things they wouldn't have said have done otherwise. Acting under anger is like being influenced by alcohol.
  • People don't have a masterplan to destroy others, even if it seems so. They are under strong bad influence by themselves. Something terrible happened to them. Revenge makes no sense.
  • Be grateful for people "trying" to hurt you as they help you to practice your path.

Reflective Questions



  • Why do I do all the things I do? What do I try to achieve?
  • What am I doing about that?
  • Is it working?
  • What are the real causes of happiness and suffering?
  • What about meditation? How does that address the situation?

Miscellaneous Guidelines



  • Posture is important as the mind and body are connected.
  • Don't use music, so you don't rely on music to change your state of mind. Similar regular guided meditation. Guided meditation is good for learning a technique, but you should not rely on another voice.
  • You are not trying to relax. Relaxing and trying are two different things.
  • When you love everything, even the bad things happening to you, then you are invincible.
  • Happiness is all in your mind. As if you flip a switch there.
  • Digging for answers will never end. It will always cause more material to dig.

If happiness is a mental issue. Clearly, the best time is spent training your mind in your free time and don't always be busy with other things. E.g. meditation, or think about the benefits of meditation. All that we do in our free time is search for happiness. Are the things we do actually working? There is always something around the corner...

E-Mail your comments to paul@nospam.buetow.org :-)

Other book notes of mine are:

2025-11-02 'The Courage To Be Disliked' book notes
2025-06-07 'A Monk's Guide to Happiness' book notes (You are currently reading this)
2025-04-19 'When: The Scientific Secrets of Perfect Timing' book notes
2024-10-24 'Staff Engineer' book notes
2024-07-07 'The Stoic Challenge' book notes
2024-05-01 'Slow Productivity' book notes
2023-11-11 'Mind Management' book notes
2023-07-17 'Software Developers Career Guide and Soft Skills' book notes
2023-05-06 'The Obstacle is the Way' book notes
2023-04-01 'Never split the difference' book notes
2023-03-16 'The Pragmatic Programmer' book notes

Back to the main site
f3s: Kubernetes with FreeBSD - Part 5: WireGuard mesh network https://foo.zone/gemfeed/2025-05-11-f3s-kubernetes-with-freebsd-part-5.html 2025-05-11T11:35:57+03:00, last updated Thu 15 Jan 19:30:46 EET 2026 Paul Buetow aka snonux paul@dev.buetow.org This is the fifth blog post about my f3s series for my self-hosting demands in my home lab. f3s? The 'f' stands for FreeBSD, and the '3s' stands for k3s, the Kubernetes distribution I will use on FreeBSD-based physical machines.

f3s: Kubernetes with FreeBSD - Part 5: WireGuard mesh network



Published at 2025-05-11T11:35:57+03:00, last updated Thu 15 Jan 19:30:46 EET 2026

This is the fifth blog post about my f3s series for my self-hosting demands in my home lab. f3s? The "f" stands for FreeBSD, and the "3s" stands for k3s, the Kubernetes distribution I will use on FreeBSD-based physical machines.

I will post a new entry every month or so (there are too many other side projects for more frequent updates — I bet you can understand).

This post has been updated to include two roaming clients (earth - Fedora laptop, pixel7pro - Android phone) that connect to the mesh via the internet gateways. The updated content is integrated throughout the post.

These are all the posts so far:

2024-11-17 f3s: Kubernetes with FreeBSD - Part 1: Setting the stage
2024-12-03 f3s: Kubernetes with FreeBSD - Part 2: Hardware and base installation
2025-02-01 f3s: Kubernetes with FreeBSD - Part 3: Protecting from power cuts
2025-04-05 f3s: Kubernetes with FreeBSD - Part 4: Rocky Linux Bhyve VMs
2025-05-11 f3s: Kubernetes with FreeBSD - Part 5: WireGuard mesh network (You are currently reading this)
2025-07-14 f3s: Kubernetes with FreeBSD - Part 6: Storage
2025-10-02 f3s: Kubernetes with FreeBSD - Part 7: k3s and first pod deployments
2025-12-07 f3s: Kubernetes with FreeBSD - Part 8: Observability

f3s logo

ChatGPT generated logo.

Let's begin...

Table of Contents




Introduction



By default, traffic within my home LAN, including traffic inside a k3s cluster, is not encrypted. While it resides in the "secure" home LAN, adopting a zero-trust policy means encryption is still preferable to ensure confidentiality and security. So we decide to secure all the traffic of all f3s participating hosts by building a mesh network:

WireGuard mesh network topology

The mesh network consists of eight infrastructure hosts and two roaming clients:

Infrastructure hosts (full mesh):

  • f0, f1, and f2 are the FreeBSD base hosts in my home LAN
  • r0, r1, and r2 are the Rocky Linux Bhyve VMs running on the FreeBSD hosts
  • blowfish and fishfinger are two OpenBSD systems running on the internet (as mentioned in the first blog of this series—these systems are already built; in fact, this very blog is served by those OpenBSD systems)

oaming clients (gateway-only connections):

  • earth is my Fedora laptop (192.168.2.200) which connects only to the internet gateways for remote access
  • pixel7pro is my Android phone (192.168.2.201) which routes all traffic through the VPN when activated

As we can see from the diagram, the eight infrastructure hosts form a true full-mesh network, where every host has a VPN tunnel to every other host. The benefit is that we do not need to route traffic through intermediate hosts (significantly simplifying the routing configuration). However, the downside is that there is some overhead in configuring and managing all the tunnels. The roaming clients take a simpler approach—they only connect to the two internet-facing gateways (blowfish and fishfinger), which is sufficient for remote access and internet connectivity.

For simplicity, we also establish VPN tunnels between f0 <-> r0, f1 <-> r1, and f2 <-> r2. Technically, this wouldn't be strictly required since the VMs rN are running on the hosts fN, and no network traffic is leaving the box. However, it simplifies the configuration as we don't have to account for exceptions, and we are going to automate the mesh network configuration anyway (read on).

Expected traffic flow



The traffic is expected to flow between the host groups through the mesh network as follows:

nfrastructure mesh traffic:

  • fN <-> rN: The traffic between the FreeBSD hosts and the Rocky Linux VMs will be routed through the VPN tunnels for persistent storage. In a later post in this series, we will set up an NFS server on the fN hosts.
  • fN <-> blowfish,fishfinger: The traffic between the FreeBSD hosts and the OpenBSD host blowfish,fishfinger will be routed through the VPN tunnels for management. We may want to log in via the internet to set it up remotely. The VPN tunnel will also be used for monitoring purposes.
  • rN <-> blowfish,fishfinger: The traffic between the Rocky Linux VMs and the OpenBSD host blowfish,fishfinger will be routed through the VPN tunnels for usage traffic. Since k3s will be running on the rN hosts, the OpenBSD servers will route the traffic through relayd to the services running in Kubernetes.
  • fN <-> fM: The traffic between the FreeBSD hosts may be later used for data replication for the NFS storage.
  • rN <-> rM: The traffic between the Rocky Linux VMs will later be used by the k3s cluster itself, as every rN will be a Kubernetes worker node.
  • blowfish <-> fishfinger: The traffic between the OpenBSD hosts isn't strictly required for this setup, but I set it up anyway for future use cases.

oaming client traffic:

  • earth,pixel7pro <-> blowfish,fishfinger: The roaming clients connect exclusively to the two internet gateways. All traffic from these clients (0.0.0.0/0) is routed through the VPN, providing secure internet access and the ability to reach services running in the mesh (via the gateways). The gateways use NAT to allow roaming clients to access the internet using the gateway's public IP address. The roaming clients cannot be reached by the LAN hosts—they are client-only and initiate all connections.

We won't cover all the details in this blog post, as we only focus on setting up the Mesh network in this blog post. Subsequent posts in this series will cover the other details.

Deciding on WireGuard



I have decided to use WireGuard as the VPN technology for this purpose.

WireGuard is a lightweight, modern, and secure VPN protocol designed for simplicity, speed, and strong cryptography. It is an excellent choice due to its minimal codebase, ease of configuration, high performance, and robust security, utilizing state-of-the-art encryption standards. WireGuard is supported on various operating systems, and its implementations are compatible with each other. Therefore, establishing WireGuard VPN tunnels between FreeBSD, Linux, and OpenBSD is seamless. This cross-platform availability makes it suitable for setups like the one described in this blog series.

We could have used Tailscale for an easy to set up and manage the WireGuard network, but the benefits of creating our own mesh network are:

  • Learning about WireGuard configuration details
  • Have full control over the setup
  • Don't rely on an external provider like Tailscale (even if some of the components are open-source)
  • Have even more fun along the way
  • WireGuard is easy to configure on my target operating systems and, therefore, easier to maintain in the long run.
  • There are no official Tailscale packages available for OpenBSD and FreeBSD. However, getting Tailscale running on these systems is still possible, though some tinkering would be required. Instead, we use that tinkering time to set up WireGuard tunnels ourselves.

https://en.wikipedia.org/wiki/WireGuard
https://www.wireguard.com/
https://tailscale.com/

WireGuard Logo

Base configuration



In the following, we prepare the base configuration for the WireGuard mesh network. We will use a similar configuration on all participating hosts, with the exception of the host IP addresses and the private keys.

FreeBSD



On the FreeBSD hosts f0, f1 and f2, similar as last time, first, we bring the system up to date:

paul@f0:~ % doas freebsd-update fetch
paul@f0:~ % doas freebsd-update install
paul@f0:~ % doas shutdown -r now
..
..
paul@f0:~ % doas pkg update
paul@f0:~ % doas pkg upgrade
paul@f0:~ % reboot

Next, we install wireguard-tools and configure the WireGuard service:

paul@f0:~ % doas pkg install wireguard-tools
paul@f0:~ % doas sysrc wireguard_interfaces=wg0
wireguard_interfaces:  -> wg0
paul@f0:~ % doas sysrc wireguard_enable=YES
wireguard_enable:  -> YES
paul@f0:~ % doas mkdir -p /usr/local/etc/wireguard
paul@f0:~ % doas touch /usr/local/etc/wireguard/wg0.conf
paul@f0:~ % doas service wireguard start
paul@f0:~ % doas wg show
interface: wg0
  public key: L+V9o0fNYkMVKNqsX7spBzD/9oSvxM/C7ZCZX1jLO3Q=
  private key: (hidden)
  listening port: 20246

We now have the WireGuard up and running, but it is not yet in any functional configuration. We will come back to that later.

Next, we add all the participating WireGuard IPs to the hosts file. This is only convenience, so we don't have to manage an external DNS server for this:

paul@f0:~ % cat <<END | doas tee -a /etc/hosts

192.168.1.120 r0 r0.lan r0.lan.buetow.org
192.168.1.121 r1 r1.lan r1.lan.buetow.org
192.168.1.122 r2 r2.lan r2.lan.buetow.org

192.168.2.130 f0.wg0 f0.wg0.wan.buetow.org
192.168.2.131 f1.wg0 f1.wg0.wan.buetow.org
192.168.2.132 f2.wg0 f2.wg0.wan.buetow.org

192.168.2.120 r0.wg0 r0.wg0.wan.buetow.org
192.168.2.121 r1.wg0 r1.wg0.wan.buetow.org
192.168.2.122 r2.wg0 r2.wg0.wan.buetow.org

192.168.2.110 blowfish.wg0 blowfish.wg0.wan.buetow.org
192.168.2.111 fishfinger.wg0 fishfinger.wg0.wan.buetow.org

fd42:beef:cafe:2::130 f0.wg0 f0.wg0.wan.buetow.org
fd42:beef:cafe:2::131 f1.wg0 f1.wg0.wan.buetow.org
fd42:beef:cafe:2::132 f2.wg0 f2.wg0.wan.buetow.org

fd42:beef:cafe:2::120 r0.wg0 r0.wg0.wan.buetow.org
fd42:beef:cafe:2::121 r1.wg0 r1.wg0.wan.buetow.org
fd42:beef:cafe:2::122 r2.wg0 r2.wg0.wan.buetow.org

fd42:beef:cafe:2::110 blowfish.wg0 blowfish.wg0.wan.buetow.org
fd42:beef:cafe:2::111 fishfinger.wg0 fishfinger.wg0.wan.buetow.org
END

As you can see, 192.168.1.0/24 is the network used in my LAN (with the fN and rN hosts) and 192.168.2.0/24 is the network used for the WireGuard mesh network. The wg0 interface will be used for all WireGuard traffic.

Rocky Linux



We bring the Rocky Linux VMs up to date as well with the following:

[root@r0 ~] dnf update -y
[root@r0 ~] reboot

Next, we prepare WireGuard on them. Same as on the FreeBSD hosts, we will only prepare WireGuard without any useful configuration yet:

[root@r0 ~] dnf install -y wireguard-tools
[root@r0 ~] mkdir -p /etc/wireguard
[root@r0 ~] touch /etc/wireguard/wg0.conf
[root@r0 ~] systemctl enable wg-quick@wg0.service
[root@r0 ~] systemctl start wg-quick@wg0.service
[root@r0 ~] systemctl disable firewalld

We also update the hosts file accordingly:

[root@r0 ~] cat <<END >>/etc/hosts

192.168.1.130 f0 f0.lan f0.lan.buetow.org
192.168.1.131 f1 f1.lan f1.lan.buetow.org
192.168.1.132 f2 f2.lan f2.lan.buetow.org

192.168.2.130 f0.wg0 f0.wg0.wan.buetow.org
192.168.2.131 f1.wg0 f1.wg0.wan.buetow.org
192.168.2.132 f2.wg0 f2.wg0.wan.buetow.org

192.168.2.120 r0.wg0 r0.wg0.wan.buetow.org
192.168.2.121 r1.wg0 r1.wg0.wan.buetow.org
192.168.2.122 r2.wg0 r2.wg0.wan.buetow.org

192.168.2.110 blowfish.wg0 blowfish.wg0.wan.buetow.org
192.168.2.111 fishfinger.wg0 fishfinger.wg0.wan.buetow.org

fd42:beef:cafe:2::130 f0.wg0 f0.wg0.wan.buetow.org
fd42:beef:cafe:2::131 f1.wg0 f1.wg0.wan.buetow.org
fd42:beef:cafe:2::132 f2.wg0 f2.wg0.wan.buetow.org

fd42:beef:cafe:2::120 r0.wg0 r0.wg0.wan.buetow.org
fd42:beef:cafe:2::121 r1.wg0 r1.wg0.wan.buetow.org
fd42:beef:cafe:2::122 r2.wg0 r2.wg0.wan.buetow.org

fd42:beef:cafe:2::110 blowfish.wg0 blowfish.wg0.wan.buetow.org
fd42:beef:cafe:2::111 fishfinger.wg0 fishfinger.wg0.wan.buetow.org
END

Unfortunately, the SELinux policy on Rocky Linux blocks WireGuard's operation. By making the wireguard_t domain permissive using semanage permissive -a wireguard_t, SELinux will no longer enforce restrictions for WireGuard, allowing it to work as intended:

[root@r0 ~] dnf install -y policycoreutils-python-utils
[root@r0 ~] semanage permissive -a wireguard_t
[root@r0 ~] reboot

https://github.com/angristan/wireguard-install/discussions/499

OpenBSD



Other than the FreeBSD and Rocky Linux hosts involved, my OpenBSD hosts (blowfish and fishfinger, which are running at OpenBSD Amsterdam and Hetzner on the internet) have been running already for longer, so I can't provide you with the "from scratch" installation details here. In the following, we will only focus on the additional configuration needed to set up WireGuard:

blowfish$ doas pkg_add wireguard-tools
blowfish$ doas mkdir /etc/wireguard
blowfish$ doas touch /etc/wireguard/wg0.conf
blowsish$ cat <<END | doas tee /etc/hostname.wg0
inet 192.168.2.110 255.255.255.0 NONE
up
!/usr/local/bin/wg setconf wg0 /etc/wireguard/wg0.conf
END

Note that on blowfish, we configure 192.168.2.110 here in the hostname.wg, and on fishfinger, we configure 192.168.2.111. Those are the IP addresses of the WireGuard interfaces on those hosts.

And here, we also update the hosts file accordingly:

blowfish$ cat <<END | doas tee -a /etc/hosts

192.168.2.130 f0.wg0 f0.wg0.wan.buetow.org
192.168.2.131 f1.wg0 f1.wg0.wan.buetow.org
192.168.2.132 f2.wg0 f2.wg0.wan.buetow.org

192.168.2.120 r0.wg0 r0.wg0.wan.buetow.org
192.168.2.121 r1.wg0 r1.wg0.wan.buetow.org
192.168.2.122 r2.wg0 r2.wg0.wan.buetow.org

192.168.2.110 blowfish.wg0 blowfish.wg0.wan.buetow.org
192.168.2.111 fishfinger.wg0 fishfinger.wg0.wan.buetow.org
192.168.2.200 earth.wg0 earth.wg0.wan.buetow.org
192.168.2.201 pixel7pro.wg0 pixel7pro.wg0.wan.buetow.org

fd42:beef:cafe:2::130 f0.wg0 f0.wg0.wan.buetow.org
fd42:beef:cafe:2::131 f1.wg0 f1.wg0.wan.buetow.org
fd42:beef:cafe:2::132 f2.wg0 f2.wg0.wan.buetow.org

fd42:beef:cafe:2::120 r0.wg0 r0.wg0.wan.buetow.org
fd42:beef:cafe:2::121 r1.wg0 r1.wg0.wan.buetow.org
fd42:beef:cafe:2::122 r2.wg0 r2.wg0.wan.buetow.org

fd42:beef:cafe:2::110 blowfish.wg0 blowfish.wg0.wan.buetow.org
fd42:beef:cafe:2::111 fishfinger.wg0 fishfinger.wg0.wan.buetow.org
fd42:beef:cafe:2::200 earth.wg0 earth.wg0.wan.buetow.org
fd42:beef:cafe:2::201 pixel7pro.wg0 pixel7pro.wg0.wan.buetow.org
END

To enable roaming clients (like earth and pixel7pro) to access the internet through the VPN, we need to configure NAT on the OpenBSD gateways. This allows the roaming clients to use the gateway's public IP address for outbound traffic. We add the following to /etc/pf.conf on both blowfish and fishfinger:

# NAT for WireGuard clients to access internet
match out on vio0 from 192.168.2.0/24 to any nat-to (vio0)

# Allow inbound traffic on WireGuard interface
pass in on wg0

# Allow all UDP traffic on WireGuard port
pass in inet proto udp from any to any port 56709

The NAT rule translates outgoing traffic from the WireGuard network (192.168.2.0/24) to the gateway's public IP. The firewall rules permit WireGuard traffic on the wg0 interface and UDP port 56709. After updating /etc/pf.conf, reload the firewall:

blowfish$ doas pfctl -f /etc/pf.conf

WireGuard configuration



So far, we have only started WireGuard on all participating hosts without any useful configuration. This means that no VPN tunnel has been established yet between any of the hosts.

Example wg0.conf



Generally speaking, a wg0.conf looks like this (example from f0 host):

[Interface]
# f0.wg0.wan.buetow.org
Address = 192.168.2.130
PrivateKey = **************************
ListenPort = 56709

[Peer]
# f1.lan.buetow.org as f1.wg0.wan.buetow.org
PublicKey = **************************
PresharedKey = **************************
AllowedIPs = 192.168.2.131/32
Endpoint = 192.168.1.131:56709
# No KeepAlive configured

[Peer]
# f2.lan.buetow.org as f2.wg0.wan.buetow.org
PublicKey = **************************
PresharedKey = **************************
AllowedIPs = 192.168.2.132/32
Endpoint = 192.168.1.132:56709
# No KeepAlive configured

[Peer]
# r0.lan.buetow.org as r0.wg0.wan.buetow.org
PublicKey = **************************
PresharedKey = **************************
AllowedIPs = 192.168.2.120/32
Endpoint = 192.168.1.120:56709
# No KeepAlive configured

[Peer]
# r1.lan.buetow.org as r1.wg0.wan.buetow.org
PublicKey = **************************
PresharedKey = **************************
AllowedIPs = 192.168.2.121/32
Endpoint = 192.168.1.121:56709
# No KeepAlive configured

[Peer]
# r2.lan.buetow.org as r2.wg0.wan.buetow.org
PublicKey = **************************
PresharedKey = **************************
AllowedIPs = 192.168.2.122/32
Endpoint = 192.168.1.122:56709
# No KeepAlive configured

[Peer]
# blowfish.buetow.org as blowfish.wg0.wan.buetow.org
PublicKey = **************************
PresharedKey = **************************
AllowedIPs = 192.168.2.110/32
Endpoint = 23.88.35.144:56709
PersistentKeepalive = 25

[Peer]
# fishfinger.buetow.org as fishfinger.wg0.wan.buetow.org
PublicKey = **************************
PresharedKey = **************************
AllowedIPs = 192.168.2.111/32
Endpoint = 46.23.94.99:56709
PersistentKeepalive = 25

For roaming clients like pixel7pro (Android phone) or earth (Fedora laptop), the configuration looks different because they route all traffic through the VPN and only connect to the internet gateways:

[Interface]
# pixel7pro.wg0.wan.buetow.org
Address = 192.168.2.201
PrivateKey = **************************
ListenPort = 56709
DNS = 1.1.1.1, 8.8.8.8

[Peer]
# blowfish.buetow.org as blowfish.wg0.wan.buetow.org
PublicKey = **************************
PresharedKey = **************************
AllowedIPs = 0.0.0.0/0, ::/0
Endpoint = 23.88.35.144:56709
PersistentKeepalive = 25

[Peer]
# fishfinger.buetow.org as fishfinger.wg0.wan.buetow.org
PublicKey = **************************
PresharedKey = **************************
AllowedIPs = 0.0.0.0/0, ::/0
Endpoint = 46.23.94.99:56709
PersistentKeepalive = 25

Note the key differences for roaming clients:
  • DNS is configured to use external DNS servers (Cloudflare and Google)
  • AllowedIPs = 0.0.0.0/0, ::/0 routes all traffic (IPv4 and IPv6) through the VPN
  • Only two peers are configured (the internet gateways), not the full mesh
  • PersistentKeepalive = 25 is used for both peers to maintain NAT traversal

Whereas there are two main sections. One is [Interface], which configures the current host (here: f0 or pixel7pro):

  • Address: Local virtual IP address on the WireGuard interface.
  • PrivateKey: Private key for this node.
  • ListenPort: Port on which this WireGuard interface listens for incoming connections.

And in the following, there is one [Peer] section for every peer node on the mesh network:

  • PublicKey: The public key of the remote peer is used to authenticate their identity.
  • PresharedKey: An optional symmetric key is used to enhance security (used in addition to PublicKey).
  • AllowedIPs: IPs or subnets routed through this peer (traffic is allowed to/from these IPs).
  • Endpoint: The public IP:port combination of the remote peer for connection.
  • PersistentKeepalive: Keeps the tunnel alive by sending periodic packets; used for NAT traversal.

NAT traversal and keepalive



As all participating hosts, except for blowfish and fishfinger (which are on the internet), are behind a NAT gateway (my home router), we need to use PersistentKeepalive to establish and maintain the VPN tunnel from the LAN to the internet because:

By default, WireGuard tries to be as silent as possible when not being used; it is not a chatty protocol. For the most part, it only transmits data when a peer wishes to send packets. When it's not being asked to send packets, it stops sending packets until it is asked again. In the majority of configurations, this works well. However, when a peer is behind NAT or a firewall, it might wish to be able to receive incoming packets even when it is not sending any packets. Because NAT and stateful firewalls keep track of "connections", if a peer behind NAT or a firewall wishes to receive incoming packets, he must keep the NAT/firewall mapping valid, by periodically sending keepalive packets. This is called persistent keepalives. When this option is enabled, a keepalive packet is sent to the server endpoint once every interval seconds. A sensible interval that works with a wide variety of firewalls is 25 seconds. Setting it to 0 turns the feature off, which is the default, since most users will not need this, and it makes WireGuard slightly more chatty. This feature may be specified by adding the PersistentKeepalive = field to a peer in the configuration file, or setting persistent-keepalive at the command line. If you don't need this feature, don't enable it. But if you're behind NAT or a firewall and you want to receive incoming connections long after network traffic has gone silent, this option will keep the "connection" open in the eyes of NAT.

That's why you see PersistentKeepAlive = 25 in the blowfish and fishfinger peer configurations. This means that every 25 seconds, a keep-alive signal is sent over the tunnel to maintain its connection. If the tunnel is not yet established, it will be created within 25 seconds latest.

Without this, we might never have a VPN tunnel open, as the systems in the LAN may not actively attempt to contact blowfish and fishfinger on their own. In fact, the opposite would likely occur, with the traffic flowing inward instead of outward (this is beyond the scope of this blog post but will be covered in a later post in this series!).

Preshared key



In a WireGuard configuration, the PSK (preshared key) is an optional additional layer of symmetric encryption used alongside the standard public key cryptography. It is a shared secret known to both peers that enhances security by requiring an attacker to compromise both the private keys and the PSK to decrypt communication. While optional, using a PSK is better as it strengthens the cryptographic security, mitigating risks of potential vulnerabilities in the key exchange process.

So, because it's better, we are using it.

Mesh network generator



Manually generating wg0.conf files for every peer in a mesh network setup is cumbersome because each peer requires its own unique public/private key pair and a preshared key for each VPN tunnel (resulting in 29 preshared keys for 8 hosts). This complexity scales almost exponentially with the number of peers as the relationships between all peers must be explicitly defined, including their unique configurations such as AllowedIPs and Endpoint and optional settings like PersistentKeepalive. Automating the process ensures consistency, reduces human error, saves considerable time, and allows for centralized management of configuration files.

Instead, a script can handle key generation, coordinate relationships, and generate all necessary configuration files simultaneously, making it scalable and far less error-prone.

I have written a Ruby script wireguardmeshgenerator.rb to do this for our purposes:

https://codeberg.org/snonux/wireguardmeshgenerator

I use Fedora Linux as my main driver on my personal Laptop, so the script was developed and tested only on Fedora Linux. However, it should also work on other Linux and Unix-like systems.

To set up the mesh generator on Fedora Linux, we run the following:

> git clone https://codeberg.org/snonux/wireguardmeshgenerator
> cd ./wireguardmeshgenerator
> bundle install
> sudo dnf install -y wireguard-tools

This assumes that Ruby and the bundler gem are already installed. If not, refer to the docs of your distribution.

wireguardmeshgenerator.yaml



The file wireguardmeshgenerator.yaml configures the mesh generator script.

---
hosts:
  f0:
    os: FreeBSD
    ssh:
      user: paul
      conf_dir: /usr/local/etc/wireguard
      sudo_cmd: doas
      reload_cmd: service wireguard reload
    lan:
      domain: 'lan.buetow.org'
      ip: '192.168.1.130'
    wg0:
      domain: 'wg0.wan.buetow.org'
      ip: '192.168.2.130'
      ipv6: 'fd42:beef:cafe:2::130'
    exclude_peers:
      - earth
      - pixel7pro
  f1:
    os: FreeBSD
    ssh:
      user: paul
      conf_dir: /usr/local/etc/wireguard
      sudo_cmd: doas
      reload_cmd: service wireguard reload
    lan:
      domain: 'lan.buetow.org'
      ip: '192.168.1.131'
    wg0:
      domain: 'wg0.wan.buetow.org'
      ip: '192.168.2.131'
      ipv6: 'fd42:beef:cafe:2::131'
    exclude_peers:
      - earth
      - pixel7pro
  f2:
    os: FreeBSD
    ssh:
      user: paul
      conf_dir: /usr/local/etc/wireguard
      sudo_cmd: doas
      reload_cmd: service wireguard reload
    lan:
      domain: 'lan.buetow.org'
      ip: '192.168.1.132'
    wg0:
      domain: 'wg0.wan.buetow.org'
      ip: '192.168.2.132'
      ipv6: 'fd42:beef:cafe:2::132'
    exclude_peers:
      - earth
      - pixel7pro
  r0:
    os: Linux
    ssh:
      user: root
      conf_dir: /etc/wireguard
      sudo_cmd:
      reload_cmd: systemctl reload wg-quick@wg0.service
    lan:
      domain: 'lan.buetow.org'
      ip: '192.168.1.120'
    wg0:
      domain: 'wg0.wan.buetow.org'
      ip: '192.168.2.120'
      ipv6: 'fd42:beef:cafe:2::120'
    exclude_peers:
      - earth
      - pixel7pro
  r1:
    os: Linux
    ssh:
      user: root
      conf_dir: /etc/wireguard
      sudo_cmd:
      reload_cmd: systemctl reload wg-quick@wg0.service
    lan:
      domain: 'lan.buetow.org'
      ip: '192.168.1.121'
    wg0:
      domain: 'wg0.wan.buetow.org'
      ip: '192.168.2.121'
      ipv6: 'fd42:beef:cafe:2::121'
    exclude_peers:
      - earth
      - pixel7pro
  r2:
    os: Linux
    ssh:
      user: root
      conf_dir: /etc/wireguard
      sudo_cmd:
      reload_cmd: systemctl reload wg-quick@wg0.service
    lan:
      domain: 'lan.buetow.org'
      ip: '192.168.1.122'
    wg0:
      domain: 'wg0.wan.buetow.org'
      ip: '192.168.2.122'
      ipv6: 'fd42:beef:cafe:2::122'
    exclude_peers:
      - earth
      - pixel7pro
  blowfish:
    os: OpenBSD
    ssh:
      user: rex
      port: 2
      conf_dir: /etc/wireguard
      sudo_cmd: doas
      reload_cmd: sh /etc/netstart wg0
    internet:
      domain: 'buetow.org'
      ip: '23.88.35.144'
    wg0:
      domain: 'wg0.wan.buetow.org'
      ip: '192.168.2.110'
      ipv6: 'fd42:beef:cafe:2::110'
    exclude_peers:
      - earth
      - pixel7pro
  fishfinger:
    os: OpenBSD
    ssh:
      user: rex
      port: 2
      conf_dir: /etc/wireguard
      sudo_cmd: doas
      reload_cmd: sh /etc/netstart wg0
    internet:
      domain: 'buetow.org'
      ip: '46.23.94.99'
    wg0:
      domain: 'wg0.wan.buetow.org'
      ip: '192.168.2.111'
      ipv6: 'fd42:beef:cafe:2::111'
    exclude_peers:
      - earth
      - pixel7pro
  earth:
    os: Linux
    wg0:
      domain: 'wg0.wan.buetow.org'
      ip: '192.168.2.200'
      ipv6: 'fd42:beef:cafe:2::200'
    exclude_peers:
      - f0
      - f1
      - f2
      - r0
      - r1
      - r2
      - pixel7pro
  pixel7pro:
    os: Android
    wg0:
      domain: 'wg0.wan.buetow.org'
      ip: '192.168.2.201'
      ipv6: 'fd42:beef:cafe:2::201'
    exclude_peers:
      - f0
      - f1
      - f2
      - r0
      - r1
      - r2
      - earth

The file specifies details such as SSH user settings, configuration directories, sudo or reload commands, and IP/domain assignments for both internal LAN-facing interfaces and WireGuard (wg0) interfaces. Each host is assigned specific roles, including internal participants and publicly accessible nodes with internet-facing IPs, enabling the creation of a fully connected mesh VPN.

Roaming clients: Note the earth and pixel7pro entries—these are configured differently from the infrastructure hosts. They have no lan or internet sections, which signals to the generator that they are roaming clients. The exclude_peers configuration ensures they only connect to the internet gateways (blowfish and fishfinger) and are not reachable by LAN hosts. The generator automatically configures these clients with AllowedIPs = 0.0.0.0/0, ::/0 to route all traffic through the VPN, includes DNS configuration (1.1.1.1, 8.8.8.8), and enables PersistentKeepalive for NAT traversal.

wireguardmeshgenerator.rb overview



The wireguardmeshgenerator.rb script consists of the following base classes:

  • KeyTool: Manages WireGuard key generation and retrieval. It ensures the presence of public/private key pairs and preshared keys (PSKs). If keys are missing, it generates them using the wg tool. It provides methods to read the public/private keys and retrieve or generate a PSK for communication with a peer. The keys are stored in a temp directory on the system from where the generator is run.
  • PeerSnippet: A Struct representing the configuration for a single WireGuard peer in the mesh. Based on the provided attributes and configuration, it generates the peer's WireGuard configuration, including public key, PSK, allowed IPs, endpoint, and keepalive settings.
  • WireguardConfig: This function generates WireGuard configuration files for the specified host in the mesh network. It includes the [Interface] section for the host itself and the [Peer] sections for all other peers. It can also clean up generated files and directories and create the required directory structure for storing configuration files locally on the system from which the script is run.
  • InstallConfig: Handles uploading, installing, and restarting the WireGuard service on remote hosts using SSH and SCP. It ensures the configuration file is uploaded to the remote machine, the necessary directories are present and correctly configured, and the WireGuard service reloads with the new configuration.

At the end (if you want to see the code for the stuff listed above, go to the Git repo and have a look), we glue it all together in this block:

begin
  options = { hosts: [] }
  OptionParser.new do |opts|
    opts.banner = 'Usage: wireguardmeshgenerator.rb [options]'
    opts.on('--generate', 'Generate Wireguard configs') do
      options[:generate] = true
    end
    opts.on('--install', 'Install Wireguard configs') do
      options[:install] = true
    end
    opts.on('--clean', 'Clean Wireguard configs') do
      options[:clean] = true
    end
    opts.on('--hosts=HOSTS', 'Comma separated hosts to configure') do |hosts|
      options[:hosts] = hosts.split(',')
    end
  end.parse!

  conf = YAML.load_file('wireguardmeshgenerator.yaml').freeze
  conf['hosts'].keys.select { options[:hosts].empty? || options[:hosts].include?(_1) }
               .each do |host|
    # Generate Wireguard configuration for the host reload!
    WireguardConfig.new(host, conf['hosts']).generate! if options[:generate]
    # Install Wireguard configuration for the host.
    InstallConfig.new(host, conf['hosts']).upload!.install!.reload! if options[:install]
    # Clean Wireguard configuration for the host.
    WireguardConfig.new(host, conf['hosts']).clean! if options[:clean]
  end
rescue StandardError => e
  puts "Error: #{e.message}"
  puts e.backtrace.join("\n")
  exit 2
end

And we also have a Rakefile:

task :generate do
  ruby 'wireguardmeshgenerator.rb', '--generate'
end

task :clean do
  ruby 'wireguardmeshgenerator.rb', '--clean'
end

task :install do
  ruby 'wireguardmeshgenerator.rb', '--install'
end

task default: :generate


Invoking the mesh network generator



Generating the wg0.conf files and keys



To generate everything (the wg0.conf of all participating hosts, including all keys involved), we run the following:

> rake generate
/usr/bin/ruby wireguardmeshgenerator.rb --generate
Generating dist/f0/etc/wireguard/wg0.conf
Generating dist/f1/etc/wireguard/wg0.conf
Generating dist/f2/etc/wireguard/wg0.conf
Generating dist/r0/etc/wireguard/wg0.conf
Generating dist/r1/etc/wireguard/wg0.conf
Generating dist/r2/etc/wireguard/wg0.conf
Generating dist/blowfish/etc/wireguard/wg0.conf
Generating dist/fishfinger/etc/wireguard/wg0.conf
Generating dist/earth/etc/wireguard/wg0.conf
Generating dist/pixel7pro/etc/wireguard/wg0.conf

It generated all the wg0.conf files listed in the output, plus those keys:

> find keys/ -type f
keys/f0/priv.key
keys/f0/pub.key
keys/psk/f0_f1.key
keys/psk/f0_f2.key
keys/psk/f0_r0.key
keys/psk/f0_r1.key
keys/psk/f0_r2.key
keys/psk/blowfish_f0.key
keys/psk/f0_fishfinger.key
keys/psk/f1_f2.key
keys/psk/f1_r0.key
keys/psk/f1_r1.key
keys/psk/f1_r2.key
keys/psk/blowfish_f1.key
keys/psk/f1_fishfinger.key
keys/psk/f2_r0.key
keys/psk/f2_r1.key
keys/psk/f2_r2.key
keys/psk/blowfish_f2.key
keys/psk/f2_fishfinger.key
keys/psk/r0_r1.key
keys/psk/r0_r2.key
keys/psk/blowfish_r0.key
keys/psk/fishfinger_r0.key
keys/psk/r1_r2.key
keys/psk/blowfish_r1.key
keys/psk/fishfinger_r1.key
keys/psk/blowfish_r2.key
keys/psk/fishfinger_r2.key
keys/psk/blowfish_fishfinger.key
keys/psk/blowfish_earth.key
keys/psk/earth_fishfinger.key
keys/psk/blowfish_pixel7pro.key
keys/psk/fishfinger_pixel7pro.key
keys/f1/priv.key
keys/f1/pub.key
keys/f2/priv.key
keys/f2/pub.key
keys/r0/priv.key
keys/r0/pub.key
keys/r1/priv.key
keys/r1/pub.key
keys/r2/priv.key
keys/r2/pub.key
keys/blowfish/priv.key
keys/blowfish/pub.key
keys/fishfinger/priv.key
keys/fishfinger/pub.key
keys/earth/priv.key
keys/earth/pub.key
keys/pixel7pro/priv.key
keys/pixel7pro/pub.key

Those keys are embedded in the resulting wg0.conf, so later, we only need to install the wg0.conf files and not all the keys individually.

Installing the wg0.conf files



Uploading the wg0.conf files to the participating hosts and reloading WireGuard on them is then just a matter of executing (this expects, that all participating hosts are up and running):

> rake install
/usr/bin/ruby wireguardmeshgenerator.rb --install
Uploading dist/f0/etc/wireguard/wg0.conf to f0.lan.buetow.org:.
Installing Wireguard config on f0
Uploading cmd.sh to f0.lan.buetow.org:.
+ [ ! -d /usr/local/etc/wireguard ]
+ doas chmod 700 /usr/local/etc/wireguard
+ doas mv -v wg0.conf /usr/local/etc/wireguard
wg0.conf -> /usr/local/etc/wireguard/wg0.conf
+ doas chmod 644 /usr/local/etc/wireguard/wg0.conf
+ rm cmd.sh
Reloading Wireguard on f0
Uploading cmd.sh to f0.lan.buetow.org:.
+ doas service wireguard reload
+ rm cmd.sh
Uploading dist/f1/etc/wireguard/wg0.conf to f1.lan.buetow.org:.
Installing Wireguard config on f1
Uploading cmd.sh to f1.lan.buetow.org:.
+ [ ! -d /usr/local/etc/wireguard ]
+ doas chmod 700 /usr/local/etc/wireguard
+ doas mv -v wg0.conf /usr/local/etc/wireguard
wg0.conf -> /usr/local/etc/wireguard/wg0.conf
+ doas chmod 644 /usr/local/etc/wireguard/wg0.conf
+ rm cmd.sh
Reloading Wireguard on f1
Uploading cmd.sh to f1.lan.buetow.org:.
+ doas service wireguard reload
+ rm cmd.sh
Uploading dist/f2/etc/wireguard/wg0.conf to f2.lan.buetow.org:.
Installing Wireguard config on f2
Uploading cmd.sh to f2.lan.buetow.org:.
+ [ ! -d /usr/local/etc/wireguard ]
+ doas chmod 700 /usr/local/etc/wireguard
+ doas mv -v wg0.conf /usr/local/etc/wireguard
wg0.conf -> /usr/local/etc/wireguard/wg0.conf
+ doas chmod 644 /usr/local/etc/wireguard/wg0.conf
+ rm cmd.sh
Reloading Wireguard on f2
Uploading cmd.sh to f2.lan.buetow.org:.
+ doas service wireguard reload
+ rm cmd.sh
Uploading dist/r0/etc/wireguard/wg0.conf to r0.lan.buetow.org:.
Installing Wireguard config on r0
Uploading cmd.sh to r0.lan.buetow.org:.
+ '[' '!' -d /etc/wireguard ']'
+ chmod 700 /etc/wireguard
+ mv -v wg0.conf /etc/wireguard
renamed 'wg0.conf' -> '/etc/wireguard/wg0.conf'
+ chmod 644 /etc/wireguard/wg0.conf
+ rm cmd.sh
Reloading Wireguard on r0
Uploading cmd.sh to r0.lan.buetow.org:.
+ systemctl reload wg-quick@wg0.service
+ rm cmd.sh
Uploading dist/r1/etc/wireguard/wg0.conf to r1.lan.buetow.org:.
Installing Wireguard config on r1
Uploading cmd.sh to r1.lan.buetow.org:.
+ '[' '!' -d /etc/wireguard ']'
+ chmod 700 /etc/wireguard
+ mv -v wg0.conf /etc/wireguard
renamed 'wg0.conf' -> '/etc/wireguard/wg0.conf'
+ chmod 644 /etc/wireguard/wg0.conf
+ rm cmd.sh
Reloading Wireguard on r1
Uploading cmd.sh to r1.lan.buetow.org:.
+ systemctl reload wg-quick@wg0.service
+ rm cmd.sh
Uploading dist/r2/etc/wireguard/wg0.conf to r2.lan.buetow.org:.
Installing Wireguard config on r2
Uploading cmd.sh to r2.lan.buetow.org:.
+ '[' '!' -d /etc/wireguard ']'
+ chmod 700 /etc/wireguard
+ mv -v wg0.conf /etc/wireguard
renamed 'wg0.conf' -> '/etc/wireguard/wg0.conf'
+ chmod 644 /etc/wireguard/wg0.conf
+ rm cmd.sh
Reloading Wireguard on r2
Uploading cmd.sh to r2.lan.buetow.org:.
+ systemctl reload wg-quick@wg0.service
+ rm cmd.sh
Uploading dist/blowfish/etc/wireguard/wg0.conf to blowfish.buetow.org:.
Installing Wireguard config on blowfish
Uploading cmd.sh to blowfish.buetow.org:.
+ [ ! -d /etc/wireguard ]
+ doas chmod 700 /etc/wireguard
+ doas mv -v wg0.conf /etc/wireguard
wg0.conf -> /etc/wireguard/wg0.conf
+ doas chmod 644 /etc/wireguard/wg0.conf
+ rm cmd.sh
Reloading Wireguard on blowfish
Uploading cmd.sh to blowfish.buetow.org:.
+ doas sh /etc/netstart wg0
+ rm cmd.sh
Uploading dist/fishfinger/etc/wireguard/wg0.conf to fishfinger.buetow.org:.
Installing Wireguard config on fishfinger
Uploading cmd.sh to fishfinger.buetow.org:.
+ [ ! -d /etc/wireguard ]
+ doas chmod 700 /etc/wireguard
+ doas mv -v wg0.conf /etc/wireguard
wg0.conf -> /etc/wireguard/wg0.conf
+ doas chmod 644 /etc/wireguard/wg0.conf
+ rm cmd.sh
Reloading Wireguard on fishfinger
Uploading cmd.sh to fishfinger.buetow.org:.
+ doas sh /etc/netstart wg0
+ rm cmd.sh

Re-generating mesh and installing the wg0.conf files again



The mesh network can be re-generated and re-installed as follows:

> rake clean
> rake generate
> rake install

That would also delete and re-generate all the keys involved.

Setting up roaming clients



For roaming clients like earth (Fedora laptop) and pixel7pro (Android phone), the setup process differs slightly since these devices are not always accessible via SSH:

Android phone (pixel7pro):

The configuration is transferred to the phone using a QR code. The official WireGuard Android app (from Google Play Store) can scan and import the configuration:

> sudo dnf install qrencode
> qrencode -t ansiutf8 < dist/pixel7pro/etc/wireguard/wg0.conf

Scan the QR code with the WireGuard app to import the configuration. The phone will then route all traffic through the VPN when the tunnel is activated. Note that WireGuard does not support automatic failover between the two gateways (blowfish and fishfinger)—if one fails, manual disconnection and reconnection is required to switch to the other.

Fedora laptop (earth):

For the laptop, manually copy the generated configuration:

> sudo cp dist/earth/etc/wireguard/wg0.conf /etc/wireguard/
> sudo chmod 600 /etc/wireguard/wg0.conf
> sudo systemctl start wg-quick@wg0.service  # Start manually
> sudo systemctl disable wg-quick@wg0.service  # Prevent auto-start

The service is disabled from auto-start so the VPN is only active when manually started. This allows selective VPN usage based on need.

Adding IPv6 support to the mesh



After setting up the IPv4-only mesh network, I decided to add dual-stack IPv6 support to enable more networking capabilities and prepare for the future. All 10 hosts (8 infrastructure + 2 roaming clients) now have both IPv4 and IPv6 addresses on their WireGuard interfaces.

IPv6 addressing scheme



We use ULA (Unique Local Address) private IPv6 space, analogous to RFC1918 private IPv4 addresses:

  • Prefix: fd42:beef:cafe::/48
  • Subnet: fd42:beef:cafe:2::/64 (wg0 interfaces)

All hosts receive dual-stack addresses:

fd42:beef:cafe:2::110/64  - blowfish.wg0 (OpenBSD gateway)
fd42:beef:cafe:2::111/64  - fishfinger.wg0 (OpenBSD gateway)
fd42:beef:cafe:2::120/64  - r0.wg0 (Rocky Linux VM)
fd42:beef:cafe:2::121/64  - r1.wg0 (Rocky Linux VM)
fd42:beef:cafe:2::122/64  - r2.wg0 (Rocky Linux VM)
fd42:beef:cafe:2::130/64  - f0.wg0 (FreeBSD host)
fd42:beef:cafe:2::131/64  - f1.wg0 (FreeBSD host)
fd42:beef:cafe:2::132/64  - f2.wg0 (FreeBSD host)
fd42:beef:cafe:2::200/64  - earth.wg0 (roaming laptop)
fd42:beef:cafe:2::201/64  - pixel7pro.wg0 (roaming phone)

Updating the mesh generator for IPv6



The mesh generator required two modifications to support dual-stack configurations:

**1. Address generation (address method)**

The generator now outputs multiple Address directives when IPv6 is present:

def address
  return '# No Address = ... for OpenBSD here' if hosts[myself]['os'] == 'OpenBSD'

  ipv4 = hosts[myself]['wg0']['ip']
  ipv6 = hosts[myself]['wg0']['ipv6']

  # WireGuard supports multiple Address directives for dual-stack
  if ipv6
    "Address = #{ipv4}\nAddress = #{ipv6}/64"
  else
    "Address = #{ipv4}"
  end
end

**2. AllowedIPs generation (peers method)**

For mesh peers, both IPv4 and IPv6 addresses are included in AllowedIPs:

if is_roaming
  allowed_ips = '0.0.0.0/0, ::/0'
else
  # For mesh peers, allow both IPv4 and IPv6 if present
  ipv4 = data['wg0']['ip']
  ipv6 = data['wg0']['ipv6']
  allowed_ips = ipv6 ? "#{ipv4}/32, #{ipv6}/128" : "#{ipv4}/32"
end

Roaming clients keep AllowedIPs = 0.0.0.0/0, ::/0 to route all traffic (IPv4 and IPv6) through the VPN.

IPv6 NAT on OpenBSD gateways



To allow roaming clients to access the internet via IPv6, we added NAT66 rules to the OpenBSD gateways' pf.conf:

# NAT for WireGuard clients to access internet (IPv4)
match out on vio0 from 192.168.2.0/24 to any nat-to (vio0)

# NAT66 for WireGuard clients to access internet (IPv6)
# Uses NPTv6 (Network Prefix Translation) to translate ULA to public IPv6
match out on vio0 inet6 from fd42:beef:cafe:2::/64 to any nat-to (vio0)

# Allow all UDP traffic on WireGuard port (IPv4 and IPv6)
pass in inet proto udp from any to any port 56709
pass in inet6 proto udp from any to any port 56709

OpenBSD's PF firewall supports IPv6 NAT with the same syntax as IPv4, using NPTv6 (RFC 6296) to translate the ULA addresses to the gateway's public IPv6 address.

Manual OpenBSD interface configuration



Since OpenBSD doesn't use the Address directive in WireGuard configs, IPv6 must be manually configured on the wg0 interfaces. On blowfish:

rex@blowfish:~ $ doas vi /etc/hostname.wg0

Add the IPv6 address (note the order - IPv6 must be configured before up):

inet 192.168.2.110 255.255.255.0 NONE
inet6 fd42:beef:cafe:2::110 64
up
!/usr/local/bin/wg setconf wg0 /etc/wireguard/wg0.conf

Important: The IPv6 address must be specified before the up directive. This ensures the interface has both addresses configured before WireGuard peers are loaded.

Apply the configuration:

rex@blowfish:~ $ doas sh /etc/netstart wg0
rex@blowfish:~ $ ifconfig wg0 | grep inet6
inet6 fd42:beef:cafe:2::110 prefixlen 64

Repeat for fishfinger with address fd42:beef:cafe:2::111.

After reboot, the interface will automatically come up with both IPv4 and IPv6 addresses. WireGuard peers may take 30-60 seconds to establish handshakes after boot.

Verifying dual-stack connectivity



After regenerating and deploying the configurations, both IPv4 and IPv6 work across the mesh:

# From r0 (Rocky Linux VM)
root@r0:~ # ping -c 2 192.168.2.130  # IPv4 to f0
64 bytes from 192.168.2.130: icmp_seq=1 ttl=64 time=2.12 ms
64 bytes from 192.168.2.130: icmp_seq=2 ttl=64 time=0.681 ms

root@r0:~ # ping6 -c 2 fd42:beef:cafe:2::130  # IPv6 to f0
64 bytes from fd42:beef:cafe:2::130: icmp_seq=1 ttl=64 time=2.16 ms
64 bytes from fd42:beef:cafe:2::130: icmp_seq=2 ttl=64 time=0.909 ms

The dual-stack configuration is backward compatible—hosts without the ipv6 field in the YAML configuration will continue to generate IPv4-only configs.

Benefits of dual-stack



Adding IPv6 to the mesh network provides:

  • Future-proofing: Ready for IPv6-only services and networks
  • Compatibility: Dual-stack maintains full IPv4 compatibility
  • Learning: Hands-on experience with IPv6 networking
  • Flexibility: Roaming clients can access both IPv4 and IPv6 internet resources

Happy WireGuard-ing



All is set up now. E.g. on f0:

paul@f0:~ % doas wg show
interface: wg0
  public key: Jm6YItMt94++dIeOyVi1I9AhNt2qQcryxCZezoX7X2Y=
  private key: (hidden)
  listening port: 56709

peer: 8PvGZH1NohHpZPVJyjhctBX9xblsNvYBhpg68FsFcns=
  preshared key: (hidden)
  endpoint: 46.23.94.99:56709
  allowed ips: 192.168.2.111/32, fd42:beef:cafe:2::111/128
  latest handshake: 1 minute, 46 seconds ago
  transfer: 124 B received, 1.75 KiB sent
  persistent keepalive: every 25 seconds

peer: Xow+d3qVXgUMk4pcRSQ6Fe+vhYBa3VDyHX/4jrGoKns=
  preshared key: (hidden)
  endpoint: 23.88.35.144:56709
  allowed ips: 192.168.2.110/32, fd42:beef:cafe:2::110/128
  latest handshake: 1 minute, 52 seconds ago
  transfer: 124 B received, 1.60 KiB sent
  persistent keepalive: every 25 seconds

peer: s3e93XoY7dPUQgLiVO4d8x/SRCFgEew+/wP7+zwgehI=
  preshared key: (hidden)
  endpoint: 192.168.1.120:56709
  allowed ips: 192.168.2.120/32, fd42:beef:cafe:2::120/128

peer: 2htXdNcxzpI2FdPDJy4T4VGtm1wpMEQu1AkQHjNY6F8=
  preshared key: (hidden)
  endpoint: 192.168.1.131:56709
  allowed ips: 192.168.2.131/32, fd42:beef:cafe:2::131/128

peer: 0Y/H20W8YIbF7DA1sMwMacLI8WS9yG+1/QO7m2oyllg=
  preshared key: (hidden)
  endpoint: 192.168.1.122:56709
  allowed ips: 192.168.2.122/32, fd42:beef:cafe:2::122/128

peer: Hhy9kMPOOjChXV2RA5WeCGs+J0FE3rcNPDw/TLSn7i8=
  preshared key: (hidden)
  endpoint: 192.168.1.121:56709
  allowed ips: 192.168.2.121/32, fd42:beef:cafe:2::121/128

peer: SlGVsACE1wiaRoGvCR3f7AuHfRS+1jjhS+YwEJ2HvF0=
  preshared key: (hidden)
  endpoint: 192.168.1.132:56709
  allowed ips: 192.168.2.132/32, fd42:beef:cafe:2::132/128

All the hosts are pingable as well, e.g.:

paul@f0:~ % foreach peer ( f1 f2 r0 r1 r2 blowfish fishfinger )
foreach? ping -c2 $peer.wg0
foreach? echo
foreach? end
PING f1.wg0 (192.168.2.131): 56 data bytes
64 bytes from 192.168.2.131: icmp_seq=0 ttl=64 time=0.334 ms
64 bytes from 192.168.2.131: icmp_seq=1 ttl=64 time=0.260 ms

--- f1.wg0 ping statistics ---
2 packets transmitted, 2 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.260/0.297/0.334/0.037 ms

PING f2.wg0 (192.168.2.132): 56 data bytes
64 bytes from 192.168.2.132: icmp_seq=0 ttl=64 time=0.323 ms
64 bytes from 192.168.2.132: icmp_seq=1 ttl=64 time=0.303 ms

--- f2.wg0 ping statistics ---
2 packets transmitted, 2 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.303/0.313/0.323/0.010 ms

PING r0.wg0 (192.168.2.120): 56 data bytes
64 bytes from 192.168.2.120: icmp_seq=0 ttl=64 time=0.716 ms
64 bytes from 192.168.2.120: icmp_seq=1 ttl=64 time=0.406 ms

--- r0.wg0 ping statistics ---
2 packets transmitted, 2 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.406/0.561/0.716/0.155 ms

PING r1.wg0 (192.168.2.121): 56 data bytes
64 bytes from 192.168.2.121: icmp_seq=0 ttl=64 time=0.639 ms
64 bytes from 192.168.2.121: icmp_seq=1 ttl=64 time=0.629 ms

--- r1.wg0 ping statistics ---
2 packets transmitted, 2 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.629/0.634/0.639/0.005 ms

PING r2.wg0 (192.168.2.122): 56 data bytes
64 bytes from 192.168.2.122: icmp_seq=0 ttl=64 time=0.569 ms
64 bytes from 192.168.2.122: icmp_seq=1 ttl=64 time=0.479 ms

--- r2.wg0 ping statistics ---
2 packets transmitted, 2 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.479/0.524/0.569/0.045 ms

PING blowfish.wg0 (192.168.2.110): 56 data bytes
64 bytes from 192.168.2.110: icmp_seq=0 ttl=255 time=35.745 ms
64 bytes from 192.168.2.110: icmp_seq=1 ttl=255 time=35.481 ms

--- blowfish.wg0 ping statistics ---
2 packets transmitted, 2 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 35.481/35.613/35.745/0.132 ms

PING fishfinger.wg0 (192.168.2.111): 56 data bytes
64 bytes from 192.168.2.111: icmp_seq=0 ttl=255 time=33.992 ms
64 bytes from 192.168.2.111: icmp_seq=1 ttl=255 time=33.751 ms

--- fishfinger.wg0 ping statistics ---
2 packets transmitted, 2 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 33.751/33.872/33.992/0.120 ms

Note that the loop above is a tcsh loop, the default shell used in FreeBSD. Of course, all other peers can ping their peers as well!

After the first ping, VPN tunnels now also show handshakes and the amount of data transferred through them:

paul@f0:~ % doas wg show
interface: wg0
  public key: Jm6YItMt94++dIeOyVi1I9AhNt2qQcryxCZezoX7X2Y=
  private key: (hidden)
  listening port: 56709

peer: 0Y/H20W8YIbF7DA1sMwMacLI8WS9yG+1/QO7m2oyllg=
  preshared key: (hidden)
  endpoint: 192.168.1.122:56709
  allowed ips: 192.168.2.122/32, fd42:beef:cafe:2::122/128
  latest handshake: 10 seconds ago
  transfer: 440 B received, 532 B sent

peer: Hhy9kMPOOjChXV2RA5WeCGs+J0FE3rcNPDw/TLSn7i8=
  preshared key: (hidden)
  endpoint: 192.168.1.121:56709
  allowed ips: 192.168.2.121/32, fd42:beef:cafe:2::121/128
  latest handshake: 12 seconds ago
  transfer: 440 B received, 564 B sent

peer: s3e93XoY7dPUQgLiVO4d8x/SRCFgEew+/wP7+zwgehI=
  preshared key: (hidden)
  endpoint: 192.168.1.120:56709
  allowed ips: 192.168.2.120/32, fd42:beef:cafe:2::120/128
  latest handshake: 14 seconds ago
  transfer: 440 B received, 564 B sent

peer: SlGVsACE1wiaRoGvCR3f7AuHfRS+1jjhS+YwEJ2HvF0=
  preshared key: (hidden)
  endpoint: 192.168.1.132:56709
  allowed ips: 192.168.2.132/32, fd42:beef:cafe:2::132/128
  latest handshake: 17 seconds ago
  transfer: 472 B received, 564 B sent

peer: Xow+d3qVXgUMk4pcRSQ6Fe+vhYBa3VDyHX/4jrGoKns=
  preshared key: (hidden)
  endpoint: 23.88.35.144:56709
  allowed ips: 192.168.2.110/32, fd42:beef:cafe:2::110/128
  latest handshake: 55 seconds ago
  transfer: 472 B received, 596 B sent
  persistent keepalive: every 25 seconds

peer: 8PvGZH1NohHpZPVJyjhctBX9xblsNvYBhpg68FsFcns=
  preshared key: (hidden)
  endpoint: 46.23.94.99:56709
  allowed ips: 192.168.2.111/32, fd42:beef:cafe:2::111/128
  latest handshake: 55 seconds ago
  transfer: 472 B received, 596 B sent
  persistent keepalive: every 25 seconds

peer: 2htXdNcxzpI2FdPDJy4T4VGtm1wpMEQu1AkQHjNY6F8=
  preshared key: (hidden)
  endpoint: 192.168.1.131:56709
  allowed ips: 192.168.2.131/32, fd42:beef:cafe:2::131/128

Managing Roaming Client Tunnels



Since roaming clients like earth and pixel7pro connect on-demand rather than being always-on like the infrastructure hosts, it's useful to know how to configure and manage the WireGuard tunnels.

Manual gateway failover configuration



The default configuration for roaming clients includes both gateways (blowfish and fishfinger) with AllowedIPs = 0.0.0.0/0, ::/0. However, WireGuard doesn't automatically failover between multiple peers with identical AllowedIPs routes. When both gateways are configured this way, WireGuard uses the first peer with a recent handshake. If that gateway goes down, traffic won't automatically switch to the backup gateway.

To enable manual failover, separate configuration files can be created for roaming clients (earth laptop and pixel7pro phone), each containing only a single gateway peer. This provides explicit control over which gateway handles traffic.

Configuration files for pixel7pro (phone):

Two separate configs in /home/paul/git/wireguardmeshgenerator/dist/pixel7pro/etc/wireguard/:

  • wg0-blowfish.conf - Routes all traffic through blowfish gateway (23.88.35.144)
  • wg0-fishfinger.conf - Routes all traffic through fishfinger gateway (46.23.94.99)

Generate QR codes for importing into the WireGuard Android app:

qrencode -t ansiutf8 < dist/pixel7pro/etc/wireguard/wg0-blowfish.conf
qrencode -t ansiutf8 < dist/pixel7pro/etc/wireguard/wg0-fishfinger.conf

Import both QR codes using the WireGuard app to create two separate tunnel profiles. You can then manually enable/disable each tunnel to select which gateway to use. Only enable one tunnel at a time.

Configuration files for earth (laptop):

Two separate configs in /home/paul/git/wireguardmeshgenerator/dist/earth/etc/wireguard/:

  • wg0-blowfish.conf - Routes all traffic through blowfish gateway
  • wg0-fishfinger.conf - Routes all traffic through fishfinger gateway

Install both configurations:

sudo cp dist/earth/etc/wireguard/wg0-blowfish.conf /etc/wireguard/
sudo cp dist/earth/etc/wireguard/wg0-fishfinger.conf /etc/wireguard/

This approach provides explicit control over which gateway handles roaming client traffic, useful when one gateway needs maintenance or experiences connectivity issues.

Starting and stopping on earth (Fedora laptop)



On the Fedora laptop, WireGuard is managed via systemd. Using the separate gateway configs:

# Start with blowfish gateway
earth$ sudo systemctl start wg-quick@wg0-blowfish.service

# Or start with fishfinger gateway
earth$ sudo systemctl start wg-quick@wg0-fishfinger.service

# Check tunnel status (example with blowfish gateway)
earth$ sudo wg show
interface: wg0
  public key: Mc1CpSS3rbLN9A2w9c75XugQyXUkGPHKI2iCGbh8DRo=
  private key: (hidden)
  listening port: 56709
  fwmark: 0xca6c

peer: Xow+d3qVXgUMk4pcRSQ6Fe+vhYBa3VDyHX/4jrGoKns=
  preshared key: (hidden)
  endpoint: 23.88.35.144:56709
  allowed ips: 0.0.0.0/0, ::/0
  latest handshake: 5 seconds ago
  transfer: 15.89 KiB received, 32.15 KiB sent
  persistent keepalive: every 25 seconds

Stopping the tunnel:

earth$ sudo systemctl stop wg-quick@wg0-blowfish.service
# Or if using fishfinger:
earth$ sudo systemctl stop wg-quick@wg0-fishfinger.service

earth$ sudo wg show
# No output - WireGuard interface is down

Switching between gateways:

# Switch from blowfish to fishfinger
earth$ sudo systemctl stop wg-quick@wg0-blowfish.service
earth$ sudo systemctl start wg-quick@wg0-fishfinger.service

The services remain disabled to prevent auto-start on boot, allowing manual control of when the VPN is active and which gateway to use.

Starting and stopping on pixel7pro (Android phone)



On Android using the official WireGuard app, you now have two tunnel profiles (wg0-blowfish and wg0-fishfinger) after importing the QR codes:

Starting a tunnel:

  • 1. Open the WireGuard app
  • 2. Tap the toggle switch next to either wg0-blowfish or wg0-fishfinger tunnel configuration
  • 3. The switch turns blue/green and shows "Active"
  • 4. A key icon appears in the notification bar indicating VPN is active
  • 5. All traffic now routes through the selected gateway

Stopping the tunnel:

  • 1. Open the WireGuard app
  • 2. Tap the toggle switch again to disable it
  • 3. The switch turns gray and shows "Inactive"
  • 4. The notification bar key icon disappears
  • 5. Normal internet routing resumes

Switching between gateways:

  • 1. Disable the currently active tunnel (e.g., wg0-blowfish)
  • 2. Enable the other tunnel (e.g., wg0-fishfinger)
  • Only enable one tunnel at a time

Quick toggling from notification:

  • Pull down the notification shade
  • Tap the WireGuard notification to quickly enable/disable the tunnel without opening the app

The WireGuard Android app supports automatically activating tunnels based on:

  • Mobile data connection (e.g., enable VPN when on cellular)
  • WiFi SSID (e.g., disable VPN when on trusted home network)
  • Ethernet connection status

These settings can be configured by tapping the pencil icon next to the tunnel name, then scrolling to "Toggle on/off based on" options.

Verifying connectivity



Once the tunnel is active on either device, verify connectivity:

# From earth laptop:
earth$ ping -c2 blowfish.wg0
earth$ ping -c2 fishfinger.wg0
earth$ curl https://ifconfig.me  # Should show gateway's public IP

Check which gateway is active: Check the transfer statistics with sudo wg show on earth to see which peer shows recent handshakes and increasing transfer bytes. On Android, the WireGuard app shows the active tunnel with data transfer statistics.

Conclusion



Having a mesh network on our hosts is great for securing all the traffic between them for our future k3s setup. A self-managed WireGuard mesh network is better than Tailscale as it eliminates reliance on a third party and provides full control over the configuration. It reduces unnecessary abstraction and "magic," enabling easier debugging and ensuring full ownership of our network.

Read the next post of this series:

f3s: Kubernetes with FreeBSD - Part 6: Storage

Other *BSD-related posts:

2025-12-07 f3s: Kubernetes with FreeBSD - Part 8: Observability
2025-10-02 f3s: Kubernetes with FreeBSD - Part 7: k3s and first pod deployments
2025-07-14 f3s: Kubernetes with FreeBSD - Part 6: Storage
2025-05-11 f3s: Kubernetes with FreeBSD - Part 5: WireGuard mesh network (You are currently reading this)
2025-04-05 f3s: Kubernetes with FreeBSD - Part 4: Rocky Linux Bhyve VMs
2025-02-01 f3s: Kubernetes with FreeBSD - Part 3: Protecting from power cuts
2024-12-03 f3s: Kubernetes with FreeBSD - Part 2: Hardware and base installation
2024-11-17 f3s: Kubernetes with FreeBSD - Part 1: Setting the stage
2024-04-01 KISS high-availability with OpenBSD
2024-01-13 One reason why I love OpenBSD
2022-10-30 Installing DTail on OpenBSD
2022-07-30 Let's Encrypt with OpenBSD and Rex
2016-04-09 Jails and ZFS with Puppet on FreeBSD

E-Mail your comments to paul@nospam.buetow.org

Back to the main site
Terminal multiplexing with `tmux` - Fish edition https://foo.zone/gemfeed/2025-05-02-terminal-multiplexing-with-tmux-fish-edition.html 2025-05-02T00:09:23+03:00 Paul Buetow aka snonux paul@dev.buetow.org This is the Fish shell edition of the same post (but for Z-Shell) of mine from last year:

Terminal multiplexing with tmux - Fish edition



Published at 2025-05-02T00:09:23+03:00

This is the Fish shell edition of the same post (but for Z-Shell) of mine from last year:

./2024-06-23-terminal-multiplexing-with-tmux.html

Tmux (Terminal Multiplexer) is a powerful, terminal-based tool that manages multiple terminal sessions within a single window. Here are some of its primary features and functionalities:

  • Session management
  • Window and Pane management
  • Persistent Workspace
  • Customization

https://github.com/tmux/tmux/wiki

            _______                           s
           |.-----.|                           s
           || Tmux||                          s
           ||_.-._||       |\   \\\\__     o          s
           `--)-(--`       | \_/    o \    o          s
          __[=== o]__      > _   (( <_  oo            s
         |:::::::::::|\    | / \__+___/               s
   jgs   `-=========-`()   |/     |/                  s
       mod. by Paul B.

Table of Contents




Before continuing...



Before continuing to read this post, I encourage you to get familiar with Tmux first (unless you already know the basics). You can go through the official getting started guide:

https://github.com/tmux/tmux/wiki/Getting-Started

I can also recommend this book (this is the book I got started with with Tmux):

https://pragprog.com/titles/bhtmux2/tmux-2/

Over the years, I have built a couple of shell helper functions to optimize my workflows. Tmux is extensively integrated into my daily workflows (personal and work). I had colleagues asking me about my Tmux config and helper scripts for Tmux several times. It would be neat to blog about it so that everyone interested in it can make a copy of my configuration and scripts.

The configuration and scripts in this blog post are only the non-work-specific parts. There are more helper scripts, which I only use for work (and aren't really useful outside of work due to the way servers and clusters are structured there).

Tmux is highly configurable, and I think I am only scratching the surface of what is possible with it. Nevertheless, it may still be useful for you. I also love that Tmux is part of the OpenBSD base system!

Shell aliases



Since last week, I am playing a bit with the Fish shell. As a result, I also converted all my tmux helper scripts (mentioned in this blog post) from Z-Shell to Fish.

https://fishshell.com

For the most common Tmux commands I use, I have created the following shell aliases:

alias tn 'tmux::new'
alias ta 'tmux::attach'
alias tx 'tmux::remote'
alias ts 'tmux::search'
alias tssh 'tmux::cluster_ssh'
alias tm tmux
alias tl 'tmux list-sessions'
alias foo 'tmux::new foo'
alias bar 'tmux::new bar'
alias baz 'tmux::new baz'

Note all tmux::...; those are custom shell functions doing certain things, and they aren't part of the Tmux distribution. But let's run through every aliases one by one.

The first two are pretty straightforward. tm is simply a shorthand for tmux, so I have to type less, and tl lists all Tmux sessions that are currently open. No magic here.

The tn alias - Creating a new session



The tn alias is referencing this function:

# Create new session and if alread exists attach to it
function tmux::new
    set -l session $argv[1]
    _tmux::cleanup_default
    if test -z "$session"
        tmux::new (string join "" T (date +%s))
    else
        tmux new-session -d -s $session
        tmux -2 attach-session -t $session || tmux -2 switch-client -t $session
    end
end

There is a lot going on here. Let's have a detailed look at what it is doing.

First, a Tmux session name can be passed to the function as a first argument. That session name is only optional. Without it, Tmux will select a session named (string join "" T (date +%s)) as a default. Which is T followed by the UNIX epoch, e.g. T1717133796.

Cleaning up default sessions automatically



Note also the call to _tmux::cleanup_default; it would clean up all already opened default sessions if they aren't attached. Those sessions were only temporary, and I had too many flying around after a while. So, I decided to auto-delete the sessions if they weren't attached. If I want to keep sessions around, I will rename them with the Tmux command prefix-key $. This is the cleanup function:

function _tmux::cleanup_default
    tmux list-sessions | string match -r '^T.*: ' | string match -v -r attached | string split ':' | while read -l s
        echo "Killing $s"
        tmux kill-session -t "$s"
    end
end

The cleanup function kills all open Tmux sessions that haven't been renamed properly yet—but only if they aren't attached (e.g., don't run in the foreground in any terminal). Cleaning them up automatically keeps my Tmux sessions as neat and tidy as possible.

Renaming sessions



Whenever I am in a temporary session (named T....), I may decide that I want to keep this session around. I have to rename the session to prevent the cleanup function from doing its thing. That's, as mentioned already, easily accomplished with the standard prefix-key $ Tmux command.

The ta alias - Attaching to a session



This alias refers to the following function, which tries to attach to an already-running Tmux session.

function tmux::attach
    set -l session $argv[1]
    if test -z "$session"
        tmux attach-session || tmux::new
    else
        tmux attach-session -t $session || tmux::new $session
    end
end

If no session is specified (as the argument of the function), it will try to attach to the first open session. If no Tmux server is running, it will create a new one with tmux::new. Otherwise, with a session name given as the argument, it will attach to it. If unsuccessful (e.g., the session doesn't exist), it will be created and attached to.

The tr alias - For a nested remote session



This SSHs into the remote server specified and then, remotely on the server itself, starts a nested Tmux session. So we have one Tmux session on the local computer and, inside of it, an SSH connection to a remote server with a Tmux session running again. The benefit of this is that, in case my network connection breaks down, the next time I connect, I can continue my work on the remote server exactly where I left off. The session name is the name of the server being SSHed into. If a session like this already exists, it simply attaches to it.

function tmux::remote
    set -l server $argv[1]
    tmux new -s $server "ssh -A -t $server 'tmux attach-session || tmux'" || tmux attach-session -d -t $server
end

Change of the Tmux prefix for better nesting



To make nested Tmux sessions work smoothly, one must change the Tmux prefix key locally or remotely. By default, the Tmux prefix key is Ctrl-b, so Ctrl-b $, for example, renames the current session. To change the prefix key from the standard Ctrl-b to, for example, Ctrl-g, you must add this to the tmux.conf:

set-option -g prefix C-g

This way, when I want to rename the remote Tmux session, I have to use Ctrl-g $, and when I want to rename the local Tmux session, I still have to use Ctrl-b $. In my case, I have this deployed to all remote servers through a configuration management system (out of scope for this blog post).

There might also be another way around this (without reconfiguring the prefix key), but that is cumbersome to use, as far as I remember.

The ts alias - Searching sessions with fuzzy finder



Despite the fact that with _tmux::cleanup_default, I don't leave a huge mess with trillions of Tmux sessions flying around all the time, at times, it can become challenging to find exactly the session I am currently interested in. After a busy workday, I often end up with around twenty sessions on my laptop. This is where fuzzy searching for session names comes in handy, as I often don't remember the exact session names.

function tmux::search
    set -l session (tmux list-sessions | fzf | cut -d: -f1)
    if test -z "$TMUX"
        tmux attach-session -t $session
    else
        tmux switch -t $session
    end
end

All it does is list all currently open sessions in fzf, where one of them can be searched and selected through fuzzy find, and then either switch (if already inside a session) to the other session or attach to the other session (if not yet in Tmux).

You must install the fzf command on your computer for this to work. This is how it looks like:

Tmux session fuzzy finder

The tssh alias - Cluster SSH replacement



Before I used Tmux, I was a heavy user of ClusterSSH, which allowed me to log in to multiple servers at once in a single terminal window and type and run commands on all of them in parallel.

https://github.com/duncs/clusterssh

However, since I started using Tmux, I retired ClusterSSH, as it came with the benefit that Tmux only needs to be run in the terminal, whereas ClusterSSH spawned terminal windows, which aren't easily portable (e.g., from a Linux desktop to macOS). The tmux::cluster_ssh function can have N arguments, where:

  • ...the first argument will be the session name (see tmux::tssh_from_argument helper function), and all remaining arguments will be server hostnames/FQDNs to connect to simultaneously.
  • ...or, the first argument is a file name, and the file contains a list of hostnames/FQDNs (see tmux::ssh_from_file helper function)

This is the function definition behind the tssh alias:

function tmux::cluster_ssh
    if test -f "$argv[1]"
        tmux::tssh_from_file $argv[1]
        return
    end
    tmux::tssh_from_argument $argv
end

This function is just a wrapper around the more complex tmux::tssh_from_file and tmux::tssh_from_argument functions, as you have learned already. Most of the magic happens there.

The tmux::tssh_from_argument helper



This is the most magic helper function we will cover in this post. It looks like this:

function tmux::tssh_from_argument
    set -l session $argv[1]
    set first_server_or_container $argv[2]
    set remaining_servers $argv[3..-1]
    if test -z "$first_server_or_container"
        set first_server_or_container $session
    end

    tmux new-session -d -s $session (_tmux::connect_command "$first_server_or_container")
    if not tmux list-session | grep "^$session:"
        echo "Could not create session $session"
        return 2
    end
    for server_or_container in $remaining_servers
        tmux split-window -t $session "tmux select-layout tiled; $(_tmux::connect_command "$server_or_container")"
    end
    tmux setw -t $session synchronize-panes on
    tmux -2 attach-session -t $session || tmux -2 switch-client -t $session
end

It expects at least two arguments. The first argument is the session name to create for the clustered SSH session. All other arguments are server hostnames or FQDNs to which to connect. The first one is used to make the initial session. All remaining ones are added to that session with tmux split-window -t $session.... At the end, we enable synchronized panes by default, so whenever you type, the commands will be sent to every SSH connection, thus allowing the neat ClusterSSH feature to run commands on multiple servers simultaneously. Once done, we attach (or switch, if already in Tmux) to it.

Sometimes, I don't want the synchronized panes behavior and want to switch it off temporarily. I can do that with prefix-key p and prefix-key P after adding the following to my local tmux.conf:

bind-key p setw synchronize-panes off
bind-key P setw synchronize-panes on

The tmux::tssh_from_file helper



This one sets the session name to the file name and then reads a list of servers from that file, passing the list of servers to tmux::tssh_from_argument as the arguments. So, this is a neat little wrapper that also enables me to open clustered SSH sessions from an input file.

function tmux::tssh_from_file
    set -l serverlist $argv[1]
    set -l session (basename $serverlist | cut -d. -f1)
    tmux::tssh_from_argument $session (awk '{ print $1 }' $serverlist | sed 's/.lan./.lan/g')
end

tssh examples



To open a new session named fish and log in to 4 remote hosts, run this command (Note that it is also possible to specify the remote user):

$ tssh fish blowfish.buetow.org fishfinger.buetow.org \
    fishbone.buetow.org user@octopus.buetow.org

To open a new session named manyservers, put many servers (one FQDN per line) into a file called manyservers.txt and simply run:

$ tssh manyservers.txt

Common Tmux commands I use in tssh



These are default Tmux commands that I make heavy use of in a tssh session:

  • Press prefix-key DIRECTION to switch panes. DIRECTION is by default any of the arrow keys, but I also configured Vi keybindings.
  • Press prefix-key <space> to change the pane layout (can be pressed multiple times to cycle through them).
  • Press prefix-key z to zoom in and out of the current active pane.

Copy and paste workflow



As you will see later in this blog post, I have configured a history limit of 1 million items in Tmux so that I can scroll back quite far. One main workflow of mine is to search for text in the Tmux history, select and copy it, and then switch to another window or session and paste it there (e.g., into my text editor to do something with it).

This works by pressing prefix-key [ to enter Tmux copy mode. From there, I can browse the Tmux history of the current window using either the arrow keys or vi-like navigation (see vi configuration later in this blog post) and the Pg-Dn and Pg-Up keys.

I often search the history backwards with prefix-key [ followed by a ?, which opens the Tmux history search prompt.

Once I have identified the terminal text to be copied, I enter visual select mode with v, highlight all the text to be copied (using arrow keys or Vi motions), and press y to yank it (sorry if this all sounds a bit complicated, but Vim/NeoVim users will know this, as it is pretty much how you do it there as well).

For v and y to work, the following has to be added to the Tmux configuration file:

bind-key -T copy-mode-vi 'v' send -X begin-selection
bind-key -T copy-mode-vi 'y' send -X copy-selection-and-cancel

Once the text is yanked, I switch to another Tmux window or session where, for example, a text editor is running and paste the yanked text from Tmux into the editor with prefix-key ]. Note that when pasting into a modal text editor like Vi or Helix, you would first need to enter insert mode before prefix-key ] would paste anything.

Tmux configurations



Some features I have configured directly in Tmux don't require an external shell alias to function correctly. Let's walk line by line through my local ~/.config/tmux/tmux.conf:

source ~/.config/tmux/tmux.local.conf

set-option -g allow-rename off
set-option -g history-limit 100000
set-option -g status-bg '#444444'
set-option -g status-fg '#ffa500'
set-option -s escape-time 0

There's yet to be much magic happening here. I source a tmux.local.conf, which I sometimes use to override the default configuration that comes from the configuration management system. But it is mostly just an empty file, so it doesn't throw any errors on Tmux startup when I don't use it.

I work with many terminal outputs, which I also like to search within Tmux. So, I added a large enough history-limit, enabling me to search backwards in Tmux for any output up to a million lines of text.

Besides changing some colours (personal taste), I also set escape-time to 0, which is just a workaround. Otherwise, my Helix text editor's ESC key would take ages to trigger within Tmux. I am trying to remember the gory details. You can leave it out; if everything works fine for you, leave it out.

The next lines in the configuration file are:

set-window-option -g mode-keys vi
bind-key -T copy-mode-vi 'v' send -X begin-selection
bind-key -T copy-mode-vi 'y' send -X copy-selection-and-cancel

I navigate within Tmux using Vi keybindings, so the mode-keys is set to vi. I use the Helix modal text editor, which is close enough to Vi bindings for simple navigation to feel "native" to me. (By the way, I have been a long-time Vim and NeoVim user, but I eventually switched to Helix. It's off-topic here, but it may be worth another blog post once.)

The two bind-key commands make it so that I can use v and y in copy mode, which feels more Vi-like (as already discussed earlier in this post).

The next set of lines in the configuration file are:

bind-key h select-pane -L
bind-key j select-pane -D
bind-key k select-pane -U
bind-key l select-pane -R

bind-key H resize-pane -L 5
bind-key J resize-pane -D 5
bind-key K resize-pane -U 5
bind-key L resize-pane -R 5

These allow me to use prefix-key h, prefix-key j, prefix-key k, and prefix-key l for switching panes and prefix-key H, prefix-key J, prefix-key K, and prefix-key L for resizing the panes. If you don't know Vi/Vim/NeoVim, the letters hjkl are commonly used there for left, down, up, and right, which is also the same for Helix, by the way.

The next set of lines in the configuration file are:

bind-key c new-window -c '#{pane_current_path}'
bind-key F new-window -n "session-switcher" "tmux list-sessions | fzf | cut -d: -f1 | xargs tmux switch-client -t"
bind-key T choose-tree

The first one is that any new window starts in the current directory. The second one is more interesting. I list all open sessions in the fuzzy finder. I rely heavily on this during my daily workflow to switch between various sessions depending on the task. E.g. from a remote cluster SSH session to a local code editor.

The third one, choose-tree, opens a tree view in Tmux listing all sessions and windows. This one is handy to get a better overview of what is currently running in any local Tmux session. It looks like this (it also allows me to press a hotkey to switch to a particular Tmux window):

Tmux sessiont tree view

The last remaining lines in my configuration file are:

bind-key p setw synchronize-panes off
bind-key P setw synchronize-panes on
bind-key r source-file ~/.config/tmux/tmux.conf \; display-message "tmux.conf reloaded"

We discussed synchronized panes earlier. I use it all the time in clustered SSH sessions. When enabled, all panes (remote SSH sessions) receive the same keystrokes. This is very useful when you want to run the same commands on many servers at once, such as navigating to a common directory, restarting a couple of services at once, or running tools like htop to quickly monitor system resources.

The last one reloads my Tmux configuration on the fly.

E-Mail your comments to paul@nospam.buetow.org :-)

Other related posts are:

2026-02-02 A tmux popup editor for Cursor Agent CLI prompts
2025-05-02 Terminal multiplexing with tmux - Fish edition (You are currently reading this)
2024-06-23 Terminal multiplexing with tmux - Z-Shell edition

Back to the main site
'When: The Scientific Secrets of Perfect Timing' book notes https://foo.zone/gemfeed/2025-04-19-when-book-notes.html 2025-04-19T10:26:05+03:00 Paul Buetow aka snonux paul@dev.buetow.org These are my personal book notes from Daniel Pink's 'When: The Scientific Secrets of Perfect Timing.' They are for me, but I hope they might be useful to you too.

"When: The Scientific Secrets of Perfect Timing" book notes



Published at 2025-04-19T10:26:05+03:00

These are my personal book notes from Daniel Pink's "When: The Scientific Secrets of Perfect Timing." They are for me, but I hope they might be useful to you too.

	  __
 (`/\
 `=\/\ __...--~~~~~-._   _.-~~~~~--...__
  `=\/\               \ /               \\
   `=\/                V                 \\
   //_\___--~~~~~~-._  |  _.-~~~~~~--...__\\
  //  ) (..----~~~~._\ | /_.~~~~----.....__\\
 ===( INK )==========\\|//====================
__ejm\___/________dwb`---`______________________

Table of Contents




You are a different kind of organism based on the time of day. For example, school tests show worse results later in the day, especially if there are fewer computers than students available. Every person has a chronotype, such as a late or early peaker, or somewhere in the middle (like most people). You can assess your chronotype here:

Chronotype Assessment

Following your chronotype can lead to more happiness and higher job satisfaction.

Daily Rhythms



Peak, Trough, Rebound (Recovery): Most people experience these periods throughout the day. It's best to "eat the frog" or tackle daunting tasks during the peak. A twin peak exists every day, with mornings and early evenings being optimal for most people. Negative moods follow the opposite pattern, peaking in the afternoon. Light helps adjust but isn't the main driver of our internal clock. Like plants, humans have intrinsic rhythms.

Optimal Task Timing



  • Analytical work requiring sharpness and focus is best at the peak.
  • Creative work is more effective during non-peak times.
  • Biorhythms can sway performance by up to twenty percent.

Exercise Timing



Exercise in the morning to lose weight; you burn up to twenty percent more fat if you exercise before eating. Exercising after eating aids muscle gain, using the energy from the food. Morning exercises elevate mood, with the effect lasting all day. They also make forming a habit easier. The late afternoon is best for athletic performance due to optimal body temperature, reducing injury risk.

Drinking Habits



  • Drink water in the morning to counter mild dehydration upon waking.
  • Delay coffee consumption until cortisol production peaks an hour or 90 minutes after waking. This helps avoid caffeine resistance.
  • For an afternoon boost, have coffee once cortisol levels drop.

Afternoon Challenges ("Bermuda Triangle")



  • Mistakes are more common in hospitals during this period, like incorrect antibiotic subscriptions or missed handwashing.
  • Traffic accidents and unfavorable judge decisions occur more frequently in the afternoon.
  • 2:55 pm is the least productive time of the day.

Breaks and Productivity



Short, restorative breaks enhance performance. Student exam results improved with a half-hour break beforehand. Even micro-breaks can be beneficial—hourly five-minute walking breaks can increase productivity as much as 30-minute walks. Nature-based breaks are more effective than indoor ones, and full detachment in breaks is essential for restoration. Physical activity during breaks boosts concentration and productivity more than long walks do. Complete detachment from work during breaks is critical.

Napping



Short naps (10-20 minutes) significantly enhance mood, alertness, and cognitive performance, improving learning and problem-solving abilities. Napping increases with age, benefiting mood, flow, and overall health. A "nappuccino," or napping after coffee, offers a double boost, as caffeine takes around 25 minutes to kick in.

Scheduling Breaks



  • Track breaks just as you do with tasks—aim for three breaks a day.
  • Every 25 minutes, look away and daydream for 20 seconds, or engage in short exercises.
  • Meditating for even three minutes is a highly effective restorative activity.
  • The "Fresh Start Effect" (e.g., beginning a diet on January 1st or a new week) impacts motivation, as does recognizing progress. At the end of each day, spends two minutes to write down accomplishments.

Final Impressions



  • The concluding experience of a vacation significantly influences overall memories.
  • Restaurant reviews often hinge on the end of the visit, highlighting extras like wrong bills or additional desserts.
  • Considering one's older future self can motivate improvements in the present.

The Midlife U Curve



Life satisfaction tends to dip in midlife, around the forties, but increases around age 54.

Project Management Tips



  • Halfway through a project, there's a concentrated work effort ("Oh Oh Effect"), similar to an alarm when slightly behind schedule.
  • Recognizing daily accomplishments can elevate motivation and satisfaction.

These insights from "When" can guide actions to optimize performance, well-being, and satisfaction across various aspects of life.

E-Mail your comments to paul@nospam.buetow.org :-)

Other book notes of mine are:

2025-11-02 'The Courage To Be Disliked' book notes
2025-06-07 'A Monk's Guide to Happiness' book notes
2025-04-19 'When: The Scientific Secrets of Perfect Timing' book notes (You are currently reading this)
2024-10-24 'Staff Engineer' book notes
2024-07-07 'The Stoic Challenge' book notes
2024-05-01 'Slow Productivity' book notes
2023-11-11 'Mind Management' book notes
2023-07-17 'Software Developers Career Guide and Soft Skills' book notes
2023-05-06 'The Obstacle is the Way' book notes
2023-04-01 'Never split the difference' book notes
2023-03-16 'The Pragmatic Programmer' book notes

Back to the main site
f3s: Kubernetes with FreeBSD - Part 4: Rocky Linux Bhyve VMs https://foo.zone/gemfeed/2025-04-05-f3s-kubernetes-with-freebsd-part-4.html 2025-04-04T23:21:01+03:00, last updated Fri 26 Dec 08:51:06 EET 2025 Paul Buetow aka snonux paul@dev.buetow.org This is the fourth blog post about the f3s series for self-hosting demands in a home lab. f3s? The 'f' stands for FreeBSD, and the '3s' stands for k3s, the Kubernetes distribution used on FreeBSD-based physical machines.

f3s: Kubernetes with FreeBSD - Part 4: Rocky Linux Bhyve VMs



Published at 2025-04-04T23:21:01+03:00, last updated Fri 26 Dec 08:51:06 EET 2025

This is the fourth blog post about the f3s series for self-hosting demands in a home lab. f3s? The "f" stands for FreeBSD, and the "3s" stands for k3s, the Kubernetes distribution used on FreeBSD-based physical machines.

2024-11-17 f3s: Kubernetes with FreeBSD - Part 1: Setting the stage
2024-12-03 f3s: Kubernetes with FreeBSD - Part 2: Hardware and base installation
2025-02-01 f3s: Kubernetes with FreeBSD - Part 3: Protecting from power cuts
2025-04-05 f3s: Kubernetes with FreeBSD - Part 4: Rocky Linux Bhyve VMs (You are currently reading this)
2025-05-11 f3s: Kubernetes with FreeBSD - Part 5: WireGuard mesh network
2025-07-14 f3s: Kubernetes with FreeBSD - Part 6: Storage
2025-10-02 f3s: Kubernetes with FreeBSD - Part 7: k3s and first pod deployments
2025-12-07 f3s: Kubernetes with FreeBSD - Part 8: Observability

f3s logo

Table of Contents




Introduction



In this blog post, we are going to install the Bhyve hypervisor.

The FreeBSD Bhyve hypervisor is a lightweight, modern hypervisor that enables virtualization on FreeBSD systems. Bhyve's strengths include its minimal overhead, which allows it to achieve near-native performance for virtual machines. It's efficient and lightweight, leveraging the capabilities of the FreeBSD operating system for performance and network management.

https://wiki.freebsd.org/bhyve

Bhyve supports running various guest operating systems, including FreeBSD, Linux, and Windows, on hardware platforms that support hardware virtualization extensions (such as Intel VT-x or AMD-V). In our case, we are going to virtualize Rocky Linux, which will later in this series be used to run k3s.

Check for POPCNT CPU support



POPCNT is a CPU instruction that counts the number of set bits (ones) in a binary number. CPU virtualization and Bhyve support for the POPCNT instruction are important because guest operating systems utilize this instruction to perform various tasks more efficiently. If the host CPU supports POPCNT, Bhyve can pass this capability to virtual machines for better performance. Without POPCNT support, some applications might not run or perform sub-optimally in virtualized environments.

To check for POPCNT support, run:

paul@f0:~ % dmesg | grep 'Features2=.*POPCNT'
  Features2=0x7ffafbbf<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,SDBG,
	FMA,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,TSCDLT,AESNI,XSAVE,
	OSXSAVE,AVX,F16C,RDRAND>

So it's there! All good.

Basic Bhyve setup



For managing the Bhyve VMs, we are using vm-bhyve, a tool not part of the FreeBSD operating system but available as a ready-to-use package. It eases VM management and reduces a lot of overhead. We also install the required package to make Bhyve work with the UEFI firmware.

https://github.com/churchers/vm-bhyve

The following commands are executed on all three hosts f0, f1, and f2, where re0 is the name of the Ethernet interface (which may need to be adjusted if your hardware is different):

paul@f0:~ % doas pkg install vm-bhyve bhyve-firmware
paul@f0:~ % doas sysrc vm_enable=YES
vm_enable:  -> YES
paul@f0:~ % doas sysrc vm_dir=zfs:zroot/bhyve
vm_dir:  -> zfs:zroot/bhyve
paul@f0:~ % doas zfs create zroot/bhyve
paul@f0:~ % doas vm init
paul@f0:~ % doas vm switch create public
paul@f0:~ % doas vm switch add public re0

Bhyve stores all its data in the /bhyve of the zroot ZFS pool:

paul@f0:~ % zfs list | grep bhyve
zroot/bhyve                                   1.74M   453G  1.74M  /zroot/bhyve

For convenience, we also create this symlink:

paul@f0:~ % doas ln -s /zroot/bhyve/ /bhyve


Now, Bhyve is ready to rumble, but no VMs are there yet:

paul@f0:~ % doas vm list
NAME  DATASTORE  LOADER  CPU  MEMORY  VNC  AUTO  STATE

Rocky Linux VMs



As guest VMs I decided to use Rocky Linux.

Using Rocky Linux 9 as a VM-based OS is beneficial primarily because of its long-term support and stable release cycle. This ensures a reliable environment that receives security updates and bug fixes for an extended period, reducing the need for frequent upgrades.

Rocky Linux is community-driven and aims to be fully compatible with enterprise Linux, making it a solid choice for consistency and performance in various deployment scenarios.

https://rockylinux.org/

ISO download



We're going to install the Rocky Linux from the latest minimal iso:

paul@f0:~ % doas vm iso \
 https://download.rockylinux.org/pub/rocky/9/isos/x86_64/Rocky-9.5-x86_64-minimal.iso
/zroot/bhyve/.iso/Rocky-9.5-x86_64-minimal.iso        1808 MB 4780 kBps 06m28s
paul@f0:/bhyve % doas vm create rocky

VM configuration



The default Bhyve VM configuration looks like this now:

paul@f0:/bhyve/rocky % cat rocky.conf
loader="bhyveload"
cpu=1
memory=256M
network0_type="virtio-net"
network0_switch="public"
disk0_type="virtio-blk"
disk0_name="disk0.img"
uuid="1c4655ac-c828-11ef-a920-e8ff1ed71ca0"
network0_mac="58:9c:fc:0d:13:3f"

The uuid and the network0_mac differ for each of the three VMs (the ones being installed on f0, f1 and f2).

But to make Rocky Linux boot it (plus some other adjustments, e.g. as we intend to run the majority of the workload in the k3s cluster running on those Linux VMs, we give them beefy specs like 4 CPU cores and 14GB RAM). So we run doas vm configure rocky and modified it to:

guest="linux"
loader="uefi"
uefi_vars="yes"
cpu=4
memory=14G
network0_type="virtio-net"
network0_switch="public"
disk0_type="virtio-blk"
disk0_name="disk0.img"
graphics="yes"
graphics_vga=io
uuid="1c45400b-c828-11ef-8871-e8ff1ed71cac"
network0_mac="58:9c:fc:0d:13:3f"

VM installation



To start the installer from the downloaded ISO, we run:

paul@f0:~ % doas vm install rocky Rocky-9.5-x86_64-minimal.iso
Starting rocky
  * found guest in /zroot/bhyve/rocky
  * booting...

paul@f0:/bhyve/rocky % doas vm list
NAME   DATASTORE  LOADER  CPU  MEMORY  VNC           AUTO  STATE
rocky  default    uefi    4    14G     0.0.0.0:5900  No    Locked (f0.lan.buetow.org)

paul@f0:/bhyve/rocky % doas sockstat -4 | grep 5900
root     bhyve       6079 8   tcp4   *:5900                *:*

Port 5900 now also opens for VNC connections, so I connected it with a VNC client and ran through the installation dialogues. This could be done unattended or more automated, but there are only three VMs to install, and the automation doesn't seem worth it as we do it only once a year or less often.

Increase of the disk image



By default, the VM disk image is only 20G, which is a bit small for our purposes, so we have to stop the VMs again, run truncate on the image file to enlarge them to 100G, and restart the installation:

paul@f0:/bhyve/rocky % doas vm stop rocky
paul@f0:/bhyve/rocky % doas truncate -s 100G disk0.img
paul@f0:/bhyve/rocky % doas vm install rocky Rocky-9.5-x86_64-minimal.iso

Connect to VNC



For the installation, I opened the VNC client on my Fedora laptop (GNOME comes with a simple VNC client) and manually ran through the base installation for each of the VMs. Again, I am sure this could have been automated a bit more, but there were just three VMs, and it wasn't worth the effort. The three VNC addresses of the VMs were vnc://f0:5900, vnc://f1:5900, and vnc://f2:5900.





I primarily selected the default settings (auto partitioning on the 100GB drive and a root user password). After the installation, the VMs were rebooted.





After install



We perform the following steps for all three VMs. In the following, the examples are all executed on f0 (the VM r0 running on f0):

VM auto-start after host reboot



To automatically start the VM on the servers, we add the following to the rc.conf on the FreeBSD hosts:

paul@f0:/bhyve/rocky % cat <<END | doas tee -a /etc/rc.conf
vm_list="rocky"
vm_delay="5"

The vm_delay isn't really required. It is used to wait 5 seconds before starting each VM, but there is currently only one VM per host. Maybe later, when there are more, this will be useful. After adding, there's now a Yes indicator in the AUTO column.

paul@f0:~ % doas vm list
NAME   DATASTORE  LOADER  CPU  MEMORY  VNC           AUTO     STATE
rocky  default    uefi    4    14G     0.0.0.0:5900  Yes [1]  Running (2063)

Static IP configuration



After that, we change the network configuration of the VMs to be static (from DHCP) here. As per the previous post of this series, the three FreeBSD hosts were already in my /etc/hosts file:

192.168.1.130 f0 f0.lan f0.lan.buetow.org
192.168.1.131 f1 f1.lan f1.lan.buetow.org
192.168.1.132 f2 f2.lan f2.lan.buetow.org

For the Rocky VMs, we add those to the FreeBSD host systems as well:

paul@f0:/bhyve/rocky % cat <<END | doas tee -a /etc/hosts
192.168.1.120 r0 r0.lan r0.lan.buetow.org
192.168.1.121 r1 r1.lan r1.lan.buetow.org
192.168.1.122 r2 r2.lan r2.lan.buetow.org
END

And we configure the IPs accordingly on the VMs themselves by opening a root shell via SSH to the VMs and entering the following commands on each of the VMs:

[root@r0 ~] % nmcli connection modify enp0s5 ipv4.address 192.168.1.120/24
[root@r0 ~] % nmcli connection modify enp0s5 ipv4.gateway 192.168.1.1
[root@r0 ~] % nmcli connection modify enp0s5 ipv4.DNS 192.168.1.1
[root@r0 ~] % nmcli connection modify enp0s5 ipv4.method manual
[root@r0 ~] % nmcli connection down enp0s5
[root@r0 ~] % nmcli connection up enp0s5
[root@r0 ~] % hostnamectl set-hostname r0.lan.buetow.org
[root@r0 ~] % cat <<END >>/etc/hosts
192.168.1.120 r0 r0.lan r0.lan.buetow.org
192.168.1.121 r1 r1.lan r1.lan.buetow.org
192.168.1.122 r2 r2.lan r2.lan.buetow.org
END

Whereas:

  • 192.168.1.120 is the IP of the VM itself (here: r0.lan.buetow.org)
  • 192.168.1.1 is the address of my home router, which also does DNS.

Permitting root login



As these VMs aren't directly reachable via SSH from the internet, we enable root login by adding a line with PermitRootLogin yes to /etc/sshd/sshd_config.

Once done, we reboot the VM by running reboot inside the VM to test whether everything was configured and persisted correctly.

After reboot, we copy a public key over. E.g. I did this from my Laptop as follows:

% for i in 0 1 2; do ssh-copy-id root@r$i.lan.buetow.org; done

Then, we edit the /etc/ssh/sshd_config file again on all three VMs and configure PasswordAuthentication no to only allow SSH key authentication from now on.

Install latest updates



[root@r0 ~] % dnf update
[root@r0 ~] % reboot

Stress testing CPU



The aim is to prove that bhyve VMs are CPU efficient. As I could not find an off-the-shelf benchmarking tool available in the same version for FreeBSD as well as for Rocky Linux 9, I wrote my own silly CPU benchmarking tool in Go:

package main

import "testing"

func BenchmarkCPUSilly1(b *testing.B) {
	for i := 0; i < b.N; i++ {
		_ = i * i
	}
}

func BenchmarkCPUSilly2(b *testing.B) {
	var sillyResult float64
	for i := 0; i < b.N; i++ {
		sillyResult += float64(i)
		sillyResult *= float64(i)
		divisor := float64(i) + 1
		if divisor > 0 {
			sillyResult /= divisor
		}
	}
	_ = sillyResult // to avoid compiler optimization
}

You can find the repository here:

https://codeberg.org/snonux/sillybench

Silly FreeBSD host benchmark



To install it on FreeBSD, we run:

paul@f0:~ % doas pkg install git go
paul@f0:~ % mkdir ~/git && cd ~/git && \
  git clone https://codeberg.org/snonux/sillybench && \
  cd sillybench

And to run it:

paul@f0:~/git/sillybench % go version
go version go1.24.1 freebsd/amd64

paul@f0:~/git/sillybench % go test -bench=.
goos: freebsd
goarch: amd64
pkg: codeberg.org/snonux/sillybench
cpu: Intel(R) N100
BenchmarkCPUSilly1-4    1000000000               0.4022 ns/op
BenchmarkCPUSilly2-4    1000000000               0.4027 ns/op
PASS
ok      codeberg.org/snonux/sillybench 0.891s

Silly Rocky Linux VM @ Bhyve benchmark



OK, let's compare this with the Rocky Linux VM running on Bhyve:

[root@r0 ~]# dnf install golang git
[root@r0 ~]# mkdir ~/git && cd ~/git && \
  git clone https://codeberg.org/snonux/sillybench && \
  cd sillybench

And to run it:

[root@r0 sillybench]# go version
go version go1.22.9 (Red Hat 1.22.9-2.el9_5) linux/amd64
[root@r0 sillybench]# go test -bench=.
goos: linux
goarch: amd64
pkg: codeberg.org/snonux/sillybench
cpu: Intel(R) N100
BenchmarkCPUSilly1-4    1000000000               0.4347 ns/op
BenchmarkCPUSilly2-4    1000000000               0.4345 ns/op

The Linux benchmark is slightly slower than the FreeBSD one. The Go version is also a bit older. I tried the same with the up-to-date version of Go (1.24.x) with similar results. There could be a slight Bhyve overhead, or FreeBSD is just slightly more efficient in this benchmark. Overall, this shows that Bhyve performs excellently.

Silly FreeBSD VM @ Bhyve benchmark



But as I am curious and don't want to compare apples with bananas, I decided to install a FreeBSD Bhyve VM to run the same silly benchmark in it. I am not going through the details of how to install a FreeBSD Bhyve VM here; you can easily look it up in the documentation.

But here are the results running the same silly benchmark in a FreeBSD Bhyve VM with the same FreeBSD and Go versions as the host system (I have the VM 4 vCPUs and 14GB of RAM; the benchmark won't use as many CPUs (and memory) anyway):

root@freebsd:~/git/sillybench # go test -bench=.
goos: freebsd
goarch: amd64
pkg: codeberg.org/snonux/sillybench
cpu: Intel(R) N100
BenchmarkCPUSilly1      1000000000               0.4273 ns/op
BenchmarkCPUSilly2      1000000000               0.4286 ns/op
PASS
ok      codeberg.org/snonux/sillybench  0.949s

It's a bit better than Linux! I am sure that this is not really a scientific benchmark, so take the results with a grain of salt!

Benchmarking with ubench



Let's run another, more sophisticated benchmark using ubench, the Unix Benchmark Utility available for FreeBSD. It was installed by simply running doas pkg install ubench. It can benchmark CPU and memory performance. Here, we limit it to one CPU for the first run with -s, and then let it run at full speed (using all available CPUs in parallel) in the second run.

FreeBSD host ubench benchmark



Single CPU:

paul@f0:~ % doas ubench -s 1
Unix Benchmark Utility v.0.3
Copyright (C) July, 1999 PhysTech, Inc.
Author: Sergei Viznyuk <sv@phystech.com>
http://www.phystech.com/download/ubench.html
FreeBSD 14.2-RELEASE-p1 FreeBSD 14.2-RELEASE-p1 GENERIC amd64
Ubench Single CPU:   671010 (0.40s)
Ubench Single MEM:  1705237 (0.48s)
-----------------------------------
Ubench Single AVG:  1188123


All CPUs (with all Bhyve VMs stopped):

paul@f0:~ % doas ubench
Unix Benchmark Utility v.0.3
Copyright (C) July, 1999 PhysTech, Inc.
Author: Sergei Viznyuk <sv@phystech.com>
http://www.phystech.com/download/ubench.html
FreeBSD 14.2-RELEASE-p1 FreeBSD 14.2-RELEASE-p1 GENERIC amd64
Ubench CPU:  2660220
Ubench MEM:  3095182
--------------------
Ubench AVG:  2877701

FreeBSD VM @ Bhyve ubench benchmark



Single CPU:

root@freebsd:~ # ubench -s 1
Unix Benchmark Utility v.0.3
Copyright (C) July, 1999 PhysTech, Inc.
Author: Sergei Viznyuk <sv@phystech.com>
http://www.phystech.com/download/ubench.html
FreeBSD 14.2-RELEASE-p1 FreeBSD 14.2-RELEASE-p1 GENERIC amd64
Ubench Single CPU:   672792 (0.40s)
Ubench Single MEM:   852757 (0.48s)
-----------------------------------
Ubench Single AVG:   762774

Wow, the CPU in the VM was a tiny bit faster than on the host! So this was probably just a glitch in the matrix. Memory seems slower, though.

All CPUs:

root@freebsd:~ # ubench
Unix Benchmark Utility v.0.3
Copyright (C) July, 1999 PhysTech, Inc.
Author: Sergei Viznyuk <sv@phystech.com>
http://www.phystech.com/download/ubench.html
FreeBSD 14.2-RELEASE-p1 FreeBSD 14.2-RELEASE-p1 GENERIC amd64
Ubench CPU:  2652857
swap_pager: out of swap space
swp_pager_getswapspace(27): failed
swap_pager: out of swap space
swp_pager_getswapspace(18): failed
Apr  4 23:02:43 freebsd kernel: pid 862 (ubench), jid 0, uid 0, was killed: failed to reclaim memory
swp_pager_getswapspace(6): failed
Apr  4 23:02:46 freebsd kernel: pid 863 (ubench), jid 0, uid 0, was killed: failed to reclaim memory
Apr  4 23:02:47 freebsd kernel: pid 864 (ubench), jid 0, uid 0, was killed: failed to reclaim memory
Apr  4 23:02:48 freebsd kernel: pid 865 (ubench), jid 0, uid 0, was killed: failed to reclaim memory
Apr  4 23:02:49 freebsd kernel: pid 861 (ubench), jid 0, uid 0, was killed: failed to reclaim memory
Apr  4 23:02:51 freebsd kernel: pid 839 (ubench), jid 0, uid 0, was killed: failed to reclaim memory

The multi-CPU benchmark in the Bhyve VM ran with almost identical results to the FreeBSD host system. However, the memory benchmark failed with out-of-swap space errors. I am unsure why, as the VM has 14GB RAM, but I am not investigating further.

Also, during the benchmark, I noticed the bhyve process on the host was constantly using 399% of the CPU (all 4 CPUs).

  PID USERNAME    THR PRI NICE   SIZE    RES STATE    C   TIME    WCPU COMMAND
 7449 root         14  20    0    14G    78M kqread   2   2:12 399.81% bhyve

Overall, Bhyve has a small overhead, but the CPU performance difference is negligible. The FreeBSD host is slightly faster than the FreeBSD VM running on Bhyve, but the difference is small enough for our use cases. The memory benchmark seems slightly off, but I'm not sure whether to trust it, especially due to the swap errors. Does ubench's memory benchmark use swap space for the memory test? That wouldn't make sense and might explain the difference to some degree, though. Do you have any ideas?

Rocky Linux VM @ Bhyve ubench benchmark



Unfortunately, I wasn't able to find ubench in any of the Rocky Linux repositories. So, I skipped this test.

Update: Improving Disk I/O Performance for etcd



Updated: Fri 26 Dec 08:51:23 EET 2025

After running k3s for some time, I noticed frequent etcd leader elections and "apply request took too long" warnings in the logs. Investigation revealed that etcd's sync writes were extremely slow - around 250 kB/s with the default virtio-blk disk emulation. etcd requires fast sync writes (ideally under 10ms fsync latency) for stable operation.

The Problem



The k3s logs showed etcd struggling with disk I/O:

{"level":"warn","msg":"apply request took too long","took":"4.996516657s","expected-duration":"100ms"}
{"level":"warn","msg":"slow fdatasync","took":"1.328469363s","expected-duration":"1s"}

A simple sync write benchmark confirmed the issue:

[root@r0 ~]# dd if=/dev/zero of=/tmp/test bs=4k count=2000 oflag=dsync
8192000 bytes copied, 31.7058 s, 258 kB/s

The Solution: Switch to NVMe Emulation



Bhyve's NVMe emulation provides significantly better I/O performance than virtio-blk.

Step 1: Prepare the Guest OS



Before changing the disk type, the guest needs NVMe drivers in the initramfs and LVM must be configured to scan all devices (not just those recorded during installation):

[root@r0 ~]# cat > /etc/dracut.conf.d/nvme.conf << EOF
add_drivers+=" nvme nvme_core "
hostonly=no
EOF

[root@r0 ~]# sed -i 's/# use_devicesfile = 1/use_devicesfile = 0/' /etc/lvm/lvm.conf
[root@r0 ~]# dracut -f
[root@r0 ~]# shutdown -h now

The hostonly=no setting ensures the initramfs includes drivers for hardware not currently present. The use_devicesfile = 0 tells LVM to scan all block devices rather than only those recorded in /etc/lvm/devices/system.devices - this is important because the device path changes from /dev/vda to /dev/nvme0n1.

Step 2: Update the Bhyve Configuration



On the FreeBSD host, update the VM configuration to use NVMe:

paul@f0:~ % doas vm stop rocky
paul@f0:~ % doas vm configure rocky

Change disk0_type from virtio-blk to nvme:

disk0_type="nvme"

Then start the VM:

paul@f0:~ % doas vm start rocky

Benchmark Results



After switching to NVMe emulation, the sync write performance improved dramatically:

[root@r0 ~]# dd if=/dev/zero of=/tmp/test bs=4k count=2000 oflag=dsync
8192000 bytes copied, 0.330718 s, 24.8 MB/s

That's approximately **100x faster** than before (24.8 MB/s vs 258 kB/s).

The etcd metrics also showed healthy fsync latencies:

etcd_disk_wal_fsync_duration_seconds_bucket{le="0.001"} 347
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.002"} 396
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.004"} 408

Most fsyncs now complete in under 1ms, and there are no more "slow fdatasync" warnings in the logs. The k3s cluster is now stable without spurious leader elections.

Important Notes



  • Do NOT use disk0_opts="nocache,direct" with NVMe emulation - in my testing this actually made performance worse.
  • The guest OS must have NVMe drivers in the initramfs before switching, otherwise it won't boot.
  • LVM's devices file feature (enabled by default in RHEL 9 / Rocky Linux 9) must be disabled to allow booting from a different device path.

Conclusion



Having Linux VMs running inside FreeBSD's Bhyve is a solid move for future f3s hosting in my home lab. Bhyve provides a reliable way to manage VMs without much hassle. With Linux VMs, I can tap into all the cool stuff (e.g., Kubernetes, eBPF, systemd) in the Linux world while keeping the steady reliability of FreeBSD.

Future uses (out of scope for this blog series) would be additional VMs for different workloads. For example, how about a Windows or NetBSD VM to tinker with?

This flexibility is great for keeping options open and managing different workloads without overcomplicating things. Overall, it's a nice setup for getting the most out of my hardware and keeping things running smoothly.

Read the next post of this series:

f3s: Kubernetes with FreeBSD - Part 5: WireGuard mesh network

Other *BSD-related posts:

2025-12-07 f3s: Kubernetes with FreeBSD - Part 8: Observability
2025-10-02 f3s: Kubernetes with FreeBSD - Part 7: k3s and first pod deployments
2025-07-14 f3s: Kubernetes with FreeBSD - Part 6: Storage
2025-05-11 f3s: Kubernetes with FreeBSD - Part 5: WireGuard mesh network
2025-04-05 f3s: Kubernetes with FreeBSD - Part 4: Rocky Linux Bhyve VMs (You are currently reading this)
2025-02-01 f3s: Kubernetes with FreeBSD - Part 3: Protecting from power cuts
2024-12-03 f3s: Kubernetes with FreeBSD - Part 2: Hardware and base installation
2024-11-17 f3s: Kubernetes with FreeBSD - Part 1: Setting the stage
2024-04-01 KISS high-availability with OpenBSD
2024-01-13 One reason why I love OpenBSD
2022-10-30 Installing DTail on OpenBSD
2022-07-30 Let's Encrypt with OpenBSD and Rex
2016-04-09 Jails and ZFS with Puppet on FreeBSD

E-Mail your comments to paul@nospam.buetow.org

Back to the main site
Sharing on Social Media with Gos v1.0.0 https://foo.zone/gemfeed/2025-03-05-sharing-on-social-media-with-gos.html 2025-03-04T21:22:07+02:00 Paul Buetow aka snonux paul@dev.buetow.org As you may have noticed, I like to share on Mastodon and LinkedIn all the technical things I find interesting, and this blog post is technically all about that.

Sharing on Social Media with Gos v1.0.0



Published at 2025-03-04T21:22:07+02:00

As you may have noticed, I like to share on Mastodon and LinkedIn all the technical things I find interesting, and this blog post is technically all about that.

Gos logo

Table of Contents




Introduction



Gos is a Go-based replacement (which I wrote) for Buffer.com, providing the ability to schedule and manage social media posts from the command line. It can be run, for example, every time you open a new shell or only once every N hours when you open a new shell.

I used Buffer.com to schedule and post my social media messages for a long time. However, over time, there were more problems with that service, including a slow and unintuitive UI, and the free version only allows scheduling up to 10 messages. At one point, they started to integrate an AI assistant (which would seemingly randomly pop up in separate JavaScript-powered input boxes), and then I had enough and decided I had to build my own social sharing tool—and Gos was born.

https://buffer.com
https://codeberg.org/snonux/gos

Gos features



  • Mastodon and LinkedIn support.
  • Dry run mode for testing posts without actually publishing.
  • Configurable via flags and environment variables.
  • Easy to integrate into automated workflows.
  • OAuth2 authentication for LinkedIn.
  • Image previews for LinkedIn posts.

Installation



Prequisites



The prerequisites are:

  • Go (version 1.24 or later)
  • Supported browsers like Firefox, Chrome, etc for oauth2.

Build and install



Clone the repository:

git clone https://codeberg.org/snonux/gos.git
cd gos

Build the binaries:

go build -o gos ./cmd/gos
go build -o gosc ./cmd/gosc
sudo mv gos ~/go/bin
sudo mv gosc ~/go/bin

Or, if you want to use the Taskfile:

go-task install

Configuration



Gos requires a configuration file to store API secrets and OAuth2 credentials for each supported social media platform. The configuration is managed using a Secrets structure, which is stored as a JSON file in ~/.config/gos/gos.json.

Example Configuration File (~/.config/gos/gos.json):

{
  "MastodonURL": "https://mastodon.example.com",
  "MastodonAccessToken": "your-mastodon-access-token",
  "LinkedInClientID": "your-linkedin-client-id",
  "LinkedInSecret": "your-linkedin-client-secret",
  "LinkedInRedirectURL": "http://localhost:8080/callback",
}

Configuration fields



  • MastodonURL: The base URL of the Mastodon instance you are using (e.g., https://mastodon.social).
  • MastodonAccessToken: Your access token for the Mastodon API, which is used to authenticate your posts.
  • LinkedInClientID: The client ID for your LinkedIn app, which is needed for OAuth2 authentication.
  • LinkedInSecret: The client secret for your LinkedIn app.
  • LinkedInRedirectURL: The redirect URL configured for handling OAuth2 responses.
  • LinkedInAccessToken: Gos will automatically update this after successful OAuth2 authentication with LinkedIn.
  • LinkedInPersonID: Gos will automatically update this after successful OAuth2 authentication with LinkedIn.

Automatically managed fields



Once you finish the OAuth2 setup (after the initial run of gos), some fields—like LinkedInAccessToken and LinkedInPersonID will get filled in automatically. To check if everything's working without actually posting anything, you can run the app in dry run mode with the --dry option. After OAuth2 is successful, the file will be updated with LinkedInClientID and LinkedInAccessToken. If the access token expires, it will go through the OAuth2 process again.

Invoking Gos



Gos is a command-line tool for posting updates to multiple social media platforms. You can run it with various flags to customize its behaviour, such as posting in dry run mode, limiting posts by size, or targeting specific platforms.

Flags control the tool's behavior. Below are several common ways to invoke Gos and descriptions of the available flags.

Common flags



  • -dry: Run the application in dry run mode, simulating operations without making any changes.
  • -version: Display the current version of the application.
  • -compose: Compose a new entry. Default is set by composeEntryDefault.
  • -gosDir: Specify the directory for Gos' queue and database files. The default is ~/.gosdir.
  • —cacheDir: Specify the directory for Gos' cache. The default is based on the gosDir path.
  • -browser: Choose the browser for OAuth2 processes. The default is "firefox".
  • -configPath: Path to the configuration file. Default is ~/.config/gos/gos.json.
  • —platforms: The enabled platforms and their post size limits. The default is "Mastodon:500,LinkedIn:1000."
  • -target: Target number of posts per week. The default is 2.
  • -minQueued: Minimum number of queued items before a warning message is printed. The default is 4.
  • -maxDaysQueued: Maximum number of days' worth of queued posts before the target increases and pauseDays decreases. The default is 365.
  • -pauseDays: Number of days until the next post can be submitted. The default is 3.
  • -runInterval: Number of hours until the next post run. The default is 12.
  • —lookback: The number of days to look back in time to review posting history. The default is 30.
  • -geminiSummaryFor: Generate a Gemini Gemtext format summary specifying months as a comma-separated string.
  • -geminiCapsules: Comma-separated list of Gemini capsules. Used to detect Gemtext links.
  • -gemtexterEnable: Add special tags for Gemtexter, the static site generator, to the Gemini Gemtext summary.
  • -dev: For internal development purposes only.

Examples



*Dry run mode*

Dry run mode lets you simulate the entire posting process without actually sending the posts. This is useful for testing configurations or seeing what would happen before making real posts.

./gos --dry

*Normal run*

Sharing to all platforms is as simple as the following (assuming it is configured correctly):

./gos 

:-)

Gos Screenshot

However, you will notice that no messages are queued to be posted yet (not like on the screenshot yet!). Relax and read on...

Composing messages to be posted



To post messages using Gos, you need to create text files containing the posts' content. These files are placed inside the directory specified by the --gosDir flag (the default directory is ~/.gosdir). Each text file represents a single post and must have the .txt extension. You can also simply run gos --compose to compose a new entry. It will open simply a new text file in gosDir.

Basic structure of a message file



Each text file should contain the message you want to post on the specified platforms. That's it. Example of a Basic Post File ~/.gosdir/samplepost.txt:

This is a sample message to be posted on social media platforms.

Maybe add a link here: https://foo.zone

#foo #cool #gos #golang

The message is just arbitrary text, and, besides inline share tags (see later in this document) at the beginning, Gos does not parse any of the content other than ensuring the overall allowed size for the social media platform isn't exceeded. If it exceeds the limit, Gos will prompt you to edit the post using your standard text editor (as specified by the EDITOR environment variable). When posting, all the hyperlinks, hashtags, etc., are interpreted by the social platforms themselves (e.g., Mastodon, LinkedIn).

Adding share tags in the filename



You can control which platforms a post is shared to, and manage other behaviors using tags embedded in the filename. Add tags in the format share:platform1.-platform2 to target specific platforms within the filename. This instructs Gos to share the message only to platform1 (e.g., Mastodon) and explicitly exclude platform2 (e.g., LinkedIn). You can include multiple platforms by listing them after share:, separated by a .. Use the - symbol to exclude a platform.

Currently, only linkedin and mastodon are supported, and the shortcuts li and ma also work.

**Examples:**

  • To share only on Mastodon: ~/.gosdir/foopost.share:mastodon.txt
  • To exclude sharing on LinkedIn: ~/.gosdir/foopost.share:-linkedin.txt
  • To explicitly share on both LinkedIn and Mastodon: ~/.gosdir/foopost.share:linkedin:mastodon.txt
  • To explicitly share only on LinkedIn and exclude Mastodon: ~/.gosdir/foopost.share:linkedin:-mastodon.txt

Besides encoding share tags in the filename, they can also be embedded within the .txt file content to be queued. For example, a file named ~/.gosdir/foopost.txt with the following content:

share:mastodon The content of the post here

or

share:mastodon

The content of the post is here https://some.foo/link

#some #hashtags

Gos will parse this content, extract the tags, and queue it as ~/.gosdir/db/platforms/mastodon/foopost.share:mastodon.extracted.txt.... (see how post queueing works later in this document).

Using the prio tag



Gos randomly picks any queued message without any specific order or priority. However, you can assign a higher priority to a message. The priority determines the order in which posts are processed, with messages without a priority tag being posted last and those with priority tags being posted first. If multiple messages have the priority tag, then a random message will be selected from them.

*Examples using the Priority tag:*

  • To share only on Mastodon: ~/.gosdir/foopost.prio.share:mastodon.txt
  • To not share on LinkedIn: ~/.gosdir/foopost.prio.share:-linkedin.txt
  • To explicitly share on both: ~/.gosdir/foopost.prio.share:linkedin:mastodon.txt
  • To explicitly share on only linkedin: ~/.gosdir/foopost.prio.share:linkedin:-mastodon.txt

There is more: you can also use the soon tag. It is almost the same as the prio tag, just with one lower priority.

More tags



  • A .ask. in the filename will prompt you to choose whether to queue, edit, or delete a file before queuing it.
  • A .now. in the filename will schedule a post immediately, regardless of the target status.

So you could also have filenames like those:

  • ~/.gosdir/foopost.ask.txt
  • ~/.gosdir/foopost.now.txt
  • ~/.gosdir/foopost.ask.share:mastodon.txt
  • ~/.gosdir/foopost.ask.prio.share:mastodon.txt
  • ~/.gosdir/foopost.ask.now.share:-mastodon.txt
  • ~/.gosdir/foopost.now.share:-linkedin.txt

etc...

All of the above also works with embedded tags. E.g.:

share:mastodon,ask,prio Hello wold :-)

or

share:mastodon,ask,prio

Hello World :-)

The gosc binary



gosc stands for Gos Composer and will simply launch your $EDITOR on a new text file in the gosDir. It's the same as running gos --compose, really. It is a quick way of composing new posts. Once composed, it will ask for your confirmation on whether the message should be queued or not.

How queueing works in gos



When you place a message file in the gosDir, Gos processes it by moving the message through a queueing system before posting it to the target social media platforms. A message's lifecycle includes several key stages, from creation to posting, all managed through the ./db/platforms/PLATFORM directories.

Step-by-step queueing process



1. Inserting a Message into gosDir: You start by creating a text file that represents your post (e.g., foo.txt) and placing it in the gosDir. When Gos runs, this file is processed. The easiest way is to use gosc here.

2. Moving to the Queue: Upon running Gos, the tool identifies the message in the gosDir and places it into the queue for the specified platform. The message is moved into the appropriate directory for each platform in ./db/platforms/PLATFORM. During this stage, the message file is renamed to include a timestamp indicating when it was queued and given a .queued extension.

*Example: If a message is queued for LinkedIn, the filename might look like this:*

~/.gosdir/db/platforms/linkedin/foo.share:-mastodon.txt.20241022-102343.queued

3. Posting the Message: Once a message is placed in the queue, Gos posts it to the specified social media platforms.

4. Renaming to .posted: After a message is successfully posted to a platform, the corresponding .queued file is renamed to have a .posted extension, and the filename timestamp is also updated. This signals that the post has been processed and published.

*Example - After a successful post to LinkedIn, the message file might look like this:*

./db/platforms/linkedin/foo.share:-mastodon.txt.20241112-121323.posted

How message selection works in gos



Gos decides which messages to post using a combination of priority, platform-specific tags, and timing rules. The message selection process ensures that messages are posted according to your configured cadence and targets while respecting pauses between posts and previously met goals.

The key factors in message selection are:

  • Target Number of Posts Per Week: The -target flag defines how many posts per week should be made to a specific platform. This target helps Gos manage the posting rate, ensuring that the right number of posts are made without exceeding the desired frequency.
  • Post History Lookback: The -lookback flag tells Gos how many days back to look in the post history to calculate whether the weekly post target has already been met. It ensures that previously posted content is considered before deciding to queue up another message.
  • Message Priority: Messages with no priority value are processed after those with priority. If two messages have the same priority, one is selected randomly.
  • Pause Between Posts: The -pauseDays flag allows you to specify a minimum number of days to wait between posts for the same platform. This prevents oversaturation of content and ensures that posts are spread out over time.

Database replication



I simply use Syncthing to backup/sync my gosDir. Note, that I run Gos on my personal laptop. No need to run it from a server.

https://syncthing.net

Post summary as gemini gemtext



For my blog, I want to post a summary of all the social messages posted over the last couple of months. For an example, have a look here:

./2025-01-01-posts-from-october-to-december-2024.html

To accomplish this, run:

gos --geminiSummaryFor 202410,202411,202412

This outputs the summary for the three specified months, as shown in the example. The summary includes posts from all social media networks but removes duplicates.

Also, add the --gemtexterEnable flag, if you are using Gemtexter:


gos --gemtexterEnable --geminiSummaryFor 202410,202411,202412

Gemtexter

In case there are HTTP links that translate directly to the Geminispace for certain capsules, specify the Gemini capsules as a comma-separated list as follows:

gos --gemtexterEnable --geminiSummaryFor 202410,202411,202412 --geminiCapsules "foo.zone,paul.buetow.org"

It will then also generate Gemini Gemtext links in the summary page and flag them with (Gemini).

Conclusion



Overall, this was a fun little Go project with practical use for me personally. I hope you also had fun reading this, and maybe you will use it as well.

E-Mail your comments to paul@nospam.buetow.org :-)

Back to the main site
Random Weird Things - Part Ⅱ https://foo.zone/gemfeed/2025-02-08-random-weird-things-ii.html 2025-02-08T11:06:16+02:00 Paul Buetow aka snonux paul@dev.buetow.org Every so often, I come across random, weird, and unexpected things on the internet. I thought it would be neat to share them here from time to time. This is the second run.

Random Weird Things - Part Ⅱ



Published at 2025-02-08T11:06:16+02:00

Every so often, I come across random, weird, and unexpected things on the internet. I thought it would be neat to share them here from time to time. This is the second run.

2024-07-05 Random Weird Things - Part Ⅰ
2025-02-08 Random Weird Things - Part Ⅱ (You are currently reading this)
2025-08-15 Random Weird Things - Part Ⅲ

/\_/\           /\_/\
( o.o ) WHOA!! ( o.o )
> ^ <           > ^ <
/   \    MOEEW! /   \
/______\       /______\

Table of Contents




11. The SQLite codebase is a gem



Check this out:

SQLite Gem

Source:

https://wetdry.world/@memes/112717700557038278

Go Programming



12. Official Go font



The Go programming language has an official font called "Go Font." It was created to complement the aesthetic of the Go language, ensuring clear and legible rendering of code. The font includes a monospace version for code and a proportional version for general text, supporting consistent look and readability in Go-related materials and development environments.

Check out some Go code displayed using the Go font:

Go font code

https://go.dev/blog/go-fonts

The design emphasizes simplicity and readability, reflecting Go's philosophy of clarity and efficiency.

I found it interesting and/or weird, as Go is a programming language. Why should it bother having its own font? I have never seen another open-source project like Go do this. But I also like it. Maybe I will use it in the future for this blog :-)

13. Go functions can have methods



Functions on struct types? Well known. Functions on types like int and string? It's also known of, but a bit lesser. Functions on function types? That sounds a bit funky, but it's possible, too! For demonstration, have a look at this snippet:

package main

import "log"

type fun func() string

func (f fun) Bar() string {
        return "Bar"
}

func main() {
        var f fun = func() string {
                return "Foo"
        }
        log.Println("Example 1: ", f())
        log.Println("Example 2: ", f.Bar())
        log.Println("Example 3: ", fun(f.Bar).Bar())
        log.Println("Example 4: ", fun(fun(f.Bar).Bar).Bar())
}

It runs just fine:

❯ go run main.go
2025/02/07 22:56:14 Example 1:  Foo
2025/02/07 22:56:14 Example 2:  Bar
2025/02/07 22:56:14 Example 3:  Bar
2025/02/07 22:56:14 Example 4:  Bar

macOS



For personal computing, I don't use Apple, but I have to use it for work.

14. ß and ss are treated the same



Know German? In German, the letter "sharp s" is written as ß. ß is treated the same as ss on macOS.

On a case-insensitive file system like macOS, not only are uppercase and lowercase letters treated the same, but non-Latin characters like the German "ß" are also considered equivalent to their Latin counterparts (in this case, "ss").

So, even though "Maß" and "Mass" are not strictly equivalent, the macOS file system still treats them as the same filename due to its handling of Unicode characters. This can sometimes lead to unexpected behaviour. Check this out:

❯ touch Maß
❯ ls -l
-rw-r--r--@ 1 paul  wheel  0 Feb  7 23:02 Maß
❯ touch Mass
❯ ls -l
-rw-r--r--@ 1 paul  wheel  0 Feb  7 23:02 Maß
❯ rm Mass
❯ ls -l

❯ touch Mass
❯ ls -ltr
-rw-r--r--@ 1 paul  wheel  0 Feb  7 23:02 Mass
❯ rm Maß
❯ ls -l


15. Colon as file path separator



MacOS can use the colon as a file path separator on its ADFS (file system). A typical ADFS file pathname on a hard disc might be:

ADFS::4.$.Documents.Techwriter.Myfile

I can't reproduce this on my (work) Mac, though, as it now uses the APFS file system. In essence, ADFS is an older file system, while APFS is a contemporary file system optimized for Apple's modern devices.

https://social.jvns.ca/@b0rk/113041293527832730

16. Polyglots - programs written in multiple languages



A coding polyglot is a program or script written so that it can be executed in multiple programming languages without modification. This is typically achieved by leveraging syntax overlaps or crafting valid and meaningful code in each targeted language. Polyglot programs are often created as a challenge or for demonstration purposes to showcase language similarities or clever coding techniques.

Check out my very own polyglot:

The fibonatti.pl.c Polyglot

17. Languages, where indices start at 1



Array indices start at 1 instead of 0 in some programming languages, known as one-based indexing. This can be controversial because zero-based indexing is more common in popular languages like C, C++, Java, and Python. One-based indexing can lead to off-by-one errors when developers switch between languages with different indexing schemes.

Languages with One-Based Indexing:

  • Fortran
  • MATLAB
  • Lua
  • R (for vectors and lists)
  • Smalltalk
  • Julia (by default, although zero-based indexing is also possible)

foo.lua example:

arr = {10, 20, 30, 40, 50}
print(arr[1]) -- Accessing the first element

❯ lua foo.lua
10

One-based indexing is more natural for human-readable, mathematical, and theoretical contexts, where counting traditionally starts from one.

18. Perl Poetry



Perl Poetry is a playful and creative practice within the programming community where Perl code is written as a poem. These poems are crafted to be syntactically valid Perl code and make sense as poetic text, often with whimsical or humorous intent. This showcases Perl's flexibility and expressiveness, as well as the creativity of its programmers.

See this Poetry of my own; the Perl interpreter does not yield any syntax error parsing that. But also, the Peom doesn't do anything useful then executed:

# (C) 2006 by Paul C. Buetow

Christmas:{time;#!!!

Children: do tell $wishes;

Santa: for $each (@children) { 
BEGIN { read $each, $their, wishes and study them; use Memoize#ing

} use constant gift, 'wrapping'; 
package Gifts; pack $each, gift and bless $each and goto deliver
or do import if not local $available,!!! HO, HO, HO;

redo Santa, pipe $gifts, to_childs;
redo Santa and do return if last one, is, delivered; 

deliver: gift and require diagnostics if our $gifts ,not break;
do{ use NEXT; time; tied $gifts} if broken and dump the, broken, ones;
The_children: sleep and wait for (each %gift) and try { to => untie $gifts };

redo Santa, pipe $gifts, to_childs;
redo Santa and do return if last one, is, delivered; 

The_christmas_tree: formline s/ /childrens/, $gifts;
alarm and warn if not exists $Christmas{ tree}, @t, $ENV{HOME};  
write <<EMail
 to the parents to buy a new christmas tree!!!!111
 and send the
EMail
;wait and redo deliver until defined local $tree;

redo Santa, pipe $gifts, to_childs;
redo Santa and do return if last one, is, delivered ;}

END {} our $mission and do sleep until next Christmas ;}

__END__

This is perl, v5.8.8 built for i386-freebsd-64int

More Perl Poetry of mine

19. CSS3 is turing complete



CSS3 is Turing complete because it can simulate a Turing machine using only CSS animations and styles without any JavaScript or external logic. This is achieved by using keyframe animations to change the styles of HTML elements in a way that encodes computation, performing calculations and state transitions.

Is CSS turing complete?

It is surprising because CSS is primarily a styling language intended for the presentation layer of web pages, not for computation or logic. Its capability to perform complex computations defies its typical use case and showcases the unintended computational power that can emerge from the creative use of seemingly straightforward technologies.

Check out this 100% CSS implementation of the Conways Game of Life:



CSS Conways Game of Life

Conway's Game of Life is Turing complete because it can simulate a universal Turing machine, meaning it can perform any computation that a computer can, given the right initial conditions and sufficient time and space. Suppose a language can implement Conway's Game of Life. In that case, it demonstrates the language's ability to handle complex state transitions and computations. It has the necessary constructs (like iteration, conditionals, and data manipulation) to simulate any algorithm, thus confirming its Turing completeness.

20. The biggest shell programs



One would think that shell scripts are only suitable for small tasks. Well, I must be wrong, as there are huge shell programs out there (up to 87k LOC) which aren't auto-generated but hand-written!

The Biggest Sell Programs in the World

My Gemtexter (bash) is only 1329 LOC as of now. So it's tiny.

Gemtexter - One Bash script to rule it all

I hope you had some fun. E-Mail your comments to paul@nospam.buetow.org :-)

Back to the main site
f3s: Kubernetes with FreeBSD - Part 3: Protecting from power cuts https://foo.zone/gemfeed/2025-02-01-f3s-kubernetes-with-freebsd-part-3.html 2025-01-30T09:22:06+02:00 Paul Buetow aka snonux paul@dev.buetow.org This is the third blog post about my f3s series for my self-hosting demands in my home lab. f3s? The 'f' stands for FreeBSD, and the '3s' stands for k3s, the Kubernetes distribution we will use on FreeBSD-based physical machines.

f3s: Kubernetes with FreeBSD - Part 3: Protecting from power cuts



Published at 2025-01-30T09:22:06+02:00

This is the third blog post about my f3s series for my self-hosting demands in my home lab. f3s? The "f" stands for FreeBSD, and the "3s" stands for k3s, the Kubernetes distribution we will use on FreeBSD-based physical machines.

2024-11-17 f3s: Kubernetes with FreeBSD - Part 1: Setting the stage
2024-12-03 f3s: Kubernetes with FreeBSD - Part 2: Hardware and base installation
2025-02-01 f3s: Kubernetes with FreeBSD - Part 3: Protecting from power cuts (You are currently reading this)
2025-04-05 f3s: Kubernetes with FreeBSD - Part 4: Rocky Linux Bhyve VMs
2025-05-11 f3s: Kubernetes with FreeBSD - Part 5: WireGuard mesh network
2025-07-14 f3s: Kubernetes with FreeBSD - Part 6: Storage
2025-10-02 f3s: Kubernetes with FreeBSD - Part 7: k3s and first pod deployments
2025-12-07 f3s: Kubernetes with FreeBSD - Part 8: Observability

f3s logo

Table of Contents




Introduction



In this blog post, we are setting up the UPS for the cluster. A UPS, or Uninterruptible Power Supply, safeguards my cluster from unexpected power outages and surges. It acts as a backup battery that kicks in when the electricity cuts out—especially useful in my area, where power cuts are frequent—allowing for a graceful system shutdown and preventing data loss and corruption. This is especially important since I will also store some of my data on the f3s nodes.

Changes since last time



FreeBSD upgrade from 14.1 to 14.2



There has been a new release since the last blog post in this series. The upgrade from 14.1 was as easy as:

paul@f0: ~ % doas freebsd-update fetch
paul@f0: ~ % doas freebsd-update install
paul@f0: ~ % doas freebsd-update -r 14.2-RELEASE upgrade
paul@f0: ~ % doas freebsd-update install
paul@f0: ~ % doas shutdown -r now

And after rebooting, I ran:

paul@f0: ~ % doas freebsd-update install
paul@f0: ~ % doas pkg update
paul@f0: ~ % doas pkg upgrade
paul@f0: ~ % doas shutdown -r now

And after another reboot, I was on 14.2:

paul@f0:~ % uname -a
FreeBSD f0.lan.buetow.org 14.2-RELEASE FreeBSD 14.2-RELEASE 
 releng/14.2-n269506-c8918d6c7412 GENERIC amd64

And, of course, I ran this on all 3 nodes!

A new home (behind the TV)



I've put all the infrastructure behind my TV, as plenty of space is available. The TV hides most of the setup, which drastically improved the SAF (spouse acceptance factor).

New hardware placement arrangement

I got rid of the mini-switch I mentioned in the previous blog post. I have the TP-Link EAP615-Wall mounted on the wall nearby, which is my OpenWrt-powered Wi-Fi hotspot. It also has 3 Ethernet ports, to which I connected the Beelink nodes. That's the device you see at the very top.

The Ethernet cables go downward through the cable boxes to the Beelink nodes. In addition to the Beelink f3s nodes, I connected the TP-Link to the UPS as well (not discussed further in this blog post, but the positive side effect is that my Wi-Fi will still work during a power loss for some time—and during a power cut, the Beelink nodes will still be able to communicate with each other).

On the very left (the black box) is the UPS, with four power outlets. Three go to the Beelink nodes, and one goes to the TP-Link. A USB output is also connected to the first Beelink node, f0.

On the very right (halfway hidden behind the TV) are the 3 Beelink nodes stacked on top of each other. The only downside (or upside?) is that my 14-month-old daughter is now chaos-testing the Beelink nodes, as the red power buttons (now reachable for her) are very attractive for her to press when passing by randomly. :-) Luckily, that will only cause graceful system shutdowns!

The UPS hardware



I wanted a UPS that I could connect to via FreeBSD, and that would provide enough backup power to operate the cluster for a couple of minutes (it turned out to be around an hour, but this time will likely be shortened after future hardware upgrades, like additional drives and a backup enclosure) and to automatically initiate the shutdown of all the f3s nodes.

I decided on the APC Back-UPS BX750MI model because:

  • Zero noise level when there is no power cut (some light noise when the battery is in operation during a power cut).
  • Cost: It is relatively affordable (not costing thousands).
  • USB connectivity: Can be connected via USB to one of the FreeBSD hosts to read the UPS status.
  • A power output of 750VA (or 410 watts), suitable for an hour of runtime for my f3s nodes (plus the Wi-Fi router).
  • Multiple power outlets: Can connect all 3 f3s nodes directly.
  • User-replaceable batteries: I can replace the batteries myself after two years or more (depending on usage).
  • Its compact design. Overall, I like how it looks.

The APC Back-UPS BX750MI in operation.

Configuring FreeBSD to Work with the UPS



USB Device Detection



Once plugged in via USB on FreeBSD, I could see the following in the kernel messages:

paul@f0: ~ % doas dmesg | grep UPS
ugen0.2: <American Power Conversion Back-UPS BX750MI> at usbus0

apcupsd Installation



To make use of the USB connection, the apcupsd package had to be installed:

paul@f0: ~ % doas install apcupsd

I have made the following modifications to the configuration file so that the UPS can be used via the USB interface:

paul@f0:/usr/local/etc/apcupsd % diff -u apcupsd.conf.sample  apcupsd.conf
--- apcupsd.conf.sample 2024-11-01 16:40:42.000000000 +0200
+++ apcupsd.conf        2024-12-03 10:58:24.009501000 +0200
@@ -31,7 +31,7 @@
 #     940-1524C, 940-0024G, 940-0095A, 940-0095B,
 #     940-0095C, 940-0625A, M-04-02-2000
 #
-UPSCABLE smart
+UPSCABLE usb

 # To get apcupsd to work, in addition to defining the cable
 # above, you must also define a UPSTYPE, which corresponds to
@@ -88,8 +88,10 @@
 #                            that apcupsd binds to that particular unit
 #                            (helpful if you have more than one USB UPS).
 #
-UPSTYPE apcsmart
-DEVICE /dev/usv
+UPSTYPE usb
+DEVICE

 # POLLTIME <int>
 #   Interval (in seconds) at which apcupsd polls the UPS for status. This

I left the remaining settings as the default ones; for example, the following are of main interest:

# If during a power failure, the remaining battery percentage
# (as reported by the UPS) is below or equal to BATTERYLEVEL,
# apcupsd will initiate a system shutdown.
BATTERYLEVEL 5

# If during a power failure, the remaining runtime in minutes
# (as calculated internally by the UPS) is below or equal to MINUTES,
# apcupsd, will initiate a system shutdown.
MINUTES 3

I then enabled and started the daemon:

paul@f0:/usr/local/etc/apcupsd % doas sysrc apcupsd_enable=YES
apcupsd_enable:  -> YES
paul@f0:/usr/local/etc/apcupsd % doas service apcupsd start
Starting apcupsd.

UPS Connectivity Test



And voila, I could now access the UPS information via the apcaccess command; how convenient :-) (I also read through the manual page, which provides a good understanding of what else can be done with it!).

paul@f0:~ % apcaccess
APC      : 001,035,0857
DATE     : 2025-01-26 14:43:27 +0200
HOSTNAME : f0.lan.buetow.org
VERSION  : 3.14.14 (31 May 2016) freebsd
UPSNAME  : f0.lan.buetow.org
CABLE    : USB Cable
DRIVER   : USB UPS Driver
UPSMODE  : Stand Alone
STARTTIME: 2025-01-26 14:43:25 +0200
MODEL    : Back-UPS BX750MI
STATUS   : ONLINE
LINEV    : 230.0 Volts
LOADPCT  : 4.0 Percent
BCHARGE  : 100.0 Percent
TIMELEFT : 65.3 Minutes
MBATTCHG : 5 Percent
MINTIMEL : 3 Minutes
MAXTIME  : 0 Seconds
SENSE    : Medium
LOTRANS  : 145.0 Volts
HITRANS  : 295.0 Volts
ALARMDEL : No alarm
BATTV    : 13.6 Volts
LASTXFER : Automatic or explicit self test
NUMXFERS : 0
TONBATT  : 0 Seconds
CUMONBATT: 0 Seconds
XOFFBATT : N/A
SELFTEST : NG
STATFLAG : 0x05000008
SERIALNO : 9B2414A03599
BATTDATE : 2001-01-01
NOMINV   : 230 Volts
NOMBATTV : 12.0 Volts
NOMPOWER : 410 Watts
END APC  : 2025-01-26 14:44:06 +0200

APC Info on Partner Nodes:



So far, so good. Host f0 would shut down itself when short on power. But what about the f1 and f2 nodes? They aren't connected directly to the UPS and, therefore, wouldn't know that their power is about to be cut off. For this, apcupsd running on the f1 and f2 nodes can be configured to retrieve UPS information via the network from the apcupsd server running on the f0 node, which is connected directly to the APC via USB.

Of course, this won't work when f0 is down. In this case, no operational node would be connected to the UPS via USB; therefore, the current power status would not be known. However, I consider this a rare circumstance. Furthermore, in case of an f0 system crash, sudden power outages on the two other nodes would occur at different times making real data loss (the main concern here) less likely.

And if f0 is down and f1 and f2 receive new data and crash midway, it's likely that a client (e.g., an Android app or another laptop) still has the data stored on it, making data recoverable and data loss overall nearly impossible. I'd receive an alert if any of the nodes go down (more on monitoring later in this blog series).

Installation on partners



To do this, I installed apcupsd via doas pkg install apcupsd on f1 and f2, and then I could connect to it this way:

paul@f1:~ % apcaccess -h f0.lan.buetow.org | grep Percent
LOADPCT  : 12.0 Percent
BCHARGE  : 94.0 Percent
MBATTCHG : 5 Percent

But I want the daemon to be configured and enabled in such a way that it connects to the master UPS node (the one with the UPS connected via USB) so that it can also initiate a system shutdown when the UPS battery reaches low levels. For that, apcupsd itself needs to be aware of the UPS status.

On f1 and f2, I changed the configuration to use f0 (where apcupsd is listening) as a remote device. I also changed the MINUTES setting from 3 to 6 and the BATTERYLEVEL setting from 5 to 10 to ensure that the f1 and f2 nodes could still connect to the f0 node for UPS information before f0 decides to shut down itself. So f1 and f2 must shut down earlier than f0:

paul@f2:/usr/local/etc/apcupsd % diff -u apcupsd.conf.sample apcupsd.conf
--- apcupsd.conf.sample 2024-11-01 16:40:42.000000000 +0200
+++ apcupsd.conf        2025-01-26 15:52:45.108469000 +0200
@@ -31,7 +31,7 @@
 #     940-1524C, 940-0024G, 940-0095A, 940-0095B,
 #     940-0095C, 940-0625A, M-04-02-2000
 #
-UPSCABLE smart
+UPSCABLE ether

 # To get apcupsd to work, in addition to defining the cable
 # above, you must also define a UPSTYPE, which corresponds to
@@ -52,7 +52,6 @@
 #                            Network Information Server. This is used if the
 #                            UPS powering your computer is connected to a
 #                            different computer for monitoring.
-#
 # snmp      hostname:port:vendor:community
 #                            SNMP network link to an SNMP-enabled UPS device.
 #                            Hostname is the ip address or hostname of the UPS
@@ -88,8 +87,8 @@
 #                            that apcupsd binds to that particular unit
 #                            (helpful if you have more than one USB UPS).
 #
-UPSTYPE apcsmart
-DEVICE /dev/usv
+UPSTYPE net
+DEVICE f0.lan.buetow.org:3551

 # POLLTIME <int>
 #   Interval (in seconds) at which apcupsd polls the UPS for status. This
@@ -147,12 +146,12 @@
 # If during a power failure, the remaining battery percentage
 # (as reported by the UPS) is below or equal to BATTERYLEVEL,
 # apcupsd will initiate a system shutdown.
-BATTERYLEVEL 5
+BATTERYLEVEL 10

 # If during a power failure, the remaining runtime in minutes
 # (as calculated internally by the UPS) is below or equal to MINUTES,
 # apcupsd, will initiate a system shutdown.
-MINUTES 3
+MINUTES 6

 # If during a power failure, the UPS has run on batteries for TIMEOUT
 # many seconds or longer, apcupsd will initiate a system shutdown.

So I also ran the following commands on f1 and f2:

paul@f1:/usr/local/etc/apcupsd % doas sysrc apcupsd_enable=YES
apcupsd_enable:  -> YES
paul@f1:/usr/local/etc/apcupsd % doas service apcupsd start
Starting apcupsd.

And then I was able to connect to localhost via the apcaccess command:

paul@f1:~ % doas apcaccess | grep Percent
LOADPCT  : 5.0 Percent
BCHARGE  : 95.0 Percent
MBATTCHG : 5 Percent

Power outage simulation



Pulling the plug



I simulated a power outage by removing the power input from the APC. Immediately, the following message appeared on all the nodes:

Broadcast Message from root@f0.lan.buetow.org
        (no tty) at 15:03 EET...

Power failure. Running on UPS batteries.                                              

I ran the following command to confirm the available battery time:

paul@f0:/usr/local/etc/apcupsd % apcaccess -p TIMELEFT
63.9 Minutes

And after around one hour (f1 and f2 a bit earlier, f0 a bit later due to the different BATTERYLEVEL and MINUTES settings outlined earlier), the following broadcast was sent out:

Broadcast Message from root@f0.lan.buetow.org
        (no tty) at 15:08 EET...

        *** FINAL System shutdown message from root@f0.lan.buetow.org ***

System going down IMMEDIATELY

apcupsd initiated shutdown

And all the nodes shut down safely before the UPS ran out of battery!

Restoring power



After restoring power, I checked the logs in /var/log/daemon.log and found the following on all 3 nodes:

Jan 26 17:36:24 f2 apcupsd[2159]: Power failure.
Jan 26 17:36:30 f2 apcupsd[2159]: Running on UPS batteries.
Jan 26 17:36:30 f2 apcupsd[2159]: Battery charge below low limit.
Jan 26 17:36:30 f2 apcupsd[2159]: Initiating system shutdown!
Jan 26 17:36:30 f2 apcupsd[2159]: User logins prohibited
Jan 26 17:36:32 f2 apcupsd[2159]: apcupsd exiting, signal 15
Jan 26 17:36:32 f2 apcupsd[2159]: apcupsd shutdown succeeded

All good :-)

Conclusion



I have the same UPS (but with a bit more capacity) for my main work setup, which powers my 28" screen, music equipment, etc. It has already been helpful a couple of times during power outages here, so I am sure that the smaller UPS for the F3s setup will be of great use.

Read the next post of this series:

f3s: Kubernetes with FreeBSD - Part 4: Rocky Linux Bhyve VMs

Other BSD related posts are:

2025-12-07 f3s: Kubernetes with FreeBSD - Part 8: Observability
2025-10-02 f3s: Kubernetes with FreeBSD - Part 7: k3s and first pod deployments
2025-07-14 f3s: Kubernetes with FreeBSD - Part 6: Storage
2025-05-11 f3s: Kubernetes with FreeBSD - Part 5: WireGuard mesh network
2025-04-05 f3s: Kubernetes with FreeBSD - Part 4: Rocky Linux Bhyve VMs
2025-02-01 f3s: Kubernetes with FreeBSD - Part 3: Protecting from power cuts (You are currently reading this)
2024-12-03 f3s: Kubernetes with FreeBSD - Part 2: Hardware and base installation
2024-11-17 f3s: Kubernetes with FreeBSD - Part 1: Setting the stage
2024-04-01 KISS high-availability with OpenBSD
2024-01-13 One reason why I love OpenBSD
2022-10-30 Installing DTail on OpenBSD
2022-07-30 Let's Encrypt with OpenBSD and Rex
2016-04-09 Jails and ZFS with Puppet on FreeBSD

E-Mail your comments to paul@nospam.buetow.org :-)

Back to the main site
Working with an SRE Interview https://foo.zone/gemfeed/2025-01-15-working-with-an-sre-interview.html 2025-01-15T00:16:04+02:00 Paul Buetow aka snonux paul@dev.buetow.org I have been interviewed by Florian Buetow on `cracking-ai-engineering.com` about what it's like working with a Site Reliability Engineer from the point of view of a Software Engineer, Data Scientist, and AI Engineer.

Working with an SRE Interview



Published at 2025-01-15T00:16:04+02:00

I have been interviewed by Florian Buetow on cracking-ai-engineering.com about what it's like working with a Site Reliability Engineer from the point of view of a Software Engineer, Data Scientist, and AI Engineer.

See original interview here
Cracking AI Engineering

Below, I am posting the interview here on my blog as well.

Table of Contents




Preamble



In this insightful interview, Paul Bütow, a Principal Site Reliability Engineer at Mimecast, shares over a decade of experience in the field. Paul highlights the role of an Embedded SRE, emphasizing the importance of automation, observability, and effective incident management. We also focused on the key question of how you can work effectively with an SRE weather you are an individual contributor or a manager, a software engineer or data scientist. And how you can learn more about site reliability engineering.

Introducing Paul



Hi Paul, please introduce yourself briefly to the audience. Who are you, what do you do for a living, and where do you work?

My name is Paul Bütow, I work at Mimecast, and I’m a Principal Site Reliability Engineer there. I’ve been with Mimecast for almost ten years now. The company specializes in email security, including things like archiving, phishing detection, malware protection, and spam filtering.

You mentioned that you’re an ‘Embedded SRE.’ What does that mean exactly?

It means that I’m directly part of the software engineering team, not in a separate Ops department. I ensure that nothing is deployed manually, and everything runs through automation. I also set up monitoring and observability. These are two distinct aspects: monitoring alerts us when something breaks, while observability helps us identify trends. I also create runbooks so we know what to do when specific incidents occur frequently.

Infrastructure SREs on the other hand handle the foundational setup, like providing the Kubernetes cluster itself or ensuring the operating systems are installed. They don't work on the application directly but ensure the base infrastructure is there for others to use. This works well when a company has multiple teams that need shared infrastructure.

How did you get started?



How did your interest in Linux or FreeBSD start?

It began during my school days. We had a PC with DOS at home, and I eventually bought Suse Linux 5.3. Shortly after, I discovered FreeBSD because I liked its handbook so much. I wanted to understand exactly how everything worked, so I also tried Linux from Scratch. That involves installing every package manually to gain a better understanding of operating systems.

https://www.FreeBSD.org
https://linuxfromscratch.org/

And after school, you pursued computer science, correct?

Exactly. I wasn’t sure at first whether I wanted to be a software developer or a system administrator. I applied for both and eventually accepted an offer as a Linux system administrator. This was before 'SRE' became a buzzword, but much of what I did back then-automation, infrastructure as code, monitoring-is now considered part of the typical SRE role.

Roles and Career Progression



Tell us about how you joined Mimecast. When did you fully embrace the SRE role?

I started as a Linux sysadmin at 1&1. I managed an ad server farm with hundreds of systems and later handled load balancers. Together with an architect, we managed F5 load balancers distributing around 2,000 services, including for portals like web.de and GMX. I also led the operations team technically for a while before moving to London to join Mimecast.

At Mimecast, the job title was explicitly 'Site Reliability Engineer.' The biggest difference was that I was no longer in a separate Ops department but embedded directly within the storage and search backend team. I loved that because we could plan features together-from automation to measurability and observability. Mimecast also operates thousands of physical servers for email archiving, which was fascinating since I already had experience with large distributed systems at 1&1. It was the right step for me because it allowed me to work close to the code while remaining hands-on with infrastructure.

What are the differences between SRE, DevOps, SysAdmin, and Architects?

SREs are like the next step after SysAdmins. A SysAdmin might manually install servers, replace disks, or use simple scripts for automation, while SREs use infrastructure as code and focus on reliability through SLIs, SLOs, and automation. DevOps isn’t really a job-it’s more of a way of working, where developers are involved in operations tasks like setting up CI/CD pipelines or on-call shifts. Architects focus on designing systems and infrastructures, such as load balancers or distributed systems, working alongside SREs to ensure the systems meet the reliability and scalability requirements. The specific responsibilities of each role depend on the company, and there is often overlap.

What are the most important reliability lessons you’ve learned so far?

  • Don’t leave SRE aspects as an afterthought. It’s much better to discuss automation, monitoring, SLIs, and SLOs early on. Traditional sysadmins often installed systems manually, but today, we do everything via infrastructure as code-using tools like Terraform or Puppet.
  • I also distinguish between monitoring and observability. Monitoring tells us, 'The server is down, alarm!' Observability dives deeper, showing trends like increasing latency so we can act proactively.
  • SLI, SLO, and SLA are core elements. We focus on what users actually experience-for example, how quickly an email is sent-and set our goals accordingly.
  • Runbooks are also crucial. When something goes wrong at night, you don’t want to start from scratch. A runbook outlines how to debug and resolve specific problems, saving time and reducing downtime.

Anecdotes and Best Practices



Runbooks sound very practical. Can you explain how they’re used day-to-day?

Runbooks are essentially guides for handling specific incidents. For instance, if a service won’t start, the runbook will specify where the logs are and which commands to use. Observability takes it a step further, helping us spot changes early-like rising error rates or latency-so we can address issues before they escalate.

When should you decide to put something into a runbook, and when is it unnecessary?

If an issue happens frequently, it should be documented in a runbook so that anyone, even someone new, can follow the steps to fix it. The idea is that 90% of the common incidents should be covered. For example, if a service is down, the runbook would specify where to find logs, which commands to check, and what actions to take. On the other hand, rare or complex issues, where the resolution depends heavily on context or varies each time, don’t make sense to include in detail. For those, it’s better to focus on general troubleshooting steps.

How do you search for and find the correct runbooks?

Runbooks should be linked directly in the alert you receive. For example, if you get an alert about a service not running, the alert will have a link to the runbook that tells you what to check, like logs or commands to run. Runbooks are best stored in an internal wiki, so if you don’t find the link in the alert, you know where to search. The important thing is that runbooks are easy to find and up to date because that’s what makes them useful during incidents.

Do you have an interesting war story you can share with us?

Sure. At 1&1, we had a proprietary ad server software that ran a SQL query during startup. The query got slower over time, eventually timing out and preventing the server from starting. Since we couldn’t access the source code, we searched the binary for the SQL and patched it. By pinpointing the issue, a developer was able to adjust the SQL. This collaboration between sysadmin and developer perspectives highlights the value of SRE work.

Working with Different Teams



You’re embedded in a team-how does collaboration with developers work practically?

We plan everything together from the start. If there’s a new feature, we discuss infrastructure, automated deployments, and monitoring right away. Developers are experts in the code, and I bring the infrastructure expertise. This avoids unpleasant surprises before going live.

How about working with data scientists or ML engineers? Are there differences?

The principles are the same. ML models also need to be deployed and monitored. You deal with monitoring, resource allocation, and identifying performance drops. Whether it’s a microservice or an ML job, at the end of the day, it’s all running on servers or clusters that must remain stable.

What about working with managers or the FinOps team?

We often discuss costs, especially in the cloud, where scaling up resources is easy. It’s crucial to know our metrics: do we have enough capacity? Do we need all instances? Or is the CPU only at 5% utilization? This data helps managers decide whether the budget is sufficient or if optimizations are needed.

Do you have practical tips for working with SREs?

Yes, I have a few:

  • Early involvement: Include SREs from the beginning in your project.
  • Runbooks & documentation: Document recurring errors.
  • Try first: Try to understand the issue yourself before immediately asking the SRE.
  • Basic infra knowledge: Kubernetes and Terraform aren’t magic. Some basic understanding helps every developer.

Using AI Tools



Let’s talk about AI. How do you use it in your daily work?

For boilerplate code, like Terraform snippets, I often use ChatGPT. It saves time, although I always review and adjust the output. Log analysis is another exciting application. Instead of manually going through millions of lines, AI can summarize key outliers or errors.

Do you think AI could largely replace SREs or significantly change the role?

I see AI as an additional tool. SRE requires a deep understanding of how distributed systems work internally. While AI can assist with routine tasks or quickly detect anomalies, human expertise is indispensable for complex issues.

SRE Learning Resources



What resources would you recommend for learning about SRE?

The Google SRE book is a classic, though a bit dry. I really like 'Seeking SRE,' as it offers various perspectives on SRE, with many practical stories from different companies.

https://sre.google/books/
Seeking SRE

Do you have a podcast recommendation?

The Google SRE prodcast is quite interesting. It offers insights into how Google approaches SRE, along with perspectives from external guests.

https://sre.google/prodcast/

Blogging



You also have a blog. What motivates you to write regularly?

Writing helps me learn the most. It also serves as a personal reference. Sometimes I look up how I solved a problem a year ago. And of course, others tackling similar projects might find inspiration in my posts.

What do you blog about?

Mostly technical topics I find exciting, like homelab projects, Kubernetes, or book summaries on IT and productivity. It’s a personal blog, so I write about what I enjoy.

Wrap-up



To wrap up, what are three things every team should keep in mind for stability?

First, maintain runbooks and documentation to avoid chaos at night. Second, automate everything-manual installs in production are risky. Third, define SLIs, SLOs, and SLAs early so everyone knows what we’re monitoring and guaranteeing.

Is there a motto or mindset that particularly inspires you as an SRE?

"Keep it simple and stupid"-KISS. Not everything has to be overly complex. And always stay curious. I’m still fascinated by how systems work under the hood.

Where can people find you online?

You can find links to my socials on my website paul.buetow.org
I regularly post articles and link to everything else I’m working on outside of work.

https://paul.buetow.org

Thank you very much for your time and this insightful interview into the world of site reliability engineering

My pleasure, this was fun.

Closing comments



Dear reader, I hope this conversation with Paul Bütow provided an exciting peak into the world of Site Reliability Engineering. Whether you’re a software developer, data scientist, ML engineer, or manager, reliable systems are always a team effort. Hopefully, you’ve taken some insights or tips from Paul’s experiences for your own team or next project. Thanks for joining us, and best of luck refining your own SRE practices!

E-Mail your comments to paul@nospam.buetow.org or contact Florian via the Cracking AI Engineering :-)

Back to the main site
Posts from October to December 2024 https://foo.zone/gemfeed/2025-01-01-posts-from-october-to-december-2024.html 2024-12-31T18:09:58+02:00 Paul Buetow aka snonux paul@dev.buetow.org Happy new year!

Posts from October to December 2024



Published at 2024-12-31T18:09:58+02:00

Happy new year!

These are my social media posts from the last three months. I keep them here to reflect on them and also to not lose them. Social media networks come and go and are not under my control, but my domain is here to stay.

These are from Mastodon and LinkedIn. Have a look at my about page for my social media profiles. This list is generated with Gos, my social media platform sharing tool.

My about page
https://codeberg.org/snonux/gos

Table of Contents




October 2024



First on-call experience in a startup. Doesn't ...



First on-call experience in a startup. Doesn't sound a lot of fun! But the lessons were learned! #sre

ntietz.com/blog/lessons-from-my-first-on-call/

Reviewing your own PR or MR before asking ...



Reviewing your own PR or MR before asking others to review it makes a lot of sense. Have seen so many silly mistakes which would have been avoided. Saving time for the real reviewer.

www.jvt.me/posts/2019/01/12/self-code-review/

Fun with defer in #golang, I did't know, that ...



Fun with defer in #golang, I did't know, that a defer object can either be heap or stack allocated. And there are some rules for inlining, too.

victoriametrics.com/blog/defer-in-go/

I have been in incidents. Understandably, ...



I have been in incidents. Understandably, everyone wants the issue to be resolved as quickly and others want to know how long TTR will be. IMHO, providing no estimates at all is no solution either. So maybe give a rough estimate but clearly communicate that the estimate is rough and that X, Y, and Z can interfere, meaning there is a chance it will take longer to resolve the incident. Just my thought. What's yours?

firehydrant.com/blog/hot-take-dont-provide-incident-resolution-estimates/

Little tips using strings in #golang and I ...



Little tips using strings in #golang and I personally think one must look more into the std lib (not just for strings, also for slices, maps,...), there are tons of useful helper functions.

www.calhoun.io/6-tips-for-using-strings-in-go/

Reading this post about #rust (especially the ...



Reading this post about #rust (especially the first part), I think I made a good choice in deciding to dive into #golang instead. There was a point where I wanted to learn a new programming language, and Rust was on my list of choices. I think the Go project does a much better job of deciding what goes into the language and how. What are your thoughts?

josephg.com/blog/rewriting-rust/

The opposite of #ChaosMonkey ... ...



The opposite of #ChaosMonkey ... automatically repairing and healing services helping to reduce manual toil work. Runbooks and scripts are only the first step, followed by a fully blown service written in Go. Could be useful, but IMHO why not rather address the root causes of the manual toil work? #sre

blog.cloudflare.com/nl-nl/improving-platform-resilience-at-cloudflare/

November 2024



I just became a Silver Patreon for OSnews. What ...



I just became a Silver Patreon for OSnews. What is OSnews? It is an independent news site about IT. It is slightly independent and, at times, alternative. I have enjoyed it since my early student days. This one and other projects I financially support are listed here:

foo.zone/gemfeed/2024-09-07-projects-i-support.html (Gemini)
foo.zone/gemfeed/2024-09-07-projects-i-support.html

Until now, I wasn't aware, that Go is under a ...



Until now, I wasn't aware, that Go is under a BSD-style license (3-clause as it seems). Neat. I don't know why, but I always was under the impression it would be MIT. #bsd #golang

go.dev/LICENSE

These are some book notes from "Staff Engineer" ...



These are some book notes from "Staff Engineer" – there is some really good insight into what is expected from a Staff Engineer and beyond in the industry. I wish I had read the book earlier.

foo.zone/gemfeed/2024-10-24-staff-engineer-book-notes.html (Gemini)
foo.zone/gemfeed/2024-10-24-staff-engineer-book-notes.html

Looking at #Kubernetes, it's pretty much ...



Looking at #Kubernetes, it's pretty much following the Unix way of doing things. It has many tools, but each tool has its own single purpose: DNS, scheduling, container runtime, various controllers, networking, observability, alerting, and more services in the control plane. Everything is managed by different services or plugins, mostly running in their dedicated pods. They don't communicate through pipes, but network sockets, though. #k8s

There has been an outage at the upstream ...



There has been an outage at the upstream network provider for OpenBSD.Amsterdam (hoster, I am using). This was the first real-world test for my KISS HA setup, and it worked flawlessly! All my sites and services failed over automatically to my other #OpenBSD VM!

foo.zone/gemfeed/2024-04-01-KISS-high-availability-with-OpenBSD.html (Gemini)
foo.zone/gemfeed/2024-04-01-KISS-high-availability-with-OpenBSD.html
openbsd.amsterdam/

One of the more confusing parts in Go, nil ...



One of the more confusing parts in Go, nil values vs nil errors: #golang

unexpected-go.com/nil-errors-that-are-non-nil-errors.html

Agreeably, writing down with Diagrams helps you ...



Agreeably, writing down with Diagrams helps you to think things more through. And keeps others on the same page. Only worth for projects from a certain size, IMHO.

ntietz.com/blog/reasons-to-write-design-docs/

I like the idea of types in Ruby. Raku is ...



I like the idea of types in Ruby. Raku is supports that already, but in Ruby, you must specify the types in a separate .rbs file, which is, in my opinion, cumbersome and is a reason not to use it extensively for now. I believe there are efforts to embed the type information in the standard .rb files, and that the .rbs is just an experiment to see how types could work out without introducing changes into the core Ruby language itself right now? #Ruby #RakuLang

github.com/ruby/rbs

So, #Haskell is better suited for general ...



So, #Haskell is better suited for general purpose than #Rust? I thought deploying something in Haskell means publishing an academic paper :-) Interesting rant about Rust, though:

chrisdone.com/posts/rust/

At first, functional options add a bit of ...



At first, functional options add a bit of boilerplate, but they turn out to be quite neat, especially when you have very long parameter lists that need to be made neat and tidy. #golang

www.calhoun.io/using-functional-options-instead-of-method-chaining-in-go/

Revamping my home lab a little bit. #freebsd ...



Revamping my home lab a little bit. #freebsd #bhyve #rocky #linux #vm #k3s #kubernetes #wireguard #zfs #nfs #ha #relayd #k8s #selfhosting #homelab

foo.zone/gemfeed/2024-11-17-f3s-kubernetes-with-freebsd-part-1.html (Gemini)
foo.zone/gemfeed/2024-11-17-f3s-kubernetes-with-freebsd-part-1.html

Wondering to which #web #browser I should ...



Wondering to which #web #browser I should switch now personally ...

www.osnews.com/story/141100/mozilla-fo..-..dvocacy-for-open-web-privacy-and-more/

eks-node-viewer is a nifty tool, showing the ...



eks-node-viewer is a nifty tool, showing the compute nodes currently in use in the #EKS cluster. especially useful when dynamically allocating nodes with #karpenter or auto scaling groups.

github.com/awslabs/eks-node-viewer

Have put more Photos on - On my static photo ...



Have put more Photos on - On my static photo sites - Generated with a #bash script

irregular.ninja

In Go, passing pointers are not automatically ...



In Go, passing pointers are not automatically faster than values. Pointers often force the memory to be allocated on the heap, adding GC overhad. With values, Go can determine whether to put the memory on the stack instead. But with large structs/objects (how you want to call them) or if you want to modify state, then pointers are the semantic to use. #golang

blog.boot.dev/golang/pointers-faster-than-values/

Myself being part of an on-call rotations over ...



Myself being part of an on-call rotations over my whole professional life, just have learned this lesson "Tell people who are new to on-call: Just have fun" :-) This is a neat blog post to read:

ntietz.com/blog/what-i-tell-people-new-to-oncall/

Feels good to code in my old love #Perl again ...



Feels good to code in my old love #Perl again after a while. I am implementing a log parser for generating site stats of my personal homepage! :-) @Perl

This is an interactive summary of the Go ...



This is an interactive summary of the Go release, with a lot of examples utilising iterators in the slices and map packages. Love it! #golang

antonz.org/go-1-23/

December 2024



Thats unexpected, you cant remove a NaN key ...



Thats unexpected, you cant remove a NaN key from a map without clearing it! #golang

unexpected-go.com/you-cant-remove-a-nan-key-from-a-map-without-clearing-it.html

My second blog post about revamping my home lab ...



My second blog post about revamping my home lab a little bit just hit the net. #FreeBSD #ZFS #n100 #k8s #k3s #kubernetes

foo.zone/gemfeed/2024-12-03-f3s-kubernetes-with-freebsd-part-2.html (Gemini)
foo.zone/gemfeed/2024-12-03-f3s-kubernetes-with-freebsd-part-2.html

Very insightful article about tech hiring in ...



Very insightful article about tech hiring in the age of LLMs. As an interviewer, I have experienced some of the scrnarios already first hand...

newsletter.pragmaticengineer.com/p/how-genai-changes-tech-hiring

for #bpf #ebpf performance debugging, have ...



for #bpf #ebpf performance debugging, have a look at bpftop from Netflix. A neat tool showing you the estimated CPU time and other performance statistics for all the BPF programs currently loaded into the #linux kernel. Highly recommend!

github.com/Netflix/bpftop

89 things he/she knows about Git commits is a ...



89 things he/she knows about Git commits is a neat list of #Git wisdoms

www.jvt.me/posts/2024/07/12/things-know-commits/

I found that working on multiple side projects ...



I found that working on multiple side projects concurrently is better than concentrating on just one. This seems inefficient at first, but whenever you tend to lose motivation, you can temporarily switch to another one with full élan. However, remember to stop starting and start finishing. This doesn't mean you should be working on 10+ (and a growing list of) side projects concurrently! Select your projects and commit to finishing them before starting the next thing. For example, my current limit of concurrent side projects is around five.

Agreed? Agreed. Besides #Ruby, I would also ...



Agreed? Agreed. Besides #Ruby, I would also add #RakuLang and #Perl @Perl to the list of languages that are great for shell scripts - "Making Easy Things Easy and Hard Things Possible"

lucasoshiro.github.io/posts-en/2024-06-17-ruby-shellscript/

Plan9 assembly format in Go, but wait, it's not ...



Plan9 assembly format in Go, but wait, it's not the Operating System Plan9! #golang #rabbithole

www.osnews.com/story/140941/go-plan9-memo-speeding-up-calculations-450/

This is a neat blog post about the Helix text ...



This is a neat blog post about the Helix text editor, to which I personally switched around a year ago (from NeoVim). I should blog about my experience as well. To summarize: I am using it together with the terminal multiplexer #tmux. It doesn't bother me that Helix is purely terminal-based and therefore everything has to be in the same font. #HelixEditor

jonathan-frere.com/posts/helix/

This blog post is basically a rant against ...



This blog post is basically a rant against DataDog... Personally, I don't have much experience with DataDog (actually, I have never used it), but one reason to work with logs at my day job (with over 2,000 physical server machines) and to be cost-effective is by using dtail! #dtail #logs #logmanagement

crys.site/blog/2024/reinventint-the-weel/
dtail.dev

Quick trick to get Helix themes selected ...



Quick trick to get Helix themes selected randomly #HelixEditor

foo.zone/gemfeed/2024-12-15-random-helix-themes.html (Gemini)
foo.zone/gemfeed/2024-12-15-random-helix-themes.html

Example where complexity attacks you from ...



Example where complexity attacks you from behind #k8s #kubernetes #OpenAI

surfingcomplexity.blog/2024/12/14/quic..-..ecent-openai-public-incident-write-up/

LLMs for Ops? Summaries of logs, probabilities ...



LLMs for Ops? Summaries of logs, probabilities about correctness, auto-generating Ansible, some uses cases are there. Wouldn't trust it fully, though.

youtu.be/WodaffxVq-E?si=noY0egrfl5izCSQI

Excellent article about your dream Product ...



Excellent article about your dream Product Manager: Why every software team needs a product manager to thrive via @wallabagapp

testdouble.com/insights/why-product-ma..-..s-accelerate-improve-software-delivery

I just finished reading all chapters of CPU ...



I just finished reading all chapters of CPU land: ... not claiming to remember every detail, but it is a great refresher how CPUs and operating systems actually work under the hood when you execute a program, which we tend to forget in our higher abstraction world. I liked the "story" and some of the jokes along the way! Size wise, it is pretty digestable (not talking about books, but only 7 web articles/chapters)! #cpu #linux #unix #kernel #macOS

cpu.land/

Indeed, useful to know this stuff! #sre ...



Indeed, useful to know this stuff! #sre

biriukov.dev/docs/resolver-dual-stack-..-..resolvers-and-dual-stack-applications/

It's the small things, which make Unix like ...



It's the small things, which make Unix like systems, like GNU/Linux, interesting. Didn't know about this #GNU #Tar behaviour yet:

xeiaso.net/notes/2024/pop-quiz-tar/

My New Year's resolution is not to start any ...



My New Year's resolution is not to start any new non-fiction books (or only very few) but to re-read and listen to my favorites, which I read to reflect on and see things from different perspectives. Every time you re-read a book, you gain new insights.<nil>17491

Other related posts:

2026-01-01 Posts from July to December 2025
2025-07-01 Posts from January to June 2025
2025-01-01 Posts from October to December 2024 (You are currently reading this)

E-Mail your comments to paul@nospam.buetow.org :-)

Back to the main site
Random Helix Themes https://foo.zone/gemfeed/2024-12-15-random-helix-themes.html 2024-12-15T13:55:05+02:00 Paul Buetow aka snonux paul@dev.buetow.org I thought it would be fun to have a random Helix theme every time I open a new shell. Helix is the text editor I use.

Random Helix Themes



Published at 2024-12-15T13:55:05+02:00; Last updated 2024-12-18

I thought it would be fun to have a random Helix theme every time I open a new shell. Helix is the text editor I use.

https://helix-editor.com/

So I put this into my zsh dotfiles (in some editor.zsh.source in my ~ directory):


So every time I open a new terminal or shell, editor::helix::random_theme gets called, which randomly selects a theme from all installed ones and updates the helix config accordingly.


A better version



Update 2024-12-18: This is an improved version, which works cross platform (e.g., also on MacOS) and multiple theme directories:


I hope you had some fun. E-Mail your comments to paul@nospam.buetow.org :-)

Back to the main site
f3s: Kubernetes with FreeBSD - Part 2: Hardware and base installation https://foo.zone/gemfeed/2024-12-03-f3s-kubernetes-with-freebsd-part-2.html 2024-12-02T23:48:21+02:00, last updated Sun 11 Jan 10:30:00 EET 2026 Paul Buetow aka snonux paul@dev.buetow.org This is the second blog post about my f3s series for my self-hosting demands in my home lab. f3s? The 'f' stands for FreeBSD, and the '3s' stands for k3s, the Kubernetes distribution I will use on FreeBSD-based physical machines.

f3s: Kubernetes with FreeBSD - Part 2: Hardware and base installation



Published at 2024-12-02T23:48:21+02:00, last updated Sun 11 Jan 10:30:00 EET 2026

This is the second blog post about my f3s series for my self-hosting demands in my home lab. f3s? The "f" stands for FreeBSD, and the "3s" stands for k3s, the Kubernetes distribution I will use on FreeBSD-based physical machines.

We set the stage last time; this time, we will set up the hardware for this project.

These are all the posts so far:

2024-11-17 f3s: Kubernetes with FreeBSD - Part 1: Setting the stage
2024-12-03 f3s: Kubernetes with FreeBSD - Part 2: Hardware and base installation (You are currently reading this)
2025-02-01 f3s: Kubernetes with FreeBSD - Part 3: Protecting from power cuts
2025-04-05 f3s: Kubernetes with FreeBSD - Part 4: Rocky Linux Bhyve VMs
2025-05-11 f3s: Kubernetes with FreeBSD - Part 5: WireGuard mesh network
2025-07-14 f3s: Kubernetes with FreeBSD - Part 6: Storage
2025-10-02 f3s: Kubernetes with FreeBSD - Part 7: k3s and first pod deployments
2025-12-07 f3s: Kubernetes with FreeBSD - Part 8: Observability

f3s logo

ChatGPT generated logo..

Let's continue...

Table of Contents




Deciding on the hardware



Note that the OpenBSD VMs included in the f3s setup (which will be used later in this blog series for internet ingress - as you know from the first part of this blog series) are already there. These are virtual machines that I rent at OpenBSD Amsterdam and Hetzner.

https://openbsd.amsterdam
https://hetzner.cloud

This means that the FreeBSD boxes need to be covered, which will later be running k3s in Linux VMs via bhyve hypervisor.

I've been considering whether to use Raspberry Pis or look for alternatives. It turns out that complete N100-based mini-computers aren't much more expensive than Raspberry Pi 5s, and they don't require assembly. Furthermore, I like that they are AMD64 and not ARM-based, which increases compatibility with some applications (e.g., I might want to virtualize Windows (via bhyve) on one of those, though that's out of scope for this blog series).

Not ARM but Intel N100



I needed something compact, efficient, and capable enough to handle the demands of a small-scale Kubernetes cluster and preferably something I don't have to assemble a lot. After researching, I decided on the Beelink S12 Pro with Intel N100 CPUs.

Beelink Mini S12 Pro N100 official page

The Intel N100 CPUs are built on the "Alder Lake-N" architecture. These chips are designed to balance performance and energy efficiency well. With four cores, they're more than capable of running multiple containers, even with moderate workloads. Plus, they consume only around 8W of power (ok, that's more than the Pis...), keeping the electricity bill low enough and the setup quiet - perfect for 24/7 operation.

Beelink preparation

The Beelink comes with the following specs:

  • 12th Gen Intel N100 processor, with four cores and four threads, and a maximum frequency of up to 3.4 GHz.
  • 16 GB of DDR4 RAM, with a maximum (official) size of 16 GB (but people could install 32 GB on it).
  • 500 GB M.2 SSD, with the option to install a 2nd 2.5 SSD drive (which I want to make use of later in this blog series).
  • GBit ethernet
  • Four USB 3.2 Gen2 ports (maybe I want to mount something externally at some point)
  • Dimensions and weight: 115*102*39mm, 280g
  • Silent cooling system.
  • HDMI output (needed only for the initial installation and maybe for troubleshooting later)
  • Auto power on via WoL (may make use of it)
  • Wi-Fi (not going to use it)

I bought three (3) of them for the cluster I intend to build.



Unboxing was uneventful. Every Beelink PC came with:

  • An AC power adapter
  • An HDMI cable
  • A VESA mount with screws (not using it as of now)
  • Some manuals
  • The pre-assembled Beelink PC itself.
  • A "Hello" post card (??)

Overall, I love the small form factor.

Network switch



I went with the tp-link mini 5-port switch, as I had a spare one available. That switch will be plugged into my wall ethernet port, which connects directly to my fiber internet router with 100 Mbit/s down and 50 Mbit/s upload speed.

Switch

Installing FreeBSD



Base install



First, I downloaded the boot-only ISO of the latest FreeBSD release and dumped it on a USB stick via my Fedora laptop:

[paul@earth]~/Downloads% sudo dd \
  if=FreeBSD-14.1-RELEASE-amd64-bootonly.iso \
  of=/dev/sda conv=sync

Next, I plugged the Beelinks (one after another) into my monitor via HDMI (the resolution of the FreeBSD text console seems strangely stretched, as I am using the LG Dual Up monitor), connected Ethernet, an external USB keyboard, and the FreeBSD USB stick, and booted the devices up. With F7, I entered the boot menu and selected the USB stick for the FreeBSD installation.

The installation was uneventful. I selected:

  • Guided ZFS on root (pool zroot)
  • Unencrypted ZFS (I will encrypt separate datasets later; I want it to be able to boot without manual interaction)
  • Static IP configuration (to ensure that the boxes always have the same IPs, even after switching the router/DHCP server)
  • I decided to enable the SSH daemon, NTP server, and NTP time synchronization at boot, and I also enabled powerd for automatic CPU frequency scaling.
  • In addition to root, I added a personal user, paul, whom I placed in the wheel group.

After doing all that three times (once for each Beelink PC), I had three ready-to-use FreeBSD boxes! Their hostnames are f0, f1 and f2!

Beelink installation

Latest patch level and customizing /etc/hosts



After the first boot, I upgraded to the latest FreeBSD patch level as follows:

root@f0:~ # freebsd-update fetch
root@f0:~ # freebsd-update install
root@f0:~ # freebsd-update reboot

I also added the following entries for the three FreeBSD boxes to the /etc/hosts file:

root@f0:~ # cat <<END >>/etc/hosts
192.168.1.130 f0 f0.lan f0.lan.buetow.org
192.168.1.131 f1 f1.lan f1.lan.buetow.org
192.168.1.132 f2 f2.lan f2.lan.buetow.org
END

You might wonder why bother using the hosts file? Why not use DNS properly? The reason is simplicity. I don't manage 100 hosts, only a few here and there. Having an OpenWRT router in my home, I could also configure everything there, but maybe I'll do that later. For now, keep it simple and straightforward.

After install



After that, I installed the following additional packages:

root@f0:~ # pkg install helix doas zfs-periodic uptimed

Helix editor



Helix? It's my favourite text editor. I have nothing against vi but like hx (Helix) more!

https://helix-editor.com/

doas



doas? It's a pretty neat (and KISS) replacement for sudo. It has far fewer features than sudo, which is supposed to make it more secure. Its origin is the OpenBSD project. For doas, I accepted the default configuration (where users in the wheel group are allowed to run commands as root):

root@f0:~ # cp /usr/local/etc/doas.conf.sample /usr/local/etc/doas.conf

https://man.openbsd.org/doas

Periodic ZFS snapshotting



zfs-periodic is a nifty tool for automatically creating ZFS snapshots. I decided to go with the following configuration here:

root@f0:~ # cat <<END >>/etc/periodic.conf
daily_zfs_snapshot_enable="YES"
daily_zfs_snapshot_pools="zroot"
daily_zfs_snapshot_keep="7"
weekly_zfs_snapshot_enable="YES"
weekly_zfs_snapshot_pools="zroot"
weekly_zfs_snapshot_keep="5"
monthly_zfs_snapshot_enable="YES"
monthly_zfs_snapshot_pools="zroot"
monthly_zfs_snapshot_keep="6"
END

https://github.com/ross/zfs-periodic

Note: We have not added zdata to the list of snapshot pools. Currently, this pool does not exist yet, but it will be created later in this blog series. zrepl, which we will use for replication, later in this blog series will manage the zdata snapshots.

Uptime tracking



uptimed? I like to track my uptimes. This is how I configured the daemon:

root@f0:~ # cp /usr/local/mimecast/etc/uptimed.conf-dist \
  /usr/local/mimecast/etc/uptimed.conf 
root@f0:~ # hx /usr/local/mimecast/etc/uptimed.conf

In the Helix editor session, I changed LOG_MAXIMUM_ENTRIES to 0 to keep all uptime entries forever and not cut off at 50 (the default config). After that, I enabled and started uptimed:

root@f0:~ # service uptimed enable
root@f0:~ # service uptimed start

To check the current uptime stats, I can now run uprecords:

 root@f0:~ # uprecords
     #               Uptime | System                                     Boot up
----------------------------+---------------------------------------------------
->   1     0 days, 00:07:34 | FreeBSD 14.1-RELEASE      Mon Dec  2 12:21:44 2024
----------------------------+---------------------------------------------------
NewRec     0 days, 00:07:33 | since                     Mon Dec  2 12:21:44 2024
    up     0 days, 00:07:34 | since                     Mon Dec  2 12:21:44 2024
  down     0 days, 00:00:00 | since                     Mon Dec  2 12:21:44 2024
   %up              100.000 | since                     Mon Dec  2 12:21:44 2024

This is how I track the uptimes for all of my host:

Unveiling guprecords.raku: Global Uptime Records with Raku-
https://github.com/rpodgorny/uptimed

Hardware check



Ethernet



Works. Nothing eventful, really. It's a cheap Realtek chip, but it will do what it is supposed to do.

paul@f0:~ % ifconfig re0
re0: flags=1008843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST,LOWER_UP> metric 0 mtu 1500
        options=8209b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC,LINKSTATE>
        ether e8:ff:1e:d7:1c:ac
        inet 192.168.1.130 netmask 0xffffff00 broadcast 192.168.1.255
        inet6 fe80::eaff:1eff:fed7:1cac%re0 prefixlen 64 scopeid 0x1
        inet6 fd22:c702:acb7:0:eaff:1eff:fed7:1cac prefixlen 64 detached autoconf
        inet6 2a01:5a8:304:1d5c:eaff:1eff:fed7:1cac prefixlen 64 autoconf pltime 10800 vltime 14400
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL>

RAM



All there:

paul@f0:~ % sysctl hw.physmem
hw.physmem: 16902905856


CPUs



They work:

paul@f0:~ % sysctl dev.cpu | grep freq:
dev.cpu.3.freq: 705
dev.cpu.2.freq: 705
dev.cpu.1.freq: 604
dev.cpu.0.freq: 604

CPU throttling



With powerd running, CPU freq is dowthrottled when the box isn't jam-packed. To stress it a bit, I run ubench to see the frequencies being unthrottled again:

paul@f0:~ % doas pkg install ubench
paul@f0:~ % rehash # For tcsh to find the newly installed command
paul@f0:~ % ubench &
paul@f0:~ % sysctl dev.cpu | grep freq:
dev.cpu.3.freq: 2922
dev.cpu.2.freq: 2922
dev.cpu.1.freq: 2923
dev.cpu.0.freq: 2922

Idle, all three Beelinks plus the switch consumed 26.2W. But with ubench stressing all the CPUs, it went up to 38.8W.

Idle consumption.

Wake-on-LAN Setup



Updated Sun 11 Jan 10:30:00 EET 2026

As mentioned in the hardware specs above, the Beelink S12 Pro supports Wake-on-LAN (WoL), which allows me to remotely power on the machines over the network. This is particularly useful since I don't need all three machines running 24/7, and I can save power by shutting them down when not needed and waking them up on demand.

The good news is that FreeBSD already has WoL support enabled by default on the Realtek network interface, as evidenced by the WOL_MAGIC option shown in the ifconfig re0 output above (line 215).

Setting up WoL on the laptop



To wake the Beelinks from my Fedora laptop (earth), I installed the wol package:

[paul@earth]~% sudo dnf install -y wol

Next, I created a simple script (~/bin/wol-f3s) to wake and shutdown the machines:

#!/bin/bash
# Wake-on-LAN and shutdown script for f3s cluster (f0, f1, f2)

# MAC addresses
F0_MAC="e8:ff:1e:d7:1c:ac"  # f0 (192.168.1.130)
F1_MAC="e8:ff:1e:d7:1e:44"  # f1 (192.168.1.131)
F2_MAC="e8:ff:1e:d7:1c:a0"  # f2 (192.168.1.132)

# IP addresses
F0_IP="192.168.1.130"
F1_IP="192.168.1.131"
F2_IP="192.168.1.132"

# SSH user
SSH_USER="paul"

# Broadcast address for your LAN
BROADCAST="192.168.1.255"

wake() {
    local name=$1
    local mac=$2
    echo "Sending WoL packet to $name ($mac)..."
    wol -i "$BROADCAST" "$mac"
}

shutdown_host() {
    local name=$1
    local ip=$2
    echo "Shutting down $name ($ip)..."
    ssh -o ConnectTimeout=5 "$SSH_USER@$ip" "doas poweroff" 2>/dev/null && \
        echo "  ✓ Shutdown command sent to $name" || \
        echo "  ✗ Failed to reach $name (already down?)"
}

ACTION="${1:-all}"

case "$ACTION" in
    f0) wake "f0" "$F0_MAC" ;;
    f1) wake "f1" "$F1_MAC" ;;
    f2) wake "f2" "$F2_MAC" ;;
    all|"")
        wake "f0" "$F0_MAC"
        wake "f1" "$F1_MAC"
        wake "f2" "$F2_MAC"
        ;;
    shutdown|poweroff|down)
        shutdown_host "f0" "$F0_IP"
        shutdown_host "f1" "$F1_IP"
        shutdown_host "f2" "$F2_IP"
        echo ""
        echo "✓ Shutdown commands sent to all machines."
        exit 0
        ;;
    *)
        echo "Usage: $0 [f0|f1|f2|all|shutdown]"
        exit 1
        ;;
esac

echo ""
echo "✓ WoL packets sent. Machines should boot in a few seconds."

After making the script executable with chmod +x ~/bin/wol-f3s, I can now control the machines with simple commands:

[paul@earth]~% wol-f3s          # Wake all three
[paul@earth]~% wol-f3s f0       # Wake only f0
[paul@earth]~% wol-f3s shutdown # Shutdown all three via SSH

Testing WoL and Shutdown



To test the setup, I shutdown all three machines using the script's shutdown function:

[paul@earth]~% wol-f3s shutdown
Shutting down f0 (192.168.1.130)...
  ✓ Shutdown command sent to f0
Shutting down f1 (192.168.1.131)...
  ✓ Shutdown command sent to f1
Shutting down f2 (192.168.1.132)...
  ✓ Shutdown command sent to f2

✓ Shutdown commands sent to all machines.

After waiting for them to fully power down (about 1 minute), I sent the WoL magic packets:

[paul@earth]~% wol-f3s
Sending WoL packet to f0 (e8:ff:1e:d7:1c:ac)...
Waking up e8:ff:1e:d7:1c:ac...
Sending WoL packet to f1 (e8:ff:1e:d7:1e:44)...
Waking up e8:ff:1e:d7:1e:44...
Sending WoL packet to f2 (e8:ff:1e:d7:1c:a0)...
Waking up e8:ff:1e:d7:1c:a0...

✓ WoL packets sent. Machines should boot in a few seconds.

Within 30-50 seconds, all three machines successfully booted up and became accessible via SSH!

WoL from WiFi



An important note: **Wake-on-LAN works perfectly even when the laptop is connected via WiFi**. As long as both the laptop and the Beelinks are on the same local network (192.168.1.x), the router bridges the WiFi and wired networks together, allowing the WoL broadcast packets to reach the machines.

This makes WoL very convenient - I can wake the cluster from anywhere in my home, whether I'm on WiFi or ethernet.

Remote Shutdown via SSH



While Wake-on-LAN handles powering on the machines remotely, I also added a shutdown function to the script for convenience. The wol-f3s shutdown command uses SSH to connect to each machine and execute doas poweroff, gracefully shutting them all down.

This is particularly useful for power saving - when I'm done working with the cluster for the day, I can simply run:

[paul@earth]~% wol-f3s shutdown

And all three machines will shut down cleanly. The next time I need them, a simple wol-f3s command wakes them all back up. This combination makes the cluster very energy-efficient while maintaining quick access when needed.

BIOS Configuration



For WoL to work reliably, make sure to check the BIOS settings on each Beelink:

  • Enable "Wake on LAN" (usually under Power Management)
  • Disable "ERP Support" or "ErP Ready" (this can prevent WoL from working)
  • Enable "Power on by PCI-E" or "Wake on PCI-E"

The exact menu names vary, but these settings are typically found in the Power Management or Advanced sections of the BIOS.

Conclusion



The Beelink S12 Pro with Intel N100 CPUs checks all the boxes for a k3s project: Compact, efficient, expandable, and affordable. Its compatibility with both Linux and FreeBSD makes it versatile for other use cases, whether as part of your cluster or as a standalone system. If you’re looking for hardware that punches above its weight for Kubernetes, this little device deserves a spot on your shortlist.

Beelinks stacked

To ease cable management, I need to get shorter ethernet cables. I will place the tower on my shelf, where most of the cables will be hidden (together with a UPS, which will also be added to the setup).

Read the next post of this series:

f3s: Kubernetes with FreeBSD - Part 3: Protecting from power cuts

Other *BSD-related posts:

2025-12-07 f3s: Kubernetes with FreeBSD - Part 8: Observability
2025-10-02 f3s: Kubernetes with FreeBSD - Part 7: k3s and first pod deployments
2025-07-14 f3s: Kubernetes with FreeBSD - Part 6: Storage
2025-05-11 f3s: Kubernetes with FreeBSD - Part 5: WireGuard mesh network
2025-04-05 f3s: Kubernetes with FreeBSD - Part 4: Rocky Linux Bhyve VMs
2025-02-01 f3s: Kubernetes with FreeBSD - Part 3: Protecting from power cuts
2024-12-03 f3s: Kubernetes with FreeBSD - Part 2: Hardware and base installation (You are currently reading this)
2024-11-17 f3s: Kubernetes with FreeBSD - Part 1: Setting the stage
2024-04-01 KISS high-availability with OpenBSD
2024-01-13 One reason why I love OpenBSD
2022-10-30 Installing DTail on OpenBSD
2022-07-30 Let's Encrypt with OpenBSD and Rex
2016-04-09 Jails and ZFS with Puppet on FreeBSD

E-Mail your comments to paul@nospam.buetow.org :-)

Back to the main site
f3s: Kubernetes with FreeBSD - Part 1: Setting the stage https://foo.zone/gemfeed/2024-11-17-f3s-kubernetes-with-freebsd-part-1.html 2024-11-16T23:20:14+02:00 Paul Buetow aka snonux paul@dev.buetow.org This is the first blog post about my f3s series for my self-hosting demands in my home lab. f3s? The 'f' stands for FreeBSD, and the '3s' stands for k3s, the Kubernetes distribution I will use on FreeBSD-based physical machines.

f3s: Kubernetes with FreeBSD - Part 1: Setting the stage



Published at 2024-11-16T23:20:14+02:00

This is the first blog post about my f3s series for my self-hosting demands in my home lab. f3s? The "f" stands for FreeBSD, and the "3s" stands for k3s, the Kubernetes distribution I will use on FreeBSD-based physical machines.

I will post a new entry every month or so (there are too many other side projects for more frequent updates—I bet you can understand).

These are all the posts so far:

2024-11-17 f3s: Kubernetes with FreeBSD - Part 1: Setting the stage (You are currently reading this)
2024-12-03 f3s: Kubernetes with FreeBSD - Part 2: Hardware and base installation
2025-02-01 f3s: Kubernetes with FreeBSD - Part 3: Protecting from power cuts
2025-04-05 f3s: Kubernetes with FreeBSD - Part 4: Rocky Linux Bhyve VMs
2025-05-11 f3s: Kubernetes with FreeBSD - Part 5: WireGuard mesh network
2025-07-14 f3s: Kubernetes with FreeBSD - Part 6: Storage
2025-10-02 f3s: Kubernetes with FreeBSD - Part 7: k3s and first pod deployments
2025-12-07 f3s: Kubernetes with FreeBSD - Part 8: Observability

f3s logo

ChatGPT generated logo..

Let's begin...

Table of Contents




Why this setup?



My previous setup was great for learning Terraform and AWS, but it is too expensive. Costs are under control there, but only because I am shutting down all containers after use (so they are offline ninety percent of the time and still cost around $20 monthly). With the new setup, I could run all containers 24/7 at home, which would still be cheaper in terms of electricity consumption. I have a 400 MBit/s uplink (I could have more if I wanted, but it is more than plenty for my use case already).

From babylon5.buetow.org to .cloud

Migrating off all my containers from AWS ECS means I need a reliable and scalable environment to host my workloads. I wanted something:

  • To self-host all my open-source apps (Docker containers).
  • Fully under my control (goodbye cloud vendor lock-in).
  • Secure and redundant.
  • Cost-efficient (after the initial hardware investment).
  • Something I can poke around with and also pick up new skills.

The infrastructure



This is still in progress, and I need to own the hardware. But in this first part of the blog series, I will outline what I intend to do.

Diagram

Physical FreeBSD nodes and Linux VMs



The setup starts with three physical FreeBSD nodes deployed into my home LAN. On these, I'm going to run Rocky Linux virtual machines with bhyve. Why Linux VMs in FreeBSD and not Linux directly? I want to leverage the great ZFS integration in FreeBSD (among other features), and I have been using FreeBSD for a while in my home lab. And with bhyve, there is a very performant hypervisor available which makes the Linux VMs de-facto run at native speed (another use case of mine would be maybe running a Windows bhyve VM on one of the nodes - but out of scope for this blog series).

https://www.freebsd.org/
https://wiki.freebsd.org/bhyve

I selected Rocky Linux because it comes with long-term support (I don't want to upgrade the VMs every 6 months). Rocky Linux 9 will reach its end of life in 2032, which is plenty of time! Of course, there will be minor upgrades, but nothing will significantly break my setup.

https://rockylinux.org/
https://wiki.rockylinux.org/rocky/version/

Furthermore, I am already using "RHEL-family" related distros at work and Fedora on my main personal laptop. Rocky Linux belongs to the same type of Linux distribution family, so I already feel at home here. I also used Rocky 9 before I switched to AWS ECS. Now, I am switching back in one sense or another ;-)

Kubernetes with k3s



These Linux VMs form a three-node k3s Kubernetes cluster, where my containers will reside moving forward. The 3-node k3s cluster will be highly available (in etcd mode), and all apps will probably be deployed with Helm. Prometheus will also be running in k3s, collecting time-series metrics and handling monitoring. Additionally, a private Docker registry will be deployed into the k3s cluster, where I will store some of my self-created Docker images. k3s is the perfect distribution of Kubernetes for homelabbers due to its simplicity and the inclusion of the most useful features out of the box!

https://k3s.io/

HA volumes for k3s with HAST/ZFS and NFS



Persistent storage for the k3s cluster will be handled by highly available (HA) NFS shares backed by ZFS on the FreeBSD hosts.

On two of the three physical FreeBSD nodes, I will add a second SSD drive to each and dedicate it to a zhast ZFS pool. With HAST (FreeBSD's solution for highly available storage), this pool will be replicated at the byte level to a standby node.

A virtual IP (VIP) will point to the master node. When the master node goes down, the VIP will failover to the standby node, where the ZFS pool will be mounted. An NFS server will listen to both nodes. k3s will use the VIP to access the NFS shares.

FreeBSD Wiki: Highly Available Storage

You can think of DRBD being the Linux equivalent to FreeBSD's HAST.

OpenBSD/relayd to the rescue for external connectivity



All apps should be reachable through the internet (e.g., from my phone or computer when travelling). For external connectivity and TLS management, I've got two OpenBSD VMs (one hosted by OpenBSD Amsterdam and another hosted by Hetzner) handling public-facing services like DNS, relaying traffic, and automating Let's Encrypt certificates.

All of this (every Linux VM to every OpenBSD box) will be connected via WireGuard tunnels, keeping everything private and secure. There will be 6 WireGuard tunnels (3 k3s nodes times two OpenBSD VMs).

https://en.wikipedia.org/wiki/WireGuard

So, when I want to access a service running in k3s, I will hit an external DNS endpoint (with the authoritative DNS servers being the OpenBSD boxes). The DNS will resolve to the master OpenBSD VM (see my KISS highly-available with OpenBSD blog post), and from there, the relayd process (with a Let's Encrypt certificate—see my Let's Encrypt with OpenBSD and Rex blog post) will accept the TCP connection and forward it through the WireGuard tunnel to a reachable node port of one of the k3s nodes, thus serving the traffic.

KISS high-availability with OpenBSD
Let's Encrypt with OpenBSD and Rex

The OpenBSD setup described here already exists and is ready to use. The only thing that does not yet exist is the configuration of relayd to forward requests to k3s through the WireGuard tunnel(s).

Data integrity



Periodic backups



Let's face it, backups are non-negotiable.

On the HAST master node, incremental and encrypted ZFS snapshots are created daily and automatically backed up to AWS S3 Glacier Deep Archive via CRON. I have a bunch of scripts already available, which I currently use for a similar purpose on my FreeBSD Home NAS server (an old ThinkPad T440 with an external USB drive enclosure, which I will eventually retire when the HAST setup is ready). I will copy them and slightly modify them to fit the purpose.

There's also zfstools in the ports, which helps set up an automatic snapshot regime:

https://www.freshports.org/sysutils/zfstools

The backup scripts also perform some zpool scrubbing now and then. A scrub once in a while keeps the trouble away.

Power protection



Power outages are regularly in my area, so a UPS keeps the infrastructure running during short outages and protects the hardware. I'm still trying to decide which hardware to get, and I still need one, as my previous NAS is simply an older laptop that already has a battery for power outages. However, there are plenty of options to choose from. My main criterion is that the UPS should be silent, as the whole setup will be installed in an upper shelf unit in my daughter's room. ;-)

Monitoring: Keeping an eye on everything



Robust monitoring is vital to any infrastructure, especially one as distributed as mine. I've thought about a setup that ensures I'll always be aware of what's happening in my environment.

Prometheus and Grafana



Inside the k3s cluster, Prometheus will be deployed to handle metrics collection. It will be configured to scrape data from my Kubernetes workloads, nodes, and any services I monitor. Prometheus also integrates with Alertmanager to generate alerts based on predefined thresholds or conditions.

https://prometheus.io

For visualization, Grafana will be deployed alongside Prometheus. Grafana lets me build dynamic, customizable dashboards that provide a real-time view of everything from resource utilization to application performance. Whether it's keeping track of CPU load, memory usage, or the health of Kubernetes pods, Grafana has it covered. This will also make troubleshooting easier, as I can quickly pinpoint where issues are arising.

https://grafana.com

Gogios: My custom alerting system



Alerts generated by Prometheus are forwarded to Alertmanager, which I will configure to work with Gogios, a lightweight monitoring and alerting system I wrote myself. Gogios runs on one of my OpenBSD VMs. At regular intervals, Gogios scrapes the alerts generated in the k3s cluster and notifies me via Email.

KISS server monitoring with Gogios

Ironically, I implemented Gogios to avoid using more complex alerting systems like Prometheus, but here we go—it integrates well now.

Conclusion



This setup may be just the beginning. Some ideas I'm thinking about for the future:

  • Adding more FreeBSD nodes (in different physical locations, maybe at my wider family's places? WireGuard would make it possible!) for better redundancy. (HA storage then might be trickier)
  • Deploying more Docker apps (data-intensive ones, like a picture gallery, my entire audiobook catalogue, or even a music server) to k3s.

For now, though, I'm focused on completing the migration from AWS ECS and getting all my Docker containers running smoothly in k3s.

What's your take on self-hosting? Are you planning to move away from managed cloud services? Stay tuned for the second part of this series, where I will likely write about the hardware and the OS setups.

Read the next post of this series:

f3s: Kubernetes with FreeBSD - Part 2: Hardware and base installation

Other *BSD-related posts:

2025-12-07 f3s: Kubernetes with FreeBSD - Part 8: Observability
2025-10-02 f3s: Kubernetes with FreeBSD - Part 7: k3s and first pod deployments
2025-07-14 f3s: Kubernetes with FreeBSD - Part 6: Storage
2025-05-11 f3s: Kubernetes with FreeBSD - Part 5: WireGuard mesh network
2025-04-05 f3s: Kubernetes with FreeBSD - Part 4: Rocky Linux Bhyve VMs
2025-02-01 f3s: Kubernetes with FreeBSD - Part 3: Protecting from power cuts
2024-12-03 f3s: Kubernetes with FreeBSD - Part 2: Hardware and base installation
2024-11-17 f3s: Kubernetes with FreeBSD - Part 1: Setting the stage (You are currently reading this)
2024-04-01 KISS high-availability with OpenBSD
2024-01-13 One reason why I love OpenBSD
2022-10-30 Installing DTail on OpenBSD
2022-07-30 Let's Encrypt with OpenBSD and Rex
2016-04-09 Jails and ZFS with Puppet on FreeBSD

E-Mail your comments to paul@nospam.buetow.org :-)

Back to the main site
'Staff Engineer' book notes https://foo.zone/gemfeed/2024-10-24-staff-engineer-book-notes.html 2024-10-24T20:57:44+03:00 Paul Buetow aka snonux paul@dev.buetow.org These are my personal takeaways after reading 'Staff Engineer' by Will Larson. Note that the book contains much more knowledge wisdom and that these notes only contain points I personally found worth writing down. This is mainly for my own use, but you might find it helpful too.

"Staff Engineer" book notes



Published at 2024-10-24T20:57:44+03:00

These are my personal takeaways after reading "Staff Engineer" by Will Larson. Note that the book contains much more knowledge wisdom and that these notes only contain points I personally found worth writing down. This is mainly for my own use, but you might find it helpful too.

         ,..........   ..........,
     ,..,'          '.'          ',..,
    ,' ,'            :            ', ',
   ,' ,'             :             ', ',
  ,' ,'              :              ', ',
 ,' ,'............., : ,.............', ',
,'  '............   '.'   ............'  ',
 '''''''''''''''''';''';''''''''''''''''''
                    '''

Table of Contents




The Four Archetypes of a Staff Engineer



Larson breaks down the role of a Staff Engineer into four main archetypes, which can help frame how you approach the role:

  • Tech Lead: Focuses on the technical direction of a team, ensuring high-quality execution, architecture, and aligning the team around shared goals.
  • Solver: Gets pulled into complex, high-impact problems that often involve many teams or systems, operating as a fixer or troubleshooter.
  • Architect: Works on the long-term technical vision for an organization, setting standards and designing systems that will scale and last over time.
  • Right Hand: Functions as a trusted technical advisor to leadership, providing input on strategy, long-term decisions, and navigating organizational politics.

Influence and Impact over Authority



As a Staff Engineer, influence is often more important than formal authority. You’ll rarely have direct control over teams or projects but will need to drive outcomes by influencing peers, other teams, and leadership. It’s about understanding how to persuade, align, and mentor others to achieve technical outcomes.

Breadth and Depth of Knowledge



Staff Engineers often need to maintain a breadth of knowledge across various areas while maintaining depth in a few. This can mean keeping a high-level understanding of several domains (e.g., infrastructure, security, product development) but being able to dive deep when needed in certain core areas.

Mentorship and Sponsorship



An important part of a Staff Engineer’s role is mentoring others, not just in technical matters but in career development as well. Sponsorship goes a step beyond mentorship, where you actively advocate for others, create opportunities for them, and push them toward growth.

Managing Up and Across



Success as a Staff Engineer often depends on managing up (influencing leadership and setting expectations) and managing across (working effectively with peers and other teams). This is often tied to communication skills, the ability to advocate for technical needs, and fostering alignment across departments or organizations.

Strategic Thinking



While Senior Engineers may focus on execution, Staff Engineers are expected to think strategically, making decisions that will affect the company or product months or years down the line. This means balancing short-term execution needs with long-term architectural decisions, which may require challenging short-term pressures.

Emotional Intelligence



The higher you go in engineering roles, the more soft skills, particularly emotional intelligence (EQ), come into play. Building relationships, resolving conflicts, and understanding the broader emotional dynamics of the team and organization become key parts of your role.



Staff Engineers are often placed in situations with high ambiguity—whether in defining the problem space, coming up with a solution, or aligning stakeholders. The ability to operate effectively in these unclear areas is critical to success.

Visible and Invisible Work



Much of the work done by Staff Engineers is invisible. Solving complex problems, creating alignment, or influencing decisions doesn’t always result in tangible code, but it can have a massive impact. Larson emphasizes that part of the role is being comfortable with this type of invisible contribution.

Scaling Yourself



At the Staff Engineer level, you must scale your impact beyond direct contribution. This can involve improving documentation, developing repeatable processes, mentoring others, or automating parts of the workflow. The idea is to enable teams and individuals to be more effective, even when you’re not directly involved.

Career Progression and Title Inflation



Larson touches on how different companies have varying definitions of "Staff Engineer," and titles don’t always correlate directly with responsibility or skill. He emphasizes the importance of focusing more on the work you're doing and the impact you're having, rather than the title itself.

These additional points reflect more of the strategic, interpersonal, and leadership aspects that go beyond the technical expertise expected at this level. The role of a Staff Engineer is often about balancing high-level strategy with technical execution, while influencing teams and projects in a sustainable, long-term way.

Not a faster Senior Engineer



  • A Staff engineer is more than just a faster Senior.
  • A staff engineer is not a senior engineer but a bit better.

It's important to know what work or which role most energizes you. A Staff engineer is not a more senior engineer. A Staff engineer also fits into another archetype.

As a staff engineer, you are always expected to go beyond your comfort zone and learn new things.

Your job sometimes will feel like an SEM and sometimes strangely similar to your senior roles.

A Staff engineer is, like a Manager, a leader. However, being a Manager is a specific job. Leaders can apply to any job, especially to Staff engineers.

The Balance



The more senior you become, the more responsibility you will have to cope with them in less time. Balance your speed of progress with your personal life, don't work late hours and don't skip these personal care events.

Do fewer things but do them better. Everything done will accelerate the organization. Everything else will drag it down—quality over quantity.

Don't work at ten things and progress slowly; focus on one thing and finish it.

Only spend some of the time firefighting. Have time for deep thinking. Only deep think some of the time. Otherwise, you lose touch with reality.

Sebactical: Take at least six months. Otherwise, it won't be as restored.

More things



  • Provide simple but widely used tools. Complex and powerful tools will have power users but only a very few. All others will not use the tool.
  • In meetings, when someone is inactive, try to pull him in. Pull in max one person at a time. Don't open the discussion to multiple people.
  • Get used to writing things down and repeating yourself. You will scale yourself much more.
  • Title inflation: skills correspond to work, but the titles don't.

E-Mail your comments to paul@nospam.buetow.org :-)

Other book notes of mine are:

2025-11-02 'The Courage To Be Disliked' book notes
2025-06-07 'A Monk's Guide to Happiness' book notes
2025-04-19 'When: The Scientific Secrets of Perfect Timing' book notes
2024-10-24 'Staff Engineer' book notes (You are currently reading this)
2024-07-07 'The Stoic Challenge' book notes
2024-05-01 'Slow Productivity' book notes
2023-11-11 'Mind Management' book notes
2023-07-17 'Software Developers Career Guide and Soft Skills' book notes
2023-05-06 'The Obstacle is the Way' book notes
2023-04-01 'Never split the difference' book notes
2023-03-16 'The Pragmatic Programmer' book notes

Back to the main site
Gemtexter 3.0.0 - Let's Gemtext again⁴ https://foo.zone/gemfeed/2024-10-02-gemtexter-3.0.0-lets-gemtext-again-4.html 2024-10-01T21:46:26+03:00 Paul Buetow aka snonux paul@dev.buetow.org I proudly announce that I've released Gemtexter version `3.0.0`. What is Gemtexter? It's my minimalist static site generator for Gemini Gemtext, HTML and Markdown, written in GNU Bash.

Gemtexter 3.0.0 - Let's Gemtext again⁴



Published at 2024-10-01T21:46:26+03:00

I proudly announce that I've released Gemtexter version 3.0.0. What is Gemtexter? It's my minimalist static site generator for Gemini Gemtext, HTML and Markdown, written in GNU Bash.

https://codeberg.org/snonux/gemtexter

-=[ typewriters ]=-  1/98
                                      .-------.
       .-------.                     _|~~ ~~  |_
      _|~~ ~~  |_       .-------.  =(_|_______|_)
    =(_|_______|_)=    _|~~ ~~  |_   |:::::::::|    .-------.
      |:::::::::|    =(_|_______|_)  |:::::::[]|   _|~~ ~~  |_
      |:::::::[]|      |:::::::::|   |o=======.| =(_|_______|_)
      |o=======.|      |:::::::[]|   `"""""""""`   |:::::::::|
 jgs  `"""""""""`      |o=======.|                 |:::::::[]|
  mod. by Paul Buetow  `"""""""""`                 |o=======.|
                                                   `"""""""""`

Table of Contents




Why Bash?



This project is too complex for a Bash script. Writing it in Bash was to try out how maintainable a "larger" Bash script could be. It's still pretty maintainable and helps me try new Bash tricks here and then!

Let's list what's new!

HTML exact variant is the only variant



The last version of Gemtexter introduced the HTML exact variant, which wasn't enabled by default. This version of Gemtexter removes the previous (inexact) variant and makes the exact variant the default. This is a breaking change, which is why there is a major version bump of Gemtexter. Here is a reminder of what the exact variant was:

Gemtexter is there to convert your Gemini Capsule into other formats, such as HTML and Markdown. An HTML exact variant can now be enabled in the gemtexter.conf by adding the line declare -rx HTML_VARIANT=exact. The HTML/CSS output changed to reflect a more exact Gemtext appearance and to respect the same spacing as you would see in the Geminispace.

Table of Contents auto-generation



Just add...

 << template::inline::toc

...into a Gemtexter template file and Gemtexter will automatically generate a table of contents for the page based on the headings (see this page's ToC for example). The ToC will also have links to the relevant sections in HTML and Markdown output. The Gemtext format does not support links, so the ToC will simply be displayed as a bullet list.

Configurable themes



It was always possible to customize the style of a Gemtexter's resulting HTML page, but all the config options were scattered across multiple files. Now, the CSS style, web fonts, etc., are all configurable via themes.

Simply configure HTML_THEME_DIR in the gemtexter.conf file to the corresponding directory. For example:

declare -xr HTML_THEME_DIR=./extras/html/themes/simple

To customize the theme or create your own, simply copy the theme directory and modify it as needed. This makes it also much easier to switch between layouts.

No use of webfonts by default



The default theme is now "back to the basics" and does not utilize any web fonts. The previous themes are still part of the release and can be easily configured. These are currently the future and business themes. You can check them out from the themes directory.

More



Additionally, there were a couple of bug fixes, refactorings and overall improvements in the documentation made.

E-Mail your comments to paul@nospam.buetow.org :-)

Other related posts are:

2024-10-02 Gemtexter 3.0.0 - Let's Gemtext again⁴ (You are currently reading this)
2023-07-21 Gemtexter 2.1.0 - Let's Gemtext again³
2023-03-25 Gemtexter 2.0.0 - Let's Gemtext again²
2022-08-27 Gemtexter 1.1.0 - Let's Gemtext again
2021-06-05 Gemtexter - One Bash script to rule it all
2021-04-24 Welcome to the Geminispace

Back to the main site
Site Reliability Engineering - Part 4: Onboarding for On-Call Engineers https://foo.zone/gemfeed/2024-09-07-site-reliability-engineering-part-4.html 2024-09-07T16:27:58+03:00 Paul Buetow aka snonux paul@dev.buetow.org Welcome to Part 4 of my Site Reliability Engineering (SRE) series. I'm currently working as a Site Reliability Engineer, and I’m here to share what SRE is all about in this blog series.

Site Reliability Engineering - Part 4: Onboarding for On-Call Engineers



Published at 2024-09-07T16:27:58+03:00

Welcome to Part 4 of my Site Reliability Engineering (SRE) series. I'm currently working as a Site Reliability Engineer, and I’m here to share what SRE is all about in this blog series.

2023-08-18 Site Reliability Engineering - Part 1: SRE and Organizational Culture
2023-11-19 Site Reliability Engineering - Part 2: Operational Balance
2024-01-09 Site Reliability Engineering - Part 3: On-Call Culture
2024-09-07 Site Reliability Engineering - Part 4: Onboarding for On-Call Engineers (You are currently reading this)
2026-03-01 Site Reliability Engineering - Part 5: System Design, Incidents, and Learning

       __..._   _...__
  _..-"      `Y`      "-._
  \ Once upon |           /
  \\  a time..|          //
  \\\         |         ///
   \\\ _..---.|.---.._ ///
jgs \\`_..---.Y.---.._`//	

This time, I want to share some tips on how to onboard software engineers, QA engineers, and Site Reliability Engineers (SREs) to the primary on-call rotation. Traditionally, onboarding might take half a year (depending on the complexity of the infrastructure), but with a bit of strategy and structured sessions, we've managed to reduce it to just six weeks per person. Let's dive in!

Setting the Scene: Tier-1 On-Call Rotation



First things first, let's talk about Tier-1. This is where the magic begins. Tier-1 covers over 80% of the common on-call cases and is the perfect breeding ground for new on-call engineers to get their feet wet. It's designed to be manageable training ground.

Why Tier-1?



  • Easy to Understand: Every on-call engineer should be familiar with Tier-1 tasks.
  • Training Ground: This is where engineers start their on-call career. It's purposefully kept simple so that it's not overwhelming right off the bat.
  • Runbook/recipe driven: Every alert is attached to a comprehensive runbook, making it easy for every engineer to follow.

Onboarding Process: From 6 Months to 6 Weeks



So how did we cut down the onboarding time so drastically? Here’s the breakdown of our process:

Knowledge Transfer (KT) Sessions: We kicked things off with more than 10 KT sessions, complete with video recordings. These sessions are comprehensive and cover everything from the basics to some more advanced topics. The recorded sessions mean that new engineers can revisit them anytime they need a refresher.

Shadowing Sessions: Each new engineer undergoes two on-call week shadowing sessions. This hands-on experience is invaluable. They get to see real-time incident handling and resolution, gaining practical knowledge that's hard to get from just reading docs.

Comprehensive Runbooks: We created 64 runbooks (by the time writing this probably more than 100) that are composable like Lego bricks. Each runbook covers a specific scenario and guides the engineer step-by-step to resolution. Pairing these with monitoring alerts linked directly to Confluence docs, and from there to the respective runbooks, ensures every alert can be navigated with ease (well, there are always exceptions to the rule...).

Self-Sufficiency & Confidence Building: With all these resources at their fingertips, our on-call engineers become self-sufficient for most of the common issues they'll face (new starters can now handle around 80% of the most common issue after 6 weeks they had joined the company). This boosts their confidence and ensures they can handle Tier-1 incidents independently.

Documentation and Feedback Loop: Continuous improvement is key. We regularly update our documentation based on feedback from the engineers. This makes our process even more robust and user-friendly.

It's All About the Tiers



Let’s briefly touch on the Tier levels:

  • Tier 1: Easy and foundational tasks. Perfect for getting new engineers started. This covers around 80% of all on-call cases we face. This is what we trained on.
  • Tier 2: Slightly more complex, requiring more background knowledge. We trained on some of the topics but not all.
  • Tier 3: Requires a good understanding of the platform/architecture. Likely needs KT sessions with domain experts.
  • Tier DE (Domain Expert): The heavy hitters. Domain experts are required for these tasks.

Growing into Higher Tiers



From Tier-1, engineers naturally grow into Tier-2 and beyond. The structured training and gradual increase in complexity help ensure a smooth transition as they gain experience and confidence. The key here is that engineers stay curous and engaged in the on-call, so that they always keep learning.

Keeping Runbooks Up to Date



It is important that runbooks are not a "project to be finished"; runbooks have to be maintained and updated over time. Sections may change, new runbooks need to be added, and old ones can be deleted. So the acceptance criteria of an on-call shift would not just be reacting to alerts and incidents, but also reviewing and updating the current runbooks.

Conclusion



By structuring the onboarding process with KT sessions, shadowing, comprehensive runbooks, and a feedback loop, we've been able to fast-track the process from six months to just six weeks. This not only prepares our engineers for the on-call rotation quicker but also ensures they're confident and capable when handling incidents.

If you're looking to optimize your on-call onboarding process, these strategies could be your ticket to a more efficient and effective transition. Happy on-calling!

Continue with the fifth part of this series:

2026-03-01 Site Reliability Engineering - Part 5: System Design, Incidents, and Learning

E-Mail your comments to paul@nospam.buetow.org :-)

Back to the main site
Projects I financially support https://foo.zone/gemfeed/2024-09-07-projects-i-support.html 2024-09-07T16:04:19+03:00 Paul Buetow aka snonux paul@dev.buetow.org This is the list of projects and initiatives I support/sponsor.

Projects I financially support



Published at 2024-09-07T16:04:19+03:00

This is the list of projects and initiatives I support/sponsor.

||====================================================================||
||//$\\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\//$\\||
||(100)==================| FEDERAL SPONSOR NOTE |================(100)||
||\\$//        ~         '------========--------'                \\$//||
||<< /        /$\              // ____ \\                         \ >>||
||>>|  12    //L\\            // ///..) \\         L38036133B   12 |<<||
||<<|        \\ //           || <||  >\  ||                        |>>||
||>>|         \$/            ||  $$ --/  ||        One Hundred     |<<||
||<<|      L38036133B        *\\  |\_/  //* series                 |>>||
||>>|  12                     *\\/___\_//*   1989                  |<<||
||<<\      Open Source   ______/Franklin\________     Supporting   />>||
||//$\                 ~| SPONSORING AND FUNDING |~               /$\\||
||(100)===================  AWESOME OPEN SOURCE =================(100)||
||\\$//\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\\$//||
||====================================================================||
 

Table of Contents




Motivation



Sponsoring free and open-source projects, even for personal use, is important to ensure the sustainability, security, and continuous improvement of the software. It supports developers who often maintain these projects without compensation, helping them provide updates, new features, and security patches. By contributing, you recognize their efforts, foster a culture of innovation, and benefit from perks like early access or support, all while ensuring the long-term viability of the tools you rely on.

Albeit I am not putting a lot of money into my sponsoring efforts, it still helps the open-source maintainers because the more little sponsors there are, the higher the total sum.

OSnews



I am a silver Patreon member of OSnews. I have been following this site since my student years. It's always been a great source of independent and slightly alternative IT news.

https://osnews.com

Cup o' Go Podcast



I am a Patreon of the Cup o' Go Podcast. The podcast helps me stay updated with the Go community for around 15 minutes per week. I am not a full-time software developer, but my long-term ambition is to become better in Go every week by working on personal projects and tools for work.

https://cupogo.dev

Codeberg



Codeberg e.V. is a nonprofit organization that provides online resources for software development and collaboration. I am a user and a supporting member, paying an annual membership of €24. I didn't have to pay that membership fee, as Codeberg offers all the services I use for free.

https://codeberg.org
https://codeberg.org/snonux - My Codeberg page

GrapheneOS



GrapheneOS is an open-source project that improves Android's privacy and security with sandboxing, exploit mitigations, and a permission model. It does not include Google apps or services but offers a sandboxed Google Play compatibility layer and its own apps and services.

I've made a one-off €100 donation because I really like this, and I run GrapheneOS on my personal Phone as my main daily driver.

https://grapheneos.org/
Why GrapheneOS Rox

AnkiDroid



AnkiDroid is an app that lets you learn flashcards efficiently with spaced repetition. It is compatible with Anki software and supports various flashcard content, syncing, statistics, and more.

I've been learning vocabulary with this free app, and it is, in my opinion, the best flashcard app I know. I've made a 20$ one-off donation to this project.

https://opencollective.com/ankidroid

OpenBSD through OpenBSD.Amsterdam



The OpenBSD project produces a FREE, multi-platform 4.4BSD-based UNIX-like operating system. Our efforts emphasize portability, standardization, correctness, proactive security and integrated cryptography. As an example of the effect OpenBSD has, the popular OpenSSH software comes from OpenBSD. OpenBSD is freely available from their download sites.

I implicitly support the OpenBSD project through a VM I have rented at OpenBSD Amsterdam. They donate €10 per VM and €15 per VM for every renewal to the OpenBSD Foundation, with dedicated servers running vmm(4)/vmd(8) to host opinionated VMs.

https://www.OpenBSD.org
https://OpenBSD.Amsterdam

ProtonMail



I am not directly funding this project, but I am a very happy paying customer, and I am listing it here as an alternative to big tech if you don't want to run your own mail infrastructure. I am listing ProtonMail here as it is a non-profit organization, and I want to emphasize the importance of considering alternatives to big tech.

https://proton.me/

Libro.fm



This is the alternative to Audible if you are into audiobooks (like I am). For every book or every month of membership, I am also supporting a local bookstore I selected. Their catalog is not as large as Audible's, but it's still pretty decent.

Libro.fm began as a conversation among friends at Third Place Books, a local bookstore in Seattle, Washington, about the growing popularity of audiobooks and the lack of a way for readers to purchase them from independent bookstores. Flash forward, and Libro.fm was founded in 2014.

https://libro.fm

E-mail your comments to paul@nospam.buetow.org :-)

Back to the main site
Typing `127.1` words per minute (`>100wpm average`) https://foo.zone/gemfeed/2024-08-05-typing-127.1-words-per-minute.html 2024-08-05T17:39:30+03:00 Paul Buetow aka snonux paul@dev.buetow.org After work one day, I noticed some discomfort in my right wrist. Upon research, it appeared to be a mild case of Repetitive Strain Injury (RSI). Initially, I thought that this would go away after a while, but after a week it became even worse. This led me to consider potential causes such as poor posture or keyboard use habits. As an enthusiast of keyboards, I experimented with ergonomic concave ortholinear split keyboards. Wait, what?...

Typing 127.1 words per minute (>100wpm average)



Published at 2024-08-05T17:39:30+03:00; Updated at 2025-02-22

,---,---,---,---,---,---,---,---,---,---,---,---,---,-------,
|1/2| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 0 | + | ' | <-    |
|---'-,-'-,-'-,-'-,-'-,-'-,-'-,-'-,-'-,-'-,-'-,-'-,-'-,-----|
| ->| | Q | W | E | R | T | Y | U | I | O | P | ] | ^ |     |
|-----',--',--',--',--',--',--',--',--',--',--',--',--'|    |
| Caps | A | S | D | F | G | H | J | K | L | \ | [ | * |    |
|----,-'-,-'-,-'-,-'-,-'-,-'-,-'-,-'-,-'-,-'-,-'-,-'---'----|
|    | < | Z | X | C | V | B | N | M | , | . | - |          |
|----'-,-',--'--,'---'---'---'---'---'---'-,-'---',--,------|
| ctrl |  | alt |                          |altgr |  | ctrl |
'------'  '-----'--------------------------'------'  '------'
      Nieminen Mika	

Table of Contents




Introduction



After work one day, I noticed some discomfort in my right wrist. Upon research, it appeared to be a mild case of Repetitive Strain Injury (RSI). Initially, I thought that this would go away after a while, but after a week it became even worse. This led me to consider potential causes such as poor posture or keyboard use habits. As an enthusiast of keyboards, I experimented with ergonomic concave ortholinear split keyboards. Wait, what?...

  • Concave: Some fingers are longer than others. A concave keyboard makes it so that the keycaps meant to be pressed by the longer fingers are further down (e.g., left middle finger for e on a Qwerty layout), and keycaps meant to be pressed by shorter fingers are further up (e.g., right pinky finger for the letter p).
  • Ortholinear: The keys are arranged in a straight vertical line, unlike most conventional keyboards. The conventional keyboards still resemble the old typewriters, where the placement of the keys was optimized so that the typewriter would not jam. There is no such requirement anymore.
  • Split: The keyboard is split into two halves (left and right), allowing one to place either hand where it is most ergonomic.

After discovering ThePrimagen (I found him long ago, but I never bothered buying the same keyboard he is on) on YouTube and reading/watching a couple of reviews, I thought that as a computer professional, the equipment could be expensive anyway (laptop, adjustable desk, comfortable chair), so why not invest a bit more into the keyboard? I purchased myself the Kinesis Advantage360 Professional keyboard.

Kinesis review



For an in-depth review, have a look at this great article:

Review of the Kinesis Advantage360 Professional keyboard

Top build quality



Overall, the keyboard feels excellent quality and robust. It has got some weight to it. Because of that, it is not ideally suited for travel, though. But I have a different keyboard to solve this (see later in this post). Overall, I love how it is built and how it feels.

Kinesis Adv.360 Pro at home

Bluetooth connectivity



Despite encountering concerns about Bluetooth connectivity issues with the Kinesis keyboard during my research, I purchased one anyway as I intended to use it only via USB. However, I discovered that the firmware updates available afterwards had addressed these reported Bluetooth issues, and as a result, I did not experience any difficulties with the Bluetooth functionality. This positive outcome allowed me to enjoy using the keyboard also wirelessly.

Gateron Brown key switches



Many voices on the internet seem to dislike the Gateron Brown switches, the only official choice for non-clicky tactile switches in the Kinesis, so I was also a bit concerned. I almost went with Cherry MX Browns for my Kinesis (a custom build from a 3rd party provider that is partnershipping with Kinesis). Still, I decided on Gateron Browns to try different switches than the Cherry MX Browns I already have on my ZSA Moonlander keyboard (another ortho-linear split keyboard, but without a concave keycap layout).

At first, I was disappointed by the Gaterons, as they initially felt a bit meshy compared to the Cherries. Still, over the weeks I grew to prefer them because of their smoothness. Over time, the tactile bumps also became more noticeable (as my perception of them improved). Because of their less pronounced tactile feedback, the Gaterons are less tiring for long typing sessions and better suited for a relaxed typing experience.

So, the Cherry MX feel sharper but are more tiring in the long run, and the Gaterons are easier to write on and the tactile Feedback is slightly less pronounced.

Keycaps



If you ever purchase a Kinesis keyboard, go with the PCB keycaps. They upgrade the typing experience a lot. The only thing you will lose is that the backlighting won't shine through them. But that is a reasonable tradeoff. When do I need backlighting? I am supposed to look at the screen and not the keyboard while typing.

I went with the blank keycaps, by the way.

Kinesis Adv.360 Pro at home

Keymap editor



There is no official keymap editor. You have to edit a configuration file manually, build the firmware from scratch, and upload the firmware with the new keymap to both keyboard halves. The Professional version of his keyboard, by the way, runs on the ZMK open-source firmware.

Many users find the need for an easy-to-use keymap editor an issue. But this is the Pro model. You can also go with the non-Pro, which runs on non-open-source firmware and has no Bluetooth (it must be operated entirely on USB).

There is a 3rd party solution which is supposed to configure the keymap for the Professional model as bliss, but I have never used it. As a part-time programmer and full-time Site Reliability Engineer, I am okay configuring the keymap in my text editor and building it in a local docker container. This is one of the standard ways of doing it here. You could also use a GitHub pipeline for the firmware build, but I prefer building it locally on my machine. This all seems natural to me, but this may be an issue for "the average Joe" user.

First steps



I didn't measure the usual words per minute (wpm) on my previous keyboard, the ZSA Moonlander, but I guess that it was around 40-50wpm. Once the Kinesis arrived, I started practising. The experience was quite different due to the concave keycaps, so I barely managed 10wpm on the first day.

I quickly noticed that I could not continue using the freestyle 6-finger typing system I was used to on my Moonlander or any previous keyboards I worked with. I learned ten-finger touch typing from scratch to be more efficient with the Kinesis keyboard. The keyboard forces you to embrace touch typing.

Sometimes, there were brain farts, and I couldn't type at all. The trick was not to freak out about it, but to move on. If your average goes down a bit for a day, it doesn't matter; the long-term trend over several days and weeks matters, not the one-off wpm high score.

Although my wrist pain seemed to go away aftre the first week of using the Kinesis, my fingers became tired of adjusting to the new way of typing. My hands were stiff, as if I had been training for the Olympics. Only after three weeks did I start to feel comfortable with it. If it weren't for the comments I read online, I would have sent it back after week 2.

I also had a problem with the left pinky finger, where I could not comfortably reach the p key. This involved moving the whole hand. An easy fix was to swap p with ; on the keyboard layout.

Considering alternate layouts



As I was going to learn 10-finger touch typing from scratch, I also played with the thought of switching from the Qwerty to the Dvorak or Colemak keymap, but after reading some comments on the internet, I decided against it:

  • These layouts (Dvorak and Colemak) will minimize the finger travel for the most commonly used English words, but they necessarily don't give you a better wpm score.
  • One comment on Redit also mentioned that getting stiffer fingers with these layouts is more likely than with Qwerty, as in Qwerty, he had to stretch out his fingers more often, which helps here.
  • There are also many applications and websites with keyboard shortcuts and are Qwerty-optimized.
  • You won't be able to use someone else's computer as there will be likely Qwerty. Some report that after using an alternative layout for a while, they forget how to use Qwerty.

Training how to type



Tools



One of the most influential tools in my touch typing journey has been keybr.com. This site/app helped me learn 10-finger touch typing, and I practice daily for 30 minutes (in the first two weeks, up to an hour every day). The key is persistence and focus on technique rather than speed; the latter naturally improves with regular practice. Precision matters, too, so I always correct my errors using the backspace key.

https://keybr.com

I also used a command-line tool called tt, which is written in Go. It has a feature that I found very helpful: the ability to practice typing by piping custom text into it. Additionally, I appreciated its customization options, such as choosing a colour theme and specifying how statistics are displayed.

https://github.com/lemnos/tt

I wrote myself a small Ruby script that would randomly select a paragraph from one of my eBooks or book notes and pipe it to tt. This helped me remember some of the books I read and also practice touch typing.

My keybr.com statistics



Overall, I trained for around 4 months in more than 5,000 sessions. My top speed in a session was 127.1wpm (up from barely 10wpm at the beginning).

All time stats

My overall average speed over those 5,000 sessions was 80wpm. The average speed over the last week was over 100wpm. The green line represents the wpm average (increasing trend), the purple line represents the number of keys in the practices (not much movement there, as all keys are unlocked), and the red line represents the average typing accuracy.

Typing speed over leson

Around the middle, you see a break-in of the wpm average value. This was where I swapped the p and ; keys, but after some retraining, I came back to the previous level and beyond.

Tips and tricks



These are some tips and tricks I learned along the way to improve my typing speed:

Relax



It's easy to get cramped when trying to hit this new wpm mark, but this is just holding you back. Relax and type at a natural pace. Now I also understand why my Katate Sensei back in London kept screaming "RELAAAX" at me during practice.... It didn't help much back then, though, as it is difficult to relax while someone screams at you!

Focus on accuracy first



This goes with the previous point. Instead of trying to speed through sessions as quickly as possible, slow down and try to type the words correctly—so don't rush it. If you aren't fast yet, the reason is that your brain hasn't trained enough. It will come over time, and you will be faster.

Chording



A trick to getting faster is to type by word and pause between each word so you learn the words by chords. From 80wpm and beyond, this makes a real difference.

Punctuation and Capitalization



I included 10% punctuation and 20% capital letters in my keybr.com practice sessions to simulate real typing conditions, which improved my overall working efficiency. I guess I would have gone to 120wpm in average if I didn't include this options...

Reverse shifting



Reverse shifting aka left-right shifting is to...

  • ...use the left shift key for letters on the right keyboard side.
  • ...use the right shift key for letters on the left keyboard side.

This makes using the shift key a blaze.

Enter the flow state



Listening to music helps me enter a flow state during practice sessions, which makes typing training a bit addictive (which is good, or isn't it?).

Repeat every word



There's a setting on keybr.com that makes it so that every word is always repeated, having you type every word twice in a row. I liked this feature very much, and I think it also helped to improve my practice.

Don't use the same finger for two consecutive keystrokes



Apparently, if you want to type fast, avoid using the same finger for two consecutive keystrokes. This means you don't always need to use the same finger for the same keys.
However, there are no hard and fast rules. Thus, everyone develops their system for typing word combinations. An exception would be if you are typing the very same letter in a row (e.g., t in letter)—here, you are using the same finger for both ts.

Warm-up



You can't reach your average typing speed first ting the morning. It would help if you warmed up before the exercise or practice later during the day. Also, some days are good, others not so, e.g., after a bad night's sleep. What matters is the mid- and long-term trend, not the fluctuations here, though.

Travel keyboard



As mentioned, the Kinesis is a great keyboard, but it is not meant for travel.

I guess keyboards will always be my expensive hobby, so I also purchased another ergonomic, ortho-linear, concave split keyboard, the Glove80 (with the Red Pro low-profile switches). This keyboard is much lighter and, in my opinion, much better suited for travel than the Kinesis. It also comes with a great travel case.

Here is a photo of me using it with my Surface Go 2 (it runs Linux, by the way) while waiting for the baggage drop at the airport:

Traveling with the Glove80 using my Surface Go 2

For everyday work, I prefer the tactile Browns on the Kinesis over the Red Pro I have on the Glove80 (normal profile vs. low profile). The Kinesis feels much more premium, whereas the Glove80 is much lighter and easier to store away in a rucksack (the official travel case is a bit bulky, so I wrapped it simply in bubble plastic).

The F-key row is odd at the Glove80. I would have preferred more keys on the sides like the Kinesis, and I use them for [] {} (), which is pretty handy there. However, I like the thumb cluster of the Glove80 more than the one on the Kinesis.

The good thing is that I can switch between both keyboards instantly without retraining my typing memories. I've configured (as much as possible) the same keymaps on both my Kinesis and Glove80, making it easy to switch between them at any occasion.

Interested in the Glove80? I suggest also reading this review:

Review of the Glove80 keyboard

Upcoming custom Kinesis build



As I mentioned, keyboards will remain an expensive hobby of mine. I don't regret anything here, though. After all, I use keyboards at my day job. I've ordered a Kinesis custom build with the Gateron Kangaroo switches, and I'm excited to see how that compares to my current setup. I'm still deciding whether to keep my Gateron Brown-equipped Kinesis as a secondary keyboard or possibly leave it at my in-laws for use when visiting or to sell it.

Update 2025-02-22: I've received my custom Kinesis Adv. 360 build with the Gateron Baby Kangaroo key switches. I am absolutely in love! I will keep my Gateron Brown version around, though.

Conclusion



When I traveled with the Glove80 for work to the London office, a colleague stared at my keyboard and made jokes that it might be broken (split into two halves). But other than that...

Ten-finger touch typing has improved my efficiency and has become a rewarding discipline. Whether it's the keyboards I use, the tools I practice with, or the techniques I've adopted, each step has been a learning experience. I hope sharing my journey provides valuable insights and inspiration for anyone looking to improve their touch typing skills.

I also accidentally started using a 10-finger-like system (maybe still 6 fingers, but better than before) on my regular laptop keyboard. I could be more efficient on the laptop keyboard. The form is different there (not ortholinear, not concave keycaps, etc.), but my typing has improved there too (even if it is only by a little bit).

I don't want to return to a non-concave keyboard as my default. I will use other keyboards still once in a while but only for short periods or when I have to (e.g. travelling with my Laptop and when there is no space to put an external keyboard)

Learning to touch type has been an eye-opening experience for me, not just for work but also for personal projects. Now, writing documentation is so much fun; who could believe that? Furthermore, working with Slack (communicating with colleagues) is more fun now as well.

E-Mail your comments to paul@nospam.buetow.org :-)

Back to the main site
'The Stoic Challenge' book notes https://foo.zone/gemfeed/2024-07-07-the-stoic-challenge-book-notes.html 2024-07-07T12:46:55+03:00 Paul Buetow aka snonux paul@dev.buetow.org These are my personal takeaways after reading 'The Stoic Challenge: A Philosopher's Guide to Becoming Tougher, Calmer, and More Resilient' by William B. Irvine.

"The Stoic Challenge" book notes



Published at 2024-07-07T12:46:55+03:00

These are my personal takeaways after reading "The Stoic Challenge: A Philosopher's Guide to Becoming Tougher, Calmer, and More Resilient" by William B. Irvine.

         ,..........   ..........,
     ,..,'          '.'          ',..,
    ,' ,'            :            ', ',
   ,' ,'             :             ', ',
  ,' ,'              :              ', ',
 ,' ,'............., : ,.............', ',
,'  '............   '.'   ............'  ',
 '''''''''''''''''';''';''''''''''''''''''
                    '''

Table of Contents




God sets you up for a challenge



Gods set you up for a challenge to see how resilient you are. Is getting angry worth the price? If you stay calm then you can find the optimal workaround for the obstacle. Stay calm even with big setbacks. Practice minimalism of negative emotions.

Put a positive spin on everything. What should you do if someone wrong you? Don't get angry, there is no point in that, it just makes you suffer. Do the best what you got now and keep calm and carry on. A resilient person will refuse to play the role of a victim. You can develop the setback response skills. Turn a setback. e.g. a handycap, into a personal triumph.

It is not the things done to you or happen to you what matters but how you take the things and react to these things.

Don't row against the other boats but against your own lazy bill. It doesn't matter if you are first or last, as long as you defeat your lazy self.

Stoics are thankful that they are mortal. As then you can get reminded of how great it is to be alive at all. In dying we are more alive we have ever been as every thing you do could be the last time you do it. Rather than fighting your death you should embrace it if there are no workarounds. Embrace a good death.

Negative visualization



It is easy what we have to take for granted.

  • Imagine the negative and then think that things are actually much better than they seem to be.
  • Close your eyes and imagine you are color blind for a minute, then open the eyes again and see all the colours. You will be grateful for being able to see the colours.
  • Now close your eyes for a minute and imagine you would be blind, so that you will never be able to experience the world again and let it sink in. When you open your eyes again you will feel a lot of gratefulness.
  • Last time meditation. Lets you appreciate the life as it is now. Life gets vitalised again.

Oh, nice trick, you stoic "god"! ;-)



Take setbacks as a challenge. Also take it with some humor.

  • A setback in a setback, how Genius :-)
  • A setback in a setback in a setback: the stoic god's work overtime, eh? :-)

What would the stoic god's do next? This is just a test strategy by them. Don't be frustrated at all but be astonished of what comes next. Thank the stoic gods of testing you. This is comfort zone extension of the stoics aka toughness Training.

E-Mail your comments to paul@nospam.buetow.org :-)

Other book notes of mine are:

2025-11-02 'The Courage To Be Disliked' book notes
2025-06-07 'A Monk's Guide to Happiness' book notes
2025-04-19 'When: The Scientific Secrets of Perfect Timing' book notes
2024-10-24 'Staff Engineer' book notes
2024-07-07 'The Stoic Challenge' book notes (You are currently reading this)
2024-05-01 'Slow Productivity' book notes
2023-11-11 'Mind Management' book notes
2023-07-17 'Software Developers Career Guide and Soft Skills' book notes
2023-05-06 'The Obstacle is the Way' book notes
2023-04-01 'Never split the difference' book notes
2023-03-16 'The Pragmatic Programmer' book notes

Back to the main site
Random Weird Things - Part Ⅰ https://foo.zone/gemfeed/2024-07-05-random-weird-things.html 2024-07-05T10:59:59+03:00 Paul Buetow aka snonux paul@dev.buetow.org Every so often, I come across random, weird, and unexpected things on the internet. I thought it would be neat to share them here from time to time. As a start, here are ten of them.

Random Weird Things - Part Ⅰ



Published at 2024-07-05T10:59:59+03:00; Updated at 2025-02-08

Every so often, I come across random, weird, and unexpected things on the internet. I thought it would be neat to share them here from time to time. As a start, here are ten of them.

2024-07-05 Random Weird Things - Part Ⅰ (You are currently reading this)
2025-02-08 Random Weird Things - Part Ⅱ
2025-08-15 Random Weird Things - Part Ⅲ

		       /\_/\
WHOA!! 	     ( o.o )
		       > ^ <
		      /  -  \
		    /        \
		   /______\  \

Table of Contents




1. bad.horse traceroute



Run traceroute to get the poem (or song).

Update: A reader hinted that by specifying -n 60, there will be even more output!

❯ traceroute -m 60 bad.horse
traceroute to bad.horse (162.252.205.157), 60 hops max, 60 byte packets
 1  _gateway (192.168.1.1)  5.237 ms  5.264 ms  6.009 ms
 2  77-85-0-2.ip.btc-net.bg (77.85.0.2)  8.753 ms  7.112 ms  8.336 ms
 3  212-39-69-103.ip.btc-net.bg (212.39.69.103)  9.434 ms  9.268 ms  9.986 ms
 4  * * *
 5  xe-1-2-0.mpr1.fra4.de.above.net (80.81.194.26)  39.812 ms  39.030 ms  39.772 ms
 6  * ae12.cs1.fra6.de.eth.zayo.com (64.125.26.172)  123.576 ms *
 7  * * *
 8  * * *
 9  ae10.cr1.lhr15.uk.eth.zayo.com (64.125.29.17)  119.097 ms  119.478 ms  120.767 ms
10  ae2.cr1.lhr11.uk.zip.zayo.com (64.125.24.140)  120.398 ms  121.147 ms  120.948 ms
11  * * *
12  ae25.mpr1.yyz1.ca.zip.zayo.com (64.125.23.117)  145.072 ms *  181.773 ms
13  ae5.mpr1.tor3.ca.zip.zayo.com (64.125.23.118)  168.239 ms  168.158 ms  168.137 ms
14  64.124.217.237.IDIA-265104-ZYO.zip.zayo.com (64.124.217.237)  168.026 ms  167.999 ms  165.451 ms
15  * * *
16  t00.toroc1.on.ca.sn11.net (162.252.204.2)  131.598 ms  131.308 ms  131.482 ms
17  bad.horse (162.252.205.130)  131.430 ms  145.914 ms  130.514 ms
18  bad.horse (162.252.205.131)  136.634 ms  145.295 ms  135.631 ms
19  bad.horse (162.252.205.132)  139.158 ms  148.363 ms  138.934 ms
20  bad.horse (162.252.205.133)  145.395 ms  148.054 ms  147.140 ms
21  he.rides.across.the.nation (162.252.205.134)  149.687 ms  147.731 ms  150.135 ms
22  the.thoroughbred.of.sin (162.252.205.135)  156.644 ms  155.155 ms  156.447 ms
23  he.got.the.application (162.252.205.136)  161.187 ms  162.318 ms  162.674 ms
24  that.you.just.sent.in (162.252.205.137)  166.763 ms  166.675 ms  164.243 ms
25  it.needs.evaluation (162.252.205.138)  172.073 ms  171.919 ms  171.390 ms
26  so.let.the.games.begin (162.252.205.139)  175.386 ms  174.180 ms  175.965 ms
27  a.heinous.crime (162.252.205.140)  180.857 ms  180.766 ms  180.192 ms
28  a.show.of.force (162.252.205.141)  187.942 ms  186.669 ms  186.986 ms
29  a.murder.would.be.nice.of.course (162.252.205.142)  191.349 ms  191.939 ms  190.740 ms
30  bad.horse (162.252.205.143)  195.425 ms  195.716 ms  196.186 ms
31  bad.horse (162.252.205.144)  199.238 ms  200.620 ms  200.318 ms
32  bad.horse (162.252.205.145)  207.554 ms  206.729 ms  205.201 ms
33  he-s.bad (162.252.205.146)  211.087 ms  211.649 ms  211.712 ms
34  the.evil.league.of.evil (162.252.205.147)  212.657 ms  216.777 ms  216.589 ms
35  is.watching.so.beware (162.252.205.148)  220.911 ms  220.326 ms  221.961 ms
36  the.grade.that.you.receive (162.252.205.149)  225.384 ms  225.696 ms  225.640 ms
37  will.be.your.last.we.swear (162.252.205.150)  232.312 ms  230.989 ms  230.919 ms
38  so.make.the.bad.horse.gleeful (162.252.205.151)  235.761 ms  235.291 ms  235.585 ms
39  or.he-ll.make.you.his.mare (162.252.205.152)  241.350 ms  239.407 ms  238.394 ms
40  o_o (162.252.205.153)  246.154 ms  247.650 ms  247.110 ms
41  you-re.saddled.up (162.252.205.154)  250.925 ms  250.401 ms  250.619 ms
42  there-s.no.recourse (162.252.205.155)  256.071 ms  251.154 ms  255.340 ms
43  it-s.hi-ho.silver (162.252.205.156)  260.152 ms  261.775 ms  261.544 ms
44  signed.bad.horse (162.252.205.157)  262.430 ms  261.410 ms  261.365 ms

2. ASCII cinema



Fancy watching Star Wars Episode IV in ASCII? Head to the ASCII cinema:

https://asciinema.org/a/569727

3. Netflix's Hello World application



Netflix has got the Hello World application run in production 😱

  • https://www.Netflix.com/helloworld

By the time this is posted, it seems that Netflix has taken it offline... I should have created a screenshot!

C programming



4. Indexing an array



In C, you can index an array like this: array[i] (not surprising). But this works as well and is valid C code: i[array], 🤯 It's because after the spec A[B] is equivalent to *(A + B) and the ordering doesn't matter for the + operator. All 3 loops are producing the same output. Would be funny to use i[array] in a merge request of some code base on April Fool's day!

#include <stdio.h>

int main(void) {
  int array[5] = { 1, 2, 3, 4, 5 };

  for (int i = 0; i < 5; i++)
    printf("%d\n", array[i]);

  for (int i = 0; i < 5; i++)
    printf("%d\n", i[array]);

  for (int i = 0; i < 5; i++)
    printf("%d\n", *(i + array));
}

5. Variables with prefix $



In C you can prefix variables with $! E.g. the following is valid C code 🫠:

#include <stdio.h>

int main(void) {
  int $array[5] = { 1, 2, 3, 4, 5 };

  for (int $i = 0; $i < 5; $i++)
    printf("%d\n", $array[$i]);

  for (int $i = 0; $i < 5; $i++)
    printf("%d\n", $i[$array]);

  for (int $i = 0; $i < 5; $i++)
    printf("%d\n", *($i + $array));
}

6. Object oriented shell scripts using ksh



Experienced software developers are aware that scripting languages like Python, Perl, Ruby, and JavaScript support object-oriented programming (OOP) concepts such as classes and inheritance. However, many might be surprised to learn that the latest version of the Korn shell (Version 93t+) also supports OOP. In ksh93, OOP is implemented using user-defined types:

#!/usr/bin/ksh93
 
typeset -T Point_t=(
    integer -h 'x coordinate' x=0
    integer -h 'y coordinate' y=0
    typeset -h 'point color'  color="red"

    function getcolor {
        print -r ${_.color}
    }

    function setcolor {
        _.color=$1
    }

    setxy() {
        _.x=$1; _.y=$2
    }

    getxy() {
        print -r "(${_.x},${_.y})"
    }
)
 
Point_t point
 
echo "Initial coordinates are (${point.x},${point.y}). Color is ${point.color}"
 
point.setxy 5 6
point.setcolor blue
 
echo "New coordinates are ${point.getxy}. Color is ${point.getcolor}"
 
exit 0

Using types to create object oriented Korn shell 93 scripts

7. This works in Go



There is no pointer arithmetic in Go like in C, but it is still possible to do some brain teasers with pointers 😧:

package main

import "fmt"

func main() {
	var i int
	f := func() *int {
		return &i
	}
	*f()++
	fmt.Println(i)
}

Go playground

8. "I am a Teapot" HTTP response code



Defined in 1998 as one of the IETF's traditional April Fools' jokes (RFC 2324), the Hyper Text Coffee Pot Control Protocol specifies an HTTP status code that is not intended for actual HTTP server implementation. According to the RFC, this code should be returned by teapots when asked to brew coffee. This status code also serves as an Easter egg on some websites, such as Google.com's "I'm a teapot" feature. Occasionally, it is used to respond to a blocked request, even though the more appropriate response would be the 403 Forbidden status code.

https://en.wikipedia.org/wiki/List_of_HTTP_status_codes#418

9. jq is a functional programming language



Many know of jq, the handy small tool and swiss army knife for JSON parsing.

https://github.com/jqlang/jq

What many don't know that jq is actually a full blown functional programming language jqlang, have a look at the language description:

https://github.com/jqlang/jq/wiki/jq-Language-Description

As a matter of fact, the language is so powerful, that there exists an implementation of jq in jq itself:

https://github.com/wader/jqjq

Here some snipped from jqjq, to get a feel of jqlang:

def _token:
	def _re($re; f):
	  ( . as {$remain, $string_stack}
	  | $remain
	  | match($re; "m").string
	  | f as $token
	  | { result: ($token | del(.string_stack))
	    , remain: $remain[length:]
	    , string_stack:
	        ( if $token.string_stack == null then $string_stack
	          else $token.string_stack
	          end
	        )
	    }
	  );
	if .remain == "" then empty
	else
	  ( . as {$string_stack}
	  | _re("^\\s+"; {whitespace: .})
	  // _re("^#[^\n]*"; {comment: .})
	  // _re("^\\.[_a-zA-Z][_a-zA-Z0-9]*"; {index: .[1:]})
	  // _re("^[_a-zA-Z][_a-zA-Z0-9]*"; {ident: .})
	  // _re("^@[_a-zA-Z][_a-zA-Z0-9]*"; {at_ident: .})
	  // _re("^\\$[_a-zA-Z][_a-zA-Z0-9]*"; {binding: .})
	  # 1.23, .123, 123e2, 1.23e2, 123E2, 1.23e+2, 1.23E-2 or 123
	  // _re("^(?:[0-9]*\\.[0-9]+|[0-9]+)(?:[eE][-\\+]?[0-9]+)?"; {number: .})
	  // _re("^\"(?:[^\"\\\\]|\\\\.)*?\\\\\\(";
	      ( .[1:-2]
	      | _unescape
	      | {string_start: ., string_stack: ($string_stack+["\\("])}
	      )
	    )
	 .
	 .
	 .

10. Regular expression to verify email addresses



This is a pretty old meme, but still worth posting here (as some may be unaware). The RFC822 Perl regex to validate email addresses is 😱:

(?:(?:\r\n)?[ \t])*(?:(?:(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t]
)+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:
\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(
?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ 
\t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\0
31]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\
](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+
(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:
(?:\r\n)?[ \t])*))*|(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z
|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)
?[ \t])*)*\<(?:(?:\r\n)?[ \t])*(?:@(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\
r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[
 \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)
?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t]
)*))*(?:,@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[
 \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*
)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t]
)+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*)
*:(?:(?:\r\n)?[ \t])*)?(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+
|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r
\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:
\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t
]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031
]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](
?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?
:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?
:\r\n)?[ \t])*))*\>(?:(?:\r\n)?[ \t])*)|(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?
:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?
[ \t]))*"(?:(?:\r\n)?[ \t])*)*:(?:(?:\r\n)?[ \t])*(?:(?:(?:[^()<>@,;:\\".\[\] 
\000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|
\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>
@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"
(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t]
)*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\
".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?
:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[
\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*|(?:[^()<>@,;:\\".\[\] \000-
\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(
?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)*\<(?:(?:\r\n)?[ \t])*(?:@(?:[^()<>@,;
:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([
^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\"
.\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\
]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*(?:,@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\
[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\
r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] 
\000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]
|\\.)*\](?:(?:\r\n)?[ \t])*))*)*:(?:(?:\r\n)?[ \t])*)?(?:[^()<>@,;:\\".\[\] \0
00-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\
.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,
;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?
:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*
(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".
\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[
^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]
]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*\>(?:(?:\r\n)?[ \t])*)(?:,\s*(
?:(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\
".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(
?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[
\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t
])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t
])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?
:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|
\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*|(?:
[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\
]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)*\<(?:(?:\r\n)
?[ \t])*(?:@(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["
()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)
?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>
@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*(?:,@(?:(?:\r\n)?[
 \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,
;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t]
)*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\
".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*)*:(?:(?:\r\n)?[ \t])*)?
(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".
\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:
\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[
"()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])
*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])
+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\
.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z
|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*\>(?:(
?:\r\n)?[ \t])*))*)?;\s*)

https://pdw.ex-parrot.com/Mail-RFC822-Address.html

I hope you had some fun. E-Mail your comments to paul@nospam.buetow.org :-)

other related posts are:

Back to the main site
Terminal multiplexing with `tmux` - Z-Shell edition https://foo.zone/gemfeed/2024-06-23-terminal-multiplexing-with-tmux.html 2024-06-23T22:41:59+03:00, last updated Fri 02 May 00:10:49 EEST 2025 Paul Buetow aka snonux paul@dev.buetow.org This is the Z-Shell version. There is also a Fish version:

Terminal multiplexing with tmux - Z-Shell edition



Published at 2024-06-23T22:41:59+03:00, last updated Fri 02 May 00:10:49 EEST 2025

This is the Z-Shell version. There is also a Fish version:

./2025-05-02-terminal-multiplexing-with-tmux-fish-edition.html

Tmux (Terminal Multiplexer) is a powerful, terminal-based tool that manages multiple terminal sessions within a single window. Here are some of its primary features and functionalities:

  • Session management
  • Window and Pane management
  • Persistent Workspace
  • Customization

https://github.com/tmux/tmux/wiki

         _______
        |.-----.|
        || Tmux||
        ||_.-._||
        `--)-(--`
       __[=== o]___
      |:::::::::::|\
jgs   `-=========-`()
    mod. by Paul B.

Table of Contents




Before continuing...



Before continuing to read this post, I encourage you to get familiar with Tmux first (unless you already know the basics). You can go through the official getting started guide:

https://github.com/tmux/tmux/wiki/Getting-Started

I can also recommend this book (this is the book I got started with with Tmux):

https://pragprog.com/titles/bhtmux2/tmux-2/

Over the years, I have built a couple of shell helper functions to optimize my workflows. Tmux is extensively integrated into my daily workflows (personal and work). I had colleagues asking me about my Tmux config and helper scripts for Tmux several times. It would be neat to blog about it so that everyone interested in it can make a copy of my configuration and scripts.

The configuration and scripts in this blog post are only the non-work-specific parts. There are more helper scripts, which I only use for work (and aren't really useful outside of work due to the way servers and clusters are structured there).

Tmux is highly configurable, and I think I am only scratching the surface of what is possible with it. Nevertheless, it may still be useful for you. I also love that Tmux is part of the OpenBSD base system!

Shell aliases



I am a user of the Z-Shell (zsh), but I believe all the snippets mentioned in this blog post also work with Bash.

https://www.zsh.org

For the most common Tmux commands I use, I have created the following shell aliases:

alias tm=tmux
alias tl='tmux list-sessions'
alias tn=tmux::new
alias ta=tmux::attach
alias tx=tmux::remote
alias ts=tmux::search
alias tssh=tmux::cluster_ssh

Note all tmux::...; those are custom shell functions doing certain things, and they aren't part of the Tmux distribution. But let's run through every alias one by one.

The first two are pretty straightforward. tm is simply a shorthand for tmux, so I have to type less, and tl lists all Tmux sessions that are currently open. No magic here.

The tn alias - Creating a new session



The tn alias is referencing this function:

# Create new session and if already exists attach to it
tmux::new () {
    readonly session=$1
    local date=date
    if where gdate &>/dev/null; then
        date=gdate
    fi

    tmux::cleanup_default
    if [ -z "$session" ]; then
        tmux::new T$($date +%s)
    else
        tmux new-session -d -s $session
        tmux -2 attach-session -t $session || tmux -2 switch-client -t $session
    fi
}
alias tn=tmux::new

There is a lot going on here. Let's have a detailed look at what it is doing. As a note, the function relies on GNU Date, so MacOS is looking for the gdate commands to be available. Otherwise, it will fall back to date. You need to install GNU Date for Mac, as it isn't installed by default there. As I use Fedora Linux on my personal Laptop and a MacBook for work, I have to make it work for both.

First, a Tmux session name can be passed to the function as a first argument. That session name is only optional. Without it, Tmux will select a session named T$($date +%s) as a default. Which is T followed by the UNIX epoch, e.g. T1717133796.

Cleaning up default sessions automatically



Note also the call to tmux::cleanup_default; it would clean up all already opened default sessions if they aren't attached. Those sessions were only temporary, and I had too many flying around after a while. So, I decided to auto-delete the sessions if they weren't attached. If I want to keep sessions around, I will rename them with the Tmux command prefix-key $. This is the cleanup function:

tmux::cleanup_default () {
    local s
    tmux list-sessions | grep '^T.*: ' | grep -F -v attached |
    cut -d: -f1 | while read -r s; do
        echo "Killing $s"
        tmux kill-session -t "$s"
    done
}

The cleanup function kills all open Tmux sessions that haven't been renamed properly yet—but only if they aren't attached (e.g., don't run in the foreground in any terminal). Cleaning them up automatically keeps my Tmux sessions as neat and tidy as possible.

Renaming sessions



Whenever I am in a temporary session (named T....), I may decide that I want to keep this session around. I have to rename the session to prevent the cleanup function from doing its thing. That's, as mentioned already, easily accomplished with the standard prefix-key $ Tmux command.

The ta alias - Attaching to a session



This alias refers to the following function, which tries to attach to an already-running Tmux session.

tmux::attach () {
    readonly session=$1

    if [ -z "$session" ]; then
        tmux attach-session || tmux::new
    else
        tmux attach-session -t $session || tmux::new $session
    fi
}
alias ta=tmux::attach

If no session is specified (as the argument of the function), it will try to attach to the first open session. If no Tmux server is running, it will create a new one with tmux::new. Otherwise, with a session name given as the argument, it will attach to it. If unsuccessful (e.g., the session doesn't exist), it will be created and attached to.

The tr alias - For a nested remote session



This SSHs into the remote server specified and then, remotely on the server itself, starts a nested Tmux session. So we have one Tmux session on the local computer and, inside of it, an SSH connection to a remote server with a Tmux session running again. The benefit of this is that, in case my network connection breaks down, the next time I connect, I can continue my work on the remote server exactly where I left off. The session name is the name of the server being SSHed into. If a session like this already exists, it simply attaches to it.

tmux::remote () {
    readonly server=$1
    tmux new -s $server "ssh -t $server 'tmux attach-session || tmux'" || \
        tmux attach-session -d -t $server
}
alias tr=tmux::remote

Change of the Tmux prefix for better nesting



To make nested Tmux sessions work smoothly, one must change the Tmux prefix key locally or remotely. By default, the Tmux prefix key is Ctrl-b, so Ctrl-b $, for example, renames the current session. To change the prefix key from the standard Ctrl-b to, for example, Ctrl-g, you must add this to the tmux.conf:

set-option -g prefix C-g

This way, when I want to rename the remote Tmux session, I have to use Ctrl-g $, and when I want to rename the local Tmux session, I still have to use Ctrl-b $. In my case, I have this deployed to all remote servers through a configuration management system (out of scope for this blog post).

There might also be another way around this (without reconfiguring the prefix key), but that is cumbersome to use, as far as I remember.

The ts alias - Searching sessions with fuzzy finder



Despite the fact that with tmux::cleanup_default, I don't leave a huge mess with trillions of Tmux sessions flying around all the time, at times, it can become challenging to find exactly the session I am currently interested in. After a busy workday, I often end up with around twenty sessions on my laptop. This is where fuzzy searching for session names comes in handy, as I often don't remember the exact session names.

tmux::search () {
    local -r session=$(tmux list-sessions | fzf | cut -d: -f1)
    if [ -z "$TMUX" ]; then
        tmux attach-session -t $session
    else
        tmux switch -t $session
    fi
}
alias ts=tmux::search

All it does is list all currently open sessions in fzf, where one of them can be searched and selected through fuzzy find, and then either switch (if already inside a session) to the other session or attach to the other session (if not yet in Tmux).

You must install the fzf command on your computer for this to work. This is how it looks like:

Tmux session fuzzy finder

The tssh alias - Cluster SSH replacement



Before I used Tmux, I was a heavy user of ClusterSSH, which allowed me to log in to multiple servers at once in a single terminal window and type and run commands on all of them in parallel.

https://github.com/duncs/clusterssh

However, since I started using Tmux, I retired ClusterSSH, as it came with the benefit that Tmux only needs to be run in the terminal, whereas ClusterSSH spawned terminal windows, which aren't easily portable (e.g., from a Linux desktop to macOS). The tmux::cluster_ssh function can have N arguments, where:

  • ...the first argument will be the session name (see tmux::tssh_from_argument helper function), and all remaining arguments will be server hostnames/FQDNs to connect to simultaneously.
  • ...or, the first argument is a file name, and the file contains a list of hostnames/FQDNs (see tmux::ssh_from_file helper function)

This is the function definition behind the tssh alias:

tmux::cluster_ssh () {
    if [ -f "$1" ]; then
        tmux::tssh_from_file $1
        return
    fi

    tmux::tssh_from_argument $@
}
alias tssh=tmux::cluster_ssh

This function is just a wrapper around the more complex tmux::tssh_from_file and tmux::tssh_from_argument functions, as you have learned already. Most of the magic happens there.

The tmux::tssh_from_argument helper



This is the most magic helper function we will cover in this post. It looks like this:

tmux::tssh_from_argument () {
    local -r session=$1; shift
    local first_server=$1; shift

    tmux new-session -d -s $session "ssh -t $first_server"
    if ! tmux list-session | grep "^$session:"; then
        echo "Could not create session $session"
        return 2
    fi

    for server in "${@[@]}"; do
        tmux split-window -t $session "tmux select-layout tiled; ssh -t $server"
    done

    tmux setw -t $session synchronize-panes on
    tmux -2 attach-session -t $session | tmux -2 switch-client -t $session
}

It expects at least two arguments. The first argument is the session name to create for the clustered SSH session. All other arguments are server hostnames or FQDNs to which to connect. The first one is used to make the initial session. All remaining ones are added to that session with tmux split-window -t $session.... At the end, we enable synchronized panes by default, so whenever you type, the commands will be sent to every SSH connection, thus allowing the neat ClusterSSH feature to run commands on multiple servers simultaneously. Once done, we attach (or switch, if already in Tmux) to it.

Sometimes, I don't want the synchronized panes behavior and want to switch it off temporarily. I can do that with prefix-key p and prefix-key P after adding the following to my local tmux.conf:

bind-key p setw synchronize-panes off
bind-key P setw synchronize-panes on

The tmux::tssh_from_file helper



This one sets the session name to the file name and then reads a list of servers from that file, passing the list of servers to tmux::tssh_from_argument as the arguments. So, this is a neat little wrapper that also enables me to open clustered SSH sessions from an input file.

tmux::tssh_from_file () {
    local -r serverlist=$1; shift
    local -r session=$(basename $serverlist | cut -d. -f1)

    tmux::tssh_from_argument $session $(awk '{ print $1} ' $serverlist | sed 's/.lan./.lan/g')
}

tssh examples



To open a new session named fish and log in to 4 remote hosts, run this command (Note that it is also possible to specify the remote user):

$ tssh fish blowfish.buetow.org fishfinger.buetow.org \
    fishbone.buetow.org user@octopus.buetow.org

To open a new session named manyservers, put many servers (one FQDN per line) into a file called manyservers.txt and simply run:

$ tssh manyservers.txt

Common Tmux commands I use in tssh



These are default Tmux commands that I make heavy use of in a tssh session:

  • Press prefix-key DIRECTION to switch panes. DIRECTION is by default any of the arrow keys, but I also configured Vi keybindings.
  • Press prefix-key <space> to change the pane layout (can be pressed multiple times to cycle through them).
  • Press prefix-key z to zoom in and out of the current active pane.

Copy and paste workflow



As you will see later in this blog post, I have configured a history limit of 1 million items in Tmux so that I can scroll back quite far. One main workflow of mine is to search for text in the Tmux history, select and copy it, and then switch to another window or session and paste it there (e.g., into my text editor to do something with it).

This works by pressing prefix-key [ to enter Tmux copy mode. From there, I can browse the Tmux history of the current window using either the arrow keys or vi-like navigation (see vi configuration later in this blog post) and the Pg-Dn and Pg-Up keys.

I often search the history backwards with prefix-key [ followed by a ?, which opens the Tmux history search prompt.

Once I have identified the terminal text to be copied, I enter visual select mode with v, highlight all the text to be copied (using arrow keys or Vi motions), and press y to yank it (sorry if this all sounds a bit complicated, but Vim/NeoVim users will know this, as it is pretty much how you do it there as well).

For v and y to work, the following has to be added to the Tmux configuration file:

bind-key -T copy-mode-vi 'v' send -X begin-selection
bind-key -T copy-mode-vi 'y' send -X copy-selection-and-cancel

Once the text is yanked, I switch to another Tmux window or session where, for example, a text editor is running and paste the yanked text from Tmux into the editor with prefix-key ]. Note that when pasting into a modal text editor like Vi or Helix, you would first need to enter insert mode before prefix-key ] would paste anything.

Tmux configurations



Some features I have configured directly in Tmux don't require an external shell alias to function correctly. Let's walk line by line through my local ~/.config/tmux/tmux.conf:

source ~/.config/tmux/tmux.local.conf

set-option -g allow-rename off
set-option -g history-limit 100000
set-option -g status-bg '#444444'
set-option -g status-fg '#ffa500'
set-option -s escape-time 0

There's yet to be much magic happening here. I source a tmux.local.conf, which I sometimes use to override the default configuration that comes from the configuration management system. But it is mostly just an empty file, so it doesn't throw any errors on Tmux startup when I don't use it.

I work with many terminal outputs, which I also like to search within Tmux. So, I added a large enough history-limit, enabling me to search backwards in Tmux for any output up to a million lines of text.

Besides changing some colours (personal taste), I also set escape-time to 0, which is just a workaround. Otherwise, my Helix text editor's ESC key would take ages to trigger within Tmux. I am trying to remember the gory details. You can leave it out; if everything works fine for you, leave it out.

The next lines in the configuration file are:

set-window-option -g mode-keys vi
bind-key -T copy-mode-vi 'v' send -X begin-selection
bind-key -T copy-mode-vi 'y' send -X copy-selection-and-cancel

I navigate within Tmux using Vi keybindings, so the mode-keys is set to vi. I use the Helix modal text editor, which is close enough to Vi bindings for simple navigation to feel "native" to me. (By the way, I have been a long-time Vim and NeoVim user, but I eventually switched to Helix. It's off-topic here, but it may be worth another blog post once.)

The two bind-key commands make it so that I can use v and y in copy mode, which feels more Vi-like (as already discussed earlier in this post).

The next set of lines in the configuration file are:

bind-key h select-pane -L
bind-key j select-pane -D
bind-key k select-pane -U
bind-key l select-pane -R

bind-key H resize-pane -L 5
bind-key J resize-pane -D 5
bind-key K resize-pane -U 5
bind-key L resize-pane -R 5

These allow me to use prefix-key h, prefix-key j, prefix-key k, and prefix-key l for switching panes and prefix-key H, prefix-key J, prefix-key K, and prefix-key L for resizing the panes. If you don't know Vi/Vim/NeoVim, the letters hjkl are commonly used there for left, down, up, and right, which is also the same for Helix, by the way.

The next set of lines in the configuration file are:

bind-key c new-window -c '#{pane_current_path}'
bind-key F new-window -n "session-switcher" "tmux list-sessions | fzf | cut -d: -f1 | xargs tmux switch-client -t"
bind-key T choose-tree

The first one is that any new window starts in the current directory. The second one is more interesting. I list all open sessions in the fuzzy finder. I rely heavily on this during my daily workflow to switch between various sessions depending on the task. E.g. from a remote cluster SSH session to a local code editor.

The third one, choose-tree, opens a tree view in Tmux listing all sessions and windows. This one is handy to get a better overview of what is currently running in any local Tmux session. It looks like this (it also allows me to press a hotkey to switch to a particular Tmux window):

Tmux session tree view


The last remaining lines in my configuration file are:

bind-key p setw synchronize-panes off
bind-key P setw synchronize-panes on
bind-key r source-file ~/.config/tmux/tmux.conf \; display-message "tmux.conf reloaded"

We discussed synchronized panes earlier. I use it all the time in clustered SSH sessions. When enabled, all panes (remote SSH sessions) receive the same keystrokes. This is very useful when you want to run the same commands on many servers at once, such as navigating to a common directory, restarting a couple of services at once, or running tools like htop to quickly monitor system resources.

The last one reloads my Tmux configuration on the fly.

E-Mail your comments to paul@nospam.buetow.org :-)

Other related posts are:

2026-02-02 A tmux popup editor for Cursor Agent CLI prompts
2025-05-02 Terminal multiplexing with tmux - Fish edition
2024-06-23 Terminal multiplexing with tmux - Z-Shell edition (You are currently reading this)

Back to the main site
Projects I currently don't have time for https://foo.zone/gemfeed/2024-05-03-projects-i-currently-dont-have-time-for.html 2024-05-03T16:23:03+03:00 Paul Buetow aka snonux paul@dev.buetow.org Over the years, I have collected many ideas for my personal projects and noted them down. I am currently in the process of cleaning up all my notes and reviewing those ideas. I don’t have time for the ones listed here and won’t have any soon due to other commitments and personal projects. So, in order to 'get rid of them' from my notes folder, I decided to simply put them in this blog post so that those ideas don't get lost. Maybe I will pick up one or another idea someday in the future, but for now, they are all put on ice in favor of other personal projects or family time.

Projects I currently don't have time for



Published at 2024-05-03T16:23:03+03:00

Over the years, I have collected many ideas for my personal projects and noted them down. I am currently in the process of cleaning up all my notes and reviewing those ideas. I don’t have time for the ones listed here and won’t have any soon due to other commitments and personal projects. So, in order to "get rid of them" from my notes folder, I decided to simply put them in this blog post so that those ideas don't get lost. Maybe I will pick up one or another idea someday in the future, but for now, they are all put on ice in favor of other personal projects or family time.

Art by Laura Brown

.'`~~~~~~~~~~~`'.
(  .'11 12 1'.  )
|  :10 \    2:  |
|  :9   @-> 3:  |
|  :8       4;  |
'. '..7 6 5..' .'
 ~-------------~  ldb


Table of Contents




Hardware projects I don't have time for



I use Arch, btw!



The idea was to build the ultimate Arch Linux setup on an old ThinkPad X200 booting with the open-source LibreBoot firmware, complete with a tiling window manager, dmenu, and all the elite tools. This is mainly for fun, as I am pretty happy (and productive) with my Fedora Linux setup. I ran EndeavourOS (close enough to Arch) on an old ThinkPad for a while, but then I switched back to Fedora because the rolling releases were annoying (there were too many updates).

OpenBSD home router



In my student days, I operated a 486DX PC with OpenBSD as my home DSL internet router. I bought the setup from my brother back then. The router's hostname was fishbone, and it performed very well until it became too slow for larger broadband bandwidth after a few years of use.

I had the idea to revive this concept, implement fishbone2, and place it in front of my proprietary ISP router to add an extra layer of security and control in my home LAN. It would serve as the default gateway for all of my devices, including a Wi-Fi access point, would run a DNS server, Pi-hole proxy, VPN client, and DynDNS client. I would also implement high availability using OpenBSD's CARP protocol.

https://openbsdrouterguide.net
https://pi-hole.net/
https://www.OpenBSD.org
https://www.OpenBSD.org/faq/pf/carp.html

However, I am putting this on hold as I have opted for an OpenWRT-based solution, which was much quicker to set up and runs well enough.

https://OpenWRT.org/

Pi-Hole server



Install Pi-hole on one of my Pis or run it in a container on Freekat. For now, I am putting this on hold as the primary use for this would be ad-blocking, and I am avoiding surfing ad-heavy sites anyway. So there's no significant use for me personally at the moment.

https://pi-hole.net/

Infodash



The idea was to implement my smart info screen using purely open-source software. It would display information such as the health status of my personal infrastructure, my current work tracker balance (I track how much I work to prevent overworking), and my sports balance (I track my workouts to stay within my quotas for general health). The information would be displayed on a small screen in my home office, on my Pine watch, or remotely from any terminal window.

I don't have this, and I haven't missed having it, so I guess it would have been nice to have it but not provide any value other than the "fun of tinkering."

Reading station



I wanted to create the most comfortable setup possible for reading digital notes, articles, and books. This would include a comfy armchair, a silent barebone PC or Raspberry Pi computer running either Linux or *BSD, and an e-Ink display mounted on a flexible arm/stand. There would also be a small table for my paper journal for occasional note-taking. There are a bunch of open-source software available for PDF and ePub reading. It would have been neat, but I am currently using the most straightforward solution: a Kobo Elipsa 2E, which I can use on my sofa.

Retro station



I had an idea to build a computer infused with retro elements. It wouldn't use actual retro hardware but would look and feel like a retro machine. I would call this machine HAL or Retron.

I would use an old ThinkPad laptop placed on a horizontal stand, running NetBSD, and attaching a keyboard from ModelFkeyboards. I use WindowMaker as a window manager and run terminal applications through Retro Term. For the monitor, I would use an older (black) EIZO model with large bezels.

https://www.NetBSD.org
https://www.modelfkeyboards.com
https://github.com/Swordfish90/cool-retro-term)

The computer would occasionally be used to surf the Gemini space, take notes, blog, or do light coding. However, I have abandoned the project for now because there isn't enough space in my apartment, as my daughter will have a room for herself.

Sound server



My idea involved using a barebone mini PC running FreeBSD with the Navidrome sound server software. I could remotely connect to it from my phone, workstation/laptop to listen to my music collection. The storage would be based on ZFS with at least two drives for redundancy. The app would run in a Linux Docker container under FreeBSD via Bhyve.

https://github.com/navidrome/navidrome
https://wiki.freebsd.org/bhyve

Project Freekat



My idea involved purchasing the Meerkat mini PC from System76 and installing FreeBSD. Like the sound-server idea (see previous idea), it would run Linux Docker through Bhyve. I would self-host a bunch of applications on it:

  • Wallabag
  • Ankidroid
  • Miniflux & Postgres
  • Audiobookshelf
  • ...

All of this would be within my LAN, but the services would also be accessible from the internet through either Wireguard or SSH reverse tunnels to one of my OpenBSD VMs, for example:

  • wallabag.awesome.buetow.org
  • ankidroid.awesome.buetow.org
  • miniflux.awesome.buetow.org
  • audiobookshelf.awesome.buetow.org
  • ...

I am abandoning this project for now, as I am currently hosting my apps on AWS ECS Fargate under *.cool.buetow.org, which is "good enough" for the time being and also offers the benefit of learning to use AWS and Terraform, knowledge that can be applied at work.

My personal AWS setup

Programming projects I don't have time for



CLI-HIVE



This was a pet project idea that my brother and I had. The concept was to collect all shell history of all servers at work in a central place, apply ML/AI, and return suggestions for commands to type or allow a fuzzy search on all the commands in the history. The recommendations for the commands on a server could be context-based (e.g., past occurrences on the same server type).

You could decide whether to share your command history with others so they would receive better suggestions depending on which server they are on, or you could keep all the history private and secure. The plan was to add hooks into zsh and bash shells so that all commands typed would be pushed to the central location for data mining.

Enhanced KISS home photo albums



I don't use third-party cloud providers such as Google Photos to store/archive my photos. Instead, they are all on a ZFS volume on my home NAS, with regular offsite backups taken. Thus, my project would involve implementing the features I miss most or finding a solution simple enough to host on my LAN:

  • A feature I miss presents me with a random day from the past and some photos from that day. This project would randomly select a day and generate a photo album for me to view and reminisce about memories.
  • Another feature I miss is the ability to automatically deduplicate all the photos, as I am sure there are tons of duplicates on my NAS.
  • Auto-enhancing the photos (perhaps using ImageMagick?)
  • I already have a simple photoalbum.sh script that generates an album based on an input directory. However, it would be great also to have a timeline feature to enable browsing through different dates.

KISS static web photo albums with photoalbum.sh

KISS file sync server with end-to-end encryption



I aimed to have a simple server to which I could sync notes and other documents, ensuring that the data is fully end-to-end encrypted. This way, only the clients could decrypt the data, while an encrypted copy of all the data would be stored on the server side. There are a few solutions (e.g., NextCloud), but they are bloated or complex to set up.

I currently use Syncthing for encrypted file sync across all my devices; however, the data is not end-to-end encrypted. It's a good-enough setup, though, as my Syncthing server is in my home LAN on an encrypted file system.

https://syncthing.net

I also had the idea of using this as a pet project for work and naming it Cryptolake, utilizing post-quantum-safe encryption algorithms and a distributed data store.

A language that compiles to bash



I had an idea to implement a higher-level language with strong typing that could be compiled into native Bash code. This would make all resulting Bash scripts more robust and secure by default. The project would involve developing a parser, lexer, and a Bash code generator. I planned to implement this in Go.

I had previously implemented a tiny scripting language called Fype (For Your Program Execution), which could have served as inspiration.

The Fype Programming Language

A language that compiles to sed



This is similar to the previous idea, but the difference is that the language would compile into a sed script. Sed has many features, but the brief syntax makes scripts challenging to read. The higher-level language would mimic sed but in a form that is easier for humans to read.

Renovate VS-Sim



VS-Sim is an open-source simulator programmed in Java for distributed systems. VS-Sim stands for "Verteilte Systeme Simulator," the German translation for "Distributed Systems Simulator." The VS-Sim project was my diploma thesis at Aachen University of Applied Sciences.

https://codeberg.org/snonux/vs-sim

The ideas I had was:

  • Translate the project into English.
  • Modernise the Java codebase to be compatible with the latest JDK.
  • Make it compile to native binaries using GraalVM.
  • Distribute the project using AppImages.

I have put this project on hold for now, as I want to do more things in Go and fewer in Java in my personal time.

KISS ticketing system



My idea was to program a KISS (Keep It Simple, Stupid) ticketing system for my personal use. However, I am abandoning this project because I now use the excellent Taskwarrior software. You can learn more about it at:

https://taskwarrior.org/

A domain-specific language (DSL) for work



At work, an internal service allocates storage space for our customers on our storage clusters. It automates many tasks, but many tweaks are accessible through APIs. I had the idea to implement a Ruby-based DSL that would make using all those APIs for ad-hoc changes effortless, e.g.:

Cluster :UK, :uk01 do
  Customer.C1A1.segments.volumes.each do |volume|
    puts volume.usage_stats
    volume.move_off! if volume.over_subscribed?
  end
end

I am abandoning this project because my workplace has stopped the annual pet project competition, and I have other more important projects to work on at the moment.

Creative universe (Work pet project contests)

Self-hosting projects I don't have time for



My own Matrix server



I value privacy. It would be great to run my own Matrix server for communication within my family. I have yet to have time to look into this more closely.

https://matrix.org

Ampache music server



Ampache is an open-source music streaming server that allows you to host and manage your music collection online, accessible via a web interface. Setting it up involves configuring a web server, installing Ampache, and organising your music files, which can be time-consuming.

Librum eBook reader



Librum is a self-hostable e-book reader that allows users to manage and read their e-book collection from a web interface. Designed to be a self-contained platform where users can upload, organise, and access their e-books, Librum emphasises privacy and control over one's digital library.

https://github.com/Librum-Reader/Librum

I am using my Kobo devices or my laptop to read these kinds of things for now.

Memos - Note-taking service



Memos is a note-taking service that simplifies and streamlines information capture and organisation. It focuses on providing users with a minimalistic and intuitive interface, aiming to enhance productivity without the clutter commonly associated with more complex note-taking apps.

https://www.usememos.com

I am abandoning this idea for now, as I am currently using plain Markdown files for notes and syncing them with Syncthing across my devices.

Bepasty server



Bepasty is like a Pastebin for all kinds of files (text, image, audio, video, documents, binary, etc.). It seems very neat, but I only share a little nowadays. When I do, I upload files via SCP to one of my OpenBSD VMs and serve them via vanilla httpd there, keeping it KISS.

https://github.com/bepasty/bepasty-server

Books I don't have time to read



Fluent Python



I consider myself an advanced programmer in Ruby, Bash, and Perl. However, Python seems to be ubiquitous nowadays, and most of my colleagues prefer Python over any other languages. Thus, it makes sense for me to also learn and use Python. After conducting some research, "Fluent Python" appears to be the best book for this purpose.

I don't have time to read this book at the moment, as I am focusing more on Go (Golang) and I know just enough Python to get by (e.g., for code reviews). Additionally, there are still enough colleagues around who can review my Ruby or Bash code.

Programming Ruby



I've read a couple of Ruby books already, but "Programming Ruby," which covers up to Ruby 3.2, was just recently released. I would like to read this to deepen my Ruby knowledge further and to revisit some concepts that I may have forgotten.

As stated in this blog post, I am currently more eager to focus on Go, so I've put the Ruby book on hold. Additionally, there wouldn't be enough colleagues who could "understand" my advanced Ruby skills anyway, as most of them are either Java developers or SREs who don't code a lot.

Peter F. Hamilton science fiction books



I am a big fan of science fiction, but my reading list is currently too long anyway. So, I've put the Hamilton books on the back burner for now. You can see all the novels I've read here:

https://paul.buetow.org/novels.html
https://paul.buetow.org/novels.gmi


New websites I don't have time for



Create a "Why Raku Rox" site



The website "Why Raku Rox" would showcase the unique features and benefits of the Raku programming language and highlight why it is an exceptional choice for developers. Raku, originally known as Perl 6, is a dynamic, expressive language designed for flexible and powerful software development.

This would be similar to the "Why OpenBSD rocks" site:

https://why-openbsd.rocks
https://raku.org

I am not working on this for now, as I currently don’t even have time to program in Raku.

Research projects I don't have time for



Project secure



For work: Implement a PoC that dumps Java heaps to extract secrets from memory. Based on the findings, write a Java program that encrypts secrets in the kernel using the memfd_secret() syscall to make it even more secure.

https://lwn.net/Articles/865256/

Due to other priorities, I am putting this on hold for now. The software we have built is pretty damn secure already!

CPU utilisation is all wrong



This research project, based on Brendan Gregg's blog post, could potentially significantly impact my work.

https://brendangregg.com/blog/2017-05-09/cpu-utilization-is-wrong.html

The research project would involve setting up dashboards that display actual CPU usage and the cycles versus waiting time for memory access.

E-Mail your comments to paul@nospam.buetow.org :-)

Related and maybe interesting:

Sweating the small stuff - Tiny projects of mine

Back to the main site