3 files changed, 1412 insertions, 66 deletions
diff --git a/gemfeed/2025-10-02-f3s-kubernetes-with-freebsd-part-7.html b/gemfeed/2025-10-02-f3s-kubernetes-with-freebsd-part-7.html
index 25d807c2..b4ae12dd 100644
--- a/gemfeed/2025-10-02-f3s-kubernetes-with-freebsd-part-7.html
+++ b/gemfeed/2025-10-02-f3s-kubernetes-with-freebsd-part-7.html
@@ -13,7 +13,7 @@
 </p>
 <h1 style='display: inline' id='f3s-kubernetes-with-freebsd---part-7-k3s-and-first-pod-deployments'>f3s: Kubernetes with FreeBSD - Part 7: k3s and first pod deployments</h1><br />
 <br />
-<span class='quote'>Published at 2025-10-02T11:27:19+03:00</span><br />
+<span class='quote'>Published at 2025-10-02T11:27:19+03:00, last updated Tue 30 Dec 10:11:58 EET 2025</span><br />
 <br />
 <span>This is the seventh blog post about the f3s series for my self-hosting demands in a home lab. f3s? The "f" stands for FreeBSD, and the "3s" stands for k3s, the Kubernetes distribution I use on FreeBSD-based physical machines.</span><br />
 <br />
@@ -43,6 +43,8 @@
 <li>⇢ ⇢ <a href='#scaling-traefik-for-faster-failover'>Scaling Traefik for faster failover</a></li>
 <li>⇢ <a href='#make-it-accessible-from-the-public-internet'>Make it accessible from the public internet</a></li>
 <li>⇢ ⇢ <a href='#openbsd-relayd-configuration'>OpenBSD relayd configuration</a></li>
+<li>⇢ ⇢ <a href='#automatic-failover-when-f3s-cluster-is-down'>Automatic failover when f3s cluster is down</a></li>
+<li>⇢ ⇢ <a href='#openbsd-httpd-fallback-configuration'>OpenBSD httpd fallback configuration</a></li>
 <li>⇢ <a href='#deploying-the-private-docker-image-registry'>Deploying the private Docker image registry</a></li>
 <li>⇢ ⇢ <a href='#prepare-the-nfs-backed-storage'>Prepare the NFS-backed storage</a></li>
 <li>⇢ ⇢ <a href='#install-or-upgrade-the-chart'>Install (or upgrade) the chart</a></li>
@@ -672,10 +674,11 @@ table &lt;f3s&gt; {
 }
 </pre>
 <br />
-<span>Inside the <span class='inlinecode'>http protocol "https"</span> block each public hostname gets its Let&#39;s Encrypt certificate and is matched to that backend table. Besides the primary trio, every service-specific hostname (<span class='inlinecode'>anki</span>, <span class='inlinecode'>bag</span>, <span class='inlinecode'>flux</span>, <span class='inlinecode'>audiobookshelf</span>, <span class='inlinecode'>gpodder</span>, <span class='inlinecode'>radicale</span>, <span class='inlinecode'>vault</span>, <span class='inlinecode'>syncthing</span>, <span class='inlinecode'>uprecords</span>) and their <span class='inlinecode'>www</span> / <span class='inlinecode'>standby</span> aliases reuse the same pool so new apps can go live just by publishing an ingress rule, whereas they will all map to a service running in k3s:</span><br />
+<span>Inside the <span class='inlinecode'>http protocol "https"</span> block each public hostname gets its Let&#39;s Encrypt certificate. The protocol configures TLS keypairs for all f3s services and other public endpoints. For f3s hosts specifically, there are no explicit <span class='inlinecode'>forward to</span> rules in the protocol—they use the relay-level failover mechanism described later. Non-f3s hosts get explicit localhost routing to prevent them from trying the f3s backends:</span><br />
 <br />
 <pre>
 http protocol "https" {
+    # TLS certificates for all f3s services
     tls keypair f3s.foo.zone
     tls keypair www.f3s.foo.zone
     tls keypair standby.f3s.foo.zone
@@ -707,36 +710,15 @@ http protocol "https" {
     tls keypair www.uprecords.f3s.foo.zone
     tls keypair standby.uprecords.f3s.foo.zone
 
-    match request quick header "Host" value "f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "www.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "standby.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "anki.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "www.anki.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "standby.anki.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "bag.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "www.bag.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "standby.bag.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "flux.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "www.flux.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "standby.flux.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "audiobookshelf.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "www.audiobookshelf.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "standby.audiobookshelf.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "gpodder.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "www.gpodder.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "standby.gpodder.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "radicale.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "www.radicale.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "standby.radicale.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "vault.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "www.vault.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "standby.vault.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "syncthing.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "www.syncthing.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "standby.syncthing.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "uprecords.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "www.uprecords.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "standby.uprecords.f3s.foo.zone" forward to &lt;f3s&gt;
+    # Explicitly route non-f3s hosts to localhost
+    match request header "Host" value "foo.zone" forward to &lt;localhost&gt;
+    match request header "Host" value "www.foo.zone" forward to &lt;localhost&gt;
+    match request header "Host" value "dtail.dev" forward to &lt;localhost&gt;
+    # ... other non-f3s hosts ...
+
+    # NOTE: f3s hosts have NO match rules here!
+    # They use relay-level failover (f3s -&gt; localhost backup)
+    # See the relay configuration below for automatic failover details
 }
 </pre>
 <br />
@@ -746,18 +728,143 @@ http protocol "https" {
 relay "https4" {
     listen on 46.23.94.99 port 443 tls
     protocol "https"
+    # Primary: f3s cluster (with health checks) - Falls back to localhost when all hosts down
     forward to &lt;f3s&gt; port 80 check tcp
+    forward to &lt;localhost&gt; port 8080
 }
 
 relay "https6" {
     listen on 2a03:6000:6f67:624::99 port 443 tls
     protocol "https"
+    # Primary: f3s cluster (with health checks) - Falls back to localhost when all hosts down
     forward to &lt;f3s&gt; port 80 check tcp
+    forward to &lt;localhost&gt; port 8080
 }
 </pre>
 <br />
 <span>In practice, that means relayd terminates TLS with the correct certificate, keeps the three WireGuard-connected backends in rotation, and ships each request to whichever bhyve VM answers first.</span><br />
 <br />
+<h3 style='display: inline' id='automatic-failover-when-f3s-cluster-is-down'>Automatic failover when f3s cluster is down</h3><br />
+<br />
+<span class='quote'>Update: This section was added at Tue 30 Dec 10:11:44 EET 2025</span><br />
+<br />
+<span>One important aspect of this setup is graceful degradation: when all three f3s nodes are unreachable (e.g., during maintenance or a power outage in my LAN), users should see a friendly status page instead of an error message.</span><br />
+<br />
+<span>OpenBSD&#39;s relayd supports automatic failover through its health check mechanism. According to the relayd.conf manual:</span><br />
+<br />
+<span class='quote'>This directive can be specified multiple times - subsequent entries will be used as the backup table if all hosts in the previous table are down.</span><br />
+<br />
+<span>The key is the order of <span class='inlinecode'>forward to</span> statements in the relay configuration. By placing the f3s table first with <span class='inlinecode'>check tcp</span> health checks, followed by localhost as a backup, relayd automatically routes traffic based on backend availability:</span><br />
+<br />
+<span>When f3s cluster is UP:</span><br />
+<br />
+<ul>
+<li>Health checks on port 80 succeed for f3s nodes</li>
+<li>All f3s traffic routes to the Kubernetes cluster</li>
+<li>Localhost backup remains idle</li>
+</ul><br />
+<span>When f3s cluster is DOWN:</span><br />
+<br />
+<ul>
+<li>All health checks fail (nodes unreachable)</li>
+<li>The <span class='inlinecode'>&lt;f3s&gt;</span> table becomes unavailable</li>
+<li>Traffic automatically falls back to <span class='inlinecode'>&lt;localhost&gt;</span> on port 8080</li>
+<li>OpenBSD&#39;s httpd serves a static fallback page</li>
+</ul><br />
+<pre>
+# NEW configuration - supports automatic failover
+http protocol "https" {
+    # Explicitly route non-f3s hosts to localhost
+    match request header "Host" value "foo.zone" forward to &lt;localhost&gt;
+    match request header "Host" value "dtail.dev" forward to &lt;localhost&gt;
+    # ... other non-f3s hosts ...
+
+    # f3s hosts have NO protocol rules - they use relay-level failover
+    # (no match rules for f3s.foo.zone, anki.f3s.foo.zone, etc.)
+}
+
+relay "https4" {
+    # f3s FIRST (with health checks), localhost as BACKUP
+    forward to &lt;f3s&gt; port 80 check tcp
+    forward to &lt;localhost&gt; port 8080
+}
+</pre>
+<br />
+<span>This way, f3s traffic uses the relay&#39;s default behavior: try the first table, fall back to the second when health checks fail.</span><br />
+<br />
+<h3 style='display: inline' id='openbsd-httpd-fallback-configuration'>OpenBSD httpd fallback configuration</h3><br />
+<br />
+<span>The localhost httpd service on port 8080 serves the fallback content from <span class='inlinecode'>/var/www/htdocs/f3s_fallback/</span>. This directory contains a simple HTML page explaining the situation:</span><br />
+<br />
+<pre>
+# OpenBSD httpd.conf
+# Fallback for f3s hosts
+server "f3s.foo.zone" {
+  listen on * port 8080
+  log style forwarded
+  location * {
+    root "/htdocs/f3s_fallback"
+    directory auto index
+  }
+}
+
+server "anki.f3s.foo.zone" {
+  listen on * port 8080
+  log style forwarded
+  location * {
+    root "/htdocs/f3s_fallback"
+    directory auto index
+  }
+}
+
+# ... similar blocks for all f3s hostnames ...
+</pre>
+<br />
+<span>The fallback page itself is straightforward:</span><br />
+<br />
+<!-- Generator: GNU source-highlight 3.1.9
+by Lorenzo Bettini
+http://www.lorenzobettini.it
+http://www.gnu.org/software/src-highlite -->
+<pre><b><u><font color="#000000">&lt;!DOCTYPE</font></u></b> <b><font color="#000000">html</font></b><b><u><font color="#000000">&gt;</font></u></b>
+<b><u><font color="#000000">&lt;html&gt;</font></u></b>
+<b><u><font color="#000000">&lt;head&gt;</font></u></b>
+    <b><u><font color="#000000">&lt;title&gt;</font></u></b>Server turned off<b><u><font color="#000000">&lt;/title&gt;</font></u></b>
+    <b><u><font color="#000000">&lt;style&gt;</font></u></b>
+        body {
+            font-family: <font color="#808080">sans-serif</font>;
+            text-align: <font color="#808080">center</font>;
+            padding-top: <font color="#808080">50px</font>;
+        }
+        .container {
+            max-width: <font color="#808080">600px</font>;
+            margin: <font color="#808080">0</font> <font color="#808080">auto</font>;
+        }
+    <b><u><font color="#000000">&lt;/style&gt;</font></u></b>
+<b><u><font color="#000000">&lt;/head&gt;</font></u></b>
+<b><u><font color="#000000">&lt;body&gt;</font></u></b>
+    <b><u><font color="#000000">&lt;div</font></u></b> <b><font color="#000000">class</font></b>=<font color="#808080">"container"</font><b><u><font color="#000000">&gt;</font></u></b>
+        <b><u><font color="#000000">&lt;h1&gt;</font></u></b>Server turned off<b><u><font color="#000000">&lt;/h1&gt;</font></u></b>
+        <b><u><font color="#000000">&lt;p&gt;</font></u></b>The servers are all currently turned off.<b><u><font color="#000000">&lt;/p&gt;</font></u></b>
+        <b><u><font color="#000000">&lt;p&gt;</font></u></b>Please try again later.<b><u><font color="#000000">&lt;/p&gt;</font></u></b>
+        <b><u><font color="#000000">&lt;p&gt;</font></u></b>Or email <b><u><font color="#000000">&lt;a</font></u></b> <b><font color="#000000">href</font></b>=<font color="#808080">"mailto:paul@nospam.buetow.org"</font><b><u><font color="#000000">&gt;</font></u></b>paul@nospam.buetow.org<b><u><font color="#000000">&lt;/a&gt;</font></u></b>
+           - so I can turn them back on for you!<b><u><font color="#000000">&lt;/p&gt;</font></u></b>
+    <b><u><font color="#000000">&lt;/div&gt;</font></u></b>
+<b><u><font color="#000000">&lt;/body&gt;</font></u></b>
+<b><u><font color="#000000">&lt;/html&gt;</font></u></b>
+</pre>
+<br />
+<span>This approach provides several benefits:</span><br />
+<br />
+<ul>
+<li>Automatic detection: Health checks run continuously; no manual intervention needed</li>
+<li>Instant fallback: When all f3s nodes go down, the next request automatically routes to localhost</li>
+<li>Transparent recovery: When f3s comes back online, health checks pass and traffic resumes automatically</li>
+<li>User experience: Visitors see a helpful message instead of connection errors</li>
+<li>No DNS changes: The same hostnames work whether f3s is up or down</li>
+</ul><br />
+<span>This fallback mechanism has proven invaluable during maintenance windows and unexpected outages, ensuring that users always get a response even when the home lab is offline.</span><br />
+<br />
 <h2 style='display: inline' id='deploying-the-private-docker-image-registry'>Deploying the private Docker image registry</h2><br />
 <br />
 <span>As not all Docker images I want to deploy are available on public Docker registries and as I also build some of them by myself, there is the need of a private registry. </span><br />
diff --git a/gemfeed/DRAFT-f3s-kubernetes-with-freebsd-part-8b.html b/gemfeed/DRAFT-f3s-kubernetes-with-freebsd-part-8b.html
new file mode 100644
index 00000000..d598c631
--- /dev/null
+++ b/gemfeed/DRAFT-f3s-kubernetes-with-freebsd-part-8b.html
@@ -0,0 +1,1132 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
+<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
+<head>
+<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
+<title>f3s: Kubernetes with FreeBSD - Part 9: Enabling etcd Metrics</title>
+<link rel="shortcut icon" type="image/gif" href="/favicon.ico" />
+<link rel="stylesheet" href="../style.css" />
+<link rel="stylesheet" href="style-override.css" />
+</head>
+<body>
+<p class="header">
+<a href="https://foo.zone">Home</a> | <a href="https://codeberg.org/snonux/foo.zone/src/branch/content-md/gemfeed/DRAFT-f3s-kubernetes-with-freebsd-part-8b.md">Markdown</a> | <a href="gemini://foo.zone/gemfeed/DRAFT-f3s-kubernetes-with-freebsd-part-8b.gmi">Gemini</a>
+</p>
+<h1 style='display: inline' id='f3s-kubernetes-with-freebsd---part-9-enabling-etcd-metrics'>f3s: Kubernetes with FreeBSD - Part 9: Enabling etcd Metrics</h1><br />
+<br />
+<h2 style='display: inline' id='introduction'>Introduction</h2><br />
+<br />
+<span>This post covers enabling etcd metrics monitoring for the k3s cluster. The etcd dashboard in Grafana initially showed no data because k3s uses an embedded etcd that doesn&#39;t expose metrics by default.</span><br />
+<br />
+<a class='textlink' href='./2025-12-07-f3s-kubernetes-with-freebsd-part-8.html'>Part 8: Observability</a><br />
+<br />
+<h2 style='display: inline' id='enabling-etcd-metrics-in-k3s'>Enabling etcd metrics in k3s</h2><br />
+<br />
+<span>On each control-plane node (r0, r1, r2), create /etc/rancher/k3s/config.yaml:</span><br />
+<br />
+<pre>
+etcd-expose-metrics: true
+</pre>
+<br />
+<span>Then restart k3s on each node:</span><br />
+<br />
+<pre>
+systemctl restart k3s
+</pre>
+<br />
+<span>After restarting, etcd metrics are available on port 2381:</span><br />
+<br />
+<pre>
+curl http://127.0.0.1:2381/metrics | grep etcd
+</pre>
+<br />
+<h2 style='display: inline' id='configuring-prometheus-to-scrape-etcd'>Configuring Prometheus to scrape etcd</h2><br />
+<br />
+<span>In persistence-values.yaml, enable kubeEtcd with the node IP addresses:</span><br />
+<br />
+<pre>
+kubeEtcd:
+  enabled: true
+  endpoints:
+    - 192.168.1.120
+    - 192.168.1.121
+    - 192.168.1.122
+  service:
+    enabled: true
+    port: 2381
+    targetPort: 2381
+</pre>
+<br />
+<span>Apply the changes:</span><br />
+<br />
+<pre>
+just upgrade
+</pre>
+<br />
+<h2 style='display: inline' id='verifying-etcd-metrics'>Verifying etcd metrics</h2><br />
+<br />
+<span>After the changes, all etcd targets are being scraped:</span><br />
+<br />
+<pre>
+kubectl exec -n monitoring prometheus-prometheus-kube-prometheus-prometheus-0 \
+  -c prometheus -- wget -qO- &#39;http://localhost:9090/api/v1/query?query=etcd_server_has_leader&#39; | \
+  jq -r &#39;.data.result[] | "\(.metric.instance): \(.value[1])"&#39;
+</pre>
+<br />
+<span>Output:</span><br />
+<br />
+<pre>
+192.168.1.120:2381: 1
+192.168.1.121:2381: 1
+192.168.1.122:2381: 1
+</pre>
+<br />
+<span>The etcd dashboard in Grafana now displays metrics including Raft proposals, leader elections, and peer round trip times.</span><br />
+<br />
+<h2 style='display: inline' id='complete-persistence-valuesyaml'>Complete persistence-values.yaml</h2><br />
+<br />
+<span>The complete updated persistence-values.yaml:</span><br />
+<br />
+<pre>
+kubeEtcd:
+  enabled: true
+  endpoints:
+    - 192.168.1.120
+    - 192.168.1.121
+    - 192.168.1.122
+  service:
+    enabled: true
+    port: 2381
+    targetPort: 2381
+
+prometheus:
+  prometheusSpec:
+    additionalScrapeConfigsSecret:
+      enabled: true
+      name: additional-scrape-configs
+      key: additional-scrape-configs.yaml
+    storageSpec:
+      volumeClaimTemplate:
+        spec:
+          storageClassName: ""
+          accessModes: ["ReadWriteOnce"]
+          resources:
+            requests:
+              storage: 10Gi
+          selector:
+            matchLabels:
+              type: local
+              app: prometheus
+
+grafana:
+  persistence:
+    enabled: true
+    type: pvc
+    existingClaim: "grafana-data-pvc"
+
+  initChownData:
+    enabled: false
+
+  podSecurityContext:
+    fsGroup: 911
+    runAsUser: 911
+    runAsGroup: 911
+</pre>
+<br />
+<h2 style='display: inline' id='zfs-monitoring-for-freebsd-servers'>ZFS Monitoring for FreeBSD Servers</h2><br />
+<br />
+<span>The FreeBSD servers (f0, f1, f2) that provide NFS storage to the k3s cluster have ZFS filesystems. Monitoring ZFS performance is crucial for understanding storage performance and cache efficiency.</span><br />
+<br />
+<h3 style='display: inline' id='node-exporter-zfs-collector'>Node Exporter ZFS Collector</h3><br />
+<br />
+<span>The node_exporter running on each FreeBSD server (v1.9.1) includes a built-in ZFS collector that exposes metrics via sysctls. The ZFS collector is enabled by default and provides:</span><br />
+<br />
+<ul>
+<li>ARC (Adaptive Replacement Cache) statistics</li>
+<li>Cache hit/miss rates</li>
+<li>Memory usage and allocation</li>
+<li>MRU/MFU cache breakdown</li>
+<li>Data vs metadata distribution</li>
+</ul><br />
+<h3 style='display: inline' id='verifying-zfs-metrics'>Verifying ZFS Metrics</h3><br />
+<br />
+<span>On any FreeBSD server, check that ZFS metrics are being exposed:</span><br />
+<br />
+<pre>
+paul@f0:~ % curl -s http://localhost:9100/metrics | grep node_zfs_arcstats | wc -l
+      69
+</pre>
+<br />
+<span>The metrics are automatically scraped by Prometheus through the existing static configuration in additional-scrape-configs.yaml which targets all FreeBSD servers on port 9100 with the os: freebsd label.</span><br />
+<br />
+<h3 style='display: inline' id='zfs-recording-rules'>ZFS Recording Rules</h3><br />
+<br />
+<span>Created recording rules for easier dashboard consumption in zfs-recording-rules.yaml:</span><br />
+<br />
+<pre>
+apiVersion: monitoring.coreos.com/v1
+kind: PrometheusRule
+metadata:
+  name: freebsd-zfs-rules
+  namespace: monitoring
+  labels:
+    release: prometheus
+spec:
+  groups:
+    - name: freebsd-zfs-arc
+      interval: 30s
+      rules:
+        - record: node_zfs_arc_hit_rate_percent
+          expr: |
+            100 * (
+              rate(node_zfs_arcstats_hits_total{os="freebsd"}[5m]) /
+              (rate(node_zfs_arcstats_hits_total{os="freebsd"}[5m]) +
+               rate(node_zfs_arcstats_misses_total{os="freebsd"}[5m]))
+            )
+          labels:
+            os: freebsd
+        - record: node_zfs_arc_memory_usage_percent
+          expr: |
+            100 * (
+              node_zfs_arcstats_size_bytes{os="freebsd"} /
+              node_zfs_arcstats_c_max_bytes{os="freebsd"}
+            )
+          labels:
+            os: freebsd
+        # Additional rules for metadata %, target %, MRU/MFU %, etc.
+</pre>
+<br />
+<span>These recording rules calculate:</span><br />
+<br />
+<ul>
+<li>ARC hit rate percentage</li>
+<li>ARC memory usage percentage (current vs maximum)</li>
+<li>ARC target percentage (target vs maximum)</li>
+<li>Metadata vs data percentages</li>
+<li>MRU vs MFU cache percentages</li>
+<li>Demand data and metadata hit rates</li>
+</ul><br />
+<h3 style='display: inline' id='grafana-dashboards'>Grafana Dashboards</h3><br />
+<br />
+<span>Created two comprehensive ZFS monitoring dashboards (zfs-dashboards.yaml):</span><br />
+<br />
+<span>**Dashboard 1: FreeBSD ZFS (per-host detailed view)**</span><br />
+<br />
+<span>Includes variables to select:</span><br />
+<ul>
+<li>FreeBSD server (f0, f1, or f2)</li>
+<li>ZFS pool (zdata, zroot, or all)</li>
+</ul><br />
+<span>**Pool Overview Row:**</span><br />
+<ul>
+<li>Pool Capacity gauge (with thresholds: green &lt;70%, yellow &lt;85%, red &gt;85%)</li>
+<li>Pool Health status (ONLINE/DEGRADED/FAULTED with color coding)</li>
+<li>Total Pool Size stat</li>
+<li>Free Space stat</li>
+<li>Pool Space Usage Over Time (stacked: used + free)</li>
+<li>Pool Capacity Trend time series</li>
+</ul><br />
+<span>**Dataset Statistics Row:**</span><br />
+<ul>
+<li>Table showing all datasets with columns: Pool, Dataset, Used, Available, Referenced</li>
+<li>Automatically filters by selected pool</li>
+</ul><br />
+<span>**ARC Cache Statistics Row:**</span><br />
+<ul>
+<li>ARC Hit Rate gauge (red &lt;70%, yellow &lt;90%, green &gt;=90%)</li>
+<li>ARC Size time series (current, target, max)</li>
+<li>ARC Memory Usage percentage gauge</li>
+<li>ARC Hits vs Misses rate</li>
+<li>ARC Data vs Metadata stacked time series</li>
+</ul><br />
+<span>**Dashboard 2: FreeBSD ZFS Summary (cluster-wide overview)**</span><br />
+<br />
+<span>**Cluster-Wide Pool Statistics Row:**</span><br />
+<ul>
+<li>Total Storage Capacity across all servers</li>
+<li>Total Used space</li>
+<li>Total Free space</li>
+<li>Average Pool Capacity gauge</li>
+<li>Pool Health Status (worst case across cluster)</li>
+<li>Total Pool Space Usage Over Time</li>
+<li>Per-Pool Capacity time series (all pools on all hosts)</li>
+</ul><br />
+<span>**Per-Host Pool Breakdown Row:**</span><br />
+<ul>
+<li>Bar gauge showing capacity by host and pool</li>
+<li>Table with all pools: Host, Pool, Size, Used, Free, Capacity %, Health</li>
+</ul><br />
+<span>**Cluster-Wide ARC Statistics Row:**</span><br />
+<ul>
+<li>Average ARC Hit Rate gauge across all hosts</li>
+<li>ARC Hit Rate by Host time series</li>
+<li>Total ARC Size Across Cluster</li>
+<li>Total ARC Hits vs Misses (cluster-wide sum)</li>
+<li>ARC Size by Host</li>
+</ul><br />
+<span>**Dashboard Visualization:**</span><br />
+<br />
+<a href='./f3s-kubernetes-with-freebsd-part-8b/grafana-zfs-dashboard.png'><img alt='ZFS monitoring dashboard in Grafana showing pool statistics and ARC cache metrics' title='ZFS monitoring dashboard in Grafana showing pool statistics and ARC cache metrics' src='./f3s-kubernetes-with-freebsd-part-8b/grafana-zfs-dashboard.png' /></a><br />
+<br />
+<h3 style='display: inline' id='deployment'>Deployment</h3><br />
+<br />
+<span>Applied the resources to the cluster:</span><br />
+<br />
+<pre>
+cd /home/paul/git/conf/f3s/prometheus
+kubectl apply -f zfs-recording-rules.yaml
+kubectl apply -f zfs-dashboards.yaml
+</pre>
+<br />
+<span>Updated Justfile to include ZFS recording rules in install and upgrade targets:</span><br />
+<br />
+<pre>
+install:
+    kubectl apply -f persistent-volumes.yaml
+    kubectl create secret generic additional-scrape-configs --from-file=additional-scrape-configs.yaml -n monitoring --dry-run=client -o yaml | kubectl apply -f -
+    helm install prometheus prometheus-community/kube-prometheus-stack --namespace monitoring -f persistence-values.yaml
+    kubectl apply -f freebsd-recording-rules.yaml
+    kubectl apply -f openbsd-recording-rules.yaml
+    kubectl apply -f zfs-recording-rules.yaml
+    just -f grafana-ingress/Justfile install
+</pre>
+<br />
+<h3 style='display: inline' id='verifying-zfs-metrics-in-prometheus'>Verifying ZFS Metrics in Prometheus</h3><br />
+<br />
+<span>Check that ZFS metrics are being collected:</span><br />
+<br />
+<pre>
+kubectl exec -n monitoring prometheus-prometheus-kube-prometheus-prometheus-0 -c prometheus -- \
+  wget -qO- &#39;http://localhost:9090/api/v1/query?query=node_zfs_arcstats_size_bytes&#39;
+</pre>
+<br />
+<span>Check recording rules are calculating correctly:</span><br />
+<br />
+<pre>
+kubectl exec -n monitoring prometheus-prometheus-kube-prometheus-prometheus-0 -c prometheus -- \
+  wget -qO- &#39;http://localhost:9090/api/v1/query?query=node_zfs_arc_memory_usage_percent&#39;
+</pre>
+<br />
+<span>Example output shows memory usage percentage for each FreeBSD server:</span><br />
+<br />
+<pre>
+"result":[
+  {"metric":{"instance":"192.168.2.130:9100","os":"freebsd"},"value":[...,"37.58"]},
+  {"metric":{"instance":"192.168.2.131:9100","os":"freebsd"},"value":[...,"12.85"]},
+  {"metric":{"instance":"192.168.2.132:9100","os":"freebsd"},"value":[...,"13.44"]}
+]
+</pre>
+<br />
+<h3 style='display: inline' id='accessing-the-dashboards'>Accessing the Dashboards</h3><br />
+<br />
+<span>The dashboards are automatically imported by the Grafana sidecar and accessible at:</span><br />
+<br />
+<a class='textlink' href='https://grafana.f3s.buetow.org'>https://grafana.f3s.buetow.org</a><br />
+<br />
+<span>Navigate to Dashboards and search for:</span><br />
+<ul>
+<li>"FreeBSD ZFS" - detailed per-host view with pool and dataset breakdowns</li>
+<li>"FreeBSD ZFS Summary" - cluster-wide overview of all ZFS storage</li>
+</ul><br />
+<h3 style='display: inline' id='key-metrics-to-monitor'>Key Metrics to Monitor</h3><br />
+<br />
+<span>**ARC Hit Rate:** Should typically be above 90% for optimal performance. Lower hit rates indicate the ARC cache is too small or workload has poor locality.</span><br />
+<br />
+<span>**ARC Memory Usage:** Shows how much of the maximum ARC size is being used. If consistently at or near maximum, the ARC is effectively utilizing available memory.</span><br />
+<br />
+<span>**Data vs Metadata:** Typically data should dominate, but workloads with many small files will show higher metadata percentages.</span><br />
+<br />
+<span>**MRU vs MFU:** Most Recently Used vs Most Frequently Used cache. The ratio depends on workload characteristics.</span><br />
+<br />
+<span>**Pool Capacity:** Monitor pool usage to ensure adequate free space. ZFS performance degrades when pools exceed 80% capacity.</span><br />
+<br />
+<span>**Pool Health:** Should always show ONLINE (green). DEGRADED (yellow) indicates a disk issue requiring attention. FAULTED (red) requires immediate action.</span><br />
+<br />
+<span>**Dataset Usage:** Track which datasets are consuming the most space to identify growth trends and plan capacity.</span><br />
+<br />
+<h3 style='display: inline' id='zfs-pool-and-dataset-metrics-via-textfile-collector'>ZFS Pool and Dataset Metrics via Textfile Collector</h3><br />
+<br />
+<span>To complement the ARC statistics from node_exporter&#39;s built-in ZFS collector, I added pool capacity and dataset metrics using the textfile collector feature.</span><br />
+<br />
+<span>Created a script at /usr/local/bin/zfs_pool_metrics.sh on each FreeBSD server:</span><br />
+<br />
+<pre>
+#!/bin/sh
+# ZFS Pool and Dataset Metrics Collector for Prometheus
+
+OUTPUT_FILE="/var/tmp/node_exporter/zfs_pools.prom.$$"
+FINAL_FILE="/var/tmp/node_exporter/zfs_pools.prom"
+
+mkdir -p /var/tmp/node_exporter
+
+{
+    # Pool metrics
+    echo "# HELP zfs_pool_size_bytes Total size of ZFS pool"
+    echo "# TYPE zfs_pool_size_bytes gauge"
+    echo "# HELP zfs_pool_allocated_bytes Allocated space in ZFS pool"
+    echo "# TYPE zfs_pool_allocated_bytes gauge"
+    echo "# HELP zfs_pool_free_bytes Free space in ZFS pool"
+    echo "# TYPE zfs_pool_free_bytes gauge"
+    echo "# HELP zfs_pool_capacity_percent Capacity percentage"
+    echo "# TYPE zfs_pool_capacity_percent gauge"
+    echo "# HELP zfs_pool_health Pool health (0=ONLINE, 1=DEGRADED, 2=FAULTED)"
+    echo "# TYPE zfs_pool_health gauge"
+
+    zpool list -Hp -o name,size,allocated,free,capacity,health | \
+    while IFS=$&#39;\t&#39; read name size alloc free cap health; do
+        case "$health" in
+            ONLINE)   health_val=0 ;;
+            DEGRADED) health_val=1 ;;
+            FAULTED)  health_val=2 ;;
+            *)        health_val=6 ;;
+        esac
+        cap_num=$(echo "$cap" | sed &#39;s/%//&#39;)
+
+        echo "zfs_pool_size_bytes{pool=\"$name\"} $size"
+        echo "zfs_pool_allocated_bytes{pool=\"$name\"} $alloc"
+        echo "zfs_pool_free_bytes{pool=\"$name\"} $free"
+        echo "zfs_pool_capacity_percent{pool=\"$name\"} $cap_num"
+        echo "zfs_pool_health{pool=\"$name\"} $health_val"
+    done
+
+    # Dataset metrics
+    echo "# HELP zfs_dataset_used_bytes Used space in dataset"
+    echo "# TYPE zfs_dataset_used_bytes gauge"
+    echo "# HELP zfs_dataset_available_bytes Available space"
+    echo "# TYPE zfs_dataset_available_bytes gauge"
+    echo "# HELP zfs_dataset_referenced_bytes Referenced space"
+    echo "# TYPE zfs_dataset_referenced_bytes gauge"
+
+    zfs list -Hp -t filesystem -o name,used,available,referenced | \
+    while IFS=$&#39;\t&#39; read name used avail ref; do
+        pool=$(echo "$name" | cut -d/ -f1)
+        echo "zfs_dataset_used_bytes{pool=\"$pool\",dataset=\"$name\"} $used"
+        echo "zfs_dataset_available_bytes{pool=\"$pool\",dataset=\"$name\"} $avail"
+        echo "zfs_dataset_referenced_bytes{pool=\"$pool\",dataset=\"$name\"} $ref"
+    done
+} &gt; "$OUTPUT_FILE"
+
+mv "$OUTPUT_FILE" "$FINAL_FILE"
+</pre>
+<br />
+<span>Deployed to all FreeBSD servers:</span><br />
+<br />
+<pre>
+for host in f0 f1 f2; do
+    scp /tmp/zfs_pool_metrics.sh paul@$host:/tmp/
+    ssh paul@$host &#39;doas mv /tmp/zfs_pool_metrics.sh /usr/local/bin/ &amp;&amp; \
+                    doas chmod +x /usr/local/bin/zfs_pool_metrics.sh&#39;
+done
+</pre>
+<br />
+<span>Set up cron jobs to run every minute:</span><br />
+<br />
+<pre>
+for host in f0 f1 f2; do
+    ssh paul@$host &#39;echo "* * * * * /usr/local/bin/zfs_pool_metrics.sh &gt;/dev/null 2&gt;&amp;1" | \
+                    doas crontab -&#39;
+done
+</pre>
+<br />
+<span>The textfile collector (already configured with --collector.textfile.directory=/var/tmp/node_exporter) automatically picks up the metrics.</span><br />
+<br />
+<span>Verify metrics are being exposed:</span><br />
+<br />
+<pre>
+paul@f0:~ % curl -s http://localhost:9100/metrics | grep "^zfs_pool" | head -5
+zfs_pool_allocated_bytes{pool="zdata"} 6.47622733824e+11
+zfs_pool_allocated_bytes{pool="zroot"} 5.3338578944e+10
+zfs_pool_capacity_percent{pool="zdata"} 64
+zfs_pool_capacity_percent{pool="zroot"} 10
+zfs_pool_free_bytes{pool="zdata"} 3.48809678848e+11
+</pre>
+<br />
+<h2 style='display: inline' id='summary'>Summary</h2><br />
+<br />
+<span>Enhanced the f3s cluster observability by:</span><br />
+<br />
+<ul>
+<li>Enabling etcd metrics monitoring for the k3s embedded etcd</li>
+<li>Implementing comprehensive ZFS monitoring for FreeBSD storage servers</li>
+<li>Creating recording rules for calculated metrics (ARC hit rates, memory usage, etc.)</li>
+<li>Deploying Grafana dashboards for visualization</li>
+<li>Configuring automatic dashboard import via ConfigMap labels</li>
+</ul><br />
+<span>The monitoring stack now provides visibility into both cluster control plane health (etcd) and storage performance (ZFS).</span><br />
+<br />
+<a class='textlink' href='https://codeberg.org/snonux/conf/src/branch/master/f3s/prometheus'>prometheus configuration on Codeberg</a><br />
+<br />
+<h2 style='display: inline' id='distributed-tracing-with-grafana-tempo'>Distributed Tracing with Grafana Tempo</h2><br />
+<br />
+<span>After implementing logs (Loki) and metrics (Prometheus), the final pillar of observability is distributed tracing. Grafana Tempo provides distributed tracing capabilities that help understand request flows across microservices.</span><br />
+<br />
+<h3 style='display: inline' id='why-distributed-tracing'>Why Distributed Tracing?</h3><br />
+<br />
+<span>In a microservices architecture, a single user request may traverse multiple services. Distributed tracing:</span><br />
+<br />
+<ul>
+<li>Tracks requests across service boundaries</li>
+<li>Identifies performance bottlenecks</li>
+<li>Visualizes service dependencies</li>
+<li>Correlates with logs and metrics</li>
+<li>Helps debug complex distributed systems</li>
+</ul><br />
+<h3 style='display: inline' id='deploying-grafana-tempo'>Deploying Grafana Tempo</h3><br />
+<br />
+<span>Tempo is deployed in monolithic mode, following the same pattern as Loki&#39;s SingleBinary deployment.</span><br />
+<br />
+<span>#### Configuration Strategy</span><br />
+<br />
+<span>**Deployment Mode:** Monolithic (all components in one process)</span><br />
+<ul>
+<li>Simpler operation than microservices mode</li>
+<li>Suitable for the cluster scale</li>
+<li>Consistent with Loki deployment pattern</li>
+</ul><br />
+<span>**Storage:** Filesystem backend using hostPath</span><br />
+<ul>
+<li>10Gi storage at /data/nfs/k3svolumes/tempo/data</li>
+<li>7-day retention (168h)</li>
+<li>Local storage is the only option for monolithic mode</li>
+</ul><br />
+<span>**OTLP Receivers:** Standard OpenTelemetry Protocol ports</span><br />
+<ul>
+<li>gRPC: 4317</li>
+<li>HTTP: 4318</li>
+<li>Bind to 0.0.0.0 to avoid Tempo 2.7+ localhost-only binding issue</li>
+</ul><br />
+<span>#### Tempo Deployment Files</span><br />
+<br />
+<span>Created in /home/paul/git/conf/f3s/tempo/:</span><br />
+<br />
+<span>**values.yaml** - Helm chart configuration:</span><br />
+<br />
+<pre>
+tempo:
+  retention: 168h
+  storage:
+    trace:
+      backend: local
+      local:
+        path: /var/tempo/traces
+      wal:
+        path: /var/tempo/wal
+  receivers:
+    otlp:
+      protocols:
+        grpc:
+          endpoint: 0.0.0.0:4317
+        http:
+          endpoint: 0.0.0.0:4318
+
+persistence:
+  enabled: true
+  size: 10Gi
+  storageClassName: ""
+
+resources:
+  limits:
+    cpu: 1000m
+    memory: 2Gi
+  requests:
+    cpu: 500m
+    memory: 1Gi
+</pre>
+<br />
+<span>**persistent-volumes.yaml** - Storage configuration:</span><br />
+<br />
+<pre>
+apiVersion: v1
+kind: PersistentVolume
+metadata:
+  name: tempo-data-pv
+spec:
+  capacity:
+    storage: 10Gi
+  accessModes:
+    - ReadWriteOnce
+  persistentVolumeReclaimPolicy: Retain
+  hostPath:
+    path: /data/nfs/k3svolumes/tempo/data
+---
+apiVersion: v1
+kind: PersistentVolumeClaim
+metadata:
+  name: tempo-data-pvc
+  namespace: monitoring
+spec:
+  storageClassName: ""
+  accessModes:
+    - ReadWriteOnce
+  resources:
+    requests:
+      storage: 10Gi
+</pre>
+<br />
+<span>**Grafana Datasource Provisioning**</span><br />
+<br />
+<span>All Grafana datasources (Prometheus, Alertmanager, Loki, Tempo) are provisioned via a unified ConfigMap that is directly mounted to the Grafana pod. This approach ensures datasources are loaded on startup without requiring sidecar-based discovery.</span><br />
+<br />
+<span>In /home/paul/git/conf/f3s/prometheus/grafana-datasources-all.yaml:</span><br />
+<br />
+<pre>
+apiVersion: v1
+kind: ConfigMap
+metadata:
+  name: grafana-datasources-all
+  namespace: monitoring
+data:
+  datasources.yaml: |
+    apiVersion: 1
+    datasources:
+      - name: Prometheus
+        type: prometheus
+        uid: prometheus
+        url: http://prometheus-kube-prometheus-prometheus.monitoring:9090/
+        access: proxy
+        isDefault: true
+      - name: Alertmanager
+        type: alertmanager
+        uid: alertmanager
+        url: http://prometheus-kube-prometheus-alertmanager.monitoring:9093/
+      - name: Loki
+        type: loki
+        uid: loki
+        url: http://loki.monitoring.svc.cluster.local:3100
+      - name: Tempo
+        type: tempo
+        uid: tempo
+        url: http://tempo.monitoring.svc.cluster.local:3200
+        jsonData:
+          tracesToLogsV2:
+            datasourceUid: loki
+            spanStartTimeShift: -1h
+            spanEndTimeShift: 1h
+          tracesToMetrics:
+            datasourceUid: prometheus
+          serviceMap:
+            datasourceUid: prometheus
+          nodeGraph:
+            enabled: true
+</pre>
+<br />
+<span>The kube-prometheus-stack Helm values (persistence-values.yaml) are configured to:</span><br />
+<ul>
+<li>Disable sidecar-based datasource provisioning</li>
+<li>Mount grafana-datasources-all ConfigMap directly to /etc/grafana/provisioning/datasources/</li>
+</ul><br />
+<span>This direct mounting approach is simpler and more reliable than sidecar-based discovery.</span><br />
+<br />
+<span>#### Installation</span><br />
+<br />
+<pre>
+cd /home/paul/git/conf/f3s/tempo
+just install
+</pre>
+<br />
+<span>Verify Tempo is running:</span><br />
+<br />
+<pre>
+kubectl get pods -n monitoring -l app.kubernetes.io/name=tempo
+kubectl exec -n monitoring &lt;tempo-pod&gt; -- wget -qO- http://localhost:3200/ready
+</pre>
+<br />
+<h3 style='display: inline' id='configuring-grafana-alloy-for-trace-collection'>Configuring Grafana Alloy for Trace Collection</h3><br />
+<br />
+<span>Updated /home/paul/git/conf/f3s/loki/alloy-values.yaml to add OTLP receivers for traces while maintaining existing log collection.</span><br />
+<br />
+<span>#### OTLP Receiver Configuration</span><br />
+<br />
+<span>Added to Alloy configuration after the log collection pipeline:</span><br />
+<br />
+<pre>
+// OTLP receiver for traces via gRPC and HTTP
+otelcol.receiver.otlp "default" {
+  grpc {
+    endpoint = "0.0.0.0:4317"
+  }
+  http {
+    endpoint = "0.0.0.0:4318"
+  }
+  output {
+    traces = [otelcol.processor.batch.default.input]
+  }
+}
+
+// Batch processor for efficient trace forwarding
+otelcol.processor.batch "default" {
+  timeout = "5s"
+  send_batch_size = 100
+  send_batch_max_size = 200
+  output {
+    traces = [otelcol.exporter.otlp.tempo.input]
+  }
+}
+
+// OTLP exporter to send traces to Tempo
+otelcol.exporter.otlp "tempo" {
+  client {
+    endpoint = "tempo.monitoring.svc.cluster.local:4317"
+    tls {
+      insecure = true
+    }
+    compression = "gzip"
+  }
+}
+</pre>
+<br />
+<span>The batch processor reduces network overhead by accumulating spans before forwarding to Tempo.</span><br />
+<br />
+<span>#### Upgrade Alloy</span><br />
+<br />
+<pre>
+cd /home/paul/git/conf/f3s/loki
+just upgrade
+</pre>
+<br />
+<span>Verify OTLP receivers are listening:</span><br />
+<br />
+<pre>
+kubectl logs -n monitoring -l app.kubernetes.io/name=alloy | grep -i "otlp.*receiver"
+kubectl exec -n monitoring &lt;alloy-pod&gt; -- netstat -ln | grep -E &#39;:(4317|4318)&#39;
+</pre>
+<br />
+<h3 style='display: inline' id='demo-tracing-application'>Demo Tracing Application</h3><br />
+<br />
+<span>Created a three-tier Python application to demonstrate distributed tracing in action.</span><br />
+<br />
+<span>#### Application Architecture</span><br />
+<br />
+<pre>
+User → Frontend (Flask:5000) → Middleware (Flask:5001) → Backend (Flask:5002)
+           ↓                          ↓                        ↓
+                    Alloy (OTLP:4317) → Tempo → Grafana
+</pre>
+<br />
+<span>**Frontend Service:**</span><br />
+<ul>
+<li>Receives HTTP requests at /api/process</li>
+<li>Forwards to middleware service</li>
+<li>Creates parent span for the entire request</li>
+</ul><br />
+<span>**Middleware Service:**</span><br />
+<ul>
+<li>Transforms data at /api/transform</li>
+<li>Calls backend service</li>
+<li>Creates child span linked to frontend</li>
+</ul><br />
+<span>**Backend Service:**</span><br />
+<ul>
+<li>Returns data at /api/data</li>
+<li>Simulates database query (100ms sleep)</li>
+<li>Creates leaf span in the trace</li>
+</ul><br />
+<span>#### OpenTelemetry Instrumentation</span><br />
+<br />
+<span>All services use Python OpenTelemetry libraries:</span><br />
+<br />
+<span>**Dependencies:**</span><br />
+<pre>
+flask==3.0.0
+requests==2.31.0
+opentelemetry-distro==0.49b0
+opentelemetry-exporter-otlp==1.28.0
+opentelemetry-instrumentation-flask==0.49b0
+opentelemetry-instrumentation-requests==0.49b0
+</pre>
+<br />
+<span>**Auto-instrumentation pattern** (used in all services):</span><br />
+<br />
+<!-- Generator: GNU source-highlight 3.1.9
+by Lorenzo Bettini
+http://www.lorenzobettini.it
+http://www.gnu.org/software/src-highlite -->
+<pre><b><u><font color="#000000">from</font></u></b> opentelemetry <b><u><font color="#000000">import</font></u></b> trace
+<b><u><font color="#000000">from</font></u></b> opentelemetry.sdk.trace <b><u><font color="#000000">import</font></u></b> TracerProvider
+<b><u><font color="#000000">from</font></u></b> opentelemetry.exporter.otlp.proto.grpc.trace_exporter <b><u><font color="#000000">import</font></u></b> OTLPSpanExporter
+<b><u><font color="#000000">from</font></u></b> opentelemetry.instrumentation.flask <b><u><font color="#000000">import</font></u></b> FlaskInstrumentor
+<b><u><font color="#000000">from</font></u></b> opentelemetry.instrumentation.requests <b><u><font color="#000000">import</font></u></b> RequestsInstrumentor
+<b><u><font color="#000000">from</font></u></b> opentelemetry.sdk.resources <b><u><font color="#000000">import</font></u></b> Resource
+
+<i><font color="silver"># Define service identity</font></i>
+resource = Resource(attributes={
+    <font color="#808080">"service.name"</font>: <font color="#808080">"frontend"</font>,
+    <font color="#808080">"service.namespace"</font>: <font color="#808080">"tracing-demo"</font>,
+    <font color="#808080">"service.version"</font>: <font color="#808080">"1.0.0"</font>
+})
+
+provider = TracerProvider(resource=resource)
+
+<i><font color="silver"># Export to Alloy</font></i>
+otlp_exporter = OTLPSpanExporter(
+    endpoint=<font color="#808080">"http://alloy.monitoring.svc.cluster.local:4317"</font>,
+    insecure=True
+)
+
+processor = BatchSpanProcessor(otlp_exporter)
+provider.add_span_processor(processor)
+trace.set_tracer_provider(provider)
+
+<i><font color="silver"># Auto-instrument Flask and requests</font></i>
+FlaskInstrumentor().instrument_app(app)
+RequestsInstrumentor().instrument()
+</pre>
+<br />
+<span>The auto-instrumentation automatically:</span><br />
+<ul>
+<li>Creates spans for HTTP requests</li>
+<li>Propagates trace context via W3C Trace Context headers</li>
+<li>Links parent and child spans across service boundaries</li>
+</ul><br />
+<span>#### Deployment</span><br />
+<br />
+<span>Created Helm chart in /home/paul/git/conf/f3s/tracing-demo/ with three separate deployments, services, and an ingress.</span><br />
+<br />
+<span>Build and deploy:</span><br />
+<br />
+<pre>
+cd /home/paul/git/conf/f3s/tracing-demo
+just build
+just import
+just install
+</pre>
+<br />
+<span>Verify deployment:</span><br />
+<br />
+<pre>
+kubectl get pods -n services | grep tracing-demo
+kubectl get ingress -n services tracing-demo-ingress
+</pre>
+<br />
+<span>Access the application at:</span><br />
+<br />
+<a class='textlink' href='http://tracing-demo.f3s.buetow.org'>http://tracing-demo.f3s.buetow.org</a><br />
+<br />
+<h3 style='display: inline' id='visualizing-traces-in-grafana'>Visualizing Traces in Grafana</h3><br />
+<br />
+<span>The Tempo datasource is automatically discovered by Grafana through the ConfigMap label.</span><br />
+<br />
+<span>#### Accessing Traces</span><br />
+<br />
+<span>Navigate to Grafana → Explore → Select "Tempo" datasource</span><br />
+<br />
+<span>**Search Interface:**</span><br />
+<ul>
+<li>Search by Trace ID</li>
+<li>Search by service name</li>
+<li>Search by tags</li>
+</ul><br />
+<span>**TraceQL Queries:**</span><br />
+<br />
+<span>Find all traces from demo app:</span><br />
+<pre>
+{ resource.service.namespace = "tracing-demo" }
+</pre>
+<br />
+<span>Find slow requests (&gt;200ms):</span><br />
+<pre>
+{ duration &gt; 200ms }
+</pre>
+<br />
+<span>Find traces from specific service:</span><br />
+<pre>
+{ resource.service.name = "frontend" }
+</pre>
+<br />
+<span>Find errors:</span><br />
+<pre>
+{ status = error }
+</pre>
+<br />
+<span>Complex query - frontend traces calling middleware:</span><br />
+<pre>
+{ resource.service.namespace = "tracing-demo" } &amp;&amp; { span.http.status_code &gt;= 500 }
+</pre>
+<br />
+<span>#### Service Graph Visualization</span><br />
+<br />
+<span>The service graph shows visual connections between services:</span><br />
+<br />
+<span>1. Navigate to Explore → Tempo</span><br />
+<span>2. Enable "Service Graph" view</span><br />
+<span>3. Shows: Frontend → Middleware → Backend with request rates</span><br />
+<br />
+<span>The service graph uses Prometheus metrics generated from trace data.</span><br />
+<br />
+<h3 style='display: inline' id='correlation-between-observability-signals'>Correlation Between Observability Signals</h3><br />
+<br />
+<span>Tempo integrates with Loki and Prometheus to provide unified observability.</span><br />
+<br />
+<span>#### Traces-to-Logs</span><br />
+<br />
+<span>Click on any span in a trace to see related logs:</span><br />
+<br />
+<span>1. View trace in Grafana</span><br />
+<span>2. Click on a span</span><br />
+<span>3. Select "Logs for this span"</span><br />
+<span>4. Loki shows logs filtered by:</span><br />
+<span>   * Time range (span duration ± 1 hour)</span><br />
+<span>   * Service name</span><br />
+<span>   * Namespace</span><br />
+<span>   * Pod</span><br />
+<br />
+<span>This helps correlate what the service was doing when the span was created.</span><br />
+<br />
+<span>#### Traces-to-Metrics</span><br />
+<br />
+<span>View Prometheus metrics for services in the trace:</span><br />
+<br />
+<span>1. View trace in Grafana</span><br />
+<span>2. Select "Metrics" tab</span><br />
+<span>3. Shows metrics like:</span><br />
+<span>   * Request rate</span><br />
+<span>   * Error rate</span><br />
+<span>   * Duration percentiles</span><br />
+<br />
+<span>#### Logs-to-Traces</span><br />
+<br />
+<span>From logs, you can jump to related traces:</span><br />
+<br />
+<span>1. In Loki, logs that contain trace IDs are automatically linked</span><br />
+<span>2. Click the trace ID to view the full trace</span><br />
+<span>3. See the complete request flow</span><br />
+<br />
+<h3 style='display: inline' id='generating-traces-for-testing'>Generating Traces for Testing</h3><br />
+<br />
+<span>Test the demo application:</span><br />
+<br />
+<pre>
+curl http://tracing-demo.f3s.buetow.org/api/process
+</pre>
+<br />
+<span>Load test (generates 50 traces):</span><br />
+<br />
+<pre>
+cd /home/paul/git/conf/f3s/tracing-demo
+just load-test
+</pre>
+<br />
+<span>Each request creates a distributed trace spanning all three services.</span><br />
+<br />
+<h3 style='display: inline' id='verifying-the-complete-pipeline'>Verifying the Complete Pipeline</h3><br />
+<br />
+<span>Check the trace flow end-to-end:</span><br />
+<br />
+<span>**1. Application generates traces:**</span><br />
+<pre>
+kubectl logs -n services -l app=tracing-demo-frontend | grep -i trace
+</pre>
+<br />
+<span>**2. Alloy receives traces:**</span><br />
+<pre>
+kubectl logs -n monitoring -l app.kubernetes.io/name=alloy | grep -i otlp
+</pre>
+<br />
+<span>**3. Tempo stores traces:**</span><br />
+<pre>
+kubectl logs -n monitoring -l app.kubernetes.io/name=tempo | grep -i trace
+</pre>
+<br />
+<span>**4. Grafana displays traces:**</span><br />
+<span>Navigate to Explore → Tempo → Search for traces</span><br />
+<br />
+<h3 style='display: inline' id='practical-example-viewing-a-distributed-trace'>Practical Example: Viewing a Distributed Trace</h3><br />
+<br />
+<span>Let&#39;s generate a trace and examine it in Grafana.</span><br />
+<br />
+<span>**1. Generate a trace by calling the demo application:**</span><br />
+<br />
+<pre>
+curl -H "Host: tracing-demo.f3s.buetow.org" http://r0/api/process
+</pre>
+<br />
+<span>**Response (HTTP 200):**</span><br />
+<br />
+<!-- Generator: GNU source-highlight 3.1.9
+by Lorenzo Bettini
+http://www.lorenzobettini.it
+http://www.gnu.org/software/src-highlite -->
+<pre>{
+  "middleware_response": {
+    "backend_data": {
+      "data": {
+        "id": <font color="#000000">12345</font>,
+        "query_time_ms": <font color="#000000">100.0</font>,
+        "timestamp": "<font color="#808080">2025-12-28T18:35:01.064538</font>",
+        "value": "<font color="#808080">Sample data from backend service</font>"
+      },
+      "service": "<font color="#808080">backend</font>"
+    },
+    "middleware_processed": <b><u><font color="#000000">true</font></u></b>,
+    "original_data": {
+      "source": "<font color="#808080">GET request</font>"
+    },
+    "transformation_time_ms": <font color="#000000">50</font>
+  },
+  "request_data": {
+    "source": "<font color="#808080">GET request</font>"
+  },
+  "service": "<font color="#808080">frontend</font>",
+  "status": "<font color="#808080">success</font>"
+}
+</pre>
+<br />
+<span>**2. Find the trace in Tempo via API:**</span><br />
+<br />
+<span>After a few seconds (for batch export), search for recent traces:</span><br />
+<br />
+<pre>
+kubectl exec -n monitoring tempo-0 -- wget -qO- \
+  &#39;http://localhost:3200/api/search?tags=service.namespace%3Dtracing-demo&amp;limit=5&#39; 2&gt;/dev/null | \
+  python3 -m json.tool
+</pre>
+<br />
+<span>Returns traces including:</span><br />
+<br />
+<!-- Generator: GNU source-highlight 3.1.9
+by Lorenzo Bettini
+http://www.lorenzobettini.it
+http://www.gnu.org/software/src-highlite -->
+<pre>{
+  "traceID": "<font color="#808080">4be1151c0bdcd5625ac7e02b98d95bd5</font>",
+  "rootServiceName": "<font color="#808080">frontend</font>",
+  "rootTraceName": "<font color="#808080">GET /api/process</font>",
+  "durationMs": <font color="#000000">221</font>
+}
+</pre>
+<br />
+<span>**3. Fetch complete trace details:**</span><br />
+<br />
+<pre>
+kubectl exec -n monitoring tempo-0 -- wget -qO- \
+  &#39;http://localhost:3200/api/traces/4be1151c0bdcd5625ac7e02b98d95bd5&#39; 2&gt;/dev/null | \
+  python3 -m json.tool
+</pre>
+<br />
+<span>**Trace structure (8 spans across 3 services):**</span><br />
+<br />
+<pre>
+Trace ID: 4be1151c0bdcd5625ac7e02b98d95bd5
+Services: 3 (frontend, middleware, backend)
+
+Service: frontend
+  └─ GET /api/process                 221.10ms  (HTTP server span)
+  └─ frontend-process                 216.23ms  (custom business logic span)
+  └─ POST                             209.97ms  (HTTP client span to middleware)
+
+Service: middleware
+  └─ POST /api/transform              186.02ms  (HTTP server span)
+  └─ middleware-transform             180.96ms  (custom business logic span)
+  └─ GET                              127.52ms  (HTTP client span to backend)
+
+Service: backend
+  └─ GET /api/data                    103.93ms  (HTTP server span)
+  └─ backend-get-data                 102.11ms  (custom business logic span with 100ms sleep)
+</pre>
+<br />
+<span>**4. View the trace in Grafana UI:**</span><br />
+<br />
+<span>Navigate to: Grafana → Explore → Tempo datasource</span><br />
+<br />
+<span>Search using TraceQL:</span><br />
+<pre>
+{ resource.service.namespace = "tracing-demo" }
+</pre>
+<br />
+<span>Or directly open the trace by pasting the trace ID in the search box:</span><br />
+<pre>
+4be1151c0bdcd5625ac7e02b98d95bd5
+</pre>
+<br />
+<span>**5. Trace visualization:**</span><br />
+<br />
+<span>The trace waterfall view in Grafana shows the complete request flow with timing:</span><br />
+<br />
+<a href='./f3s-kubernetes-with-freebsd-part-8b/grafana-tempo-trace.png'><img alt='Distributed trace visualization in Grafana Tempo showing Frontend → Middleware → Backend spans' title='Distributed trace visualization in Grafana Tempo showing Frontend → Middleware → Backend spans' src='./f3s-kubernetes-with-freebsd-part-8b/grafana-tempo-trace.png' /></a><br />
+<br />
+<span>For additional examples of Tempo trace visualization, see also:</span><br />
+<br />
+<a class='textlink' href='https://foo.zone/gemfeed/2025-12-24-x-rag-observability-hackathon.html'>X-RAG Observability Hackathon (more Grafana Tempo screenshots)</a><br />
+<br />
+<span>The trace reveals the distributed request flow:</span><br />
+<ul>
+<li>**Frontend (221ms)**: Receives GET /api/process, executes business logic, calls middleware</li>
+<li>**Middleware (186ms)**: Receives POST /api/transform, transforms data, calls backend</li>
+<li>**Backend (104ms)**: Receives GET /api/data, simulates database query with 100ms sleep</li>
+<li>**Total request time**: 221ms end-to-end</li>
+<li>**Span propagation**: W3C Trace Context headers automatically link all spans</li>
+</ul><br />
+<span>**6. Service graph visualization:**</span><br />
+<br />
+<span>The service graph is automatically generated from traces and shows service dependencies. For examples of service graph visualization in Grafana, see the screenshots in the X-RAG Observability Hackathon blog post.</span><br />
+<br />
+<a class='textlink' href='https://foo.zone/gemfeed/2025-12-24-x-rag-observability-hackathon.html'>X-RAG Observability Hackathon (includes service graph screenshots)</a><br />
+<br />
+<span>This visualization helps identify:</span><br />
+<ul>
+<li>Request rates between services</li>
+<li>Average latency for each hop</li>
+<li>Error rates (if any)</li>
+<li>Service dependencies and communication patterns</li>
+</ul><br />
+<h3 style='display: inline' id='storage-and-retention'>Storage and Retention</h3><br />
+<br />
+<span>Monitor Tempo storage usage:</span><br />
+<br />
+<pre>
+kubectl exec -n monitoring &lt;tempo-pod&gt; -- df -h /var/tempo
+</pre>
+<br />
+<span>With 10Gi storage and 7-day retention, the system handles moderate trace volumes. If storage fills up:</span><br />
+<br />
+<ul>
+<li>Reduce retention to 72h (3 days)</li>
+<li>Implement sampling in Alloy</li>
+<li>Increase PV size</li>
+</ul><br />
+<h3 style='display: inline' id='complete-observability-stack'>Complete Observability Stack</h3><br />
+<br />
+<span>The f3s cluster now has complete observability:</span><br />
+<br />
+<span>**Metrics** (Prometheus):</span><br />
+<ul>
+<li>Cluster resource usage</li>
+<li>Application metrics</li>
+<li>Node metrics (FreeBSD ZFS, OpenBSD edge)</li>
+<li>etcd health</li>
+</ul><br />
+<span>**Logs** (Loki):</span><br />
+<ul>
+<li>All pod logs</li>
+<li>Structured log collection</li>
+<li>Log aggregation and search</li>
+</ul><br />
+<span>**Traces** (Tempo):</span><br />
+<ul>
+<li>Distributed request tracing</li>
+<li>Service dependency mapping</li>
+<li>Performance profiling</li>
+<li>Error tracking</li>
+</ul><br />
+<span>**Visualization** (Grafana):</span><br />
+<ul>
+<li>Unified dashboards</li>
+<li>Correlation between metrics, logs, and traces</li>
+<li>Service graphs</li>
+<li>Alerts</li>
+</ul><br />
+<h3 style='display: inline' id='configuration-files'>Configuration Files</h3><br />
+<br />
+<span>All configuration files are available on Codeberg:</span><br />
+<br />
+<a class='textlink' href='https://codeberg.org/snonux/conf/src/branch/master/f3s/tempo'>Tempo configuration</a><br />
+<a class='textlink' href='https://codeberg.org/snonux/conf/src/branch/master/f3s/loki'>Alloy configuration (updated for traces)</a><br />
+<a class='textlink' href='https://codeberg.org/snonux/conf/src/branch/master/f3s/tracing-demo'>Demo tracing application</a><br />
+<p class="footer">
+	Generated with <a href="https://codeberg.org/snonux/gemtexter">Gemtexter 3.0.1-develop</a> |
+	served by <a href="https://www.OpenBSD.org">OpenBSD</a>/<a href="https://man.openbsd.org/relayd.8">relayd(8)</a>+<a href="https://man.openbsd.org/httpd.8">httpd(8)</a> |
+	<a href="https://foo.zone/site-mirrors.html">Site Mirrors</a>
+	<br />
+	Webring: <a href="https://shring.sh/foo.zone/previous">previous</a> | <a href="https://shring.sh">shring</a> | <a href="https://shring.sh/foo.zone/next">next</a>
+</p>
+</body>
+</html>
diff --git a/gemfeed/atom.xml b/gemfeed/atom.xml
index 7e7f3733..09efca2f 100644
--- a/gemfeed/atom.xml
+++ b/gemfeed/atom.xml
@@ -1,6 +1,6 @@
 <?xml version="1.0" encoding="utf-8"?>
 <feed xmlns="http://www.w3.org/2005/Atom">
-    <updated>2025-12-26T23:33:35+02:00</updated>
+    <updated>2025-12-30T10:15:58+02:00</updated>
     <title>foo.zone feed</title>
     <subtitle>To be in the .zone!</subtitle>
     <link href="https://foo.zone/gemfeed/atom.xml" rel="self" />
@@ -2579,7 +2579,7 @@ p hash.values_at(:a, :c)
         <title>f3s: Kubernetes with FreeBSD - Part 7: k3s and first pod deployments</title>
         <link href="https://foo.zone/gemfeed/2025-10-02-f3s-kubernetes-with-freebsd-part-7.html" />
         <id>https://foo.zone/gemfeed/2025-10-02-f3s-kubernetes-with-freebsd-part-7.html</id>
-        <updated>2025-10-02T11:27:19+03:00</updated>
+        <updated>2025-10-02T11:27:19+03:00, last updated Tue 30 Dec 10:11:58 EET 2025</updated>
         <author>
             <name>Paul Buetow aka snonux</name>
             <email>paul@dev.buetow.org</email>
@@ -2589,7 +2589,7 @@ p hash.values_at(:a, :c)
             <div xmlns="http://www.w3.org/1999/xhtml">
                 <h1 style='display: inline' id='f3s-kubernetes-with-freebsd---part-7-k3s-and-first-pod-deployments'>f3s: Kubernetes with FreeBSD - Part 7: k3s and first pod deployments</h1><br />
 <br />
-<span class='quote'>Published at 2025-10-02T11:27:19+03:00</span><br />
+<span class='quote'>Published at 2025-10-02T11:27:19+03:00, last updated Tue 30 Dec 10:11:58 EET 2025</span><br />
 <br />
 <span>This is the seventh blog post about the f3s series for my self-hosting demands in a home lab. f3s? The "f" stands for FreeBSD, and the "3s" stands for k3s, the Kubernetes distribution I use on FreeBSD-based physical machines.</span><br />
 <br />
@@ -2619,6 +2619,8 @@ p hash.values_at(:a, :c)
 <li>⇢ ⇢ <a href='#scaling-traefik-for-faster-failover'>Scaling Traefik for faster failover</a></li>
 <li>⇢ <a href='#make-it-accessible-from-the-public-internet'>Make it accessible from the public internet</a></li>
 <li>⇢ ⇢ <a href='#openbsd-relayd-configuration'>OpenBSD relayd configuration</a></li>
+<li>⇢ ⇢ <a href='#automatic-failover-when-f3s-cluster-is-down'>Automatic failover when f3s cluster is down</a></li>
+<li>⇢ ⇢ <a href='#openbsd-httpd-fallback-configuration'>OpenBSD httpd fallback configuration</a></li>
 <li>⇢ <a href='#deploying-the-private-docker-image-registry'>Deploying the private Docker image registry</a></li>
 <li>⇢ ⇢ <a href='#prepare-the-nfs-backed-storage'>Prepare the NFS-backed storage</a></li>
 <li>⇢ ⇢ <a href='#install-or-upgrade-the-chart'>Install (or upgrade) the chart</a></li>
@@ -3248,10 +3250,11 @@ table &lt;f3s&gt; {
 }
 </pre>
 <br />
-<span>Inside the <span class='inlinecode'>http protocol "https"</span> block each public hostname gets its Let&#39;s Encrypt certificate and is matched to that backend table. Besides the primary trio, every service-specific hostname (<span class='inlinecode'>anki</span>, <span class='inlinecode'>bag</span>, <span class='inlinecode'>flux</span>, <span class='inlinecode'>audiobookshelf</span>, <span class='inlinecode'>gpodder</span>, <span class='inlinecode'>radicale</span>, <span class='inlinecode'>vault</span>, <span class='inlinecode'>syncthing</span>, <span class='inlinecode'>uprecords</span>) and their <span class='inlinecode'>www</span> / <span class='inlinecode'>standby</span> aliases reuse the same pool so new apps can go live just by publishing an ingress rule, whereas they will all map to a service running in k3s:</span><br />
+<span>Inside the <span class='inlinecode'>http protocol "https"</span> block each public hostname gets its Let&#39;s Encrypt certificate. The protocol configures TLS keypairs for all f3s services and other public endpoints. For f3s hosts specifically, there are no explicit <span class='inlinecode'>forward to</span> rules in the protocol—they use the relay-level failover mechanism described later. Non-f3s hosts get explicit localhost routing to prevent them from trying the f3s backends:</span><br />
 <br />
 <pre>
 http protocol "https" {
+    # TLS certificates for all f3s services
     tls keypair f3s.foo.zone
     tls keypair www.f3s.foo.zone
     tls keypair standby.f3s.foo.zone
@@ -3283,36 +3286,15 @@ http protocol "https" {
     tls keypair www.uprecords.f3s.foo.zone
     tls keypair standby.uprecords.f3s.foo.zone
 
-    match request quick header "Host" value "f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "www.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "standby.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "anki.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "www.anki.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "standby.anki.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "bag.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "www.bag.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "standby.bag.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "flux.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "www.flux.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "standby.flux.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "audiobookshelf.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "www.audiobookshelf.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "standby.audiobookshelf.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "gpodder.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "www.gpodder.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "standby.gpodder.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "radicale.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "www.radicale.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "standby.radicale.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "vault.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "www.vault.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "standby.vault.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "syncthing.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "www.syncthing.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "standby.syncthing.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "uprecords.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "www.uprecords.f3s.foo.zone" forward to &lt;f3s&gt;
-    match request quick header "Host" value "standby.uprecords.f3s.foo.zone" forward to &lt;f3s&gt;
+    # Explicitly route non-f3s hosts to localhost
+    match request header "Host" value "foo.zone" forward to &lt;localhost&gt;
+    match request header "Host" value "www.foo.zone" forward to &lt;localhost&gt;
+    match request header "Host" value "dtail.dev" forward to &lt;localhost&gt;
+    # ... other non-f3s hosts ...
+
+    # NOTE: f3s hosts have NO match rules here!
+    # They use relay-level failover (f3s -&gt; localhost backup)
+    # See the relay configuration below for automatic failover details
 }
 </pre>
 <br />
@@ -3322,18 +3304,143 @@ http protocol "https" {
 relay "https4" {
     listen on 46.23.94.99 port 443 tls
     protocol "https"
+    # Primary: f3s cluster (with health checks) - Falls back to localhost when all hosts down
     forward to &lt;f3s&gt; port 80 check tcp
+    forward to &lt;localhost&gt; port 8080
 }
 
 relay "https6" {
     listen on 2a03:6000:6f67:624::99 port 443 tls
     protocol "https"
+    # Primary: f3s cluster (with health checks) - Falls back to localhost when all hosts down
     forward to &lt;f3s&gt; port 80 check tcp
+    forward to &lt;localhost&gt; port 8080
 }
 </pre>
 <br />
 <span>In practice, that means relayd terminates TLS with the correct certificate, keeps the three WireGuard-connected backends in rotation, and ships each request to whichever bhyve VM answers first.</span><br />
 <br />
+<h3 style='display: inline' id='automatic-failover-when-f3s-cluster-is-down'>Automatic failover when f3s cluster is down</h3><br />
+<br />
+<span class='quote'>Update: This section was added at Tue 30 Dec 10:11:44 EET 2025</span><br />
+<br />
+<span>One important aspect of this setup is graceful degradation: when all three f3s nodes are unreachable (e.g., during maintenance or a power outage in my LAN), users should see a friendly status page instead of an error message.</span><br />
+<br />
+<span>OpenBSD&#39;s relayd supports automatic failover through its health check mechanism. According to the relayd.conf manual:</span><br />
+<br />
+<span class='quote'>This directive can be specified multiple times - subsequent entries will be used as the backup table if all hosts in the previous table are down.</span><br />
+<br />
+<span>The key is the order of <span class='inlinecode'>forward to</span> statements in the relay configuration. By placing the f3s table first with <span class='inlinecode'>check tcp</span> health checks, followed by localhost as a backup, relayd automatically routes traffic based on backend availability:</span><br />
+<br />
+<span>When f3s cluster is UP:</span><br />
+<br />
+<ul>
+<li>Health checks on port 80 succeed for f3s nodes</li>
+<li>All f3s traffic routes to the Kubernetes cluster</li>
+<li>Localhost backup remains idle</li>
+</ul><br />
+<span>When f3s cluster is DOWN:</span><br />
+<br />
+<ul>
+<li>All health checks fail (nodes unreachable)</li>
+<li>The <span class='inlinecode'>&lt;f3s&gt;</span> table becomes unavailable</li>
+<li>Traffic automatically falls back to <span class='inlinecode'>&lt;localhost&gt;</span> on port 8080</li>
+<li>OpenBSD&#39;s httpd serves a static fallback page</li>
+</ul><br />
+<pre>
+# NEW configuration - supports automatic failover
+http protocol "https" {
+    # Explicitly route non-f3s hosts to localhost
+    match request header "Host" value "foo.zone" forward to &lt;localhost&gt;
+    match request header "Host" value "dtail.dev" forward to &lt;localhost&gt;
+    # ... other non-f3s hosts ...
+
+    # f3s hosts have NO protocol rules - they use relay-level failover
+    # (no match rules for f3s.foo.zone, anki.f3s.foo.zone, etc.)
+}
+
+relay "https4" {
+    # f3s FIRST (with health checks), localhost as BACKUP
+    forward to &lt;f3s&gt; port 80 check tcp
+    forward to &lt;localhost&gt; port 8080
+}
+</pre>
+<br />
+<span>This way, f3s traffic uses the relay&#39;s default behavior: try the first table, fall back to the second when health checks fail.</span><br />
+<br />
+<h3 style='display: inline' id='openbsd-httpd-fallback-configuration'>OpenBSD httpd fallback configuration</h3><br />
+<br />
+<span>The localhost httpd service on port 8080 serves the fallback content from <span class='inlinecode'>/var/www/htdocs/f3s_fallback/</span>. This directory contains a simple HTML page explaining the situation:</span><br />
+<br />
+<pre>
+# OpenBSD httpd.conf
+# Fallback for f3s hosts
+server "f3s.foo.zone" {
+  listen on * port 8080
+  log style forwarded
+  location * {
+    root "/htdocs/f3s_fallback"
+    directory auto index
+  }
+}
+
+server "anki.f3s.foo.zone" {
+  listen on * port 8080
+  log style forwarded
+  location * {
+    root "/htdocs/f3s_fallback"
+    directory auto index
+  }
+}
+
+# ... similar blocks for all f3s hostnames ...
+</pre>
+<br />
+<span>The fallback page itself is straightforward:</span><br />
+<br />
+<!-- Generator: GNU source-highlight 3.1.9
+by Lorenzo Bettini
+http://www.lorenzobettini.it
+http://www.gnu.org/software/src-highlite -->
+<pre><b><u><font color="#000000">&lt;!DOCTYPE</font></u></b> <b><font color="#000000">html</font></b><b><u><font color="#000000">&gt;</font></u></b>
+<b><u><font color="#000000">&lt;html&gt;</font></u></b>
+<b><u><font color="#000000">&lt;head&gt;</font></u></b>
+    <b><u><font color="#000000">&lt;title&gt;</font></u></b>Server turned off<b><u><font color="#000000">&lt;/title&gt;</font></u></b>
+    <b><u><font color="#000000">&lt;style&gt;</font></u></b>
+        body {
+            font-family: <font color="#808080">sans-serif</font>;
+            text-align: <font color="#808080">center</font>;
+            padding-top: <font color="#808080">50px</font>;
+        }
+        .container {
+            max-width: <font color="#808080">600px</font>;
+            margin: <font color="#808080">0</font> <font color="#808080">auto</font>;
+        }
+    <b><u><font color="#000000">&lt;/style&gt;</font></u></b>
+<b><u><font color="#000000">&lt;/head&gt;</font></u></b>
+<b><u><font color="#000000">&lt;body&gt;</font></u></b>
+    <b><u><font color="#000000">&lt;div</font></u></b> <b><font color="#000000">class</font></b>=<font color="#808080">"container"</font><b><u><font color="#000000">&gt;</font></u></b>
+        <b><u><font color="#000000">&lt;h1&gt;</font></u></b>Server turned off<b><u><font color="#000000">&lt;/h1&gt;</font></u></b>
+        <b><u><font color="#000000">&lt;p&gt;</font></u></b>The servers are all currently turned off.<b><u><font color="#000000">&lt;/p&gt;</font></u></b>
+        <b><u><font color="#000000">&lt;p&gt;</font></u></b>Please try again later.<b><u><font color="#000000">&lt;/p&gt;</font></u></b>
+        <b><u><font color="#000000">&lt;p&gt;</font></u></b>Or email <b><u><font color="#000000">&lt;a</font></u></b> <b><font color="#000000">href</font></b>=<font color="#808080">"mailto:paul@nospam.buetow.org"</font><b><u><font color="#000000">&gt;</font></u></b>paul@nospam.buetow.org<b><u><font color="#000000">&lt;/a&gt;</font></u></b>
+           - so I can turn them back on for you!<b><u><font color="#000000">&lt;/p&gt;</font></u></b>
+    <b><u><font color="#000000">&lt;/div&gt;</font></u></b>
+<b><u><font color="#000000">&lt;/body&gt;</font></u></b>
+<b><u><font color="#000000">&lt;/html&gt;</font></u></b>
+</pre>
+<br />
+<span>This approach provides several benefits:</span><br />
+<br />
+<ul>
+<li>Automatic detection: Health checks run continuously; no manual intervention needed</li>
+<li>Instant fallback: When all f3s nodes go down, the next request automatically routes to localhost</li>
+<li>Transparent recovery: When f3s comes back online, health checks pass and traffic resumes automatically</li>
+<li>User experience: Visitors see a helpful message instead of connection errors</li>
+<li>No DNS changes: The same hostnames work whether f3s is up or down</li>
+</ul><br />
+<span>This fallback mechanism has proven invaluable during maintenance windows and unexpected outages, ensuring that users always get a response even when the home lab is offline.</span><br />
+<br />
 <h2 style='display: inline' id='deploying-the-private-docker-image-registry'>Deploying the private Docker image registry</h2><br />
 <br />
 <span>As not all Docker images I want to deploy are available on public Docker registries and as I also build some of them by myself, there is the need of a private registry. </span><br />