From 388dc7f7b783cfb102ce9e04f3cae35ef458a796 Mon Sep 17 00:00:00 2001 From: Paul Buetow Date: Sat, 1 Nov 2025 16:11:32 +0200 Subject: Update content for gemtext --- .../2025-11-02-perl-new-features-and-foostats.gmi | 401 +++++++++++++++ ...25-11-02-perl-new-features-and-foostats.gmi.tpl | 373 ++++++++++++++ gemfeed/DRAFT-perl-new-features-and-foostats.gmi | 358 -------------- .../DRAFT-perl-new-features-and-foostats.gmi.tpl | 351 -------------- gemfeed/atom.xml | 537 ++++++++++++++++++--- gemfeed/index.gmi | 1 + gemfeed/stats.gmi | 7 + 7 files changed, 1243 insertions(+), 785 deletions(-) create mode 100644 gemfeed/2025-11-02-perl-new-features-and-foostats.gmi create mode 100644 gemfeed/2025-11-02-perl-new-features-and-foostats.gmi.tpl delete mode 100644 gemfeed/DRAFT-perl-new-features-and-foostats.gmi delete mode 100644 gemfeed/DRAFT-perl-new-features-and-foostats.gmi.tpl create mode 100644 gemfeed/stats.gmi (limited to 'gemfeed') diff --git a/gemfeed/2025-11-02-perl-new-features-and-foostats.gmi b/gemfeed/2025-11-02-perl-new-features-and-foostats.gmi new file mode 100644 index 00000000..9b917d1f --- /dev/null +++ b/gemfeed/2025-11-02-perl-new-features-and-foostats.gmi @@ -0,0 +1,401 @@ +# Perl New Features and Foostats + +> Published at 2025-11-01T16:10:35+02:00 + +Perl recently reached rank 10 in the TIOBE index. That headline made me write this blog post as I was developing the Foostats script for simple analytics of my personal websites and Gemini capsules (e.g. `foo.zone`) and there were a couple of new features added to the Perl language over the last releases. The book *Perl New Features* by brian d foy documents the changes well; this post shows how those features look in a real program that runs every morning for my stats generation. + +=> https://developers.slashdot.org/story/25/09/14/0134239/is-perl-the-worlds-10th-most-popular-programming-language Perl re-enters the top ten +=> https://perlschool.com/books/perl-new-features/ Perl New Features by Joshua McAdams and brian d foy + +``` +$b="24P7cP3dP31P3bPaP28P24P64P31P2cP24P64P32P2cP24P73P2cP24P67P2cP24P7 +2P29P3dP28P22P31P30P30P30P30P22P2cP22P31P30P30P30P30P30P22P2cP22P4aP75 +P7 3P +74 P2 +0P 41P6eP6fP74P 68P65P72P20P50 P65P72P6cP2 0P48P 61 +P6 3P6bP65P72P22P 29P3bPaP40P6dP 3dP73P70P6cP6 9P74P 20 +P2 fP2fP 2cP22P 2cP2eP3aP21P2 bP2aP 30P4f P40P2 2P +3b PaP24 P6eP3 dP6c P65P6 eP67 P74P6 8P +20 P24P7 3P3bP aP24 P75P3 dP22 P20P2 2P +78 P24P6 eP3bP aPaP 70P72 P69P 6eP74 P2 +0P 22P5c P6eP20 P20P 24P75 P5cP7 2P22P 3b +Pa PaP66P6fP72P2 8P24P7aP20P 3dP20P31P3bP 20P24 P7 +aP 3cP3dP24P6 eP3bP20P24 P7aP2bP2bP 29P20 P7 +bP aPaP9 P77P28P24P6 4P31P29P 3bPaP 9P +24 P72P3 dP69 P6eP74P28 P72P6 1P +6e P64P2 8P24 P6eP2 9P29P 3bPaP 9P +24 P67P3 dP73 P75P6 2P73P 74P72 P2 +0P 24P73 P2cP24P72P2cP 31P3b PaP9P 24P67P20P3fP20 P6 +4P 6fP20 P9P7bP20PaP9P9 P9P9P 9P66P 6fP72P20P28P24 P6 +bP 3dP30 P3bP24P6bP3cP3 9P3bP 24P6bP 2bP2bP29P20P7b Pa +P9 P9 +P9 P9 +P9 P9P73P75P6 2P73 P74P 72P2 8P24P75P2c P24P72 P2 +cP 31P29P3dP24P 6dP5 bP24 P6bP 5dP3bP20Pa P9P9 P9P9 P9 +P9 P70P 72P69 P6eP 74P2 0P22 P20P20P24P 75P 5cP 72 +P2 2P3b PaP9 P9P9 P9P9 P9P7 7P28 P24 P6 4P +32 P29P 3bPa P9P9 P9P9 P9P7 dPaP 9P9 P9 +P9 P9P7 3P75 P62P 73P7 4P72 P28P 24P7 5P +2c P24P 72P2c P31P 29P3 dP24 P67P3bP20P aP9P9 P9 +P9 P7dP20PaP9P 9P3a P20P 72P6 5P64P6fP3b PaP9 P7 +3P 75P62P73P 74P7 2P28 P24P 73P2cP24P7 2P2c P3 +1P 29P3dP2 2P30 P22P 3bPa P9P7 0P7 2P +69 P6eP74P2 0P22 P20P 20P2 4P75 P5c P7 +2P 22P3 bPaPa P7dP aPaP 77P2 0P28 P24 P6 +4P 32P2 9P3bP aP70 P72P 69P6 eP74 P2 0P2 2P +20 P20P 24P75 P20P21P5cP7 2P22P3bPaP 73P6cP65P6 5P7 0P20 P3 +2P 3bPa P70P7 2P69P6eP74P 20P22P20P2 0P24P75P20 P21P 5cP6 eP +22 P3bP aPaP7 3P75P62P2 0P77P20P7b PaP9P24P6c P3dP73 P6 +8P 69 +P6 6P +74P3bPaP9P66P6fP72P28P24P6aP3dP30P3bP24P6aP3cP24P6cP3bP24P6aP2bP2bP29P +7bP7dPaP7dP";$b=~s/\s//g;split /P/,$b;foreach(@_){$c.=chr hex};eval $c + +The above Perl script prints out "Just Another Perl Hacker !" in an +animation of sorts. + +``` + +## Table of Contents + +* ⇢ Perl New Features and Foostats +* ⇢ ⇢ Motivation +* ⇢ ⇢ Why I used Perl +* ⇢ ⇢ Inside Foostats +* ⇢ ⇢ ⇢ Log pipeline +* ⇢ ⇢ ⇢ `fooodds.txt` +* ⇢ ⇢ ⇢ Feed kinds +* ⇢ ⇢ ⇢ Aggregation and output +* ⇢ ⇢ ⇢ Command-line entry points +* ⇢ ⇢ Packages as real blocks +* ⇢ ⇢ ⇢ Scoped packages +* ⇢ ⇢ Postfix dereferencing keeps data structures tidy +* ⇢ ⇢ ⇢ Clear dereferencing +* ⇢ ⇢ `say` is the default voice now +* ⇢ ⇢ Lexical subs promote local reasoning +* ⇢ ⇢ Reference aliasing makes intent explicit +* ⇢ ⇢ Persistent state without globals +* ⇢ ⇢ ⇢ Rate limiting state +* ⇢ ⇢ ⇢ De-duplicated logging +* ⇢ ⇢ Subroutine signatures +* ⇢ ⇢ Defined-or assignment for defaults without boilerplate +* ⇢ ⇢ Cleanup with `defer` +* ⇢ ⇢ Builtins and booleans +* ⇢ ⇢ Conclusion + +## Motivation + +I've been running `foo.zone` for a while now, but I've never looked into visitor statistics or analytics. I value privacy—not just my own, but also the privacy of others (the visitors of this site) — so I hesitated to use any off-the-shelf analytics plugins. All I wanted to collect were: + +* Which blog posts had the most (unique) visitors +* Exclude, if possible, any bots and scrapers from the stats +* Track only anonymized IP addresses, never store raw addresses + +With Foostats I've created a Perl script which does that for my highly opinionated website/blog setup, which consists of: + +=> https://foo.zone/gemfeed/2021-06-05-gemtexter-one-bash-script-to-rule-it-all.html Gemtexter, my static site and Gemini capsule generator +=> https://foo.zone/gemfeed/2024-04-01-KISS-high-availability-with-OpenBSD.html How I host this site highly-available using OpenBSD + +## Why I used Perl + +Even though nowadays I code more in Go and Ruby, I stuck with Perl for Foostats for three simple reasons: + +* I wanted an excuse to explore the newer features of my first programming love. +* Sometimes, I miss Perl. +* Perl ships with OpenBSD (the operating system on which my sites run) by default. +* It really does live up to its Practical Extraction and Report Language (that's what the name Perl means) for this kind of log grinding I did with Foostats. + +## Inside Foostats + +Foostats is simply a log file analyser, which analyses the OpenBSD httpd and relayd logs. + +=> https://man.openbsd.org/httpd.8 +=> https://man.openbsd.org/relayd.8 + +### Log pipeline + +A CRON job starts Foostats, reads OpenBSD httpd and relayd access logs, and produces the numbers published at `https://stats.foo.zone` and `gemini://stats.foo.zone`. The dashboards are humble because traffic on my sites is still light, yet the trends are interesting for spotting patterns. The script is opinionated (I am repeating myself here, I know), and I will probably be the only one ever using it for my own sites. However, the code demonstrates how Perl's newer features help keep a small script like this exciting and fun! + +=> https://stats.foo.zone Foostats (HTTP) +=> gemini://stats.foo.zone Foostats (Gemini) + +On OpenBSD, I've configured the job via the `daily.local` on both of my OpenBSD servers (`fishfinger.buetow.org` and `blowfish.buetow.org` - note one is the master server, the other is the standby server, but the script runs on both and the stats are merged later in the process): + +```sh +fishfinger$ grep foostats /etc/daily.local +perl /usr/local/bin/foostats.pl --parse-logs --replicate --report +``` + +Internally, `Foostats::Logreader` parses each line of the log files `/var/log/daemon*` and `/var/www/logs/access_log*`, turns timestamps into `YYYYMMDD/HHMMSS` values, hashes IP addresses with SHA3 (for anonymization), and hands a normalized event to `Foostats::Filter`. The filter compares the URI against entries in `fooodds.txt`, tracks how many times an IP address requests within the exact second, and drops anything suspicious (e.g., from web crawlers or malicious attackers). Valid events reach `Foostats::Aggregator`, which counts requests per protocol, records unique visitors for the Gemtext and Atom feeds, and remembers page-level IP sets. `Foostats::FileOutputter` writes the result as gzipped JSON files—one per day and per protocol—with IPv4/IPv6 splits, filtered counters, feed readership, and hashes for long URLs. + +### `fooodds.txt` + +`fooodds.txt` is a plain text list of substrings of URLs to be blocked, making it quick to shut down web crawlers. Foostats also detects rapid requests (an indicator of excessive crawling) and blocks the IP. Audit lines are written to `/var/log/fooodds`, which can later be reviewed for false or true positives (I do this around once a month). The `Justfile` even has a `gather-fooodds` target that collects suspicious paths from remote logs so new patterns can be added quickly. + +### Feed kinds + +There are different kinds of feeds being tracked by Foostats: + +* The Atom web-feed +* The same feed via Gemini +* The Gemfeed (a special format popular in the Geminispace) + +### Aggregation and output + +As mentioned, Foostats merges the stats from both hosts, master and standby. For the master-standby setup description, read: + +=> ./2024-04-01-KISS-high-availability-with-OpenBSD.gmi KISS high-availability with OpenBSD + +Those gzipped files land in `stats/`. From there, `Foostats::Replicator` can pull matching files from the partner host (`fishfinger` or `blowfish`) so the view covers both servers, `Foostats::Merger` combines them into daily summaries, and `Foostats::Reporter` rebuilds Gemtext and HTML reports. + +Those are the raw stats files: + +=> https://blowfish.buetow.org/foostats/ +=> https://fishfinger.buetow.org/foostats/ + +These are the 30-day reports generated (already linked earlier in this post, but adding here again for clarity): + +=> gemini://stats.foo.zone stats.foo.zone Gemini capsule dashboard +=> https://stats.foo.zone stats.foo.zone HTTP dashboard + +### Command-line entry points + +`foostats_main` is the command entry point. `--parse-logs` refreshes the gzipped files, `--replicate` runs the cross-host sync, and `--report` rebuilds the HTML and Gemini report pages. `--all` performs everything in one go. Defaults point to `/var/www/htdocs/buetow.org/self/foostats` for data, `/var/gemini/stats.foo.zone` for Gemtext output, and `/var/www/htdocs/gemtexter/stats.foo.zone` for HTML output. Replication always forces the three most recent days' worth of data across HTTPS and leaves older files untouched to save bandwidth. + +The complete source lives on Codeberg here: + +=> https://codeberg.org/snonux/foostats Foostats on Codeberg + +Now let's go to some new Perl features: + +## Packages as real blocks + +### Scoped packages + +Recent Perl versions allow the block form `package Foo { ... }`. Foostats uses it for every package. Imports stay local to the block, helper subs do not leak into the global symbol table, and configuration happens where the code needs it. + +The old way: + +```perl +package foo; + +sub hello { + print "Hello from package foo\n"; +} + +package bar; + +sub hello { + print "Hello from package bar\n"; +} + +1 +``` + +But now it is also possible to do this: + +```perl +package foo { + sub hello { + print "Hello from package foo\n"; + } +} + +package bar { + sub hello { + print "Hello from package bar\n"; + } +} +``` + +## Postfix dereferencing keeps data structures tidy + +### Clear dereferencing + +The script handles nested hashes and arrays. Postfix dereferencing (`$hash->%*`, `$array->@*`) keeps that readable. + +E.g. instead of having to write: + +```perl +for my $elem (@{$array_ref}) { + print "$elem\n"; +} +``` + +one can now do: + +```perl +for my $elem ($array_ref->@*) { + print "$elem\n"; +} +``` + +You see that this feature becomes increasingly useful with nested data structures, e.g. to print all keys of the nested hash: + +```perl +print for keys $hash->{stats}->%*; +``` + +Loops over like `$stats->{page_ips}->{urls}->%*` or `$merge{$key}->{$_}->%*` show which level of the structure is in play. The merger in Foostats updates host and URL statistics without building temporary arrays, and the reporter code mirrors the layout of the final tables. Before postfix dereferencing, the same code relied on braces within braces and was harder to read. + +## `say` is the default voice now + +`say` became the default once the script switched to `use v5.38;`. It adds a newline to every message printed, comparable to Ruby's `puts`, making log messages like "Processing $path" or "Writing report to $report_path" cleaner: + +```perl +use v5.38; + +print "Hello, world!\n"; # old way + +say "Hello, world!"; # new way +``` + +## Lexical subs promote local reasoning + +Lexical subroutines keep helpers close to the code that needs them. In `Foostats::Logreader::parse_web_logs`, functions such as `my sub parse_date` and `my sub open_file` live only inside that scope. + +This is an example of a lexical sub named `trim`, which is only visible within the outer sub named `process_lines`: + +```perl +use v5.38; + +sub process_lines { + my @lines = @_; + + my sub trim ($str) { + $str =~ s/^\s+|\s+$//gr; + } + + return [ map { trim($_) } @lines ]; +} + +my @raw = (" foo ", " bar", "baz "); +my $cleaned = process_lines(@raw); +say for @$cleaned; # prints "foo", "bar", "baz" +``` + +## Reference aliasing makes intent explicit + +Reference aliasing can be enabled with `use feature qw(refaliasing)` and helps communicate intent more clearly (if you remember the Perl syntax, of course—otherwise, it can look rather cryptic). The filter starts with `\my $uri_path = \$event->{uri_path}` so any later modification touches the original event. This is an example with ref aliasing in action: + +```perl +use feature qw(refaliasing); + +my $hash = { foo => 42 }; +\my $foo = \$hash->{foo}; + +$foo = 99; +print $hash->{foo}; # prints 99 +``` + +The aggregator in Foostats aliases `$self->{stats}{$date_key}` before updating counters, so the structure remains intact. Combined with subroutine signatures, this makes it obvious when a piece of data is shared instead of copied, preventing silent bugs. This enables having shorter names for long nested data structures. + +## Persistent state without globals + +A Perl state variable is declared with `state $var` and retains its value between calls to the enclosing subroutine. Foostats uses that for rate limiting and de-duplicated logging. + +This is a small example demonstrating the use of a state variable in Perl: + +```perl +sub counter { + state $count = 0; + $count++; + return $count; +} + +say counter(); # 1 +say counter(); # 2 +say counter(); # 3 +``` + +Hash and array state variables have been supported since `state` arrived in Perl 5.10. Scalar state variables were already supported previously. + +### Rate limiting state + +In Foostats, `state` variables store run-specific state without using package globals. `state %blocked` remembers IP hashes that already triggered the odd-request filter, and `state $last_time` and `state %count` track how many requests an IP makes in the exact second. + +### De-duplicated logging + +`state %dedup` keeps the log output of the suspicious calls to one warning per URI. Early versions utilized global hashes for the same tasks, producing inconsistent results during tests. Switching to `state` removed those edge cases. + +## Subroutine signatures + +Perl now supports subroutine signatures like other modern languages do. Foostats uses them everywhere. Examples: + +```perl +# Old way +sub greet_old { my $name = shift; print "Hello, $name!\n" } + +# Another old way +sub greet_old2 ($) { my $name = shift; print "Hello, $name!\n" } + +# New way +sub greet ($name) { say "Hello, $name!"; } + +greet("Alice"); # prints "Hello, Alice!" +``` + +In Foostats, constructors declare `sub new ($class, $odds_file, $log_path)`, anonymous callbacks expose `sub ($event)`, and helper subs list the values they expect, e.g.: + +```perl +my $anon = sub ($name) { + say "Hello, $name!"; +}; + +$anon->("World"); # prints "Hello, World!" +``` + +## Defined-or assignment for defaults without boilerplate + +The operator `//=` keeps configuration and counters simple. Environment variables may be missing when CRON runs the script, so `//=`, combined with signatures, sets defaults without warnings. Example use of that operator: + +```perl +my $foo; +$foo //= 42; +say $foo; # prints 42 + +$foo //= 99; +say $foo; # still prints 42, because $foo was already defined +``` + +## Cleanup with `defer` + +Even though not used in Foostats, this feature (similar to Go's defer) is neat to have in Perl now. + +The `defer` block (`use feature 'defer"`) schedules a piece of code to run when the current scope exits, regardless of how it exits (e.g. normal return, exception). This is perfect for ensuring resources, such as file handles, are closed. + +```perl +use feature qw(defer); + +sub parse_log_file { + my ($path) = @_; + open my $fh, '<', $path or die "Cannot open $path: $!"; + defer { close $fh }; + + while (my $line = <$fh>) { + # ... parsing logic that might throw an exception ... + } + # $fh is automatically closed here +} +``` + +This pattern replaces manual `close` calls in every exit path of the subroutine and is more robust than relying solely on object destructors. + +## Builtins and booleans + +The script also utilizes other modern additions that often go unnoticed. `use builtin qw(true false);` combined with `experimental::builtin` provides more real boolean values. + +## Conclusion + +I want to code more in Perl again. The newer features make it a joy to write small scripts like Foostats. If you haven't looked at Perl in a while, give it another try! The main thing which holds me back from writing more Perl is the lack of good tooling. For example, there is no proper LSP and tree sitter support available, which would work as good as the ones available for Go and Ruby. + +E-Mail your comments to `paul@nospam.buetow.org` :-) + +Other related posts are: + +=> ./2023-05-01-unveiling-guprecords:-uptime-records-with-raku.gmi 2023-05-01 Unveiling `guprecords.raku`: Global Uptime Records with Raku +=> ./2022-05-27-perl-is-still-a-great-choice.gmi 2022-05-27 Perl is still a great choice +=> ./2011-05-07-perl-daemon-service-framework.gmi 2011-05-07 Perl Daemon (Service Framework) +=> ./2008-06-26-perl-poetry.gmi 2008-06-26 Perl Poetry + +=> ../ Back to the main site diff --git a/gemfeed/2025-11-02-perl-new-features-and-foostats.gmi.tpl b/gemfeed/2025-11-02-perl-new-features-and-foostats.gmi.tpl new file mode 100644 index 00000000..9852c6b6 --- /dev/null +++ b/gemfeed/2025-11-02-perl-new-features-and-foostats.gmi.tpl @@ -0,0 +1,373 @@ +# Perl New Features and Foostats + +> Published at 2025-11-01T16:10:35+02:00 + +Perl recently reached rank 10 in the TIOBE index. That headline made me write this blog post as I was developing the Foostats script for simple analytics of my personal websites and Gemini capsules (e.g. `foo.zone`) and there were a couple of new features added to the Perl language over the last releases. The book *Perl New Features* by brian d foy documents the changes well; this post shows how those features look in a real program that runs every morning for my stats generation. + +=> https://developers.slashdot.org/story/25/09/14/0134239/is-perl-the-worlds-10th-most-popular-programming-language Perl re-enters the top ten +=> https://perlschool.com/books/perl-new-features/ Perl New Features by Joshua McAdams and brian d foy + +``` +$b="24P7cP3dP31P3bPaP28P24P64P31P2cP24P64P32P2cP24P73P2cP24P67P2cP24P7 +2P29P3dP28P22P31P30P30P30P30P22P2cP22P31P30P30P30P30P30P22P2cP22P4aP75 +P7 3P +74 P2 +0P 41P6eP6fP74P 68P65P72P20P50 P65P72P6cP2 0P48P 61 +P6 3P6bP65P72P22P 29P3bPaP40P6dP 3dP73P70P6cP6 9P74P 20 +P2 fP2fP 2cP22P 2cP2eP3aP21P2 bP2aP 30P4f P40P2 2P +3b PaP24 P6eP3 dP6c P65P6 eP67 P74P6 8P +20 P24P7 3P3bP aP24 P75P3 dP22 P20P2 2P +78 P24P6 eP3bP aPaP 70P72 P69P 6eP74 P2 +0P 22P5c P6eP20 P20P 24P75 P5cP7 2P22P 3b +Pa PaP66P6fP72P2 8P24P7aP20P 3dP20P31P3bP 20P24 P7 +aP 3cP3dP24P6 eP3bP20P24 P7aP2bP2bP 29P20 P7 +bP aPaP9 P77P28P24P6 4P31P29P 3bPaP 9P +24 P72P3 dP69 P6eP74P28 P72P6 1P +6e P64P2 8P24 P6eP2 9P29P 3bPaP 9P +24 P67P3 dP73 P75P6 2P73P 74P72 P2 +0P 24P73 P2cP24P72P2cP 31P3b PaP9P 24P67P20P3fP20 P6 +4P 6fP20 P9P7bP20PaP9P9 P9P9P 9P66P 6fP72P20P28P24 P6 +bP 3dP30 P3bP24P6bP3cP3 9P3bP 24P6bP 2bP2bP29P20P7b Pa +P9 P9 +P9 P9 +P9 P9P73P75P6 2P73 P74P 72P2 8P24P75P2c P24P72 P2 +cP 31P29P3dP24P 6dP5 bP24 P6bP 5dP3bP20Pa P9P9 P9P9 P9 +P9 P70P 72P69 P6eP 74P2 0P22 P20P20P24P 75P 5cP 72 +P2 2P3b PaP9 P9P9 P9P9 P9P7 7P28 P24 P6 4P +32 P29P 3bPa P9P9 P9P9 P9P7 dPaP 9P9 P9 +P9 P9P7 3P75 P62P 73P7 4P72 P28P 24P7 5P +2c P24P 72P2c P31P 29P3 dP24 P67P3bP20P aP9P9 P9 +P9 P7dP20PaP9P 9P3a P20P 72P6 5P64P6fP3b PaP9 P7 +3P 75P62P73P 74P7 2P28 P24P 73P2cP24P7 2P2c P3 +1P 29P3dP2 2P30 P22P 3bPa P9P7 0P7 2P +69 P6eP74P2 0P22 P20P 20P2 4P75 P5c P7 +2P 22P3 bPaPa P7dP aPaP 77P2 0P28 P24 P6 +4P 32P2 9P3bP aP70 P72P 69P6 eP74 P2 0P2 2P +20 P20P 24P75 P20P21P5cP7 2P22P3bPaP 73P6cP65P6 5P7 0P20 P3 +2P 3bPa P70P7 2P69P6eP74P 20P22P20P2 0P24P75P20 P21P 5cP6 eP +22 P3bP aPaP7 3P75P62P2 0P77P20P7b PaP9P24P6c P3dP73 P6 +8P 69 +P6 6P +74P3bPaP9P66P6fP72P28P24P6aP3dP30P3bP24P6aP3cP24P6cP3bP24P6aP2bP2bP29P +7bP7dPaP7dP";$b=~s/\s//g;split /P/,$b;foreach(@_){$c.=chr hex};eval $c + +The above Perl script prints out "Just Another Perl Hacker !" in an +animation of sorts. + +``` + +<< template::inline::toc + +## Motivation + +I've been running `foo.zone` for a while now, but I've never looked into visitor statistics or analytics. I value privacy—not just my own, but also the privacy of others (the visitors of this site) — so I hesitated to use any off-the-shelf analytics plugins. All I wanted to collect were: + +* Which blog posts had the most (unique) visitors +* Exclude, if possible, any bots and scrapers from the stats +* Track only anonymized IP addresses, never store raw addresses + +With Foostats I've created a Perl script which does that for my highly opinionated website/blog setup, which consists of: + +=> https://foo.zone/gemfeed/2021-06-05-gemtexter-one-bash-script-to-rule-it-all.html Gemtexter, my static site and Gemini capsule generator +=> https://foo.zone/gemfeed/2024-04-01-KISS-high-availability-with-OpenBSD.html How I host this site highly-available using OpenBSD + +## Why I used Perl + +Even though nowadays I code more in Go and Ruby, I stuck with Perl for Foostats for three simple reasons: + +* I wanted an excuse to explore the newer features of my first programming love. +* Sometimes, I miss Perl. +* Perl ships with OpenBSD (the operating system on which my sites run) by default. +* It really does live up to its Practical Extraction and Report Language (that's what the name Perl means) for this kind of log grinding I did with Foostats. + +## Inside Foostats + +Foostats is simply a log file analyser, which analyses the OpenBSD httpd and relayd logs. + +=> https://man.openbsd.org/httpd.8 +=> https://man.openbsd.org/relayd.8 + +### Log pipeline + +A CRON job starts Foostats, reads OpenBSD httpd and relayd access logs, and produces the numbers published at `https://stats.foo.zone` and `gemini://stats.foo.zone`. The dashboards are humble because traffic on my sites is still light, yet the trends are interesting for spotting patterns. The script is opinionated (I am repeating myself here, I know), and I will probably be the only one ever using it for my own sites. However, the code demonstrates how Perl's newer features help keep a small script like this exciting and fun! + +=> https://stats.foo.zone Foostats (HTTP) +=> gemini://stats.foo.zone Foostats (Gemini) + +On OpenBSD, I've configured the job via the `daily.local` on both of my OpenBSD servers (`fishfinger.buetow.org` and `blowfish.buetow.org` - note one is the master server, the other is the standby server, but the script runs on both and the stats are merged later in the process): + +```sh +fishfinger$ grep foostats /etc/daily.local +perl /usr/local/bin/foostats.pl --parse-logs --replicate --report +``` + +Internally, `Foostats::Logreader` parses each line of the log files `/var/log/daemon*` and `/var/www/logs/access_log*`, turns timestamps into `YYYYMMDD/HHMMSS` values, hashes IP addresses with SHA3 (for anonymization), and hands a normalized event to `Foostats::Filter`. The filter compares the URI against entries in `fooodds.txt`, tracks how many times an IP address requests within the exact second, and drops anything suspicious (e.g., from web crawlers or malicious attackers). Valid events reach `Foostats::Aggregator`, which counts requests per protocol, records unique visitors for the Gemtext and Atom feeds, and remembers page-level IP sets. `Foostats::FileOutputter` writes the result as gzipped JSON files—one per day and per protocol—with IPv4/IPv6 splits, filtered counters, feed readership, and hashes for long URLs. + +### `fooodds.txt` + +`fooodds.txt` is a plain text list of substrings of URLs to be blocked, making it quick to shut down web crawlers. Foostats also detects rapid requests (an indicator of excessive crawling) and blocks the IP. Audit lines are written to `/var/log/fooodds`, which can later be reviewed for false or true positives (I do this around once a month). The `Justfile` even has a `gather-fooodds` target that collects suspicious paths from remote logs so new patterns can be added quickly. + +### Feed kinds + +There are different kinds of feeds being tracked by Foostats: + +* The Atom web-feed +* The same feed via Gemini +* The Gemfeed (a special format popular in the Geminispace) + +### Aggregation and output + +As mentioned, Foostats merges the stats from both hosts, master and standby. For the master-standby setup description, read: + +=> ./2024-04-01-KISS-high-availability-with-OpenBSD.gmi KISS high-availability with OpenBSD + +Those gzipped files land in `stats/`. From there, `Foostats::Replicator` can pull matching files from the partner host (`fishfinger` or `blowfish`) so the view covers both servers, `Foostats::Merger` combines them into daily summaries, and `Foostats::Reporter` rebuilds Gemtext and HTML reports. + +Those are the raw stats files: + +=> https://blowfish.buetow.org/foostats/ +=> https://fishfinger.buetow.org/foostats/ + +These are the 30-day reports generated (already linked earlier in this post, but adding here again for clarity): + +=> gemini://stats.foo.zone stats.foo.zone Gemini capsule dashboard +=> https://stats.foo.zone stats.foo.zone HTTP dashboard + +### Command-line entry points + +`foostats_main` is the command entry point. `--parse-logs` refreshes the gzipped files, `--replicate` runs the cross-host sync, and `--report` rebuilds the HTML and Gemini report pages. `--all` performs everything in one go. Defaults point to `/var/www/htdocs/buetow.org/self/foostats` for data, `/var/gemini/stats.foo.zone` for Gemtext output, and `/var/www/htdocs/gemtexter/stats.foo.zone` for HTML output. Replication always forces the three most recent days' worth of data across HTTPS and leaves older files untouched to save bandwidth. + +The complete source lives on Codeberg here: + +=> https://codeberg.org/snonux/foostats Foostats on Codeberg + +Now let's go to some new Perl features: + +## Packages as real blocks + +### Scoped packages + +Recent Perl versions allow the block form `package Foo { ... }`. Foostats uses it for every package. Imports stay local to the block, helper subs do not leak into the global symbol table, and configuration happens where the code needs it. + +The old way: + +```perl +package foo; + +sub hello { + print "Hello from package foo\n"; +} + +package bar; + +sub hello { + print "Hello from package bar\n"; +} + +1 +``` + +But now it is also possible to do this: + +```perl +package foo { + sub hello { + print "Hello from package foo\n"; + } +} + +package bar { + sub hello { + print "Hello from package bar\n"; + } +} +``` + +## Postfix dereferencing keeps data structures tidy + +### Clear dereferencing + +The script handles nested hashes and arrays. Postfix dereferencing (`$hash->%*`, `$array->@*`) keeps that readable. + +E.g. instead of having to write: + +```perl +for my $elem (@{$array_ref}) { + print "$elem\n"; +} +``` + +one can now do: + +```perl +for my $elem ($array_ref->@*) { + print "$elem\n"; +} +``` + +You see that this feature becomes increasingly useful with nested data structures, e.g. to print all keys of the nested hash: + +```perl +print for keys $hash->{stats}->%*; +``` + +Loops over like `$stats->{page_ips}->{urls}->%*` or `$merge{$key}->{$_}->%*` show which level of the structure is in play. The merger in Foostats updates host and URL statistics without building temporary arrays, and the reporter code mirrors the layout of the final tables. Before postfix dereferencing, the same code relied on braces within braces and was harder to read. + +## `say` is the default voice now + +`say` became the default once the script switched to `use v5.38;`. It adds a newline to every message printed, comparable to Ruby's `puts`, making log messages like "Processing $path" or "Writing report to $report_path" cleaner: + +```perl +use v5.38; + +print "Hello, world!\n"; # old way + +say "Hello, world!"; # new way +``` + +## Lexical subs promote local reasoning + +Lexical subroutines keep helpers close to the code that needs them. In `Foostats::Logreader::parse_web_logs`, functions such as `my sub parse_date` and `my sub open_file` live only inside that scope. + +This is an example of a lexical sub named `trim`, which is only visible within the outer sub named `process_lines`: + +```perl +use v5.38; + +sub process_lines { + my @lines = @_; + + my sub trim ($str) { + $str =~ s/^\s+|\s+$//gr; + } + + return [ map { trim($_) } @lines ]; +} + +my @raw = (" foo ", " bar", "baz "); +my $cleaned = process_lines(@raw); +say for @$cleaned; # prints "foo", "bar", "baz" +``` + +## Reference aliasing makes intent explicit + +Reference aliasing can be enabled with `use feature qw(refaliasing)` and helps communicate intent more clearly (if you remember the Perl syntax, of course—otherwise, it can look rather cryptic). The filter starts with `\my $uri_path = \$event->{uri_path}` so any later modification touches the original event. This is an example with ref aliasing in action: + +```perl +use feature qw(refaliasing); + +my $hash = { foo => 42 }; +\my $foo = \$hash->{foo}; + +$foo = 99; +print $hash->{foo}; # prints 99 +``` + +The aggregator in Foostats aliases `$self->{stats}{$date_key}` before updating counters, so the structure remains intact. Combined with subroutine signatures, this makes it obvious when a piece of data is shared instead of copied, preventing silent bugs. This enables having shorter names for long nested data structures. + +## Persistent state without globals + +A Perl state variable is declared with `state $var` and retains its value between calls to the enclosing subroutine. Foostats uses that for rate limiting and de-duplicated logging. + +This is a small example demonstrating the use of a state variable in Perl: + +```perl +sub counter { + state $count = 0; + $count++; + return $count; +} + +say counter(); # 1 +say counter(); # 2 +say counter(); # 3 +``` + +Hash and array state variables have been supported since `state` arrived in Perl 5.10. Scalar state variables were already supported previously. + +### Rate limiting state + +In Foostats, `state` variables store run-specific state without using package globals. `state %blocked` remembers IP hashes that already triggered the odd-request filter, and `state $last_time` and `state %count` track how many requests an IP makes in the exact second. + +### De-duplicated logging + +`state %dedup` keeps the log output of the suspicious calls to one warning per URI. Early versions utilized global hashes for the same tasks, producing inconsistent results during tests. Switching to `state` removed those edge cases. + +## Subroutine signatures + +Perl now supports subroutine signatures like other modern languages do. Foostats uses them everywhere. Examples: + +```perl +# Old way +sub greet_old { my $name = shift; print "Hello, $name!\n" } + +# Another old way +sub greet_old2 ($) { my $name = shift; print "Hello, $name!\n" } + +# New way +sub greet ($name) { say "Hello, $name!"; } + +greet("Alice"); # prints "Hello, Alice!" +``` + +In Foostats, constructors declare `sub new ($class, $odds_file, $log_path)`, anonymous callbacks expose `sub ($event)`, and helper subs list the values they expect, e.g.: + +```perl +my $anon = sub ($name) { + say "Hello, $name!"; +}; + +$anon->("World"); # prints "Hello, World!" +``` + +## Defined-or assignment for defaults without boilerplate + +The operator `//=` keeps configuration and counters simple. Environment variables may be missing when CRON runs the script, so `//=`, combined with signatures, sets defaults without warnings. Example use of that operator: + +```perl +my $foo; +$foo //= 42; +say $foo; # prints 42 + +$foo //= 99; +say $foo; # still prints 42, because $foo was already defined +``` + +## Cleanup with `defer` + +Even though not used in Foostats, this feature (similar to Go's defer) is neat to have in Perl now. + +The `defer` block (`use feature 'defer"`) schedules a piece of code to run when the current scope exits, regardless of how it exits (e.g. normal return, exception). This is perfect for ensuring resources, such as file handles, are closed. + +```perl +use feature qw(defer); + +sub parse_log_file { + my ($path) = @_; + open my $fh, '<', $path or die "Cannot open $path: $!"; + defer { close $fh }; + + while (my $line = <$fh>) { + # ... parsing logic that might throw an exception ... + } + # $fh is automatically closed here +} +``` + +This pattern replaces manual `close` calls in every exit path of the subroutine and is more robust than relying solely on object destructors. + +## Builtins and booleans + +The script also utilizes other modern additions that often go unnoticed. `use builtin qw(true false);` combined with `experimental::builtin` provides more real boolean values. + +## Conclusion + +I want to code more in Perl again. The newer features make it a joy to write small scripts like Foostats. If you haven't looked at Perl in a while, give it another try! The main thing which holds me back from writing more Perl is the lack of good tooling. For example, there is no proper LSP and tree sitter support available, which would work as good as the ones available for Go and Ruby. + +E-Mail your comments to `paul@nospam.buetow.org` :-) + +Other related posts are: + +<< template::inline::rindex perl raku + +=> ../ Back to the main site diff --git a/gemfeed/DRAFT-perl-new-features-and-foostats.gmi b/gemfeed/DRAFT-perl-new-features-and-foostats.gmi deleted file mode 100644 index 87ba0898..00000000 --- a/gemfeed/DRAFT-perl-new-features-and-foostats.gmi +++ /dev/null @@ -1,358 +0,0 @@ -# Perl New Features and Foostats - -Perl just reached rank 10 in the TIOBE index. That headline matches my day-to-day reality because I keep developing the foostats script for simple analytics of my personal websites and Gemini capsules (e.g. `foo.zone`), and almost every Perl release adds new features. The book *Perl New Features* by brian d foy documents the changes well; this post shows how those features look in a real program that runs every morning for my stats generation. - -``` -$b="24P7cP3dP31P3bPaP28P24P64P31P2cP24P64P32P2cP24P73P2cP24P67P2cP24P7 -2P29P3dP28P22P31P30P30P30P30P22P2cP22P31P30P30P30P30P30P22P2cP22P4aP75 -P7 3P -74 P2 -0P 41P6eP6fP74P 68P65P72P20P50 P65P72P6cP2 0P48P 61 -P6 3P6bP65P72P22P 29P3bPaP40P6dP 3dP73P70P6cP6 9P74P 20 -P2 fP2fP 2cP22P 2cP2eP3aP21P2 bP2aP 30P4f P40P2 2P -3b PaP24 P6eP3 dP6c P65P6 eP67 P74P6 8P -20 P24P7 3P3bP aP24 P75P3 dP22 P20P2 2P -78 P24P6 eP3bP aPaP 70P72 P69P 6eP74 P2 -0P 22P5c P6eP20 P20P 24P75 P5cP7 2P22P 3b -Pa PaP66P6fP72P2 8P24P7aP20P 3dP20P31P3bP 20P24 P7 -aP 3cP3dP24P6 eP3bP20P24 P7aP2bP2bP 29P20 P7 -bP aPaP9 P77P28P24P6 4P31P29P 3bPaP 9P -24 P72P3 dP69 P6eP74P28 P72P6 1P -6e P64P2 8P24 P6eP2 9P29P 3bPaP 9P -24 P67P3 dP73 P75P6 2P73P 74P72 P2 -0P 24P73 P2cP24P72P2cP 31P3b PaP9P 24P67P20P3fP20 P6 -4P 6fP20 P9P7bP20PaP9P9 P9P9P 9P66P 6fP72P20P28P24 P6 -bP 3dP30 P3bP24P6bP3cP3 9P3bP 24P6bP 2bP2bP29P20P7b Pa -P9 P9 -P9 P9 -P9 P9P73P75P6 2P73 P74P 72P2 8P24P75P2c P24P72 P2 -cP 31P29P3dP24P 6dP5 bP24 P6bP 5dP3bP20Pa P9P9 P9P9 P9 -P9 P70P 72P69 P6eP 74P2 0P22 P20P20P24P 75P 5cP 72 -P2 2P3b PaP9 P9P9 P9P9 P9P7 7P28 P24 P6 4P -32 P29P 3bPa P9P9 P9P9 P9P7 dPaP 9P9 P9 -P9 P9P7 3P75 P62P 73P7 4P72 P28P 24P7 5P -2c P24P 72P2c P31P 29P3 dP24 P67P3bP20P aP9P9 P9 -P9 P7dP20PaP9P 9P3a P20P 72P6 5P64P6fP3b PaP9 P7 -3P 75P62P73P 74P7 2P28 P24P 73P2cP24P7 2P2c P3 -1P 29P3dP2 2P30 P22P 3bPa P9P7 0P7 2P -69 P6eP74P2 0P22 P20P 20P2 4P75 P5c P7 -2P 22P3 bPaPa P7dP aPaP 77P2 0P28 P24 P6 -4P 32P2 9P3bP aP70 P72P 69P6 eP74 P2 0P2 2P -20 P20P 24P75 P20P21P5cP7 2P22P3bPaP 73P6cP65P6 5P7 0P20 P3 -2P 3bPa P70P7 2P69P6eP74P 20P22P20P2 0P24P75P20 P21P 5cP6 eP -22 P3bP aPaP7 3P75P62P2 0P77P20P7b PaP9P24P6c P3dP73 P6 -8P 69 -P6 6P -74P3bPaP9P66P6fP72P28P24P6aP3dP30P3bP24P6aP3cP24P6cP3bP24P6aP2bP2bP29P -7bP7dPaP7dP";$b=~s/\s//g;split /P/,$b;foreach(@_){$c.=chr hex};eval $c - -The above Perl scripts prints out "Just Another Perl Hacker !" in an -animation of sorts. - -``` - -## Table of Contents - -* ⇢ Perl New Features and Foostats -* ⇢ ⇢ Motivation -* ⇢ ⇢ Why I used Perl -* ⇢ ⇢ Inside foostats -* ⇢ ⇢ ⇢ Log pipeline -* ⇢ ⇢ ⇢ Aggregation and output -* ⇢ ⇢ ⇢ Command-line entry points -* ⇢ ⇢ Packages as real blocks -* ⇢ ⇢ ⇢ Scoped packages -* ⇢ ⇢ Postfix dereferencing keeps data structures tidy -* ⇢ ⇢ ⇢ Clear dereferencing -* ⇢ ⇢ Lexical subs promote local reasoning -* ⇢ ⇢ ⇢ Helpers that stay local -* ⇢ ⇢ Reference aliasing makes intent explicit -* ⇢ ⇢ ⇢ Shared data on purpose -* ⇢ ⇢ Persistent state without globals -* ⇢ ⇢ ⇢ Rate limiting state -* ⇢ ⇢ ⇢ De-duplicated logging -* ⇢ ⇢ Subroutine signatures clarify every call site -* ⇢ ⇢ ⇢ "normal" subroutine signatures now -* ⇢ ⇢ Defined-or assignment keeps defaults obvious -* ⇢ ⇢ ⇢ Defaults without boilerplate -* ⇢ ⇢ `say` is the default voice now -* ⇢ ⇢ Cleanup with `defer` -* ⇢ ⇢ Builtins and booleans -* ⇢ ⇢ Conclusion - -## Motivation - -I've been running `foo.zone` for a while now, but I've never looked into visitor statistics or analytics. I value privacy—not just my own, but also the privacy of others (the visitors of this site) — so I hesitated to use any off-the-shelf analytics plugins. All I wanted to collect were: - -* Which blog posts had the most (unique) visitors -* Exclude, if possible, any bots and scrapers from the stats -* Track only anonymized IP addresses, never store raw addresses - -With Foostats I've created a Perl script which does that for my highly opinionated website/blog setup: - -=> https://foo.zone/gemfeed/2021-06-05-gemtexter-one-bash-script-to-rule-it-all.html Gemtexter, my static site and Gemini capsule generator -=> https://foo.zone/gemfeed/2024-04-01-KISS-high-availability-with-OpenBSD.html How I host this site highly-available using OpenBSD - -## Why I used Perl - -Even though nowadays I code more in Go and Ruby, I stuck with Perl for Foostats for three simple reasons: - -* I wanted an excuse to explore the newer features of my first programming love. -* Sometimes, I miss Perl. -* Perl ships with OpenBSD (the operating system on which my sites run) by default. -* It really does live up to its Practical Extraction and Report Language (that's where the name Perl means) for this kind of log grinding I did with foostats. - -=> https://developers.slashdot.org/story/25/09/14/0134239/is-perl-the-worlds-10th-most-popular-programming-language Perl re-enters the top ten -=> https://perlschool.com/books/perl-new-features/ Perl New Features by Joshua McAdams and brian d foy - -## Inside foostats - -Foostats is simply a log file analyser, which analyses the OpenBSD httpd and relayd logs. - -### Log pipeline - -A cron job starts Foostats, reads OpenBSD httpd and relayd access logs, and produces the numbers published at `https://stats.foo.zone` and `gemini://stats.foo.zone`. The dashboards are humble because traffic on my sites is still light, yet the trends are interesting for spotting patterns. The script is opinionated (I am repeating myself here, I know), and I will probably be the only one ever using it for my own sites. However, the code demonstrates how Perl's newer features help keep a small script like this exciting and fun! - -On OpenBSD, I've configured the job via the `daily.local` on both of my OpenBSD servers (`fishfinger.buetow.org` and `blowfish.buetow.org` - note one is the master server, the other is the standby server, but the script runs on both and the stats are merged later in the process): - -```sh -fishfinger$ grep foostats /etc/daily.local -perl /usr/local/bin/foostats.pl --parse-logs --replicate --report -``` - -Internally, `Foostats::Logreader` parses each line of the log files `/var/log/daemon*` and `/var/www/logs/access_log*`, turns timestamps into `YYYYMMDD/HHMMSS` values, hashes IP addresses with SHA3 (for anonymization), and hands a normalized event to `Foostats::Filter`. The filter compares the URI against entries in `fooodds.txt`, tracks how many times an IP address requests within the exact second, and drops anything suspicious (e.g., from web crawlers or malicious attackers). Valid events reach `Foostats::Aggregator`, which counts requests per protocol, records unique visitors for the Gemtext and Atom feeds, and remembers page-level IP sets. `Foostats::FileOutputter` writes the result as gzipped JSON files—one per day and per protocol—with IPv4/IPv6 splits, filtered counters, feed readership, and hashes for long URLs. - -Whereas, there are different kinds of feeds: - -* The Atom web-feed -* The same feed via Gemini -* The Gemfeed (a special format popular in the Geminispace) - -### Aggregation and output - -As mentioned, Foostats merges the stats from both hosts, master and standby. For the master-standby setup description, read: - -=> ./2024-04-01-KISS-high-availability-with-OpenBSD.gmi KISS high-availability with OpenBSD - -Those gzipped files land in `stats/`. From there, `Foostats::Replicator` can pull matching files from the partner host (`fishfinger` or `blowfish`) so the view covers both servers, `Foostats::Merger` combines them into daily summaries, and `Foostats::Reporter` rebuilds Gemtext and HTML reports. - -Those are the raw stats files: - -=> https://blowfish.buetow.org/foostats/ -=> https://fishfinger.buetow.org/foostats/ - -These are the 30-day reports generated: - -=> gemini://stats.foo.zone stats.foo.zone Gemini capsule dashboard -=> https://stats.foo.zone stats.foo.zone HTTP dashboard - -### Command-line entry points - -`foostats_main` is the command entry point. `--parse-logs` refreshes the gzipped files, `--replicate` runs the cross-host sync, and `--report` rebuilds the HTML and Gemini report pages. `--all` performs everything in one go. Defaults point to `/var/www/htdocs/buetow.org/self/foostats` for data, `/var/gemini/stats.foo.zone` for Gemtext output, and `/var/www/htdocs/gemtexter/stats.foo.zone` for HTML output. Replication always forces the three most recent days worth of the data across HTTPS and leaves older files untouched to save bandwidth. - -`fooodds.txt` is a plain text list of substrings of URLs to be blocked, making it quick to shut down web crawlers. Foostats also detects rapid requests (an indicator of excessive crawling) and blocks the IP. Audit lines are written to `/var/log/fooodds`, which can later be reviewed for false or true positives (I do this around once a month). The `Justfile` even has a `gather-fooodds` target that collects suspicious paths from remote logs so new patterns can be added quickly. - -The complete source lives on Codeberg here: - -=> https://codeberg.org/snonux/foostats foostats on Codeberg - -Now let's go to some new Perl features: - -## Packages as real blocks - -### Scoped packages - -Recent Perl versions allow the block form `package Foo { ... }`. Foostats uses it for every package. Imports stay local to the block, helper subs do not leak into the global symbol table, and configuration happens where the code needs it. - -The old way: - -```perl -package foo; - -sub hello { - print "Hello from package foo\n"; -} - -package bar; - -sub hello { - print "Hello from package bar\n"; -} - -1 -``` - -But now it is also possible to do this: - -```perl -package foo { - sub hello { - print "Hello from package foo\n"; - } -} - -package bar { - sub hello { - print "Hello from package bar\n"; - } -} -``` - -## Postfix dereferencing keeps data structures tidy - -### Clear dereferencing - -The script handles nested hashes and arrays. Postfix dereferencing (`$hash->%*`, `$array->@*`) keeps that readable. - -E.g. instead of having to write: - -```perl -for my $elem (@{$array_ref}) { - print "$elem\n"; -} -``` - -one can now do: - -```perl -for my $elem ($array_ref->@*) { - print "$elem\n"; -} -``` - -You see that this feature becomes increasingly useful the with nested data structures, e.g. to print all keys of the nested hash: - -```perl -print for keys $hash->{stats}->%*; -``` - -Loops over like `$stats->{page_ips}->{urls}->%*` or `$merge{$key}->{$_}->%*` show which level of the structure is in play. The merger in Foostats updates host and URL statistics without building temporary arrays, and the reporter code mirrors the layout of the final tables. Before postfix dereferencing, the same code relied on braces within braces and was harder to read. - -## Lexical subs promote local reasoning - -### Helpers that stay local - -Lexical subroutines keep helpers close to the code that needs them. In `Foostats::Logreader::parse_web_logs`, functions such as `my sub parse_date` and `my sub open_file` live only inside that scope. - -## Reference aliasing makes intent explicit - -### Shared data on purpose - -Ref aliasing is enabled with `use feature qw(refaliasing)` and helps communicate intent more clearly (if you remember the Perl syntax, of course. Otherwise, it's like chinese). The filter starts with `\my $uri_path = \$event->{uri_path}` so any later modification touches the original event. - -```perl -use feature qw(refaliasing); - -my $hash = { foo => 42 }; -\my $foo = \$hash->{foo}; - -$foo = 99; -print $hash->{foo}; # prints 99 -``` - -The aggregator in Foostats aliases `$self->{stats}{$date_key}` before updating counters, so the structure remains intact. Combined with subroutine signatures, this makes it obvious when a piece of data is shared instead of copied, preventing silent bugs. - -## Persistent state without globals - -A Perl state variable is declared with `state $var` and retains its value between calls to the enclosing subroutine. Foostats uses that for rate limiting and de-duplicated logging. - -This is a small example demonstrating the use of a state variable in Perl: - -```perl -sub counter { - state $count = 0; - $count++; - return $count; -} - -say counter(); # 1 -say counter(); # 2 -say counter(); # 3 -``` - -### Rate limiting state - -In Foostats, `state` variables store run-specific state without using package globals. `state %blocked` remembers IP hashes that already triggered the odd-request filter, and `state $last_time` and `state %count` track how many requests an IP makes in the exact second. Hash and array state variables have been supported since `state` arrived in Perl 5.10, so this code takes advantage of that long-standing capability. However, what's new is that hashes can now also be state variables. - -### De-duplicated logging - -`state %dedup` keeps the log output to one warning per URI. Early versions utilized global hashes for the same tasks, producing inconsistent results during tests. Switching to `state` removed those edge cases. - -## Subroutine signatures clarify every call site - -Perl now supports subroutine signatures like other modern languages do. Foostats uses them everywhere. - -```perl -# Old way -sub greet_old { my $name = shift; print "Hello, $name!\n" } - -# Another old way -sub greet_old ($) { $name = shift; print "Hello, $name!\n" } - -# New way -sub greet ($name) { say "Hello, $name!"; } - -greet("Alice"); # prints "Hello, Alice!" - -sub greet ($name) { - say "Hello, $name!"; -} - -greet("Alice"); # prints "Hello, Alice!" -``` - -### "normal" subroutine signatures now - -Subroutine signatures are active throughout foostats. Constructors declare `sub new ($class, $odds_file, $log_path)`, anonymous callbacks expose `sub ($event)`, and helper subs list the values they expect. - -## Defined-or assignment keeps defaults obvious - -### Defaults without boilerplate - -The operator `//=` keeps configuration and counters simple. Environment variables may be missing when cron runs the script, so `//=`, combined with signatures, sets defaults without warnings. - -## `say` is the default voice now - -`say` became the default once the script switched to `use v5.38;`. Log messages such as "Processing $path" or "Writing report to $report_path". It adds a newline to every message printed, comparable to Ruby's `put`. - -## Cleanup with `defer` - -Even though not used in Foostats, this (borrowed from Go?) feature is neat to have in Perl now. - -The `defer` block (`use feature 'defer"`) schedules a piece of code to run when the current scope exits, regardless of how it exits (e.g. normal return, exception). This is perfect for ensuring resources, such as file handles, are closed. `Foostats::Logreader` uses it to make sure log files are always closed, even if parsing fails mid-way. - -```perl -use feature qw(defer); - -sub parse_log_file { - my ($path) = @_; - open my $fh, '<', $path or die "Cannot open $path: $!"; - defer { close $fh }; - - while (my $line = <$fh>) { - # ... parsing logic that might throw an exception ... - } - # $fh is automatically closed here -} -``` - -This pattern replaces manual `close` calls in every exit path of the subroutine and is more robust than relying solely on object destructors. - -## Builtins and booleans - -The script also utilises other modern additions that often go unnoticed. `use builtin qw(true false);` combined with `experimental::builtin` provides more real boolean values. - -## Conclusion - -I want to code more in Perl again. The newer features make it a joy to write small scripts like Foostats. If you haven't looked at Perl in a while, give it another try! The main thing which holds me back from writing more Perl is the lack of good tooling. For example, there is no proper LSP and tree sitter support available, which would work as well as for Go and Ruby. - -E-Mail your comments to `paul@nospam.buetow.org` :-) - -Other related posts are: - -=> ./2023-05-01-unveiling-guprecords:-uptime-records-with-raku.gmi 2023-05-01 Unveiling `guprecords.raku`: Global Uptime Records with Raku -=> ./2022-05-27-perl-is-still-a-great-choice.gmi 2022-05-27 Perl is still a great choice -=> ./2011-05-07-perl-daemon-service-framework.gmi 2011-05-07 Perl Daemon (Service Framework) -=> ./2008-06-26-perl-poetry.gmi 2008-06-26 Perl Poetry - -=> ../ Back to the main site diff --git a/gemfeed/DRAFT-perl-new-features-and-foostats.gmi.tpl b/gemfeed/DRAFT-perl-new-features-and-foostats.gmi.tpl deleted file mode 100644 index 42d6fbf7..00000000 --- a/gemfeed/DRAFT-perl-new-features-and-foostats.gmi.tpl +++ /dev/null @@ -1,351 +0,0 @@ -# Perl New Features and Foostats - -Perl just reached rank 10 in the TIOBE index. That headline matches my day-to-day reality because I keep developing the foostats script for simple analytics of my personal websites and Gemini capsules (e.g. `foo.zone`), and almost every Perl release adds new features. The book *Perl New Features* by brian d foy documents the changes well; this post shows how those features look in a real program that runs every morning for my stats generation. - -``` -$b="24P7cP3dP31P3bPaP28P24P64P31P2cP24P64P32P2cP24P73P2cP24P67P2cP24P7 -2P29P3dP28P22P31P30P30P30P30P22P2cP22P31P30P30P30P30P30P22P2cP22P4aP75 -P7 3P -74 P2 -0P 41P6eP6fP74P 68P65P72P20P50 P65P72P6cP2 0P48P 61 -P6 3P6bP65P72P22P 29P3bPaP40P6dP 3dP73P70P6cP6 9P74P 20 -P2 fP2fP 2cP22P 2cP2eP3aP21P2 bP2aP 30P4f P40P2 2P -3b PaP24 P6eP3 dP6c P65P6 eP67 P74P6 8P -20 P24P7 3P3bP aP24 P75P3 dP22 P20P2 2P -78 P24P6 eP3bP aPaP 70P72 P69P 6eP74 P2 -0P 22P5c P6eP20 P20P 24P75 P5cP7 2P22P 3b -Pa PaP66P6fP72P2 8P24P7aP20P 3dP20P31P3bP 20P24 P7 -aP 3cP3dP24P6 eP3bP20P24 P7aP2bP2bP 29P20 P7 -bP aPaP9 P77P28P24P6 4P31P29P 3bPaP 9P -24 P72P3 dP69 P6eP74P28 P72P6 1P -6e P64P2 8P24 P6eP2 9P29P 3bPaP 9P -24 P67P3 dP73 P75P6 2P73P 74P72 P2 -0P 24P73 P2cP24P72P2cP 31P3b PaP9P 24P67P20P3fP20 P6 -4P 6fP20 P9P7bP20PaP9P9 P9P9P 9P66P 6fP72P20P28P24 P6 -bP 3dP30 P3bP24P6bP3cP3 9P3bP 24P6bP 2bP2bP29P20P7b Pa -P9 P9 -P9 P9 -P9 P9P73P75P6 2P73 P74P 72P2 8P24P75P2c P24P72 P2 -cP 31P29P3dP24P 6dP5 bP24 P6bP 5dP3bP20Pa P9P9 P9P9 P9 -P9 P70P 72P69 P6eP 74P2 0P22 P20P20P24P 75P 5cP 72 -P2 2P3b PaP9 P9P9 P9P9 P9P7 7P28 P24 P6 4P -32 P29P 3bPa P9P9 P9P9 P9P7 dPaP 9P9 P9 -P9 P9P7 3P75 P62P 73P7 4P72 P28P 24P7 5P -2c P24P 72P2c P31P 29P3 dP24 P67P3bP20P aP9P9 P9 -P9 P7dP20PaP9P 9P3a P20P 72P6 5P64P6fP3b PaP9 P7 -3P 75P62P73P 74P7 2P28 P24P 73P2cP24P7 2P2c P3 -1P 29P3dP2 2P30 P22P 3bPa P9P7 0P7 2P -69 P6eP74P2 0P22 P20P 20P2 4P75 P5c P7 -2P 22P3 bPaPa P7dP aPaP 77P2 0P28 P24 P6 -4P 32P2 9P3bP aP70 P72P 69P6 eP74 P2 0P2 2P -20 P20P 24P75 P20P21P5cP7 2P22P3bPaP 73P6cP65P6 5P7 0P20 P3 -2P 3bPa P70P7 2P69P6eP74P 20P22P20P2 0P24P75P20 P21P 5cP6 eP -22 P3bP aPaP7 3P75P62P2 0P77P20P7b PaP9P24P6c P3dP73 P6 -8P 69 -P6 6P -74P3bPaP9P66P6fP72P28P24P6aP3dP30P3bP24P6aP3cP24P6cP3bP24P6aP2bP2bP29P -7bP7dPaP7dP";$b=~s/\s//g;split /P/,$b;foreach(@_){$c.=chr hex};eval $c - -The above Perl scripts prints out "Just Another Perl Hacker !" in an -animation of sorts. - -``` - -<< template::inline::toc - -## Motivation - -I've been running `foo.zone` for a while now, but I've never looked into visitor statistics or analytics. I value privacy—not just my own, but also the privacy of others (the visitors of this site) — so I hesitated to use any off-the-shelf analytics plugins. All I wanted to collect were: - -* Which blog posts had the most (unique) visitors -* Exclude, if possible, any bots and scrapers from the stats -* Track only anonymized IP addresses, never store raw addresses - -With Foostats I've created a Perl script which does that for my highly opinionated website/blog setup: - -=> https://foo.zone/gemfeed/2021-06-05-gemtexter-one-bash-script-to-rule-it-all.html Gemtexter, my static site and Gemini capsule generator -=> https://foo.zone/gemfeed/2024-04-01-KISS-high-availability-with-OpenBSD.html How I host this site highly-available using OpenBSD - -## Why I used Perl - -Even though nowadays I code more in Go and Ruby, I stuck with Perl for Foostats for three simple reasons: - -* I wanted an excuse to explore the newer features of my first programming love. -* Sometimes, I miss Perl. -* Perl ships with OpenBSD (the operating system on which my sites run) by default. -* It really does live up to its Practical Extraction and Report Language (that's where the name Perl means) for this kind of log grinding I did with foostats. - -=> https://developers.slashdot.org/story/25/09/14/0134239/is-perl-the-worlds-10th-most-popular-programming-language Perl re-enters the top ten -=> https://perlschool.com/books/perl-new-features/ Perl New Features by Joshua McAdams and brian d foy - -## Inside foostats - -Foostats is simply a log file analyser, which analyses the OpenBSD httpd and relayd logs. - -### Log pipeline - -A CRON job starts Foostats, reads OpenBSD httpd and relayd access logs, and produces the numbers published at `https://stats.foo.zone` and `gemini://stats.foo.zone`. The dashboards are humble because traffic on my sites is still light, yet the trends are interesting for spotting patterns. The script is opinionated (I am repeating myself here, I know), and I will probably be the only one ever using it for my own sites. However, the code demonstrates how Perl's newer features help keep a small script like this exciting and fun! - -On OpenBSD, I've configured the job via the `daily.local` on both of my OpenBSD servers (`fishfinger.buetow.org` and `blowfish.buetow.org` - note one is the master server, the other is the standby server, but the script runs on both and the stats are merged later in the process): - -```sh -fishfinger$ grep foostats /etc/daily.local -perl /usr/local/bin/foostats.pl --parse-logs --replicate --report -``` - -Internally, `Foostats::Logreader` parses each line of the log files `/var/log/daemon*` and `/var/www/logs/access_log*`, turns timestamps into `YYYYMMDD/HHMMSS` values, hashes IP addresses with SHA3 (for anonymization), and hands a normalized event to `Foostats::Filter`. The filter compares the URI against entries in `fooodds.txt`, tracks how many times an IP address requests within the exact second, and drops anything suspicious (e.g., from web crawlers or malicious attackers). Valid events reach `Foostats::Aggregator`, which counts requests per protocol, records unique visitors for the Gemtext and Atom feeds, and remembers page-level IP sets. `Foostats::FileOutputter` writes the result as gzipped JSON files—one per day and per protocol—with IPv4/IPv6 splits, filtered counters, feed readership, and hashes for long URLs. - -Whereas, there are different kinds of feeds: - -* The Atom web-feed -* The same feed via Gemini -* The Gemfeed (a special format popular in the Geminispace) - -### Aggregation and output - -As mentioned, Foostats merges the stats from both hosts, master and standby. For the master-standby setup description, read: - -=> ./2024-04-01-KISS-high-availability-with-OpenBSD.gmi KISS high-availability with OpenBSD - -Those gzipped files land in `stats/`. From there, `Foostats::Replicator` can pull matching files from the partner host (`fishfinger` or `blowfish`) so the view covers both servers, `Foostats::Merger` combines them into daily summaries, and `Foostats::Reporter` rebuilds Gemtext and HTML reports. - -Those are the raw stats files: - -=> https://blowfish.buetow.org/foostats/ -=> https://fishfinger.buetow.org/foostats/ - -These are the 30-day reports generated: - -=> gemini://stats.foo.zone stats.foo.zone Gemini capsule dashboard -=> https://stats.foo.zone stats.foo.zone HTTP dashboard - -### Command-line entry points - -`foostats_main` is the command entry point. `--parse-logs` refreshes the gzipped files, `--replicate` runs the cross-host sync, and `--report` rebuilds the HTML and Gemini report pages. `--all` performs everything in one go. Defaults point to `/var/www/htdocs/buetow.org/self/foostats` for data, `/var/gemini/stats.foo.zone` for Gemtext output, and `/var/www/htdocs/gemtexter/stats.foo.zone` for HTML output. Replication always forces the three most recent days worth of the data across HTTPS and leaves older files untouched to save bandwidth. - -`fooodds.txt` is a plain text list of substrings of URLs to be blocked, making it quick to shut down web crawlers. Foostats also detects rapid requests (an indicator of excessive crawling) and blocks the IP. Audit lines are written to `/var/log/fooodds`, which can later be reviewed for false or true positives (I do this around once a month). The `Justfile` even has a `gather-fooodds` target that collects suspicious paths from remote logs so new patterns can be added quickly. - -The complete source lives on Codeberg here: - -=> https://codeberg.org/snonux/foostats foostats on Codeberg - -Now let's go to some new Perl features: - -## Packages as real blocks - -### Scoped packages - -Recent Perl versions allow the block form `package Foo { ... }`. Foostats uses it for every package. Imports stay local to the block, helper subs do not leak into the global symbol table, and configuration happens where the code needs it. - -The old way: - -```perl -package foo; - -sub hello { - print "Hello from package foo\n"; -} - -package bar; - -sub hello { - print "Hello from package bar\n"; -} - -1 -``` - -But now it is also possible to do this: - -```perl -package foo { - sub hello { - print "Hello from package foo\n"; - } -} - -package bar { - sub hello { - print "Hello from package bar\n"; - } -} -``` - -## Postfix dereferencing keeps data structures tidy - -### Clear dereferencing - -The script handles nested hashes and arrays. Postfix dereferencing (`$hash->%*`, `$array->@*`) keeps that readable. - -E.g. instead of having to write: - -```perl -for my $elem (@{$array_ref}) { - print "$elem\n"; -} -``` - -one can now do: - -```perl -for my $elem ($array_ref->@*) { - print "$elem\n"; -} -``` - -You see that this feature becomes increasingly useful the with nested data structures, e.g. to print all keys of the nested hash: - -```perl -print for keys $hash->{stats}->%*; -``` - -Loops over like `$stats->{page_ips}->{urls}->%*` or `$merge{$key}->{$_}->%*` show which level of the structure is in play. The merger in Foostats updates host and URL statistics without building temporary arrays, and the reporter code mirrors the layout of the final tables. Before postfix dereferencing, the same code relied on braces within braces and was harder to read. - -## `say` is the default voice now - -`say` became the default once the script switched to `use v5.38;`. Log messages such as "Processing $path" or "Writing report to $report_path". It adds a newline to every message printed, comparable to Ruby's `put`: - -```perl -use v5.38; - -print "Hello, world!\n"; # old way - -say "Hello, world!"; # new way -``` - -## Lexical subs promote local reasoning - -### Helpers that stay local - -Lexical subroutines keep helpers close to the code that needs them. In `Foostats::Logreader::parse_web_logs`, functions such as `my sub parse_date` and `my sub open_file` live only inside that scope. - -## Reference aliasing makes intent explicit - -### Shared data - -Reference aliasing can be enabled with `use feature qw(refaliasing)` and helps communicate intent more clearly (if you remember the Perl syntax, of course. Otherwise, it's like Chinese). The filter starts with `\my $uri_path = \$event->{uri_path}` so any later modification touches the original event. This is an example with ref aliasing in action: - -```perl -use feature qw(refaliasing); - -my $hash = { foo => 42 }; -\my $foo = \$hash->{foo}; - -$foo = 99; -print $hash->{foo}; # prints 99 -``` - -The aggregator in Foostats aliases `$self->{stats}{$date_key}` before updating counters, so the structure remains intact. Combined with subroutine signatures, this makes it obvious when a piece of data is shared instead of copied, preventing silent bugs. - -## Persistent state without globals - -A Perl state variable is declared with `state $var` and retains its value between calls to the enclosing subroutine. Foostats uses that for rate limiting and de-duplicated logging. - -This is a small example demonstrating the use of a state variable in Perl: - -```perl -sub counter { - state $count = 0; - $count++; - return $count; -} - -say counter(); # 1 -say counter(); # 2 -say counter(); # 3 -``` - -### Rate limiting state - -In Foostats, `state` variables store run-specific state without using package globals. `state %blocked` remembers IP hashes that already triggered the odd-request filter, and `state $last_time` and `state %count` track how many requests an IP makes in the exact second. Hash and array state variables have been supported since `state` arrived in Perl 5.10, so this code takes advantage of that long-standing capability. However, what's new is that hashes can now also be state variables. - -### De-duplicated logging - -`state %dedup` keeps the log output of the suspicious calls to one warning per URI. Early versions utilized global hashes for the same tasks, producing inconsistent results during tests. Switching to `state` removed those edge cases. - -## Subroutine signatures - -Perl now supports subroutine signatures like other modern languages do. Foostats uses them everywhere. Examples: - -```perl -# Old way -sub greet_old { my $name = shift; print "Hello, $name!\n" } - -# Another old way -sub greet_old ($) { $name = shift; print "Hello, $name!\n" } - -# New way -sub greet ($name) { say "Hello, $name!"; } - -greet("Alice"); # prints "Hello, Alice!" - -sub greet ($name) { - say "Hello, $name!"; -} - -greet("Alice"); # prints "Hello, Alice!" -``` - -In Foostats, constructors declare `sub new ($class, $odds_file, $log_path)`, anonymous callbacks expose `sub ($event)`, and helper subs list the values they expect, e.g.: - -```perl -my $anon = sub ($name) { - say "Hello, $name!"; -}; - -$anon->("World"); # prints "Hello, World!" -``` - -## Defined-or assignment keeps defaults obvious - -### Defaults without boilerplate - -The operator `//=` keeps configuration and counters simple. Environment variables may be missing when CRON runs the script, so `//=`, combined with signatures, sets defaults without warnings. Example use of that operator: - -```perl -my $foo; -$foo //= 42; -say $foo; # prints 42 - -$foo //= 99; -say $foo; # still prints 42, because $foo was already defined -``` - -## Cleanup with `defer` - -Even though not used in Foostats, this (borrowed from Go?) feature is neat to have in Perl now. - -The `defer` block (`use feature 'defer"`) schedules a piece of code to run when the current scope exits, regardless of how it exits (e.g. normal return, exception). This is perfect for ensuring resources, such as file handles, are closed. `Foostats::Logreader` uses it to make sure log files are always closed, even if parsing fails mid-way. - -```perl -use feature qw(defer); - -sub parse_log_file { - my ($path) = @_; - open my $fh, '<', $path or die "Cannot open $path: $!"; - defer { close $fh }; - - while (my $line = <$fh>) { - # ... parsing logic that might throw an exception ... - } - # $fh is automatically closed here -} -``` - -This pattern replaces manual `close` calls in every exit path of the subroutine and is more robust than relying solely on object destructors. - -## Builtins and booleans - -The script also utilizes other modern additions that often go unnoticed. `use builtin qw(true false);` combined with `experimental::builtin` provides more real boolean values. - -## Conclusion - -I want to code more in Perl again. The newer features make it a joy to write small scripts like Foostats. If you haven't looked at Perl in a while, give it another try! The main thing which holds me back from writing more Perl is the lack of good tooling. For example, there is no proper LSP and tree sitter support available, which would work as good as the ones available for Go and Ruby. - -E-Mail your comments to `paul@nospam.buetow.org` :-) - -Other related posts are: - -<< template::inline::rindex perl raku - -=> ../ Back to the main site diff --git a/gemfeed/atom.xml b/gemfeed/atom.xml index 49bd2103..865734a5 100644 --- a/gemfeed/atom.xml +++ b/gemfeed/atom.xml @@ -1,11 +1,471 @@ - 2025-10-28T20:14:24+02:00 + 2025-11-01T16:10:35+02:00 foo.zone feed To be in the .zone! gemini://foo.zone/ + + Perl New Features and Foostats + + gemini://foo.zone/gemfeed/2025-11-02-perl-new-features-and-foostats.gmi + 2025-11-01T16:10:35+02:00 + + Paul Buetow aka snonux + paul@dev.buetow.org + + Perl recently reached rank 10 in the TIOBE index. That headline made me write this blog post as I was developing the Foostats script for simple analytics of my personal websites and Gemini capsules (e.g. `foo.zone`) and there were a couple of new features added to the Perl language over the last releases. The book *Perl New Features* by brian d foy documents the changes well; this post shows how those features look in a real program that runs every morning for my stats generation. + +
+

Perl New Features and Foostats


+
+Perl recently reached rank 10 in the TIOBE index. That headline made me write this blog post as I was developing the Foostats script for simple analytics of my personal websites and Gemini capsules (e.g. foo.zone) and there were a couple of new features added to the Perl language over the last releases. The book *Perl New Features* by brian d foy documents the changes well; this post shows how those features look in a real program that runs every morning for my stats generation.
+
+Perl re-enters the top ten
+Perl New Features by Joshua McAdams and brian d foy
+
+
+$b="24P7cP3dP31P3bPaP28P24P64P31P2cP24P64P32P2cP24P73P2cP24P67P2cP24P7
+2P29P3dP28P22P31P30P30P30P30P22P2cP22P31P30P30P30P30P30P22P2cP22P4aP75
+P7                                                                  3P
+74                                                                  P2
+0P  41P6eP6fP74P     68P65P72P20P50 P65P72P6cP2     0P48P           61
+P6  3P6bP65P72P22P   29P3bPaP40P6dP 3dP73P70P6cP6   9P74P           20
+P2  fP2fP    2cP22P  2cP2eP3aP21P2  bP2aP    30P4f  P40P2           2P
+3b  PaP24      P6eP3 dP6c           P65P6      eP67 P74P6           8P
+20  P24P7      3P3bP aP24           P75P3      dP22 P20P2           2P
+78  P24P6      eP3bP aPaP           70P72      P69P 6eP74           P2
+0P  22P5c    P6eP20  P20P           24P75    P5cP7  2P22P           3b
+Pa  PaP66P6fP72P2    8P24P7aP20P    3dP20P31P3bP    20P24           P7
+aP  3cP3dP24P6       eP3bP20P24     P7aP2bP2bP      29P20           P7
+bP  aPaP9            P77P28P24P6    4P31P29P        3bPaP           9P
+24  P72P3            dP69           P6eP74P28       P72P6           1P
+6e  P64P2            8P24           P6eP2 9P29P     3bPaP           9P
+24  P67P3            dP73           P75P6  2P73P    74P72           P2
+0P  24P73            P2cP24P72P2cP  31P3b   PaP9P   24P67P20P3fP20  P6
+4P  6fP20            P9P7bP20PaP9P9 P9P9P    9P66P  6fP72P20P28P24  P6
+bP  3dP30            P3bP24P6bP3cP3 9P3bP    24P6bP 2bP2bP29P20P7b  Pa
+P9                                                                  P9
+P9                                                                  P9
+P9  P9P73P75P6     2P73   P74P  72P2       8P24P75P2c     P24P72    P2
+cP  31P29P3dP24P   6dP5   bP24  P6bP       5dP3bP20Pa   P9P9  P9P9  P9
+P9  P70P    72P69  P6eP   74P2  0P22       P20P20P24P  75P      5cP 72
+P2  2P3b      PaP9 P9P9   P9P9  P9P7       7P28       P24        P6 4P
+32  P29P      3bPa P9P9   P9P9  P9P7       dPaP       9P9           P9
+P9  P9P7      3P75 P62P   73P7  4P72       P28P        24P7         5P
+2c  P24P    72P2c  P31P   29P3  dP24       P67P3bP20P   aP9P9       P9
+P9  P7dP20PaP9P    9P3a   P20P  72P6       5P64P6fP3b      PaP9     P7
+3P  75P62P73P      74P7   2P28  P24P       73P2cP24P7        2P2c   P3
+1P  29P3dP2        2P30   P22P  3bPa       P9P7                0P7  2P
+69  P6eP74P2       0P22   P20P  20P2       4P75                 P5c P7
+2P  22P3 bPaPa     P7dP   aPaP  77P2       0P28                 P24 P6
+4P  32P2  9P3bP    aP70   P72P  69P6       eP74       P2        0P2 2P
+20  P20P   24P75   P20P21P5cP7  2P22P3bPaP 73P6cP65P6 5P7     0P20  P3
+2P  3bPa    P70P7  2P69P6eP74P  20P22P20P2 0P24P75P20  P21P  5cP6   eP
+22  P3bP     aPaP7  3P75P62P2   0P77P20P7b PaP9P24P6c    P3dP73     P6
+8P                                                                  69
+P6                                                                  6P
+74P3bPaP9P66P6fP72P28P24P6aP3dP30P3bP24P6aP3cP24P6cP3bP24P6aP2bP2bP29P
+7bP7dPaP7dP";$b=~s/\s//g;split /P/,$b;foreach(@_){$c.=chr hex};eval $c
+
+The above Perl script prints out "Just Another Perl Hacker !" in an
+animation of sorts.
+
+
+
+

Table of Contents


+
+
+

Motivation


+
+I've been running foo.zone for a while now, but I've never looked into visitor statistics or analytics. I value privacy—not just my own, but also the privacy of others (the visitors of this site) — so I hesitated to use any off-the-shelf analytics plugins. All I wanted to collect were:
+
+
    +
  • Which blog posts had the most (unique) visitors
  • +
  • Exclude, if possible, any bots and scrapers from the stats
  • +
  • Track only anonymized IP addresses, never store raw addresses
  • +

+With Foostats I've created a Perl script which does that for my highly opinionated website/blog setup, which consists of:
+
+Gemtexter, my static site and Gemini capsule generator
+How I host this site highly-available using OpenBSD
+
+

Why I used Perl


+
+Even though nowadays I code more in Go and Ruby, I stuck with Perl for Foostats for three simple reasons:
+
+
    +
  • I wanted an excuse to explore the newer features of my first programming love.
  • +
  • Sometimes, I miss Perl.
  • +
  • Perl ships with OpenBSD (the operating system on which my sites run) by default.
  • +
  • It really does live up to its Practical Extraction and Report Language (that's what the name Perl means) for this kind of log grinding I did with Foostats.
  • +

+

Inside Foostats


+
+Foostats is simply a log file analyser, which analyses the OpenBSD httpd and relayd logs.
+
+https://man.openbsd.org/httpd.8
+https://man.openbsd.org/relayd.8
+
+

Log pipeline


+
+A CRON job starts Foostats, reads OpenBSD httpd and relayd access logs, and produces the numbers published at https://stats.foo.zone and gemini://stats.foo.zone. The dashboards are humble because traffic on my sites is still light, yet the trends are interesting for spotting patterns. The script is opinionated (I am repeating myself here, I know), and I will probably be the only one ever using it for my own sites. However, the code demonstrates how Perl's newer features help keep a small script like this exciting and fun!
+
+Foostats (HTTP)
+Foostats (Gemini)
+
+On OpenBSD, I've configured the job via the daily.local on both of my OpenBSD servers (fishfinger.buetow.org and blowfish.buetow.org - note one is the master server, the other is the standby server, but the script runs on both and the stats are merged later in the process):
+
+ +
fishfinger$ grep foostats /etc/daily.local
+perl /usr/local/bin/foostats.pl --parse-logs --replicate --report
+
+
+Internally, Foostats::Logreader parses each line of the log files /var/log/daemon* and /var/www/logs/access_log*, turns timestamps into YYYYMMDD/HHMMSS values, hashes IP addresses with SHA3 (for anonymization), and hands a normalized event to Foostats::Filter. The filter compares the URI against entries in fooodds.txt, tracks how many times an IP address requests within the exact second, and drops anything suspicious (e.g., from web crawlers or malicious attackers). Valid events reach Foostats::Aggregator, which counts requests per protocol, records unique visitors for the Gemtext and Atom feeds, and remembers page-level IP sets. Foostats::FileOutputter writes the result as gzipped JSON files—one per day and per protocol—with IPv4/IPv6 splits, filtered counters, feed readership, and hashes for long URLs.
+
+

fooodds.txt


+
+fooodds.txt is a plain text list of substrings of URLs to be blocked, making it quick to shut down web crawlers. Foostats also detects rapid requests (an indicator of excessive crawling) and blocks the IP. Audit lines are written to /var/log/fooodds, which can later be reviewed for false or true positives (I do this around once a month). The Justfile even has a gather-fooodds target that collects suspicious paths from remote logs so new patterns can be added quickly.
+
+

Feed kinds


+
+There are different kinds of feeds being tracked by Foostats:
+
+
    +
  • The Atom web-feed
  • +
  • The same feed via Gemini
  • +
  • The Gemfeed (a special format popular in the Geminispace)
  • +

+

Aggregation and output


+
+As mentioned, Foostats merges the stats from both hosts, master and standby. For the master-standby setup description, read:
+
+KISS high-availability with OpenBSD
+
+Those gzipped files land in stats/. From there, Foostats::Replicator can pull matching files from the partner host (fishfinger or blowfish) so the view covers both servers, Foostats::Merger combines them into daily summaries, and Foostats::Reporter rebuilds Gemtext and HTML reports.
+
+Those are the raw stats files:
+
+https://blowfish.buetow.org/foostats/
+https://fishfinger.buetow.org/foostats/
+
+These are the 30-day reports generated (already linked earlier in this post, but adding here again for clarity):
+
+stats.foo.zone Gemini capsule dashboard
+stats.foo.zone HTTP dashboard
+
+

Command-line entry points


+
+foostats_main is the command entry point. --parse-logs refreshes the gzipped files, --replicate runs the cross-host sync, and --report rebuilds the HTML and Gemini report pages. --all performs everything in one go. Defaults point to /var/www/htdocs/buetow.org/self/foostats for data, /var/gemini/stats.foo.zone for Gemtext output, and /var/www/htdocs/gemtexter/stats.foo.zone for HTML output. Replication always forces the three most recent days' worth of data across HTTPS and leaves older files untouched to save bandwidth.
+
+The complete source lives on Codeberg here:
+
+Foostats on Codeberg
+
+Now let's go to some new Perl features:
+
+

Packages as real blocks


+
+

Scoped packages


+
+Recent Perl versions allow the block form package Foo { ... }. Foostats uses it for every package. Imports stay local to the block, helper subs do not leak into the global symbol table, and configuration happens where the code needs it.
+
+The old way:
+
+ +
package foo;
+
+sub hello {
+    print "Hello from package foo\n";
+}
+
+package bar;
+
+sub hello {
+    print "Hello from package bar\n";
+}
+
+1
+
+
+But now it is also possible to do this:
+
+ +
package foo {
+    sub hello {
+        print "Hello from package foo\n";
+    }
+}
+
+package bar {
+    sub hello {
+        print "Hello from package bar\n";
+    }
+}
+
+
+

Postfix dereferencing keeps data structures tidy


+
+

Clear dereferencing


+
+The script handles nested hashes and arrays. Postfix dereferencing ($hash->%*, $array->@*) keeps that readable.
+
+E.g. instead of having to write:
+
+ +
for my $elem (@{$array_ref}) {
+    print "$elem\n";
+}
+
+
+one can now do:
+
+ +
for my $elem ($array_ref->@*) {
+    print "$elem\n";
+}
+
+
+You see that this feature becomes increasingly useful with nested data structures, e.g. to print all keys of the nested hash:
+
+ +
print for keys $hash->{stats}->%*;
+
+
+Loops over like $stats->{page_ips}->{urls}->%* or $merge{$key}->{$_}->%* show which level of the structure is in play. The merger in Foostats updates host and URL statistics without building temporary arrays, and the reporter code mirrors the layout of the final tables. Before postfix dereferencing, the same code relied on braces within braces and was harder to read.
+
+

say is the default voice now


+
+say became the default once the script switched to use v5.38;. It adds a newline to every message printed, comparable to Ruby's puts, making log messages like "Processing $path" or "Writing report to $report_path" cleaner:
+
+ +
use v5.38;
+
+print "Hello, world!\n";    # old way
+
+say "Hello, world!";        # new way
+
+
+

Lexical subs promote local reasoning


+
+Lexical subroutines keep helpers close to the code that needs them. In Foostats::Logreader::parse_web_logs, functions such as my sub parse_date and my sub open_file live only inside that scope.
+
+This is an example of a lexical sub named trim, which is only visible within the outer sub named process_lines:
+
+ +
use v5.38;
+
+sub process_lines {
+    my @lines = @_;
+
+    my sub trim ($str) {
+        $str =~ s/^\s+|\s+$//gr;
+    }
+
+    return [ map { trim($_) } @lines ];
+}
+
+my @raw = ("  foo  ", " bar", "baz ");
+my $cleaned = process_lines(@raw);
+say for @$cleaned; # prints "foo", "bar", "baz"
+
+
+

Reference aliasing makes intent explicit


+
+Reference aliasing can be enabled with use feature qw(refaliasing) and helps communicate intent more clearly (if you remember the Perl syntax, of course—otherwise, it can look rather cryptic). The filter starts with \my $uri_path = \$event->{uri_path} so any later modification touches the original event. This is an example with ref aliasing in action:
+
+ +
use feature qw(refaliasing);
+
+my $hash = { foo => 42 };
+\my $foo = \$hash->{foo};
+
+$foo = 99;
+print $hash->{foo}; # prints 99
+
+
+The aggregator in Foostats aliases $self->{stats}{$date_key} before updating counters, so the structure remains intact. Combined with subroutine signatures, this makes it obvious when a piece of data is shared instead of copied, preventing silent bugs. This enables having shorter names for long nested data structures.
+
+

Persistent state without globals


+
+A Perl state variable is declared with state $var and retains its value between calls to the enclosing subroutine. Foostats uses that for rate limiting and de-duplicated logging.
+
+This is a small example demonstrating the use of a state variable in Perl:
+
+ +
sub counter {
+    state $count = 0;
+    $count++;
+    return $count;
+}
+
+say counter(); # 1
+say counter(); # 2
+say counter(); # 3
+
+
+Hash and array state variables have been supported since state arrived in Perl 5.10. Scalar state variables were already supported previously.
+
+

Rate limiting state


+
+In Foostats, state variables store run-specific state without using package globals. state %blocked remembers IP hashes that already triggered the odd-request filter, and state $last_time and state %count track how many requests an IP makes in the exact second.
+
+

De-duplicated logging


+
+state %dedup keeps the log output of the suspicious calls to one warning per URI. Early versions utilized global hashes for the same tasks, producing inconsistent results during tests. Switching to state removed those edge cases.
+
+

Subroutine signatures


+
+Perl now supports subroutine signatures like other modern languages do. Foostats uses them everywhere. Examples:
+
+ +
# Old way
+sub greet_old { my $name = shift; print "Hello, $name!\n" }
+
+# Another old way
+sub greet_old2 ($) { my $name = shift; print "Hello, $name!\n" }
+
+# New way
+sub greet ($name) { say "Hello, $name!"; }
+
+greet("Alice"); # prints "Hello, Alice!"
+
+
+In Foostats, constructors declare sub new ($class, $odds_file, $log_path), anonymous callbacks expose sub ($event), and helper subs list the values they expect, e.g.:
+
+ +
my $anon = sub ($name) {
+    say "Hello, $name!";
+};
+
+$anon->("World"); # prints "Hello, World!"
+
+
+

Defined-or assignment for defaults without boilerplate


+
+The operator //= keeps configuration and counters simple. Environment variables may be missing when CRON runs the script, so //=, combined with signatures, sets defaults without warnings. Example use of that operator:
+
+ +
my $foo;
+$foo //= 42;
+say $foo; # prints 42
+
+$foo //= 99;
+say $foo; # still prints 42, because $foo was already defined
+
+
+

Cleanup with defer


+
+Even though not used in Foostats, this feature (similar to Go's defer) is neat to have in Perl now.
+
+The defer block (use feature 'defer") schedules a piece of code to run when the current scope exits, regardless of how it exits (e.g. normal return, exception). This is perfect for ensuring resources, such as file handles, are closed.
+
+ +
use feature qw(defer);
+
+sub parse_log_file {
+    my ($path) = @_;
+    open my $fh, '<', $path or die "Cannot open $path: $!";
+    defer { close $fh };
+
+    while (my $line = <$fh>) {
+        # ... parsing logic that might throw an exception ...
+    }
+    # $fh is automatically closed here
+}
+
+
+This pattern replaces manual close calls in every exit path of the subroutine and is more robust than relying solely on object destructors.
+
+

Builtins and booleans


+
+The script also utilizes other modern additions that often go unnoticed. use builtin qw(true false); combined with experimental::builtin provides more real boolean values.
+
+

Conclusion


+
+I want to code more in Perl again. The newer features make it a joy to write small scripts like Foostats. If you haven't looked at Perl in a while, give it another try! The main thing which holds me back from writing more Perl is the lack of good tooling. For example, there is no proper LSP and tree sitter support available, which would work as good as the ones available for Go and Ruby.
+
+E-Mail your comments to paul@nospam.buetow.org :-)
+
+Other related posts are:
+
+2023-05-01 Unveiling guprecords.raku: Global Uptime Records with Raku
+2022-05-27 Perl is still a great choice
+2011-05-07 Perl Daemon (Service Framework)
+2008-06-26 Perl Poetry
+
+Back to the main site
+
+
+
Key Takeaways from The Well-Grounded Rubyist @@ -14400,81 +14860,6 @@ http://www.gnu.org/software/src-highlite -->
Paul

-Back to the main site
- - -
- - Site Reliability Engineering - Part 1: SRE and Organizational Culture - - gemini://foo.zone/gemfeed/2023-08-18-site-reliability-engineering-part-1.gmi - 2023-08-18T22:43:47+03:00 - - Paul Buetow aka snonux - paul@dev.buetow.org - - Being a Site Reliability Engineer (SRE) is like stepping into a lively, ever-evolving universe. The world of SRE mixes together different tech, a unique culture, and a whole lot of determination. It’s one of the toughest but most exciting jobs out there. There's zero chance of getting bored because there's always a fresh challenge to tackle and new technology to play around with. It's not just about the tech side of things either; it's heavily rooted in communication, collaboration, and teamwork. As someone currently working as an SRE, I’m here to break it all down for you in this blog series. Let's dive into what SRE is really all about! - -
-

Site Reliability Engineering - Part 1: SRE and Organizational Culture


-
-Published at 2023-08-18T22:43:47+03:00
-
-Being a Site Reliability Engineer (SRE) is like stepping into a lively, ever-evolving universe. The world of SRE mixes together different tech, a unique culture, and a whole lot of determination. It’s one of the toughest but most exciting jobs out there. There's zero chance of getting bored because there's always a fresh challenge to tackle and new technology to play around with. It's not just about the tech side of things either; it's heavily rooted in communication, collaboration, and teamwork. As someone currently working as an SRE, I’m here to break it all down for you in this blog series. Let's dive into what SRE is really all about!
-
-2023-08-18 Site Reliability Engineering - Part 1: SRE and Organizational Culture (You are currently reading this)
-2023-11-19 Site Reliability Engineering - Part 2: Operational Balance
-2024-01-09 Site Reliability Engineering - Part 3: On-Call Culture
-2024-09-07 Site Reliability Engineering - Part 4: Onboarding for On-Call Engineers
-
-
-▓▓▓▓░░                                                                                  
-                                                                                          
-DC on fire:
-                                                                                          
-                ▓▓                                    ▓▓                ▓▓                
-      ░░  ░░    ▓▓▓▓                  ██                  ░░            ▓▓▓▓        ▓▓    
-    ▓▓░░░░  ░░  ▓▓▓▓                              ▓▓░░                  ▓▓▓▓              
-    ░░░░      ▓▓▓▓▓▓        ▓▓      ▓▓            ▓▓                  ▓▓▓▓▓▓      ▓▓      
-    ▓▓░░    ▓▓▒▒▒▒▓▓▓▓    ▓▓        ▓▓▓▓        ▓▓▓▓▓▓              ▓▓▒▒▒▒▓▓▓▓    ▓▓▓▓    
-  ██▓▓      ▓▓▒▒░░▒▒▓▓  ▓▓██      ▓▓▓▓▓▓        ▓▓▒▒▓▓              ▓▓▒▒░░▒▒▓▓  ██▓▓▓▓    
-  ▓▓▓▓██  ▓▓▒▒░░░░▒▒▓▓  ▓▓▓▓      ▓▓▒▒▒▒▓▓    ▓▓▒▒░░▒▒▓▓██▓▓      ▓▓▒▒░░░░▒▒▓▓  ▓▓▒▒▒▒▓▓  
-  ▓▓▒▒▒▒▓▓▓▓▒▒░░▒▒▓▓▓▓▓▓▒▒▒▒▓▓  ▓▓▓▓░░▒▒▓▓    ▓▓▒▒░░▒▒▓▓▒▒▒▒▓▓    ▓▓▒▒░░▒▒▓▓▓▓▓▓▓▓░░▒▒▓▓  
-  ▒▒░░▒▒▓▓▓▓▒▒░░▒▒▓▓▓▓▒▒░░▒▒▓▓  ▓▓▒▒░░▒▒▓▓    ▓▓░░░░▒▒▒▒░░░░▒▒██████▒▒░░▒▒██▓▓▓▓▒▒░░▒▒▓▓██
-  ░░░░▒▒▓▓▒▒░░▒▒▓▓▓▓▓▓▒▒░░▒▒▓▓██▒▒░░░░▒▒▓▓  ▓▓▒▒░░▒▒▓▓▒▒▒▒░░▒▒▓▓▓▓▒▒░░▒▒▓▓▓▓▓▓▒▒░░░░▒▒▓▓▓▓
-  ░░░░▒▒▓▓▒▒░░░░▓▓██▒▒░░░░▒▒▓▓██▒▒░░░░▒▒██▓▓▓▓▒▒░░▒▒▓▓▓▓▒▒░░░░▒▒▓▓▒▒░░░░██▓▓▓▓▒▒░░░░▒▒████
-  ▒▒░░▒▒▓▓▓▓░░░░▒▒▓▓▒▒▒▒░░░░▒▒▓▓▓▓▒▒░░░░▒▒▓▓▓▓▒▒░░░░▒▒▓▓▒▒░░▒▒▓▓▓▓▓▓░░░░▒▒▓▓▓▓▓▓▒▒░░░░▒▒▓▓
-  ▒▒░░▒▒▓▓▒▒▒▒░░▒▒██▒▒▒▒░░▒▒▒▒██▒▒▒▒░░░░░░▒▒▓▓▒▒░░░░▒▒▒▒░░░░▒▒████▒▒▒▒░░▒▒██▓▓▒▒▒▒░░░░░░▒▒
-  ░░░░░░▒▒░░░░░░░░▒▒▒▒▒▒░░░░▒▒▒▒▒▒░░░░░░░░▒▒▒▒░░░░░░▒▒▒▒░░░░░░▒▒▒▒░░░░░░░░▒▒▒▒▒▒░░░░░░░░▒▒
-  ░░░░░░░░░░▒▒░░░░░░░░░░░░░░░░░░░░░░░░▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░▒▒░░░░░░░░░░░░░░░░░░
-
-
-

SRE and Organizational Culture: Navigating the Nexus


-
-At the core of SRE is the principle of "prevention over cure." Unlike traditional IT setups that mostly react to problems, SRE focuses on spotting issues before they happen. This proactive approach involves using Service Level Indicators (SLIs) and Service Level Objectives (SLOs). These tools give teams specific metrics and targets to aim for, helping them keep systems reliable and users happy. It's all about creating a culture that prioritizes user experience and makes sure everything runs smoothly to meet their needs.
-
-Another key concept in SRE is the "error budget." It’s a clever approach that recognizes no system is perfect and that failures will happen. Instead of punishing mistakes, SRE culture embraces them as chances to learn and improve. The idea is to give teams a "budget" for errors, creating a space where innovation can thrive and failures are simply seen as lessons learned.
-
-SRE isn't just about tech and metrics; it's also about people. It tackles the "hero culture" that often ends up burning out IT teams. Sure, having a hero swoop in to save the day can be great, but relying on that all the time just isn’t sustainable. Instead, SRE focuses on collective expertise and teamwork. It recognizes that heroes are at their best within a solid team, making the need for constant heroics unnecessary. This way of thinking promotes a balanced on-call experience and highlights trust, ownership, good communication, and collaboration as key to success. I've been there myself, falling into the hero trap, and I know firsthand that it's just not feasible to be the go-to person for every problem that comes up.
-
-Also, the SRE model puts a big emphasis on good documentation. It's not enough to just have docs; they need to be top-notch and go through the same quality checks as code. This really helps with onboarding new team members, training, and keeping everyone on the same page.
-
-Adopting SRE can be a big challenge for some organizations. They might think the SRE approach goes against their goals, like preferring to roll out new features quickly rather than focusing on reliability, or seeing SRE practices as too much hassle. Building an SRE culture often means taking the time to explain things patiently and showing the benefits, like faster release cycles and a better user experience.
-
-Monitoring and observability are also big parts of SRE, highlighting the need for top-notch tools to query and analyze data. This aligns with the SRE focus on continuous learning and being adaptable. SREs naturally need to be curious, ready to dive into any strange issues, and always open to picking up new tools and practices.
-
-For SRE to really work in any organization, everyone needs to buy into its principles. It's about moving away from working in isolated silos and relying on SRE to just patch things up. Instead, it’s about making reliability a shared responsibility across the whole team.
-
-In short, bringing SRE principles into the mix goes beyond just the technical stuff. It helps shift the whole organizational culture to value things like preventing issues before they happen, always learning, working together, and being open with communication. When SRE and corporate culture blend well, you end up with not just reliable systems but also a strong, resilient, and forward-thinking workplace.
-
-Organizations that have SLIs, SLOs, and error budgets in place are already pretty far along in their SRE journey. Getting there takes a lot of communication, convincing people, and patience.
-
-Continue with the second part of this series:
-
-2023-11-19 Site Reliability Engineering - Part 2: Operational Balance
-
-E-Mail your comments to paul@nospam.buetow.org :-)
-
Back to the main site
diff --git a/gemfeed/index.gmi b/gemfeed/index.gmi index 39d99d9c..872d7690 100644 --- a/gemfeed/index.gmi +++ b/gemfeed/index.gmi @@ -2,6 +2,7 @@ ## To be in the .zone! +=> ./2025-11-02-perl-new-features-and-foostats.gmi 2025-11-02 - Perl New Features and Foostats => ./2025-10-11-key-takeaways-from-the-well-grounded-rubyist.gmi 2025-10-11 - Key Takeaways from The Well-Grounded Rubyist => ./2025-10-02-f3s-kubernetes-with-freebsd-part-7.gmi 2025-10-02 - f3s: Kubernetes with FreeBSD - Part 7: k3s and first pod deployments => ./2025-09-14-bash-golf-part-4.gmi 2025-09-14 - Bash Golf Part 4 diff --git a/gemfeed/stats.gmi b/gemfeed/stats.gmi new file mode 100644 index 00000000..28fb095a --- /dev/null +++ b/gemfeed/stats.gmi @@ -0,0 +1,7 @@ +# Stats + +Here, you can find some statistics! + +=> ./uptime-stats.gmi My machine uptime statistics +=> https://stats.foo.zone Site statistics (HTTP) +=> gemini://stats.foo.zone Site statistics (Gemini) -- cgit v1.2.3