> **π§ PRE-ALPHA SOFTWARE:** This project is in a pre-alpha state and is intended for my own personal use only. Use at your own risk.
# I/O Riot NG (aka ior)
I/O Riot NG is an experiments with BPF. This program traces for synchronous I/O syscalls and then analyses the time taken for each of those syscalls. This is especially useful for drawing FlameGraphs like these:
Maybe this is a spiritual successor of one of my previous projects, I/O Riot https://codeberg.org/snonux/ioriot, the latter was based on SystemTap and C. The NG is based on Go, C and BPF (via libbpfgo).
This works only on Linux!
## Requirements
- Go 1.26 or newer (ior relies on cgo via libbpfgo).
## Local libbpfgo Toolchain
`ior` links against a locally built `libbpfgo` checkout. By default
`Magefile.go` expects that checkout at `../libbpfgo` relative to this repo; set
`LIBBPFGO=/absolute/path/to/libbpfgo` if you keep it elsewhere.
Pin that checkout to `v0.9.2-libbpf-1.5.1` and rebuild the static artifacts
before running `mage` targets:
```shell
git -C ../libbpfgo checkout v0.9.2-libbpf-1.5.1
git -C ../libbpfgo submodule update --init --recursive
make -C ../libbpfgo libbpfgo-static
```
Validated commands for this pin:
```shell
env GOTOOLCHAIN=auto mage world
env GOTOOLCHAIN=auto mage integrationTest
```
Troubleshooting and rollback:
- If builds fail with `bpf/bpf.h` missing, re-run the checkout, submodule sync,
and `make libbpfgo-static` commands above, then retry `env GOTOOLCHAIN=auto mage world`.
- Prefer Mage targets over raw `go test` for packages that import `libbpfgo`;
Mage injects the required `CGO_CFLAGS`, `CGO_LDFLAGS`, and `LIBBPFGO` values.
- To roll back to the previous wrapper state, repin `go.mod` to
`github.com/aquasecurity/libbpfgo v0.6.0-libbpf-1.3.0.20240111220235-90dbffffbdab`,
then reset the sibling checkout and rebuild:
```shell
git -C ../libbpfgo checkout 90dbffffbdab
git -C ../libbpfgo submodule update --init --recursive
make -C ../libbpfgo libbpfgo-static
```
## Timing Semantics
Each reported event pair has two timing counters:
- `durationNs`: syscall runtime on the same thread (`exit(current) - enter(current)`).
- `durationToPrevNs`: inter-syscall gap on the same thread (`enter(current) - exit(previous)`).
Important details:
- `durationToPrevNs` is tracked per `tid` (thread), not globally across all threads.
- The first observed syscall pair for a thread has `durationToPrevNs = 0` because there is no prior exit timestamp.
- `durationToPrevNs` is attributed to the current syscall pair (the one whose `enter` closes the gap).
- There is no separate "idle" pseudo-event bucket; use the `durationToPrev` count field when aggregated flamegraph output should emphasize inter-syscall time.
## Fedora
To get this running on Fedora 42, run:
```shell
mkdir -p ~/git
git clone https://codeberg.org/snonux/ior ~/git/ior
git clone https://github.com/aquasecurity/libbpfgo ~/git/libbpfgo
sudo dnf install -y golang clang bpftool elfutils-libelf-devel zlib-static glibc-static libzstd-static
git -C ~/git/libbpfgo checkout v0.9.2-libbpf-1.5.1
git -C ~/git/libbpfgo submodule update --init --recursive
make -C ~/git/libbpfgo libbpfgo-static
```
Need libelf static, which isn't in any repos. So we need to compile it ourselves.
```
sudo dnf install rpmdevtools dnf-utils
dnf download --source elfutils-libelf
rpm -ivh elfutils-*.src.rpm
cd ~
sudo dnf builddep rpmbuild/SPECS/*.spec
cd ~/rpmbuild/SPECS
rpmbuild -ba *.spec
mkdir ~/src
tar -C ~/src -xvjpf ~/rpmbuild/SOURCES/elfutils-*.tar.bz2
cd ~/src/elfutils-*
rm -Rf ~/rpmbuild
./configure
make
sudo cp -v ./libelf/libelf.a /usr/lib64/
```
## TUI Flamegraphs
Flamegraphs are available only inside the TUI dashboard.
Use `-fields` to change the stack order and `-count` to choose the metric.
The default stack order is `comm,path,tracepoint` (bottom to top).
## Recording Modes
`ior` has four distinct output flows. They are intentionally different:
| Mode | How to use it | What it writes | Filter behavior |
| --- | --- | --- | --- |
| TUI dashboard | default startup | nothing continuously; data stays in memory unless you export | current TUI/global filters drive what you see |
| TUI CSV snapshot export | press `e` in the dashboard | one `ior-stream-.csv` snapshot of the current filtered stream view | exports only the currently filtered in-memory rows |
| Headless `.ior.zst` export | start with `-flamegraph -name ` | one aggregated native trace artifact written at shutdown | no TUI filter stack; this is the native trace/integration workflow |
| Parquet recording | press `R` in the TUI, or start with `-parquet ` | a streaming Parquet file of traced syscall rows | TUI mode records rows that pass the active TUI filter; headless `-parquet` records all traced rows |
Important distinction:
- `.ior.zst` output is an aggregated native artifact, not a row-by-row event log.
- CSV export is a point-in-time snapshot of the ring buffer.
- Parquet recording is a streaming capture from start to stop.
- The ring buffer is capped, so CSV export is not a replacement for Parquet recording or `.ior.zst` output.
### Headless Native `.ior.zst` Output
Use `-flamegraph` when you want the native `ior` trace artifact instead of a streaming row log:
```shell
sudo ./ior -flamegraph -name trace-run -duration 60
```
Native `.ior.zst` behavior:
- writes one `*.ior.zst` file when the run ends
- stores aggregated counters for repeated syscall/path/process combinations
- is intended for `ior`'s native flamegraph and integration-style workflows
- does not preserve one output row per traced syscall
### TUI Parquet Recording
Start a recording from the dashboard with `R`.
- First `R`: open a filename prompt (`ior-recording-.parquet` by default).
- `Enter`: start recording to that file.
- Second `R`: stop and finalize the active Parquet file.
- Recording stops automatically when you quit the TUI or reselect PID/TID/session scope.
Lifecycle details:
- TUI recording uses the active TUI global filter at emission time.
- If a filter change restarts tracing, the recorder stays alive and continues writing matching rows after the restart.
- The dashboard footer shows the active recording path or the last recording error.
### Headless Parquet Recording
Use `-parquet` to skip the TUI and stream traced syscall rows directly to a Parquet file:
```shell
sudo ./ior -parquet trace.parquet -duration 60
```
Headless Parquet mode behavior:
- skips the TUI completely
- records all traced rows
- rejects content filters such as `-comm`, `-path`, `-pid`, and `-tid`
- cannot be combined with `-plain`, `-flamegraph`, `--testflames`, or `--testliveflames`
Use headless mode when you want a full recording, and TUI mode when you want interactive filtering plus optional start/stop recording from the dashboard.
### Choosing Between `.ior.zst` and Parquet
Both formats are useful, but they solve different problems:
| Question | Native `.ior.zst` | Parquet |
| --- | --- | --- |
| Data shape | aggregated counters | one row per traced syscall |
| Write pattern | collect in memory, write one compressed artifact at the end | stream rows continuously while recording |
| Best for | `ior`-native trace artifacts, flamegraph workflows, integration assertions | offline analysis in other tools, long captures, preserving per-event detail |
| Relative write cost | usually lower because repeated events are folded together before file write | usually higher because each traced row is serialized |
| Detail retained | loses original row order and per-event granularity | keeps per-event timing and syscall fields |
Rule of thumb:
- choose `.ior.zst` when you want the native `ior` artifact and do not need every traced syscall row preserved
- choose Parquet when you want a full event stream for downstream analysis outside `ior`
## TUI Navigation
The TUI interface provides an inβscreen help panel (toggle with **H**) that lists all available keys. Use this help screen to discover navigation shortcuts.
You can move between dashboard tabs:
- **tab** β next dashboard tab
- **shift+tab** β previous dashboard tab
- **1** β Overview
- **2** β Syscalls
- **3** β Files
- **4** β Processes
- **5** β Latency+Gaps
- **6** β Stream
The bottom hint shows `press H for help` when the help is hidden.
The TUI has two key scopes:
- Global hotkeys: available from dashboard screens.
- Dashboard hotkeys: behavior that depends on the active dashboard tab (especially `6:Stream`).
Help visibility:
- `H`: toggle bottom help sections on/off.
- By default, help is hidden and the bottom hint shows `press H for help`.
### Global Hotkeys
- `tab`: next dashboard tab.
- `shift+tab`: previous dashboard tab.
- `1`: `Overview` tab.
- `2`: `Syscalls` tab.
- `3`: `Files` tab.
- `4`: `Processes` tab.
- `5`: `Latency+Gaps` tab.
- `6`: `Stream` tab.
- `7`: `Stream` tab (alias).
- `e`: export filtered stream rows to CSV (`ior-stream-.csv`) in current working directory.
- `R`: start or stop Parquet recording from the TUI dashboard.
- `p`: re-open process selector (PID selection flow).
- `t`: open TID selector flow.
- `o`: open probe selection/toggling dialog.
- `r`: refresh dashboard snapshot.
- `q` or `ctrl+c`: quit.
### Dashboard / Tab-Specific Hotkeys
- `d` in `3:Files`: toggle directory-grouped files view.
- `s` in sortable table tabs (`2:Syscalls`, `3:Files`, `4:Processes`): sort by the selected column using that table's default direction.
- `S` in sortable table tabs (`2:Syscalls`, `3:Files`, `4:Processes`): reverse-sort by the selected column.
- `j/k` or `up/down` in list-like tabs (`2:Syscalls`, `3:Files`, `4:Processes`): scroll list.
`left/right` and `h/l` do not switch tabs. In `6:Stream` paused mode they move selected column.
### 6:Stream Hotkeys and Behavior
`6:Stream` has two modes:
- Live mode (`paused=false`): rows update continuously.
- Pause mode (`paused=true`): selection/cell/filter/search/export workflows are enabled.
Core controls:
- `space`: toggle live/pause.
- `g`/`G`: jump to top/tail.
- `c`: clear stream filters.
- `f`: open advanced filter modal.
- `j/k` or `up/down`: move selected row in pause mode; scroll in live mode.
- `left/right` or `h/l`: move selected column in pause mode.
#### Enter-Based Filter Stack (Pause Mode)
In pause mode, `enter` on the selected cell pushes a new filter onto a stack and immediately re-filters the current ring buffer snapshot. Filters are stackable.
- String columns use case-insensitive substring match:
- `Comm` -> `comm~`
- `Syscall` -> `syscall~`
- `File` -> `file~`
- Numeric exact match:
- `PID`, `TID`, `FD`, `Ret`, `Bytes`
- Numeric threshold (`>=`):
- `Latency` -> `latency>=selected_value`
- `Gap` -> `gap>=selected_value`
Undo:
- `esc` in pause mode pops the most recent filter from the stack (LIFO).
- Repeated `esc` keeps undoing until no stacked filters remain.
#### Regex Search (Pause Mode)
- `/`: open regex prompt and search forward.
- `?`: open regex prompt and search backward.
- Search checks all stream columns/fields and wraps around ring-buffer rows.
- `n`: next match in the same direction as last `/` or `?`.
- `N`: previous match (opposite direction).
#### Stream CSV Export (Pause Mode)
- `x`: quick export filtered stream rows to CSV (`ior-stream-.csv`).
- `X`: export filtered stream rows to CSV with filename prompt.
- `E`: open last stream-exported CSV in foreground editor (`EDITOR` -> `VISUAL` -> `SUDO_EDITOR` -> fallback `hx`, else `vi`).
Export behavior:
- `e` exports a fresh filtered stream snapshot using the current shared TUI filter, even outside paused mode.
- `x`/`X` export the currently paused stream rows, preserving the stream tab's exact paused view.