| Age | Commit message (Collapse) | Author |
|
The staleness filter was incorrectly hiding stale OK checks from the
HTML report and email notifications. OK checks should always be shown
since stale OK alerts are not concerning.
|
|
Introduce peer URL monitoring with active/passive alert suppression, skip checks when passive, and bump version to v1.4.0.
Co-authored-by: Cursor <cursoragent@cursor.com>
|
|
Add a JSON report alongside the HTML status page with matching sections and
summary counts, plus a last-updated timestamp for remote consumption.
Co-authored-by: Cursor <cursoragent@cursor.com>
|
|
Add MinNotifyIntervalS config option to batch email notifications over a
time interval. When configured, Gogios only sends an email when:
1. The interval has elapsed since the last notification, AND
2. There's been a state change since the last notification.
HTML status page and text reports continue updating on every run.
The --force flag bypasses the interval for immediate notifications.
Notification state (timestamp + check states snapshot) is persisted to
{StateDir}/notify_state.json for comparison across runs.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
|
Co-authored-by: aider (ollama/qwen3-coder:latest) <aider@aider.chat>
|
|
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
|
OK checks should appear in "OK checks" section, not "Suppressed alerts".
Suppression now only applies to non-OK checks (Critical, Warning, Unknown).
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
|
|
|
Adds ability to suppress alerts during maintenance windows by checking
for the existence of a file. When the file exists and is recent (within
configured max age), matching alerts are excluded from email reports.
Features:
- Global PrometheusOnlyIfNotExists config for Prometheus alerts
- Per-check OnlyIfNotExists config for individual checks
- Configurable max age (default 86400s) for suppression file
- New "Suppressed alerts" section in email and HTML reports
- Suppressed checks excluded from counts and unhandled sections
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
|
Include a configurable link to the HTML status page in email
notifications. Defaults to https://gogios.buetow.org but can be
customized via the StatusPageURL config option.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
|
Stale OK alerts are not concerning and shouldn't clutter the stale
alerts section. Only report stale alerts with warning/critical/unknown
status.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
|
When Prometheus alerts stop firing, they were previously left in state
and became stale. Now they are automatically removed when no longer
in the firing alerts list from Prometheus.
Also fix Magefile Openbsd target to run build before deploy sequentially
instead of using mg.Deps which runs them in parallel.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
|
|
|
|
|
|
|
|
|
- Treat firing Watchdog as OK status to confirm Alertmanager is working
- Treat absent/non-firing Watchdog as CRITICAL to alert on Alertmanager issues
- Add comprehensive tests for both scenarios
|
|
|
|
- Add configurable HTML status page generation after each check-run
- Default output: /var/www/htdocs/buetow.org/self/gogios/index.html
- New config options: HTMLStatusFile (path) and HTMLDisable (bool)
- Auto-creates output directory if it doesn't exist
- Uses atomic writes (tmp file + rename) to prevent corruption
- Auto-refresh every 5 minutes via meta tag
- Minimal, clean styling based on f3s_fallback template
- W3C HTML5 compliant output
- Email notifications on I/O errors via existing notifyError mechanism
- Mirrors email report structure (status changes, unhandled alerts, stale alerts)
- Comprehensive unit tests including W3C compliance validation
- All user-generated content properly HTML-escaped for security
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
|
|
|
This commit introduces two new optional parameters to the check configuration:
- `randomSpread`: This parameter allows specifying a random sleep time up to N seconds before a check is executed. This is useful to avoid all checks running at the same time.
- `RunInterval`: This parameter defines the minimum interval in seconds between two executions of a check. This is useful if gogios is run more frequently than a specific check should be.
The `README.md` has been updated to document these new features.
fix: Fix deadlock when skipping checks
This commit also fixes a deadlock that occurred when a check was skipped due to the `RunInterval` setting. The `inputWg.Done()` was not being called, causing the main goroutine to wait forever.
build: Replace Taskfile with Magefile
The `Taskfile.yml` has been replaced with a `Magefile.go` to manage the build process. This provides more flexibility and is more idiomatic for Go projects.
|
|
- Tests successful federation from remote endpoints
- Tests error handling (server errors, invalid JSON, timeouts)
- Tests multiple endpoint federation
- Tests duplicate check name conflict resolution
- Uses httptest.NewServer for realistic HTTP simulation
- Validates federated check marking and status reporting
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|