reimport this PoC

author: Paul Buetow <paul@buetow.org> 2026-02-07 16:32:10 +0200
committer: Paul Buetow <paul@buetow.org> 2026-02-07 16:32:10 +0200
commit: 3fd46f3977fb650974e5e936cba362c787c00637 (patch)
tree: b49111ddd0b7af4a007bca6a304dba10efcd88ff /README.md
1 files changed, 1000 insertions, 0 deletions
diff --git a/README.md b/README.md
new file mode 100644
index 0000000..ba10a76
--- /dev/null
+++ b/README.md
@@ -0,0 +1,1000 @@
+<div align="center">
+  <img src="logo.png" alt="Epimetheus Logo" width="400"/>
+</div>
+
+# Epimetheus
+
+A versatile Go tool for pushing metrics to Prometheus with support for both realtime and historic data ingestion.
+
+## Why "Epimetheus"?
+
+In Greek mythology, [Epimetheus](https://en.wikipedia.org/wiki/Epimetheus_(mythology)) is Prometheus's brother, whose name means "afterthought" or "hindsight" (while Prometheus means "forethought"). This name cleverly captures the tool's purpose: bringing data to Prometheus **after** collection, whether it's historic data from hours, days, or weeks ago, or realtime data pushed on-demand.
+
+While Epimetheus is sometimes depicted as foolish in myths (he accepted Pandora's box despite warnings), this tool embraces the "afterthought" aspect productively - it's never too late to bring your metrics home to Prometheus!
+
+## Architecture
+
+```
+┌─────────────────────────────────────────────────────────────────────────┐
+│                            Epimetheus                                   │
+│                     (Metrics Ingestion Tool)                            │
+│                                                                         │
+│  Modes:                                                                 │
+│  • Realtime  - Current metrics (< 5 min old)                          │
+│  • Historic  - Historic metrics (≥ 5 min old)                         │
+│  • Backfill  - Range of historic data                                 │
+│  • Auto      - Automatic routing based on timestamp age               │
+└─────────────────────────────────────────────────────────────────────────┘
+         │                                           │
+         │ Realtime Data                            │ Historic Data
+         │ (via HTTP POST)                          │ (via Remote Write API)
+         │ Uses "now" timestamp                     │ Preserves timestamps
+         ▼                                           ▼
+┌─────────────────────┐                    ┌─────────────────────┐
+│   Pushgateway       │                    │    Prometheus       │
+│   (Port 9091)       │                    │    (Port 9090)      │
+│                     │                    │                     │
+│ • Buffers metrics   │                    │ Remote Write API:   │
+│ • Scraped by        │──── Scraped ─────▶ │ /api/v1/write       │
+│   Prometheus        │    every 15-30s    │                     │
+│ • No timestamp      │                    │ Feature Required:   │
+│   preservation      │                    │ --enable-feature=   │
+│                     │                    │  remote-write-      │
+│                     │                    │  receiver           │
+└─────────────────────┘                    └─────────────────────┘
+                                                      │
+                                                      │ Prometheus Query API
+                                                      │ /api/v1/query
+                                                      ▼
+                                           ┌─────────────────────┐
+                                           │     Grafana         │
+                                           │   (Port 3000)       │
+                                           │                     │
+                                           │ • Prometheus as     │
+                                           │   datasource        │
+                                           │ • Dashboards:       │
+                                           │   - Epimetheus      │
+                                           │     Test Metrics    │
+                                           │ • Auto-refresh      │
+                                           └─────────────────────┘
+```
+
+### Data Flow
+
+1. **Realtime Path** (for current data):
+   - Epimetheus → Pushgateway (HTTP POST)
+   - Prometheus scrapes Pushgateway periodically
+   - Timestamp = "now" when Prometheus scrapes
+
+2. **Historic Path** (for old data):
+   - Epimetheus → Prometheus Remote Write API (HTTP POST)
+   - Direct write to Prometheus TSDB
+   - Timestamp preserved from original data
+
+3. **Visualization**:
+   - Grafana queries Prometheus
+   - Displays metrics in dashboards
+   - Auto-refresh every 10 seconds
+
+## Overview
+
+**epimetheus** is a standalone binary that:
+- **Generates** realistic example metrics simulating production applications
+- **Pushes** metrics via Pushgateway (realtime) or Remote Write API (historic)
+- **Automatically detects** timestamp age and chooses the optimal ingestion method
+- **Supports** multiple data formats (CSV, JSON) and all Prometheus metric types
+- **Provides** Grafana dashboard for visualizing test metrics
+
+## Quick Start
+
+### 1. Deploy Pushgateway (one-time setup)
+
+The Pushgateway Helm chart is available in the [conf repository](https://codeberg.org/snonux/conf) at `f3s/pushgateway/helm-chart`.
+
+```bash
+# Clone the conf repository if you haven't already
+git clone https://codeberg.org/snonux/conf.git
+cd conf/f3s/pushgateway/helm-chart
+
+# Deploy Pushgateway
+helm upgrade --install pushgateway . -n monitoring --create-namespace
+```
+
+Alternatively, deploy Pushgateway using the official chart:
+
+```bash
+helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
+helm install pushgateway prometheus-community/prometheus-pushgateway -n monitoring --create-namespace
+```
+
+### 2. Run in Realtime Mode
+
+```bash
+# Port-forward Pushgateway
+kubectl port-forward -n monitoring svc/pushgateway 9091:9091 &
+
+# Push test metrics continuously
+cd /home/paul/git/conf/f3s/epimetheus
+./epimetheus -mode=realtime -continuous
+```
+
+The binary pushes metrics every 15 seconds. Press Ctrl+C to stop.
+
+### 3. View Metrics
+
+```bash
+# Pushgateway UI
+open http://localhost:9091
+
+# Prometheus UI
+kubectl port-forward -n monitoring svc/prometheus-kube-prometheus-prometheus 9090:9090 &
+open http://localhost:9090
+```
+
+## Operating Modes
+
+### 👁️ Watch Mode
+Monitor CSV files for changes and push metrics to Prometheus with file modification timestamps.
+
+**Works with ANY CSV format** - automatically detects numeric vs string columns and sanitizes names.
+
+**NEW: Automatic DNS Resolution** - IP addresses are automatically resolved to hostnames for better observability in Grafana.
+
+```bash
+./epimetheus -mode=watch \
+    -file=mydata.csv \
+    -metric-name=myapp \
+    -prometheus=http://localhost:9090/api/v1/write
+```
+
+**Features:**
+- 🔍 **Format-agnostic**: Works with any tabular CSV structure
+- 📊 **Automatic detection**: Numeric columns → metrics, String columns → labels
+- 🏷️ **Name sanitization**: `min(potatoes)`, `avg(time)`, `p99(latency)` → valid metric names
+- 🌐 **DNS Resolution**: IP addresses → hostnames (e.g., `10.50.52.61` → `foo.example.lan`)
+- 💾 **Smart Caching**: In-memory cache prevents redundant DNS lookups
+- ⏱️ **Timestamp preservation**: Uses file modification time
+- 🔄 **Continuous monitoring**: Polls file every 1 second
+- 💪 **Error resilient**: Continues watching despite failures
+- 🎯 **Remote Write**: Pushes to Prometheus (preserves timestamps)
+
+**CSV Format:**
+Works with any tabular CSV:
+- First row: column headers (automatically sanitized)
+- Subsequent rows: data values
+- Column names can be anything: `min(x)`, `avg(y)`, `p99(latency)`, etc.
+
+**Example 1** - Web metrics:
+```csv
+avg(response_time),p99(latency),endpoint,method
+45.2,120.5,/api/users,GET
+52.1,135.8,/api/orders,POST
+```
+
+Generates:
+```promql
+web_avg_response_time{endpoint="/api/users",method="GET"} 45.2
+web_p99_latency{endpoint="/api/users",method="GET"} 120.5
+web_avg_response_time{endpoint="/api/orders",method="POST"} 52.1
+web_p99_latency{endpoint="/api/orders",method="POST"} 135.8
+```
+
+**Example 2** - Food metrics:
+```csv
+min(potatoes),last(coke),avg(price),country,store_type
+5.2,10.5,12.99,USA,grocery
+3.8,8.2,9.99,Canada,convenience
+```
+
+Generates:
+```promql
+food_min_potatoes{country="USA",store_type="grocery"} 5.2
+food_last_coke{country="USA",store_type="grocery"} 10.5
+food_avg_price{country="USA",store_type="grocery"} 12.99
+# ... etc
+```
+
+Each row generates N samples (N = number of numeric columns).
+
+See [CSV-FORMAT-FLEXIBILITY.md](CSV-FORMAT-FLEXIBILITY.md) for more examples.
+
+**Options:**
+- `-file` - CSV file to watch (required)
+- `-metric-name` - Base metric name (required, e.g., `food`, `network`, `database`)
+- `-prometheus` - Prometheus Remote Write URL (default: http://localhost:9090/api/v1/write)
+- `-clickhouse` - ClickHouse HTTP URL (e.g. http://localhost:8123) to also ingest metrics
+- `-clickhouse-table` - ClickHouse table name (default: epimetheus_metrics)
+- `-job` - Job name for metrics (default: example_metrics_pusher)
+- `-resolve-ip-labels` - Additional IP labels to resolve via DNS (default: ip is always resolved)
+
+**ClickHouse Support:**
+Watch mode can ingest to ClickHouse in addition to (or instead of) Prometheus:
+
+```bash
+# Ingest to both Prometheus and ClickHouse
+./epimetheus -mode=watch -file=data.csv -metric-name=myapp \
+    -prometheus=http://localhost:9090/api/v1/write \
+    -clickhouse=http://localhost:8123
+
+# ClickHouse only (use -prometheus= to disable Prometheus)
+./epimetheus -mode=watch -file=test-data/watch-clickhouse-test.csv \
+    -metric-name=watch_test -clickhouse=http://localhost:8123 -prometheus=
+
+# Verify data in ClickHouse
+./verify-clickhouse.sh
+```
+
+**DNS Resolution:**
+By default, the `ip` label is automatically resolved to a hostname. To resolve additional IP labels:
+
+```bash
+./epimetheus -mode=watch \
+    -file=network.csv \
+    -metric-name=network \
+    -resolve-ip-labels=source_ip,dest_ip
+```
+
+This will resolve: `ip` (default) + `source_ip` + `dest_ip`
+
+**Example:**
+- Input: `ip="10.50.52.61"`
+- Output: `ip="foo.example.lan"`
+- Failed lookups: IP remains unchanged
+
+**Documentation:**
+- [DNS-RESOLUTION-FEATURE.md](DNS-RESOLUTION-FEATURE.md) - Complete DNS resolution guide
+- [CSV-FORMAT-FLEXIBILITY.md](CSV-FORMAT-FLEXIBILITY.md) - Works with ANY CSV format
+- [DTAIL-METRICS-EXAMPLE.md](DTAIL-METRICS-EXAMPLE.md) - Detailed dtail.csv example
+
+### 🔄 Realtime Mode (Default)
+Push current metrics to Pushgateway with "now" timestamp.
+
+```bash
+./epimetheus -mode=realtime -continuous
+```
+
+**Options:**
+- `-pushgateway` - Pushgateway URL (default: http://localhost:9091)
+- `-job` - Job name (default: example_metrics_pusher)
+- `-continuous` - Keep pushing every 15 seconds
+
+### ⏰ Historic Mode
+Push a single datapoint from the past using Remote Write API.
+
+```bash
+# Port-forward Prometheus
+kubectl port-forward -n monitoring svc/prometheus-kube-prometheus-prometheus 9090:9090 &
+
+# Push data from 24 hours ago
+./epimetheus -mode=historic -hours-ago=24
+```
+
+**Options:**
+- `-prometheus` - Prometheus URL (default: http://localhost:9090/api/v1/write)
+- `-hours-ago` - Hours in the past (default: 24)
+
+### 📦 Backfill Mode
+Import a range of historic data points.
+
+```bash
+# Backfill last 48 hours with 1-hour intervals
+./epimetheus -mode=backfill -start-hours=48 -end-hours=0 -interval=1
+
+# Backfill last week with 6-hour intervals
+./epimetheus -mode=backfill -start-hours=168 -end-hours=0 -interval=6
+```
+
+**Options:**
+- `-start-hours` - Start time in hours ago
+- `-end-hours` - End time in hours ago (0 = now)
+- `-interval` - Interval between points in hours
+
+### 🤖 Auto Mode (Recommended!)
+Automatically detect timestamp age and route to the correct ingestion method.
+
+```bash
+# Generate test data
+./generate-test-data.sh
+
+# Import mixed current and historic data
+./epimetheus -mode=auto -file=test-all-ages.csv
+```
+
+**Detection Logic:**
+- Data < 5 minutes old → Pushgateway (realtime)
+- Data ≥ 5 minutes old → Remote Write (historic)
+
+**Options:**
+- `-file` - Input file path
+- `-format` - Data format: csv or json (default: csv)
+- `-pushgateway` - Pushgateway URL
+- `-prometheus` - Prometheus Remote Write URL
+
+## Data Formats
+
+### CSV Format
+
+```csv
+# Format: metric_name,labels,value,timestamp_ms
+# Labels: key1=value1;key2=value2
+epimetheus_test_requests_total,instance=web1;env=prod,100,1767125148000
+epimetheus_test_temperature_celsius,instance=web2,22.5,1767038748000
+
+# Timestamp is optional (uses "now" if omitted)
+epimetheus_test_active_connections,instance=web3,42,
+```
+
+### JSON Format
+
+```json
+[
+  {
+    "metric": "epimetheus_test_requests_total",
+    "labels": {"instance": "web1", "env": "prod"},
+    "value": 100,
+    "timestamp_ms": 1767125148000
+  },
+  {
+    "metric": "epimetheus_test_temperature_celsius",
+    "labels": {"instance": "web2"},
+    "value": 22.5,
+    "timestamp_ms": 1767038748000
+  }
+]
+```
+
+## Test Metrics
+
+All generated metrics use the `epimetheus_test_` prefix to clearly identify them as test data.
+
+### Counter: `epimetheus_test_requests_total`
+- **Type:** Counter (monotonically increasing)
+- **Description:** Total number of requests processed
+- **Use case:** Counting total events, requests, errors
+
+### Gauge: `epimetheus_test_active_connections`
+- **Type:** Gauge (can increase or decrease)
+- **Description:** Current number of active connections (0-100)
+- **Use case:** Current state measurements, capacity
+
+### Gauge: `epimetheus_test_temperature_celsius`
+- **Type:** Gauge
+- **Description:** Current temperature in Celsius (0-50°C)
+- **Use case:** Environmental monitoring
+
+### Histogram: `epimetheus_test_request_duration_seconds`
+- **Type:** Histogram (distribution)
+- **Description:** Request duration distribution
+- **Buckets:** 0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 10 seconds
+- **Use case:** Latency measurements, SLO tracking
+
+### Labeled Counter: `epimetheus_test_jobs_processed_total`
+- **Type:** Counter with labels
+- **Description:** Jobs processed by type and status
+- **Labels:**
+  - `job_type`: email, report, backup
+  - `status`: success, failed
+- **Use case:** Categorized counting, multi-dimensional metrics
+
+## Grafana Dashboard
+
+A comprehensive dashboard is available showcasing all test metrics.
+
+### Dashboard Features
+
+- **8 Panels:**
+  1. Request Rate (line graph)
+  2. Total Requests (stat panel)
+  3. Active Connections (gauge with thresholds)
+  4. Temperature (gauge with thresholds)
+  5. Request Duration Histogram (p50, p90, p99)
+  6. Average Request Duration (stat)
+  7. Jobs Processed by Type (bar gauge)
+  8. Jobs Status Breakdown (table)
+
+- **Auto-refresh:** Every 10 seconds
+- **Time range:** Last 15 minutes (customizable)
+- **Dark theme optimized**
+
+### Deploy Dashboard
+
+#### Option 1: Helm/Kubernetes ConfigMap (Recommended)
+
+```bash
+# Deploy via Kubernetes ConfigMap
+kubectl apply -f ../prometheus/epimetheus-dashboard.yaml
+```
+
+The dashboard will be automatically discovered by Grafana.
+
+#### Option 2: Manual Import
+
+```bash
+# Port-forward Grafana
+kubectl port-forward -n monitoring svc/prometheus-grafana 3000:80
+
+# Open Grafana
+open http://localhost:3000
+
+# Go to Dashboards → Import → Upload grafana-dashboard.json
+```
+
+#### Option 3: Automated Script
+
+```bash
+# Deploy via API
+./deploy-dashboard.sh
+
+# Or with custom credentials
+GRAFANA_URL="http://localhost:3000" \
+GRAFANA_USER="admin" \
+GRAFANA_PASSWORD="yourpassword" \
+./deploy-dashboard.sh
+```
+
+## Example Queries
+
+### Basic Queries
+
+```promql
+# View total requests
+epimetheus_test_requests_total
+
+# View request rate over last 5 minutes
+rate(epimetheus_test_requests_total[5m])
+
+# View current active connections
+epimetheus_test_active_connections
+
+# View current temperature
+epimetheus_test_temperature_celsius
+```
+
+### Histogram Queries
+
+```promql
+# 95th percentile request duration
+histogram_quantile(0.95, rate(epimetheus_test_request_duration_seconds_bucket[5m]))
+
+# 50th percentile (median)
+histogram_quantile(0.50, rate(epimetheus_test_request_duration_seconds_bucket[5m]))
+
+# Average request duration
+rate(epimetheus_test_request_duration_seconds_sum[5m]) /
+rate(epimetheus_test_request_duration_seconds_count[5m])
+```
+
+### Labeled Counter Queries
+
+```promql
+# Failed jobs by type
+epimetheus_test_jobs_processed_total{status="failed"}
+
+# Job success rate
+rate(epimetheus_test_jobs_processed_total{status="success"}[5m]) /
+rate(epimetheus_test_jobs_processed_total[5m])
+
+# Total jobs by type
+sum by (job_type) (epimetheus_test_jobs_processed_total)
+```
+
+### Curl Examples
+
+```bash
+# Port-forward Prometheus
+kubectl port-forward -n monitoring svc/prometheus-kube-prometheus-prometheus 9090:9090 &
+
+# Query total requests
+curl -s "http://localhost:9090/api/v1/query?query=epimetheus_test_requests_total" | jq .
+
+# Query temperature
+curl -s "http://localhost:9090/api/v1/query?query=epimetheus_test_temperature_celsius" | jq .
+
+# Query request rate
+curl -s "http://localhost:9090/api/v1/query?query=rate(epimetheus_test_requests_total[5m])" | jq .
+
+# Query histogram p95
+curl -s "http://localhost:9090/api/v1/query?query=histogram_quantile(0.95,rate(epimetheus_test_request_duration_seconds_bucket[5m]))" | jq .
+```
+
+## Time Range Limitations
+
+### ✅ Supported Time Ranges
+
+| Time Range | Status | Method |
+|------------|--------|--------|
+| Current (< 5 min) | ✅ Works | Pushgateway |
+| 1 hour old | ✅ Works | Remote Write |
+| 1 day old | ✅ Works | Remote Write |
+| 1 week old | ✅ Works | Remote Write |
+| 1 month old | ✅ Works | Remote Write |
+
+### ⚠️ Potential Issues
+
+- **Future timestamps:** Rejected (> 5 minutes in future)
+- **Very old data (6+ months):** May be rejected depending on Prometheus retention
+- **Years old:** Likely rejected - use `promtool tsdb create-blocks-from` instead
+- **Out-of-order samples:** Can't insert older data into existing time series (use different labels)
+
+### Prometheus Configuration
+
+Check your retention settings:
+
+```bash
+# View retention
+kubectl get prometheus -n monitoring prometheus-kube-prometheus-prometheus \
+  -o jsonpath='{.spec.retention}'
+
+# Default is typically 15 days
+```
+
+For very old data:
+- Increase retention in Prometheus config
+- Enable out-of-order ingestion (experimental)
+- Use `promtool` for direct TSDB block creation
+
+## Project Structure
+
+```
+epimetheus/
+├── cmd/
+│   └── epimetheus/
+│       └── main.go              # Main entry point
+├── internal/
+│   ├── config/                  # Configuration
+│   ├── metrics/                 # Metric generators
+│   ├── parser/                  # CSV/JSON parsers (includes tabular CSV)
+│   ├── ingester/                # Pushgateway & Remote Write ingesters
+│   └── watcher/                 # File watcher for watch mode
+├── epimetheus            # Compiled binary
+├── grafana-dashboard.json       # Grafana dashboard definition
+├── deploy-dashboard.sh          # Dashboard deployment script
+├── generate-test-data.sh        # Test data generator
+├── run.sh                       # Helper script
+└── README.md                    # This file
+```
+
+## Setup Requirements
+
+### 1. Enable Prometheus Remote Write Receiver ⚠️ **REQUIRED for Historic Data**
+
+**IMPORTANT**: To use historic mode, backfill mode, or auto mode with old data, you **must** enable the Prometheus Remote Write receiver. Without this feature, Epimetheus can only push realtime data via Pushgateway.
+
+The Remote Write receiver is configured in the [conf repository](https://codeberg.org/snonux/conf) at `f3s/prometheus/persistence-values.yaml`:
+
+```yaml
+# In prometheus/persistence-values.yaml (from conf repository)
+prometheus:
+  prometheusSpec:
+    # Enable Remote Write receiver endpoint and Admin API (Prometheus 3.x syntax)
+    additionalArgs:
+      - name: web.enable-remote-write-receiver
+        value: ""
+      - name: web.enable-admin-api
+        value: ""
+
+    # Enable out-of-order ingestion for backfilling
+    # Allows writing data points older than existing data for the same time series
+    enableFeatures:
+      - exemplar-storage
+      - otlp-write-receiver
+
+    # Allow backfilling up to 31 days in the past (provides 1-day buffer for 30-day datasets)
+    tsdb:
+      outOfOrderTimeWindow: 744h  # 31 days
+```
+
+**What This Enables:**
+- **Remote Write API**: HTTP endpoint at `/api/v1/write` for ingesting metrics with custom timestamps
+- **Admin API**: HTTP endpoints at `/api/v1/admin/tsdb/*` for data deletion and management
+- **Out-of-Order Ingestion**: Allows writing data points older than existing data for the same time series
+- **31-Day Window**: Can backfill data up to 31 days in the past (provides 1-day buffer for 30-day datasets)
+
+After updating the configuration, upgrade your Prometheus installation:
+
+```bash
+cd conf/f3s/prometheus
+just upgrade  # Or manually:
+# helm upgrade prometheus prometheus-community/kube-prometheus-stack \
+#   -n monitoring -f persistence-values.yaml
+```
+
+Verify the features are enabled:
+
+```bash
+# Check Remote Write receiver flag
+kubectl get pod -n monitoring prometheus-prometheus-kube-prometheus-prometheus-0 \
+  -o jsonpath='{.spec.containers[0].args}' | grep -o "web.enable-remote-write-receiver"
+
+# Check out-of-order time window
+kubectl get prometheus -n monitoring prometheus-kube-prometheus-prometheus \
+  -o jsonpath='{.spec.tsdb.outOfOrderTimeWindow}'
+# Should output: 744h
+
+# Check admin API flag
+kubectl get pod -n monitoring prometheus-prometheus-kube-prometheus-prometheus-0 \
+  -o jsonpath='{.spec.containers[0].args}' | grep -o "web.enable-admin-api"
+```
+
+**Performance Considerations:**
+
+This configuration is designed for ad-hoc troubleshooting and development, **NOT production use**. Enabling these features has trade-offs:
+
+- **Increased Memory Usage**: Out-of-order ingestion requires additional memory for buffering and sorting time series
+- **Higher TSDB Overhead**: Prometheus TSDB needs to handle non-sequential writes, increasing disk I/O
+- **Query Performance**: Queries may be slower due to fragmented data blocks
+- **Storage Amplification**: Out-of-order samples can trigger additional compactions, increasing storage usage
+
+**Recommendation for Production:**
+- Keep `outOfOrderTimeWindow` as small as possible (or disabled)
+- Monitor Prometheus memory and disk usage closely
+- Use Remote Write only when necessary
+- Consider using dedicated testing/development Prometheus instances
+
+**Note**: The syntax changed in Prometheus 3.x - use `additionalArgs` with `web.enable-remote-write-receiver` instead of the deprecated `enableFeatures: [remote-write-receiver]`.
+
+### 2. Update Prometheus Scrape Config
+
+Ensure Pushgateway is in scrape targets:
+
+```yaml
+# additional-scrape-configs.yaml
+- job_name: 'pushgateway'
+  honor_labels: true
+  static_configs:
+    - targets:
+      - 'pushgateway.monitoring.svc.cluster.local:9091'
+```
+
+Apply the configuration:
+
+```bash
+kubectl create secret generic additional-scrape-configs \
+  --from-file=/home/paul/git/conf/f3s/prometheus/additional-scrape-configs.yaml \
+  --dry-run=client -o yaml -n monitoring | kubectl apply -f -
+```
+
+## Building from Source
+
+### Using Mage (Recommended)
+
+This project includes a [Magefile](./MAGEFILE.md) for easy building, testing, and running:
+
+```bash
+# Install Mage (one-time setup)
+go install github.com/magefile/mage@latest
+
+# Build binary
+mage build
+
+# Run tests
+mage test
+
+# Run with coverage report
+mage testCoverage
+
+# Run in realtime mode
+mage run
+
+# See all available targets
+mage -l
+```
+
+See [MAGEFILE.md](./MAGEFILE.md) for complete documentation.
+
+### Using Go directly
+
+```bash
+# Build binary
+go build -o epimetheus cmd/epimetheus/main.go
+
+# Run tests
+go test ./... -v
+
+# Check test coverage
+go test ./... -cover
+```
+
+## Troubleshooting
+
+### Binary can't connect to Pushgateway
+
+```bash
+# Check port-forward is running
+ps aux | grep "port-forward.*9091"
+
+# Restart port-forward
+kubectl port-forward -n monitoring svc/pushgateway 9091:9091
+```
+
+### Metrics not appearing in Prometheus
+
+```bash
+# Check Pushgateway has metrics
+curl http://localhost:9091/metrics | grep "prometheus_pusher_test"
+
+# Check Prometheus scrape targets
+# Open http://localhost:9090/targets - look for "pushgateway" job
+
+# Check Prometheus logs
+kubectl logs -n monitoring -l app.kubernetes.io/name=prometheus
+```
+
+### "Remote write receiver not enabled" error
+
+```bash
+# Verify feature is enabled
+kubectl logs -n monitoring prometheus-prometheus-kube-prometheus-prometheus-0 | grep "remote-write-receiver"
+
+# Should see: msg="Experimental features enabled" features=[remote-write-receiver]
+```
+
+### "Out of order sample" error
+
+This occurs when trying to insert data older than existing data for the same time series.
+
+**Solutions:**
+- Use different job labels for historic data (e.g., `job="historic_data"`)
+- Enable out-of-order ingestion in Prometheus (experimental)
+- Ensure backfill goes from oldest to newest
+
+### Dashboard not appearing in Grafana
+
+```bash
+# Check ConfigMap exists
+kubectl get configmap -n monitoring | grep epimetheus
+
+# Check labels
+kubectl get configmap epimetheus-dashboard -n monitoring -o yaml | grep "grafana_dashboard"
+
+# Restart Grafana to force reload
+kubectl rollout restart deployment/prometheus-grafana -n monitoring
+```
+
+## Architecture
+
+```
+┌─────────────────┐
+│  Go Binary      │
+│ (prometheus-    │──Push realtime──┐
+│  pusher)        │                 │
+└─────────────────┘                 ▼
+         │                  ┌──────────────────┐
+         │                  │  Pushgateway     │◄──Scrape──┐
+         │                  │  (Port 9091)     │           │
+         │                  └──────────────────┘           │
+         │                                                 │
+         └──Push historic──────────────────┐              │
+                                            ▼              │
+                                  ┌─────────────────┐     │
+                                  │   Prometheus    │◄────┘
+                                  │   (Port 9090)   │
+                                  │ Remote Write API│
+                                  └─────────────────┘
+                                           │
+                                           │ Datasource
+                                           ▼
+                                  ┌─────────────────┐
+                                  │    Grafana      │
+                                  │   (Port 3000)   │
+                                  │   Dashboards    │
+                                  └─────────────────┘
+```
+
+## Best Practices
+
+### When to Use Pushgateway vs. Remote Write
+
+**Use Pushgateway (realtime mode):**
+- Short-lived batch jobs
+- Service-level metrics
+- Jobs behind firewalls
+- Current/recent data (< 5 minutes old)
+
+**Use Remote Write (historic mode):**
+- Historic data import
+- Backfilling gaps
+- Data migration
+- Data older than 5 minutes
+
+**Use Auto Mode:**
+- Mixed current and historic data
+- Importing from files
+- Unknown timestamp ages
+- General-purpose ingestion
+
+### Metric Design
+
+- **Use appropriate metric types:**
+  - Counter for cumulative values (requests, errors)
+  - Gauge for point-in-time values (temperature, connections)
+  - Histogram for distributions (latency, sizes)
+
+- **Label cardinality:**
+  - Include meaningful labels
+  - Avoid high-cardinality labels (user IDs, timestamps)
+  - Keep label combinations reasonable (< 1000 per metric)
+
+- **Naming conventions:**
+  - Use descriptive names
+  - Include units in gauge names (\_celsius, \_bytes)
+  - Use \_total suffix for counters
+
+## Cleanup
+
+### Cleaning Up Benchmark Data from Prometheus
+
+For cleaning up benchmark metrics from Prometheus, use the provided cleanup script:
+
+```bash
+# Port-forward to Prometheus
+kubectl port-forward -n monitoring svc/prometheus-kube-prometheus-prometheus 9090:9090 &
+
+# Run the cleanup script
+./cleanup-benchmark-data.sh
+```
+
+The script will:
+1. Delete all `epimetheus_benchmark_*` metrics using the Prometheus Admin API
+2. Clean up tombstones to free disk space
+3. Provide clear success/error feedback
+
+**Manual cleanup** (if you prefer):
+
+```bash
+# Delete specific metric
+curl -X POST 'http://localhost:9090/api/v1/admin/tsdb/delete_series?match[]=epimetheus_benchmark_cpu_usage'
+
+# Clean up tombstones
+curl -X POST 'http://localhost:9090/api/v1/admin/tsdb/clean_tombstones'
+```
+
+### Other Cleanup Tasks
+
+```bash
+# Stop port-forwards
+pkill -f "port-forward.*9091"
+pkill -f "port-forward.*9090"
+pkill -f "port-forward.*3000"
+
+# Delete test metrics from Pushgateway
+curl -X DELETE http://localhost:9091/metrics/job/example_metrics_pusher
+
+# Uninstall Pushgateway (if needed)
+helm uninstall pushgateway -n monitoring
+```
+
+## MacOS Setup
+
+### Basic Installation
+
+```bash
+brew install prometheus
+brew install grafana
+go install github.com/prometheus/pushgateway@latest
+brew services start grafana
+brew services start prometheus
+~/go/bin/pushgateway &
+```
+
+Once done, login to http://localhost:3000 as admin:admin, you will be prompted to change the password. Afterwards, add http://localhost:9090 as a Prometheus datasource.
+
+### Enable Remote Write Receiver (Required for Watch Mode)
+
+⚠️ **Important**: Watch mode, historic mode, backfill mode, and auto mode require the Prometheus Remote Write receiver to be enabled.
+
+#### Option 1: Permanent Configuration (Recommended)
+
+Edit the Prometheus arguments file:
+
+```bash
+# Edit the arguments file
+nano /opt/homebrew/etc/prometheus.args
+```
+
+Add this line at the end:
+```
+--web.enable-remote-write-receiver
+```
+
+The complete file should look like:
+```
+--config.file /opt/homebrew/etc/prometheus.yml
+--web.listen-address=127.0.0.1:9090
+--storage.tsdb.path /opt/homebrew/var/prometheus
+--web.enable-remote-write-receiver
+--web.enable-admin-api
+```
+
+**Note:** `--web.enable-admin-api` is optional but recommended for easier data management (allows deleting old metrics).
+
+Restart Prometheus:
+```bash
+brew services restart prometheus
+```
+
+Verify it's working:
+```bash
+# Check Prometheus is healthy
+curl http://localhost:9090/-/healthy
+
+# Test Remote Write endpoint (should return 400, not 404)
+curl -X POST http://localhost:9090/api/v1/write
+```
+
+#### Option 2: Temporary (For Testing)
+
+Stop the service and start manually:
+
+```bash
+# Stop brew service
+brew services stop prometheus
+
+# Start with Remote Write enabled
+prometheus --web.enable-remote-write-receiver
+```
+
+Keep this terminal open. In another terminal, run your epimetheus commands.
+
+**Note**: This only lasts until you stop the terminal. Use Option 1 for permanent setup.
+
+### Clearing Old Metrics (Optional)
+
+If you need to delete old metrics and start fresh:
+
+```bash
+# Delete specific metrics (e.g., blockstore)
+curl -X POST -g 'http://localhost:9090/api/v1/admin/tsdb/delete_series?match[]={__name__=~"blockstore_.*"}'
+
+# Clean up deleted data
+curl -X POST http://localhost:9090/api/v1/admin/tsdb/clean_tombstones
+
+# Wait a moment for cleanup
+sleep 2
+```
+
+**Note:** Admin API must be enabled (add `--web.enable-admin-api` to prometheus.args).
+
+### Verify Setup
+
+Once Remote Write is enabled, test watch mode:
+
+```bash
+# Create a test CSV
+cat > /tmp/test.csv << EOF
+status,count,method
+200,100,GET
+404,50,POST
+EOF
+
+# Watch the file
+./epimetheus -mode=watch \
+    -file=/tmp/test.csv \
+    -metric-name=test \
+    -prometheus=http://localhost:9090/api/v1/write
+```
+
+You should see:
+```
+✅ Successfully pushed X samples to Prometheus
+```
+
+Query in Prometheus (http://localhost:9090):
+```promql
+{__name__=~"test_.*"}
+```
+
+## Additional Resources
+
+- [Prometheus Documentation](https://prometheus.io/docs/)
+- [Pushgateway Documentation](https://github.com/prometheus/pushgateway)
+- [Prometheus Remote Write Spec](https://prometheus.io/docs/concepts/remote_write_spec/)
+- [Grafana Documentation](https://grafana.com/docs/)
+
+## Version
+
+Current version: 0.0.0
+
+## License
+
+See LICENSE file for details.
author	Paul Buetow <paul@buetow.org>	2026-02-07 16:32:10 +0200
committer	Paul Buetow <paul@buetow.org>	2026-02-07 16:32:10 +0200
commit	3fd46f3977fb650974e5e936cba362c787c00637 (patch)
tree	b49111ddd0b7af4a007bca6a304dba10efcd88ff /README.md