1 files changed, 43 insertions, 0 deletions
diff --git a/docs/operations/troubleshooting.md b/docs/operations/troubleshooting.md
new file mode 100644
index 0000000..9446508
--- /dev/null
+++ b/docs/operations/troubleshooting.md
@@ -0,0 +1,43 @@
+# Troubleshooting
+
+## Binary can't connect to Pushgateway
+
+- Confirm a port-forward or route to Pushgateway is running, e.g. `ps aux | grep "port-forward.*9091"`.
+- Restart port-forward: `kubectl port-forward -n monitoring svc/pushgateway 9091:9091`.
+- Ensure `-pushgateway` points at the URL you use (e.g. `http://localhost:9091`).
+
+## Metrics not appearing in Prometheus
+
+- **Pushgateway:** `curl http://localhost:9091/metrics | grep "prometheus_pusher_test"` (or your job/metric name). If empty, Epimetheus may not be pushing or the job name may differ.
+- **Scrape:** In Prometheus UI (e.g. http://localhost:9090/targets), check that the Pushgateway job exists and is up.
+- **Logs:** `kubectl logs -n monitoring -l app.kubernetes.io/name=prometheus` (or your Prometheus pod) for scrape/remote-write errors.
+
+## "Remote write receiver not enabled" error
+
+Prometheus must be started with the Remote Write receiver enabled. Verify:
+
+```bash
+kubectl logs -n monitoring prometheus-prometheus-kube-prometheus-prometheus-0 | grep "remote-write-receiver"
+```
+
+You should see the feature listed in the enabled features. If not, add `web.enable-remote-write-receiver` (see [Setup: Prometheus](setup-prometheus.md)) and restart Prometheus.
+
+## "Out of order sample" error
+
+You are writing a sample older than existing data for the same series.
+
+- Use different labels for historic data (e.g. `job="historic_data"`), or
+- Enable out-of-order ingestion on Prometheus and set `tsdb.outOfOrderTimeWindow` (see [Setup: Prometheus](setup-prometheus.md)), or
+- Run backfills from oldest to newest.
+
+## Dashboard not appearing in Grafana
+
+- Check the dashboard ConfigMap exists: `kubectl get configmap -n monitoring | grep epimetheus`.
+- Ensure the ConfigMap has the label Grafana uses for dashboard discovery (e.g. `grafana_dashboard: "1"`): `kubectl get configmap epimetheus-dashboard -n monitoring -o yaml | grep "grafana_dashboard"`.
+- Restart Grafana to reload dashboards: `kubectl rollout restart deployment/prometheus-grafana -n monitoring` (adjust deployment name to your setup).
+
+## ClickHouse connection failed
+
+- Ensure ClickHouse is listening on HTTP (default port 8123): `curl -sS http://localhost:8123/ping`.
+- If using Kubernetes, check Service and port-forwards. Use the same URL as `-clickhouse`.
+- See [Setup: ClickHouse](setup-clickhouse.md) and [ClickHouse backend](../backends/clickhouse.md).