summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorPaul Buetow <paul@buetow.org>2025-12-24 09:44:16 +0200
committerPaul Buetow <paul@buetow.org>2025-12-24 09:44:16 +0200
commit4a8ef3480ecb3cd9a39ffdb7c1c9642340470d6d (patch)
tree2c8556a38d410e4d0bbf5f0f4ad559c65ee3b763
parentc015404c963915ee3f1bc7d7982f0bd0ac75468f (diff)
fix
-rw-r--r--gemfeed/DRAFT-x-rag-observability-hackathon.gmi.tpl32
1 files changed, 16 insertions, 16 deletions
diff --git a/gemfeed/DRAFT-x-rag-observability-hackathon.gmi.tpl b/gemfeed/DRAFT-x-rag-observability-hackathon.gmi.tpl
index dff3bddf..45af1017 100644
--- a/gemfeed/DRAFT-x-rag-observability-hackathon.gmi.tpl
+++ b/gemfeed/DRAFT-x-rag-observability-hackathon.gmi.tpl
@@ -8,7 +8,7 @@ This blog post describes my hackathon efforts adding observability to X-RAG, a d
## What is X-RAG?
-X-RAG is a distributed RAG (Retrieval-Augmented Generation) platform running on Kubernetes. The idea behind RAG is simple: instead of asking an LLM to answer questions from its training data alone, you first retrieve relevant documents from your own knowledge base, then feed those documents to the LLM as context. The LLM synthesises an answer grounded in your actual content—reducing hallucinations and enabling answers about private or recent information the model was never trained on.
+X-RAG is the extendendible RAG (Retrieval-Augmented Generation) platform running on Kubernetes. The idea behind RAG is simple: instead of asking an LLM to answer questions from its training data alone, you first retrieve relevant documents from your own knowledge base, then feed those documents to the LLM as context. The LLM synthesises an answer grounded in your actual content—reducing hallucinations and enabling answers about private or recent information the model was never trained on.
X-RAG handles the full pipeline: ingest documents, chunk them into searchable pieces, generate vector embeddings, store them in a vector database, and at query time, retrieve relevant chunks and pass them to an LLM for answer generation. The system supports both local LLMs (Florian runs his on a beefy desktop) and cloud APIs like OpenAI. I configured an OpenAI API key since my laptop's CPU and GPU aren't fast enough for decent local inference.
@@ -30,15 +30,15 @@ The data layer includes Weaviate (vector database with hybrid search), Kafka (me
```
┌─────────────────────────────────────────────────────────────────────────┐
-│ X-RAG Kubernetes Cluster │
+│ X-RAG Kubernetes Cluster │
├─────────────────────────────────────────────────────────────────────────┤
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Search UI │ │Search Svc │ │Embed Service│ │ Indexer │ │
│ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │
│ │ │ │ │ │
│ └────────────────┴────────────────┴────────────────┘ │
-│ │ │
-│ ▼ │
+│ │ │
+│ ▼ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Weaviate │ │ Kafka │ │ MinIO │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
@@ -67,7 +67,7 @@ The `kindest/node` image contains everything needed: kubelet, containerd, CNI pl
```
┌─────────────────────────────────────────────────────────────────────────┐
-│ Docker Host │
+│ Docker Host │
├─────────────────────────────────────────────────────────────────────────┤
│ ┌───────────────────┐ ┌───────────────────┐ ┌───────────────────┐ │
│ │ xrag-k8-control │ │ xrag-k8-worker │ │ xrag-k8-worker2 │ │
@@ -152,19 +152,19 @@ Getting all logs in one place was the foundation. I deployed Grafana Loki in the
```
┌──────────────────────────────────────────────────────────────────────┐
-│ LOGS PIPELINE │
+│ LOGS PIPELINE │
├──────────────────────────────────────────────────────────────────────┤
│ Applications write to stdout → containerd stores in /var/log/pods │
-│ │ │
-│ File tail │
-│ ▼ │
-│ Grafana Alloy (DaemonSet) │
-│ Discovers pods, extracts metadata │
-│ │ │
-│ HTTP POST /loki/api/v1/push │
-│ ▼ │
-│ Grafana Loki │
-│ Indexes labels, stores chunks │
+│ │ │
+│ File tail │
+│ ▼ │
+│ Grafana Alloy (DaemonSet) │
+│ Discovers pods, extracts metadata │
+│ │ │
+│ HTTP POST /loki/api/v1/push │
+│ ▼ │
+│ Grafana Loki │
+│ Indexes labels, stores chunks │
└──────────────────────────────────────────────────────────────────────┘
```