hypr - My "local" LLM setup with Hyperstack.

Age	Commit message (Collapse)	Author
2026-03-25	hyperstack: split 3335-line monolith into lib/hyperstack/ modules	Paul Buetow
	Extracts all classes from hyperstack.rb into focused library files: - lib/hyperstack/config.rb — ConfigLoader + Config (TOML loading, validation) - lib/hyperstack/state.rb — StateStore + PrefixedOutput (JSON state, threaded output) - lib/hyperstack/client.rb — HyperstackClient (REST API + retry logic) - lib/hyperstack/wireguard.rb — LocalWireGuard (wg1.conf peer management, /etc/hosts) - lib/hyperstack/provisioning.rb — ProvisioningScripts + RemoteProvisioner (SSH bootstrap) - lib/hyperstack/manager.rb — Manager (VM lifecycle orchestration) - lib/hyperstack/watcher.rb — VllmWatcher (Prometheus + GPU dashboard) - lib/hyperstack/cli.rb — CLI (OptionParser command dispatch) hyperstack.rb becomes a 46-line entry point with require_relative calls. All files pass `ruby -c` syntax check and `--help` runs correctly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-24	hyperstack: gpt-oss-120b + qwen3-coder-next dual-VM pair on A100x1	Paul Buetow
	- Add hyperstack-vm1-gptoss.toml: A100x1 config for gpt-oss-120b (VM1) and qwen3-coder-next (VM2) pair, replacing the H100x2 default - Fix pi/agent/models.json: hyperstack provider URL was pointing at hyperstack.wg1 (unresolvable); corrected to hyperstack1.wg1 (192.168.3.1) - Update hyperstack.rb, hypr.fish: reference vm1-gptoss.toml for create-both and pair commands; update fish abbrs for the new pair setup - Update ask-mode/utils.ts: allow read-only 'ask' commands in ask-mode - Update agent-plan-mode/utils.ts: tighten isAskCommand check - Add state files for provisioned vm1/vm2 instances Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-24	Add SUPIR photo restoration + local colour corrections	Paul Buetow
	- workflows/photo-enhance.json: replace Real-ESRGAN-only pipeline with SUPIR_Upscale (SUPIR-v0Q + SDXL base backbone). Parameters: 20 steps, scale_by=1.0 (original resolution), Wavelet colour fix, tiled VAE + tiled sampling for L40 VRAM headroom. - photo-enhance.rb: add ImageMagick colour corrections after download: S-curve contrast (-sigmoidal-contrast 3,50%), +20% saturation (-modulate 100,120,100), micro-contrast sharpening (-unsharp). These stack on top of SUPIR's internal Wavelet colour fix. - hyperstack.rb: update comfyui_install_script to install ComfyUI-SUPIR custom node and download both SUPIR-v0Q + sd_xl_base_1.0 on provision. - hyperstack.rb: extend status_config_loaders to auto-discover hyperstack-vm-photo.toml so --watch shows the photo VM. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-24	hyperstack: extend --watch dashboard to cover ComfyUI VMs	Paul Buetow
	- status_config_loaders now includes hyperstack-vm-photo.toml so the photo VM shows up in `watch` and `status` without --config - VmSnapshot gains service_type (:vllm or :comfyui) to route rendering - fetch_vm branches to fetch_vllm_vm or fetch_comfyui_vm based on config - fetch_comfyui_stats: single SSH call for nvidia-smi + /queue + /history - render_vm shows ComfyUI queue (running/queued/completed) for photo VM - Header renamed from "vLLM watch" to "VM watch" Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-24	Add ComfyUI photo enhancement VM and photo-enhance.rb client	Paul Buetow
	- hyperstack-vm-photo.toml: L40 GPU VM config (192.168.3.4, ~$1/hr) with [comfyui] section for port, model dirs, and pre-downloaded weights - hyperstack.rb: full ComfyUI provisioning support alongside vLLM/Ollama — config accessors, comfyui_install_script (git clone + venv + systemd), RemoteProvisioner#install_comfyui, Manager#create integration, UFW rules, status/service_mode_summary updates, --comfyui/--no-comfyui CLI flags - photo-enhance.rb: standalone client — uploads photos, submits ComfyUI workflow, polls for output, downloads PNG, converts to JPEG at quality 92 so file sizes match originals; --watch mode; processed-file manifest - workflows/photo-enhance.json: Real-ESRGAN x4plus enhance-in-place workflow (upscale 4x for enhancement, ImageScaleBy 0.25 back to original resolution) - README.md: Photo enhancement section with quickstart, config reference, workflow customisation notes, and performance table Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23	Add vLLM watch dashboard, side-by-side layout, and insert-mode default	Paul Buetow
	- hyperstack.rb: add VllmWatcher class and `watch` subcommand — live terminal dashboard polling all active VMs every 5 s via SSH; shows GPU util/VRAM/temp/power bars and vLLM throughput/requests/KV-cache/ cache-hit bars aligned in a shared column layout - draw(): render two or more VM panels side-by-side (horizontal) with a │ separator, padded to equal visible width; single VM falls back to vertical layout - pi/agent/extensions/modal-editor: start in INSERT mode instead of NORMAL - README: document watch command and update fish script rename Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-22	Upgrade VM1 to H100x2 with 1M context for Nemotron-3-Super	Paul Buetow
	Switch VM1 from n3-H100x1 to n3-H100x2 to run Nemotron-3-Super with 1M token context window via tensor parallelism. The dual-GPU setup (160 GB total VRAM) provides enough KV cache headroom to override the model's config.json limit of 262144 tokens. Key changes: - flavor_name: n3-H100x1 → n3-H100x2 - tensor_parallel_size: 1 → 2 - max_model_len: 131072 → 1048576 (with VLLM_ALLOW_LONG_MAX_MODEL_LEN=1) - gpu_memory_utilization: 0.92 → 0.85 (headroom for Mamba cache + sampler warmup) - Remove --enforce-eager: no longer needed with dual-GPU VRAM budget - Disable prefix caching: on NemotronH it forces Mamba "all" cache mode which pre-allocates states for all max_num_seqs and OOMs before the sampler warmup pass; per-request allocation is cheaper at startup Add two new vllm config fields to hyperstack.rb: - extra_docker_env: passes -e KEY=VALUE flags to Docker before the image name (used for VLLM_ALLOW_LONG_MAX_MODEL_LEN=1 and PYTORCH_ALLOC_CONF=expandable_segments:True) - enable_prefix_caching: makes --enable-prefix-caching conditional (default true for backward compat; false for NemotronH) Both fields are supported in [vllm] defaults and [vllm.presets.*] overrides with the same fallback semantics as existing fields. Update pi/agent/models.json: Nemotron vm1 entry renamed to "Nemotron 3 Super 120B 1M [vm1]" with contextWindow 1048576. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-21	Consolidate vllm-setup.txt into README.md and remove the file	Paul Buetow
	Merged all still-relevant content from vllm-setup.txt into README.md: - Why vLLM over Ollama section - Full monitoring commands with engine metrics table - Troubleshooting table - VRAM sizing guide - Performance characteristics table Dropped LiteLLM, Anthropic API, Claude Code, and OpenCode sections which are no longer applicable. Removes the vllm-setup.txt file.
2026-03-21	Remove LiteLLM and Claude Code repo references (task 301)	Paul Buetow

2026-03-21	initial import	Paul Buetow