| Age | Commit message (Collapse) | Author |
|
Extracts all classes from hyperstack.rb into focused library files:
- lib/hyperstack/config.rb — ConfigLoader + Config (TOML loading, validation)
- lib/hyperstack/state.rb — StateStore + PrefixedOutput (JSON state, threaded output)
- lib/hyperstack/client.rb — HyperstackClient (REST API + retry logic)
- lib/hyperstack/wireguard.rb — LocalWireGuard (wg1.conf peer management, /etc/hosts)
- lib/hyperstack/provisioning.rb — ProvisioningScripts + RemoteProvisioner (SSH bootstrap)
- lib/hyperstack/manager.rb — Manager (VM lifecycle orchestration)
- lib/hyperstack/watcher.rb — VllmWatcher (Prometheus + GPU dashboard)
- lib/hyperstack/cli.rb — CLI (OptionParser command dispatch)
hyperstack.rb becomes a 46-line entry point with require_relative calls.
All files pass `ruby -c` syntax check and `--help` runs correctly.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
- Add hyperstack-vm1-gptoss.toml: A100x1 config for gpt-oss-120b (VM1)
and qwen3-coder-next (VM2) pair, replacing the H100x2 default
- Fix pi/agent/models.json: hyperstack provider URL was pointing at
hyperstack.wg1 (unresolvable); corrected to hyperstack1.wg1 (192.168.3.1)
- Update hyperstack.rb, hypr.fish: reference vm1-gptoss.toml for create-both
and pair commands; update fish abbrs for the new pair setup
- Update ask-mode/utils.ts: allow read-only 'ask' commands in ask-mode
- Update agent-plan-mode/utils.ts: tighten isAskCommand check
- Add state files for provisioned vm1/vm2 instances
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
- workflows/photo-enhance.json: replace Real-ESRGAN-only pipeline with
SUPIR_Upscale (SUPIR-v0Q + SDXL base backbone). Parameters:
20 steps, scale_by=1.0 (original resolution), Wavelet colour fix,
tiled VAE + tiled sampling for L40 VRAM headroom.
- photo-enhance.rb: add ImageMagick colour corrections after download:
S-curve contrast (-sigmoidal-contrast 3,50%), +20% saturation
(-modulate 100,120,100), micro-contrast sharpening (-unsharp).
These stack on top of SUPIR's internal Wavelet colour fix.
- hyperstack.rb: update comfyui_install_script to install ComfyUI-SUPIR
custom node and download both SUPIR-v0Q + sd_xl_base_1.0 on provision.
- hyperstack.rb: extend status_config_loaders to auto-discover
hyperstack-vm-photo.toml so --watch shows the photo VM.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
- status_config_loaders now includes hyperstack-vm-photo.toml so the
photo VM shows up in `watch` and `status` without --config
- VmSnapshot gains service_type (:vllm or :comfyui) to route rendering
- fetch_vm branches to fetch_vllm_vm or fetch_comfyui_vm based on config
- fetch_comfyui_stats: single SSH call for nvidia-smi + /queue + /history
- render_vm shows ComfyUI queue (running/queued/completed) for photo VM
- Header renamed from "vLLM watch" to "VM watch"
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
- hyperstack-vm-photo.toml: L40 GPU VM config (192.168.3.4, ~$1/hr)
with [comfyui] section for port, model dirs, and pre-downloaded weights
- hyperstack.rb: full ComfyUI provisioning support alongside vLLM/Ollama —
config accessors, comfyui_install_script (git clone + venv + systemd),
RemoteProvisioner#install_comfyui, Manager#create integration, UFW rules,
status/service_mode_summary updates, --comfyui/--no-comfyui CLI flags
- photo-enhance.rb: standalone client — uploads photos, submits ComfyUI
workflow, polls for output, downloads PNG, converts to JPEG at quality 92
so file sizes match originals; --watch mode; processed-file manifest
- workflows/photo-enhance.json: Real-ESRGAN x4plus enhance-in-place workflow
(upscale 4x for enhancement, ImageScaleBy 0.25 back to original resolution)
- README.md: Photo enhancement section with quickstart, config reference,
workflow customisation notes, and performance table
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
- hyperstack.rb: add VllmWatcher class and `watch` subcommand — live
terminal dashboard polling all active VMs every 5 s via SSH; shows
GPU util/VRAM/temp/power bars and vLLM throughput/requests/KV-cache/
cache-hit bars aligned in a shared column layout
- draw(): render two or more VM panels side-by-side (horizontal) with a
│ separator, padded to equal visible width; single VM falls back to
vertical layout
- pi/agent/extensions/modal-editor: start in INSERT mode instead of NORMAL
- README: document watch command and update fish script rename
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
Switch VM1 from n3-H100x1 to n3-H100x2 to run Nemotron-3-Super with
1M token context window via tensor parallelism. The dual-GPU setup
(160 GB total VRAM) provides enough KV cache headroom to override the
model's config.json limit of 262144 tokens.
Key changes:
- flavor_name: n3-H100x1 → n3-H100x2
- tensor_parallel_size: 1 → 2
- max_model_len: 131072 → 1048576 (with VLLM_ALLOW_LONG_MAX_MODEL_LEN=1)
- gpu_memory_utilization: 0.92 → 0.85 (headroom for Mamba cache + sampler warmup)
- Remove --enforce-eager: no longer needed with dual-GPU VRAM budget
- Disable prefix caching: on NemotronH it forces Mamba "all" cache mode
which pre-allocates states for all max_num_seqs and OOMs before the
sampler warmup pass; per-request allocation is cheaper at startup
Add two new vllm config fields to hyperstack.rb:
- extra_docker_env: passes -e KEY=VALUE flags to Docker before the image
name (used for VLLM_ALLOW_LONG_MAX_MODEL_LEN=1 and
PYTORCH_ALLOC_CONF=expandable_segments:True)
- enable_prefix_caching: makes --enable-prefix-caching conditional
(default true for backward compat; false for NemotronH)
Both fields are supported in [vllm] defaults and [vllm.presets.*]
overrides with the same fallback semantics as existing fields.
Update pi/agent/models.json: Nemotron vm1 entry renamed to
"Nemotron 3 Super 120B 1M [vm1]" with contextWindow 1048576.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
Merged all still-relevant content from vllm-setup.txt into README.md:
- Why vLLM over Ollama section
- Full monitoring commands with engine metrics table
- Troubleshooting table
- VRAM sizing guide
- Performance characteristics table
Dropped LiteLLM, Anthropic API, Claude Code, and OpenCode sections
which are no longer applicable. Removes the vllm-setup.txt file.
|
|
|
|
|