diff options
| author | Paul Buetow <paul@buetow.org> | 2026-03-24 23:49:42 +0200 |
|---|---|---|
| committer | Paul Buetow <paul@buetow.org> | 2026-03-24 23:49:42 +0200 |
| commit | 9e3ae0f5847f73eea73af6ed9f49f93bf2b811f4 (patch) | |
| tree | d9e8acc1931d56803b5d1e4a96931f5c9929534c /hyperstack-vm1.toml | |
| parent | 9731b82818a2a199a8d826ae3e406c61572c2b6f (diff) | |
gpt-oss-120b: enable reasoning via openai_gptoss parser
- Add --reasoning-parser openai_gptoss to gpt-oss-120b vLLM config in
all three toml files; extracts <|channel|>analysis thinking blocks
into reasoning_content in API responses
- Mark gpt-oss-120b as reasoning: true in pi/agent/models.json for all
three providers (hyperstack, hyperstack1, hyperstack2)
- Update vm1 state file
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Diffstat (limited to 'hyperstack-vm1.toml')
| -rw-r--r-- | hyperstack-vm1.toml | 1 |
1 files changed, 1 insertions, 0 deletions
diff --git a/hyperstack-vm1.toml b/hyperstack-vm1.toml index 35a330c..8df93f5 100644 --- a/hyperstack-vm1.toml +++ b/hyperstack-vm1.toml @@ -134,6 +134,7 @@ max_model_len = 131072 gpu_memory_utilization = 0.92 tensor_parallel_size = 1 tool_call_parser = "" +extra_vllm_args = ["--reasoning-parser", "openai_gptoss"] # Qwen2.5-Coder-32B-Instruct AWQ — best-in-class open coding model at 32B, ~18 GB on A100. [vllm.presets.qwen25-coder-32b] |
