vLLM
此内容尚不支持你的语言。
vLLM can serve open-source (and some custom) models via an OpenAI-compatible HTTP API. CoderClaw can connect to vLLM using the openai-completions API.
CoderClaw can also auto-discover available models from vLLM when you opt in with VLLM_API_KEY (any value works if your server doesn’t enforce auth) and you do not define an explicit models.providers.vllm entry.
Quick start
Section titled “Quick start”- Start vLLM with an OpenAI-compatible server.
Your base URL should expose /v1 endpoints (e.g. /v1/models, /v1/chat/completions). vLLM commonly runs on:
http://127.0.0.1:8000/v1
- Opt in (any value works if no auth is configured):
export VLLM_API_KEY="vllm-local"- Select a model (replace with one of your vLLM model IDs):
{ agents: { defaults: { model: { primary: "vllm/your-model-id" }, }, },}Model discovery (implicit provider)
Section titled “Model discovery (implicit provider)”When VLLM_API_KEY is set (or an auth profile exists) and you do not define models.providers.vllm, CoderClaw will query:
GET http://127.0.0.1:8000/v1/models
…and convert the returned IDs into model entries.
If you set models.providers.vllm explicitly, auto-discovery is skipped and you must define models manually.
Explicit configuration (manual models)
Section titled “Explicit configuration (manual models)”Use explicit config when:
- vLLM runs on a different host/port.
- You want to pin
contextWindow/maxTokensvalues. - Your server requires a real API key (or you want to control headers).
{ models: { providers: { vllm: { baseUrl: "http://127.0.0.1:8000/v1", apiKey: "${VLLM_API_KEY}", api: "openai-completions", models: [ { id: "your-model-id", name: "Local vLLM Model", reasoning: false, input: ["text"], cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 }, contextWindow: 128000, maxTokens: 8192, }, ], }, }, },}Troubleshooting
Section titled “Troubleshooting”- Check the server is reachable:
curl http://127.0.0.1:8000/v1/models- If requests fail with auth errors, set a real
VLLM_API_KEYthat matches your server configuration, or configure the provider explicitly undermodels.providers.vllm.