Venice is our highlight Venice setup for privacy-first inference with optional anonymized access to proprietary models.
Venice AI provides privacy-focused AI inference with support for uncensored models and access to major proprietary models through their anonymized proxy. All inference is private by default—no training on your data, no logging.
/v1 endpoints.Venice offers two privacy levels — understanding this is key to choosing your model:
| Mode | Description | Models |
|---|---|---|
| Private | Fully private. Prompts/responses are never stored or logged. Ephemeral. | Llama, Qwen, DeepSeek, Venice Uncensored, etc. |
| Anonymized | Proxied through Venice with metadata stripped. The underlying provider (OpenAI, Anthropic) sees anonymized requests. | Claude, GPT, Gemini, Grok, Kimi, MiniMax |
/v1 endpoints for easy integrationvapi_xxxxxxxxxxxx)Option A: Environment Variable
export VENICE_API_KEY="vapi_xxxxxxxxxxxx"
Option B: Interactive Setup (Recommended)
coderclaw onboard --auth-choice venice-api-key
This will:
VENICE_API_KEY)Option C: Non-interactive
coderclaw onboard --non-interactive \
--auth-choice venice-api-key \
--venice-api-key "vapi_xxxxxxxxxxxx"
coderclaw chat --model venice/llama-3.3-70b "Hello, are you working?"
After setup, CoderClaw shows all available Venice models. Pick based on your needs:
venice/llama-3.3-70b for private, balanced performance.venice/claude-opus-45 for hard jobs (Opus remains the strongest).Change your default model anytime:
coderclaw models set venice/claude-opus-45
coderclaw models set venice/llama-3.3-70b
List all available models:
coderclaw models list | grep venice
coderclaw configurecoderclaw configure| Use Case | Recommended Model | Why |
|---|---|---|
| General chat | llama-3.3-70b |
Good all-around, fully private |
| Best overall quality | claude-opus-45 |
Opus remains the strongest for hard tasks |
| Privacy + Claude quality | claude-opus-45 |
Best reasoning via anonymized proxy |
| Coding | qwen3-coder-480b-a35b-instruct |
Code-optimized, 262k context |
| Vision tasks | qwen3-vl-235b-a22b |
Best private vision model |
| Uncensored | venice-uncensored |
No content restrictions |
| Fast + cheap | qwen3-4b |
Lightweight, still capable |
| Complex reasoning | deepseek-v3.2 |
Strong reasoning, private |
| Model ID | Name | Context (tokens) | Features |
|---|---|---|---|
llama-3.3-70b |
Llama 3.3 70B | 131k | General |
llama-3.2-3b |
Llama 3.2 3B | 131k | Fast, lightweight |
hermes-3-llama-3.1-405b |
Hermes 3 Llama 3.1 405B | 131k | Complex tasks |
qwen3-235b-a22b-thinking-2507 |
Qwen3 235B Thinking | 131k | Reasoning |
qwen3-235b-a22b-instruct-2507 |
Qwen3 235B Instruct | 131k | General |
qwen3-coder-480b-a35b-instruct |
Qwen3 Coder 480B | 262k | Code |
qwen3-next-80b |
Qwen3 Next 80B | 262k | General |
qwen3-vl-235b-a22b |
Qwen3 VL 235B | 262k | Vision |
qwen3-4b |
Venice Small (Qwen3 4B) | 32k | Fast, reasoning |
deepseek-v3.2 |
DeepSeek V3.2 | 163k | Reasoning |
venice-uncensored |
Venice Uncensored | 32k | Uncensored |
mistral-31-24b |
Venice Medium (Mistral) | 131k | Vision |
google-gemma-3-27b-it |
Gemma 3 27B Instruct | 202k | Vision |
openai-gpt-oss-120b |
OpenAI GPT OSS 120B | 131k | General |
zai-org-glm-4.7 |
GLM 4.7 | 202k | Reasoning, multilingual |
| Model ID | Original | Context (tokens) | Features |
|---|---|---|---|
claude-opus-45 |
Claude Opus 4.5 | 202k | Reasoning, vision |
claude-sonnet-45 |
Claude Sonnet 4.5 | 202k | Reasoning, vision |
openai-gpt-52 |
GPT-5.2 | 262k | Reasoning |
openai-gpt-52-codex |
GPT-5.2 Codex | 262k | Reasoning, vision |
gemini-3-pro-preview |
Gemini 3 Pro | 202k | Reasoning, vision |
gemini-3-flash-preview |
Gemini 3 Flash | 262k | Reasoning, vision |
grok-41-fast |
Grok 4.1 Fast | 262k | Reasoning, vision |
grok-code-fast-1 |
Grok Code Fast 1 | 262k | Reasoning, code |
kimi-k2-thinking |
Kimi K2 Thinking | 262k | Reasoning |
minimax-m21 |
MiniMax M2.1 | 202k | Reasoning |
CoderClaw automatically discovers models from the Venice API when VENICE_API_KEY is set. If the API is unreachable, it falls back to a static catalog.
The /models endpoint is public (no auth needed for listing), but inference requires a valid API key.
| Feature | Support |
|---|---|
| Streaming | âś… All models |
| Function calling | âś… Most models (check supportsFunctionCalling in API) |
| Vision/Images | ✅ Models marked with “Vision” feature |
| JSON mode | âś… Supported via response_format |
Venice uses a credit-based system. Check venice.ai/pricing for current rates:
| Aspect | Venice (Anonymized) | Direct API |
|---|---|---|
| Privacy | Metadata stripped, anonymized | Your account linked |
| Latency | +10-50ms (proxy) | Direct |
| Features | Most features supported | Full features |
| Billing | Venice credits | Provider billing |
# Use default private model
coderclaw chat --model venice/llama-3.3-70b
# Use Claude via Venice (anonymized)
coderclaw chat --model venice/claude-opus-45
# Use uncensored model
coderclaw chat --model venice/venice-uncensored
# Use vision model with image
coderclaw chat --model venice/qwen3-vl-235b-a22b
# Use coding model
coderclaw chat --model venice/qwen3-coder-480b-a35b-instruct
echo $VENICE_API_KEY
coderclaw models list | grep venice
Ensure the key starts with vapi_.
The Venice model catalog updates dynamically. Run coderclaw models list to see currently available models. Some models may be temporarily offline.
Venice API is at https://api.venice.ai/api/v1. Ensure your network allows HTTPS connections.
{
env: { VENICE_API_KEY: "vapi_..." },
agents: { defaults: { model: { primary: "venice/llama-3.3-70b" } } },
models: {
mode: "merge",
providers: {
venice: {
baseUrl: "https://api.venice.ai/api/v1",
apiKey: "${VENICE_API_KEY}",
api: "openai-completions",
models: [
{
id: "llama-3.3-70b",
name: "Llama 3.3 70B",
reasoning: false,
input: ["text"],
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
contextWindow: 131072,
maxTokens: 8192,
},
],
},
},
},
}