CoderClaw tracks tokens, not characters. Tokens are model-specific, but most OpenAI-style models average ~4 characters per token for English text.
CoderClaw assembles its own system prompt on every run. It includes:
read)AGENTS.md, SOUL.md, TOOLS.md, IDENTITY.md, USER.md, HEARTBEAT.md, BOOTSTRAP.md when new, plus MEMORY.md and/or memory.md when present). Large files are truncated by agents.defaults.bootstrapMaxChars (default: 20000), and total bootstrap injection is capped by agents.defaults.bootstrapTotalMaxChars (default: 150000). memory/*.md files are on-demand via memory tools and are not auto-injected.See the full breakdown in System Prompt.
Everything the model receives counts toward the context limit:
For images, CoderClaw downscales transcript/tool image payloads before provider calls.
Use agents.defaults.imageMaxDimensionPx (default: 1200) to tune this:
For a practical breakdown (per injected file, tools, skills, and system prompt size), use /context list or /context detail. See Context.
Use these in chat:
/status → emoji‑rich status card with the session model, context usage,
last response input/output tokens, and estimated cost (API key only)./usage off|tokens|full → appends a per-response usage footer to every reply.
responseUsage)./usage cost → shows a local cost summary from CoderClaw session logs.Other surfaces:
/status + /usage are supported.coderclaw status --usage and coderclaw channels list show
provider quota windows (not per-response costs).Costs are estimated from your model pricing config:
models.providers.<provider>.models[].cost
These are USD per 1M tokens for input, output, cacheRead, and
cacheWrite. If pricing is missing, CoderClaw shows tokens only. OAuth tokens
never show dollar cost.
Provider prompt caching only applies within the cache TTL window. CoderClaw can optionally run cache-ttl pruning: it prunes the session once the cache TTL has expired, then resets the cache window so subsequent requests can re-use the freshly cached context instead of re-caching the full history. This keeps cache write costs lower when a session goes idle past the TTL.
Configure it in Gateway configuration and see the behavior details in Session pruning.
Heartbeat can keep the cache warm across idle gaps. If your model cache TTL
is 1h, setting the heartbeat interval just under that (e.g., 55m) can avoid
re-caching the full prompt, reducing cache write costs.
For Anthropic API pricing, cache reads are significantly cheaper than input tokens, while cache writes are billed at a higher multiplier. See Anthropic’s prompt caching pricing for the latest rates and TTL multipliers: https://docs.anthropic.com/docs/build-with-claude/prompt-caching
agents:
defaults:
model:
primary: "anthropic/claude-opus-4-6"
models:
"anthropic/claude-opus-4-6":
params:
cacheRetention: "long"
heartbeat:
every: "55m"
Anthropic’s 1M context window is currently beta-gated. CoderClaw can inject the
required anthropic-beta value when you enable context1m on supported Opus
or Sonnet models.
agents:
defaults:
models:
"anthropic/claude-opus-4-6":
params:
context1m: true
This maps to Anthropic’s context-1m-2025-08-07 beta header.
/compact to summarize long sessions.agents.defaults.imageMaxDimensionPx for screenshot-heavy sessions.See Skills for the exact skill list overhead formula.