This document describes provider-specific fixes applied to transcripts before a run (building model context). These are in-memory adjustments used to satisfy strict provider requirements. These hygiene steps do not rewrite the stored JSONL transcript on disk; however, a separate session-file repair pass may rewrite malformed JSONL files by dropping invalid lines before the session is loaded. When a repair occurs, the original file is backed up alongside the session file.
Scope includes:
If you need transcript storage details, see:
All transcript hygiene is centralized in the embedded runner:
src/agents/transcript-policy.tssanitizeSessionHistory in src/agents/pi-embedded-runner/google.tsThe policy uses provider, modelApi, and modelId to decide what to apply.
Separate from transcript hygiene, session files are repaired (if needed) before load:
repairSessionFileIfNeeded in src/agents/session-file-repair.tsrun/attempt.ts and compact.ts (embedded runner)Image payloads are always sanitized to prevent provider-side rejection due to size limits (downscale/recompress oversized base64 images).
This also helps control image-driven token pressure for vision-capable models. Lower max dimensions generally reduce token usage; higher dimensions preserve detail.
Implementation:
sanitizeSessionMessagesImages in src/agents/pi-embedded-helpers/images.tssanitizeContentBlocksImages in src/agents/tool-images.tsagents.defaults.imageMaxDimensionPx (default: 1200).Assistant tool-call blocks that are missing both input and arguments are dropped
before model context is built. This prevents provider rejections from partially
persisted tool calls (for example, after a rate limit failure).
Implementation:
sanitizeToolCallInputs in src/agents/session-transcript-repair.tssanitizeSessionHistory in src/agents/pi-embedded-runner/google.tsWhen an agent sends a prompt into another session via sessions_send (including
agent-to-agent reply/announce steps), CoderClaw persists the created user turn with:
message.provenance.kind = "inter_session"This metadata is written at transcript append time and does not change role
(role: "user" remains for provider compatibility). Transcript readers can use
this to avoid treating routed internal prompts as end-user-authored instructions.
During context rebuild, CoderClaw also prepends a short [Inter-session message]
marker to those user turns in-memory so the model can distinguish them from
external end-user instructions.
OpenAI / OpenAI Codex
Google (Generative AI / Gemini CLI / Antigravity)
Anthropic / Minimax (Anthropic-compatible)
Mistral (including model-id based detection)
OpenRouter Gemini
thought_signature values (keep base64).Everything else
Before the 2026.1.22 release, CoderClaw applied multiple layers of transcript hygiene:
_/-).<final> tags from assistant text before persistence.This complexity caused cross-provider regressions (notably openai-responses
call_id|fc_id pairing). The 2026.1.22 cleanup removed the extension, centralized
logic in the runner, and made OpenAI no-touch beyond image sanitization.