The SDK is a subprocess.
Anthropic ships a Claude Agent SDK that bundles a Claude Code binary and streams a JSON event stream over stdio. The bridge wraps that stream in the same Redis envelope every other agent backend uses — so a chat in the BawtHub UI doesn't know, and shouldn't care, whether the tools running on the other side are coming from a local llama.cpp tool loop, a WebSocket gateway, or a Claude binary that an SDK spawned five seconds ago.
01 Why a bridge, not an import.
The Claude Agent SDK is a Python package, and it would be technically possible to import it inside the main llm-bawt-app FastAPI process. The bridge exists for three reasons, in increasing order of importance:
- Process isolation. The SDK spawns a child Claude binary. If that binary crashes (it has, with
exit code: 1on bad OAuth state) it shouldn't take the main API down with it. - Restart-without-restart. The bridge is its own Docker service.
docker compose restart claude-code-bridgerecycles the SDK + binary without bouncing the API; the API just sees commands enqueue while the bridge is starting up. - Uniformity. OpenClaw was the first agent backend, and it lives across a WebSocket on a different host. Claude Code came second. Codex came third. They all speak the same Redis protocol —
chat.sendin,AgentEventout — because that's the only way the SSE generator on the API side can stay backend-agnostic.
02 Topology.
┌──────────────────────────────────────────────────┐
│ llm-bawt-app (FastAPI) │
│ ClaudeCodeBackend.stream_raw() │
│ → RedisSubscriber.send_command(backend= │
│ "claude-code", session_key="claude:nick", │
│ message, system_prompt, model, ...) │
└─────────────────────┬────────────────────────────┘
│ XADD agent:commands
▼
┌──────────────────────────────────────────────────┐
│ Redis Streams │
│ agent:commands (consumer group: │
│ "claude-code-bridge"; filter backend == │
│ "claude-code", ACK others) │
└─────────────────────┬────────────────────────────┘
│ XREADGROUP
▼
┌──────────────────────────────────────────────────┐
│ claude-code-bridge container │
│ • SessionQueue.lock(session_key) │
│ • _get_session(bot) → resume UUID if same model│
│ • claude_agent_sdk.query(options) │
│ • iterate async msg stream: │
│ StreamEvent → ASSISTANT_DELTA │
│ ToolUseBlock → TOOL_START │
│ ToolResultBlk → TOOL_END │
│ ResultMessage → ASSISTANT_DONE │
└─────────────────────┬────────────────────────────┘
│ XADD agent:run:{req_id}
▼
API subscribes
→ SSE to frontend
The chat path in AgentBridgeBackend.stream_raw() (which ClaudeCodeBackend inherits unchanged) creates a per-request UUID, opens a RedisSubscriber.subscribe_run(request_id) async iterator, and yields each event into the SSE generator. The bridge never knows there's a Next.js frontend on the other end — it publishes to a stream, and whoever's subscribed reads.
03 Session keys and resume semantics.
The Claude Agent SDK has its own session concept — a UUID it assigns when query() first runs. To resume a conversation across turns the bridge has to pass that UUID back as resume_id. llm-bawt stores it in the bot's agent_backend_config.session_key column.
But the SDK session UUID is not the routing key. The Redis routing key is built in ClaudeCodeBackend._resolve_session_key():
def _resolve_session_key(self, config: dict) -> str:
bot_id = (config.get("bot_id") or "main").strip() or "main"
user_id = (config.get("user_id") or "default").strip() or "default"
return f"{bot_id}:{user_id}"
So Nick chatting with the claude bot routes through claude:nick; Megan chatting with the same bot routes through claude:megan. Two independent Claude conversations, same bot persona, same Postgres row. The SDK UUID is stored per (bot, user) pair in a small server-side map keyed by routing key.
When a new chat.send arrives, the bridge looks up the existing session for that bot via _get_session(). If the model hasn't changed, it passes resume_id to the SDK and Claude continues the prior conversation. If the model has changed (e.g. someone PATCH'd the bot from sonnet to opus[1m]), the bridge clears the session and lets the SDK start fresh — Claude can't resume a Sonnet thread on Opus.
The user message /new short-circuits the same way: clear session, drop the prefix, run normally. If the message was only /new, the bridge synthesizes a ASSISTANT_DONE with text Session reset. Ready for a new conversation.
and never calls the SDK.
04 SDK events → AgentEvent.
The bridge calls claude_agent_sdk.query(prompt, options=ClaudeAgentOptions(...)) and iterates its async message stream. The translation isn't fancy — it's just a long isinstance chain — but the mapping is the load-bearing part of the whole bridge.
| SDK message | Block / field | Emits |
|---|---|---|
StreamEvent | content_block_delta · text_delta | ASSISTANT_DELTA with text=delta.text |
StreamEvent | content_block_start · tool_use | (captures tool name; emits TOOL_START on the AssistantMessage) |
AssistantMessage | ToolUseBlock | TOOL_START · tool_name=block.name, tool_arguments=block.input |
UserMessage | ToolResultBlock | TOOL_END · tool_result truncated to 2000 chars |
ResultMessage | full text + usage | ASSISTANT_DONE · token_usage (input/cache/output, context window, cost) |
| SDK exception | — | ERROR · run_done |
Every event is then wrapped in AgentEvent and published via RedisPublisher.publish_event() to agent:run:{request_id}. The provider="claude-code" field is stamped at the publisher boundary so the frontend's ClaudeToolCallCard can dispatch on it.
05 Token usage: the cumulative-vs-iteration trap.
One thing the bridge had to get right that the internal docs glossed over: ResultMessage.usage is cumulative across all internal API iterations in the turn. For a tool-heavy turn that re-reads the cached system prompt on every iteration, the summed cache_read_input_tokens can exceed the model's actual context window and produce nonsense over-100% counters in the UI's per-bubble usage pill.
The fix: the bridge captures AssistantMessage.usage on every assistant message during the turn, keeps only the last one, and uses that "iteration view" for the input_tokens / cache_read_tokens / cache_creation_tokens fields surfaced to the UI. That represents the model's final view of the context — what a human really wants to see as "context fullness." Only output_tokens and total_cost_usd are taken from the cumulative ResultMessage, because those genuinely accumulate.
CLAUDE_CODE_BRIDGE.md documents the bridge but doesn't mention the per-iteration usage handling — that was added later when the >100% context bar showed up in production. The bridge's docstring at bridge.py:752–810 is the current source of truth.
06 OAuth, not API key.
The bridge authenticates through a Claude subscription (Max/Pro), not metered API billing. The OAuth bundle lives at ~/.claude/.credentials.json on the host and is bind-mounted into the container at /home/bridge/.claude/.credentials.json. The bundle has the shape:
{
"claudeAiOauth": {
"accessToken": "sk-ant-oat01-...",
"refreshToken": "...",
"expiresAt": 1762531200000,
"scopes": ["user:inference", "user:profile"]
}
}
The bridge's _get_fresh_oauth_token() reads this on each call, checks if expiresAt is within 5 minutes of now, and if so POSTs to https://platform.claude.com/v1/oauth/token with grant_type=refresh_token to mint a fresh access token. The refreshed bundle gets written back to the same file — which means a claude setup-token on the host self-heals the bridge on the next turn, no docker restart required.
CLAUDE_CODE_OAUTH_TOKEN env var is the fallback path, used only when the credentials file is unreadable. In normal operation the file is canonical.
07 Permission mode and the non-root container.
The bridge runs as a non-root user named bridge with permission_mode="bypassPermissions". That mode is the Claude CLI's flag that disables every "should I run this tool?" prompt — which is the only behavior that makes sense for an unattended service. Bypass mode also refuses to run as root, which is why the Dockerfile creates the bridge user. Trying to USER 0 the container will hard-fail at SDK startup.
Mounts are deliberate. The bridge sees:
| Host | Container | Why |
|---|---|---|
~/dev | /home/bridge/dev | Repos. Claude can Read, Edit, Bash across the user's whole dev tree. |
~/.config/claude-code-bridge | /home/bridge/.claude | Settings, MCP config, CLAUDE.md, credentials. |
~/dev/agent-skills | /home/bridge/.claude/skills | Shared skills repo, directly mounted into Claude's expected path. |
~/.ssh | /home/bridge/.ssh (ro) | SSH keys for git pulls and remote host access. |
The skill mount means editing a skill on the host at ~/dev/agent-skills/<skill>/SKILL.md is visible to Claude on the next turn. No symlinks, no rebuilds.
08 MCP server bootstrap.
The bridge loads MCP server definitions from ~/.claude/settings.json at startup and passes them to the SDK as ClaudeAgentOptions(mcp_servers=...). The same llm-bawt-memory MCP server that powers memory tools also exposes the task tools — so when Caid is running through this bridge, he can call tasks_update and memory_search from inside the SDK loop.
One small piece of magic: the bridge appends a system-prompt suffix per call:
## MCP Tool Context Your bot_id is "caid". When using llm-bawt-memory MCP tools: - Memory/message tools: always pass bot_id="caid" - Profile tool with entity_type="user": use entity_id="nick" - Profile tool with entity_type="bot": use entity_id="caid" (yourself)
This injection means the agent never has to remember its own slug — the bridge slips that context in before every turn.
09 Stamping the originating message.
Every chat.send the API sends carries a trigger_message_id — the UUID of the user's chat bubble that kicked off this turn. The bridge stashes it in self._trigger_message_ids[request_id] when the send starts, and _publish_event() reads it back so every emitted AgentEvent for that request carries the same trigger_message_id. The frontend then buckets tool activity under the originating user bubble without ever having to guess from turn_id heuristics.
10 What it looks like in the UI.
11 Key files.
src/claude_code_bridge/bridge.pysrc/claude_code_bridge/__main__.pysrc/llm_bawt/agent_backends/claude_code.pyAgentBridgeBackend; only overrides name = "claude-code" and _resolve_session_key() to route per (bot, user).src/llm_bawt/agent_backends/agent_bridge.pyAgentBridgeBackend.stream_raw() — the actual implementation that all three integrations subclass. Sends the command, subscribes to the run stream, translates events back to deltas and tool dicts for the SSE generator.src/agent_bridge/events.pyAgentEvent dataclass with AgentEventKind enum: ASSISTANT_DELTA, ASSISTANT_DONE, TOOL_START, TOOL_END, USER_MESSAGE, RUN_STARTED, RUN_COMPLETED, SYSTEM_NOTE, ERROR.src/agent_bridge/publisher.pyagent:commands, agent:run:{id}, agent:events:{session}. Stamps provider at publish time so call sites don't have to.Dockerfile.bridgepython:3.12-slim, used by both claude-code-bridge and openclaw-bridge. Non-root bridge user.main on 2026-05-13
Source: llm-bawt agent backends