BawtHub
⌕ Search ⌘K Source ↗ Open app →
BawtHub · overview

The face of the bawts.

BawtHub is everything that isn't the LLM. It's the web UI you talk to, the voice pipeline that turns microphone bytes into transcripts and TTS audio back into your speakers, the 3D avatar that mouths the words, and the dashboards for managing memory, bots, agents, Unraid, and a thousand small operational details. llm-bawt does the thinking. BawtHub is the body.

Frontend: Next.js 16.1.6 · React 19.2 · Tailwind 4.2 Backend: Python 3.12 · FastAPI · fastrtc 0.0.23 Front door: Traefik on :80

01 Two services, one origin.

BawtHub is split across two processes that share a hostname. A Next.js app handles every page, every REST proxy, and every Server-Sent Event stream to llm-bawt. A Python FastAPI service handles the realtime voice pipeline — microphone WebSocket in, Opus audio WebSocket out — plus a few admin endpoints that need direct access to the GPU services (TTS module switching, container restarts, Home Assistant pulls). Traefik routes between them on a per-path basis.

Runtime topology · single host, five containers
Edge
traefik :80priority-based path routing12+ host aliases
Web
frontend :3000Next.js 16 + HMR//api/chat/*/api/unraid/*/api/agents/*/api/avatar/*
Voice
backend :80FastAPI + fastrtc/v1/ws/v1/health/v1/tts/*
GPU
stt :8090tts :8089Moshi · Kyutaimsgpack/ws
Upstream
llm-bawt :8642separate repoOpenAI-compatSSE turns + tool events

The frontend is the orchestrator — every chat message, every memory query, every agent task dispatch goes through a Next.js API route that proxies to llm-bawt. The Python backend exists for one reason: realtime audio doesn't belong in a Node.js process. WebSockets to STT and TTS, Opus codec handling, and the turn-by-turn voice state machine all live in Python where fastrtc, sphn, and the Moshi msgpack protocol are first-class.

02 Traefik does the splitting.

The compose stack runs a single Traefik instance on port 80. Path rules send /api/chat/*, /api/unraid/*, /api/agents/*, /api/avatar/*, /api/notifications/*, /api/preferences/*, /api/llm/*, /api/docker/*, /api/bot-colors/*, /api/camera-presets/*, /api/uploads/*, /api/media/*, and /api/clog to the Next.js container with router priority 200. A backend catch-all on PathPrefix(/api) at priority 100 mops up everything else — the voice WS, TTS admin, health checks — after stripping the /api/ prefix so the FastAPI app sees clean paths.

Adding a new /api/ route is a Traefik decision, not a Next decision.

If you forget to add a new prefix to the frontend's priority-200 router, the Python backend's catch-all will swallow the request and you'll get a confused 404 from the wrong service. The compose file has a long banner comment about this. docker inspect bawthub-frontend-1 | grep traefik.http.routers.frontend- is the verification step.

The same container serves a dozen host aliases — echo.ferreri.us, echo.zenoran.com, app.bawthub.com, and matching echo.lan.* + dev.echo.lan.* internal names. There's no host-based service selection; the router is host-listed but path-driven. An optional frontend-prod snapshot container behind the snapshot compose profile serves snapshot.echo.lan.* with a baked image when you need a stable build to A/B against the live HMR container.

03 What the frontend looks like.

The Next.js app is organized by feature, not by component type. Almost everything lives under src/app/ using App Router conventions. The two main route groups split the layout in half:

GroupLayoutPages
(app)/(dashboard)Sidebar nav, chrome, sub-nav stripHome tiles, tools/*, agents/*, docker, studio, unraid
(app)/(fullscreen)No chrome — edge to edgechat, voice, avatar

Beyond pages, the top level of src/app/ is a flat surface of orchestrator components (BawtHub.tsx, AnimatedOrb.tsx, AvatarViewer.tsx, FloatingTranscript.tsx), Zustand stores (useAppStore.ts, useAvatarStore.ts), and audio hooks (useAudioProcessor.ts, useRealtimeAudioOutput.ts, useSpeechRecognition.ts). The chat surface gets its own folder with the big ChatUI.tsx (3,500+ lines), a per-bot useChatStore, and a unified SSE event stream hook.

04 State, in three categories.

BawtHub doesn't pick one state pattern. It uses three, deliberately:

A PreferencesContext reconciles the persisted Zustand state with the server preferences on first load, treating the server as authoritative but never blocking the UI on the async fetch.

05 What the Python backend does.

The Python service is small by design — about 1,500 lines in main_websocket.py plus an 885-line handler.py that owns the voice state machine. Its job is:

  1. Accept the browser's microphone WebSocket. Decode Opus into 24 kHz float frames.
  2. Stream those frames to the Kyutai Moshi STT container over msgpack/WebSocket. Read back word + pause-prediction messages.
  3. When the STT signals a pause (and a configurable VAD threshold confirms it), flush the audio buffer, hand the transcript to llm-bawt's chat completions endpoint, and stream the response word-by-word.
  4. Forward each word to a TTS provider (Moshi, Azure, xAI Grok, or Kokoro adapter — chosen by voice ID). Stream the audio chunks back to the browser as Opus frames over the same WebSocket.
  5. Handle interruptions: if the STT-VAD detects user speech while the bot is talking, cancel the LLM stream and stop the TTS mid-word.

It also exposes admin endpoints — /v1/voices, /v1/tts/admin/config, /v1/tts/admin/restart, /v1/tts/preview, /v1/health — that the frontend's TTS admin page calls. Container restarts go through a small shell script (restart_tts_container.sh) that talks to the mounted Docker socket so the backend can recycle the TTS worker after a module-type switch.

06 How it consumes llm-bawt.

llm-bawt is the LLM orchestration layer — chat completions, memory, tools, agent bridges, bot personalities. It runs in its own Docker stack on port 8642, exposes an OpenAI-compatible API at /v1/*, and is reached from both halves of BawtHub via BAWTHUB_LLM_URL=http://host.docker.internal:8642.

frontend/src/app/api/chat/*
The proxy floor. Every chat-flavored endpoint — bots, models, history, memory, summaries, tool-calls, turn-logs — has a thin Next.js route that forwards to llm-bawt's /v1/*. Headers, auth, and SSE pass-through happen here. The proxy/[...path] catch-all handles long-tail endpoints not worth a dedicated route file.
bawthub/llm/llm_utils.py
The voice path. The Python backend talks to llm-bawt as a stateless OpenAI client and streams word-by-word for low TTS latency. StatelessLLMStream + rechunk_to_words are the relevant helpers.
frontend/src/app/chat/useUnifiedEventStream.ts
The SSE side. A single Server-Sent Events subscription per bot consumes tool_start / tool_end / turn_complete events from llm-bawt. Used by the chat UI to render live tool activity inline as agent bots execute steps.

There's nothing that talks to llm-bawt's Postgres directly. Memory writes, summary rebuilds, embedding regeneration — all go through MCP-style HTTP endpoints. This is a deliberate firewall: BawtHub knows nothing about the schema in {bot_id}_memories or profile_attributes, and it can't accidentally corrupt them.

07 Deployment is just file edits.

The frontend container bind-mounts ./frontend from the host and runs next dev --webpack. Editing a .ts, .tsx, or .css file on disk triggers HMR; users see the change on the next render. There is no separate build-and-deploy step for the live site — git pull on the host is the release.

Turbopack is intentionally off.

package.json pins next dev --webpack. Turbopack broke /public/ Web Workers + AudioWorklets (the Opus decoder and the audio-output-processor), which killed voice audio streaming. The webpack-based dev server is the supported path.

The Python backend is hot-reloading too: ./bawthub is bind-mounted, and the container runs fastapi run in dev-mode with autoreload. The Moshi STT/TTS containers are static — they don't reload, but they almost never change. make rebuild-ui exists for the rare case where package.json changes warrant a fresh image. make snapshot-up brings up the baked snapshot build on a sibling hostname when you want a stable comparison.

08 External dependencies.

The compose stack is intentionally narrow: Traefik, frontend, backend, TTS, STT, plus an optional NocoDB instance. Everything else BawtHub talks to is external:

ServiceWhereWhy
llm-bawthost.docker.internal:8642LLM completions, memory, tools, bots — separate repo, separate lifecycle
PostgreSQLOperator-providedPrisma schema for UI persistence — not bundled, you bring your own
Home AssistantConfigured via envWeather widget on the home page, smart-home context for bots
UnraidConfigured via envContainer management dashboard for the home server
HuggingFaceHF_TOKENPulls Moshi STT + TTS weights on first run (~20 GB)

09 Where to go from here.

Frontend
Next.js 16, App Router, Zustand stores, the SSE event bus, and how the chat UI tracks 3,500 lines of state.
Voice
The fastrtc handler, Moshi STT/TTS msgpack, the multi-provider TTS registry, and how a word becomes audio in under 200 ms.
Avatar
VRM, GLB, and FBX in a unified three.js scene with audio-amplitude lip sync and procedural idle motion.
Surfaces
A tour of every page — chat, voice, agents, tools, unraid — with the real bot roster doing real work.
Validated against main on 2026-05-13 Source: bawthub repo (private)