cross-cutting · data + schema

One Postgres. Two schemas.

BawtHub and llm-bawt share a database — TimescaleDB-flavored Postgres 16 with pgvector — but they don't share tables. llm-bawt owns the memory, sessions, profiles, turn logs, and the OpenClaw event archive. BawtHub owns its presentation tables: avatar settings, the Unraid container catalog, the agent task system, and notifications. Two ORMs, one connection string, zero foreign keys across the boundary.

DB engine: timescale/timescaledb-ha:pg16 llm-bawt ORM: SQLAlchemy + SQLModel BawtHub ORM: Prisma (TypeScript) Vector index: pgvector HNSW, 384d

01 Where the database lives.

The Postgres container is postgres-pgvector on Unraid, network br0.2, image timescale/timescaledb-ha:pg16. The choice of the TimescaleDB image isn't really about time series — it's that the HA image ships with pgvector and a fat catalog of useful extensions preinstalled. Nothing in BawtHub depends on hypertables today.

Both llm-bawt-app on echo (10.0.0.101) and bawthub-frontend (also on echo) reach Postgres over the LAN. Connection strings are configured via LLM_BAWT_POSTGRES_* environment variables for llm-bawt and a single DATABASE_URL for the Prisma client in the BawtHub frontend container.

02 Schema layout.

Despite living in the same database, the two services' tables don't overlap and there are no cross-schema joins. Each side treats the other's data as opaque — when BawtHub needs message history it calls llm-bawt's REST API; when llm-bawt needs avatar metadata it doesn't, because it isn't llm-bawt's concern.

Tables grouped by owning service

Memory

{bot}_messages{bot}_memories{bot}_forgotten_messagessessions

Identity

entity_profilesentity_profile_attributesbot_profiles

Runtime

turn_logstool_call_recordsscheduled_jobsjob_runsruntime_settings

OpenClaw

agent_eventsagent_session_stateagent_runs

Prompts

prompt_templatesprompt_template_versions

BawtHub

AgentProjectAgentTaskAgentStepAgentActivityUnraidContainerAvatarSettings+more

03 Memory (per bot, plus shared sessions).

The five-layer memory model is covered in its own page; this is the disk-level view. Per bot, llm-bawt creates three tables — {bot}_messages, {bot}_memories, {bot}_forgotten_messages — by sanitizing the slug into a table name and running CREATE TABLE IF NOT EXISTS on first access. The memories table has a 384-dim pgvector embedding column added via raw SQL because SQLAlchemy doesn't ship a native pgvector type.

Table	Per-bot?	Notable columns	Indexes
`{bot}_messages`	Yes	`id` (UUID), `role`, `content`, `timestamp`, `session_id`, `processed`, `summarized`, `recalled_history`	FTS on `content`
`{bot}_memories`	Yes	`id`, `content`, `tags JSONB`, `importance`, `source_message_ids`, `access_count`, `intent`, `stakes`, `emotional_charge`, `recurrence_keywords`, `embedding vector(384)`	HNSW on `embedding`
`{bot}_forgotten_messages`	Yes	Soft-deleted rows from messages, with `forgotten_at`	—
`sessions`	Shared (column `bot_id`)	`id`, `bot_id`, `started_at`, `ended_at`, `status`, `session_metadata JSONB`	`(bot_id, started_at)`, `(status)`

The per-bot pattern means table count grows linearly with bot count. That's deliberate — it keeps each bot's vector index small and makes "drop this bot" an actual DROP TABLE, not a tombstone migration. The downside is cross-bot search has to fan out, which the memory_search_all MCP tool handles.

04 Identity tables (shared).

Unlike memory, identity is shared across bots. A fact about Nick shouldn't have to be relearned by Snark, Mira, and Codex independently.

Table	Purpose
`entity_profiles`	One row per entity (a user or a bot). Keyed by `(entity_type, entity_id)`. Holds entity-level metadata.
`entity_profile_attributes`	Typed key/value attributes about each entity. Each row carries `confidence`, `source_message_id`, `extracted_at`. Bot attributes capture emergent personality drift; user attributes capture facts about humans.
`bot_profiles`	The persisted bot configuration — slug, name, color, system prompt, `agent_backend`, `agent_backend_config JSONB` (which holds session keys, model choices, timeouts), feature flags like `uses_tools`, `include_summaries`, `voice_optimized`, `tts_mode`.

The agent_backend_config.session_key field is where bridges write the SDK-internal thread/session UUID — for Claude Code it's the Agent SDK session, for Codex it's the thread_id. When you send /new as your first word, the chat-routes handler clears this field, and the next turn starts a fresh upstream conversation.

05 Runtime + turn logs.

Every chat completion gets a row in turn_logs, persisted at request-start with status="streaming" so even a backend crash leaves a trace. The row gets backfilled with prepared messages, tool call details, response text, latency, and final status as the turn progresses.

Table	What it stores
`turn_logs`	One row per chat completion. Holds `user_prompt`, `response_text`, `tool_calls_json`, `error_text`, `latency_ms`, `token_usage_json`, `trigger_message_id` (frontend UUID), `agent_session_key`, `agent_request_id`, `animation` (for avatar dispatch).
`tool_call_records`	Incrementally appended during a turn — one row per tool invocation. `turn_id`, `tool_name`, `arguments_json`, `result_text`, `iteration`, `started_at`/`ended_at`, `duration_ms`. Survives the turn-log `tool_calls_json` being out of date because that field is only written at finalization.
`scheduled_jobs` / `job_runs`	Cron-style automation: per-bot scheduled prompts (e.g. nightly summarization), with one `job_runs` row per execution recording status and output.
`runtime_settings`	Hot-reloadable configuration values. Edited via the BawtHub `/tools/settings` page; consumed by llm-bawt on the next request without restart.

Turn logs have a TTL — default 168 hours — and a periodic cleanup job. Tool call records persist alongside their parent turn. The trigger_message_id column on turn_logs is what lets the frontend tie a turn log back to the user message that started it; the column was backfilled retroactively when the field was added.

06 The OpenClaw event archive.

Streaming events live in Redis (see the streaming page) — but every event is also persisted to Postgres for replay and durability:

Table	Purpose
`agent_events`	Every `AgentEvent` the bridges have published, deduped by `event_dedupe_key` (SHA-256 of session/kind/payload/seq). Indexed on `(session_key, created_at)`, `(session_key, run_id)`, `(session_key, seq)`. `payload_json JSONB` holds the full raw event.
`agent_session_state`	Per-session cursor (`last_event_id`) for replay-gap detection. Tracks `ws_connected` and an opaque `last_cursor` for the upstream gateway.
`agent_runs`	One row per agent run with assembled `full_text`, `tool_calls_json`, status, and started/completed timestamps. This is the closest thing to a "chat transcript" the OpenClaw stack maintains.

✦

Why archive every event when they're also in Redis?

Redis Streams have a maxlen cap of 10,000 entries per session. That's enough for live consumption but it'll wrap during a long-running gateway session. The Postgres archive lets the OpenClaw dashboard at /tools/openclaw replay history beyond the Redis window, supports the "resume from event id" semantics when a subscriber attaches, and gives the bridge a place to assemble full_text from delta events without needing to keep them in memory.

07 The prompt registry.

System prompts and reusable prompt templates are versioned. prompt_templates holds the named template (e.g. "memory_consolidation", "fact_extraction"); prompt_template_versions stores every edit with the body and a timestamp. The /tools/prompts page in BawtHub is the editing UI.

This isn't speculative — most of the memory pipeline (consolidation, fact extraction, profile resolution) is driven by templated prompts the operator can iterate on without redeploying code. The active version is the latest version unless one is explicitly pinned.

08 BawtHub: presentation tables (Prisma).

The BawtHub frontend owns its own schema at frontend/prisma/schema.prisma. These tables back the UI's non-chat surfaces — dashboards, avatar config, the agent task system.

Model	What it stores
`UnraidContainer` / `UnraidContainerGroup`	Mirrored snapshot of Docker containers on the Unraid host. `ports JSONB`, `lanIpPorts JSONB`, `customUrl`, `iconUrl`, `autoStart`, `visible`, `sortOrder`. Synced periodically from the Unraid API.
`AgentProject`	A bucket of related tasks. `contextPrompt`, `color`, `icon`, `agentBotId` (which bot handles its tasks).
`AgentTask`	One unit of work. Statuses: `QUEUED` → `PLANNING` → `REFINED` → `IN_PROGRESS` → `REVIEW` → `COMPLETED`/`FAILED`/`CANCELLED`. Self-references via `TaskDependencies` for DAGs.
`AgentStep`	Sub-step of a task. Types: `PLAN`, `READ_FILE`, `EDIT_FILE`, `RUN_COMMAND`, `SEARCH`, `ASK_USER`, `REVIEW`, etc. Ordered by `orderIndex`.
`AgentActivity`	Activity feed across tasks/projects. Polymorphic (`actorType`: user/agent/system) with a `meta JSONB` bag. Indexed on every dimension.
`Upload`	SHA-256-addressed uploaded files. Bound to a task or a project (or both).
`AvatarSettings` / `AvatarSettingsPreset` / `BoneMapping`	Per-VRM-model rig config — mesh visibility toggles, morph overrides, material overrides, bone-name mappings between mixamo and the target rig. Editable via `/tools/avatar`.
`AvatarAnimation`	The catalog of animations the LLM can pick from for a TTS response. The FBX clips themselves live on disk; this is the semantic catalog with descriptions and enabled flags.
`BotReplyNotification`	One row per agent reply. Drives the bell-icon notification dropdown. Indexed on `(entityId, createdAt desc)` and `(entityId, readAt)`.
`BotColorMapping`	Maps a bot's display name to its accent color — used everywhere a bot bubble or pill renders in the UI.
`UserPreference`	Per-entity opaque key/value store for user-set frontend preferences (theme, layout, hidden bots, etc.).

09 Two ORMs, no foreign keys across the boundary.

The boundary is intentional. Prisma manages its own migrations via prisma migrate; llm-bawt manages its own DDL via SQLAlchemy create_all plus targeted ALTER TABLE IF NOT EXISTS statements for schema additions. Neither side knows about the other's tables — when the frontend wants chat history it goes through llm-bawt's REST surface at /v1/history, not a direct join.

This buys two things: clean ownership of migrations (no need to coordinate Prisma migrations with Python ones), and the freedom to host the two services against different databases later if it ever makes sense. Today they share one Postgres instance because that's the simplest thing that works.

⚠

The one shared concept: bot_id.

Both schemas use bot_id as a soft join key. BawtHub's BotReplyNotification.botId, AgentProject.agentBotId, and BotColorMapping.richName all reference bots that llm-bawt owns. There's no FK constraint — if a bot is renamed or deleted in llm-bawt's bot_profiles, BawtHub rows can dangle. In practice bots are stable enough that this hasn't been an operational issue.

10 Redis is the other data store.

Not everything is in Postgres. The Redis container (llm-bawt-redis, redis:7-alpine) is configured with --maxmemory 128mb --maxmemory-policy allkeys-lru — a transient bus, not a database. It holds:

Streams for events, commands, and run responses — capped at maxlen ~10,000 per session (per-session events), 5,000 per request (run streams), or 5,000 for unified bot+user event streams.
Short-lived RPC reply streams with 60-second TTLs for cross-process function calls.
History-persist commands (agent:history, maxlen 1,000) that the main app consumes to keep bot chat histories in sync with the gateway.

Nothing in Redis is durable past restart. Postgres is the source of truth; Redis is just the fast path between processes.

11 Backups.

The Postgres cluster is dumped nightly to NAS storage as part of the broader Unraid backup routine. Memory tables are large (snark_messages alone is ~2,500 rows of conversation history at the time of writing); the dump runs against the live database with pg_dump --clean --if-exists. There's no streaming replication today — single-node Postgres on Unraid, point-in-time recovery via dpkg-backed restore, daily granularity.

The infra-backups agent skill covers the monitoring side; verification of dump freshness is a separate routine on the host.

12 Key files.

src/llm_bawt/memory/postgresql.py

The memory backend. ~1,900 lines. Defines get_memory_table_pg(bot_id), get_message_table_pg(bot_id), the shared sessions_table, and the PostgreSQLMemoryBackend class that orchestrates them.

src/llm_bawt/profiles.py

Profile attributes. ProfileAttribute, EntityProfile, the EntityType enum. entity_profile_attributes is the shared K/V store.

src/llm_bawt/service/turn_logs.py

Turn logs + tool call records. The TurnLog SQLModel, ToolCallRecord, TurnLogStore with TTL-based cleanup.

src/llm_bawt/service/scheduler.py

Scheduled jobs. Cron-style automation tables — scheduled_jobs and job_runs.

src/llm_bawt/runtime_settings.py

Runtime settings + bot profiles. runtime_settings for hot-reload values; bot_profiles for the bot catalog with agent_backend_config.

src/llm_bawt/prompt_registry.py

Versioned prompt templates. Backs the /tools/prompts editor; every save is a new version row.

src/agent_bridge/store.py

OpenClaw event archive. Creates agent_events, agent_session_state, agent_runs; assembles run text from delta events.

bawthub/frontend/prisma/schema.prisma

BawtHub schema. ~270 lines of Prisma DSL covering avatars, the agent task system, container catalog, notifications, and user preferences.

PreviousStreaming NextDeployment topology

Validated against main on 2026-05-13 Source: llm-bawt + bawthub