Plan, dispatch, review. Repeat.
Agents in BawtHub don't live in a separate task tracker. The same Next.js app that renders chat also owns a full project / task / step model — and every operation on it is exposed twice: once as a REST endpoint for the UI, once as an MCP tool for the agents themselves. A bot can plan its own follow-up work, queue dependent tasks, mark its own steps complete, and hand off to another bot — all without a human round-trip.
01 The data model.
Three nouns: projects, tasks, steps. One verb that matters: dispatch. Everything else — activity entries, dependencies, attachments, cron schedules — hangs off those.
AgentProjectname + color + iconcontextPromptdefault agentBotIdAgentTaskshortId (TASK-216)status · prioritymodelIdresponsedependsOn[]AgentSteporderIndextypestatusoutputAgentActivitytypeactorType (user/bot)meta JSONappend-onlyTasks carry a human-friendly shortId (e.g. TASK-216) for use in chat, plus a UUID for foreign-key joins. Steps are ordered by orderIndex and typed: PLAN, READ_FILE, EDIT_FILE, CREATE_FILE, DELETE_FILE, RUN_COMMAND, SEARCH, ASK_USER, REVIEW. The type is a hint — the agent picks the actual tool to use at execution time.
02 The status machine.
A task moves through a small set of statuses. Two transitions are gated — IN_PROGRESS → REVIEW is set by the executing agent when it claims work is done; REVIEW → COMPLETED is set by a human, never by a bot. That single rule is what keeps the system honest.
| Status | Set by | Meaning |
|---|---|---|
QUEUED | creator | Created, not yet planned or started. |
PLANNING | plan-dispatcher | An agent is writing or refining the spec + steps. |
REFINED | planner | Plan written; awaiting execute dispatch. |
IN_PROGRESS | dispatcher (auto) | The dispatch route flipped this before sending the prompt. |
REVIEW | executing agent | "I'm done. Human, look at this." |
COMPLETED | human only | Signed off. Counts toward project progress. |
FAILED | agent or dispatcher | Hard error; response holds the explanation. |
CANCELLED | human | Won't do; kept for audit. |
The tasks_update MCP tool's docstring spells this out: IMPORTANT: Set status to REVIEW when done - only humans mark COMPLETED.
Nothing physically prevents an agent from sending status="COMPLETED" — it's a norm, not a permission check — but it's the norm that keeps the review queue meaningful.
03 Dispatch: clean-context handoff.
When a human (or another bot) clicks Dispatch, the frontend hits POST /api/agents/tasks/[id]/dispatch. That route does four things, in this order:
- Mark the task
IN_PROGRESSin Postgres and pre-fillmodelIdfrom the assigned bot's default model — the human doesn't have to specify which model is doing the work. - Touch the parent project's
updatedAtso it bubbles to the top of the sidebar. - Record an
AgentActivityrow of typetask.dispatchedwithactorType= user or bot. - Return
{ok: true, status: "dispatched"}immediately. The actual LLM call is deferred to Next'safter()background hook so the HTTP response is never blocked on a 30-second agent turn.
Inside the after() callback, the route fetches the full task with its project and ordered steps, renders the agents.task_execution prompt template (pulled live from the llm-bawt prompt store, cached 5 minutes), and POSTs to /v1/chat/completions with extract_memory: false and augment_memory: false. Task work doesn't pollute the bot's conversational memory.
The execution prompt is the interesting part. It includes the task's markdown context (title, description, project context, ordered checklist of steps with their UUIDs), then injects:
1. Start: Set task status to IN_PROGRESS immediately. modelId is auto-filled. 2. Per step: Before starting each step, set it to RUNNING. When done, set it to COMPLETED with a brief output. If it fails, FAILED with the error. 3. Finish: When all work is done, write a summary into the task response field and set status to REVIEW. If the task cannot be completed, set status to FAILED.
The agent is told to update its own task via the same REST endpoints the UI uses — PATCH /api/agents/tasks/[id] and PATCH /api/agents/tasks/[id]/steps/[stepId]. From its perspective there's nothing magic about the dashboard; it just makes the same HTTP calls a human would.
04 What the human sees.
BotDispatchPanel — pick a bot, hit Plan or Execute, the task flips to PLANNING or IN_PROGRESS in real time as the dispatched bot updates its own row.The frontend pieces — SortableTaskList, TaskRow, BotDispatchPanel, PlanDispatchButton, DispatchNoteDialog — all live under src/app/(app)/(dashboard)/agents/_components/. The drag-to-reorder uses POST /api/agents/tasks/reorder. Promote-to-project is a one-click action that wraps a task as its own project when scope creeps.
05 The MCP surface.
Every operation the UI can perform on tasks is also exposed as an MCP tool, registered in mcp_server/task_tools.py. Agents call these directly. The tools are thin httpx wrappers over the same /api/agents/* REST endpoints — X-Agent-Bot-Id is passed for activity attribution, and errors are wrapped as {error: ..., status: ...} dicts instead of raised exceptions, because LLMs handle failure better as data.
| Tool | Verb | Use |
|---|---|---|
tasks_list | GET | Filter by status, project, search query. Sorted by recency. |
tasks_get | GET | Full task by UUID or shortId. |
tasks_get_context | GET (derived) | Markdown briefing — title, description, dep list, step checklist, project context. Drop into prompt. |
tasks_create | POST | Queue a new task; optionally seed steps. |
tasks_update | PATCH | Status, response, modelId, title, description, priority, planned, projectId, agentBotId. |
tasks_delete | DELETE | Hard delete. Docstring nudges agents toward CANCELLED instead. |
tasks_add_dependency | POST | Cycles rejected server-side. |
tasks_remove_dependency | DELETE | Unblock. |
tasks_promote | POST | Promote task to its own project (title becomes project name, description becomes context). |
tasks_regenerate | POST | Server-side LLM rewrite of title + steps. Docstring explicitly warns agents off: RARELY USEFUL FOR AGENTS — you are already an LLM and can write better steps yourself. |
steps_add / steps_update / steps_delete | POST/PATCH/DELETE | Per-step lifecycle. Agents call steps_update as they work — RUNNING on entry, COMPLETED/FAILED on exit, with an output summary. |
projects_* family | CRUD | Same shape: list, get, create, update, delete, plus projects_get_context for a markdown-only briefing. |
activity_get | GET | Recent activity entries. Filterable by task or project. The audit trail. |
06 Task context as a markdown briefing.
The tasks_get_context tool builds a single markdown document agents can paste into their reasoning at the top of a turn. It's deliberately not a JSON blob — the agent doesn't need to parse it. It looks like this:
# TASK-216 — Wire up Codex tool-result diffs **Status:** IN_PROGRESS **Priority:** HIGH **Assigned to:** caid **Model:** claude-sonnet-4-6 ## Description The Codex bridge currently sends fileChange items without diff content... ## Dependencies - ✅ TASK-212 — Provider-aware tool rendering (COMPLETED) ## Steps - [x] Inspect codex item shape (READ_FILE) - [~] Map file_change → ClaudeToolCallCard (EDIT_FILE) - [ ] Add provider gate to FileChangeBody (EDIT_FILE) ## Project: llm-bawt ### Project Context This is the llm-bawt repo. Run `make restart` after Python changes...
The checkbox states [x], [~], [ ], [!], [-] map to COMPLETED, RUNNING, PENDING, FAILED, SKIPPED. The agent updates these via steps_update as it works; a parallel viewer in the human dashboard re-fetches the task and shows the same row state. Both sides are watching the same Postgres rows.
07 Two-phase dispatch: plan, then execute.
For larger tasks the UI offers two dispatch buttons. Plan sends the task to a bot with a different prompt template — agents.task_planning — that asks the bot to write the spec and seed an ordered step list, but not to do the work. Status moves QUEUED → PLANNING → REFINED. Execute dispatches the same or a different bot against the now-refined plan.
This split lets a fast-thinking bot (e.g. Snark or Loopy) plan, and a code-capable agent (Caid for code; Vex for security audits) execute. The planner doesn't have to be a coding agent, and the executor doesn't have to write the spec from a one-liner.
08 Cron and scheduled work.
The /api/agents/cron family — CronCreate, CronList, CronDelete — lets agents schedule recurring task creation. A common pattern is a daily 6am job: create a task titled "morning brief" assigned to
. The cron row holds a crontab expression plus a task template; on each fire it materializes a fresh task in snark with the description "summarize overnight Postgres logs"QUEUED and either auto-dispatches or waits for a human nudge depending on the row's autoDispatch flag.
09 Activity as the audit trail.
Every mutating operation — task.created, task.dispatched, task.status_changed, step.completed, project.updated — appends a row to AgentActivity. The row carries actorType (user or bot), actorId (user email or bot slug), and a free-form meta JSON for type-specific payload. The frontend's chat-side AgentActivityRow component renders these inline in the conversation when an agent touches a task while talking to you.
The activity_get MCP tool lets agents look up what they (or other agents) did recently — useful for follow-up tasks: List the last 10 things
returns the actual mutation history, not a summary.caid did on TASK-216
10 Key files.
llm-bawt/src/llm_bawt/mcp_server/task_tools.pytasks_*, steps_*, projects_*, activity_get. Thin httpx wrappers over the BawtHub REST API; X-Agent-Bot-Id for attribution.bawthub/frontend/src/app/api/agents/tasks/[id]/dispatch/route.tsafter()-defers the actual /v1/chat/completions call. Uses cached agents.task_execution prompt template.bawthub/frontend/src/app/(app)/(dashboard)/agents/_components/BotDispatchPanel, SortableTaskList, TaskRow, TaskCreateDialog, PlanDispatchButton, DispatchNoteDialog, TaskStatusIcon.bawthub/frontend/src/app/agents/taskDispatchPrompt.tsbawthub/prisma/schema.prismaAgentProject, AgentTask, AgentStep, AgentActivity, AgentCron. Foreign keys with cascade deletes only on steps; tasks survive project deletion as unassigned.bawthub/frontend/src/lib/agentActivity.tsrecordActivity, resolveActor — used by every mutating route to stamp who did what.main on 2026-05-13
Source: llm-bawt agent backends