Why I built it
The real bottleneck in agentic development isn't the agents. It's the context they share, the verification that bounds them, and the orchestration that holds them accountable. Run thirty agents against a codebase without those three pieces and you get confident slop. Get those three pieces right and a small operator starts to ship like a team an order of magnitude larger.
Optimus is the platform I needed and couldn't buy. It treats every artifact a team produces — plans, stories, docs, git commits, goals, daily notes, calendar events, agent memory — as a row in one composable, embedding-aware table, and gives every agent a single interface for asking the world what it knows.
The thesis
A team's "world" isn't a kanban board. It's the union of every artifact the team has ever produced, plus every signal from the systems they operate. Most tools store one slice of that world really well — Notion stores docs, Linear stores stories, GitHub stores commits — and ignore the rest. Agents that have to crawl five sources and reconcile them by hand are slow and brittle.
Optimus puts the union itself first. The data layer (Nexus) is a single table whose rows are typed and cross-linked, not a federation. The verification layer (Test-driver) writes its evidence back into that same table. Agents read and write through one interface — nothing leaks past the boundary as a side-channel.
What's inside
Nexus — the data layer
A universal world-context primitive. 16 ContextTypes (docs, stories, plans, git history, goals, daily notes, calendar, memory, …) live in one table, cross-linked, embedding-aware, queried through one API. Read more →
Test-driver — the verification layer
An agent that walks features end-to-end, captures evidence, scores severity, and writes the run manifest back into nexus_items. Runs on a weekly LaunchAgent or on demand. Read more →
Harness — the inference layer
Provider-abstracted LLM routing. Every AI feature speaks one interface; the harness picks the model. Swap Claude → GPT → Gemini without touching feature code. Insurance against tokenizer drift and pricing whiplash.
Goals + Daily Flow — the orchestration layer
Goals auto-rollup status from the stories that contribute to them. Daily Flow is the rhythm — what got shipped, what's stuck, what's contributing where — surfaced as a generated daily note that the operator can read in 30 seconds.
Operating principles
- One operator, agentic fleet. Build for the case where one careful human is the only human in the loop and twenty agents are the workforce.
- Composable context, not federation. Storing data once in a typed primitive beats stitching three sources at query time.
- Verification is first-class. Test-driver isn't a plugin — it's a peer of the data and inference layers.
- Provider-abstract everything LLM. Lock-in to a single model is the only failure mode that's strictly self-inflicted.