The paradigm

Reeve is built on assumptions that aren't universal in this moment of the AI market. They're load-bearing for the product working at all, and they're worth being explicit about — because they imply a very different shape of system than what most "AI for X" announcements describe.

Trust is a primitive, not a vibe

A typical AI product launches with the agent fully autonomous on day one — handle every customer, send every email, never check in. The marketing language is "we deploy AI agents to your inbox and they work." The reality is the operator finds out six weeks in that the agent has been confidently telling customers things that aren't true, and they spend the next quarter unwinding it.

Reeve assumes the opposite. The agent has zero authority on day one. Every action — every reply, every booking, every invoice, every refund — is a proposal that the operator approves, edits, or rejects. The approve/edit/reject pattern is the operator's tacit knowledge being written down: "we don't quote over text", "Saturdays only by appointment", "always include the trip charge". After thirty days of clean operation, the routine actions (FAQs, confirmations, appointment reminders) graduate to ship-on-their-own. After ninety, more. The consequential actions — large quotes, refunds, pricing changes — never auto-ship; they always need a human glance.

We call this the phase model: Phase 1 (review queue), Phase 2 (shadow mode — agent acts; operator audits after the fact), Phase 3 (autonomous on routine, gated on consequential). Phase progression is per-action-class, not per-tenant — an operator can be in Phase 3 for FAQs and Phase 1 for refunds indefinitely. If a Phase 3 action class starts failing more correlation regression tests, it falls back to Phase 2 automatically. Trust is something the system can lose, not just gain.

Constraints in code and data, never in prompts

The fragile pattern: "you are a helpful assistant. Never quote a price below $200. Never book Sundays. Always include the trip charge." This works approximately ninety-five percent of the time, which is to say it fails several times a week, which is to say the operator can't trust it.

Reeve enforces constraints structurally:

Database constraints. The bookings table has an exclusion constraint on overlapping time slots — the database itself rejects a double-booking before it can happen, regardless of what the agent intended.
Tenant isolation via row-level security. Every business table has a Postgres RLS policy. A query without the tenant context set raises an error; one with the wrong tenant context returns zero rows. The agent never handles cross-tenant data because the database refuses to surface it.
Programmatic policy gates. "Never quote below $200" is a row in a tenant_policies table that the hypervisor consults on every send_quote action. A quote under $200 is rejected at the gate — not asked of the agent, not pleaded with the prompt.
Append-only event log. The events table is immutable: triggers reject UPDATE, DELETE, and TRUNCATE. The audit chain is a database invariant, not an honor system.

The agent's prompt is for thinking, not for enforcing. When the agent proposes a $50 quote and the policy says $200 floor, the hypervisor blocks the action and writes an event. The operator sees a clear "agent tried to quote $50; pricing floor said no; here's the proposal — edit if you want."

The agent is a planner, not an enforcer

Most AI agent frameworks make the agent responsible for everything: deciding what to do, deciding whether it's allowed, executing it, logging it. The agent is judge, jury, and executioner.

Reeve splits these roles. The agent (Receptionist, Scheduler, Pricing, Collections — functional roles, not personas) reads state and proposes actions. A separate hypervisor evaluates every action against review-class, confidence threshold, operator phase, and active policies — and decides ship, queue, or reject. Only after the hypervisor's "ship" does an adapter (Telnyx, Gmail, Stripe) actually do anything. The agent never executes; it only proposes. The hypervisor never reads from the world; it only decides.

This division means the agent's reasoning can be wrong — confidently, persistently wrong — and the system still behaves safely. It also means the agent can be replaced (a different model, a fine-tuned successor, a per-tenant apprentice) without changing the safety story.

One Postgres, four logical layers

No Neo4j, no Pinecone, no MongoDB on day one. One Postgres instance with four logical roles: structural (the business tables — customers, bookings, invoices), semantic (pgvector embeddings for retrieval), graph (a projection — Apache AGE deferred until a graph query is actually hot), and event log. Tenant isolation is row-level security; per-tenant credentials live in a separate Signet vault. When the design partner wants to self-host (v2), they get one container, not a service mesh.

Vertical-agnostic by design

Reeve has no industry-specific code paths. There is no "plumber mode" and no "HVAC config." Claude's general world knowledge handles the vertical-specific reasoning (a "compressor" in HVAC is a part; in plumbing it's a different thing). The operator's particular preferences are taught to the system through their corrections, not through a config file. As a consequence, the same Reeve runs the plumber, the mobile mechanic, the appliance repair shop, the dog groomer. No new code per vertical.

The operator is the merchant of record

Reeve never holds funds. Stripe Connect routes every payment to the operator's account directly. Tax, refunds, disputes, customer contracts — all the operator's. Reeve is software, not a marketplace. This matters legally (the operator owns the customer relationship; Reeve is the processor under their controller) and structurally (Reeve doesn't need to KYC every operator's customer, doesn't take a slice of the revenue, doesn't get caught in the principal-agent ambiguity that sinks most "AI replaces the small business" startups).

The paradigm is the product as much as the code is. If a competitor copies our feature list but builds it on autonomous-from-day-one assumptions, they will fail the same way every prior agent product has — and we'll know the failure mode by name.