Reeve

Capabilities

What Reeve actually does, what's covered by an executable test, and what's deferred. The list is generated from the build plan and the simulation harness; if a claim isn't on this page with a status, it isn't a claim Reeve makes.

Architecture

Adversarial validation — the 28-scenario harness

Each scenario is a runnable integration test that drives the production code paths through synthesized adversarial input and asserts properties of the resulting state. Coverage is mapped against OWASP LLM Top 10 (2025) and MITRE ATLAS. Every scenario in this list is currently passing in the test suite.

Architecture-level (9)

Vertical-specific (19)

Each scenario maps a marketing claim on a /for/<vertical> page to an executable assertion. The architecture's response to a vertical-shaped customer message is verified to match the promise.

Few-shot in-context demonstrations (v0.5.5)

Every operator approve/edit/reject in the review queue produces an (input, draft, final) triple. On the next draft for that same (tenant, action class), Reeve inlines 1–2 prior approved/edited examples as ICL demonstrations. Per AdaptAgent (Verma et al., 2024), this boosts task success 3–7% absolute on unseen settings; gains saturate at ~5 examples.

Apprentice integration (v0.5)

Compliance components (per-tenant)

Governed under Pact

Coverage gaps — honestly enumerated

Things Reeve does NOT yet test or claim, with the trigger that would cause us to add coverage:


Last updated 2026-05-03. The full simulation harness lives at tests/integration/simulation/ in the reeve repo. Run any scenario locally with npm run simulate -- <scenario-id>.