Plan mode vs per-action approval

AIAcademy · AIAcademy · 2026-05-16

Anthropic — Trustworthy Agents

The most useful number in Anthropic's Trustworthy Agents writeup is one most coverage glossed past: 93% of users in their internal study preferred plan-then-execute over per-action approval — and the plan-mode group also scored higher on task completion. Higher satisfaction and better outcomes is unusual in safety-vs-usability tradeoffs. Usually it is one or the other.

The mechanism is not subtle. Per-action approval asks a human to evaluate every tool call in isolation, without the surrounding context of what comes next. By turn forty of a forty-step coding agent, the operator is rubber-stamping read_file calls they no longer understand the purpose of. Approval fatigue arrives fast and degrades into either reflex-yes (security failure) or reflex-no (capability failure). The 2026 Agentic Coding Trends Report flags the same pattern: long-horizon agent work breaks the assumptions interactive-chat safety UX was built around.

Plan mode inverts the question. The agent produces a plan, the human reviews it once at full context, then execution runs without interruption. One decision instead of forty. The cost is a slightly more expensive front-end review; the gain is that the review is actually meaningful — the operator sees the shape of the work before any of it happens, and the agent's plan becomes a contract the run can be audited against.

Per-action approval is not dead. It belongs exactly where the cost of a single wrong action is high and uncorrectable: sending money, sending email, deleting data, deploying to production. The Claude Code permission docs show the practical version of this split: plan/read-only modes for exploration, more permissive modes for routine edits, and explicit ask rules for actions that should still stop the run. That is the right design: plan-mode by default, per-action gates on the four or five operations that cannot be rolled back.