Write Tool Contracts Before Giving Agents Autonomy
On this page
The autonomy you grant is the autonomy you debug
An agent that can read a calendar is useful and boring. Give that same agent a send_email tool with no confirmation step and no idempotency key, and on the run where it misreads a retry as a fresh request, it sends the same wrong message to forty people before anyone refreshes their inbox. The model did not get dumber between those two versions. The only thing that changed was the access you granted it, and the contract you forgot to write around that access.
The instinct, when an agent demo works, is to widen its reach: more tools, more write access, fewer prompts asking permission. That is the wrong axis. Autonomy should be granted tool by tool, not vibe by vibe. Each tool an agent can call is a small contract between your system and a model that will, on some percentage of runs, do something you did not expect.
This guide is about writing that contract before you grant the access. It is the same discipline you would apply to any internal API exposed to an unreliable caller, except the caller here reasons in natural language and improvises.
What a tool actually is to an agent
In modern agent frameworks, a tool is a typed function the model can choose to call. The OpenAI Agents SDK describes agents as components that plan, call tools, and carry state across steps. The model reads a tool's name and description, decides whether the current step needs it, fills in the inputs, and reads the result back into its context.
That means the model is selecting tools from text. The name and description are not documentation for humans. They are the interface the model reasons over. A tool called update with the description "updates things" will be called in situations you never intended. A tool called update_ticket_status with a description that names its preconditions will be called far more predictably.
So the first contract is linguistic. Before you think about permissions, write a name and description precise enough that a competent but literal reader would never misuse it. If you cannot describe when the tool should and should not be called, the model cannot either.
The flow every tool call should pass through
A tool call is not one event. It is a short pipeline, and most of the safety lives between the steps. The model forms an intent, a planner selects a tool, the system checks permissions, the call runs, the output is validated, and only then does the result hand off to the next step.
Diagram
Tool-call pipeline with permission and validation gates
The two boxes teams skip are Permission check and Output validation. They feel like overhead during a demo. They are the entire difference between a prototype and something you can leave running. The permission check decides whether this call is allowed at all, given who the agent is acting for and what it has already done this run. The output validation decides whether the result is shaped correctly before it pollutes the model's context with garbage it will then reason over.
Anthropic's Building Effective Agents makes a related point at the workflow level: most reliable systems are not one freewheeling agent but composable patterns. Prompt chaining splits a task into checkable steps. Routing sends each input to the handler built for it. Orchestrator-workers has a lead agent break work into subtasks for narrower workers. In every one of these patterns, the tool boundary is where a step becomes auditable. The patterns give you the seams; the tool contract is what you write on each seam.
The tool contract card
Before a tool is exposed to an agent, fill in this card. It is short on purpose. If a field is blank, the tool is not ready for autonomy, only for a supervised trial where a human approves every call.
| Field | What it specifies | Example |
|---|---|---|
| Name | The exact identifier the model selects on | create_refund |
| Purpose | When to call it and, explicitly, when not to | Issue a refund for a paid order; never for trials or disputes |
| Input schema | Typed, validated, required vs optional | order_id: str, amount_cents: int, reason: enum |
| Output schema | The shape the agent reads back | {refund_id, status, amount_cents} |
| Read / write | Does it observe or change state | Write |
| Approval needed | Human confirmation before execution | Yes, above a value threshold |
| Idempotency | Key that makes a repeat call safe | idempotency_key per order+amount |
| Failure behavior | What happens on error or timeout | Return typed error, no partial write, log and stop |
The card forces decisions that are easy to defer. The most important pair is Read / write and Idempotency. A read tool can be retried freely; the worst case is wasted tokens. A write tool that lacks an idempotency key is a landmine, because agents retry. When a call times out or the model is unsure, it will often call again. Without a key, the second create_refund is a second refund.
A worked example
Take a support agent that can change subscription tiers. The naive version exposes change_tier(account_id, tier). The contract version specifies that it is a write tool, requires an idempotency_key, needs human approval for downgrades that cancel paid features, returns a typed error rather than throwing, and logs the before and after state. None of that changes the happy path. All of it changes what happens on the 1-in-50 run where the model misreads the request.
Make it concrete. Picture Halcyon Support, a four-person team at a fictional billing SaaS, rolling out an agent that resolves tier-change tickets. They started with the naive change_tier and watched it, during a noisy retry, downgrade a paying customer from Pro to Free and silently revoke their saved reports. After that, they rewrote the tool as a contract card: downgrades that touch paid features now route to a human approval queue, every call carries an idempotency_key derived from the ticket ID, and a denied call returns a typed error the agent reports back instead of looping. The agent kept handling the routine upgrades on its own; the one class of action that could destroy customer data is the only thing that now waits for a person.
Read-only first, then earn the writes
A reliable way to introduce a new agent is to start it with read tools only and watch what it does. Read tools let you observe the model's judgment, its tool selection, and its tendency to over-call, all without any state changing underneath you. You learn where its descriptions are ambiguous before a bad call costs anything.
Write access is granted on a ladder, not as a switch. The rungs below are a useful default. Each rung is a deliberate decision with an owner, not a config flag someone flips to unblock a demo.
| Rung | Capability | What must be true to grant it |
|---|---|---|
| 1 | Read internal, non-sensitive data | Tool descriptions reviewed; scope minimized |
| 2 | Read sensitive or user-scoped data | Permission check ties access to the acting user |
| 3 | Write to reversible internal state | Idempotency and rollback path proven |
| 4 | Write to external systems or users | Approval gate plus full audit log on every call |
| 5 | Unattended write at volume | Trace-based evals pass; rate limits and kill switch wired |
Most teams want to start at rung 4 because that is where the impressive demo lives. The cost of skipping rungs is paid later, usually in an incident review, where the question is always the same: who approved this tool, and where is the log.
Permissions belong to the system, not the prompt
A recurring mistake is to enforce permissions in the system prompt. "Do not refund more than the order total" is a suggestion, not a control. The model will follow it most of the time, which is exactly the property that makes it dangerous, because the failures are rare enough to be invisible until they are expensive.
Real permission checks live in code, between tool selection and execution. They are deterministic. The agent proposes a call; your code decides whether the acting user is allowed that action, on that resource, with that argument range, at this point in the run. The model never gets to overrule it, because the model never sees the check; it only sees the typed error when a call is denied.
The same applies to confirmation. For any high-consequence write, the contract should require a human approval step that the agent cannot satisfy on its own behalf. Anthropic's documentation on dynamic workflows in Claude Code treats this kind of structured, permissioned tool use as a first-class part of the design rather than an afterthought, and that framing is worth borrowing: the approval gate is part of the tool, not a wrapper around it.
Validate the output before it re-enters context
The step after the tool call matters as much as the call. Whatever a tool returns goes back into the model's context and becomes the basis for its next decision. If a retrieval tool can return untrusted text, that text is now instructions the model may act on. If a tool returns a malformed object, the model will improvise around it.
Output validation closes that gap. Check that the result matches the declared output schema. Treat any free-text field returned from an external or user-controlled source as data, not as commands. Strip or clearly mark content that should never be interpreted as a new instruction. This is plain defensive engineering: you are validating an untrusted boundary, the same way you would validate a request body from the public internet.
A short pre-autonomy checklist captures the whole discipline:
- Every tool has a filled-in contract card; no blank fields.
- Names and descriptions are precise enough to constrain selection.
- Write tools have idempotency keys and a proven rollback or no-op-on-repeat path.
- Permission checks run in code, not in the prompt.
- High-consequence writes require a human approval the agent cannot self-grant.
- Tool outputs are schema-validated; external text is treated as data.
- Every call is logged with inputs, decision, and outcome.
- The agent starts read-only and climbs the rungs deliberately.
Where this fits in your delivery system
Tool contracts are not a one-time setup task. They are part of how you ship and operate agents over time. A new tool is a change to the agent's blast radius, and it deserves the same review you would give a new API endpoint. The contract card is the artifact that makes that review fast: a reviewer can see in one table whether the tool is read or write, whether it can be retried safely, and who approves it.
If you are building these patterns hands-on, AI Agentic Patterns walks through prompt chaining, routing, and orchestrator-workers as composable building blocks, with the tool boundary as the seam where each step becomes reviewable. For the engineering side of safe tool calls, schema validation, and rollback paths, Full-Stack Developer with AI covers the implementation discipline that turns a demo into something operable. And to place tool contracts inside a repeatable operating loop with owners, controls, and review gates, AI Delivery Systems connects the per-tool work to how your team actually ships.
If you take one tool from read-only to unattended this quarter, do it on the ladder above rather than the demo's timeline — and AI Agentic Patterns is where you practice drawing those seams before a real create_refund is on the line.
The shift worth internalizing is small but durable. You are not deciding whether your agent should be autonomous. You are deciding, one tool at a time, which specific actions it has earned the right to take alone, and what evidence justified each grant. Write the contract first, and the autonomy takes care of itself.
Originally published April 9, 2026. Updated and re-verified June 14, 2026.
Continue learning with these courses
Sources and Further Reading
- OpenAI Agents SDK documentationdevelopers.openai.com
- Anthropic: Building Effective Agentsanthropic.com
- Anthropic: dynamic workflows in Claude Codecode.claude.com
Stay ahead with AI insights
Get practical AI tips, new course announcements, and career strategies delivered weekly.