How does this work for agent frameworks like LangChain or AutoGen?

The pattern is framework-agnostic. The agent framework holds the agent's reasoning logic. The framework produces tool-call requests as structured messages (typically JSON in modern frameworks). The gateway sits between the framework's tool-execution layer and the tool's API. The framework is responsible for attaching the delegation token to outbound calls. The gateway is responsible for verifying the token, evaluating the policy, and producing the record. The Microsoft May 2026 disclosure of prompt-to-shell escalation paths in mainstream agent frameworks reinforces the case for a gateway that does not depend on the framework's internal authorization.

What if the task is too dynamic to scope in advance?

Dynamic tasks are common, especially in conversational agents. The scoping pattern degrades gracefully. The gateway can accept a delegation token with a broader scope and apply per-call policy that narrows the actual decision based on the tool, the arguments, and the natural-person identity. The trade-off is operator complexity: a broader token means the policy at the gateway carries more of the decision. A narrower token means the application that mints the token carries more of the decision. Both patterns produce records. The narrower-token pattern produces a clearer audit trail because the delegation chain is explicit.

How does per-call evaluation affect agent latency?

End-to-end enforcement overhead at the gateway measures under 50 ms in production tests. Agent reasoning latency is typically 500 ms to several seconds per turn. The per-call gateway evaluation is invisible relative to the reasoning latency. For agents that fire many tool calls in a single turn, the gateway evaluations run in parallel where the tool calls run in parallel; the overhead is dominated by the slowest tool, not by the gateway.

Can tool permissions be evaluated entirely inside the model context?

Some patterns attempt this through prompt-level constraints (the system prompt tells the model "only call tool X when condition Y"). The model's adherence to these constraints is probabilistic, just as model guardrails are probabilistic. Prompt-injection attacks, adversarial framing, and chain-of-thought drift all degrade the constraint's effectiveness. The argument I made in Model Guardrails Are Not a Security Control applies: model-level constraints are part of defense in depth but they are not the enforcement layer. The gateway in the tool-call path is the deterministic, recordable, externally auditable layer.

What's the relationship between tool permissions and OAuth scopes?

OAuth scopes are the existing primitive for delegated authorization to APIs. The pattern for AI agent tool permissions is to use OAuth scopes (or equivalent) as the delegation token format, with the gateway interpreting the scopes per call. The two concepts are aligned: scopes describe what authority has been delegated, and the gateway enforces against the scope. The advance the gateway adds is the per-call records and the per-task policy refinement that vanilla OAuth (one scope grant, used across many calls) does not by itse

AI Agent Tool Permissions: The Authorization Layer Between Reasoning and Action

An AI agent that holds the union of every tool permission its operating role might ever need is over-privileged on every individual call. The agent that updates a Salesforce record, runs a SQL query, hits the Jira API, and reads a Google Drive folder probably needs all four permissions across some workflows but rarely all four on a single request. The agent's static permission profile is a service-account-grade least-privilege violation by design. Per-action attribution under NIST Pillar 3 requires per-call authorization. The pattern that satisfies it is per-task scoped delegation evaluated at the AI request boundary.

I want to walk through the four properties a tool-permission policy must have, why the typical static-role pattern fails each, and where the policy decision lands when the agent calls a tool.

The four properties

Per-task scoping

The agent's authority on a given task is the minimum subset of permissions the task needs, not the full permission set the agent's role holds. When the user asks the agent to "update the Salesforce record for Customer X," the agent's authority for that task is "write to Customer X's Salesforce record." It is not "write to Salesforce." It is not "read and write Salesforce." It is not the full Salesforce permission the role holds.

Per-task scoping is the property that the OAuth scopes pattern was designed for. Agents typically receive their permissions through delegation. The delegation token should carry the narrow scope appropriate to the task, not the maximal scope appropriate to the role.

Per-call evaluation

The agent's reasoning over tools is non-deterministic. A change in the user's prompt, the model's chain of thought, or the available context can lead the agent to invoke a tool it should not invoke. Per-call policy evaluation is the layer that catches the divergence between intended task and actual tool invocation.

The evaluation runs at the gateway between the agent and the tool. The agent makes the call; the gateway evaluates whether the call is permitted under the delegated scope; the decision is recorded.

Identity context

The verified identity of the natural person on whose behalf the agent is acting must travel with each tool call. Authorization policies that allow tool invocation depending on the user's role (the marketing analyst can read CRM but not write, the customer-service rep can read and update tickets but not delete) need the user identity at the decision moment.

The identity travels as a signed delegation token. The application that hosts the agent attaches the token to the outbound tool call. The gateway verifies and uses the identity in the decision.

Per-call records

Each tool invocation produces a record of who authorized the action, under what delegated scope, against which tool, with what arguments, and with what outcome. The records support per-action attribution for both compliance review and security incident reconstruction.

Why the static-role pattern fails

The static-role pattern is the default for agent deployments today. The agent is granted a role at deployment time. The role's permissions are the union of every tool the agent might use across every supported workflow. The pattern fails the four properties above in identifiable ways.

The agent over-grants on every call

A single tool call uses one or two tools, not the full set the role grants. The permission profile at the moment of the call is wider than the task requires. An OWASP "excessive agency" finding follows from the static role grant.

The reasoning layer is the only check

Without per-call authorization at a layer below the agent, the only thing standing between the agent's reasoning and a destructive tool invocation is the model's behavior. Model safety is probabilistic. The agent's prompt-injection vulnerabilities become tool-permission vulnerabilities.

The user identity disappears

If the agent calls the tool with a service credential, the per-call record attributes the action to the service account. The natural person on whose behalf the agent was acting disappears from the audit trail.

Revocation is service-wide

A misbehaving agent cannot be contained without affecting every other instance of the same agent in the deployment. The credential is shared; revoking it shuts down every instance.

Where the policy decision lands

The policy decision lands at the AI gateway, in the path of the tool call. The deployment pattern has three layers.

The agent runtime

The agent reasons about the task and produces tool-call requests. Each request is a structured message: the tool name, the arguments, the context. The agent does not authorize itself. It produces the request and forwards it.

The AI gateway

The gateway intercepts the tool call. It verifies the delegation token, extracts the natural-person identity, evaluates the per-task scope, and decides whether the call is permitted. The decision uses three inputs: the identity context from the token, the task scope from the delegation, and the policy for the requested tool under that scope.

The tool itself

The tool, downstream of the gateway, applies its own coarse-grained authorization (the API's RBAC). The tool sees only authorized calls. The gateway is the per-call enforcement boundary that the static role pattern omits.

Mandate vs. Compliance

EU AI Act Article 14 on human oversight, NIST's three-pillar framework, and the OWASP Top 10 for Agentic Applications converge on the per-task scoped delegation pattern. The vocabulary across the three differs. The architecture does not.

Disclosure test

When a regulator or auditor opens an inquiry into an agent's behavior, the question is "which user authorized this action under what scope at what moment, against which tool, with what outcome." The static-role pattern answers part of the question (the role) and skips the rest. The per-call records produced at the gateway answer the full question.

Vendor liability

Vendors building agent platforms ship with default permission models that the deploying enterprise inherits. The enterprise is the deployer under EU AI Act Article 26 and is liable for the deployment's compliance with high-risk obligations. A vendor's permission model is an input to the deployer's policy. The deployer cannot transfer the per-task scoping responsibility.

Compliance gap

Most production agent deployments today hold static roles. The per-call evaluation, the identity context, the per-task scope, and the per-call records all require architecture above what the agent runtime ships with by default. The gap closes with a gateway in the path of the tool calls.

DeepInspect

This is the architecture DeepInspect was built to provide. DeepInspect sits at the AI request boundary as a stateless proxy between any application and any LLM, and between agents and the tools they call. Per call, the gateway verifies the delegation token, extracts the identity context, evaluates the per-task scope, and decides whether the tool call is permitted.

Every decision produces a per-decision audit record containing the natural-person identity, the agent identity, the task scope, the tool, the arguments, the policy version, and the decision outcome. The record is signed and tamper-evident. The record is committed before the tool call returns to the agent.

For NIST Pillar 3, this is action lineage at per-tool-call granularity. For OWASP excessive agency, the gateway is the layer that catches the divergence between intended and actual tool invocation. For EU AI Act Article 26 deployer obligations, the records show that human oversight applied to each agent action.

Book a demo today.