AI Agent Authorization: NIST Pillar 2 at the Request Boundary
AI agent authorization is the per-request decision about whether a specific caller, against a specific resource, under a specific policy, is allowed to act. NIST calls it delegated authority. Most enterprise AI deployments solve authentication and skip authorization.

AI agent authorization is the per-request decision about whether a specific caller, against a specific resource, under a specific policy, is allowed to act at this moment. NIST's AI agent identity and authorization framework calls this Pillar 2: delegated authority. Most production deployments solve Pillar 1 (verified identity) at the front door and then run the rest of the chain on a static service credential with permanent access to the full model API. The agent acts with the maximum privilege of the credential, not with the scoped authority the human granted at the entry point.
I want to walk through what authorization at the AI call layer requires, why authentication alone falls short, and where Pillar 2 has to be enforced architecturally.
Authentication answers who. Authorization answers what.
Authentication verifies identity. The user holds a valid token, the agent presents a verifiable claim, the credential is recognized by the system. Authorization is the next decision: given this verified identity, is this specific action against this specific resource permitted right now. The two questions look similar in slide decks and diverge sharply in production. A user can be authenticated and still attempt actions they have no right to perform.
The Meta March 18 incident is the canonical example. An internal AI agent exposed sensitive data to engineers who were fully authenticated and had no business reading it. The authentication layer worked. The authorization decision at the AI request layer was missing.
Static service credentials violate least privilege by design
The dominant pattern in production AI deployments is a static API key issued to the application. The key authenticates the application to the model provider. Every request the application makes carries the same credential. Every caller of the application, every prompt, every data classification gets evaluated against the same privilege level.
This is the inverse of least privilege. The credential has permanent access to the full model API for every possible request. There is no per-request, per-role, per-data-classification decision happening at the credential level. The only way to scope authority is to evaluate the request against policy at a layer above the credential.
Delegated authority requires per-request evaluation
Pillar 2 of the NIST framework calls for delegated authority: the scope of what an agent may do is bounded by the authority the human granted, evaluated per request. The evaluation has to take into account who the human is, what their role is, what data classification the prompt carries, what resource the action targets, and what policy version applies at this moment.
A static credential cannot encode any of this. The evaluation has to happen at a policy decision point above the credential, with identity context, role, classification, and policy state all available at the moment of evaluation. The decision is deterministic, not probabilistic, and it produces a record that says what was permitted and why.
What an authorization record looks like
An authorization record at the AI request boundary contains the verified identity of the caller, the role and authorization context in effect, the policy version that governed the decision, the data classification applied to the request, the resource the action targeted, and the outcome. The record is committed before the response returns to the application. The application never has custody of the record's write path.
This is also the Article 12 evidence layer. EU AI Act Article 12 requires automatic recording of events over the lifetime of the system, sufficient to reconstruct risk situations. An authorization record satisfies the reconstruction requirement by construction: identity, policy, classification, resource, outcome.
Where the enforcement layer sits
The architecturally correct place for AI agent authorization is the AI request boundary: a layer between the agent or orchestrator and the model API it calls. The layer holds the identity context, evaluates the per-request policy, makes the deterministic decision, and writes the audit record. Enforcement is inline. A blocked request never reaches the model. Latency overhead is under 50 ms in production, against an LLM inference baseline of 500 ms to 5 seconds.
This is the same architectural argument for inline enforcement generally. Log-and-alert produces forensic value. Prevention requires the decision to happen before the request reaches the model.
DeepInspect
This is the architecture DeepInspect was built to provide. DeepInspect sits at the AI request boundary as a stateless proxy between authenticated agents and any LLM. Every request is evaluated against per-route, per-role policies using the identity context the application supplies. PII is detected and redacted or blocked based on data classification rules. Every decision produces a per-decision audit record bound to the natural person on whose behalf the agent is acting.
For Pillar 2, this is the missing infrastructure. The credential the model provider sees is the proxy's. The decision the regulator asks about is the proxy's. The audit record the regulator inspects is independent of the application and signed at the layer that made the decision.
Frequently asked questions
- How does AI agent authorization differ from API authorization?
API authorization decides whether a caller may invoke an API endpoint. AI agent authorization decides whether the caller, against the resource the agent is about to act on, under the policy in effect, may take the specific action the model has planned. The decision depends on the data classification of the prompt, the role of the human behind the agent, and the policy version that applies at this moment. API authorization is the access decision. AI agent authorization is the action decision, evaluated against the model's intent.
- Can OAuth scopes serve as AI agent authorization?
OAuth scopes describe permissions a token holds at issue time. They do not evaluate the per-request decision about whether a specific action against a specific resource, under a specific policy, is permitted right now. Scopes can be one input to that decision, but the decision itself has to happen at a policy decision point that holds the identity context, the classification, the policy version, and the resource state at the moment of the request. Scopes are necessary, not sufficient.
- What does Pillar 2 require that Pillar 1 does not?
Pillar 1 (verified identity) requires the system to know who is calling. Pillar 2 (delegated authority) requires the system to know what that caller is permitted to do, scoped per request, per role, per resource, and per policy version. Pillar 1 is upstream application architecture. Pillar 2 is enforcement at the AI request layer. The application owns Pillar 1. An external enforcement layer owns Pillars 2 and 3.
- Does my application's existing RBAC system handle this?
A standard RBAC system maps roles to permissions on application resources. AI agent authorization extends the decision into the model interaction: the prompt content, the data classification of the context window, the policy version in effect at the request layer. RBAC is the input to the decision. The decision itself has to happen at the AI request layer with prompt and classification context available. Most enterprise RBAC systems do not see the prompt, so the decision cannot be made there.
- How does this map to ABAC?
Attribute-based access control evaluates a decision against attributes of the subject, action, resource, and environment. AI agent authorization is an ABAC decision applied at the AI request layer, with the prompt's data classification and the model interaction's risk profile as additional attributes. The ABAC engine has to receive the AI-specific context. Most enterprise ABAC implementations do not, and that gap is what an enforcement layer at the AI request boundary closes.