AI Agent Runtime Protection: Where the Control Plane Has to Sit and Why Most Architectures Get It Wrong
AI agent runtime protection has to enforce policy at the moment the agent calls a tool or a model. Most architectures push protection into the framework layer (LangChain callbacks, AutoGen middlewares, Semantic Kernel filters), which the agent can bypass once a prompt-injection payload reshapes the call path. The placement that survives sits at the HTTP request layer between the agent and the LLM or tool endpoint. This walkthrough shows why framework-layer protection fails, where the gateway placement closes the gap, and which controls only the request layer can enforce.

Most AI agent runtime protection lives inside the framework. LangChain callbacks, AutoGen middlewares, and Semantic Kernel filters check the agent's intermediate steps and refuse the ones that look risky. That design works in the demo and fails in production. The failure mode is that the agent's call path is shaped by the prompt; an attacker who reshapes the prompt reshapes the call path; the framework callback receives the new path as legitimate and approves it. The protection that survives sits at the HTTP request layer between the agent and the LLM or the tool endpoint, where the policy decision runs against the actual request that leaves the process.
I want to walk through where framework-layer protection breaks, why the HTTP layer is the placement that survives, and which controls only the request layer can enforce.
Why framework-layer protection fails
A framework callback runs inside the agent process. The callback is invoked when the agent decides to call a tool or a model. The decision is made by the agent's reasoning loop, which is driven by the prompt and the conversation state. When the prompt carries an injection payload, the reasoning loop produces a call sequence the attacker selected. The callback receives the call sequence as a legitimate intermediate step.
Microsoft documented this pattern in the May 7, 2026 disclosure on prompt-to-shell escalation across mainstream agent frameworks. The injection payload reshapes the agent's tool-call graph; the framework callback sees the reshaped graph as the agent's own intent. The attacker's payload has the same trust level as the legitimate user prompt because the framework treats both as input to the reasoning loop.
The callback's view of the world is filtered through the same reasoning loop the attacker compromised. A control whose decision depends on the integrity of the layer above it is not a control; it is a courtesy check.
Why the HTTP request layer survives
A control that sits at the HTTP request layer evaluates the request that actually leaves the process. The agent's reasoning loop produces a network call; the network call carries the model name, the tool endpoint, the parameters, and the caller's identity token. The policy gateway terminates the connection, attaches the verified identity, classifies the parameters against the data policy, evaluates the policy in force, makes the decision, and forwards or refuses. The decision runs against the bytes on the wire, not against the agent's narration of what it intended.
The placement closes the prompt-injection bypass because the injection payload cannot change what the policy gateway evaluates. The gateway sees the model call or the tool call as a request from the agent's authenticated identity. The classification and policy decision apply to the request, regardless of why the agent decided to send it.
Controls that only the request layer can enforce
Five controls require the HTTP placement and cannot be enforced from inside the framework.
Identity-bound rate limits
The framework knows the user that started the session. The HTTP layer knows the agent's verified identity on every outgoing call. Rate limits applied at the framework layer are bypassed when the agent spawns sub-agents that inherit the session but bypass the limit accumulator. The HTTP layer accumulates against the verified identity, which is the dimension a regulator actually asks about.
Per-decision audit records
EU AI Act Article 12 requires an automatic record of each event over the lifetime of the system. Article 19 requires the record to identify the natural person involved. The framework callback can produce a record of its decisions, but the record lives inside the process the attacker may already control. The HTTP layer writes the record outside the process to durable, signed storage that the agent cannot modify.
Egress destination policy
The agent's tool call goes to a real endpoint at a real domain. The HTTP layer enforces which destinations are reachable from which agent identities under which policies. The framework callback sees the tool name; the destination is resolved by the framework after the callback runs. A policy that says "only the support-summarization agent can call the customer database tool" enforces at the HTTP layer where the call resolves; at the framework layer the policy applies to a name that can be remapped.
Output classification on the response path
The model's response carries content the agent will act on. A policy that says "responses containing customer PII cannot reach an outbound tool call" enforces on the response path at the HTTP layer. The framework callback runs after the agent has already incorporated the response into its reasoning state, which is too late for content-classification refusal.
Fail-closed behavior on policy outage
When the policy engine is unreachable, the framework callback usually fails open: the call proceeds because the protection is advisory. The HTTP layer fails closed: the call is refused because the policy decision cannot be made. Fail-closed is the design property a regulator expects from a control on the critical path.
A reference architecture
A reference agent architecture that survives the runtime threats looks like the figure below. The agent process runs the framework; the framework's outbound calls flow through the policy gateway; the gateway terminates the TLS, evaluates the policy, writes the audit record, and forwards or refuses.
The gateway and the agent process are separate failure domains. A compromise of the agent process does not compromise the policy decision because the policy decision happens outside the process. The audit record persists outside the process because the storage is outside the process.
How this maps to OWASP and to NIST
The OWASP Top 10 for Agentic Applications (2026) identifies the "agentic skills" intermediate behavior layer as the new vulnerable component. The HTTP placement covers the outbound side of that layer; the framework callbacks cover the intermediate decisions, but the enforceable point is at the request boundary.
The NIST AI RMF MANAGE function requires controls that operate on the running system. The forthcoming COSAiS overlays for single-agent and multi-agent systems specify identity-aware policy decisions at the request layer as a measured control. The HTTP placement is the natural home for the controls the overlays describe.
DeepInspect
DeepInspect is the policy gateway in the reference architecture. The agent process routes its outbound LLM calls and tool calls through DeepInspect. The gateway resolves the agent's verified identity, classifies the parameters against the data policy, evaluates the policy in force, writes the per-decision audit record to durable signed storage, and refuses the call on policy outage. The five controls listed above operate at the request layer where they cannot be reshaped by the prompt.
The placement is the difference between a protection that the prompt-injection payload can reroute and a control the regulator accepts as a system of record. Take the AI readiness self-assessment to see how your current agent architecture compares against the gateway placement.
Frequently asked questions
- Does this replace LangChain or AutoGen?
The framework continues to drive the agent's reasoning loop and the tool-call graph. The gateway sits below the framework on the outbound HTTP path. The framework callbacks remain useful for in-process telemetry and for catching obvious malformed calls; the gateway provides the enforceable control and the audit record. Both coexist.
- What about agents that run entirely in-process and never make external calls?
An agent that never makes an external call is rare in production. Most agent deployments call at least one LLM, one tool API, and one internal service. The HTTP placement protects those calls. The pure in-process agent is a research artifact; it has no enforceable control because there is no request to inspect.
- How does the gateway know which agent identity is calling?
The agent process attaches a verified token on each outbound call. The token can be an OAuth access token tied to the agent's service identity, a JWT signed by the identity provider, or a workload identity certificate. The gateway validates the token, resolves the identity, and binds the audit record to it. Tokens that the gateway cannot validate are refused, which is the fail-closed behavior described above.
- Does this add unacceptable latency?
A correctly sized gateway adds 4 to 20 milliseconds at the p99 for the policy decision and the audit write. The model's own response time is usually 200 milliseconds or more. The gateway latency is a small fraction of the end-to-end response time, and it is a fixed cost the platform team can size against.
- How does this interact with prompt-injection defenses?
Prompt-injection defenses (input sanitization, system-prompt hardening, output validators) continue to operate inside the agent process. The gateway is the layer that enforces what the agent can do regardless of whether an injection succeeded. The two are complementary: the agent's defenses reduce the likelihood of injection; the gateway bounds the damage when injection succeeds.