Five Eyes Just Defined Agentic AI Risk in Five Categories. Three Live on the Traffic Plane.

On April 30, 2026, six national cybersecurity agencies (NSA, CISA, ASD ACSC, the Canadian Centre for Cyber Security, NCSC NZ, NCSC UK) published Careful Adoption of Agentic AI Services. It is the first multi-government joint guidance specifically about agentic AI. The headline most coverage picked up was that agencies now treat agentic AI as a live critical infrastructure risk.

The guidance defines five risk categories: privilege, design and configuration, behavioral, structural, accountability. If you are sitting in front of a sprawl of LLM-driven internal tools and you have been asked to put a control plan on a board agenda, this is the closest thing to an accepted vocabulary you will get. It will be cited.

Two of the five (design and configuration, structural) belong to deployment architecture. They are decided by the people standing up the agent, the framework they pick, and how the runtime is laid out. By the time traffic is flowing, those decisions are already locked in. You re-architect to address them.

The other three (privilege, behavioral, accountability) are operational. They show up in production, on every model call. They are enforceable at the boundary between the agent and whatever it is calling: a model, a tool, an MCP server, another agent. That boundary is where I will spend the rest of this post.

Photo by Amanda Dalbjörn on Unsplash

Privilege risk

The Five Eyes definition: an agent receives more authority than its task requires. CSA's April research puts a number on it. About three-quarters of surveyed enterprises report agents holding broader access than they need. The reason is structural. Agents inherit a service account. The service account holds the union of permissions any agent in the system might ever need. Every individual agent gets that union.

A common response is to give each agent its own service account. That is the same problem with more rows. The right fix is identity-bound scope, evaluated per call.

The pattern, in pseudocode:

on_outbound_model_call(agent_call):
    user = resolve_originating_user(agent_call.delegation_chain)
    role = directory.role_for(user)
    purpose = agent_call.declared_purpose
    scope = policy.scope_for(role, purpose, agent_call.target_model)
    if not scope.contains(agent_call.intended_action):
        deny(agent_call, reason=...)
    else:
        forward(agent_call, scope_token=scope.sign())

Three things to notice. The principal resolves to the originating user and the agent identity & the orchestrator's service account do not appear as the actor in the audit trail. The scope is computed from role and declared purpose, so the same agent invoked by two different users sees two different envelopes. The decision happens before the call leaves the boundary, so the model never sees an over-scoped request even on a misconfigured agent.

This is what NIST SP 800-53 CM-7 style "least functionality" looks like for agents. It is also what Five Eyes is asking for under privilege risk.

Behavioral risk

The Five Eyes definition is broader than I expected. It covers emergent action patterns, drift after model updates, sensitivity to prompt phrasing, and chained tool use that produces effects no single tool produces on its own.

A static allow-list says "this agent may call these five tools". It fires only when an agent tries to call a sixth. It stays quiet when the agent calls the first five tools forty times instead of three, or runs them in a sequence that exfiltrates data through a side channel each individual call permits.

The mechanism is a runtime baseline per agent role. What I mean by baseline:

baseline(agent_role) = {
    tool_call_distribution_per_session,
    argument_distribution_per_tool,
    response_class_distribution,
    chain_depth_distribution,
    egress_volume_per_session,
}

Each call updates the baseline and is also evaluated against it. Divergence beyond a tolerance does not auto-block. It escalates. The escalation lands in the same evidence chain as the policy decision, so the on-call has the full context: the call, the prior baseline, the divergence, the originating user, the agent version, the model. That last point matters because behavioral drift after a model upgrade is a pattern Five Eyes calls out specifically.

Behavioral risk is the category most teams currently address with dashboards and offline reviews. Both are useful for review but enforcement happens inline.

Accountability risk

Five Eyes is direct about this category: when an agent acts, you should be able to reconstruct what it did, on whose behalf, with what credentials, and under whose instructions. CSA's April whitepaper found that a majority of enterprises cannot reliably distinguish AI agent activity from human activity in their logs. That is the gap Five Eyes is closing with the accountability category.

Two design points. The record is signed at the boundary, so a downstream log pipeline cannot quietly drop fields without breaking signature verification. The session id is the join key for everything that happened inside one logical user request, including hops across agents. That join key is what lets a forensic reviewer pull the full chain in one query instead of stitching it from five log sources. I covered the broader design pattern in How to Build a Defensible AI Audit Trail.

This is what auditors will ask for under HIPAA 45 CFR 164.312(b), under SR 11-7 model risk management, and under EU AI Act Article 12 record-keeping. The Five Eyes accountability category restates an audit obligation that already existed in those regimes. The new part is having a boundary at which to capture it.

Categories that need other controls

Design and configuration risk is about how the agent is built and stood up. Framework choice, default tool registrations, sandboxing strategy, system prompt hygiene. These are decisions made by the team building the agent. A traffic plane control cannot retroactively fix a system prompt that leaks instructions or a framework that auto-registers tools the operator did not approve. This category belongs in your SDLC and your AI architecture review.

Structural risk is about how multiple agents and supporting components compose. Where state lives, who can mutate it, how agents discover each other, how memory is shared. The traffic plane sees the calls between components. This category belongs in your reference architecture for agentic systems.

If a vendor tells you their AI security product addresses all five Five Eyes categories, ask them which control specifically addresses structural risk. The honest answer is that no traffic-plane product does.

Where this leaves DeepInspect

DeepInspect is an identity-aware enforcement plane on AI traffic. The three categories I covered (privilege, behavioral, accountability) are the ones it addresses. The two I set aside are the ones it does not. That split is the honest map of the runtime control plane's reach.

In practice an evaluator should expect three things from any traffic-plane control: identity-bound scope per call, runtime baselines that escalate on drift, and signed evidence records that survive regulator review. Each maps to one of the three operational categories. A vendor claiming a single product covers all five Five Eyes categories is making a capability claim the architecture does not support, and that overstatement is worth catching during a procurement review.

A four-step audit

If I were a CISO reading the Five Eyes document on Monday morning, I would do four things.

Map every agentic system in production to the five categories. Per system, per category. The matrix is small enough to fit on one page and useful enough to drive your next investment decision.
For privilege, ask each agent owner what identity their agent uses to call models and tools today. If the answer is a service account, that is a finding.
For behavioral, ask whether any agent in production has a baseline of normal behavior. If the answer is "we have logs," that is a finding.
For accountability, pull a real incident response and ask the on-call to reconstruct what an agent did during the last anomalous session. Time them. The number you get is the gap to close.

These four steps will tell you which of the five categories you are running blind on, and that is the prerequisite for any plan that follows.

Common questions

What is the Five Eyes agentic AI guidance?

The Careful Adoption of Agentic AI Services document, published April 30, 2026, is the first multi-government joint guidance specifically about agentic AI. It was issued by six national cybersecurity agencies: the US National Security Agency (NSA), the Cybersecurity and Infrastructure Security Agency (CISA), Australian Signals Directorate Cyber Security Centre (ASD ACSC), the Canadian Centre for Cyber Security, the UK National Cyber Security Centre, and New Zealand's National Cyber Security Centre. The document defines five risk categories that organizations should map their agentic AI deployments against: privilege, design and configuration, behavioral, structural, and accountability. It treats agentic AI as a live critical infrastructure risk. The recommended approach is to fold these systems into existing zero trust and defense-in-depth frameworks. Building a separate AI security discipline is explicitly discouraged.

Which Five Eyes risk categories can a runtime control plane enforce?

Three of the five categories (privilege, behavioral, accountability) are operational and enforceable at the boundary between an agent and the model or tool it calls. Privilege risk is enforced by binding scope to the originating user identity at every call. Behavioral risk is enforced by maintaining a runtime baseline of normal agent activity and escalating on divergence. Accountability risk is enforced by writing a signed evidence record for every model call, indexed by session id. The remaining two categories (design and configuration, structural) are decided when the agent is built and deployed. They belong to architecture review and reference design. A vendor claiming a single product addresses all five categories is making a capability claim the architecture does not support, and that overstatement is worth catching during evaluation.

How do you enforce privilege risk in agentic AI systems?

The mechanism is identity-bound scope evaluated per call. Before the agent's outbound request leaves your environment, resolve the originating user from the delegation chain. Look up the user's role and the agent's declared purpose. Compute the scope the user's role permits for that target model with that purpose. If the agent's intended action falls outside the scope, deny the request. If it falls inside, forward the request with a signed scope token. The same agent invoked by two different users sees two different scopes, because the principal is the user, not the agent. This pattern aligns with NIST SP 800-53 CM-7 "least functionality" applied to agents. It also addresses the structural finding from CSA's April 2026 research that approximately three-quarters of enterprises grant agents broader access than the assigned task requires.

Why are static allow-lists insufficient for behavioral risk in agentic AI?

An allow-list says "this agent may call these five tools". It fires only when an agent attempts to call a sixth tool. It stays quiet when the agent calls the first five tools forty times instead of three, or when it runs them in a sequence that produces an effect no individual call would. The Five Eyes guidance flags emergent action patterns, drift after model updates, and chained tool use as distinct behavioral risks. None of those are visible to a static allow-list. The mechanism that does see them is a runtime baseline per agent role: tool call distribution per session, argument distribution per tool, response class distribution, chain depth distribution, egress volume per session. Each call updates the baseline and is also evaluated against it. Divergence beyond a tolerance escalates rather than auto-blocks, with full context delivered to the on-call.

What evidence does the Five Eyes accountability category require for an AI session?

The minimum evidence record at the model call boundary needs to answer four questions: what did the agent do, on whose behalf, with what credentials, and under whose instructions. Concretely, a signed record per model call containing session id, timestamp, originating user, delegation chain across agents, agent name and version, model target, prompt classification with redactions, retrieved context references, model response classification, downstream tool calls, and the policy decision with rule id. The session id is the join key for the full chain across agent hops. The signature prevents downstream log pipelines from quietly dropping fields. This evidence pattern satisfies HIPAA 45 CFR 164.312(b) audit controls, SR 11-7 model risk management documentation, and EU AI Act Article 12 record-keeping. The accountability category restates an existing audit obligation, captured at a boundary that previously did not exist.