← Blog

Non-Human Identity for AI Agents: Why Service Credentials Are the Wrong Primitive

Non-human identity covers the API keys, OAuth tokens, and workload identities that authenticate services and agents to APIs. AI agents have outgrown the static-service-credential model. A single agent can act on behalf of many users, hold delegated authority that varies by task, and produce decisions that need per-action attribution. This piece walks through the four properties an NHI for AI agents must have, why static API keys fail each of them, and how identity-bound policy at the AI request boundary closes the gap.

ByParminder Singh· Founder & CEO, DeepInspect Inc.
Platform & Architecturenon-human-identityai-agentsidentityengineeringnist-ai-rmf
Non-Human Identity for AI Agents: Why Service Credentials Are the Wrong Primitive

Non-human identity, or NHI, is the working term for the API keys, OAuth tokens, machine certificates, and workload identities that authenticate services and agents to APIs. The category is growing as AI agents proliferate. A single agent can act on behalf of many users, hold delegated authority that varies by task, run with elevated permissions on some calls and reduced ones on others, and produce decisions that need to be attributable per action rather than per session. The static-API-key primitive that worked for service-to-service traffic does not carry that load. The May 4, 2026 Cisco-Astrix acquisition made the category a buying-center conversation, but the architectural questions sit upstream of any vendor decision.

I want to walk through the four properties an NHI for AI agents must have to work, why static service credentials fail each of them, and where the policy enforcement actually lands.

NHI for AI agents

An agent calling an LLM API today typically uses one of two identity primitives. A static API key issued by the model provider to the deploying organization. A workload identity (IAM role, service principal, Kubernetes service account) used by the runtime to authenticate to a corporate AI proxy. Both primitives carry the same architectural flaw at scale: the identity at the call layer is the service identity, not the agent's identity and not the natural person on whose behalf the agent acts.

The four properties an NHI must have for AI agents

The first property is per-action attribution. The records must support reconstruction of which action was authorized by which identity at which moment. The static-key model attributes every call to the holder of the key. Per-action attribution requires the identity to be more specific than the key.

The second property is delegated authority. An agent acting on behalf of a user holds authority delegated from that user. The delegation must be visible at the call layer so the policy can evaluate against the actual authority chain.

The third property is short lifetime. The agent's identity at any moment must reflect a specific operating context, not a permanent grant. Short-lived, scoped credentials let the policy enforce the principle of least privilege per call rather than per service.

The fourth property is revocation. The identity must be revocable without reissuing the entire service's credentials. An agent that exfiltrates a credential should be containable without breaking every other agent in the deployment.

Static API keys fail all four

Static API keys held by the calling service attribute every call to the service, hold no delegation context, never expire under normal operation, and require a fleet-wide rotation to revoke. They are the wrong primitive for agents that are supposed to act with delegated authority on behalf of identified users.

What real NHI for agents looks like

The pattern emerging across enterprise deployments has three properties.

Workload identity for the runtime

The agent runtime authenticates to the corporate AI gateway with a workload identity, not a model-provider API key. The workload identity is short-lived (typically minutes to hours), scoped to the runtime's expected behavior, and managed through the existing identity provider. Standards work in this area includes SPIFFE and the workload-identity patterns shipped in Kubernetes, Azure Workload Identity Federation, and AWS IAM Roles for Service Accounts.

Delegated authority via signed tokens

When the agent acts on behalf of a user, the application that hosts the agent fetches a short-lived token from the identity provider that encodes the natural-person identity, the delegated scopes, and the maximum authority the user wants the agent to use. The token is signed and verifiable independently. The token is attached to the outbound call to the AI gateway. The gateway verifies the signature, extracts the identity context, and uses it in the policy decision.

Action-level records

The per-decision audit record captures the workload identity, the delegated natural-person identity, the scopes, the policy version, and the decision. The record is attributable per action, which is what NIST Pillar 3 (action lineage) requires.

Compliance gap

Most production AI deployments today fail the NHI test in three identifiable ways.

The model-provider key as the only identity

The deploying organization holds a single OpenAI key. Every agent calls OpenAI with that key. The model provider's logs attribute every call to the organization. The internal logs, if they exist, attribute calls to whichever microservice made the outbound call. The natural person disappears between the application and the model. The Article 19 requirement to identify natural persons cannot be satisfied from this configuration.

Shared service accounts across agents

Multiple distinct agents share a single service-account credential for operational convenience. The policy enforcement at the AI gateway evaluates every call as that service account, regardless of which agent originated the call or which user the agent was acting for. The result is that agents with materially different authority profiles get evaluated under the same policy.

Revocation is a fleet-wide event

An agent that misbehaves cannot be revoked individually because the credential is shared. The operational response to a compromised credential is a fleet-wide key rotation, which often takes hours and disrupts every legitimate agent in the deployment.

Mandate vs. Compliance

NIST's AI agent identity and authorization framework, EU AI Act Article 19, and the emerging OWASP Top 10 for Agentic Applications converge on the same architectural answer.

The NIST three pillars

Pillar 1 requires agent identity at the application layer. The application is responsible for verifying who is calling and attaching that identity context to outbound calls. Pillar 2 requires delegated authority, evaluated per call. Pillar 3 requires action lineage, recorded per call.

The static-key NHI satisfies none of the three. The signed-token-with-workload-identity pattern satisfies all three.

Records that survive the inquiry

When a regulator opens a high-risk inquiry under EU AI Act Article 26, the deployer must produce records that identify the natural person behind each AI decision. When a SOC opens an incident review on an agent that exfiltrated data, the investigator must reconstruct which agent took the action, under what authority, and what policy was in effect. Both require the records that the action-level NHI produces. Application logs that capture only the service-account identity cannot answer either question.

Compliance gap

The structural gap is that the application has the natural-person identity (the user session is at the application layer) and the model provider sees only the service credential (the API key at the call layer). The information needed to bridge the two has to travel with the request, in the form of a signed delegation token that the AI gateway can verify and act on.

DeepInspect

This is the architecture DeepInspect was built to provide. DeepInspect sits at the AI request boundary as a stateless proxy between any application and any LLM. The agent runtime authenticates to DeepInspect using a workload identity. The application attaches a signed delegation token that encodes the natural person, the delegated scopes, and the maximum authority. DeepInspect verifies the token, evaluates the policy against the identity context, and forwards the permitted call.

Every decision produces a per-decision audit record containing the workload identity, the natural-person identity, the delegated scopes, the policy version, the data classification, and the decision outcome. The record is signed and tamper-evident. The record is committed before the model response returns to the application. The records satisfy NIST Pillar 3 action lineage and EU AI Act Article 19 identity of natural persons at the same granularity.

Book a demo today.

Frequently asked questions

What is the relationship between NHI and AI agents specifically?

Non-human identity, or NHI, is the broader category. It covers any identity used by a non-human entity to authenticate to an API: service accounts, OAuth client credentials, workload identities, machine certificates, API keys. AI agents are one population inside that category. They differ from traditional service-to-service NHI in three ways: an agent often acts on behalf of a user, the agent's authority profile changes per task, and the agent's actions need per-action attribution for audit and compliance. The static-key NHI primitive that handles low-attribution service-to-service traffic falls short for agents in the AI population. The 2026 buying-center conversation around NHI tools reflects that gap.

How does NHI for AI agents map to the NIST framework?

NIST's AI agent identity and authorization framework, discussed in detail in the singhspeak post, splits the problem into three pillars. Pillar 1 (identity) is an application-layer concern. The application verifies who is calling and attaches the identity context. Pillar 2 (authorization) is per-request policy evaluation at the AI call layer. Pillar 3 (action lineage) is the per-decision record. NHI for AI agents is the implementation of Pillar 1 in a form that the gateway at Pillars 2 and 3 can act on. Workload identity for the runtime plus signed delegation tokens for the user-on-whose-behalf is the working pattern.

Are static API keys ever acceptable for AI agents?

Static keys are acceptable for a narrow case: an internal service-to-service call where there is no user-on-whose-behalf, the data flowing through is non-sensitive, and the audit obligation is minimal. As soon as any of those three properties change, the static key becomes the wrong primitive. A static key for a model-provider call that does carry user data, runs in production, and is in scope of EU AI Act Article 12 leaves the deployer unable to produce the records the regulation requires. The right primitive in that case is a workload identity for the runtime plus a signed token for the natural person.

How does delegation work when the agent calls multiple downstream APIs in a chain?

The delegation token travels with the call. When an agent calls a tool, which calls another tool, which calls the model, the original delegation token (or a derived token with explicitly narrowed scopes) is attached at each hop. The pattern is similar to OAuth's token exchange: the calling agent exchanges its current token for a downstream token that retains the natural-person identity and the delegated scopes appropriate to the next hop. The AI gateway at the boundary sees the final-hop token, verifies the chain, and evaluates the policy. The per-decision record captures the chain so the lineage is reconstructable from the records alone.

What happens to revocation when the agent's delegation token is compromised?

Short-lived tokens are the first line of defense. A token with a five-minute lifetime that is compromised has a five-minute exploitation window before it expires. For the window before expiry, the gateway can revoke the token by adding it to a denylist that the policy checks on every call. The pattern is a small revocation list, refreshed every few seconds, that the gateway consults during the policy evaluation. Long-lived workload identities are revoked at the identity provider, which propagates the revocation to all gateways within the next token-refresh cycle. The result is that revocation is per-identity rather than fleet-wide, which is what the NHI properties require.