How is LLM05 different from LLM03 (training data poisoning)?

LLM03 is specifically about contamination of training or fine-tuning data. LLM05 covers the broader supply chain, including model weights, serving frameworks, tool definitions, MCP servers, and inference dependencies. A poisoned fine-tune is LLM03. A backdoored model downloaded from a public hub is LLM05.

Does the gateway need patching?

Yes. The gateway is security-critical infrastructure and has the same patching obligation as any other component in that category. The LiteLLM CVE wave in June 2026 is a recent reminder that gateway-class software has its own CVE history.

What is the value of statelessness for supply-chain risk?

A stateless gateway holds no long-lived data and no long-lived provider keys in application memory. A CVE that exposes the gateway's process memory exposes only the in-flight requests, not the credentials or the historical traffic. The blast radius of a gateway-side compromise is materially smaller than for a stateful equivalent.

How does this connect to MCP server security?

MCP servers are part of the supply chain when agent loops invoke them. The gateway can enforce identity-bound policy on MCP invocations: which identities can call which MCP server, with which capability scopes, at which data classifications. The policy layer constrains the blast radius of a compromised MCP server.

Where does this fit in OWASP AISVS?

OWASP AISVS Chapter 4 (supply chain) and Chapter 14 (governance, including supply chain controls) cover the verification requirements for the LLM05 surface. The chapters require documented AIBOM artifacts, signed model and tool provenance, and per-decision logs of which model version served each request.

OWASP LLM05 Supply Chain Vulnerabilities: Mapping the Surface a Gateway Can Cover

Q: What is an AIBOM?

An AI Bill of Materials is the inventory artifact for an AI deployment. It lists the model identifier and version, the model card reference, the fine-tune data sources, the inference dependencies, and the tools the agent loop calls. The AIBOM is the supply-chain record auditors and regulators ask for.

OWASP LLM05 covers supply chain vulnerabilities across the AI stack. The Top 10 entry catalogs a wider surface than the original LLM01-LLM10 set's other entries: foundation model weights distributed through public hubs, serving frameworks with their own CVE histories, third-party tools the agent loop calls into, vector databases and embedding model dependencies, and the long tail of Python and JavaScript packages the serving infrastructure pulls in. The June 2026 LiteLLM CVE wave (CVE-2026-12773 auth bypass, CVE-2026-42271 RCE added to CISA's KEV catalog) is a recent reminder that the gateway and proxy layer of the AI stack has its own CVE history and needs to be patched on the same cadence as any other security-critical infrastructure.

The defenses split across the supply chain itself, the runtime, and the network boundary. The boundary slice is where a policy gateway operates. The gateway is not the AIBOM. The gateway is not the patch manager. The gateway is the layer that sees every request and response and that can enforce identity-bound policy on the actions the supply chain produces.

I want to walk through the LLM05 surface, sort the controls by which layer enforces each one, show what an identity-aware gateway adds, and call out the surfaces where the gateway is the wrong layer for the primary defense.

The supply chain surface

Five sub-surfaces account for most of the LLM05 risk in production deployments.

Foundation model weights. The enterprise pulls a model from a public hub, a private hub, or a provider API. Hub-distributed weights have been documented to contain backdoors, modified tokenizers, and modified config files. Provider-API weights are opaque; the enterprise has only the provider's attestation that the weights match the model card.

Serving frameworks and proxies. vLLM, TGI, Ollama, LiteLLM, BerriAI, and the long tail of inference servers each have a CVE history. The CVEs cover auth bypasses (LiteLLM's June 2026 wave), unsafe deserialization, prompt-template injection, model-poisoning APIs that bypass the intended administrative boundary, and the usual class of memory-corruption issues that ship with C-extension-heavy Python stacks.

Tool and plugin definitions. Agent loops call out to tools through structured definitions. The tools wrap third-party APIs, internal services, or shell processes. A poisoned tool definition that promises one behavior and executes another is an LLM05 instance. A tool that has been compromised at the upstream service is also an LLM05 instance.

MCP servers. The Model Context Protocol formalizes the tool surface and ships with its own security considerations: authentication on the MCP server endpoint, capability negotiation, and the trust boundary between the host application and the MCP server. CVE-level issues in MCP server implementations have already surfaced; the surface will grow as MCP adoption scales.

Inference dependencies. The Python or JavaScript packages the serving infrastructure pulls in have the same supply-chain exposure as any other modern software stack. A typosquatted package, a compromised maintainer account, or a malicious update in a transitive dependency can introduce code into the inference path.

The controls by layer

The controls that actually do the work split across three layers.

Supply chain layer. Provenance attestations on model weights (signed model cards, hub-side verification of upload identity), reproducible builds for the serving framework, SBOM and AIBOM tracking for the inference stack, signed and pinned tool definitions, and regular CVE scanning of the serving infrastructure. The supply chain controls are upstream of any runtime control and they are where most of the LLM05 work has to happen.

Runtime layer. Capability sandboxing on tools (the tool runs in a restricted process with limited filesystem and network access), input validation on model and tool responses (the serving framework checks that the response shape matches the declared schema), and isolated execution boundaries between tenants. The runtime controls limit the blast radius when a supply-chain control fails.

Network boundary layer. Identity-bound policy on every model call and every tool invocation, per-decision audit logs that capture the identity, the route, the model version, and the policy decision, and detection signals for behavior that diverges from the model card or the tool's declared behavior. This is the layer a policy gateway covers.

Each layer constrains the failures of the layers above it. A poisoned model weight that the supply chain failed to catch can still be contained at the runtime layer if the runtime enforces output schema validation. A compromised tool that the runtime failed to catch can still be contained at the network boundary if the gateway enforces identity-bound policy that limits which identities can invoke the tool.

What an identity-aware gateway adds

The gateway adds three concrete LLM05 controls that the other layers do not provide.

Per-identity authorization on model and tool invocations. The gateway evaluates whether the calling identity is permitted to invoke this specific model version with this specific tool set at this specific data scope. An attacker who has compromised a tool definition still gets blocked at the gateway if the calling identity is not permitted to invoke the affected tool. The control turns a supply-chain compromise into a per-identity blast-radius problem.

Per-decision audit independent of the application and the serving framework. Every request and response produces a record with identity, route, provider, model version, tool invocations, and policy decisions. If a CVE in the serving framework is later discovered, the audit log allows the security team to reconstruct which requests went through the vulnerable version and which identities were exposed. The forensic trail is the source data for incident response.

Centralized credential handling. Provider credentials and tool credentials sit in the gateway's secret store, not in the application or the serving framework. A CVE that exposes the serving framework's process memory does not expose the long-lived provider key, because the key is never resident in the serving framework's memory. The LiteLLM CVE-2026-42271 RCE is the canonical example: a stateless gateway that holds no long-lived provider keys gives the RCE materially less to reach.

What sits outside the gateway boundary

The model weights, the serving framework's binary, the tool implementations, the MCP server implementations, and the underlying inference dependencies all sit outside the gateway. The gateway sees the requests and responses flowing through it. The gateway does not see the code that produced the responses.

The gateway is the wrong layer for any control that requires inspecting the model's internal state, validating the weights, scanning the serving framework's binaries, or auditing the dependency tree. Those controls belong to the supply chain layer and the runtime layer.

This is the cleanest case in the OWASP series where the defense lives across multiple layers and where no single layer can carry the whole load. The architectural answer is depth: provenance and patching at the supply chain layer, sandboxing at the runtime layer, identity-bound policy at the network boundary layer.

How LLM05 maps to AIBOM and regulatory requirements

The AI Bill of Materials (AIBOM) is the supply-chain inventory artifact for the AI stack. An AIBOM includes the model identifier and version, the model card reference, the fine-tune data sources, the inference dependencies, and the tools the agent loop calls. The AIBOM is the supply-chain record that LLM05 implicitly requires.

EU AI Act Article 11 requires technical documentation for high-risk AI systems that includes the data sources, the methodology, the system architecture, and the dependencies. An AIBOM that is current and signed satisfies the dependency portion of Article 11 documentation. A per-decision audit log that captures the model version on every request provides the production-side evidence that the documented dependencies were in use.

NIST AI RMF MAP and MEASURE functions both reference the supply chain explicitly. MAP requires documentation of third-party components; MEASURE requires monitoring for emergent behavior from those components. The combination of AIBOM (MAP) and per-decision audit (MEASURE) covers the LLM05 surface from the framework's perspective.

The CISA KEV catalog entry for LiteLLM CVE-2026-42271 (added June 8, 2026) is a forcing function for federal contractors and any organization that uses the catalog as a patch-priority signal. The gateway layer needs to be patched on the same cadence as any other security-critical infrastructure.

DeepInspect

This is the boundary-layer control DeepInspect provides for the LLM05 surface. DeepInspect is a stateless proxy between authenticated users or agents and the LLMs they call. It binds every request to a verified identity, enforces per-identity authorization on model and tool invocations, holds no long-lived provider keys in application memory, and writes a per-decision audit record outside the calling application.

The stateless design means the gateway has no long-lived state for an attacker to extract through a CVE in the gateway itself. The credential handling means a serving-framework CVE elsewhere in the stack does not expose the provider keys. The audit records mean a supply-chain compromise discovered later can be investigated against the per-request history rather than against application logs the compromised system may have suppressed.

The gateway is not the AIBOM. The gateway is not the patch manager for the serving framework. The gateway is the network boundary control that contains the blast radius when an upstream supply-chain control fails.

If you are mapping the OWASP LLM Top 10 controls against your current architecture and your LLM05 coverage depends on trusting every component in your AI stack to be CVE-free, let's talk today.