How does the inspection layer cover prompt injection (LLM01) when the injection sits in a retrieved document?

The inspection layer runs a classifier over the retrieved-content section of the prompt at request time. The classifier matches against a maintained library of injection signatures. A detected signature feeds the policy bundle, which decides between blocking the request, stripping the suspect span, or passing with a logged warning. The application marks the retrieved content with a provenance tag so the policy can apply a stricter rule to content from untrusted corpora. The architecture handles direct and indirect injection through the same control point.

How does the inspection layer cover sensitive information disclosure (LLM02) for outbound prompts?

The inspection layer runs a prompt-level classifier over the request text and the attachments before the request reaches the model endpoint. The classifier identifies data classes: PII, PHI, financial identifiers, source code, secrets. The policy fires on the classification: redact the matched span, block the request, or pass with logging depending on the data class and the caller's authorization. The audit record carries the classification signal and the policy decision for every request.

How does the inspection layer handle excessive agency (LLM06) in tool-using agents?

The inspection layer enforces per-tool authorization at the request boundary. The natural-person identity of the caller attaches to every request. The policy evaluates whether the proposed tool call is authorized for the caller. A successful injection that proposes an unauthorized tool call fails the policy at the inspection layer before the application executes the tool. The architecture preserves the agent's autonomy within the caller's authorized scope.

Does the inspection layer replace application-side output sanitization (LLM05)?

The inspection layer adds a control point upstream of application-side sanitization. The response classifier matches against payload signatures (SQL injection, command injection, encoded payloads, suspicious URLs) and blocks responses that match before the calling application receives them. The application's existing output handling sits downstream and catches anything that bypasses the inspection layer. Defense in depth produces the posture the OWASP guidance recommends.

How does the audit record series support the EU AI Act and DORA evidence obligations?

The per-decision audit record carries the timestamp, the natural-person identity of the caller, the model and version, the policy version, the classification signals, the policy decision, and the cryptographic integrity signature. An analyst querying the record series produces the EU AI Act Article 12 traceability artifact, the DORA Article 19 incident notification, and the Fannie Mae LL-2026-04 disclosure-on-demand response. The write-path independence of the inspection layer satisfies the auditor's question about whether the application could have modified the record.

OWASP LLM Top 10: How the 2025 Update Maps to Production AI Security Controls

The OWASP LLM Top 10 has become the working enumeration for LLM application risks. The 2025 update kept the structure introduced in 2023 and reordered the items to reflect what production teams actually saw across two years of incident response. Prompt injection (LLM01) sits at the top of the list. Sensitive information disclosure (LLM02) moved up. Supply chain risk (LLM03) replaced training data poisoning as the higher-frequency category. A new category for unbounded resource consumption (LLM10) captures the cost-amplification attacks that became a recurring incident pattern in 2024 and 2025.

I want to walk each LLM Top 10 item to the inspection-layer control that produces a defensible posture, the architectural reason model-side and application-side defenses fall short, and where the audit-record series intersects the EU AI Act Article 12 and DORA Article 19 evidence obligations.

LLM01: Prompt injection

Prompt injection covers any attack where adversarial content in the model's context window changes the model's behavior. Direct injection is the user typing instructions that override the system prompt. Indirect injection is the model reading injected content from a retrieved document, a tool output, or a long-term memory store. The 2025 update collapsed direct and indirect into the same LLM01 category because the architectural control point is the same in both cases.

The control point is the request boundary. An inspection layer that classifies prompt content for injection signatures before the request reaches the model produces a deterministic signal the policy can act on. The signatures cover instructions to disregard prior context, instructions to assume a different persona, instructions to encode responses in exfiltration-friendly formats, and instructions to call tools the caller is not authorized to invoke. The classifier evolves as new patterns surface in incident response and academic research.

Model-side defense fails by construction. Models attend to the entire context window. Refusal training produces a probabilistic preference, not a structural separation. The Stanford Trustworthy AI and AIUC-1 Consortium briefing found refusal behaviors degrade under adversarial pressure. The boundary between trusted application instructions and untrusted retrieved content has to be enforced upstream of the model.

LLM02: Sensitive information disclosure

Sensitive information disclosure covers PII, PHI, financial data, source code, and internal secrets reaching the model through the prompt or through a tool result the model surfaces to the user. The 2025 update moved this risk up because the volume of leaks through generative AI prompts grew faster than any other category. IBM's Cost of Data Breach Report studied 600 breached organizations and found that customer PII exposure jumped to 65% in shadow-AI breaches compared to 53% across all breaches.

Prompt-level classification at the request boundary catches the disclosure before the prompt reaches the model endpoint. The classifier runs over the prompt text and the attachments, identifies data classes (PII, PHI, financial identifiers, source code, secrets), and feeds the policy a deterministic signal. The policy fires on the classification: redact the matched span, block the request, or pass with logging depending on the data class and the caller's authorization. The audit record carries the classification signal regardless of the policy outcome.

Response inspection covers the case where the model surfaces sensitive content from its training data or from a tool result. The inspection layer runs the same classifier over the streamed response. A detected pattern blocks the response stream before the calling application receives it.

LLM03: Supply chain

Supply chain risk covers the model providers, the model weights, the tokenizers, the embedding models, the vector databases, the agent frameworks, and the third-party tools the application loads at runtime. A compromise at any layer propagates through every downstream application. The 2025 update elevated this risk because the surface grew dramatically with the proliferation of open-source model hubs and the broad adoption of agent frameworks.

The inspection layer attaches identity context to every outbound model call. A request to an unauthorized model provider, an unauthorized model variant, or an unauthorized endpoint fails the policy at the request boundary. The constraint scales with the policy bundle: a new approved model gets added to the bundle, a deprecated model gets removed, an endpoint that fails security review gets blocked. The application code does not change.

The audit record carries the model identifier, the endpoint, the request fingerprint, and the response fingerprint. An analyst querying the record series produces the inventory the EU AI Act Article 26 and the NIST AI RMF Govern function expect.

LLM04: Data and model poisoning

Data poisoning targets training data, fine-tuning data, and the embeddings that power RAG retrieval. The 2025 update merged the training-data and model-poisoning categories because the operational defense at the inspection layer is the same: control the data that reaches the training pipeline, and inspect the embeddings that reach the retrieval corpus.

The inspection layer covers the retrieval side. A classifier over the documents the application ingests into the corpus catches injected content at ingestion time. A second classifier over the documents the retrieval surfaces at query time catches anything the ingestion classifier missed. The combination produces the defense-in-depth posture the OWASP guidance recommends.

The training side is partially out of scope for the runtime inspection layer. The control covers the prompts and the tool results that reach the model, the retrieval corpus, and the model selection. Training-pipeline controls (data lineage, contributor identity, dataset signing) sit in the MLOps layer.

LLM05: Improper output handling

Improper output handling covers application code that treats model output as trusted. The application passes the output to a SQL query, an OS command, an HTML render, or another downstream system without validation. The 2025 update kept this category because it remains a high-frequency incident pattern: prompt injection causes the model to emit a payload, and the application executes the payload because the developer treated the model output as untainted.

The inspection layer's response classifier catches the dangerous payload patterns: SQL injection signatures, command injection signatures, encoded payloads, suspicious URLs. The classifier produces a signal the policy can act on before the calling application receives the response. The architecture does not replace application-side output sanitization. The architecture adds a control point upstream of the application's sanitization layer that produces an audit record for every detected pattern.

LLM06: Excessive agency

Excessive agency covers agents that have more tools, more permissions, or more autonomy than the use case requires. The 2025 update kept the category and added emphasis on tool-use frameworks. An agent with a delete_records tool, a send_email tool, and a transfer_funds tool in the same bundle has the surface for an injection to escalate from a benign user query to a destructive action.

The inspection layer enforces per-tool authorization at the request boundary. The natural-person identity of the caller attaches to the request, the policy evaluates whether the proposed tool call is authorized for the caller, and the inspection layer blocks unauthorized tool calls before the application executes them. The architecture preserves the agent's autonomy within scope and prevents the autonomy from escalating beyond the caller's authorized scope.

LLM07: System prompt leakage

System prompt leakage covers attacks where the model emits its system prompt or other confidential context. The 2025 update added this category because system prompts often contain proprietary business logic, customer-specific configuration, or hardcoded credentials the developer thought the model would never reveal.

The defense is twofold. The application puts secrets out of the system prompt and into the inspection layer's policy bundle, where the model never sees them. The inspection layer's response classifier matches against known system-prompt-leakage signatures and blocks responses that match. The combination prevents the secrets from reaching the model and catches the cases where the model emits a different sensitive fragment.

LLM08: Vector and embedding weaknesses

Vector and embedding weaknesses cover attacks against the vector database, the embedding model, and the retrieval pipeline. Embedding-inversion attacks recover the original text from stored embeddings. Cross-tenant data leakage happens when a multi-tenant vector store fails to isolate tenants. Adversarial inputs produce embeddings that map to attacker-chosen documents.

The inspection layer covers the retrieval side. Per-request identity context flows into the retrieval call. The policy evaluates whether the caller is authorized for the requested namespace, the requested document set, and the requested similarity threshold. Cross-tenant retrieval that lacks the namespace context fails the policy. The audit record captures the retrieval source and the document identifiers the model received.

LLM09: Misinformation

Misinformation covers hallucinated content the user trusts because the application surfaced it. The 2025 update added this category because hallucination has become the dominant source of downstream incidents in customer-support and clinical decision-support deployments.

The inspection layer's contribution here is narrower than for the injection and disclosure categories. The layer captures the per-decision audit record (the prompt, the retrieval context, the model and version, the response, the policy state). An analyst investigating a hallucination-driven incident reconstructs the decision from the record. The record supports the post-incident analysis and the regulator notification under the EU AI Act Article 73 incident reporting regime.

LLM10: Unbounded consumption

Unbounded consumption covers cost-amplification and denial-of-service attacks against the application's LLM bill. An attacker fires a high-token prompt in a loop, or causes the agent to enter a tool-call cycle that consumes the rate limit and the budget.

The inspection layer enforces per-caller rate limits, per-route token budgets, and per-session loop detection at the request boundary. The control fails closed under load: a caller that exceeds the budget hits a 429 from the inspection layer rather than from the model provider. The audit record captures the budget state and the rate-limit decisions across the series.

How the audit record series intersects compliance evidence

The per-decision audit record series the inspection layer commits covers the evidence the EU AI Act Article 12 expects: timestamps, identity context, decision outcomes, model and version, policy state. The same record series covers the DORA Article 19 incident reporting evidence for the financial-services scope. The same series covers the Fannie Mae LL-2026-04 disclosure-on-demand obligation for mortgage lenders deploying AI. The OWASP LLM Top 10 risks are categories of operational threats. The audit record series is the evidence that the threats were detected, the policy acted, and the outcomes were preserved.

The architecture produces the categories the regulator and the customer auditor will accept. A separately-running inspection layer with cryptographically signed records satisfies the write-path independence test. An application-controlled log that the application can modify fails the test.

DeepInspect

This is the gap DeepInspect closes for the OWASP LLM Top 10 surface. DeepInspect sits inline between the calling application and any HTTP LLM endpoint. For every request, the inspection layer runs the prompt classifier (LLM01, LLM02), evaluates the per-tool authorization (LLM06), runs the model and endpoint policy (LLM03), runs the retrieval classifier (LLM04, LLM08), runs the response classifier (LLM05, LLM07), and enforces the budget and rate limits (LLM10). The per-decision audit record carries the natural-person identity, the policy state, the classification signals, and the outcomes in a format the EU AI Act Article 12, DORA Article 19, Fannie Mae LL-2026-04, and the NIST AI RMF auditor accept.

The architecture covers the OpenAI, Anthropic, Vertex, and Bedrock endpoints, the agent frameworks built on top, and the retrieval pipelines the agents consume. The deployment integrates as a single HTTP hop. The application code keeps the same SDK calls, the inspection layer attaches the identity and the policy state at the request boundary, and the per-decision audit record series feeds the compliance program.

If you are running an LLM in production and the security review or the regulator is asking how you address the OWASP LLM Top 10, let's talk.