OWASP LLM01 (Prompt Injection)
OWASP LLM01 is the first entry in the OWASP Top 10 for Large Language Model Applications, covering prompt injection. The OWASP foundation published the LLM Top 10 in 2023 and updated it in 2025 (current version 2025). LLM01 covers both direct prompt injection (the attacker types instructions into the prompt the model receives) and indirect prompt injection (the attacker plants instructions in a data source the model later reads through retrieval, web browsing, or tool output). The entry is the OWASP community's consolidated threat model for the input channel a deployer cannot trust by default.
What OWASP LLM01 actually says about mitigation
The 2025 entry lists prevention practices that organize into three categories. The first category is input handling: privilege separation between the application's instructions and the user's input, sandboxed retrieval, content filtering on retrieved data. The second category is output handling: structured output validation, content classification on the model's response, action authorization before any tool call. The third category is monitoring and incident response: per-decision audit, prompt anomaly detection, replayable evidence the security team uses post-incident.
Why OWASP LLM01 cannot be solved at the model layer alone
The model is the asset being attacked, not the control point. A model trained to refuse injection still gets attacked at the input boundary, and the success rate of a sufficiently creative attacker against pure model-side defenses sits well above zero across every public evaluation. The mitigation that holds up under audit places the controls outside the model: at the request boundary where identity and classification get evaluated, on the retrieval path where content provenance gets checked, and on the output path where the tool authorization decision gets made before any side effect lands.
Related reading
- OWASP LLM Top 10: How the 2025 Update Maps to Production AI Security Controls
The OWASP LLM Top 10 enumerates the application-security risks that show up when an LLM is wired into a production application. The 2025 update reorganized the list to reflect what production teams actually see: prompt injection at the top, sensitive information disclosure and supply chain risk close behind, and a new category for unbounded resource consumption. This piece walks each risk to the inspection layer control that produces a defensible posture, the gap each risk exposes in standard application-side defenses, and where the audit record series intersects EU AI Act Article 12 and DORA Article 19 evidence obligations.
- OWASP LLM01 Prompt Injection: The 2025 Update and What the Inspection Layer Enforces
OWASP LLM01 captures both direct and indirect prompt injection in a single category in the 2025 update. The architectural reason is that the control point is the same: the request boundary. Application-side defenses fail by construction because the application cannot tell which spans of the prompt the model treats as instructions. Model-side defenses fail because refusal training is probabilistic. This piece walks through the LLM01 attack surface, the inspection-layer controls that produce a defensible posture, the audit record that survives review under EU AI Act Article 12 and DORA Article 19, and the deployment pattern that fits a production AI stack.
- Prompt Injection in Production: Where It Happens, What It Costs, and How To Prevent It at the Request Boundary
Prompt injection is the class of attacks where adversarial content in a prompt overrides the application instructions or extracts data the model was not authorized to reveal. The attack surface includes direct user prompts, indirect injection through retrieved documents and tool results, and chained injection through agent loops. OWASP has consistently ranked prompt injection as the top LLM vulnerability. This piece walks through the attack mechanisms in production, the failure modes of model-side defenses, the request-boundary controls that produce a defensible posture, and the audit record format that holds up after an attempt is detected.