AI DLP: Why Traditional Data Loss Prevention Misses the LLM Request Path and What Replaces It
Traditional DLP sits at the network edge or endpoint and inspects files and email. AI DLP has to sit at the HTTP request layer between authenticated users or agents and the LLM endpoint, because the prompt is the data and the prompt is inside an encrypted POST body the network DLP never sees. This piece walks through where each DLP layer terminates inspection, the regulatory framing under EU AI Act Article 12 and HIPAA, and the inspection architecture that produces a defensible record.

Traditional DLP sits at three places in the enterprise stack: at the network egress, at the endpoint, and at the email gateway. Each placement inspects bytes the user sends out of the corporate boundary. None of them inspect the HTTP POST body the browser or application sends to an LLM endpoint over TLS, because the inspection terminates underneath the encryption and the LLM endpoint is the legitimate destination. The result is that the prompt, which is the data, leaves the organization through a channel the DLP product treats as a normal API call. The IBM Cost of Data Breach report studied 600 breached organizations and found that one in five experienced breaches linked to shadow AI, with PII exposure jumping to 65% versus 53% across all breaches. The DLP did not stop those exposures because the DLP did not see the prompt.
I want to walk through where each traditional DLP layer terminates inspection, why the LLM request path slips through every one of them, where the inspection has to sit to produce a defensible record under the EU AI Act and HIPAA, and the architectural pattern that closes the gap.
Where traditional DLP terminates inspection
A network DLP terminates inspection at the corporate egress. It can read cleartext HTTP, classify outbound files in unencrypted protocols, and block known-bad destinations. When the user's browser opens a TLS connection to api.openai.com or claude.ai, the DLP can see the connection metadata (SNI, IP, certificate fingerprint) but not the body. Without an MITM appliance in the path, the prompt content is invisible. With an MITM appliance, the cost in operational complexity is high enough that most enterprises run it only against general web traffic, not specific SaaS APIs.
An endpoint DLP runs as an agent on the user's device. It can see the clipboard, the file system, and certain browser extensions' DOM events. It cannot see the network body of a request the browser composes against an LLM API once the request enters the browser's TLS stack. If the user types directly into a chat web app, the endpoint agent sees keystrokes but not the assembled POST body, and the classification windows for the keystroke stream are narrow.
An email DLP inspects outbound messages at the mail gateway. The LLM request path does not pass through the mail gateway.
Why the LLM request path slips through every layer
The mechanical reason is that the prompt is the data, and the prompt sits inside an encrypted POST body the user's stack composes for a legitimate destination. The data is not in a file the user is uploading to a personal account. The data is not in an email going to an external recipient. The data is in a JSON field inside the body of a normal API call to a SaaS LLM endpoint. Every traditional DLP layer was designed against a different threat model: file exfiltration, email exfiltration, or web upload. The LLM request fits none of those patterns.
A second mechanical reason is that the user identity does not travel in the request. The application calls the LLM API with a service credential or API key. The natural person behind the request is identified inside the application session, not at the model API boundary. Even if a DLP could decrypt the body, it could not bind the prompt to the user without external identity context. That binding is what Article 12 and HIPAA both require on the record.
Where AI DLP has to sit
The inspection point that can see the prompt and bind it to identity is the HTTP request layer between the authenticated user or agent and the LLM endpoint. The layer terminates the TLS connection from the caller, authenticates the caller against the corporate identity provider, runs the classifier against the prompt content, evaluates policy against the bound identity and the classification, and forwards or rejects the request before the model receives it.
This is the only placement where the same surface holds both the prompt and the verified identity at the same moment. The placement is what makes the per-decision record possible. Without it, the record either omits identity (because the prompt was inspected at the network) or omits the prompt (because identity was inspected at the application). Article 19 of the EU AI Act lists identification of natural persons involved as a required log field. The architecture has to support that field at the inspection moment, which means the inspection has to sit at the HTTP boundary between authenticated caller and LLM.
Regulatory framing under EU AI Act and HIPAA
EU AI Act Article 12 requires automatic recording of events over the lifetime of the system sufficient to ensure traceability. Article 19 specifies what the record carries, including identification of natural persons involved. Article 99 sets penalties at €15 million or 3% of global annual turnover for high-risk non-compliance. The August 2, 2026 deadline applies to high-risk AI systems including credit scoring, employment screening, education access, biometric identification, and several others enumerated in Annex III.
HIPAA Security Rule 45 CFR 164.312(b) expects audit controls on systems that process PHI. A SaaS LLM is processing PHI the moment a clinician pastes a SOAP note into a chat window. Cloud Radix found that 57% of healthcare professionals use unauthorized AI to process PHI without a Business Associate Agreement. The DLP does not see the paste because the paste went into a TLS-encrypted POST body to a SaaS endpoint the network treats as a normal API call.
What real AI DLP architecture requires
The architecture sits inline on the HTTP path between authenticated users or agents and the LLM. It carries four operating properties: it terminates TLS at the inspection layer, it authenticates the caller against the corporate IdP at request time, it runs a content classifier against the prompt with deterministic categories the policy can match against, and it commits a tamper-evident audit record before the model response returns to the caller.
The classifier categories include PII (names, addresses, government IDs, payment card numbers), PHI (clinical narratives, diagnosis codes, treatment plans), source code with secret patterns, customer data fields bound to specific data tenants, and free-form sensitive categories defined per organization. The policy matches on the classification plus the identity context (role, group membership, data tenant) and returns a deterministic decision: permit, redact, or block. The record carries identity, classification, policy version, decision, timestamp, and an integrity signature.
DeepInspect
DeepInspect is the inspection-layer shape of AI DLP. It sits inline between authenticated users or agents and any LLM, terminates TLS at the proxy, authenticates against the corporate IdP, runs deterministic classification against the prompt content, evaluates policy against identity and classification, and commits a per-decision audit record before the model response returns to the application. The records carry the fields the EU AI Act Article 12 and Article 19 expect on the same series the HIPAA audit control rule references.
For organizations preparing for the August 2, 2026 high-risk system deadline, the question is whether the AI DLP layer can produce identity-bound records of every prompt that touched a high-risk decision. If the current DLP stack terminates at the network, the endpoint, or the email gateway, the LLM request path is unmanaged at the granularity the regulation requires.
Book a demo today.
Frequently asked questions
- Does AI DLP replace traditional DLP?
The two solve adjacent problems and both stay in the stack. Traditional DLP keeps inspecting files, email, and general web traffic. AI DLP adds inspection at the HTTP request boundary to LLM endpoints, which is the surface traditional DLP was never designed to cover. Most enterprises run the two layers side by side, with the AI DLP terminating LLM-specific traffic and the traditional DLP handling the rest.
- Why can't a CASB cover the AI DLP problem?
A CASB sits at the network edge or reverse-proxies sanctioned SaaS traffic. It can enforce session policies against known SaaS catalogs and sometimes inspect API calls to specific apps. The classification windows for prompts inside LLM API bodies are narrower than for documents in storage APIs, and the CASB's policy surface was designed for OAuth-mediated access patterns, not for per-request identity-bound prompt inspection. CASBs help with SSO enforcement against AI SaaS apps. They do not produce the per-request record at the granularity Article 12 expects.
- How does AI DLP handle agentic AI traffic?
Agentic AI traffic is HTTP requests originated by an autonomous agent running under a service identity that the agent inherits from its instantiation. The inspection layer authenticates the agent's identity the same way it authenticates a human user, classifies the prompt content the agent assembled, and evaluates policy against the agent's role and the data classification. The record series carries the agent identity, the action lineage, and the policy state. This is the integration point for the NIST AI agent identity and authorization framework, which is covered in the NIST piece.
- Does the inspection layer slow the LLM round-trip down?
End-to-end inspection overhead measures under 50 ms in DeepInspect's internal testing. LLM inference itself takes 500 ms to 5 seconds depending on the model and the request size. The inspection-layer overhead is inside the variance of the inference round-trip, which means the user-perceived latency does not move when AI DLP is added inline.
- What happens when the user uses a personal account from a tethered phone?
When the user routes around the corporate network entirely, the inspection layer at the HTTP boundary cannot see the prompt. The control then moves to two adjacent layers: the corporate IdP (which can enforce SSO-only access to sanctioned AI apps) and the acceptable-use policy with monitored attestation (which makes the behavior a labor matter, not a technical one). The architecture for handling personal-device shadow AI is covered in the shadow AI monitoring piece.