LLM DLP: The Inspection Point Where Prompt Content Becomes Sensitive Data
LLM DLP is the inspection layer that catches PII, PHI, source code, and customer data inside the prompt body before it reaches an LLM endpoint. Network DLP, endpoint DLP, and email DLP each terminate inspection before the prompt is in scope. This piece walks through where each traditional layer stops, why the LLM request path slips through, the regulatory framing under EU AI Act Article 12 and HIPAA, and the architectural placement that produces a defensible per-request record.

The IBM Cost of Data Breach Report studied 600 breached organizations and found that one in five experienced breaches linked to shadow AI, with customer PII exposure jumping to 65% in shadow AI breaches versus 53% across all breaches. Cloud Radix found that 77% of employees using unauthorized AI admit to pasting sensitive business data into unsanctioned models. The DLP product did not stop the exposure because the DLP terminated inspection before the prompt was in scope. The prompt sat inside a TLS-encrypted POST body to a legitimate SaaS endpoint, and the network DLP, the endpoint DLP, and the email DLP each looked at a different surface.
I want to walk through where each traditional DLP layer terminates inspection, why the LLM request path slips through every one of them, and where LLM DLP has to sit to produce a per-request record an auditor can sample under EU AI Act Article 12 or HIPAA Security Rule 45 CFR 164.312(b).
Where network DLP terminates
A network DLP sits at the corporate egress. It can read cleartext HTTP, classify outbound files in unencrypted protocols, and block known-bad destinations. When the user's browser opens a TLS connection to an LLM endpoint, the network DLP can see the connection metadata (SNI, IP, certificate fingerprint) but not the body. The destination is a legitimate SaaS endpoint, so the connection metadata alone does not raise the DLP's policy. Without an MITM appliance in the path, the prompt content is invisible. With an MITM appliance, the operational cost is high enough that most enterprises run it against general web traffic rather than specific SaaS APIs.
The mechanical limit is the TLS termination point. The network DLP cannot inspect the body because the body is encrypted to the endpoint, and the endpoint is the LLM provider's API server.
Where endpoint DLP terminates
An endpoint DLP runs as an agent on the user's device. It can see the clipboard, the file system, and certain browser extensions' DOM events. It cannot see the network body of a request the browser composes against an LLM API once the request enters the browser's TLS stack. If the user types directly into a chat web app, the endpoint agent sees keystrokes but not the assembled POST body, and the classification windows for the keystroke stream are narrow.
The mechanical limit is the application's TLS context. The endpoint agent cannot read inside the application's encrypted session to the LLM provider without instrumenting the browser at a level most enterprises do not deploy.
Where email DLP terminates
An email DLP inspects outbound messages at the mail gateway. The LLM request path does not pass through the mail gateway. The two surfaces do not intersect.
Why the LLM request path slips through
The mechanical reason is that the prompt is the data, and the prompt sits inside an encrypted POST body the application stack composes for a legitimate destination. The data is not in a file the user is uploading to a personal account. The data is not in an email going to an external recipient. The data is in a JSON field inside the body of a normal API call to a SaaS LLM endpoint.
A second mechanical reason is that the user identity does not travel in the request. The application calls the LLM API with a service credential or API key. The natural person behind the request is identified inside the application session, not at the model API boundary. Even if a DLP could decrypt the body, it could not bind the prompt to the user without external identity context. The identity binding is the field that Article 19 of the EU AI Act expects on the record.
Where LLM DLP has to sit
The inspection point that can see the prompt and bind it to identity is the HTTP request layer between the authenticated user or agent and the LLM endpoint. The layer terminates the TLS connection from the caller, authenticates the caller against the corporate identity provider, runs the classifier against the prompt content, evaluates policy against the bound identity and the classification, and forwards or rejects the request before the model receives it.
This is the only placement where the same surface holds both the prompt and the verified identity at the same moment. The placement is what makes the per-decision record possible. Without it, the record either omits identity (because the prompt was inspected at the network) or omits the prompt (because identity was inspected at the application).
What real LLM DLP architecture requires
The architecture carries four operating properties. It terminates TLS at the inspection layer. It authenticates the caller against the corporate IdP at request time. It runs a content classifier against the prompt with deterministic categories the policy can match against. It commits a tamper-evident audit record before the model response returns to the caller.
The classifier categories include PII (names, addresses, government IDs, payment card numbers), PHI (clinical narratives, diagnosis codes, treatment plans), source code with secret patterns, customer data fields bound to specific data tenants, and free-form sensitive categories defined per organization. The policy matches on the classification plus the identity context (role, group membership, data tenant) and returns a deterministic decision: permit, redact, or block. The record carries identity, classification, policy version, decision, timestamp, and an integrity signature on a tamper-evident series.
Regulatory framing
EU AI Act Article 12 requires automatic recording of events over the lifetime of the system sufficient to ensure traceability. Article 19 specifies that records identify natural persons involved. Article 99 sets penalties at €15 million or 3% of global annual turnover for high-risk non-compliance. The August 2, 2026 deadline applies to high-risk AI systems including credit scoring, employment screening, education access, and biometric identification, all of which can touch prompt content with sensitive data.
HIPAA Security Rule 45 CFR 164.312(b) expects audit controls on systems that process PHI. A SaaS LLM is processing PHI the moment a clinician pastes a SOAP note into a chat window. Cloud Radix found that 57% of healthcare professionals use unauthorized AI to process PHI without a Business Associate Agreement. The audit control persists regardless of whether the PHI processing happened on a sanctioned platform or a shadow one.
DeepInspect
DeepInspect is the inspection-layer shape of LLM DLP. The proxy sits inline between authenticated users or agents and any LLM, terminates TLS at the inspection layer, authenticates against the corporate IdP, runs deterministic classification against the prompt content, evaluates policy against identity and classification, and commits a per-decision audit record before the model response returns. The records carry the fields the EU AI Act Article 12 and Article 19 expect on the same series the HIPAA audit control rule references.
For organizations preparing for the August 2, 2026 high-risk system deadline, the question is whether the LLM DLP layer can produce identity-bound records of every prompt that touched a high-risk decision. If the current DLP stack terminates at the network, the endpoint, or the email gateway, the LLM request path is unmanaged at the granularity the regulation expects.
Book a demo today.
Frequently asked questions
- Does LLM DLP replace traditional DLP?
The two surfaces solve adjacent problems and both stay in the stack. Traditional DLP keeps inspecting files, email, and general web traffic. LLM DLP adds inspection at the HTTP request boundary to LLM endpoints, which is the surface traditional DLP was never designed to cover. Most enterprises run the two layers side by side.
- How does LLM DLP differ from a content classifier alone?
A content classifier inspects the prompt and returns a category label. LLM DLP combines the classifier with identity binding, policy evaluation, and the per-decision record series. The classifier is one component of LLM DLP, not a replacement for it.
- Can LLM DLP cover personal-account use from a tethered phone?
When the user routes around the corporate network entirely, the inspection layer at the HTTP boundary cannot see the prompt. The control moves to the corporate IdP (which can enforce SSO-only access to sanctioned AI apps) and the acceptable-use policy with monitored attestation. The architecture for personal-device shadow AI sits adjacent to the LLM DLP placement.
- How does the latency profile compare to inline TLS termination at any proxy?
End-to-end inspection overhead measures under 50 ms in DeepInspect's internal testing. The TLS termination at the inspection point adds standard handshake cost. LLM inference itself takes 500 ms to 5 seconds, so the inspection overhead sits inside the variance of the round-trip.
- What is the audit record format LLM DLP produces?
The record series carries identity, classification, policy version, decision, timestamp, and an integrity signature in a structured format (typically JSON Lines with a signed schema). The records flow into SIEM, data lake, or compliance analytics platforms via standard log forwarding patterns. The schema aligns with the fields EU AI Act Article 19 references.