← Blog

AI Prompt Redaction: The Substitution Step That Lets the Model Reason Without Touching the Raw Data

AI prompt redaction substitutes placeholders for sensitive content in the prompt before the model receives the request. The substitution preserves the structural cues the model needs to produce a coherent response while keeping the raw PII or PHI off the model provider. This piece walks through the redaction pattern, how placeholders feed the model, the audit record fields the redaction lands on, and the EU AI Act and HIPAA framing.

ByParminder Singh· Founder & CEO, DeepInspect Inc.
AI Security Solutionsai-prompt-redactionllm-dlpai-dlpai-securityinline-enforcement
AI Prompt Redaction: The Substitution Step That Lets the Model Reason Without Touching the Raw Data

A redaction decision at the prompt level substitutes a placeholder for the sensitive span before the request reaches the model. The model sees a prompt where the PII, PHI, or customer data has been replaced with a typed token. The model produces a response against the redacted prompt, and the response does not contain the sensitive data because the model never saw it. The original prompt content lives only in the audit record on the inspection side, with stricter access controls than production traffic.

I want to walk through how the redaction pattern works on the request path, the placeholder shapes that preserve model coherence, how the audit record handles the redaction, and the regulatory framing that makes the pattern operationally consequential.

The redaction step on the request path

The inspection layer sits at the HTTP request boundary between authenticated users or agents and the LLM endpoint. The layer terminates TLS, authenticates the caller, runs the classifier, evaluates policy, and executes the decision. When the decision is redact, the layer rewrites the prompt body before forwarding the request to the model:

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

The redact step is deterministic and idempotent. The same input plus the same policy version produces the same redacted output. The audit record carries the policy version, which lets the program reconstruct the redaction decision at any point.

Placeholder shapes that preserve model coherence

The placeholders matter for response quality. A naive [REDACTED] token strips the structural cue the model needs and produces incoherent responses. Typed placeholders preserve the role of the entity in the sentence:

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

The typed placeholder tells the model that [PII_NAME_1] references a person and [PHI_MRN_1] references a medical record number. The model produces a response that refers to the placeholder at the appropriate grammatical position. Programs running redaction in production usually run a calibration window of thirty days to confirm response quality stays in scope.

A second pattern is the consistent token across the same conversation:

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

The model treats [CUSTOMER_ACCOUNT_1] as a consistent reference across the conversation, which preserves multi-turn coherence the way an unredacted account ID would.

The audit record fields the redaction lands on

The audit record carries the original prompt text, the redacted prompt text, the labels, the spans, the policy version, and the integrity signature:

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

The original prompt sits on a separate encrypted store with stricter access controls than the production audit log. The decrypt key reference lets a compliance officer recover the original under a documented break-glass procedure that itself produces a record. The redacted prompt sits in cleartext because the model already received it in cleartext.

Why redaction matters for the model provider relationship

The model provider receives the redacted prompt. Under the provider's terms of service, the redacted content is what the provider sees, stores (per the data-retention setting), and references in any downstream training (when training opt-out is not in effect). The raw PII and PHI never crossed the inspection boundary in its raw form. Programs running redaction usually find that the BAA scope of the model provider relationship can be tightened because the PHI exposure surface is the inspection layer, not the model API.

The same pattern applies to general SaaS LLM use. The model provider sees what the redaction policy let through. The raw data the user submitted to the application never reached the provider. The audit record explains exactly which spans were redacted and which were permitted.

Regulatory framing

EU AI Act Article 12 requires automatic recording of events over the lifetime of the system sufficient to ensure traceability. Article 19 specifies the record fields, including input data references and identification of natural persons involved. The August 2, 2026 deadline applies to high-risk AI systems.

The redaction pattern lets the program carry both fields on the same record: the original prompt content (on the encrypted store) for traceability and the natural-person identity (from the IdP integration) on every record. The auditor's sample reconstructs the decision context from the record series without exposing the original PII to the operations team handling routine audit reviews.

HIPAA Security Rule 45 CFR 164.312(b) expects audit controls on systems that process PHI. The redaction decision is the field that maps to the access-control requirement: PHI did not reach the model unless the policy explicitly permitted. The audit record carries the redaction span and the decision, which lets the HIPAA review reconstruct exactly which PHI elements were exposed to the model and which were not.

What goes wrong without typed placeholders

A redaction pattern that uses untyped placeholders or that strips the content without a placeholder confuses the model. Response quality drops, application teams disable the redaction policy, and the program lands back on the permit-everything posture that produced the audit gap in the first place. The typed-placeholder pattern is the operational property that keeps redaction in production long-term.

A second failure mode is the inconsistent token across turns. If the placeholder for the same account ID changes between turn one and turn three, the model loses the multi-turn reference and the response coherence drops. The consistent-token pattern within a conversation scope is the fix.

DeepInspect

DeepInspect ships AI prompt redaction as part of the identity-aware enforcement gateway. The proxy authenticates the caller against the corporate IdP, classifies the prompt content, applies typed placeholders consistent across the conversation scope, evaluates policy against identity and classification, and commits a per-decision audit record with the original and redacted prompts on the same series. The records carry the integrity signature on a tamper-evident series the EU AI Act Article 12 and HIPAA Security Rule reference.

For programs preparing for the August 2, 2026 deadline, the redaction pattern is the field that lets the model receive only the content the policy explicitly permits, with the audit record carrying the full decision context.

Book a demo today.

Frequently asked questions

Does redaction work for code prompts?

Yes. The classifier identifies secret patterns inside source code (API keys, AWS access keys, private keys) and substitutes typed placeholders that preserve the syntactic role. The model receives code with placeholder tokens at the secret positions and produces a response that handles the tokens as opaque references. Programs running coding-assistant integrations rely on this pattern to keep production secrets out of the model.

Can we configure response-side redaction too?

The inspection layer can run classification on the model response before it returns to the caller. The redact-on-response pattern catches cases where the model produces sensitive data the prompt classification did not predict (for example, the model hallucinating a Social Security Number that resembles a real one). The audit record carries the response-side decision on the same series as the request-side decision.

How does redaction interact with multi-turn conversations?

The placeholder for the same entity stays consistent across the conversation scope. The model treats the consistent token as the canonical reference for the entity, which preserves multi-turn coherence. When the conversation ends, the token mapping is discarded and the next conversation produces fresh placeholders.

What happens when the redacted prompt confuses the model?

A small fraction of redaction patterns produce model responses that explicitly reference the placeholder ("I cannot help with [PII_NAME_1] without more context"). Programs handle this by adjusting the placeholder shape, the redaction granularity, or the policy threshold. The detection-mode window before enforcement catches most of these cases before they reach production.

Does the redacted prompt count as the input for Article 19 purposes?

The audit record carries both the original and the redacted prompt. The Article 19 reference to input data points at the data the user sent; the original prompt is what the record carries on the encrypted store. The redacted prompt is the artifact the model received. The auditor receives both fields when sampling a decision.