LLM Prompt Logging: What an Article 12 Compliant Record Has to Contain
LLM prompt logging records every prompt sent to an LLM, the response the model returned, the identity that initiated the call, the policy that governed the decision, and the data classifications detected. The EU AI Act Article 12 obligation, the NIST AI RMF Manage function, and the Fannie Mae LL-2026-04 disclosure mandate each expect this record at a specific granularity. I walk through what the record contains, where most application logging falls short, and how the architectural pattern that produces a compliant record differs from application-side logging.

LLM prompt logging records every prompt sent to an LLM, the response the model returned, the identity that initiated the call, the policy that governed the decision, and the data classifications the inspection point detected. EU AI Act Article 12 mandates automatic recording of events over the lifetime of every high-risk AI system, with the August 2, 2026 deadline pending. Article 19 specifies the content (timestamps, input data, identification of natural persons) and the retention floor (six months). Most application-side logging today captures a fraction of what the obligation expects and writes the fraction to a log the application itself controls. The result is a record that fails the audit independence property the regulation reads onto Article 12.
I want to walk through what an Article 12 compliant record actually contains, where most application logging falls short, and how the architectural pattern that produces the compliant record works on the wire.
What the record has to contain
A compliant LLM prompt log record covers seven required fields and three recommended fields the regulatory reviewer expects at high-risk deployments.
Required: the verified identity of the natural person on whose behalf the request was made. The role and authorization context the identity carried. The model class and provider the request targeted. The prompt content the caller sent. The response content the model returned. The policy version in effect at the moment of the decision. The timestamp with sufficient precision to correlate across systems.
Recommended: the data classifications detected in the prompt and the response. The rules that matched against the policy. The decision outcome (permit, redact, deny) and the rationale.
The record commits to a write path the application has no access to. The signature anchors at a key the audit store holds. The prompt and response content can be stored in their raw form in the audit store (with access control that limits ad-hoc retrieval) or stored as hashes plus structured classifications, depending on the regulatory regime and the deployer's data retention posture.
Where application-side logging falls short
Most enterprise LLM deployments today log the LLM call inside the calling application. The application writes a log line that captures whatever fields the developer chose to capture, into whatever log infrastructure the application uses. Three failure modes recur.
The first is incomplete content. The application-side log captures the prompt summary, the response summary, or a hash, because the full prompt and response are large and the application's log infrastructure was not sized for the content. The Article 19 requirement for the input data is not satisfied by a summary.
The second is missing identity context. The application-side log captures the session identifier, the API key, or the username string the application was using internally. Article 19 requires the identification of natural persons. The application-side log frequently captures the wrong identity, an application identity rather than the user identity, or no identity at all.
The third is the self-attestation problem. The application that made the LLM decision also wrote the log. The application can suppress the log on crash, rewrite the log because it has write access, or selectively log because the developer chose what to log and what to skip. The audit independence property the regulation expects is not satisfied.
The combined failure produces a record that does not survive a regulatory review. The reviewer asks "produce the records of how this AI system handled this specific decision." The application-side log returns a partial record without verified identity, without the full content, written by the system under audit.
The architectural pattern that produces the compliant record
A compliant record requires an inspection point at the AI request boundary that reads the prompt and response, attaches the verified identity, and writes the record to a write path the application has no access to.
The inspection point operates as a stateless proxy between the calling application and the LLM provider. The proxy terminates the outbound TLS session, reads the JSON request body, verifies the identity context the application attached, classifies the prompt content, evaluates the identity-aware policy, and writes the per-decision record before forwarding the call to the upstream provider.
The same operations run on the return path. The model response is read, classified, and evaluated against the response-path policy. The record covers the response alongside the request.
The architectural property the pattern carries is audit independence. The application has no write path to the audit store. The application cannot suppress the record, cannot rewrite the record, and cannot selectively log. The record is the system of record at the AI request layer.
Retention and the deployer's posture
Article 19 sets a retention floor of six months. The operational retention for most regulated enterprises runs much longer. Financial institutions in most EU jurisdictions face record-keeping obligations of five to ten years under existing financial regulation. Healthcare deployers face HIPAA-style retention obligations that depend on the data type. The Fannie Mae LL-2026-04 mandate expects records covering AI-assisted lending decisions for the duration of the loan and beyond.
The architecturally sensible default is seven years with the option to extend per regime. The audit store is sized for the retention. The cost is dominated by storage volume, which compresses well for structured records, and is small relative to the regulatory exposure the records mitigate.
The retention also has to handle the data-subject access request side of GDPR Article 15. A subject who requests their records receives the records covering their natural-person identifier. The audit store supports the query against the subject field and produces the relevant records.
The cryptographic integrity property
Every record carries a tamper-evident signature. The signature anchors at a key the audit store holds. The signature is the property that makes the record admissible as evidence in regulatory review and in civil litigation.
A reviewer trusts a record by trusting the signature chain. The reviewer verifies that the record was produced at the time it claims, by the gate it claims, against the policy version it references. A record without a signature is not admissible. A record with a signature that does not validate has been tampered with after commit and is treated as evidence of suppression rather than evidence of the underlying decision.
The signing key lives in the audit store's KMS or HSM. The corporate operator does not have direct access to the key. The key rotation pattern follows the rotation cadence the corporate KMS or HSM runs for other signing keys. The records produced under each key version validate against the key version the record references, which preserves the integrity property across rotations.
How the record satisfies the 2026 regulatory set
EU AI Act Article 12 expects automatic logging over the system lifetime. The inspection point writes the record on every decision regardless of the application's behavior, which is what "automatic" reads onto. The August 2, 2026 deadline applies.
Article 19 expects the content (timestamps, input data, identity of natural persons) and the retention floor (six months). The record covers each field structurally. The retention is configured at the audit store level.
Article 99 sets the penalty at 15 million EUR or 3% of global annual turnover for high-risk non-compliance, whichever is higher. The exposure the compliant record mitigates is in that range.
NIST AI RMF Manage function expects incident response evidence. The record is the evidence. ISO 42001 clause 8.3 expects operational controls that produce evidence on demand. The audit store produces the records on demand.
Fannie Mae LL-2026-04, effective August 6, 2026 per the Cooley legal analysis, expects lenders to produce records of AI-assisted lending decisions and to disclose tools, providers, and safeguards on demand. The record carries each field. Texas TRAIGA, effective January 1, 2026, expects operators to maintain records of AI system operation in consequential-decision contexts.
Where most enterprises are exposed
The IBM Cost of Data Breach Report 2026 found that shadow AI breaches take 247 days to detect, six days longer than standard breaches. The detection window reflects the absence of a per-decision record at the AI request layer. The deployment that has no inspection point and no compliant log sees the breach only when the data shows up outside the boundary, months later.
Netwrix found that only 37% of organizations have any AI governance policy in place. Without policy, the inspection point has no decision input. Without a decision input, the record has nothing to reference. The two gaps compound.
Cloud Radix found that 86% of IT leaders are completely blind to employee AI interactions. The blindness reflects the absence of inspection at the prompt layer. The blindness is the inverse of the audit record: a deployment with the record has structural visibility, and a deployment without the record has structural blindness.
The architectural answer is the inspection point and the policy that drives the per-decision record. The cost of building the inspection point is recoverable. The cost of an unmet Article 12 obligation when the August 2, 2026 deadline arrives is not.
DeepInspect
This is exactly what DeepInspect produces. DeepInspect is the inspection point at the AI request boundary. The proxy reads every prompt, classifies the content, evaluates identity-aware policy, and writes a per-decision audit record to a write path the application has no access to.
The record carries the verified identity, the role, the model class, the prompt content (or a hash plus structured classifications, per the deployer's data retention posture), the response content, the policy version, the rules matched, the decision outcome, the timestamp, and a tamper-evident signature. The record commits before the model response returns to the application.
The audit store retains the records for the operational retention period the deployer configured. The store supports the data subject access request query, the regulatory production query, and the incident response query. The signing key lives in the corporate KMS or HSM. The compliance review interface is access-controlled.
Enforcement overhead runs under 50 milliseconds end-to-end in internal DeepInspect testing. The record commit happens in the request hot path. The proxy operates in front of any HTTP-accessible LLM endpoint.
If your Article 12 readiness depends on application-side LLM logs the application controls, let's talk today.
Frequently asked questions
- Does the record contain the full prompt or only a summary?
The deployer chooses between full content storage and hash plus structured classification storage. Full content storage produces the most complete record for regulatory review and the highest storage footprint. Hash plus classification storage produces a smaller footprint and supports the GDPR Article 15 data subject access query through the identity field while limiting the surface for accidental disclosure of past prompts. Both patterns satisfy Article 19; the regulator reads the input data requirement as the content that allows reconstruction of the decision, which the structured classifications support alongside the model and policy versions.
- How does the record handle multi-turn conversations?
A multi-turn conversation produces one record per turn. The records reconcile on the conversation identifier and carry the turn number. The conversation history visible to the model at each turn is captured as part of the prompt content for that turn, which lets the reviewer reconstruct what the model saw at each step. The cumulative size of the records across a long conversation is bounded by the audit store retention and storage policy, which the deployer configures.
- What happens if the audit store is unavailable?
The inspection point operates fail-closed against audit unavailability. A request the proxy cannot record is a request the proxy denies, because forwarding the call without writing the record produces an evidence gap the regulator does not accept. The pattern matches the standard fail-closed posture on identity verification and classification: any failure mode that prevents the decision from being made and recorded produces a deny. The audit store availability target is sized to the corporate SLA for the rest of the security-critical infrastructure.
- Does the record cover embedded AI in SaaS tools?
Embedded AI in SaaS tools (the customer service platform that summarizes tickets with an LLM, the productivity suite that drafts content) calls the LLM from the vendor's environment, not from the corporate environment. The inspection point sees only the traffic the corporate environment routes. The architectural answer is the vendor due care path: procurement contracts require vendor-side logs the deployer can request on demand. The deployer's record covers the AI traffic the corporate environment originates; the vendor's record covers the rest. The reviewer reads both alongside the contract that requires the vendor record.
- How long does the inspection point retain the audit records?
The retention is configured per regime. The architecturally sensible default is seven years with the option to extend. Financial institutions in most EU jurisdictions face five to ten years under existing financial regulation. Healthcare deployers face HIPAA-style retention obligations that depend on the data type. The deployer configures the retention at the audit store layer and the inspection point writes records to that retention without modification.