← Blog

AI Audit Logs: The Format Spec That Survives EU AI Act, DORA, and Fannie Mae Review

AI audit logs that survive regulatory review carry a specific set of fields the EU AI Act Article 12, DORA Article 19, Fannie Mae LL-2026-04, NIST AI RMF, and HIPAA all expect on the same record. The fields cover identity, decision provenance, model identity, policy state, and integrity metadata. The format has to support per-record retrieval and per-series replay. The write path has to sit outside the application so the application cannot modify the record. This piece walks through the field-level format specification, the integrity model, the storage characteristics, and the deployment pattern that produces records the regulator and the customer auditor will accept.

ByParminder Singh· Founder & CEO, DeepInspect Inc.
Problem-Awareai-audit-logsaudit-logseu-ai-actdoracomplianceinline-enforcement
AI Audit Logs: The Format Spec That Survives EU AI Act, DORA, and Fannie Mae Review

AI audit logs that survive regulatory review carry a specific set of fields, a specific integrity model, and a specific write-path topology. The EU AI Act Article 12 requires automatic recording of events over the lifetime of the system sufficient to ensure traceability. DORA Article 19 requires major incident reporting evidence in the financial-services scope. Fannie Mae LL-2026-04 requires disclosure on demand for mortgage AI decisions. NIST AI RMF Manage 4 expects measurement and tracking of AI risks. HIPAA Security Rule 45 CFR 164.312(b) expects audit controls on PHI processing. The reviewers all read against the same record series.

I want to walk through the field-level format specification, the integrity model that closes the self-attestation gap, the storage characteristics that support the retrieval workflows, and the deployment pattern that produces records the regulator and the customer auditor will accept. The companion pillar how-to-build-a-defensible-ai-audit-trail covers the broader architectural decisions; this piece focuses on the record format itself.

The required field set

Per-decision audit records carry fields in five groups: identity, request, model, policy, and integrity.

The identity group carries the natural-person caller (the authenticated end user or the named service account behind the request), the agent identity (the application, the function, the version) that submitted the call, and the session identifier if a multi-turn conversation is in progress. The natural-person field is the load-bearing one for EU AI Act Article 12 traceability. A record that lacks the natural-person identifier fails the Article 12 test because the auditor cannot trace the decision back to the responsible human.

The request group carries the timestamp (millisecond precision), the route identifier (the application endpoint or the API path), the prompt fingerprint (a content hash; the full prompt may also be stored depending on the data class), the prompt-level classification signals (PII detected, PHI detected, MNPI signals, source code signals), the retrieval source identifiers if RAG context is included, and the tool-call set if the agent proposed tool actions.

The model group carries the model provider, the model name, the model version, the endpoint URL, the request parameters (temperature, top_p, max tokens), and the prompt and response token counts. The model-identity fields support the supply-chain inventory under EU AI Act Article 26 and NIST AI RMF Govern.

The policy group carries the policy bundle version that evaluated the request, the per-route policy that matched, the per-role policy that applied, the per-tool authorization decisions, the policy decision outcome (passed, redacted, blocked, queued for review), and the response classifier outcome on the streamed response.

The integrity group carries the cryptographic signature over the record's canonical serialization, the public key identifier the signature verifies against, the hash chain pointer to the prior record in the series, and the periodic anchoring receipt for the segment of the series the record belongs to.

The canonical serialization

The record serializes to a deterministic JSON form that supports signature verification across the storage layer and across replay. The canonical form fixes the field order, the number encoding, the string normalization, and the array ordering. Two serializations of the same record produce the same byte sequence, and the signature verifies against either.

The schema versions on the record. A schema-version field at the top of each record identifies which version of the format spec the record conforms to. Format upgrades preserve the prior records and add new fields without modifying the existing ones. The auditor reading a five-year-old record gets the schema version and verifies the signature with the field set the record was committed against.

The integrity model

Per-record signing covers the integrity of individual records. The inspection layer signs each record at the moment of commit using a key the application does not have access to. The signature ties the record to the inspection layer that produced it and lets a verifier confirm authenticity without trusting the storage layer.

Hash chaining covers the integrity of the record series. Each record carries a pointer to the cryptographic hash of the prior record. A modification to any record in the series breaks the chain at the modified point and breaks every subsequent record's hash check. The chain catches retroactive insertion, deletion, and modification across the series.

External anchoring covers the long-term integrity of the chain. The inspection layer periodically commits the chain head to an external trust anchor (a transparency log, a blockchain commitment, or a regulator-approved notary). The anchor receipt persists alongside the record series. An auditor checking the chain integrity walks from any record to the anchor and verifies that no retroactive modification has happened since the anchor commit.

The combination produces the tamper-evident property the EU AI Act Article 12 implies, the DORA Article 19 integrity test requires, and the Fannie Mae LL-2026-04 disclosure-on-demand obligation rests on.

The write-path independence requirement

The application that submitted the AI request cannot have custody of the write path. An application-controlled log fails the self-attestation test that regulators apply to compliance records. The reasoning is structural: the application is the entity whose behavior the log evidences. A log the entity controls is a self-attestation, not an independent record.

The inspection layer runs as a separate process or service with a distinct credential set. The write path goes from the inspection layer directly to the audit store using a credential the application does not have. The application reads the policy decision (passed, redacted, blocked) from the inspection layer's response header. The application cannot read or modify the audit record itself.

The architectural pattern matches what financial-services regulators have required for transaction logs for decades and what healthcare regulators have required for PHI access logs since the HIPAA Security Rule took effect.

The storage characteristics

The record store has to support three retrieval workflows. The first is per-record lookup: given a request identifier, return the full record with the signature verification. The second is per-series replay: given a date range and a filter (natural person, route, model), return the matching records in order with the chain verification. The third is anchored verification: given any record, prove that the record predates a specific anchor commit.

Storage cost scales with record volume. A production AI deployment that processes 100,000 requests per day produces 36 million records per year. At 2 KB per record (including the canonical serialization, the signature, and the chain pointer), the annual storage requirement is 72 GB. Object storage at archival pricing handles this cost. Active-query indexes (the per-record lookup index and the per-series replay index) sit in a faster tier.

Retention follows the longest applicable regulatory requirement. EU AI Act Article 19 sets a retention floor of six months unless the application is high-risk, in which case the obligations under Articles 12, 16, and 17 typically push retention to the lifetime of the system and beyond. HIPAA expects six years. SOC 2 Type II reads against the audit period plus the look-back. The default retention setting for the record series is the maximum of the applicable regimes.

The retrieval API

The audit store exposes a retrieval API for the auditor workflows. The API accepts a query by request identifier, by natural-person identifier, by route, by model, by date range, by policy decision class, and by classifier signal. The response carries the records with the signature verification metadata so the auditor's tooling can verify the records independently.

The same API supports the regulator disclosure workflow. A regulator requesting an Article 12 traceability artifact for a specific decision receives the per-decision record with the signature verification and the chain proof. A regulator requesting a series for an incident investigation receives the filtered series with the chain verification across the segment.

DeepInspect

This is the gap DeepInspect closes for AI audit logs. DeepInspect sits inline between the calling application and any HTTP LLM endpoint. For every request, DeepInspect captures the identity, request, model, and policy field groups, signs the canonical serialization with a key the application cannot access, commits the record to a write-path-independent store, and chains the record to the prior record in the series. Periodic anchoring commits the chain head to an external trust anchor. The retrieval API surfaces the records to the auditor with the verification metadata.

The architecture covers the OpenAI, Anthropic, Vertex, and Bedrock endpoints, the agent frameworks built on top, and the retrieval pipelines the agents consume. A new field added to the spec versions on the schema. Format upgrades preserve the prior records and add new fields without modifying the existing ones. The integrity verification flow handles the schema version explicitly.

If you are facing an EU AI Act Article 12, DORA Article 19, or Fannie Mae LL-2026-04 review and the audit record format is the gap, let's talk.

Frequently asked questions

Why does the natural-person identifier matter so much for EU AI Act Article 12 traceability?

The Article 12 language requires logs sufficient to ensure traceability of the AI system's functioning. The traceability test the auditor applies asks "given this AI decision, who was the human responsible?" A record that carries only the agent identity (the application that called the model) fails the test because the agent serves many users and the auditor cannot identify the responsible human. A record that carries the natural-person identifier closes the loop. The traceability requirement is the load-bearing reason DeepInspect attaches identity context at the request boundary rather than relying on the application's log.

How does the hash chain catch retroactive modification of a record?

Each record carries a pointer to the cryptographic hash of the prior record. A modification to any record in the series changes that record's hash. Every subsequent record in the chain was committed against the original hash, so every subsequent record's chain pointer fails to verify after the modification. The chain catches retroactive insertion (the inserted record breaks the next record's chain check), deletion (the next record points to a hash that no longer exists), and modification (the modified record's hash no longer matches the pointer in the next record). External anchoring extends the integrity to predate any potential attacker access.

What goes into the policy bundle version field and why does the field matter for the audit record?

The policy bundle version is the version identifier of the policy set that evaluated the request. The bundle includes the per-route rules, the per-role rules, the data-class classifications, the per-tool authorization map, and the response classifier signatures. The version field on the record lets an auditor reconstruct the policy state that applied at the moment of decision. A regulator investigating a months-old decision asks "what policy was in effect when this decision fired?" The version field plus a policy-bundle archive answers the question deterministically.

Can application logs be used instead of a write-path-independent audit store?

Application logs fail the write-path independence test that regulators apply. The reasoning is structural: the application is the entity whose AI behavior the log evidences, and a log the entity controls is a self-attestation rather than an independent record. The pattern matches what financial-services regulators have required for transaction logs and what healthcare regulators have required for PHI access logs. Application logs are useful for application-side debugging. They are not sufficient evidence for the EU AI Act Article 12, DORA Article 19, or Fannie Mae LL-2026-04 reviews.

What is the storage and retention cost for a production AI deployment?

A deployment processing 100,000 AI requests per day produces 36 million records per year at roughly 2 KB per record (canonical serialization plus signature plus chain pointer). Annual storage runs at 72 GB. Object storage at archival pricing handles this cost at well under $100 per year for the cold storage tier. Active-query indexes for the per-record lookup and per-series replay workflows sit in a faster tier at higher cost. Retention follows the longest applicable regulatory requirement: six months under EU AI Act Article 19 as a floor, six years under HIPAA, the lifetime of the system under Article 12 for high-risk deployments.