AI compliance reporting automation: turning per-decision audit records into board-ready evidence
AI compliance reporting automation turns the per-decision audit records the inspection layer writes into three artifacts the auditor, the control owner, and the board each consume. The raw log substrate covers EU AI Act Article 12 and DORA Article 19. The per-control evidence summary covers SOC 2 TSC and NIST AI RMF MEASURE. The board KPI rolls up to a single page. The three-layer stack is the automation target.

The per-decision audit record the inspection layer writes carries the verified identity, the policy version, the classification result, and the decision outcome. That record is the raw substrate. The auditor pulling EU AI Act Article 12 evidence wants a query against the substrate scoped to a model and a date range. The control owner preparing the SOC 2 walkthrough wants a per-control summary that names the population, the sampling method, and the exception count. The board reviewing AI risk wants a single KPI page. Three readers, three artifacts, one underlying record set. The automation target is the pipeline that produces all three from the same source.
I want to walk through the three-layer report stack, how the EU AI Act, NIST AI RMF, and SOC 2 evidence asks each map to a layer, what the control mapping registry contains, what the evidence freshness SLA enforces, and how the auditor-pull endpoint exposes the data without giving the auditor production access.
The three-layer report stack
The substrate sits at the bottom: every HTTP AI request produces a signed record committed before the model returns. The record carries about 30 fields including identity, role, agent identity where applicable, classification, model, policy version, decision, and signature. The substrate is the cluster 1 inspection-layer output.
The middle layer is per-control evidence. Each control in the SOC 2 trust services criteria, the NIST AI RMF MEASURE function, or the EU AI Act Article 12 logging mandate maps to one or more queries against the substrate. The query returns a population, an exception count, a sampling method, and a freshness timestamp. The control owner uses the summary in the walkthrough.
The top layer is the board KPI. The KPI rolls up control health into a single number per quarter. The CRO presents the number. The supporting evidence chain traces back to the substrate.
EU AI Act Article 12 and Article 26 mapping
Article 12 requires high-risk AI systems to maintain automatic event logs over their lifetime. The logs must enable the identification of situations that may present risk under Article 79, the post-market monitoring under Article 72, and the substantial modification triggers under Article 43. Article 26 extends the obligation to deployers: the deployer must keep the logs the system produces under their control for at least six months.
The automation pipeline extracts the Article 12 evidence pack as a JSON file. The pack contains the model identifier, the date range, the decision count, the redaction count, the block count, and the substantial-modification event list. The auditor pulls the pack against a scoped read endpoint. The query returns in under five seconds for a 30-day range against a model handling 100,000 daily requests.
NIST AI RMF MEASURE function reporting
The NIST AI Risk Management Framework MEASURE function asks the organization to track AI risks and trustworthiness characteristics with metrics. The MEASURE 2 subcategory covers evaluation across the trustworthiness dimensions. The MEASURE 3 subcategory covers mechanisms for tracking identified AI risks over time.
The per-control evidence summary produces a MEASURE 2 metric pack: classification accuracy on prompts containing the sensitive-data taxonomy, false-positive rate on the block decisions, false-negative rate measured against the labeled regression set, and policy-version coverage on the population. The MEASURE 3 metric pack produces a time series of the same metrics by week, plus the incident count tagged with the AI RMF risk taxonomy. The summary is the artifact the NIST AI RMF lead presents at the quarterly governance review.
SOC 2 control evidence pull
The SOC 2 trust services criteria for the security category include CC6.1 logical access, CC7.2 system monitoring, and CC7.3 incident response. Applied to AI traffic, CC6.1 maps to the per-request identity verification on the substrate. CC7.2 maps to the policy decision and the classification outcome. CC7.3 maps to the block decisions and the incident tickets they generated.
The automation pulls a CC6.1 evidence file containing the count of decisions, the count of identity verification failures, and the population definition. The auditor receives the file with the signature chain that ties each row back to the substrate. The walkthrough takes 20 minutes instead of two days because the control owner does not assemble screenshots from five systems. The single source is the substrate.
Control mapping registry, freshness SLA, auditor-pull endpoint
The control mapping registry is the YAML file that ties each control identifier to its query. The registry has one entry per control. The entry names the framework (EU AI Act Article 12, NIST AI RMF MEASURE 2, SOC 2 CC6.1), the control identifier, the query against the substrate, the freshness SLA in hours, and the output schema.
The freshness SLA is the maximum age the evidence may have. SOC 2 control summaries refresh every 24 hours. EU AI Act Article 12 evidence packs refresh on demand. NIST AI RMF MEASURE metrics refresh weekly. The pipeline alarms when a control falls out of SLA. The control owner sees the alarm before the auditor arrives.
The auditor-pull endpoint is the scoped read endpoint that exposes the evidence files. The auditor authenticates against a separate IdP role. The role permits read on the evidence files for a defined date range. The role does not permit production access. Every auditor pull writes its own audit record.
Sample evidence-pack structure
The Article 12 evidence pack carries a fixed schema. The pack is signed against the same key chain the substrate uses.
DeepInspect
This is the substrate the three-layer report stack depends on. DeepInspect sits at the AI request boundary as a stateless proxy between authenticated users or agents and any LLM endpoint. Every HTTP request produces a per-decision audit record committed before the model response returns. The record carries the verified identity, the role and agent context, the data classification applied to the prompt, the model and version called, the policy version, the decision outcome, and a cryptographic signature.
The control mapping registry, the evidence freshness SLA, and the auditor-pull endpoint operate against the substrate. The EU AI Act Article 12 evidence pack, the NIST AI RMF MEASURE metric pack, and the SOC 2 CC6.1 control summary each query the same record set. The auditor sees a single signature chain. The control owner sees a single source. The board sees a single KPI.
Book a demo today.
Frequently asked questions
- How does the evidence pack differ from a raw log export?
The raw log export is the unfiltered substrate scoped to a date range. The evidence pack is the curated artifact mapped to a specific framework requirement. The pack carries the metadata the auditor expects: the framework reference, the control identifier, the population definition, the sampling method where applicable, and the freshness timestamp. The auditor accepts the pack without needing to interpret the substrate schema. The signature chain ties the pack back to the substrate, so the export remains the source of truth.
- What does the freshness SLA actually buy?
The freshness SLA is the maximum age the evidence may have when the auditor pulls it. The SLA prevents the situation where the control owner produces evidence from last quarter for a current-period audit. The pipeline alarms when a control falls out of SLA, typically 24 hours for SOC 2 control summaries and 168 hours for NIST AI RMF MEASURE weekly metrics. The control owner remediates before the auditor sees stale data. The alarm itself is logged.
- Why does the auditor pull through a scoped endpoint instead of getting a database read?
The scoped endpoint enforces three properties the database read does not. First, the auditor's role permits read only on the evidence files for the scoped date range, not on production. Second, every pull writes its own audit record, so the access is itself auditable. Third, the endpoint returns the signed pack, not the raw substrate, which prevents the auditor from accidentally seeing data outside the audit scope. The endpoint also rate-limits the pulls, which protects production load.
- Does this work for SOC 2 Type II or only Type I?
The three-layer stack supports both. SOC 2 Type I confirms the design of the controls at a point in time. The control mapping registry and the auditor-pull endpoint demonstrate the design. SOC 2 Type II confirms the operating effectiveness over a period, typically six or twelve months. The freshness SLA, the population definition, and the time-series metrics in the per-control evidence summary cover the period. The substrate retains the records over the audit window, which Article 12 and DORA Article 19 retention rules already require.
- How does this map to ISO 42001?
ISO 42001 defines the AI management system standard. The control mapping registry adds ISO 42001 as another framework alongside EU AI Act, NIST AI RMF, and SOC 2. The same substrate covers the AIMS clauses on operational planning and control (clause 8), on performance evaluation (clause 9), and on improvement (clause 10). The evidence pack format extends with the AIMS clause reference. The auditor pull endpoint and the freshness SLA work the same way.