← Blog

AI Governance Framework: The Operational Layers Between Policy Documents and the Audit Record

An AI governance framework that survives an audit has three operational layers: a policy layer that names what the program will and will not do, an enforcement layer that binds the policy to production traffic, and a record layer that produces the per-decision evidence. This piece walks through each layer, what artifacts each one produces, and how the layers map to EU AI Act Article 12, NIST AI RMF, and ISO 42001.

ByParminder Singh· Founder & CEO, DeepInspect Inc.
Compliance & Regulationai-governance-frameworkai-governanceeu-ai-actnist-ai-rmfiso-42001

An AI governance framework that holds up under audit has three operational layers. The first is the policy layer: the documents that name what the program will and will not do with AI, what data categories the policy permits, and what decisions trigger human review. The second is the enforcement layer: the technical placement that binds the policy to production AI traffic. The third is the record layer: the per-decision evidence series that lets the auditor reconstruct what the system did at a specific moment.

Most programs I see have the policy layer in good shape. Most fail the enforcement and record layers. The policy is written, signed, and circulated, but the production AI traffic does not pass through any inspection point that evaluates the policy and produces the per-decision record. The auditor's question (show me the records for a sample of high-risk decisions) does not have an answer.

Layer one: policy

The policy layer answers what. What AI systems the program operates, what data categories the AI can process, what decisions the AI can make autonomously, what decisions require human review, what users can access which models, what records the program retains, and how long. The artifacts on the policy layer:

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

The policy layer maps directly to the EU AI Act Article 9 risk management system requirement, the NIST AI RMF GOVERN function, and the ISO 42001 management system clauses. The policy layer is the foundation but it is not the audit record. The auditor reads the policy to understand what the program intends, then samples the record layer to confirm the intent shows up in production.

Layer two: enforcement

The enforcement layer answers how. How the policy gets bound to production AI traffic, where the decision happens, what blocks at the boundary versus what flows through with monitoring. The artifacts on the enforcement layer:

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

The enforcement layer is where most governance programs collapse during audit. The policy says PHI cannot leave the corporate boundary into a SaaS LLM. The enforcement layer is supposed to bind that policy to the LLM request path. If the binding is missing, the policy is aspirational and the audit record reflects the aspiration rather than the practice.

The enforcement layer maps to the NIST AI RMF MEASURE and MANAGE functions: the inspection point produces the measurement and the policy engine produces the management action. It maps to EU AI Act Article 14 on human oversight (the policy decides when to escalate to a human reviewer) and Article 26 on deployer obligations (the deployer operates the system in accordance with the instructions for use).

Layer three: record

The record layer answers when, who, what, and on what basis. The per-decision audit record carries the fields the auditor samples:

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

The record series sits on a store with controlled access, encrypted at rest, retained for at least the duration the regulation expects (six months under EU AI Act Article 26(6), seven years under common financial-services audit obligations, ten years under some HIPAA interpretations).

The record layer is what the audit references. NIST AI RMF MEASURE outputs cite the record series. EU AI Act Article 12 and Article 19 require the record series exist with the specific fields. ISO 42001 management review references the record series for the evidence of the system's operational behavior.

How the layers tie together

A governance framework that ships as policy alone produces a document on a shelf. A framework that ships as enforcement alone catches things the policy never named. A framework that ships as records alone tells the auditor what happened without telling them whether the program intended what happened. The three layers have to ship together.

The flow:

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

Each layer has an artifact the auditor receives. The policy artifact (the document set). The enforcement artifact (the configuration of the inspection point and the policy engine). The record artifact (the per-decision series). The three artifacts cross-reference each other: the policy points at the categories the enforcement evaluates; the enforcement points at the record series it produces; the record series points back at the policy version it decided under.

Mapping to specific regulatory regimes

EU AI Act Title III requires all three layers for high-risk systems. Article 9 covers the policy layer (risk management). Articles 12-19 cover the enforcement and record layers (the system has to produce automatically generated records sufficient to ensure traceability). Article 26 binds the deployer to operate the system in accordance with instructions and retain the records.

NIST AI RMF covers all three layers through its four functions. GOVERN covers the policy layer. MAP covers the inventory and risk categorization that feeds policy. MEASURE covers the metrics the enforcement and record layers produce. MANAGE covers the loop between the records, the metrics, and the policy refinement.

ISO 42001 maps to a management-system structure across the same three layers. Clauses 6-7 cover policy and resources. Clauses 8 cover operation, which includes the enforcement layer. Clauses 9-10 cover performance evaluation and improvement, which reference the record layer.

HIPAA Security Rule 45 CFR 164.308-164.312 covers administrative, physical, and technical safeguards. The audit control requirement under 164.312(b) maps directly to the record layer. The access management requirements under 164.308(a)(4) map to the enforcement layer's identity binding.

Where most programs land

The programs I talk to typically have a policy layer in good shape because the policy work was the obvious starting point for the governance program. They typically have a gap at the enforcement layer because the production AI traffic does not pass through any inspection point that evaluates the policy. They typically have a gap at the record layer because the records that exist (application logs, model provider logs, network logs) do not carry the fields the regulation expects on the same series.

The path forward for most programs is to close the enforcement and record gaps at the same time by placing the inspection layer at the HTTP request boundary between authenticated users or agents and the LLM endpoint. The placement supplies both layers because the same inspection point that evaluates the policy is the one that commits the record.

DeepInspect

DeepInspect is the enforcement and record layer of the framework. The proxy sits inline between authenticated users or agents and any LLM, terminates TLS at the inspection layer, authenticates against the corporate IdP, classifies the prompt content against the policy categories, evaluates policy against identity and classification, and commits a per-decision audit record before the response returns. The records carry the fields EU AI Act Article 12, NIST AI RMF MEASURE, and ISO 42001 performance evaluation reference.

For organizations that have the policy layer in good shape and need the enforcement and record layers, the proxy placement closes both gaps at the same time. The policy is what the program intends; the record is what the program produces; the inspection point is what binds the two together.

If you are facing the August deadline, let's talk.

Frequently asked questions

How does the framework handle agentic AI?

Agentic AI traffic adds a layer to the identity binding: the agent identity plus the originating user identity if the agent operates on behalf of a person. The enforcement layer treats agent traffic the same way as user traffic, with the agent's service identity authenticated against the IdP and the prompt classification running against the agent's assembled prompt. The record series carries both identity fields.

Does the framework apply to internal RAG pipelines that do not call SaaS LLMs?

Yes. The enforcement layer sits in front of any LLM endpoint, including self-hosted models behind internal RAG pipelines. The framework's three layers apply identically: the policy defines what data the RAG can retrieve, the enforcement binds the policy to the RAG's LLM call, and the record series carries the per-decision evidence.

How long does the framework take to stand up?

Programs that start with a policy in good shape can stand up the enforcement and record layers in four to six weeks for a single LLM provider, then extend coverage to additional providers over the next quarter. Programs that need to build all three layers from scratch run a twelve-week program on the typical pace.

How does the framework interact with existing security operations?

The record series flows into the SIEM the security operations team already runs. The enforcement decisions can trigger SOAR playbooks the team already operates. The framework integrates into the existing security stack rather than replacing it.

What is the smallest viable version of the framework?

The smallest viable version covers the highest-risk AI use case (typically the one inside an Annex III category) with all three layers in place. The policy names the use case. The enforcement binds the inspection at the use case's LLM endpoint. The record series carries the per-decision evidence. Extending coverage to additional use cases proceeds from this baseline.