← Blog

Setting Up AI Policy Enforcement: From the First Rule to a Production Deployment

AI policy enforcement is the runtime control point that turns a written policy into a per-request decision. This guide walks through how to set up enforcement: the policy schema, the decision-point placement, the per-route and per-role rules, the audit format that proves the policy was applied, and the deployment sequence that gets a production-ready enforcement layer live in 8 to 12 weeks.

ByParminder Singh· Founder & CEO, DeepInspect Inc.
AI Security Solutionsai-policyenforcementimplementation-guideai-governanceinline-policyai-security
Setting Up AI Policy Enforcement: From the First Rule to a Production Deployment

AI policy enforcement is the runtime control point that turns a written policy into a per-request decision. A policy document that lives in Confluence does not stop a sales engineer from pasting customer PHI into Claude. The enforcement layer is what closes the gap between the policy on paper and the policy on the wire.

I want to walk through the implementation. The policy schema, the decision point, the per-route and per-role rules, the audit format, and the deployment sequence that produces a production posture in 8 to 12 weeks.

What enforcement is and where it sits

Enforcement runs at the request boundary between the authenticated caller and the LLM endpoint. The decision point reads the request context (verified subject, route, classification verdicts), looks up the applicable rule from the policy bundle, and returns one of four outcomes: pass, block, redact, route-only.

The decision is synchronous. The request does not reach the model until the decision has been made and the audit record has been written. A request whose decision cannot be made (policy lookup timeout, identity verification failure, audit store unavailable) fails closed.

Enforcement is the difference between a posture document and a posture. The posture document attests to a policy. The enforcement layer produces the evidence that the policy held.

The policy schema

A working policy bundle has three levels: identity, data, and route. Each level carries a set of rules. The decision point evaluates all three levels in order.

Identity rules

Who is making the request. Subject type (user, agent-on-behalf, service), role, group memberships, scope of permitted use cases.

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

Data rules

What data class the request carries. PII, PHI, source code, contract content, customer record, secret, mixed.

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

Route rules

Where the request is going. Model vendor, model ID, deployment (direct, Bedrock, Vertex, Azure OpenAI).

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

The bundle is versioned. Every change produces a new SHA-256 hash that the audit record stamps. A regulator who reads the audit record can pull the corresponding bundle version from the policy store and confirm the rule in effect at the time of the request.

The decision point

The decision point is the function that evaluates the policy. The signature:

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

The implementation pattern is a deterministic rules engine, not a model. A model-based "safety filter" returns different verdicts for the same input across calls, which makes the audit record unstable. The decision point's invariant is: same inputs produce the same decision, every time.

The audit record stamps:

  • The bundle version hash.
  • The matching rule's identifier.
  • The decision outcome.
  • The reason code (which rule fired and why).

The reason code is what the operator reads after the fact. "policy.phi.non-baa-destination.block" tells the on-call engineer exactly which rule fired without a deep dive into the bundle.

Per-route and per-role rules

Most enforcement programs start with two axes: route and role. The matrix gets dense:

| | OpenAI direct | Bedrock Claude (BAA) | Azure OpenAI (BAA) | Vertex Gemini | |---|---|---|---|---| | Clinician | block | pass | pass | block | | Engineer | pass (no PHI) | pass | pass | pass (no PHI) | | Sales | pass (no PHI, no contract) | block | block | pass (no PHI, no contract) | | Service principal | scoped to allowlist | scoped to allowlist | scoped to allowlist | scoped to allowlist |

The matrix is what a policy author hands to the enforcement layer. The bundle compiles to YAML or JSON, the decision point evaluates it on each request.

The audit format

The enforcement record carries the policy state separately from the request state:

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

The chained audit record set is the evidence that the policy was applied. A regulator asking "show me the requests on 2026-08-15 from clinical-team-A to non-BAA routes" runs one query against the record set. The answer is either a complete list (with each block decision recorded) or a list with gaps, in which case the regulator follows up.

The deployment sequence

A typical deployment from contract signature to production-ready enforcement runs 8 to 12 weeks.

Weeks 1-2: discovery and policy authoring. The deployer's compliance team and the enforcement vendor walk through the existing AI usage. The output is the first draft of the policy bundle. The bundle covers the three or four use cases that drive most of the traffic.

Weeks 3-4: gateway deployment. The gateway lands in the deployer's environment. TLS termination, identity provider integration, audit store wiring. The gateway runs in shadow mode: it sees the traffic, classifies, evaluates, and records, but does not yet block.

Weeks 5-6: dry-run. The shadow-mode records produce a report of which requests the policy would have blocked, redacted, or routed. The compliance team reviews the report and tunes the bundle. False positives get a rule narrowing. False negatives get a new rule.

Weeks 7-8: cut over. The gateway flips from shadow to enforce on the first use case. The other use cases follow on a staggered schedule. Each cutover is gated on a no-regression review of the prior week's records.

Weeks 9-12: hardening. Edge cases surface in production traffic. The bundle gets adjusted. The audit store's retention is confirmed (six months minimum for Article 19, longer for sector law). The first quarterly review against the bundle's effective controls completes.

By the end of week 12, the deployer has an enforcement layer that produces compliant records for every AI request in scope, a bundle that reflects production reality, and a process for keeping the bundle current as new use cases ship.

Where most programs slip

Three failure modes appear repeatedly:

Policy that does not compile. A policy written in prose ("PHI may not leave the BAA-covered environment except under approved use cases reviewed by the AI risk committee") does not compile to a deterministic rule. The decision point cannot evaluate ambiguity. The policy authoring step has to produce a bundle the decision point can run.

No identity at the gateway. The application authenticates the user, then forwards an OpenAI request with a service credential. The verified subject never reaches the gateway. The audit record names the service account, which fails the Article 19 natural-person requirement. The fix is to carry the identity context through the request, end to end.

Audit-write asymmetry. Some implementations write the audit record after the model responds, so the cleartext completion can be included. If the audit writer fails after the completion has returned, the action has been taken and the record is gone. The audit write has to commit before the completion returns, even if that means hashing the completion rather than storing it.

DeepInspect

DeepInspect is the enforcement layer described above. The policy bundle is a versioned YAML object the deployer maintains in their own repository. The decision point is a deterministic rules engine that evaluates the bundle on each request. The audit record carries the bundle version hash, the matched rule, the decision, and the reason code. Shadow mode and cutover are part of the standard rollout.

The 8-to-12-week deployment sequence is the path we run with every new deployer.

Book a technical deep dive at deepinspect.ai.

Frequently asked questions

Can the policy bundle be authored by a non-technical compliance team?

The bundle's source is YAML, which compliance teams have not typically authored. In practice, the policy bundle is authored jointly: the compliance team writes the rule intent in prose, an engineer or solutions architect translates the intent into the bundle, the compliance team reviews and signs off. A bundle that compliance has not signed is not the policy.

Does shadow mode introduce risk?

Shadow mode records what the gateway would have done without modifying the request path. The application's traffic to the LLM endpoint is unchanged. The risk is that the deployer's compliance posture is unchanged during shadow mode. Shadow mode is a measurement tool, not a control. Treat it as a 2-week window with a hard cutover date.

How do agent-on-behalf workflows get evaluated?

Agent-on-behalf requests carry two identities: the agent and the principal. The policy bundle evaluates rules against the principal's role and the agent's scope. An agent acting on behalf of a clinician inherits the clinician's data-class permissions, but only within the scope the agent was authorized to operate.

What happens when the policy needs an exception for a single edge case?

Exceptions go in the bundle as time-bounded rules with the explicit subject and reason. The exception's audit trail includes who authorized it and for how long. Ad-hoc exceptions outside the bundle do not exist; the bundle is the policy.