Why not just use the gateway's built-in policy UI?

The UI produces a shorter time to first policy, at the cost of the review, test, and rollback properties that a code-based pipeline provides. Small deployments with two or three static policies live with the UI. Enterprise deployments with dozens of policies that change monthly outgrow the UI at the first audit or the first incident, whichever comes first.

Which language should we pick?

For teams already running OPA elsewhere in the stack (Kubernetes admission, API authorization), Rego. For teams standardized on AWS Verified Permissions, Cedar. For deployments where the policy surface is small and the language dependency is a cost, JSON schema plus a purpose-built evaluator. The ai request authorization model piece covers the authorization semantics that inform the choice.

How do we test AI policies?

Unit tests with known inputs, integration tests that exercise the policy engine in the gateway, and canary deployments that expose the new policy to a subset of production traffic before full rollout. Test cases should cover both the affirmative decisions (this identity plus this model plus this data class should allow) and the negative decisions (this combination should deny), and the edge cases (missing data classification tag, unknown model identifier, expired identity claim).

Does policy-as-code work with a hosted AI gateway?

It depends on whether the hosted gateway exposes a policy artifact API and version-controlled deployment. Gateways that only expose a UI cannot support policy-as-code without a custom integration. Gateways designed for the pattern (DeepInspect, and some open-source AI gateways) expose the policy artifacts directly.

How does policy-as-code interact with runtime policy updates?

Runtime updates through the pipeline are the pattern. A pull request merges, CI runs, the new policy artifact deploys to the gateway through a canary rollout, and the gateway swaps the policy version live. No gateway restart. The audit record includes the timestamp of the swap and the git SHA of the deployed version.

Where does the policy engine sit relative to the AI proxy?

In-process is the low-latency default. The gateway loads the policy artifacts at startup, evaluates the policy on every request, and hot-reloads new versions as they deploy. Out-of-process (a policy sidecar) is the pattern when the policy engine has to be shared across multiple proxy instances or across proxy versions with different lifecycle requirements. The ai gateway architecture piece covers the deployment patterns.

Policy as Code for AI: The Review Pipeline That Turns AI Policy From a Config Screen Into a Reviewed Artifact

An AI policy that lives inside a gateway UI as a set of dropdowns and toggles changes without version control, without code review, and without a rollback plan. When the auditor asks who approved the change that permitted a new model, or who loosened the rate limit that let a runaway agent hit $18,000 in inference costs over a weekend, the gateway UI produces at best a timestamp and a user name. The same policy expressed as code, checked into git, reviewed through a pull request, and deployed through the same pipeline that ships the application produces the artifacts auditors, security teams, and platform engineers all recognize. I want to walk through the operational pattern, the language choices, and the evidence chain policy-as-code produces.

The pipeline that ships application code is the pipeline AI policy needs.

What "policy as code" means for AI

Policy-as-code in the AI context means expressing the rules that govern who can call which model with which data as machine-readable artifacts stored in version control. The artifacts are evaluated at request time by a policy engine at the AI gateway boundary. Changes to the artifacts flow through pull requests, code review, tests, and deployments. The audit record for any given AI decision references the specific policy version that applied.

Three properties distinguish policy-as-code from configuration-in-UI. First, the change history is the git history, not a database audit table. Second, the review is the pull request, not a click-through in a UI. Third, the tests run on the policy the way tests run on code, which means changes with unintended consequences get caught before they land in production.

The three languages in production

Three policy languages dominate production AI deployments in 2026.

Rego (from Open Policy Agent) is the most widely adopted. Rego is declarative, supports pattern matching on nested JSON structures, and evaluates in milliseconds. The policy engine (OPA) is embedded in the gateway or runs as a sidecar. The OpenAI API gateway patterns piece covers the OPA integration pattern.

Cedar (from AWS, open-sourced 2023) is the language behind AWS Verified Permissions. Cedar's type system enforces authorization semantics at policy-authoring time, which reduces the "policy compiled but is wrong" failure mode Rego permits. Cedar policies compose across authorization scopes, which suits agentic AI deployments where a single request needs authorization checks against the user, the agent, the model, and the data classification.

JSON schema plus a lightweight evaluator covers deployments where the policy surface is small enough that a full policy language is overkill. A schema that describes allowed models, allowed data classifications, and rate-limit configurations, evaluated by a purpose-built engine, gives the policy-as-code review pipeline without introducing a new language.

The choice between the three depends on team fluency, the complexity of the authorization semantics, and whether the policy engine has to run in-process at sub-millisecond latency.

The policy above lands in a repo alongside the application code. A change (adding a new model, adjusting the role list, tightening the PHI check) goes through a pull request. The CI pipeline runs unit tests against the policy with known inputs. The deployment pipeline rolls the policy out to the gateway with the same canary and rollback semantics the application uses.

The evidence chain auditors accept

Three audit frameworks lean on the same evidence chain policy-as-code produces.

SOC 2 CC8.1 covers change management. The evidence is the pull request, the reviewer approval, the test run, and the deployment record. Policy changes that ship through the pipeline satisfy CC8.1 the same way code changes do. The SOC 2 AI controls piece covers the full test.

ISO 27001 Annex A control 8.28 covers secure coding. The 2022 revision adds this control specifically to cover the software development lifecycle, which policy-as-code sits inside. The ISO 27001 AI Annex A piece covers the mapping.

EU AI Act Article 26 covers deployer obligations, including maintaining accuracy of the AI system's operation and monitoring for anomalies. When a policy change alters the AI system's operational parameters, the change history is part of the record deployers have to maintain. The EU AI Act Article 12 piece covers the logging side; Article 26 covers the operational side.

The rollback story

The rollback story is the property that separates policy-as-code from UI-managed policy. A policy change that shipped and broke a workflow (denying a legitimate request because of an incorrectly scoped role check, allowing a request that should have been denied because of a missing predicate) needs to revert in seconds, not hours. The revert is a git revert plus a redeploy through the same pipeline.

The ai gateway rollback strategy piece covers the operational pattern. In short, canary deployments of policy changes to a subset of traffic, health checks against known-good and known-bad requests, and automated rollback on failure. The ai-gateway canary deployment piece covers the traffic-shaping side.

DeepInspect

This is exactly what DeepInspect does. DeepInspect exposes its policy surface as machine-readable artifacts in the customer's git repository. Policy changes flow through the customer's own review pipeline. The gateway evaluates the deployed policy version on every request. The audit log for each decision references the specific policy commit that applied, which means the reconstruction question ("which policy was in effect when this decision landed") has a git SHA as the answer.

The policy language is Rego with Cedar support for deployments that already run Cedar for other authorization scopes. Deployment integrates with the customer's existing CI/CD tooling: GitHub Actions, GitLab CI, ArgoCD, Flux, Jenkins.

Book a technical deep dive at deepinspect.ai.