Why is the policy decoupled from the provider?

The policy answers what the deployer permits a class of prompts to do. The provider answers which endpoint serves the request. The two concerns change at different rates: the policy changes when the regulatory regime or the internal risk appetite shifts, the provider changes when the deployer evaluates a new model, hits a rate limit or responds to a DORA concentration finding. Binding the two artefacts together would mean every provider change forces a policy review. Keeping them separate keeps the policy stable across rotations.

What does the audit log show after a rotation event?

The log shows the provider field, the model family and the model version that served each request, against the policy version that governed the decision. A rotation from OpenAI to Anthropic on a route produces a sequence of records: the records before the rotation show openai/gpt-4.1, the records after show anthropic/claude-3.7-sonnet, the policy version is the same across the boundary. The auditor reads the rotation as a routing change with the policy invariant preserved.

How does the gateway handle a provider rate limit?

The fail-over rule on the route activates the secondary provider on HTTP 429. The gateway records the rate-limit event on the log, the request is served by the secondary against the same policy version, and the per-decision log carries the secondary provider's model identity. Where the secondary also rate-limits, the fall-back activates. Where the fall-back is unavailable, the gateway fails closed and records the failure category. The fail-closed state preserves the policy invariant.

Does rotation affect the EU AI Act conformity-assessment file?

The conformity-assessment file under EU AI Act Article 11 and Annex IV documents the deployment of the high-risk system, including the model or models the deployer relies on. A rotation event that introduces a new model into the deployment updates the file. The Article 19 retention obligation on the file covers 10 years. The routing artefact and the per-decision log together provide the evidence the file references.

How does rotation interact with DORA Article 28 concentration risk?

Article 28 expects the financial entity to assess concentration risk on critical ICT third-party providers and to maintain an exit strategy. The rotation mechanism is the operational expression of the exit strategy: the deployer demonstrates the capability to move traffic between providers, the routing artefact documents the split across providers, and the per-decision log validates the operational reality matches the documentation. The supervisor inspecting the Article 28 file reads the routing artefact and samples the log.

What stops a rotation from breaking the policy?

The policy artefact does not name the provider. The routing artefact names the provider. A change to the routing artefact does not touch the policy artefact. The per-decision log binds the policy version and the provider name together at the moment of the request, the binding is verified at audit time, and the auditor at year three reads both. The architectural separation is what prevents a routing change from creating a policy gap.

AI provider rotation strategy: how to swap OpenAI for Anthropic without breaking policy or audit

DORA Article 28 requires financial entities to assess and manage concentration risk on critical ICT third-party providers, with the European Supervisory Authorities operating a Union-level register against the same obligation. A bank running its entire AI workload through one provider has a concentration-risk problem the moment the supervisor opens the file. The operational answer is provider rotation: the deployer moves traffic between OpenAI, Anthropic, Google and other endpoints without breaking the policy decision or the audit trail. The mechanism has five parts and each part has a regulatory anchor.

I want to walk through what a provider-agnostic policy model looks like, why the model identity has to be on every per-decision log, how per-route routing rules decouple the policy from the endpoint, what fail-over semantics keep the policy invariant under outage, and how the architecture aligns with DORA Article 28 and EU AI Act Article 26 provider documentation.

Provider-agnostic policy model

The policy is written against the prompt classification and the user identity, not against the provider. A rule that says "block PHI on egress" applies whether the egress endpoint is api.openai.com, api.anthropic.com or generativelanguage.googleapis.com. The provider is a routing target, not a policy axis. The separation is what makes rotation tractable.

The policy artefact is a versioned rule set, the rule set references the classification taxonomy described in our prompt-classification article, and the rule set does not name the provider. The provider name enters the system through the routing layer, which is a separate artefact. The two artefacts are bound on the per-decision log: the log carries the policy version and the provider name together, but the policy version is independent of the provider name. The auditor at year three reading the log sees both, sees that the policy applied was the same across two providers across a rotation event, and reproduces the decision on either side of the rotation.

Model identity on every per-decision log

A per-decision log that records "the call went to an LLM" without recording which model is not auditable. EU AI Act Article 26 requires deployers of high-risk systems to keep records that allow the conformity-assessment review to trace the decision back to the model that produced it. The recorded fields are provider, model family, model version and the request identifier the provider returns.

The model identity travels with the record across the retention window. A rotation from one provider to another produces a clean handoff in the log: the request_id changes, the provider field changes, the model identity changes, the policy version stays the same. The auditor sees the rotation as a routing change, not as a policy change.

Per-route routing rules

The routing rule is a separate artefact from the policy rule. The routing rule says "for this route, send the request to this provider, with this model, under these fail-over conditions." The rule is versioned independently of the policy. A rotation event updates the routing rule. The policy rule is untouched.

The route binds the policy invariants explicitly. The invariants are enforced regardless of which provider the routing rule selects. A request that hits the secondary or the fall-back carries the same policy version and the same enforcement outcome as a request that hit the primary, with the provider field on the log reflecting the actual provider that served the request.

Fail-over semantics under outage

The fail-over rule covers three conditions: provider rate-limiting (HTTP 429), provider unavailability (HTTP 503 or connection failure) and provider degradation (p99 latency above a configured threshold). The fail-over is bounded: the gateway tries the secondary, then the fall-back, then it fails closed. The fail-closed state is recorded on the log with the failure category.

The policy is invariant under the fail-over: a request that was going to be blocked by policy against the primary is also blocked against the secondary and the fall-back. A request that was going to be redacted is redacted against any provider the rule selects. The provider change does not relax the policy. DORA Article 11 expects the financial entity to maintain operational resilience under ICT incidents, and the policy-invariant fail-over is the mechanism that satisfies the expectation for AI workloads.

DORA Article 28 concentration risk and EU AI Act Article 26 documentation

DORA Article 28 requires the financial entity to assess and document concentration risk on critical ICT third-party providers. A bank routing 100% of its AI traffic through one provider has a concentration-risk finding the moment the supervisor opens the assessment. The operational answer is a documented split of traffic across providers with the routing artefact as the evidence and the per-decision log as the validation.

EU AI Act Article 26(5) requires deployers of high-risk AI systems to inform the provider of any serious incident. A rotation event that swaps a primary provider after an incident is the operational pattern Article 26 anticipates. The recorded model identity on the audit log, the routing rule version, and the per-decision log together give the deployer the artefacts the supervisor inspects when the file is opened.

DeepInspect

This is the rotation discipline DeepInspect operates against. DeepInspect sits as a stateless proxy between authenticated users or agents and any LLM endpoint, with the policy model written against the classification and the identity rather than against the provider. The provider, the model family and the model version are recorded on every per-decision log alongside the policy version that governed the decision. A rotation event changes the routing rule, not the policy rule, and the change is reflected on the log as a routing-field difference with the same policy version on either side.

The fail-over semantics are policy-invariant: a request blocked by policy against the primary is blocked against the secondary and the fall-back, with the failure category recorded on the log. The DORA Article 28 concentration-risk artefact is the routing rule, the per-decision log is the validation, and the EU AI Act Article 26 incident-notification path is the operational mechanism the rotation supports.

Book a demo today.