Does fail-closed mean the AI tool stops working when anything fails?

No. The gateway implements bounded retries and graceful degradation before falling back to deny. Most transient failures recover within the retry budget. The deny applies only when the failure persists past the budget. The user experience under typical operating conditions is the same as fail-open; the difference appears only under genuine sustained failure.

What about non-regulated use cases? Is fail-open ever defensible?

For non-regulated personal use, the trade-off is one of preference. For non-regulated low-stakes enterprise use, fail-open may be acceptable where the records produced under uncertainty have no operational consequence. For any high-risk use case under the EU AI Act, any financial-services use case under DORA, or any mortgage-origination use case under Fannie Mae LL-2026-04, fail-closed is the architectural default.

How does fail-closed interact with model fallback strategies?

Model fallback (try OpenAI, fall back to Anthropic if OpenAI is down) is orthogonal to fail-closed at the policy layer. The gateway can apply policy and commit the audit record on the actual provider selected. The fallback decision is logged as part of the record. Fail-closed at the policy layer remains the posture: if the policy cannot be evaluated, the request denies regardless of which provider was selected.

What's the deny response the user sees?

A typical deny response carries an explicit error code, a human-readable description, and a request identifier. The identifier lets the user contact support with a specific reference. The support team queries the audit log by request identifier and sees the failure category and the policy state at the time. The user-facing experience is informative rather than mysterious.

How does fail-closed handle streaming responses?

For streaming responses, the gateway applies the policy decision at the start of the stream before any tokens flow back. Under uncertainty, the stream does not start. Once the stream is in flight, the gateway can still apply per-token classification or response-side redaction; under uncertainty during the stream, the gateway can terminate the stream and commit a deny-mid-stream record. The streaming posture remains fail-closed.

Fail-Closed AI Gateway: Why the Default Has to Be Deny in Regulated Environments

A fail-closed AI gateway defaults to block when the policy decision is unreachable, when the prompt classification is uncertain, or when the gateway loses upstream connectivity. The opposite posture, fail-open, defaults to pass under the same conditions and lets the request through to the model. The choice between the two postures is the most consequential operational decision the architect makes when deploying an AI gateway in a regulated environment. EU AI Act Article 12, DORA Article 19, and Fannie Mae LL-2026-04 each presume an architecture where the audit record exists for every decision. A fail-open gateway loses the record at exactly the moment a regulator would ask for it. A fail-closed gateway preserves the regulatory posture at the cost of operational investment in availability.

I want to walk through what fail-closed actually means at the gateway boundary, why the alternative is incompatible with regulated deployment, what the operational trade-offs look like, and how the architecture absorbs the availability cost.

What fail-closed means at the gateway boundary

Fail-closed is the default-deny posture under uncertainty. The gateway encounters three categories of uncertainty that exercise the default.

Category 1: policy evaluation failure

The gateway cannot reach the policy decision point. The policy store is unreachable, the policy parse failed, or the policy refers to attributes the request does not carry. Under fail-closed, the gateway denies. Under fail-open, the gateway passes the request through with no policy evaluation, which leaves the regulatory record empty at the policy state field.

Category 2: classification uncertainty

The classification component cannot decisively classify the prompt. The detector returns low confidence, the prompt contains content categories the classifier was not trained on, or the classifier timed out. Under fail-closed, the gateway denies the request or routes it to a human-in-the-loop review. Under fail-open, the gateway passes the request and the classification field records "unknown."

Category 3: upstream connectivity loss

The gateway lost connectivity to the LLM provider or to a downstream dependency. Under fail-closed, the gateway returns a deny response to the application. Under fail-open, the gateway might bypass itself entirely and route the application directly to the provider, which leaves no record of the bypass.

Why fail-open is incompatible with regulated deployment

Three regulatory expectations make fail-open untenable for high-risk AI under the EU AI Act and adjacent regimes.

The contemporaneous record requirement

Article 12 requires automatic recording of events over the lifetime of the system. Article 19 expects the records to include period of use, input data, and identity of natural persons. A fail-open posture under policy or classification uncertainty produces records with empty fields at exactly the moments most likely to be subject to regulatory inquiry. The auditor reviewing the record sees an empty policy state for the high-stakes request and the deployer cannot defend the gap.

The deterministic enforcement requirement

The regulatory posture under Article 26 expects deployer enforcement to be deterministic and to operate as the deployer documents in the conformity file. A fail-open gateway that passes requests under uncertainty is operationally non-deterministic: the same request under different conditions produces different outcomes, and the variability is not under the deployer's control. The conformity file becomes inaccurate at exactly the failure modes the regulation cares about.

The independent-record requirement

The audit record must be independent of the application that made the request. A fail-open gateway that defaults to pass effectively delegates the decision to the application, which means the application's record (or lack thereof) becomes the only artifact. The write-path independence test fails because the application is recording its own decision instead of the gateway recording an external one.

What real fail-closed control looks like

A fail-closed gateway in production operates with three properties.

Property 1: default deny under any uncertainty

The default outcome under policy failure, classification uncertainty, or upstream connectivity loss is deny. The deny is recorded in the audit log with the failure reason. The application receives an explicit error code and a request identifier the support flow can use to investigate.

Property 2: bounded retries and graceful degradation

The gateway implements bounded retries against the policy store and the classification component before falling back to default deny. Where the failure is transient, the retry recovers the decision. Where the failure persists, the deny is committed within the retry budget. The bounded design prevents the gateway from holding the request indefinitely.

Property 3: operational instrumentation that detects the deny conditions

The gateway emits telemetry for each deny under uncertainty. The platform team monitors the rate of denials by failure category. A spike in policy-evaluation failures triggers an alert; a spike in classification uncertainty triggers a model review. The fail-closed posture is the safety net; the operational instrumentation is what prevents the safety net from becoming a continuous workload.

Where the operational trade-off lives

The trade-off between fail-closed and fail-open is the trade-off between regulatory posture and availability. Fail-closed under repeated policy or classification failures translates into rejected user requests, which the workforce experiences as the AI tool being broken. The operational investment in availability is the cost the deployer accepts to maintain the regulatory posture.

The investment includes horizontal scaling of the policy store, active-active deployment across availability zones, conservative timeouts on classification, redundant paths to the policy decision point, and continuous monitoring of the failure-condition rate. The investment is meaningful but contained: it is the same investment any production identity or policy infrastructure makes.

The alternative (fail-open) saves the operational investment and loses the regulatory record. For regulated deployment, the trade is not available.

Compliance angle

The EU AI Act Article 26 deployer obligation expects continuous monitoring. The Article 9 risk management system expects identification of residual risks and the controls that address them. A fail-open posture surfaces as a residual risk the deployer must accept and document. A fail-closed posture is the architectural answer to the residual risk. The same logic applies to DORA Article 19 retention obligations and to the Fannie Mae LL-2026-04 disclosure-on-demand requirement.

DeepInspect

This is the architectural posture DeepInspect ships with. DeepInspect runs as a fail-closed enforcement gateway that sits at the AI request boundary as an external enforcement layer, operating as a stateless proxy between authenticated users or agents and any LLM endpoint. The default decision under uncertainty is deny. Every HTTP request is evaluated against per-route, per-role, identity-bound policy. The per-decision audit record is committed by the proxy, independent of the application and independent of the LLM provider, regardless of whether the decision is pass, redact, deny, or deny-under-uncertainty.

The record contains the verified identity, the role and authorization context, the data classification applied to the prompt, the model and version called, the policy version, the decision outcome including the failure category where applicable, and a cryptographic signature that prevents post-hoc modification. The operational design includes bounded retries, redundant policy decision paths, and continuous monitoring of the deny-under-uncertainty rate.

Book a technical deep dive at deepinspect.ai.