Fail-closed
Fail-closed is the architectural property that governs how a policy enforcement point behaves when it cannot reach a definitive decision. A fail-closed gateway blocks the request when the policy lookup errors, when the identity claim is missing, when the classification model times out, or when the audit writer cannot persist the record. A fail-open gateway forwards the request and records a soft warning. EU AI Act Article 12 traceability obligations, Fannie Mae LL-2026-04 disclosure-on-demand requirements, and NIST AI RMF action lineage all sit on the assumption that the enforcement point is fail-closed, since a missing decision record is the same evidentiary gap as a missing decision.
How fail-closed shapes the gateway runtime
The gateway has four request-time dependencies that can fail: the identity provider that validates the subject claim, the classification model that labels the payload, the policy decision point that returns the verdict, and the audit writer that persists the record. A fail-closed configuration returns a 503 with a reason code to the caller when any of those four returns an error or exceeds the latency budget. The caller retries. The 22-second median between initial access and handoff to a secondary threat group (Mandiant M-Trends 2026) is the operational reason fail-open is unsafe; a request that bypasses the policy because the classifier timed out is the foothold the attacker uses.
Where fail-closed needs hardening
A fail-closed gateway with a single policy decision point that itself fails creates an availability incident that pressures the operator to flip the configuration to fail-open. The hardening pattern is a horizontally scaled policy decision point with a local rule cache, a circuit-breaker pattern around the classification model so the gateway degrades gracefully under load, and a write-ahead audit buffer so the audit writer never blocks the request path. The DeepInspect benchmark holds the decision under 50 ms at p99 under that configuration, which keeps fail-closed operationally defensible.
Related reading
- AI Inline Enforcement Architecture: Where the Policy Decision Sits and What It Has To Commit
AI inline enforcement runs the policy decision in the request path, before the model API call returns to the calling application. The architecture places a deterministic policy decision point between the application identity and the model endpoint and commits a per-decision audit record before the response forwards. This piece walks through the architectural components, the decision-time data shape, the failure modes the implementation has to handle, and the regulatory profile that the inline placement satisfies (EU AI Act Article 12, NIST AI agent identity and authorization Pillar 2 and Pillar 3, Fannie Mae LL-2026-04, DORA Article 6).
- Identity-Aware AI Gateway Architecture: How Inline Enforcement Binds Decisions to Users and Agents
An identity-aware AI gateway sits at the AI request boundary, attaches verified identity context to every model API call, evaluates per-route and per-role policies, and commits a per-decision audit record before the model response returns to the calling application. The architecture closes the post-authentication gap that most enterprise AI deployments have inherited from the credential-pooling pattern used by SDKs and proxy frameworks. This piece walks through the architectural building blocks, the call path, the audit primitives, and where the identity-aware gateway sits relative to existing IAM, API gateway, and DLP infrastructure.