DeepInspect vs Cloudflare AI Gateway: When Each Architecture Fits
DeepInspect and Cloudflare AI Gateway both sit between applications and LLM endpoints, and both call themselves AI gateways. The architectures differ in what they enforce, what they record, and which compliance regimes they support. Cloudflare AI Gateway is built for observability, caching, and routing at the edge. DeepInspect is built for identity-bound policy enforcement and per-decision audit evidence in regulated environments. This piece compares the two on architecture, enforcement model, audit posture, and the buyer fit for each.

Cloudflare AI Gateway was announced in 2023 and has grown into Cloudflare's unified plane for observability, caching, rate limiting, and routing across model providers. Cloudflare positions the product as the operational layer for AI traffic: see what is happening, cache repeated calls, fall back across providers, and rate-limit by endpoint. DeepInspect sits at a different point in the stack. It is a stateless proxy between authenticated users or agents and any LLM, enforcing identity-bound policy on every request and producing a per-decision audit record that the deploying organization controls.
The two products call themselves AI gateways. The architectures fit different buying motions.
I want to walk through what each product actually does at the request layer, where the architectural differences land in production, and which buyer profile each is built for.
TL;DR
Cloudflare AI Gateway is an edge-based observability, caching, and routing layer for AI calls, sold through Cloudflare's developer platform and priced by usage. DeepInspect is an identity-aware policy enforcement layer and audit system for regulated AI deployments, sold to enterprises with compliance obligations under the EU AI Act, NIST AI RMF, ISO 42001, HIPAA, and sector-specific regimes.
Cloudflare AI Gateway: where it sits
Cloudflare AI Gateway operates at Cloudflare's edge. Applications point their AI API calls at a Cloudflare endpoint, and Cloudflare proxies the calls to OpenAI, Anthropic, Workers AI, Google AI, Azure OpenAI, Amazon Bedrock, Groq, Mistral, and other supported providers. The gateway provides analytics on requests and responses, semantic and exact caching of LLM responses, rate limiting per route, fallback routing across providers, prompt management, and a logs surface that captures requests and responses for inspection.
The enforcement model is request shaping at the edge. Cloudflare can throttle, cache, fail over, and capture telemetry. The enforcement happens against the operational properties of the request, not against the identity of the natural person behind it. Cloudflare AI Gateway does not have native concepts for the authenticated end-user, the data classification inside the prompt, or the policy state at the moment of decision.
The audit posture follows the same edge model. Cloudflare retains logs according to its retention configuration, and the logs are available through Cloudflare's APIs and dashboard. The records are not signed for tamper evidence, and the retention is governed by Cloudflare's data lifecycle, not by the deploying organization's compliance retention rules.
Where DeepInspect sits
DeepInspect operates between authenticated users or agents and the LLM endpoints they call. The enforcement model is per-request policy evaluation against identity context that the application supplies. Every request is evaluated against per-route and per-role policies, with prompt-level classification applied to the content. The decision is permit, redact, or deny, and the decision is recorded with identity, role, policy version, data classification, and outcome.
The audit posture is built around the regulator's question, not around operational telemetry. Every per-decision record is signed, tamper-evident, and committed before the model response returns to the application. Retention is governed by the deploying organization, which is necessary for the six-month floor under EU AI Act Article 19 and for the longer retention windows under HIPAA, MiFID II, and similar regimes.
DeepInspect is model-agnostic. It works in front of any HTTP-based LLM endpoint, including OpenAI, Anthropic, Bedrock, Azure OpenAI, Vertex AI, self-hosted Llama, self-hosted Mistral, and on-prem inference deployments. The deploying organization can route traffic across providers without the policy layer or the audit trail changing.
Feature comparison
The buying differences land at a few specific points.
The blank rows on the DeepInspect side are areas where DeepInspect does not compete with Cloudflare's edge platform. The blank rows on the Cloudflare side are areas where the edge-based architecture cannot produce what regulated buyers ask for.
Pick Cloudflare AI Gateway if
The buyer profile fits Cloudflare AI Gateway in several common cases. The application is consumer-facing or non-regulated, and the operational goals are caching, rate limiting, and observability across providers. The deploying team already runs on Cloudflare and wants a single dashboard for AI telemetry alongside CDN, Workers, and DNS. The compliance posture does not require identity-bound audit evidence, and the retention requirements fit Cloudflare's defaults. The team values per-provider failover and prompt management more than per-decision policy enforcement.
Pick DeepInspect if
The buyer profile fits DeepInspect when the deployment carries compliance obligations. The system falls under EU AI Act high-risk classification and needs Article 12 logging and Article 14 oversight evidence. The organization operates in healthcare, financial services, government, or another regime with sector-specific audit obligations. The architecture has to support identity-bound policy because the AI traffic passes through authenticated user and agent contexts that have to land in the audit trail. The retention windows run beyond what an operational telemetry platform provides. The deploying organization needs the audit trail to be tamper-evident and controlled by the organization, not by the gateway vendor.
Pricing approach
Cloudflare AI Gateway is sold through the Cloudflare developer platform with usage-based pricing tied to request volume, log storage, and additional features. The cost structure tracks operational scale. Pricing is published on the Cloudflare site.
DeepInspect is sold through enterprise contracts with pricing based on the deployment scope, the number of authenticated identities, the policy complexity, and the retention windows. Pricing conversations include the compliance regimes in scope and the audit posture required. DeepInspect pricing is not published as a list price because the deployment shapes the cost.
Where DeepInspect and Cloudflare AI Gateway can coexist
The architectures are not mutually exclusive in every deployment. An organization can run Cloudflare AI Gateway at the edge for operational telemetry, caching, and provider failover, and run DeepInspect inside the application boundary for identity-aware policy enforcement and per-decision audit evidence. The two products record at different layers and answer different questions. The edge layer answers "what AI traffic is moving through the platform and how does it perform." The enforcement layer answers "who made this specific decision, under which policy, against what data classification, and what was the outcome."
For deployments that need both answers, the architectures stack rather than substitute.
DeepInspect
This is the architectural pattern DeepInspect was built around. DeepInspect sits at the AI request boundary as an external enforcement layer that produces identity-bound, per-decision audit records, deterministic policy enforcement, and tamper-evident evidence under the deploying organization's control.
If your deployment falls under the EU AI Act high-risk obligations or under a sector-specific regime that asks for per-decision evidence, the August 2, 2026 effective date for the high-risk requirements is close. Book a technical deep dive at deepinspect.ai.
Frequently asked questions
- Is Cloudflare AI Gateway a security product?
Cloudflare AI Gateway provides operational controls that have security-adjacent properties. Rate limiting, fallback routing, observability, and prompt management are part of running AI in production. The product is not positioned as a security or compliance enforcement layer. The Cloudflare documentation places AI Gateway in the developer platform alongside Workers AI and other AI tooling. Buyers should not expect identity-bound audit evidence, per-decision policy enforcement, or compliance-grade retention from Cloudflare AI Gateway. Those properties belong to a different category of product.
- How does Cloudflare AI Gateway handle compliance logging?
Cloudflare AI Gateway captures request and response logs subject to the configured logging settings and Cloudflare's data lifecycle. The logs are useful for debugging, cost attribution, and operational analytics. They are not designed as audit evidence for a regulatory inspection. A regulator asking for the Article 12 record of a specific decision wants the identity context, the policy version, the data classification, the decision outcome, and a tamper-evident integrity guarantee. Cloudflare's operational logs do not carry those fields by default.
- Can Cloudflare AI Gateway enforce per-role or per-route policies?
Cloudflare AI Gateway can apply rate limits per route and per token, and can route requests to different providers based on routing rules. The product does not have native concepts for end-user roles, per-role policies, or per-decision permit and deny outcomes tied to identity context. Organizations that need that level of enforcement run a separate policy layer on top of or alongside the edge gateway.
- What happens to audit records when Cloudflare AI Gateway is the only logging layer?
The records reflect what Cloudflare's edge sees: the request, the response, the timing, and the routing decision. The records do not contain the authenticated user identity unless the application has forwarded it as a header that Cloudflare logs, and Cloudflare's log retention follows Cloudflare's defaults rather than the deploying organization's compliance schedule. For regulated workloads, that posture leaves the audit trail incomplete.
- Are Cloudflare AI Gateway and DeepInspect direct competitors?
The products overlap on the surface description ("AI gateway") and diverge in architecture, audit posture, and target buyer. Cloudflare AI Gateway is an operational platform for AI traffic. DeepInspect is a policy enforcement and audit layer for regulated AI deployments. Some buyers choose one and not the other. Some buyers run both. The choice depends on the compliance requirements, the identity model of the deployment, and the audit retention windows.